Document #: | |
Date: | 2024-10-24 |
Project: | Programming Language C++ |
Audience: |
|
Reply-to: |
Sean Baxter <seanbax.circle@gmail.com> |
As for dangling pointers and for ownership, this model detects all possible errors. This means that we can guarantee that a program is free of uses of invalidated pointers.
– A brief introduction to C++’s model for type- and resource- safety[type-and-resource-safety-2015]
Safety Profiles were introduced in 2015 with the promise to detect all lifetime safety defects in existing C++ code. It was a bold claim. But after a decade of effort, Profiles failed to produce a specification, reliable implementation or any tangible benefit for C++ safety. The cause of this failure involves a number of mistaken premises at the core of its design:
safe
function annotation”[P3446R0]The parameters of the problem make success impossible. This paper examines the contradictions in these premises, explains why the design didn’t improve safety in the past and why it won’t improve safety in the future.
Note that this document is specifically about the lifetime
safety claims in [P1179R1] and [P3465R0]. The wider set of analyses in
[P3081R0] that deal with rejecting
specific operations like reinterpret_cast
,
union access, variadic arguments, etc., are not considered here.
Zero annotation is required by default, because existing C++ source code already contains sufficient information.
C++ source code does not have sufficient information for achieving memory safety. A C++ function declaration lacks three things that are critical for lifetime safety:
Functions involving parameter types with pointer or reference semantics have implicit aliasing, lifetime and safeness requirements. Safety Profiles cannot recover these properties from C++ code, because there are no language facilities to describe them. These requirements are only specified in documentation, if they are specified at all.
A C++ compiler can infer nothing about aliasing from a function declaration. A function parameter with a mutable reference might always alias other parameters, it might never alias other parameters, or it might not care about aliasing other parameters.
// i and j must always alias. They must refer to the same container.
void f1(std::vector<int>::iterator i, std::vector<int>::iterator j) {
// If i and j point into different vectors, you have real problems.
::sort(i, j);
std}
// vec must not alias x.
void f2(std::vector<int>& vec, int& x) {
// Resizing vec may invalidate x if x is a member of vec.
.push_back(5);
vec
// Potential use-after-free.
= 6;
x }
// vec may or may not alias x. It doesn't matter.
void f3(std::vector<int>& vec, const int& x) {
.push_back(x);
vec}
f1
and
f2
have aliasing requirements. In
f1
, both iterators must point into
the same container. In f2
,
x
must not come from the container
vec
. These requirements are only
visible as documentation. The compiler cannot infer a function’s
aliasing requirements from its declaration or even from its definition.
If the safety profile enforces no mutable aliasing, then the
definitions of f1
and
f3
will fail to compile, breaking
your program.
int main() {
::vector<int> vec1, vec2;
std
// *Incorrectly* permits call.
// UB, because the iterators point into different containers.
(vec1.begin(), vec2.end());
f1
// *Incorrectly* rejects call.
// This is the correct usage, but mutable aliasing prevents compilation.
(vec1.begin(), vec1.end());
f1
// *Correctly* rejects call.
(vec1, vec1[2]);
f2
// *Incorrectly* rejects call.
(vec1, vec1[2]);
f3}
Profiles chose the wrong convention for several uses. It permits the
incorrect call to f1
to compile, but
rejects a correct usage of f1
on the
grounds of mutable aliasing. An unsound call to
f2
is correctly rejected, but a
sound call to f3
is also rejected.
Rejecting or permitting code (rightly or wrongly) is a matter of
coincidence, not intelligence.
Without language-level aliasing information, compile-time memory safety is not possible. This requirement is the motivation for Rust’s borrow type. A mutable borrow cannot alias other borrows. That’s enforced by the borrow checker. Raw pointers have no aliasing requirements, but are unsafe to dereference. In general, things that can be checked by the compiler are checked, and things that can’t be checked are unsafe to use.
#include <vector>
#include <iostream>
void func(std::vector<int>& vec, int& x) {
.push_back(1);
vec= 2; // A write-after-free when x is a member of vec!
x }
int main() {
::vector<int> vec;
std.push_back(1);
vec(vec, vec[0]);
func
::cout<< vec[0]<< "\n";
std::cout<< vec[1]<< "\n";
std}
Program returned: 0
1 1
The Safety Profiles partial reference implementation can’t prevent aliasing-related undefined behavior because C++ doesn’t provide aliasing information.
A C++ compiler can infer nothing about lifetimes from a function declaration. A reference return type may be constrained by the lifetimes of any number of reference parameters, by none of the reference parameters, or by some other lifetime.
// The returned reference is only constrained by the lifetime of the map
// parameter.
// It is not constrained by the lifetime of the key parameter.
const int& f4(std::map<int, int>& map, const int& key) {
return map[key];
}
// The returned reference is constrained by the lifetime of both x and y
// parameters.
const int& f5(const int& x, const int& y) {
return std::min(x, y);
}
// The returned reference is not constrained by the lifetime of any
// reference parameter.
const int& f6(const int& key) {
static std::map<int, int> map;
return map[key];
}
These three functions have different lifetime requirements, which are indicated by comments. This information is available to developers but not to the compiler. What’s the strategy to uphold these lifetime requirements? Read the documentation, read the code, and don’t make mistakes.
int main() {
::map<int, int> map;
std
// r4 is constrained by lifetimes of map and 40.
int& r4 = f4(map, 40);
// *Incorrectly* rejects usage of r4. r4 is constrained to the lifetime
// of the temporary 40, which expired at the end of the above statement.
int x = r4;
// r5 is constrained by lifetimes of 50 and 51.
const int& r5 = f5(50, 51);
// *Correctly* rejects usage of r5. The reference refers to one of the
// two expired temporaries. This use would be a use-after-free.
int y = r5;
// r6 is constrained by the lifetime of 60.
const int& r6 = f6(60);
// *Incorrectly* rejects usage of r6.
// The return reference r6 should not be constrained by the lifetime of 60.
int z = r6;
}
Profiles take a similarly conservative approach to lifetimes as they
do with aliasing. The lifetime of a returned reference is constrained by
the lifetimes of all of its arguments. This is fortuitous for a
function like
std::min
,
which returns a reference to either of its function parameters. It’s bad
for a function like std::map<T>::operator[]
,
which takes a key argument by reference but returns a reference that’s
only constrained by the lifetime of
this
.
Since the compiler has no information about function parameter
lifetimes, it can’t accurately flag out-of-contract function calls.
f4
and
f6
take references to temporary
objects but return references that should not be constrained to that
temporary. In both cases, the safety profile rejects a subsequent use of
the reference as a use-after-free, because it applies a too-conservative
convention.
The need for explicit lifetime information in function types is the motivation for Rust’s lifetime arguments. A returned reference must be annotated with a lifetime parameter that is constrained by a function parameter on the same function, or it must be static. The alternative is to be deluged with an impossible quantity of use-after-free false positives.
#include <map>
#include <utility>
const int& f4(std::map<int, int>& map, const int& key) {
return map[key];
}
int main() {
::map<int, int> map;
stdconst int& ref = f4(map, 200);
int x = ref;
}
<source>:11:11: warning: dereferencing a dangling pointer [-Wlifetime]
int x = ref;
^~~
<source>:10:32: note: temporary was destroyed at the end of the full expression
const int& ref = f4(map, 200); ^
The Safety Profiles reference implementation can’t accurately deal with lifetimes because C++ doesn’t provide lifetime information. The tool doesn’t test for correctness, it only tests if your code conforms to a pre-chosen convention.
We should not require a
safe
function annotation that has the semantics that asafe
function can only call othersafe
functions.– (Re)affirm design principles for future C++ evolution[P3446R0]
Recall what “safe” actually means:
A C++ compiler can infer nothing about safeness from a function declaration. It can’t by tell by looking what constitutes an out-of-contract call and what doesn’t. A safe-specifier indicates the presence of soundness preconditions. An unsafe-block permits the user to escape the safe context, prove the preconditions, and call the unsafe function.
template<typename T>
class vector {
public:
size_t size() const noexcept safe {
return _len;
}
& operator[](size_t index) noexcept safe {
T// Can call size() because it's a safe function.
if(index >= size())
("Out-of-bounds vector::operator[]");
panic
{
unsafe // Pointer operations only allowed in unsafe context.
// Safety proof:
// The allocation has size() valid elements and index < size().
return _data[index];
}
}
private:
* _data;
Tsize_t _len, _cap;
};
Let’s take a really simple case: vector::operator[]
.
Profiles have to reject pointer arithmetic, because there’s no static
analysis protection against indexing past the end of the allocation. How
is the compiler told to permit the raw pointer subscript in the
return-statement in vector::operator[]
?
In Rust and Safe C++, enter an unsafe-block.
This design distinguishes safe functions, which have no soundness preconditions and can be called from other safe functions, and unsafe functions, which require an unsafe-block escape to use, just like pointer operations.
Separation of safe and unsafe functions is common in memory-safe
languages. Rust and C#[csharp] include an
unsafe
function specifier and an
unsafe-block construct. This is a human- and tooling-readable
tag for auditing potential origins of soundness defects. Aliasing and
lifetimes are transitive properties that must be recoverable from a
function declaration in order to be upheld. Safeness (the lack of
soundness preconditions) is another transitive property that must be
marked in a function declaration. The way to do that is with a
safe-specifier.
template< class RandomIt >
void sort( RandomIt first, RandomIt last );
Let’s consider another example: the
std::sort
API that takes two random-access iterators. This is an unsafe
function because it exhibits undefined behavior if called with the wrong
arguments. But there’s nothing in the type system to indicate that it
has soundness preconditions, so the compiler doesn’t know to reject
calls in safe contexts.
What are sort
’s
preconditions?
first
and
last
iterators must point at
elements from the same container.first
must not indicate an
element that appears after
last
.first
and
last
may not be dangling
iterators.In the absence of a enforced safeness information, it’s up to the user to follow the documentation and satisfy the requirements. Guidance for calling unsafe functions is essentially “don’t write bugs.”
void func(std::vector<int> vec1, std::vector<int> vec2) {
// #1 - *Incorrectly* rejects correct call for mutable aliasing
(vec1.begin(), vec1.end());
sort
// #2 - *Incorrectly* permits out-of-contract call.
(vec1.begin(), vec2.end());
sort}
In the Profiles model, the correct call to
sort
#1 is rejected due to mutable
aliasing. That’s bad, but permitting the out-of-contract call #2 is
worse, because it’s a soundness bug. There’s no realistic static
analysis technology to verify that a call to
sort
meets its preconditions. Even
the safety profile with the most conservative aliasing setting lets this
call through.
This is where safe
and
unsafe
specifiers play an important
role. From the caller’s perspective,
sort
is unsafe because it has
preconditions that must be upheld without the compiler’s help. From the
callee’s perspective, sort
is unsafe
because it’s written with unsafe operations. Pointer
differencing computes a pivot for the sort, and pointer differencing is
undefined when its operands point to different allocations.
// No safe-specifier means unsafe.
void sort(vector<int>::iterator begin, vector<int>::iterator end);
// A safe-specifier means it can only call safe functions.
void func(vector<int> vec1, vector<int> vec2) safe {
// Ill-formed: sort is an unsafe function.
// Averts potential undefined behavior.
(vec1.begin(), vec2.end());
sort
{
unsafe // Well-formed: call unsafe function from unsafe context.
// Safety proof:
// sort requires both iterators point into the same container.
// Here, they both point into vec1.
(vec1.begin(), vec1.end());
sort}
}
The only way to enforce memory safety is to separate safe and unsafe
functions with a safe-specifier. In this example,
func
is safe because it’s
defined for all valid inputs. It cannot call
sort
, because that has soundness
preconditions: the two iterators must point into the same container. A
call to sort
in a safe context
leaves the program ill-formed, because the compiler cannot guarantee
that the preconditions are satisfied. But by entering an
unsafe-block, the user can prove the preconditions and make the
unsafe call without the compiler’s soundness guarantees.
[P3081R0] does float a [[suppress(profile)]]
attribute to turn off certain Profiles checkes. It looks like the
equivalent of an unsafe-block. It may permit pointer operations
in a definition, but it doesn’t address the other side of the call:
without a safe-specifier, how does the Profiles design deal
with functions like sort
that are
inherently unsafe? They must be separated from provably safe
functions. User intervention, wrapped up in unsafe-blocks, is
needed to satisfy their preconditions. Without this bump of impedance
the language cannot guarantee safety, as the property that a safe
functions contains no undefined behavior is not transitively upheld.
#include <memory>
#include <vector>
#include <algorithm>
int main() {
::vector<int> v1, v2;
std.push_back(1);
v1.push_back(2);
v2
// UB!
::sort(v1.end(), v2.end());
std}
Program returned: 139
double free or corruption (out) Program terminated with signal: SIGSEGV
The Safety Profiles reference implementation can’t deal with unsafe functions, because C++ doesn’t know which functions are unsafe. This out-of-contract call produces a heap double-free and then segfaults.
Do not add a feature that requires viral annotation.
– (Re)affirm design principles for future C++ evolution[P3446R0]
Rust’s safety model incorporates lifetime arguments on every reference (or struct with reference semantics) that occurs in a function type. The authors of Profiles disparagingly call these “viral annotations.” Don’t be scared. C++ has always been full of viral annotations: types are viral annotations.
Types establish type safety properties that are enforced by both the caller and callee. These properties are transitive (i.e. viral) because they’re enforced through any number of function calls, creating a network of reasoning from the point where an object is created to all of its uses.
Languages that treat types as viral annotations are statically-typed languages. Languages that don’t are dynamically-typed languages. These have well-known trade-offs. Statically-typed languages exhibit higher performance and provide more information to developers; programs in a statically-typed language may be easier to reason about. Dynamically-typed languages are much simpler and can be more productive.
Lifetime parameters, which provide crucial information to the compiler to enable rigorous safety analysis, defines another axis of typing. Rust has static lifetimes, which is a high-performance, high-information approach to memory safety. Users can reason about lifetimes and aliasing because those concepts are built into the language. The compiler has sufficient information to rigorously enforce lifetime safety with borrow checking.
Most other memory-safe languages use dynamic lifetimes, of which garbage collection is an implementation. Instead of enforcing lifetimes and exclusivity at compile time, the garbage collector manages objects on the heap and extends their scope as long as there are live references to them. This has the same basic trade-off as dynamic typing: simplicity at the cost of performance.
Static lifetimes | Dynamic lifetimes | |
---|---|---|
Static types | Rust | Java, Go |
Dynamic types | - | Javascript, Python |
The static types/static lifetimes quadrant is a new area of language design, at least for languages widely used in production. The principles may be unfamiliar. Lifetime annotations feel different than type annotations because they establish relationships between parameters and return types rather than on individual parameters and objects. Instead of answering the question “What are the properties of this entity?” they answer “How does this entity relate to other entities?”.
Profiles fail because they reject, as a design principle, the specific language improvements that provide necessary lifetime information for compile-time safety.
Annotations are distracting, add verbosity, and some can be wrong (introducing the kind of errors they are assumed to help eliminate).
– Profile invalidation - eliminating dangling pointers[P3446R0]
This is not right. In a memory-safe language you can’t introduce undefined behavior with mere coding mistakes. That’s the whole point of memory safety. If you put the wrong lifetime annotation on a parameter, your program becomes ill-formed, not undefined. A mistaken use of lifetime parameters can be an ergonomics bug, or it can mask undefined behavior when wrapping an unsafe function in a safe interface, but it can’t cause undefined behavior.
fn f1<'a, 'b>(x:&'a i32, y:&'b i32) -> &'b i32 {
return x;
}
error: lifetime may not live long enough
--> lifetime1.rs:5:10
|
4 | fn f1<'a, 'b>(x:&'a i32, y:&'b i32) -> &'b i32 {
| -- -- lifetime `'b` defined here
| |
| lifetime `'a` defined here
5 | return x;
| ^ function was supposed to return data with lifetime `'b` but it is returning data with lifetime `'a`
| = help: consider adding the following bound: `'a: 'b`
Lifetime constraints are a contract between the caller and callee. If
either side violates the contract, the program is ill-formed. In the
code above, the lifetime constraints are violated by the callee. The
lifetime of the x
parameter does not
outlive the lifetime of the returned reference. We used the wrong
annotation, but instead of leading to undefined behavior, the compiler
produces a detailed message that explains how the lifetime contract was
not met.
fn f2<'a, 'b>(x:&'a i32, y:&'b i32) -> &'b i32 {
// Well-formed. The lifetime on y outlives the lifetime on
// the return reference.
return y;
}
fn f3() {
let x = 1;
let r:&i32;
{
let y = 2;
= f2(&x, &y);
r }
// Ill-formed: r depends on y, which is out of scope.
let z = *r;
}
error[E0597]: `y` does not live long enough
--> lifetime2.rs:15:16
|
14 | let y = 2;
| - binding `y` declared here
15 | r = f2(&x, &y);
| ^^ borrowed value does not live long enough
16 | }
| - `y` dropped here while still borrowed
...
19 | let z = *r; | -- borrow later used here
Let’s fix the implementation of the callee and test a broken version
of the caller. The returned reference depends on
y
, but it’s used after
y
goes out of scope. The compiler
rejects the program and tells us “y
does not live long enough.”
The use of lifetime annotations on parameters is the same as the use of type annotations on parameters: it turns an intractable whole-program analysis problem into an easy-to-enforce local-analysis problem. Lifetime annotations, which exist to guarantee safety, do not jeopardize safety.
Do not add a feature that requires heavy annotation. “Heavy” means something like “more than 1 annotation per 1,000 lines of code.”
– (Re)affirm design principles for future C++ evolution[P3446R0]
We have an implemented approach that requires near-zero annotation of existing source code.
Central to Safety Profiles is the claim that annotations are exceptional rather than the norm. For this to be true, the great bulk of C++ would need to be written according to some preferred convention. [P1179R1] chooses “no mutable aliasing” and constrains reference return types to all reference parameters. Let’s consider a number of Standard Library functions and compare their aliasing and exclusivity requirements to those conventions. Functions that don’t adhere to these conventions must be annotated, and those annotations must be virally propagated up the stack to all callers, as aliasing and lifetime requirements are transitive. Only functions that have no soundness preconditions can be considered safe.
Let’s start in <algorithm>
and work through alphabetically, indicating how functions deviate from
the Safety Profile’s aliasing and lifetime conventions:
// Unsafe!
// Precondition: `first` and `last` must alias.
template< class InputIt, class UnaryPred >
bool all_of( InputIt first, InputIt last, UnaryPred p );
template< class InputIt, class UnaryPred >
bool any_of( InputIt first, InputIt last, UnaryPred p );
template< class InputIt, class UnaryPred >
bool none_of( InputIt first, InputIt last, UnaryPred p );
// Unsafe!
// Precondition 1: `first` and `last` must alias.
// Lifetime: The return type is not constrained by the lifetime of `value`
template< class InputIt, class T >
( InputIt first, InputIt last, const T& value );
InputIt find
template< class InputIt, class UnaryPred >
( InputIt first, InputIt last, UnaryPred p );
InputIt find_if
template< class InputIt, class UnaryPred >
( InputIt first, InputIt last, UnaryPred q );
InputIt find_if_not
// Unsafe!
// Precondition 1: `first` and `last` must alias.
// Precondition 2: `s_first` and `s_last` must alias.
// Lifetime: The return type is not constrained by the lifetime of `s_first`
// or `s_last`.
template< class InputIt, class ForwardIt >
( InputIt first, InputIt last,
InputIt find_first_of);
ForwardIt s_first, ForwardIt s_last
// Unsafe!
// Precondition 1: `first` and `last` must alias.
template< class ForwardIt >
( ForwardIt first, ForwardIt last );
ForwardIt adjacent_find
// Unsafe!
// Precondition 1: `first1` and `last2` must alias.
// Lifetime: The returned Input1 is constrained only by `first1` and `last1`
// Lifetime: The returned Input2 is constrained only by `first2`.
template< class InputIt1, class InputIt2 >
::pair<InputIt1, InputIt2> mismatch( InputIt1 first1, InputIt1 last1,
std);
InputIt2 first2
// Unsafe!
// Precondition 1: `first` and `last` must alias.
// Precondition 2: `s_first` and `s_last` must alias.
// Lifetime: The returned ForwardIt1 is constrained only by `first` and `last`
template< class ForwardIt1, class ForwardIt2 >
( ForwardIt1 first, ForwardIt1 last, ForwardIt2 s_first,
ForwardIt1 search); ForwardIt2 s_last
The functions in <algorithms>
mostly involve iterators which are inherently unsafe. Additionally, the
lifetime convention chosen by Profiles is frequently wrong: the lifetime
of a returned reference rarely is constrained by the lifetimes of all
its parameters. You’d need annotations in all of these cases.
Consider these conventions against the API for a container. Let’s
look at <map>
:
// Aliasing: the `key` parameter may alias `*this`.
// Lifetimes: the returned T& is only constrained by `*this` and not by `key`.
& map<Key, T>::at( const Key& key );
T& map<Key, T>::operator[]( const Key& key );
T
// Aliasing: the `key` parameter may alias `*this`.
// Lifetimes: the returned iterator is only constrained by `*this` and not by
// `value`.
<Key, T>::find( const Key& key );
iterator map<Key, T>::lower_bound( const Key& key );
iterator map<Key, T>::upper_bound( const Key& key );
iterator map
// Aliasing: the `value` parameter may alias `*this`.
// Lifetimes: the returned iterator is only constrained by `*this` and not by
// `value`.
::pair<iterator, bool> map<Key, T>::insert( const value_type& value );
std
// Unsafe!
// Precondition 1: `pos` must point into `*this`
// Aliasing: the `value` parameter may alias `*this` or `pos`
// Lifetimes: The returned iterator is only constrained by `*this` and not by
// `value`.
<Key, T>::insert( iterator pos, const value_type& value );
iterator map
// Aliasing: The `k` and `obj` parameters may alias `*this`.
// Lifetimes: The returned iterator is only constrained by `*this` and not by
// `k` or `value`.
template< class M >
::pair<iterator, bool> map<Key, T>::insert_or_assign( const Key& k, M&& obj )
std
// Unsafe!
// Precondition 1: `hint` must point into `*this`
// Aliasing: The `k` and `obj` parameters may alias `*this` and `hint`.
// Lifetimes: The returned iterator is only constrained by `*this` and not by
// `k` or `value`.
template< class M >
( const_iterator hint, const Key& k, M&& obj ); iterator insert_or_assign
This is only a few of the map
APIs which would either be unsafe or require annotations in the Profiles
model. The conservative aliasing rules gets most member functions wrong:
a reference returned from a member function is typically constrained
only by the *this
/self
parameter. That’s what Rust’s lifetime elision rules do. Regardless of
the convention chosen, expect annotations every time the function does
something different. With C++ code, it does something different very
often.
#include <map>
int main() {
::map<int, int> m;
std[1] = 2;
m
// Temporary 1 expires. Profiles considers `value` a dangling reference.
int& value = m[1];
// Profiles should flag this apparent use-after-free.
= 2;
value }
Profile’s inability to deal accurately with lifetimes means that an
implementation would reject much valid code. In this example the
subscript to map::operator[]
is a temporary. It goes out of scope at the end of the statement. Under
the Profile’s conservative lifetime convention, the returned reference
(stored in value
) would be
considered a dangling reference and the subsequent use would make the
program ill-formed.
I do not believe that C++ code, with its countless unstated soundness preconditions and inconsistent aliasing and lifetime requirements, can be made memory safe with fewer than “1 annotation per 1,000 lines of code.” In fact, legacy C++ code will have many more annotations than equivalent Rust code. Rust often chooses object relocation to pass parameters by value rather than pass them by reference. This reduces the number of lifetime constraints that the system deals with. Additionally, it has simpler, safe versions of facilities which are unsafe in C++: the Rust iterator, for example, keeps both the data pointer and length in the same struct to completely alleviate the aliasing concerns that prevent safety analysis in C++.
The density of annotations required to vet existing code is not the biggest problem facing Profiles. C++ overload resolution has created a knot that cannot be untangled. Its standard conversion rules are one reason why C++ is considered inherently unsafe.
For many accessor-style C++ APIs, there are two overloads:
If the mutable candidate can be chosen, it is chosen, no matter what the result object is used for.
void f1(const int& x, const int& y);
void f2(std::vector<int> vec) {
// The mutable overload of operator[] is called here.
(vec[0], vec[1]);
f1}
This code will not pass an exclusivity test.
vec
is a mutable object, so vec[0]
calls the mutable version of operator[]
and produces a mutable reference result object. While that mutable
loan is in scope (it remains in scope until
f1
returns), vec[1]
calls the mutable version of operator[]
to produce its mutable reference result object. But you’re not allowed
more than one mutable reference to the same place. This is an
exclusivity error!
Rust avoids this problem in two ways:
_mut
suffix.index
or
index_mut
. The latter is chosen in a
mutable context, which is the left-hand side of an
assignment.We can’t ditch function overloading and remain C++. But we can change how overload resolution evaluates candidates. The standard conversion is responsible for binding references to expressions. C++ chooses the wrong (for safety purposes) subscript candidate because the standard conversion is able to bind mutable references to lvalue expressions.
void f3(const int^ x, const int^ y) safe;
int main() safe {
::vector<int> vec { };
std2
// Okay.
(vec[0], vec[1]);
f3
// Ill-formed: mutable borrow of vec between its mutable borrow and its use.
(mut vec[0], mut vec[1]);
f3}
safety: during safety checking of int main() safe
borrow checking: example.cpp:13:22
f3(mut vec[0], mut vec[1]);
^
mutable borrow of vec between its mutable borrow and its use
loan created at example.cpp:13:10
f3(mut vec[0], mut vec[1]); ^
Safe C++ changes the standard conversion to work around this language
defect. In this extension, standard conversions do not bind mutable
references. vec[0]
chooses the const candidate, which permits aliasing, and mut vec[0]
chooses the mutable candidate, which does not. By opting in to
mutation, you get aliasing by default.
#feature on safety
int main() safe {
int x = 1;
int^ ref = x; // Ill-formed! Can't bind mutable reference to lvalue.
}
error: example.cpp:5:14
int^ ref = x;
^ cannot implicitly bind borrow int^ to lvalue int
The mut
keyword[mutation] puts the subexpression into
the mutable context and restores the restricted functionality.
In the mutable context, the compiler will bind mutable references to
expression:
#feature on safety
int main() safe {
int x = 1;
int^ ref = mut x; // Ok. Can bind mutable references in mutable context.
}
Now, the const overload of a function is chosen unless the user
escapes with the mut
keyword. This
addresses a language defect head-on.
What option does Profiles have? In its full generality, the mutable binding default makes for an exceptionally thorny analysis problem. Does Profiles replace calls to mutable candidates with calls to similarly-named const candidates? That’s a presumption. Does it retroactively classify mutable loans as shared loans depending on usage? I’m not a soundness maverick. This is getting close to touching a live wire.
Legacy C++ errs on the side of mutability, making it too unconstrained to test for soundness. Old code is what it is.
The development of new product lines for use in service of critical infrastructure or NCFs (national critical functions) in a memory-unsafe language (e.g., C or C++) … is dangerous and significantly elevates risk to national security, national economic security, and national public health and safety.
– CISA, Product Security Bad Practices[cisa]
[P3466R0] insists that “we want to make sure C++ evolution … hews to C++’s core principles.” But these are bad principles. They make C++ extra vulnerable to memory safety defects that are prevented in memory-safe languages. The US Government implicates C++’s core principles as a danger to national security and public health.
Static lifetimes | Dynamic lifetimes | |
---|---|---|
Static types | Rust | Java, Go |
Dynamic types | - | Javascript, Python |
Reconsider this table. We want to evolve C++ to live in the static types/static lifetimes quadrant. Since Rust is the only species in that design family (at least among production languages), a new entry is necessarily going to resemble Rust (at least in its memory safety treatment) more than it does other languages. An earnest effort to pursue [P1179R1] as a Lifetime TS[P3465R0] will compromise on C++’s outdated and unworkable core principles and adopt mechanisms more like Rust’s. In the compiler business this is called carcinization: a tendency of non-crab organisms to evolve crab-like features.
I think it is worth pursuing this compatible path first before, or at least at the same time as, trying to graft another foreign language’s semantics onto C++ which turns C++ into “something else” and/or build an off-ramp from C++.
Who does this provincialism serve? The latest Android security study “prioritizes transitioning to memory-safe languages.”[android-security] The off-ramp from C++ is an increasingly viable and attractive strategy for projects looking to reduce CVE exposure. The off-ramp is happening and its benefits are measurable. As the Android study observes, “once we turn off the tap of new vulnerabilities, they decrease exponentially, making all of our code safer.”
All focus should be on turning off the tap of new vulnerabilities. Incorporating Rust’s safety model into C++ helps in two ways:
C++ can be made memory safe, but not by dismissing everything that works, which is what the authors of Safety Profiles do. The language must evolve to be more explicit in how it expresses aliasing, lifetime and safeness properties. C++ can meet the security needs of its users, both in a principal role, and, for those projects determined to take the off-ramp, in an important supporting role.