Rust Is Not a Memory-Safe Language

David Jeske (Artificial Necessity LLC), with Claude (Anthropic)

Status: Draft Last Updated: 2026-07-02

Abstract

Rust is not a memory-safe language. Memory safety, as the term has always meant in language design — the property that programs in the language cannot violate memory integrity — is a compositional property of a language and its runtime: it holds for all programs, over all data shapes, unconditionally. Rust does not have this property. What Rust has is a static checker that verifies an ownership discipline over tree-shaped data only, plus a set of unsafe escape hatches whose use is not optional but structurally mandatory: any cyclic, self-referential, or densely shared data structure — doubly-linked lists, graphs, DOMs, scene graphs, caches, entity systems — is inexpressible under the checker and forces the program into hand-written unsafe, runtime reference counting, or unchecked index schemes. Because these structures dominate real systems software, the unverified surface of a Rust program grows with the program's structural complexity, and at engine scale the observed end state — in Servo, Bevy, and Zed alike — is that every large Rust project builds its own bespoke, unverified object-lifetime runtime. A language whose safety property is conditional on universally quantified, undischarged proof obligations across its entire ecosystem, whose checker cannot express the core data structures of the domain it targets, and whose flagship projects manage memory through hand-audited runtimes, is not a memory-safe language under any definition of the term that does not also admit reference-counted C++. Garbage-collected platforms are memory safe. Rust is a memory-disciplined language with a memory-safe subset for trees — and the difference is not pedantry; it is the difference between a property of the language and a property of small programs.

1. Thesis

Rust is not a memory-safe language. This paper defends that statement literally, under the ordinary meaning "memory-safe language" has carried since the term was coined: a language in which programs cannot commit memory-safety violations — no use-after-free, no dangling dereference, no out-of-bounds access — as a property of the language, holding for all programs a developer can write in it.

The prevailing description of Rust — in its own project materials, in vendor adoption cases, and in US federal cybersecurity guidance, which classify Rust alongside garbage-collected languages under the label "memory-safe" [8][9] — is wrong, and wrong in a way that matters. It equates two safety architectures that differ not in degree but in kind, and it awards the categorical label to a language whose guarantee is conditional, non-compositional, incomplete over data shapes, and — decisively — weakest in exactly the software domains the label is invoked to justify.

Three facts, developed in the sections that follow, establish the thesis:

The checked subset is incomplete. Rust's borrow checker verifies programs whose data ownership forms a tree. Cyclic and self-referential structures are inexpressible under it. (§3)
The guarantee is non-compositional. Code fully accepted by the checker can trigger undefined behavior by calling unsound unsafe code anywhere in its dependency closure — and the ecosystem measurably fails these proof obligations, including in the standard library. (§2, [1])
At scale, Rust converges to C++'s shape. Every flagship large Rust system has been forced to construct its own unverified object-lifetime runtime — arriving at the same safety architecture the equivalent C++ systems use. The claimed categorical difference from C++ disappears precisely in the domains it was claimed for. (§4)

A language of which all three are true does not satisfy the definition. The remainder of the paper is the demonstration.

2. What "Memory Safe" Means, and Why Rust Does Not Meet It

2.1 The definition

A memory-safe language is one in which memory-safety violations are inexpressible by construction for all programs. Garbage-collected languages meet this definition through their runtime: the collector guarantees no reachable object is reclaimed, bounds checks cover access, and consequently:

Completeness: every data shape — cyclic, shared, self-referential — is safe. class Node { Node? Parent; List<Node> Children; } is safe, done.
Compositionality: safe code calling safe code is safe, unconditionally. No library can export an API that safe callers can use to corrupt memory.
Fixed trusted base: the trusted surface is one runtime artifact, institutionally hardened for decades, whose failure modes are uncorrelated with application logic — a JIT bug is triggered by codegen patterns, not by an application "holding an API wrong."

This is the property the term "memory safe" was minted for. C#, Java, Go, and JavaScript have it. (C# undermines its own lexical boundary — IntPtr, Marshal.*, and MemoryMarshal are callable without the unsafe keyword, an indefensible design defect that makes its trusted surface ungreppable — but this is a policing failure over a known API list, not a structural incompleteness. The compositional property over the safe API surface stands.)

2.2 Rust's actual property

Rust's property is different in kind, on every axis:

It is conditional — and the conditions are the wrong kind. The precise statement of what the borrow checker delivers is: code it accepts is violation-free provided the compiler is sound, the standard library's internal unsafe is correct, and every unsafe block in the transitive dependency closure correctly discharges its proof obligations. This paper concedes the first two conditions without argument: compiler soundness is analogous to JIT soundness, a fixed, centralized residual that every safety architecture carries and that §6's definition absorbs. (Open rustc soundness holes exist and are demonstrable [2], as JIT bugs exist for the CLR; neither is load-bearing here.) The condition that has no analogue — the one the thesis rests on — is the third: safety quantified over every unsafe author in the dependency closure, a surface that is distributed across thousands of unaffiliated maintainers rather than centralized, that grows with the program rather than staying fixed, and whose failures are triggered by the application's own usage patterns rather than uncorrelated with them. And it measurably fails: the RUDRA study [1] found 264 new memory-safety bugs across crates.io — including in the standard library — many reachable from callers that never leave the checked subset.

It is non-compositional — because its boundary is a promise, not a mechanism. In a memory-safe language, unsafety is contained at the boundary by machinery: the collector and the bounds checks sit at the line and enforce safety dynamically, no matter what the caller does. Because enforcement is mechanical, the safe side can be offered arbitrary generality — alias anything, mutate any graph — and the boundary holds regardless. In Rust, the boundary around an unsafe internal is enforced by practice: the wrapper is sound only if no possible sequence of safe calls can violate the internal invariants — a proof-shaped obligation — but nothing in the toolchain requires, checks, or even represents that proof. What stands at the line is the author's informal reasoning, review, and convention; the compiler takes the wrapper's signature on faith. (Genuine machine-checked proof exists for Rust — the RustBelt project formally verified a handful of standard-library abstractions — and its rarity is the tell: it is a per-abstraction academic research effort covering a rounding error of the ecosystem, against aliasing rules whose formal statement is itself unfinished. If proof were the operative enforcement, RustBelt would not be a research program; it would be part of the compiler.) The author therefore has exactly one lever for making the promise true: narrow the exposed API until the dangerous usage patterns become inexpressible. Soundness is purchased with generality; even a perfectly implemented boundary is a constraint of use, because rendering behavior consumable by checked contexts means amputating its generality away from the uncheckable. (Type-level techniques — branded lifetimes, GhostCell-style encodings — can push some invariants into the boundary, but the encodable frontier is famously narrow and paid for in exactly this ergonomic and expressive loss.)

This leaves an unsafe wrapper author a trichotomy: amputate the generality, hand-build runtime enforcement (dynamic borrow flags, index validity checks — i.e., locally rebuild the memory-safe language's mechanical fence, per structure, unaudited), or ship the proof obligation to the callers. The RUDRA study [1] is the measured rate of the third option across the ecosystem: obligations shipped and botched, so that code fully accepted by the checker triggers undefined behavior through its own usage patterns. "My code contains no unsafe" bounds authorship, not exposure — safety becomes a universally quantified claim over thousands of unaffiliated maintainers upholding aliasing invariants (Stacked/Tree Borrows) subtler than C++'s, under a memory model that remains unfinalized [4]. A guarantee contingent on an ecosystem-wide universal quantifier is not a language property. It is a hope.

And it is incomplete — which is the structural core of the thesis, and the subject of the next section.

3. The Checker Cannot Express Real Data Structures

The borrow checker's ownership model — one owner per value, borrows forming statically scoped tree regions, aliasing XOR mutation — verifies exactly the programs whose data is tree-shaped. Therefore:

Any structure containing a cycle, a back-edge, or self-reference is inexpressible under the checker.

The excluded set is not exotic. It is: doubly-linked lists, graphs, intrusive lists, trees with parent pointers, DOMs, scene graphs, observer registries, cross-referencing caches, buffer pools, and entity systems — the load-bearing structures of engines, browsers, databases, editors, and servers. The community's own canonical text, Learning Rust With Entirely Too Many Linked Lists [3], is a book-length demonstration that a first-semester data structure requires expert unsafe in Rust.

When the program needs a non-tree shape — and every interesting program does — there are exactly three exits. Each one surrenders the property.

Exit 1: hand-written unsafe. The programmer assumes proof obligations subtler than C++'s, under an aliasing model still being defined [4], with RUDRA [1] as the measured ecosystem failure rate. This is not memory safety; this is manual memory management with stricter, less-specified rules.

Exit 2: Rc<RefCell<T>> / Weak. The defect here is not that reference counting is unsound — it is that in Rust it is elective. Python's reference counting is a language mechanism: universal, embedded in the runtime, and completed by a cycle detector, so its guarantees are properties of the language. Rust's Rc is a library type, applied per-structure by programmer discipline, with cycles broken by manual Weak placement — and pervasive application is impractical for the same reasons it is with shared_ptr in C++: refcount traffic on hot paths, runtime borrow-panic hazards in deep call stacks, and pervasive ergonomic drag. Nobody builds whole systems this way, in either language, for the same reasons. A safety story that holds only where the programmer opted in, cannot practically be opted into everywhere, and leaks cycles wherever Weak placement is missed, is conditional on usage — a property of disciplined programs, not of the language. Had Rust wanted reference counting to ground a language-level claim, it would be embedded and cycle-collected, as Python's is; it is not, deliberately, as a performance trade — a legitimate trade, but one that trades away the label.

Exit 3: arenas and indices (slotmap, petgraph, generational arenas, ECS). This exit is not memory safe, full stop — and seeing why requires only stating what an arena is. An arena is a heap: a user-space allocator in which slots are allocations, indices are pointers, and slot reuse is free followed by malloc recycling an address. It satisfies the borrow checker because indices are opaque integers — pointers the compiler cannot see — which is the defect, not the feature. A stale index is therefore not analogous to use-after-free; it is use-after-free, executed against the program's operative allocator, which merely happens to be implemented one level above malloc. And it carries use-after-free's signature security payload: the dead handle reads whatever object now occupies the slot — another session's credentials, another player's state, another document's contents — private information disclosed across an object-lifetime boundary, with corruption available on the write side. This is the vulnerability class the term "memory safety" exists to name; disclosure of recycled storage through a dead reference does not require a wild hardware pointer, as the industry's own hardening work acknowledges by treating intra-pool reuse as first-class attack surface [10]. The only way to certify this exit as memory safe is to gerrymander the definition so that the hand-rolled heap does not count as memory. Generational indices mitigate — converting silent disclosure into a lookup panic — but they are optional, conventional, and unverified: a runtime check the author may or may not have installed, enforced by practice, invisible to the checker. Note also that the container-index defense available to memory-safe languages does not apply here: a stale index into a C# list is a bug inside a memory-safe object model; the arena is not a container inside an object model — at scale it is the object model, load-bearing as the allocator, so its use-after-free is allocator use-after-free.

The consequence: the unverified surface of a Rust program is not a constant, as a runtime's is. It grows with the structural complexity of the program's data. The more graph-shaped the problem, the less the checker covers — a scaling law pointed directly at systems software.

4. At Scale, Every Project Builds Its Own Runtime

Section 3's scaling law makes a prediction: at the scale of an Unreal, a Unity, a Blender, a MySQL, a BigTable, a League-of-Legends-class server — codebases whose cores are densely cyclic, mutably-shared object graphs mutated under tight budgets from deep call stacks — the three exits individually collapse. Rc<RefCell> becomes refcount traffic on every hot path plus a nested-borrow panic minefield; arenas become a bespoke object model with hand-enforced handle discipline; raw unsafe becomes C++ with harder rules. The prediction is that large Rust projects will be forced to build their own object-lifetime runtimes.

The flagship projects confirm it, unanimously:

Servo — Mozilla's browser engine, the language's original motivating project — and the reframing it forces. The marketed claim is that Rust is categorically different from C++. Compare shapes: Blink, the C++ engine in Chrome, manages its DOM with Oilpan, a tracing garbage collector, surrounded by hand-disciplined C++ [10]. Servo, unable to express the DOM under the checker, manages its DOM with SpiderMonkey's tracing garbage collector, integrated through extensive unsafe FFI, policed by a bespoke custom-lint suite enforcing invariants the borrow checker cannot see [5]. The same shape: cyclic core on a tracing collector, practice-enforced glue at the boundary. This is not evidence that Servo's engineers erred — a DOM interoperating with a JavaScript heap plausibly demands collection in any language, which is itself the §5 point. It is evidence that in the domain that motivated the language's creation, the language and C++ converge to the same safety architecture. A categorical difference that vanishes at the destination was never categorical.
Bevy — the largest Rust-native game engine. Its ECS is presented as a performance architecture; it is equally the arena/index exit imposed engine-wide, because a conventional scene graph is inexpressible under the checker. The ECS core is itself dense with unsafe and carries a multi-year public trail of soundness bugs in precisely that code [6].
Zed — the editor. Its gpui framework implements a custom entity system: reference-counted handles with runtime lease semantics [7] — a hand-built dynamic ownership runtime, constructed because static ownership could not express an editor.

The pattern is a memory-management instance of Greenspun's Tenth Rule: every sufficiently large Rust program contains an ad-hoc, informally specified, unverified reimplementation of an object-lifetime runtime — different in every codebase, invariants living in comments, fuzzed by no one. And §2.2 explains why this outcome is forced rather than accidental: mechanism is the only thing that actually contains unsafety at a boundary, and when a system needs generality that cannot be amputated away, its authors have no option left but to build the mechanism themselves.

Now state the comparison honestly. A large C# system and a large Rust system both end up with a trusted lifetime runtime in the load-bearing role — but the two occupants of that role are not the same kind of thing. The C# system's runtime is a mechanism in the §2.2 sense: a precise moving collector for which reachability is ground truth, complete over all data shapes, compositional, its failure modes uncorrelated with application logic, its trusted surface fixed no matter what the application does — and it ships with the platform, hardened by twenty-five years of institutional fuzzing, with the application's million lines entirely on the safe side of it. The Rust system's stand-in is a practice-enforced promise: per-structure wrappers around unsafe internals, covering only the shapes their authors anticipated, sound only against the caller patterns their authors imagined, with a trusted surface that grows with every extension — project-local, young, audited by no one. The role is the same; the occupant differs in kind. The language did not remove the trusted runtime from systems programming. It replaced a mechanism with a promise, and privatized the promise.

This is the final blow to the label. A memory-safe language does not require its largest programs to hand-build the machinery of memory safety. That the machinery must be built is the proof that the language does not contain it.

5. Objections

"Most Rust code never writes unsafe." Authorship is not exposure (§2.2). Non-compositionality means the application is hostage to the unsafe it transitively calls, and §3 shows that volume is set by the program's data shapes, not the author's discipline.

"ECS and arenas are good architecture regardless — data-oriented design wins on cache behavior." Sometimes true on performance, and irrelevant to the safety claim. An architecture mandated by the verifier's expressiveness gap, whose handle discipline the verifier cannot check, is not evidence of the verifier's success. The checker verified the tree-shaped parts and was absent for the hard parts.

"Panics beat UB." Genuinely true, and a different claim. "Fails better than C++" is not "memory safe" — and GC platforms deliver the better failure modes and compositionality and completeness, without per-structure manual schemes.

"The Android data shows Rust reducing vulnerabilities." The cited datasets [8] aggregate Rust with Java and Kotlin as "memory-safe languages" versus C/C++. They prove the value of leaving C++. They say nothing about Rust versus GC platforms — this paper's comparison — on which no isolating data exists.

"GC is unacceptable in systems software." Examine the systems software that actually exists. Chrome ships multiple collectors on its hottest paths: V8's generational GC for the JavaScript heap, and Oilpan — a tracing garbage collector Google built in C++, specifically to manage Blink's DOM — adopted after years of reference-counting that cyclic graph convinced the most performance-scrutinized C++ team in the industry that a tracing collector was the correct mechanism [10]. Firefox pairs SpiderMonkey's GC with a cycle collector. Unreal Engine — the reference C++ game engine — ships a mark-and-sweep collector for its UObject graph; Unity's gameplay layer runs on the .NET GC. The planet's infrastructure layer is collected: Kubernetes, Docker, etcd, CockroachDB on Go's GC; Kafka, Cassandra, Elasticsearch on the JVM's. Garbage collection is not grudgingly tolerated in systems software; it is standard equipment in the flagship systems, adopted independently by C++ teams with every anti-GC incentive, because graph-shaped cores demand mechanism. Meanwhile the inventory of systems software written in Rust consists of excellent components — Stylo and WebRender inside Firefox, Firecracker, Pingora, kernel drivers — which are tree-shaped or embedded in larger hosts, exactly per §6, while the whole-system flagships embed a garbage collector at their graph-shaped core: Servo runs its DOM on SpiderMonkey's collector; Deno wraps V8's. There is no shipped Rust browser, database engine, mainstream operating system, or major game engine. The objection's premise is refuted by every system it gestures at, and §4 already explained why: the large Rust systems did not escape runtime lifetime management — they reimplemented it. At scale the choice is not "GC vs. no GC." It is "a mechanism or a promise" — the platform's fence, or your own.

"So the term is just being used loosely — why does it matter?" Because the label is now written into federal procurement guidance [9] as an undifferentiated category, steering rewrites of exactly the graph-shaped systems where the property does not hold. A category error at policy scale is not pedantry.

6. What Rust Actually Is

Rust is a memory-disciplined language: a static ownership checker of real value, whose verified domain is tree-shaped data, in programs small enough that the transitive unsafe surface stays auditable, conditional on compiler soundness and on ecosystem-wide proof obligations that are measurably not being met. Within that domain — parsers, serializers, codecs, compression kernels, CLI tools, straight-line pipelines — the checker covers essentially the whole program and delivers no-runtime performance with genuine static assurance. ripgrep and serde are honest exemplars, and nothing here diminishes them.

But examine even the granted niche closely, and the advantage — as distinct from the guarantee — narrows toward the vanishing point. The marketed value proposition is "safety without garbage collection, where garbage collection hurts." Now cross the two axes. Where the checker's guarantee holds — batch tools, parsers, pipelines: bounded working sets, reused buffers, no latency contract, process exits — is precisely where a collector costs nothing; a ripgrep-shaped workload is the GC's easiest case, and its actual performance derives from algorithmic engineering (literal-prefix SIMD scanning, lazy-DFA regex, parallel traversal) available in any compiled language. Where a collector genuinely hurts — long-lived, latency-bound, graph-shaped systems — is precisely where §3 and §4 showed the guarantee is gone. The sweet spot the label advertises — safety without GC, where GC hurts — is a near-empty quadrant: where the safety holds, the collector was free; where the collector costs, the safety has left. The honest residual quadrant is hard-real-time, allocation-forbidden, tree-shaped kernels — embedded targets, codecs under latency contracts, kernel modules — real, valuable, and a small fraction of what the label is being applied to.

The comparison lands harder still because the memory-safe platforms did not cede the niche. Modern .NET ships a compiler-enforced static lifetime checker for exactly the hot-window pattern: ref struct semantics (Span<T>, ReadOnlySpan<T>) with escape rules the compiler proves — no boxing, no heap capture, no crossing async or closure boundaries, scoped lifetime narrowing, stackalloc feeding directly into checked stack windows [11]. This is borrow checking deployed at the scope where it is complete: stack-lifetime views are tree-shaped by nature, so the checker in that role is total, and it operates inside a language whose object graph is covered by mechanism. The architecture the evidence in this paper points to is thus not hypothetical — it ships: mechanism for the graph, checked lifetimes for the windows. Rust's core idea is vindicated at window scope and indicted at object-model scope; its error was not the borrow checker but the claim that the borrow checker could be the object model.

But a memory-safe language is one whose property holds for the language — all programs, all data shapes, compositionally — assuming only that the trusted mechanism is itself correct and transmits its guarantees faithfully to the native boundary. Every safety property bottoms out at such an assumption; the honest question is what shape the assumption takes. For a GC platform it is a single one: one runtime artifact, fixed in size regardless of what the application does, uncorrelated with application logic, institutionally hardened. Rust fails the definition even granting it the analogous assumption — grant compiler soundness and a correct standard library, and the property still does not follow, because the remaining conditions are of a different character entirely: distributed across every unsafe author in the dependency closure, growing with the program's structural complexity, and triggered by the application's own usage patterns. Rust's property is conditional in a way the assumption cannot absorb, non-compositional where the definition requires compositional, incomplete where it requires total, and absent at scale where the marketing aimed it. The tree-shaped subset is safe. The language is not. Conflating the two hands the categorical label earned by garbage-collected runtimes to a language that, at the scale of the systems it was built to replace, asks every project to build the runtime itself.

Rust is not a memory-safe language. It is a good language wearing the wrong label — and the systems being rewritten under that label deserve the accurate one.

References

Bae, Y., Kim, Y., Askar, A., Lim, J., Kim, T. — RUDRA: Finding Memory Safety Bugs in Rust at the Ecosystem Scale. SOSP 2021. https://dl.acm.org/doi/10.1145/3477132.3483570
cve-rs — memory-safety violations constructed with no unsafe, on stable compilers, via open soundness issues. https://github.com/Speykious/cve-rs ; underlying issue: https://github.com/rust-lang/rust/issues/25860
Beingessner, A. — Learning Rust With Entirely Too Many Linked Lists. https://rust-unofficial.github.io/too-many-lists/
Jung, R., et al. — Stacked Borrows: An Aliasing Model for Rust (POPL 2020); Tree Borrows successor work; Unsafe Code Guidelines effort (memory model unfinalized). https://plv.mpi-sws.org/rustbelt/stacked-borrows/ ; https://github.com/rust-lang/unsafe-code-guidelines ; RustBelt (machine-checked verification of selected std abstractions): https://plv.mpi-sws.org/rustbelt/
Servo project — DOM design: SpiderMonkey GC integration and custom lint enforcement. https://github.com/servo/servo/blob/main/components/script/dom/mod.rs
Bevy Engine — bevy_ecs internals and public soundness-issue history. https://github.com/bevyengine/bevy/tree/main/crates/bevy_ecs ; https://github.com/bevyengine/bevy/issues?q=label%3AC-Unsound
Zed Industries — gpui entity/ownership model. https://github.com/zed-industries/zed/tree/main/crates/gpui
Google Security Blog — Memory Safe Languages in Android 13. https://security.googleblog.com/2022/12/memory-safe-languages-in-android-13.html
ONCD — Back to the Building Blocks: A Path Toward Secure and Measurable Software (Feb 2024); CISA et al. — The Case for Memory Safe Roadmaps (Dec 2023). https://bidenwhitehouse.archives.gov/oncd/briefing-room/2024/02/26/press-release-technical-report/ ; https://www.cisa.gov/resources-tools/resources/case-memory-safe-roadmaps
Chromium — Oilpan: a tracing garbage collector for Blink's DOM objects (C++); MiraclePtr/PartitionAlloc use-after-free hardening (intra-pool reuse as attack surface). https://chromium.googlesource.com/chromium/src/+/main/third_party/blink/renderer/platform/heap/BlinkGCDesign.md ; https://v8.dev/blog/oilpan-library ; https://security.googleblog.com/2022/09/use-after-freedom-miracleptr.html
Microsoft — C# ref struct semantics and ref-safety rules (compiler-enforced escape analysis for stack-referential types); Span<T> and low-level struct improvements. https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/ref-struct ; https://learn.microsoft.com/en-us/dotnet/csharp/advanced-topics/performance/ref-safety

unsolicited Dave

Thursday, July 2, 2026

Rust Is Not Memory Safe