SOLVED Re: Memory leak in hush with NOMMU busybox-1.37.0

Henrique de Moraes Holschuh henrique at nic.br
Mon Aug 11 12:46:41 UTC 2025


Em 10/08/2025 06:13, David Laight escreveu:
> On Mon, 4 Aug 2025 09:20:43 -0300
> Henrique de Moraes Holschuh <henrique at nic.br> wrote:
>> volatile isn't "read once" or "write once": the compiler can output code
>> that will read a volatile variable many times even if the load shows up
>> only once in the source code.  In fact, declaring something volatile
>> INCREASES the chances the compiler will load it more than once.
> 
> Are you sure, that sounds wrong to me.

The "increases the chances" part ?

We're used to the optimizer in gcc and clang doing its level best to 
avoid memory loads it doesn't have to do, so in a way you're correct. 
In fact, one usually uses volatile (probably with something like 
sig_atomic_t to ensure no unexpected MRW) *exactly* to defeat that.

But when you're declaring something volatile, you are actually asking 
the compiler to never use stale values (and never elide writes across 
sequence points), so it should increase the chances it will need to 
access that variable more than once depending on how it unfolds control 
flows like loops mixed with conditional execution.

I don't think it will ever duplicate writes, unless you actually have 
two writes in the code flow (since it can't elide them).  But I could be 
wrong about that.

> What catches people out is that without volatile the compiler can (is allowed
> to) read a variable twice even if it only appears once in the source.
> (Although I can't remember whether the gcc/clang maintainers have said
> that the current versions will actually do it.)

It really depends on the code flow and register pressure.  ia32 (i386 to 
i686) was kinda good at triggering such issues due to the small number 
of architectural registers the compiler could use.

>> volatile isn't a compile barrier: the compiler can still move the
>> loads/stores to volatile variables around, unless something else in the
>> code forbids it to.  It must not reorder loads and stores to the same
>> volatile variable, and it must not elide loads because it just stored
>> something and that data is still in a register, but that's it.
> 
> I'm sure volatile is a compile barrier w.r.t other volatiles.

You are correct.

A quick look at the C0x draft section 6.7.3 (which is arguably rather 
hard to read), seems to imply volatile is to be a compiler barrier re. 
access (read and write) to other volatile variables *across sequence 
points*.  I don't think this is new behavior, either.

> (This doesn't mean the the cpu will execute the accesses in order.)

Indeed.

>> volatile loads and stores are not atomic, and they are not synchronizing
>> (memory barriers) either.  A store to a volatile variable can result
>> into the compiler generating read-modify-write code for some types, for
>> example.
> 
> I think loads and stores (but not increments) need to be atomic.

They don't, at least not for every type.  This is why "sig_atomic_t" 
exists in POSIX in the first place, AFAIK.

gcc (and clang) will try hard to access volatile variables with a single 
CPU instruction where possible, and with luck (single cache-line access, 
CPU-aligned single-instruction access, non-RMW at uOP level like simple 
stores and loads for any sane micro-architecture), it can end up being 
effectively a relaxed atomic access at the hardware level.

> Certainly any compiler used to build the Linux kernel requires that a
> volatile access (done by accessing though a volatile cast) generates a
> single cpu memory access instruction, MSVC does the same.

It depends on type size and architecture.  POSIX needs it for at least 
one simple type (that ends up used for sig_atomic_t, etc), and as you 
said, Linux has some requirements in that area as well.

> But a variable of type 'atomic_t' is guaranteed to be read/written
> is a manner that multiple threads will never see 'corrupt' values.
> While there are alternatives, it will be just a volatile variable.

When you need atomic behavior (especially when it includes visibility 
control / multi-thread synchronization), you should use the appropriate 
atomics, instead of just "volatile".

We certainly agree on that, that's pretty much what I was trying to explain.

-- 
Henrique de Moraes Holschuh
Analista de Projetos
Centro de Estudos e Pesquisas em Tecnologias de Redes e Operações 
(Ceptro.br)
+55 11 5509-3537 R.:4023
INOC 22548*625
www.nic.br


More information about the busybox mailing list