[git commit] Add a "Use less RAM" HOWTO style document

Denys Vlasenko vda.linux at googlemail.com
Fri Apr 22 16:33:43 UTC 2016


commit: https://git.busybox.net/busybox-website/commit/?id=38f41eec400d213c85f1976e73be314157621a80
branch: https://git.busybox.net/busybox-website/commit/?id=refs/heads/master

Signed-off-by: Denys Vlasenko <vda.linux at googlemail.com>
---
 header.html       |   1 +
 use_less_ram.html | 174 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 175 insertions(+)

diff --git a/header.html b/header.html
index 8b71093..1dc1dda 100644
--- a/header.html
+++ b/header.html
@@ -66,6 +66,7 @@
         <!--li><a href="/downloads/patches/recent.html">Recent Changes</a></li-->
         <li><a href="lists.html">Mailing Lists</a></li>
         <li><a href="https://bugs.busybox.net/">Bug Tracking</a></li>
+        <li><a href="use_less_ram.html">Use less RAM</a></li>
         <li><a href="developer.html">Contributing</a></li>
     </ul>
     <p><b>Links</b>
diff --git a/use_less_ram.html b/use_less_ram.html
new file mode 100644
index 0000000..d414d81
--- /dev/null
+++ b/use_less_ram.html
@@ -0,0 +1,174 @@
+<!--#include file="header.html" -->
+
+<h3>How to build a Busybox binary with reduced memory usage</h3>
+
+<p>
+Busybox is designed to be frugal with memory usage. However, it is still
+written in C. Compiler and linker are usually not written with a focus
+to strongly minimize memory usage. This can be helped, though.
+</p>
+
+<h4>Overview of RAM usage in Linux</h4>
+<p>
+Executables have the following parts:
+read-only executable code and constants, also known as "text",
+read-write initialized data, and
+read-write non-initialized (zeroed on demand) data, also known as "bss".
+</p><p>
+At runtime, all text pages are mapped RO and executable.
+The RO mapping, like any other, can only cover whole pages, not parts of them.
+What happens to the last page of text? The entire page is mapped RO,
+including the part beyond the last text byte.
+</p><p>
+Data pages are mapped RW and they are file-backed
+(this includes a small portion of bss which may live in the last
+partial page of data). Pages which are fully in bss are mapped
+to anonymous memory: when a page fault requires kernel to allocate
+a real page, a free page is found and zero-filled. No reads from
+storage are needed.
+</p><p>
+File-backed RO pages are shared among all copies of running process.
+RW pages, both data and bss, exist as separate copies for each process.
+(They may be shared as long as they are not modified. However,
+well-designed programs don't put constant data into RW sections,
+therefore in practice almost all RW data or bss pages which were
+touched by the program are modified).
+</p><p>
+It is important to minimize the number of RW pages your program touches.
+Data, bss and stack pages are never freed, therefore for large,
+and especially for temporary allocations, it's best to use malloc()
+or mmap().
+</p><p>
+[describe stack]
+</p><p>
+[describe heap]
+</p><p>
+</p><p>
+</p><p>
+</p><p>
+</p><p>
+</p><p>
+</p><p>
+You can see these mappings with the following command, where busybox
+shows its own memory map:
+</p><p>
+<pre>$ sh -c 'exec busybox pmap $$'
+1628: busybox pmap 1628
+08048000     900K r-xp  /bin/busybox  - text
+08129000       4K rw-p  /bin/busybox  - data
+0812a000      12K rw-p    [ anon ]    - bss
+08854000       8K rw-p  [heap]        - malloc space
+f76fe000       8K r--p  [vvar]
+f7700000       8K r-xp  [vdso]
+ff82a000     132K rw-p  [stack]       - stack</pre>
+</p>
+<h4>Optimizing start of data</h4>
+<p>
+A simple solution for RO/RW mappings for text and data would be to
+pad text to a full page. However, on some architectures pages are
+quite large and this would cause a significant growth of on-disk size
+of the binary. GNU linker employs a hack to avoid that: sometimes
+it starts data immediately after text, without padding.
+The last, "mixed" page, gets mapped twice: once as a RO page,
+and second time as a RW page (usually at the position of the very next page
+in the virtual address space).
+</p><p>
+This reduces image size on-disk, yes, but it wastes RAM: now there is
+an unused part of first data page which takes up RAM. This may push
+the opposite, tail end of the data/bss sections far enough that they
+now need one more page.
+</p><p>
+This can be fixed if we make GNU linker to use our own linker script
+instead of a built-in one. We will do this by modifying its default script.
+Every busybox build generates a "busybox_unstripped.out" file. Among other
+infromation, it constans a part which says: "using internal linker script:",
+and the script follows.
+</p><p>
+Cut out the script and put it into a file named "busybox_ldscript".
+Now the build will use this script, you should see a message
+"Custom linker script 'busybox_ldscript' found, using it" next time.
+</p><p>
+To unconditionally align data to the next page boundary, find a part
+which looks similar to:
+</p><p>
+<pre>/* Adjust the address for the data segment.  We want to adjust up to
+   the same address within the page on the next page up.  */
+. = ALIGN (0x1000) - ((0x1000 - .) & (0x1000 - 1)); . = DATA_SEGMENT_ALIGN (0x1000, 0x1000);</pre>
+</p><p>
+and replace it with:
+</p><p>
+<pre>. = ALIGN (0x1000); . = DATA_SEGMENT_ALIGN (0x1000, 0x1000);</pre>
+</p>
+<h4>Reducing padding between data</h4>
+<p>
+GCC, even with -Os, compiles byte and 16-bit arrays into data object definitions
+which require word alignment (at least 4 bytes). Some versions even require
+32-byte alignment for arrays of more than 32 bytes long. This includes explicitly
+declared string arrays.
+</p><p>
+As a result, these arrays often have padding added at the end when they are packed
+by linker into the binary. As of 2016, linker is not clever enough yet to use that
+padding for tiny data objects (e.g. bool variables).
+</p><p>
+The alignment can be relaxed by explicit ALIGN1 or ALIGN2 attributes on such arrays:
+</p><p>
+static const char extn[][5] ALIGN1 = { ".zip", ".ZIP" };
+</p><p>
+The above prevents 2 bytes of padding in the binary.
+</p><p>
+Busybox code is evolving, new arrays constantly pop up, coders forget
+to add ALIGNn on them. You can run "grep -F -B3 '*fill*' busybox_unstripped.map"
+to find all linker-added padding in your binary, and add forgotten ALIGNn's.
+Please send a patch if you do.
+<p>
+<h4>Converting bss to data</h4>
+<p>
+Busybox's data and bss sections are small already, some 4-12 kilobytes.
+But they still require two mappings (VMAs in kernel-speak) because
+they have different attributes: data mapping is file-backed, while bss
+is anonymous. We can save a bit of memory on the kernel side, for every process,
+by not requiring two VMAs. This can be achieved by changing linker script
+to place all formerly-bss variables to data:
+</p><p>
+<pre>  .data           :
+  {
+    *(.data .data.* .gnu.linkonce.d.*)
+    SORT(CONSTRUCTORS)
+  }
+  ......................
+  .bss            :
+  {
+   *(.dynbss)                       ---
+   *(.bss .bss.* .gnu.linkonce.b.*) --- move these lines to .data {} block
+   *(COMMON)                        ---
+   . = ALIGN(. != 0 ? 64 / 8 : 1);
+  }</pre>
+</p><p>
+The result of this change is visible in pmap as absense of "[ anon ]" mapping:
+</p><p>
+<pre>
+08048000     856K r-xp  /app/busybox-1.22.1/busybox
+0811e000       8K rw-p  /app/busybox-1.22.1/busybox -- former bss is in here
+08bbf000       4K rw-p  [heap]
+f77a2000       8K r--p  [vvar]
+f77a4000       8K r-xp  [vdso]
+ffa18000     132K rw-p  [stack]</pre>
+</p><p>
+This has a drawback that on-disk binary contains a few zeroed pages and they
+will need to be read when formerly-bss variables are touched. IOW:
+this has a small speed penalty.
+</p>
+<h4>Use space at the end of bss: FEATURE_USE_BSS_TAIL</h4>
+<p>
+Bss end is usually not page-aligned. There is an unused space in the last page.
+Linker marks its start with the _end symbol.
+</p><p>
+The FEATURE_USE_BSS_TAIL option attempts to use that space for bb_common_bufsiz1[]
+array. If it fits after _end, that space will be used, and COMMON_BUFSIZE
+will be enlarged from its guaranteed minimum size of 1 kbyte.
+This may require recompilation a second time, since the position of _end
+is known only after final link.
+</p>
+<br>
+
+<!--#include file="footer.html" -->


More information about the busybox-cvs mailing list