FAQ Linux Memory Management

FAQ Linux Memory Management                                        From Gentoo Linux Wiki                                                                                                Jump to: navigation, search
                                               
This article is part  of the FAQ series.
General • Portage • Wiki
Contents [hide]
  • 1 Overview of memory management
  • 2 Virtual Memory Area
  • 3 The mysterious 880 MB limit on x86
  • 4 The difference among VIRT, RES, and SHR in top output
  • 5 The difference between buffers and cache
  • 6 Swappiness (2.6 kernels)
    • 6.1 Autoregulation
  • 7 Credits
  • 8 Feedback
[edit] Overview of memory managementTraditional Unix tools like 'top' often report a surprisingly smallamount of free memory after a system has been running for a while. Forinstance, after about 3 hours of uptime, the machine I'm writing thison reports under 60 MB of free memory, even though I have 512 MB of RAMon the system. Where does it all go?
The biggest place it's being used is in the disk cache, whichis currently over 290 MB. This is reported by top as "cached". Cachedmemory is essentially free, in that it can be replaced quickly if arunning (or newly starting) program needs the memory.
The reason Linux uses so much memory for disk cache is becausethe RAM is wasted if it isn't used. Keeping the cache means that ifsomething needs the same data again, there's a good chance it willstill be in the cache in memory. Fetching the information from there isaround 1,000 times quicker than getting it from the hard disk. If it'snot found in the cache, the hard disk needs to be read anyway, but inthat case nothing has been lost in time.
To see a better estimation of how much memory is really free for applications to use, run the command free -m:
[size=-1]Code: free -m
             total       used       free     shared    buffers     cached
Mem:           503        451         52          0         14        293
-/+ buffers/cache:        143        360
Swap:         1027          0       1027

The -/+ buffers/cache line shows how much memory is used andfree from the perspective of the applications. Generally speaking, iflittle swap is being used, memory usage isn't impacting performance atall.
Notice that I have 512 MB of memory in my machine, but only 52is listed as available by free. This is mainly because the kernel can'tbe swapped out, so the memory it occupies could never be freed. Theremay also be regions of memory reserved for/by the hardware for otherpurposes as well, depending on the system architecture. However, 360Mare free for application consumption.
[edit]  Virtual Memory Area Virtual memory allows non-contiguous memory to be addressed as if it is contiguous. Each process has a memory map made up of (at least):
  • Program's executable code (called text)
  • Areas for data, that could be initialized (assigned value atthe beginning of execution), uninitialized data (BSS), and the programstack.
  • One area for each active memory mapping
Let's see how we can see the memory area of a process. We first have to identify the process we want to look at. We can use ps -A for that:
# ps -A
  PID TTY          TIME CMD
    1 ?        00:00:00 init
    2 ?        00:00:00 ksoftirqd/0
We are going to look at init that has the pid 1. Looking at at /proc/<process pid>/maps we can see the memory area of a process. In this case:
# cat /proc/1/maps
08048000-08050000 r-xp 00000000 16:46 493923     /sbin/init (executable code)
08050000-08051000 rw-p 00007000 16:46 493923     /sbin/init (data)
08051000-08072000 rw-p 08051000 00:00 0          [heap]
b7e2b000-b7e2c000 rw-p b7e2b000 00:00 0
b7e2c000-b7f4c000 r-xp 00000000 16:46 3770390    /lib/libc-2.5.so
b7f4c000-b7f4d000 r--p 00120000 16:46 3770390    /lib/libc-2.5.so
b7f4d000-b7f4f000 rw-p 00121000 16:46 3770390    /lib/libc-2.5.so
b7f4f000-b7f53000 rw-p b7f4f000 00:00 0
b7f6f000-b7f70000 r-xp b7f6f000 00:00 0          [vdso]
b7f70000-b7f8a000 r-xp 00000000 16:46 3770498    /lib/ld-2.5.so
b7f8a000-b7f8b000 r--p 00019000 16:46 3770498    /lib/ld-2.5.so
b7f8b000-b7f8c000 rw-p 0001a000 16:46 3770498    /lib/ld-2.5.so
bf8fc000-bf911000 rw-p bf8fc000 00:00 0          [stack]
The columns correspond to:
start-end perm offset major:minor inode image
Meaning:
  • Start and end of virtual address.
  • Permissions (read, write, execute, private/shared).
  • Offset
  • Major and minor numbers holding the mapped file.
  • Inode number
  • Name of the mapped file.
We can look at the different memory regions by looking at /proc/iomem. In my AMD XP:
[size=-1]File: # cat /proc/iomem
00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000ccfff : Video ROM
000f0000-000fffff : System ROM
00100000-5ffeffff : System RAM
  00100000-00405770 : Kernel code
  00405771-0054148b : Kernel data
5fff0000-5fff7fff : ACPI Tables
5fff8000-5fffffff : ACPI Non-volatile Storage
70000000-7001ffff : 0000:00:04.0
afa00000-cfbfffff : PCI Bus #01
  b8000000-bfffffff : 0000:01:00.1
  c0000000-c7ffffff : 0000:01:00.0
cfd00000-cfefffff : PCI Bus #01
  cfec0000-cfedffff : 0000:01:00.0
  cfee0000-cfeeffff : 0000:01:00.1
  cfef0000-cfefffff : 0000:01:00.0
cfffb800-cfffbfff : 0000:00:0c.0
  cfffb800-cfffbfff : ohci1394
cfffc000-cfffcfff : 0000:00:04.0
  cfffc000-cfffcfff : sis900
cfffd000-cfffdfff : 0000:00:03.0
cfffd000-cfffdfff : ohci_hcd
cfffe000-cfffefff : 0000:00:03.1
  cfffe000-cfffefff : ohci_hcd
cffff000-cfffffff : 0000:00:03.2
  cffff000-cfffffff : ehci_hcd
d0000000-d3ffffff : 0000:00:00.0
fec00000-fec00fff : reserved
fee00000-fee00fff : reserved
ffee0000-ffefffff : reserved
fffc0000-ffffffff : reserved

We can look at how different devices are mapped into the memory. If we have a look at my video card:
# lspci -vvv
01:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200] (rev 01) (prog-if 00 [VGA])
Region 0: Memory at c0000000 (32-bit, prefetchable) [size=128M]
Region 1: I/O ports at a800
Region 2: Memory at cfef0000 (32-bit, non-prefetchable) [size=64K]
We can see that /proc/iomem shows the card memory allocation.
[edit] The mysterious 880 MB limit on x86By default, the Linux kernel runs in and manages only low memory.This makes managing the page tables slightly easier, which in turnmakes memory accesses slightly faster. The downside is that it can'tuse all of the memory once the amount of total RAM reaches theneighborhood of 880 MB. This has historically not been a problem,especially for desktop machines.
To be able to use all the RAM on an 1GB machine or better, thekernel needs to be recompiled. Go into 'make menuconfig' (or whicheverconfig is preferred) and set the following option:
[size=-1]Linux Kernel Configuration: Large amounts of memory
Processor Type and Features ---->
High Memory Support ---->
(*) 4GB

This applies both to 2.4 and 2.6 kernels. Turning on high memorysupport theoretically slows down accesses slightly, but according toJoseph_sys and log, there is no practical difference.
Also, the ck-sources kernel has a patch for 1gb high memory support.
[edit] The difference among VIRT, RES, and SHR in top outputVIRT stands for the virtual size of a process, which is the sum ofmemory it is actually using, memory it has mapped into itself (forinstance the video card's RAM for the X server), files on disk thathave been mapped into it (most notably shared libraries), and memoryshared with other processes. VIRT represents how much memory theprogram is able to access at the present moment.
RES stands for the resident size, which is an accuraterepresentation of how much actual physical memory a process isconsuming. (This also corresponds directly to the %MEM column.) Thiswill virtually always be less than the VIRT size, since most programsdepend on the C library.
SHR indicates how much of the VIRT size is actually sharable(memory or libraries). In the case of libraries, it does notnecessarily mean that the entire library is resident. For example, if aprogram only uses a few functions in a library, the whole library ismapped and will be counted in VIRT and SHR, but only the parts of thelibrary file containing the functions being used will actually beloaded in and be counted under RES.
[edit] The difference between buffers and cacheBuffers are allocated by various processes to use as input queues,etc. Most of the time, buffers are some processes' output, and they arefile buffers. A simplistic explanation of buffers is that they allowprocesses to temporarily store input in memory until the process candeal with it.
Cache is typically frequently requested disk I/O. If multipleprocesses are accessing the same files, much of those files will becached to improve performance (RAM being so much faster than harddrives), it's disk cache.
[edit] Swappiness (2.6 kernels)Since 2.6, there has been a way to tune how much Linux favorsswapping out to disk compared to shrinking the caches when memory getsfull.
When an application needs memory and all the RAM is fullyoccupied, the kernel has two ways to free some memory at its disposal:it can either reduce the disk cache in the RAM by eliminating theoldest data or it may swap some less used memory (anonymous pages) ofprocessess out to the swap partition on disk. It is not easy to predictwhich method would be more efficient. The kernel makes a choice byroughly guessing the effectiveness of the two methods at a giveninstant, based on the recent history of activity.
Before the 2.6 kernels, the user had no possible means toinfluence the calculations and there could happen situations where thekernel often made the wrong choice, leading to thrashing and slowperformance. The addition of swappiness in 2.6 changes this. Thanks,ghoti!
Swappiness takes a value between 0 and 100 to change thebalance between swapping processess anonymous pages and freeing cache.At 100, the kernel will always prefer to find inactive pages and swapthem out; in other cases, whether a swapout occurs depends on how muchapplication memory is in use and how poorly the cache is doing atfinding and releasing inactive items.
The default swappiness is 60. A value of 0 gives somethingclose to the old behavior where applications that wanted memory couldshrink the cache to a tiny fraction of RAM. For laptops which wouldprefer to let their disk spin down, a value of 20 or less isrecommended.
As a sysctl, the swappiness can be set at runtime with either of the following commands:
sysctl -w vm.swappiness=30
echo 30 >/proc/sys/vm/swappiness
The default when Gentoo boots can also be set in /etc/sysctl.conf:
[size=-1]File: /etc/sysctl.conf
# Control how much the kernel should favor swapping out applications (0-100)
vm.swappiness = 30

Some patchsets (e.g. Con Kolivas' ck-sources patchset) allow the kernel to auto-tune the swappiness level as it sees fit; they may not keep a user-set value.
[edit] Autoregulationgentoo-sources(and probably other gentoo 2.6 kernels) prior to 2.6.7-gentoo containsthe Con Kolivas autoregulated swappiness patch. This means that thekernel automatically adjusts the /proc/sys/vm/swappiness value asneeded during runtime, so any changes you make will be clobbered nexttime it updates. A good explanation of this patch and how it works ison KernelTrap.
I repeat: With gentoo-sources (prior to 2.6.7-gentoo) it is neither necessary nor possible to permanently adjust the swappiness value. It's taken care of automatically, no need to worry.
gentoo-sources no longer contains this patch as of 2.6.7-gentoo.The maintainer of gentoo-sources, Greg, pulled the autoregulation patchfrom the ebuild.http://bugs.gentoo.org/show_bug.cgi?id=54560
[edit] CreditsOriginal Forum Post by sapphirecat
Original Forum Post about autoregulation by bk0
Linux Device Drivers by Jonathan Corbet, Alessandro Rubini, and Greg Kroah-Hartman. Published by O’Reilly Media.
[edit]