How Much Memory Is My Program Really Using?
Modern Linux kernel memory measurements can help
February 10, 2022
It sounds like such a simple question - but virtual memory makes measuring real memory use complicated. Popular tools like ps and top report virtual, resident and shared memory consumption, but these are unsatisfactory:
- Shared memory is not apportioned per process using it so it is overstated.
- Resident memory includes shared memory, so it overstates as well.
- Virtual memory includes resident memory plus data held in swap, which is not in main memory.
Fortunately unless you are running a Linux kernel older than the last financial crisis, you have a better option: Proportional Set Size (PSS)¹. This is the amount of private memory the process is using, plus its proportional allocation of shared memory. The kernel reports this data per PID in /proc/$pid/smaps
. The file lists each memory block mapped to the process, and its properties. You could wrangle the smaps data with shell code, but you don’t have to! The smem tool does that, and it can filter/format too. Modern kernels also aggregate the data in /proc/$pid/smaps_rollup
which is convenient to grep and less error prone than rolling your own aggregator².
Example
Memory concepts can be a bit abstract, so let’s look at an example. This Perl script creates a big array of numbers, forks a child process and prints both processes RSS and PSS using Linux::Smaps:
#!/usr/bin/env perl
use strict;
use warnings;
use Linux::Smaps;
sub print_memusage {
my $smaps = Linux::Smaps->new;
printf "% 6s % 9d % 9d KB % 9d KB\n", $_[0], $$, $smaps->rss, $smaps->pss;
}
my @bigarray = (1..1_000_000);
print " LABEL PID RSS PSS\n";
my $pid = fork;
die "failed to fork $!" unless defined $pid;
if ($pid == 0) {
print_memusage("CHILD");
exit;
}
print_memusage("PARENT");
waitpid $pid, 0;
Now what do you think the output will show? The child process receives a copy of its parent’s memory - so has real memory use doubled? No! The parent and child share the memory; if the child tries to write to any memory it inherited, the kernel will copy the memory page for the child to write to (this is called copy-on-write).
Running the script, I get this output:
LABEL PID RSS PSS
CHILD 1393612 81924 KB 40661 KB
PARENT 1393611 85356 KB 42178 KB
The total resident memory is 167,280 KB but we know that’s a lie - there’s one big array in memory and both processes are sharing it. The total PSS of 82,839 KB is more accurate.
If my program exits, how much memory can be reclaimed?
Whilst Proportional Set Size is useful for gauging how much memory a process is using, it overstates how much memory will be freed when the process exits as shared memory cannot be reclaimed.
To measure free-able memory, use Unique Set Size (USS); the sum of all memory pages private to the process. The kernel does not report USS, you have to calculate it using /proc/$pid/pagemap
. Conveniently, smem already reports USS so there’s no need to write your own solution unless you’re curious³.
bash -c $'smem -c \'pid vss rss pss uss\' --processfilter bash | grep "PID\|$$"'
PID VSS RSS PSS USS
1412528 9500 3184 341 260
This Bash one liner prints its own memory statistics by running smem
and filtering the output to itself. The USS is 260 KB so I expect to free that much real memory when it exits.
At 341 KB, PSS is 81 KB higher, and comprised of its proportional share of /bin/bash
(already in memory as I’m launching this from a Bash shell), libc and other shared libraries, and the system locales. The Bash program starts quicker as by sharing the libraries already in memory, the kernel doesn’t have to copy them to the program’s memory. Additionally, shared memory reduces overall memory use. Virtual memory complexifies operating systems but it has a lot of benefits.
Notes
- PSS was created by Matt Mackall - this LWN article has more info on its origins.
- The motivation for smaps_rollup was Android taking too long to sample large processes' memory in order to balance memory pools.
- And if you are curious, this neat golang script by Viacheslav Biriukov calculates the Unique Set Size for a given PID. The pagemap interface and USS algorithm is described here.
Tags: virtual-memory resident-set-size proportional-set-size unique-set-size smem smaps perl linux