Malloc Lab: Memory is just a big array
Any bit, piece, excerpt, or line coming from an implementation other than the one provided, any code that you did not write yourself, showing up at any point on your account, will result in an immediate failing grade for the course.
1 Overview
In this lab you will be replacing the GNU libc memory allocator with your own implementation. You will be implementing:
malloc(3)
free(3)
realloc(3)
malloc_aligned
(Bonus Part)
Except for debugging, the only external function you are allowed to use is
sbrk(2)
(and memcpy(3)
in realloc
). The system call
sbrk
increments the heap break and returns the former value of the break. As
a special case sbrk(0)
returns the current value of the break.
—Note
The GNU libc info manual has a section on Replacingmalloc
. In there, they
mention that a full reimplementation of malloc
should provide:
malloc(3)
, free(3)
, calloc(3)
,
realloc(3)
, aligned_alloc(3)
,
malloc_usable_size(3)
, memalign(3)
,
posix_memalign(3)
, pvalloc(3)
, valloc(3)
Most of them can be implemented using a single line using just the functions you
are asked to implement in this lab. These single-line implementations are
provided and automatically added to your library through the mallocless.c
file.
2 Downloading the assignment
Start by extracting ~prof/malloclab-handout.tar.bz2
in your home:
$ cd ~/
$ tar xvf ~prof/malloclab-handout.tar.bz2
This will cause a number of files to be unpacked into the directory. You
should only modify the file src/malloc.c
and this is the only file that is
submitted. The tests
folder contains the different tests and grading examples
for your program. Use the make(1)
command to compile your code and the command
make test
to run the test driver. This would run all the tests for all parts.
This is more useful during development than in our previous labs, but short
tests are still provided: see Section Driver for more finely grained testing
options.
3 Part 0: Correctness (0pts)
In Part 0, your implementation is run in a simulated environment. This environment crashes as soon as it detects an error, printing an error message.
Part 0 is mandatory: no point is awarded to the lab if your implementation crashes.
—Note
There are multiple approaches to running amalloc
implementation in a
simulated environment. The approach taken here is to intercept your calls to
sbrk
using interpositioning (see Ch. 7, Lec. 7). At the beginning of the
execution, some limited space is allocated using mmap(2)
(see Ch. 9, Lec. 10)
and each time your implementation calls sbrk
, it returns a portion of that
buffer. In addition, a page is allocated before that buffer, and its rights
changed to PROT_NONE
(no read nor write accesses, see Cache Lab, Sec. Watching Memory
for more on page protection). Any access to that page would crash the program.
4 Part 1: Performance on traces (60+ points)
In this part, each test executes a series of malloc
, free(1)
, and realloc
.
For each test, you implementation is run twice:
- Once measuring the time taken to complete all the operations, resulting in a throughput measure: the number of operations per second. This is measured in thousands of operations per second (Kops/s).
- Once measuring the utilization of your implementation: this is the ratio
between the maximum amount
malloc
’ed at any point and the maximum heap size.
With \(T\) and \(U\) your average throughput and utilization over all tests, your final grade for this part is (roughly) computed as:
\[ 60 \times \left( { T \over 1500 } \times 0.6 + { U \over 65} \times 0.4 \right)\enspace. \]
This means that a throughput of 1500Kops/s and a utilization of 65% will give you all the points, but also that anything higher would give you extra points (capped at 54 points for throughput and 36 points for utilization). As a reference point, implicit list should give you 0 points, explicit list 45, segregated list 60, and any tweaking would push you above that. The GNU libc implementation scores 84 points.
—Note
Most of the traces executed come from actual (although partial) executions of some programs. For instance,tests/progs/grading-p1-firefox.c
contains the successive
calls to malloc
given by Firefox. These accesses were gathered by using a
library that was intercepting the calls to malloc
using LD_PRELOAD
(see
Ch. 7, Lec. 7). A few lines of that file read:
void* p848045194363482441 = malloc (31); free (p848045194363482386); free (p848045194363482387); void* p848045194363482442 = malloc (11);
5 Part 2: Performance running Python (50+ points)
In Part 1, we compiled long C files that contain a trace of the memory
allocations made by a program. Even for simple programs, these traces grow in
the several mebibytes, making them impracticable to compile. In this part, we
simply run Python scripts, replacing the GNU libc malloc
with your
implementation using LD_PRELOAD
.
In this part, only speed is important. The target speed is such that a normal implementation of segregated list would give all points, although you can score up to 75 points by providing a better implementation.
6 Bonus Part (40 points)
In this part, your implementation will be used to start Emacs (using
LD_PRELOAD
). The specificity here is that, in addition to requesting memory
with malloc
, Emacs also asks for aligned memory with diverse alignments.
You will have to implement:
void* malloc_aligned (size_t alignment, size_t size);
This function should return a memory address that is a multiple of alignment
.
You have a guarantee that alignment
is a power of 2 larger than 128
.
You earn all the points if Emacs does not crash.
7 Evaluation
7.1 Methodology
The tests are mostly programs found in tests/progs
. These are dynamically
linked with your implementation. This means that you do not
have to recompile the test programs when you update your implementation.
Your code is compiled into two libraries: one with all optimization flags (used
for the tests) and one with no optimization and with all debug symbols (for your
debugging). These are compiled as src/libmalloc.opt.so
and
src/libmalloc.so
, respectively.
Each C file is compiled in 5 different versions. For instance, the
file grading-p0-sed.c
is compiled into:
grading-p0-sed
: This is linked with the debug version of your library, it is used for your debugging. Note that calls tomalloc
andfree(1)
are not “direct”; they go through some functions that I implemented to help you debug, and malloc is run in a “virtual environment”. Use this with GDB if you want to trace your program.grading-p0-sed.stu.tim
: This variant runs the program and outputs the throughput of your implementation. (Used in Part 1.)grading-p0-sed.stu.utl
: This variant runs the program and outputs the utilization of your implementation. (Used in Part 1.)grading-p0-sed.glc.{tim,utl}
: These are the throughput/utilization tests, but running the GNU libc implementation.
For Part 2 and the Bonus Part, the programs, Python or Emacs, are run with your
implementation using LD_PRELOAD
.
7.2 Driver
This project is evaluated by a test driver. You can run the full driver by typing, at the root of your project:
$ make test
This makes sure that your program is compiled with the latest sources, moves to
the directory tests
, and runs the driver ./driver.sh
.
As you work on each successive parts, you may want to run the driver on each part separately and on smaller test files—this is less useful in this lab than in previous labs, except for Part 0:
$ cd tests/
$ ./driver.sh -h
usage: ./driver.sh [-sg] [-P PART]...
Grade the malloclab.
-P PART Grade only part PART (0, 1, 2, B). Default is all.
-s Use short tests (not for grading).
-g Run the tests on the GNU libc implementation.
$ ./driver.sh -s -P 0
* PART 0: Correctness
[OK] progs/short-p0-10.stu
[OK] progs/short-p0-11.stu
...
For each test, the driver prints what is the command executed. The result, for each test, is readily printed; in bracket, there’s a shorthand notation for the result:
OK
: Test passed.KO
: Test failed, your program printed something unexpected.TO
: Time out, your program took too long to answer, it’s usually caused by an infinite loop.
8 Tools & Libraries
8.1 Size to bins
In the implementation of segregated lists, you will want to map the size of free
blocks to their lists. These lists are generally known as bins. The optimal
number NBINS
of bins and the optimal mapping SIZE_TO_BIN
are up to debate,
however, you are allowed to use the GNU libc malloc
’s versions. They are
provided as macros NBINS
and SIZE_TO_BIN
in the file malloc.h
, you can
thus use, for instance:
size_t bin_number = SIZE_TO_BIN (payload_size (h)); for (bin_number++; bin_number < NBINS; ++bin_number) ...
8.2 GNU libc implementation
8.3 GDB
Refer to the Dict Lab writeup for info on GDB. To run one of the tests using
your implementation of malloc
, use the program name printed by the driver:
$ make test
* PART 0: Correctness
[KO] progs/grading-p0-amptjp
CTRL-c
$ cd tests
$ gdb progs/grading-p0-amptjp
(gdb) b malloc
Breakpoint 1 at 0x1040
(gdb) r
Starting program: /home/micha/src/malloclab/tests/progs/grading-p0-amptjp
Breakpoint 1, malloc (plsize=2040) at malloc.c:170
170 if (params.head == NULL)
You can also run any program in GDB and attach your library:
$ gdb emacs
(gdb) set environment LD_PRELOAD src/libmalloc.so
(gdb) b malloc
Breakpoint 1 at 0x43a70
(gdb) r -nw -Q
Starting program: /usr/local/bin/emacs -nw -Q
Breakpoint 1, malloc (plsize=4) at malloc.c:170
170 if (params.head == NULL)
8.4 Efficient debugging
Debugging an implementation of malloc
is an art in itself. The tests
of Part 0 can only detect so much and you may end up with unexpected crashes.
As with any piece of code, it is better to make it crash yourself than having
the user do it.
8.4.1 Checking assumptions
You should make sure that your assumptions about the state of memory are always correct; there are two ways to do that:
Use
assert(3)
. For instance, when you assume that a function returns a free blockb
, you may want to check this:block = get_free_block (size); assert (block->free);
Your program will crash with a useful error message if this condition is not satisfied:
grading-p0-ls: malloc.c:128: main: Assertion `block->free' failed. [1] 650575 abort (core dumped) ./a.out
You can then run your program in GDB, have it crash, and go up the call stack to see what caused the assertion to be false.
- Use a consistency checker. This is a function you would call at the beginning
and end of each of your
malloc
,free(1)
, andrealloc
. It would check that the memory is in the state you expect. For instance, you may want to check that all the blocks in the list of free blocks is marked free (traversing the list), and that every block that is free is in the free list (traversing the memory linearly).
The calls to assert
are automatically removed when compiled as the optimized
library: as indicated in the man page assert(3)
, if the macro NDEBUG
is defined, then assert()
generates no code. Your optimized library is
compiled using the command line flag -DNDEBUG
, which defines the macro
NDEBUG
. However, the consistency checker is yours, so you should make sure
that it does nothing when NDEBUG
is defined:
void check_consistency () { #ifndef NDEBUG //... #endif }
Note that all the tests run by the driver are using your library with
NDEBUG
. If your program fails a test, you will want to run the version
linked with the nonoptimized library (see Evaluation Methodology).
8.4.2 Printing debug information
9 Handing in Your Work
When you have completed the lab, you will submit it as follows:
$ make submit
This will package your whole src/
folder into a tarball that is saved in
~/submissions/
, where the instructor will get it. Since this is simply saved
on your account, you can use make submit
as many times as you want up until
the deadline.