As with previous assignments, we wil be using GitHub to distribute skeleton code and collect submissions. Please refer to our Git Workflow guide for more details. Note that we will be using multiple tags for this assignment, for each deliverable part.
NOTE: If at all possible, please try to submit using x86. If one of your group
members owns an x86 machine, test on that machine prior to submitting, and do not
commit a .armpls
file. This will make grading much easier for us.
For students on arm64 computers (e.g. M1/M2 machines): if you want your
submission to be built/tested for ARM, you must create and submit a file called
.armpls
in the top-level directory of your repo; feel free to use the
following one-liner:
cd "$(git rev-parse --show-toplevel)" && touch .armpls && git add -f .armpls && git commit .armpls -m "ARM pls"
You should do this first so that this file is present in all parts.
There is a script in the skeleton code named run_checkpatch.sh
. It is a
wrapper over linux/scripts/checkpatch.pl
, which is a Perl script that comes
with the Linux kernel that checks if your code conforms to the kernel coding
style.
Execute run_checkpatch.sh
to see if your code conforms to the kernel style –
it’ll let you know what changes you should make. You must make these changes
before pushing a tag. Passing run_checkpatch.sh
with no warnings and no errors
is required for this assignment.
In addition to the pristine Linux kernel source tree (under linux/
), we’ve
provided a patch file which will create the syscall stubs for you. You will need
to apply this patch to your repo.
The patch is under the following path:
patch/fridge.patch
You can use git apply
to apply this patch. First, check which files will be
modified by the patch:
git apply --stat patch/fridge.patch
You should also inspect what the patch is doing by reading the diffs inside. Finally, you can apply the patch with the following:
git apply patch/fridge.patch
Now, when you run git status
, you should see some files modified, as well as a
file added at linux/kernel/kkv.c
. After verifying that these changes worked as
intended, commit them.
Now, when you build your kernel, you should have the four fridge system call
stubs in your kernel. Make sure you’re building with a local version that is
different from your fallback (-cs4118
), so you don’t overwrite it; set your
local version to your UNI (i.e. -<uni>-fridge
).
Your entire implementation should take place under the user/
directory. No
changes should be made under linux/
(besides applying the patch and creating a
.config
file).
When we grade your submission, we will not compile the kernel in your linux/
directory. We will only attempt to insert your module into our pre-compiled
fridge kernel which is the standard linux kernel with our patch applied. If your
submission depends on subsequent modifications to the kernel your module will
not work with our patched kernel, and we will not grade your submission.
For this assignment, you will be using libfridge, a user space wrapper library for the fridge system calls.
In your skeleton repo, you will find the source code for libfridge under the
directory user/lib/libfridge/
. Follow the instructions in the README
to
build and install this library.
After you install libfridge, you should be able to #include <fridge.h>
in your
userspace programs when you compile, and link against libfridge by passing the
-lfridge
flag when you link.
In this part, we will implement an in-kernel key-value store, fridge. The
fridge will be accessed using four systems calls that you will
add to the Linux kernel. Here’s the userspace API for those four system calls
(see user/lib/libfridge/fridge.h
):
/*
* Initialize the Kernel Key-Value store.
*
* Returns 0 on success.
* Returns -1 on failure, with errno set accordingly.
* The flags parameter is currently unused.
*
* The result of initializing twice (without an intervening
* kkv_destroy() call) is undefined.
*/
int kkv_init(int flags);
/*
* Destroy the Kernel Key-Value store, removing all entries
* and deallocating all memory.
*
* Returns the number of entries removed on success.
* Returns -1 on failure, with errno set accordingly.
* The flags parameter is currently unused.
*
* After calling kkv_destroy(), the Kernel Key-Value store can be
* re-initialized by calling kkv_init() again.
*
* The result of destroying before initializing is undefined.
*/
int kkv_destroy(int flags);
/*
* If a key-value pair is found for the given key, the pair is
* REMOVED from the Kernel Key-Value store.
*
* The value is copied into the buffer pointed to by the "val" parameter,
* up to "size" bytes. Note that if the value was a string of
* length >= "size", the string will be truncated and it will not be
* null-terminated.
*
* Returns 0 if a key-value pair was found, removed and returned.
* Returns -1 with errno set to ENOENT if the key was not found.
* Returns -1 on other failures, with errno set accordingly.
* The flags parameter is currently unused; later used for specifying
* blocking behavior.
*
* The result of calling kkv_get() before initializing the Kernel Key-Value
* store is undefined.
*/
int kkv_get(uint32_t key, void *val, size_t size, int flags);
/*
* Insert a new key-value pair. The previous value for the key, if any, is
* replaced by the new value.
*
* The "size" bytes from the buffer pointed to by the "val" parameter
* will be copied into the Kernel Key-Value store.
*
* If "val" is a string, users of this syscall should make sure that
* size == strlen(val) + 1, so that the null character is stored as part
* of the value.
*
* Returns 0 on success.
* Returns -1 on failure, with errno set accordingly.
* The flags parameter is currently unused.
*
* The result of calling kkv_put() before initializing the Kernel
* Key-Value store is undefined.
*/
int kkv_put(uint32_t key, void *val, size_t size, int flags);
Note that, while these userspace wrappers return int
, the underlying syscalls
should return long
. The syscalls have to return long
for errno
to be set
properly.
Alongside a kernel module stub, we’ve provided a header file in your skeleton code which defines some hash table data structures. You must use these data structures in your fridge implementation; do not modify them in any way.
Note that some of the struct members are for part 4 only; you may leave these unused. The relevant fields for this part are the following:
#define HASH_TABLE_LENGTH 17
#define KKV_NONBLOCK 0
/*
* A key-value pair.
*/
struct kkv_pair {
void *val;
uint32_t key;
size_t size;
};
/*
* A node in a linked list. Contains a key-value pair.
*/
struct kkv_ht_entry {
struct list_head entries;
struct kkv_pair kv_pair;
};
/*
* A bucket in the hash table.
* The hash table is an array of HASH_TABLE_LENGTH buckets.
*/
struct kkv_ht_bucket {
spinlock_t lock;
struct list_head entries;
uint32_t count;
};
Please read the following pages on kernel locks:
You will also want to refresh your memory on using kernel linked lists and
allocating kernel memory using kmalloc()
. We used them previously, in the
Linux list assignment.
Have a look at fridge.h
and libfridge.c
from the libfridge source code,
and how they invoke the syscalls.
Implement the four system calls.
Now implement locking, so that put
and get
work with concurrent puts and
gets.
kmalloc()
,
copy_from/to_user()
, etc.) while holding a spin lock.Please implement fridge syscalls in a kernel module using the technique from HW4.
In the Linux kernel source tree, you will find the system call stubs
implemented for you in linux/kernel/kkv.c
.
You will find a module stub in your skeleton repo at the path
user/module/fridge/
. Please implement your modularized system call here in
fridge.c
, and don’t forget to use the supplied hash table data structure
definitions in fridge_data_structures.h
.
For this part, while kkv_put()
and kkv_get()
must be concurrent, you can
assume that kkv_init()
and kkv_destroy()
are called in an orderly,
synchronous manner. That is, only a single call to kkv_init()
is made before
any calls to kkv_put()
and kkv_get()
, and only a single call to
kkv_destroy()
is right before the program exits. We will deal with
concurrency and misuse of kkv_init()
and kkv_destroy()
in the next part.
We’ve provided some simple test programs in user/test/
to help you get
started, but you must test your syscalls as rigorously as you can.
You’ll find a small C test suite under user/test/con-ed/
, and some Python
bindings under user/test/py-in-fridge
. You should find con-ed
’s
simple_sequential
and hot_potato
tests useful for this part.
Please write your own test programs, and include them in your submission
under the user/test/<group-name>-p1-test/
. Feel free to link in any of the
boilerplate code from con-ed
or py-in-fridge
.
For now, pass in KKV_NONBLOCK
(defined in libfridge) for the flags
parameter of kkv_get()
; pass 0
for the flags
parameter of all other
fridge syscalls.
Make sure to include a Makefile
so that we can build and run your test
programs.
Include the following information in your README.txt
:
Description of your system call implementation.
Description of your test programs and results.
To submit this part, push the hw5p1handin
tag with the following:
git tag -a -m "Completed hw5 part1." hw5p1handin
git push origin master
git push origin hw5p1handin
What should happen when a user program misuses the system calls? What happens
when kkv_init()
gets called twice in a row? What happens when kkv_destroy()
is called while the kernel is executing kkv_put()
?
In the API function comments shown above, certain behaviors are left undefined. For user level library functions, an easy way to implement undefined behaviors is to not do anything special, and possibly let the program crash.
This is not the case for system calls. System calls must be robust against misuse and abuse. A user program should never be able to crash the operating system or corrupt kernel data structures no matter what it does.
Modify the code so that the four system calls can be called in any order and in
any degree of concurrency between them. Of course, if a sequence of calls
doesn’t make sense, they don’t have to succeed. Returning -EPERM
is a
reasonable course of action in those cases.
Your solution in this part should not sacrifice concurrency among puts and gets. Multiple puts and gets should be able to occur simultaneously if they operate on different buckets.
Your solution must allow concurrent init, destroy, put, and get calls. For
example, calling kkv_destroy()
and kkv_put()
at the same time should not
cause a kernel panic or corrupt internal data structures.
As in the previous part, you should write test programs to stress-test your system calls.
Please include your new/updated test programs in
user/test/<group-name>-p2-test/
.
As before, please include a Makefile
to build these test programs.
Include the following information in your README.txt
:
Description of your system call implementation.
Description of your test programs and results.
To submit this part, push the hw5p2handin
tag with the following:
git tag -a -m "Completed hw5 part2." hw5p2handin
git push origin master
git push origin hw5p2handin
The Linux kernel provides the SLAB layer (also called the SLAB allocator) as a
more efficient way to manage memory for kernel-level data structures. It is
specifically designed as a memory management interface to be used for repeated
allocations and deallocations of the same structures, as we are doing here with
kkv_ht_entry
.
Read the following pages on the SLAB allocator:
Set up a SLAB cache for struct kkv_ht_entry
, and replace calls to
kmalloc()
and kfree()
with the appropriate SLAB allocator calls for this
struct.
To submit this part, push the hw5p3handin
tag with the following:
git tag -a -m "Completed hw5 part3." hw5p3handin
git push origin master
git push origin hw5p3handin
In this part, we will modify the fridge implementation so that kkv_get()
system call supports blocking. We will now use the flags
parameter to specify
blocking behavior, which will be set to either KKV_NONBLOCK
or KKV_BLOCK
(both defined in fridge_data_structures.h
.
When flags == KKV_NONBLOCK
, kkv_get()
will behave exactly the same as
before, as described in the API documentation in part 1.
When flags == KKV_BLOCK
, kkv_get()
will change its behavior as follows:
Returns 0 if a key-value pair was found, removed, and written to value
.
(This is the same behavior as the non-blocking case.)
When a key-value pair for the given key is not present in the kernel key-value store, the calling process will block, waiting for a key-value pair for the key to be added to the key-value store.
When another thread adds a key-value pair for the key by calling kkv_put()
,
the waiting process will then wake up, remove the key-value pair, copy the
value to user land up to max_size
bytes, and returns 0.
When multiple threads are waiting for the same key, and another thread adds a key-value pair for the key by calling kkv_put(), only one of the waiting threads will wake up and remove the value. The other threads will keep waiting.
kkv_get()
.Note that when multiple threads are waiting on the same key, you should
NOT expect that each kkv_put()
will be matched by a waiting thread.
kkv_put()
will replace an existing key-value pair. It is
entirely possible that two consecutive kkv_put()
calls will happen
without an intervening removal of the key-value pair by another thread.
Keep this in mind when you write your test driver.A thread blocked on kkv_get()
should return when it receives a signal.
kkv_get()
returns -EINTR
in that case.
signal_pending()
kernel function will come in handy here.The API descriptions for kkv_put()
, kkv_init()
, and kkv_destroy()
remain
the same. That is, they will behave exactly the same as in part 1. However, you
may need to change their implementations to support the new behavior of
kkv_get()
. The flags
parameter for the remaining three syscalls will remain
unused.
You should now fully utilize the data structures defined in
fridge_data_structures.h
. The fields that were previously unused are
highlighted below:
#define KKV_NONBLOCK 0
#define KKV_BLOCK 1
/*
* A node in a linked list. Contains a key-value pair.
*/
struct kkv_ht_entry {
struct list_head entries;
struct kkv_pair kv_pair;
wait_queue_head_t q; /* only for part 4 (empty fridge) */
uint32_t q_count; /* only for part 4 (empty fridge) */
};
Implement blocking behavior for kkv_get()
using the the fields defined for
struct kkv_ht_entry
in fridge_data_structures.h
.
Do not lose the concurrency implemented in part 2. That is, a user program should be able to call the four system calls in any order without crashing the kernel or corrupting internal data structures.
Note that when kkv_get()
blocks waiting for a key-value pair, this is a slow
system call because the key-value pair may never come. Make sure that you are
not holding any lock when sleep in kkv_get()
.
The other test programs (blocking_simple
and blocking_signal
in con-ed
,
and simple_as.py
in py-in-fridge
) will come in handy for this part, but
you should also write some more tests of your own to verify that the blocking
kkv_get()
works as specified.
Please include your empty fridge test programs in
user/test/<group-name>-p4-test/
.
Make sure to pass KKV_BLOCK
to the flags
parameter for kkv_get()
to
test its blocking behavior.
Don’t forget to run regression tests with the test programs you wrote for previous parts to ensure that they still work!
You should pass in KKV_NONBLOCK
to the functions for which the flags
parameter remains unused.
As before, please include a Makefile
to build your test programs.
Include the following information in your README.txt
:
Description of your system call implementation.
Description of your test programs and results.
To submit this part, push the hw5p4handin
tag with the following:
git tag -a -m "Completed hw5 part4." hw5p4handin
git push origin master
git push origin hw5p4handin
The initial Fridge API and test driver suite was designed and implemented by the following TAs of COMS W4118 Operating Systems I, Spring 2015, Columbia University:
The Empty Fridge extension to the original Fridge API and the test driver suite was written by the following TAs of COMS W4118 Operating Systems I, Spring 2016, Columbia University:
The Fridge assignment skeleton code, assignment, solutions, and test suite were updated to support 64-bit ArchLinux, the GitHub-based repo management, and fridge Python bindings by the following TAs of COMS W4118 Operating Systems I, Spring 2018, Columbia University:
The Fridge assignment was updated for 64-bit Linux version 5.10.57 by the following TAs of COMS W4118 Operating Systems I, Fall 2021, Columbia University:
The Fridge assignment was updated for 64-bit Linux version 5.10.158 by the following TAs of COMS W4118 Operating Systems I, Spring 2023, Columbia University:
Last updated: 2023-03-08