fridge

Fridge

Submission

As with previous assignments, we wil be using GitHub to distribute skeleton code and collect submissions. Please refer to our Git Workflow guide for more details. Note that we will be using multiple tags for this assignment, for each deliverable part.

NOTE: If at all possible, please try to submit using x86. If one of your group members owns an x86 machine, test on that machine prior to submitting, and do not commit a .armpls file. This will make grading much easier for us.

For students on arm64 computers (e.g. M1/M2 machines): if you want your submission to be built/tested for ARM, you must create and submit a file called .armpls in the top-level directory of your repo; feel free to use the following one-liner:

cd "$(git rev-parse --show-toplevel)" && touch .armpls && git add -f .armpls && git commit .armpls -m "ARM pls"

You should do this first so that this file is present in all parts.

Code Style

There is a script in the skeleton code named run_checkpatch.sh. It is a wrapper over linux/scripts/checkpatch.pl, which is a Perl script that comes with the Linux kernel that checks if your code conforms to the kernel coding style.

Execute run_checkpatch.sh to see if your code conforms to the kernel style – it’ll let you know what changes you should make. You must make these changes before pushing a tag. Passing run_checkpatch.sh with no warnings and no errors is required for this assignment.

Part 0: Apply kernel syscall stubs + install libfridge

Kernel syscall stubs

In addition to the pristine Linux kernel source tree (under linux/), we’ve provided a patch file which will create the syscall stubs for you. You will need to apply this patch to your repo.

The patch is under the following path:

patch/fridge.patch

You can use git apply to apply this patch. First, check which files will be modified by the patch:

git apply --stat patch/fridge.patch

You should also inspect what the patch is doing by reading the diffs inside. Finally, you can apply the patch with the following:

git apply patch/fridge.patch

Now, when you run git status, you should see some files modified, as well as a file added at linux/kernel/kkv.c. After verifying that these changes worked as intended, commit them.

Now, when you build your kernel, you should have the four fridge system call stubs in your kernel. Make sure you’re building with a local version that is different from your fallback (-cs4118), so you don’t overwrite it; set your local version to your UNI (i.e. -<uni>-fridge).

Your entire implementation should take place under the user/ directory. No changes should be made under linux/ (besides applying the patch and creating a .config file).

When we grade your submission, we will not compile the kernel in your linux/ directory. We will only attempt to insert your module into our pre-compiled fridge kernel which is the standard linux kernel with our patch applied. If your submission depends on subsequent modifications to the kernel your module will not work with our patched kernel, and we will not grade your submission.

libfridge

For this assignment, you will be using libfridge, a user space wrapper library for the fridge system calls.

In your skeleton repo, you will find the source code for libfridge under the directory user/lib/libfridge/. Follow the instructions in the README to build and install this library.

After you install libfridge, you should be able to #include <fridge.h> in your userspace programs when you compile, and link against libfridge by passing the -lfridge flag when you link.

Part 1: Fridge, a Kernel Key-Value Store

In this part, we will implement an in-kernel key-value store, fridge. The fridge will be accessed using four systems calls that you will add to the Linux kernel. Here’s the userspace API for those four system calls (see user/lib/libfridge/fridge.h):

/*
 * Initialize the Kernel Key-Value store.
 *
 * Returns 0 on success.
 * Returns -1 on failure, with errno set accordingly.
 * The flags parameter is currently unused.
 *
 * The result of initializing twice (without an intervening
 * kkv_destroy() call) is undefined.
 */
int kkv_init(int flags);

/*
 * Destroy the Kernel Key-Value store, removing all entries
 * and deallocating all memory.
 *
 * Returns the number of entries removed on success.
 * Returns -1 on failure, with errno set accordingly.
 * The flags parameter is currently unused.
 *
 * After calling kkv_destroy(), the Kernel Key-Value store can be
 * re-initialized by calling kkv_init() again.
 *
 * The result of destroying before initializing is undefined.
 */
int kkv_destroy(int flags);

/*
 * If a key-value pair is found for the given key, the pair is
 * REMOVED from the Kernel Key-Value store.
 *
 * The value is copied into the buffer pointed to by the "val" parameter,
 * up to "size" bytes. Note that if the value was a string of
 * length >= "size", the string will be truncated and it will not be
 * null-terminated.
 *
 * Returns 0 if a key-value pair was found, removed and returned.
 * Returns -1 with errno set to ENOENT if the key was not found.
 * Returns -1 on other failures, with errno set accordingly.
 * The flags parameter is currently unused; later used for specifying
 * blocking behavior.
 *
 * The result of calling kkv_get() before initializing the Kernel Key-Value
 * store is undefined.
 */
int kkv_get(uint32_t key, void *val, size_t size, int flags);

/*
 * Insert a new key-value pair. The previous value for the key, if any, is
 * replaced by the new value.
 *
 * The "size" bytes from the buffer pointed to by the "val" parameter
 * will be copied into the Kernel Key-Value store.
 *
 * If "val" is a string, users of this syscall should make sure that
 * size == strlen(val) + 1, so that the null character is stored as part
 * of the value.
 *
 * Returns 0 on success.
 * Returns -1 on failure, with errno set accordingly.
 * The flags parameter is currently unused.
 *
 * The result of calling kkv_put() before initializing the Kernel
 * Key-Value store is undefined.
 */
int kkv_put(uint32_t key, void *val, size_t size, int flags);

Note that, while these userspace wrappers return int, the underlying syscalls should return long. The syscalls have to return long for errno to be set properly.

Alongside a kernel module stub, we’ve provided a header file in your skeleton code which defines some hash table data structures. You must use these data structures in your fridge implementation; do not modify them in any way.

Note that some of the struct members are for part 4 only; you may leave these unused. The relevant fields for this part are the following:

#define HASH_TABLE_LENGTH 17
#define KKV_NONBLOCK 0

/*
 * A key-value pair.
 */
struct kkv_pair {
    void *val;
    uint32_t key;
    size_t size;
};

/*
 * A node in a linked list. Contains a key-value pair.
 */
struct kkv_ht_entry {
    struct list_head entries;
    struct kkv_pair kv_pair;
};

/*
 * A bucket in the hash table.
 * The hash table is an array of HASH_TABLE_LENGTH buckets.
 */
struct kkv_ht_bucket {
    spinlock_t lock;
    struct list_head entries;
    uint32_t count;
};

Tasks

  1. Please read the following pages on kernel locks:

    • LKD chapter 10: pages 183 (from Spin Locks) - 202

    You will also want to refresh your memory on using kernel linked lists and allocating kernel memory using kmalloc(). We used them previously, in the Linux list assignment.

  2. Have a look at fridge.h and libfridge.c from the libfridge source code, and how they invoke the syscalls.

  3. Implement the four system calls.

    • For this step, don’t worry about locking; just make it work for single-threaded test programs.
  4. Now implement locking, so that put and get work with concurrent puts and gets.

    • Use the spin lock embedded in each bucket. Spin locks should be grabbed and released as quickly as possible. Make sure to structure your code to minimize the locked region, and only hold locks only when necessary. Recall that you cannot call any kernel functions that may block (e.g. kmalloc(), copy_from/to_user(), etc.) while holding a spin lock.

Requirements

Submission

To submit this part, push the hw5p1handin tag with the following:

git tag -a -m "Completed hw5 part1." hw5p1handin
git push origin master
git push origin hw5p1handin

Part 2: Undefined? Yes. Crash? No Way

What should happen when a user program misuses the system calls? What happens when kkv_init() gets called twice in a row? What happens when kkv_destroy() is called while the kernel is executing kkv_put()?

In the API function comments shown above, certain behaviors are left undefined. For user level library functions, an easy way to implement undefined behaviors is to not do anything special, and possibly let the program crash.

This is not the case for system calls. System calls must be robust against misuse and abuse. A user program should never be able to crash the operating system or corrupt kernel data structures no matter what it does.

Modify the code so that the four system calls can be called in any order and in any degree of concurrency between them. Of course, if a sequence of calls doesn’t make sense, they don’t have to succeed. Returning -EPERM is a reasonable course of action in those cases.

Requirements

Submission

To submit this part, push the hw5p2handin tag with the following:

git tag -a -m "Completed hw5 part2." hw5p2handin
git push origin master
git push origin hw5p2handin

Part 3: Using the SLAB allocator

The Linux kernel provides the SLAB layer (also called the SLAB allocator) as a more efficient way to manage memory for kernel-level data structures. It is specifically designed as a memory management interface to be used for repeated allocations and deallocations of the same structures, as we are doing here with kkv_ht_entry.

Tasks

  1. Read the following pages on the SLAB allocator:

    • LKD Chapter 12, pages 245 - 252
  2. Set up a SLAB cache for struct kkv_ht_entry, and replace calls to kmalloc() and kfree() with the appropriate SLAB allocator calls for this struct.

Requirements

Submission

To submit this part, push the hw5p3handin tag with the following:

git tag -a -m "Completed hw5 part3." hw5p3handin
git push origin master
git push origin hw5p3handin

Part 4: Empty Fridge

In this part, we will modify the fridge implementation so that kkv_get() system call supports blocking. We will now use the flags parameter to specify blocking behavior, which will be set to either KKV_NONBLOCK or KKV_BLOCK (both defined in fridge_data_structures.h.

When flags == KKV_NONBLOCK, kkv_get() will behave exactly the same as before, as described in the API documentation in part 1.

When flags == KKV_BLOCK, kkv_get() will change its behavior as follows:

  1. Returns 0 if a key-value pair was found, removed, and written to value. (This is the same behavior as the non-blocking case.)

  2. When a key-value pair for the given key is not present in the kernel key-value store, the calling process will block, waiting for a key-value pair for the key to be added to the key-value store.

    When another thread adds a key-value pair for the key by calling kkv_put(), the waiting process will then wake up, remove the key-value pair, copy the value to user land up to max_size bytes, and returns 0.

  3. When multiple threads are waiting for the same key, and another thread adds a key-value pair for the key by calling kkv_put(), only one of the waiting threads will wake up and remove the value. The other threads will keep waiting.

    • Those other threads might have woken up, found that there is no key-value pair for the key, and gone back to sleep, but that all happens in the kernel; those threads do not return from kkv_get().

    Note that when multiple threads are waiting on the same key, you should NOT expect that each kkv_put() will be matched by a waiting thread.

    • Recall that kkv_put() will replace an existing key-value pair. It is entirely possible that two consecutive kkv_put() calls will happen without an intervening removal of the key-value pair by another thread. Keep this in mind when you write your test driver.
  4. A thread blocked on kkv_get() should return when it receives a signal. kkv_get() returns -EINTR in that case.

    • The signal_pending() kernel function will come in handy here.
    • Make sure you are not holding a lock or leaking memory when you return because of signals.

The API descriptions for kkv_put(), kkv_init(), and kkv_destroy() remain the same. That is, they will behave exactly the same as in part 1. However, you may need to change their implementations to support the new behavior of kkv_get(). The flags parameter for the remaining three syscalls will remain unused.

You should now fully utilize the data structures defined in fridge_data_structures.h. The fields that were previously unused are highlighted below:

#define KKV_NONBLOCK 0
#define KKV_BLOCK 1

/*
 * A node in a linked list. Contains a key-value pair.
 */
struct kkv_ht_entry {
    struct list_head entries;
    struct kkv_pair kv_pair;
    wait_queue_head_t q;        /* only for part 4 (empty fridge) */
    uint32_t q_count;           /* only for part 4 (empty fridge) */
};

Requirements

Submission

To submit this part, push the hw5p4handin tag with the following:

git tag -a -m "Completed hw5 part4." hw5p4handin
git push origin master
git push origin hw5p4handin

Acknowledgements

The initial Fridge API and test driver suite was designed and implemented by the following TAs of COMS W4118 Operating Systems I, Spring 2015, Columbia University:

The Empty Fridge extension to the original Fridge API and the test driver suite was written by the following TAs of COMS W4118 Operating Systems I, Spring 2016, Columbia University:

The Fridge assignment skeleton code, assignment, solutions, and test suite were updated to support 64-bit ArchLinux, the GitHub-based repo management, and fridge Python bindings by the following TAs of COMS W4118 Operating Systems I, Spring 2018, Columbia University:

The Fridge assignment was updated for 64-bit Linux version 5.10.57 by the following TAs of COMS W4118 Operating Systems I, Fall 2021, Columbia University:

The Fridge assignment was updated for 64-bit Linux version 5.10.158 by the following TAs of COMS W4118 Operating Systems I, Spring 2023, Columbia University:


Last updated: 2023-03-08