To obtain the skeleton files that we have provided for you, you need to download
the lab tarball, hw1.tgz
, from Courseworks, under the Files tab. Inside
your VM, you may unpack it with the following command:
tar xzf hw1.tgz
This will unpack into a directory named hw1
, which is a git repo.
The TAs will use scripts to download, extract, build and display your code. It is essential that you DO NOT change the names of the skeleton files provided to you. If you deviate from the requirements, the grading scripts will fail and you will not receive credit for your work.
For students on ARM Mac computers (e.g. with M1 chip): if you want your
submission to be built/tested for ARM, you must create and submit a file called
.armpls
in the top-level directory of your repo; feel free to use the
following one-liner:
cd "$(git rev-parse --show-toplevel)" && touch .armpls && git add -f .armpls && git commit .armpls -m "ARM pls"
You should do this right after unpacking the tarball.
You need to have at least 5 git commits total, but we encourage you to have many
more. All your work should be committed to the master
branch. If you have not
used git before, this tutorial can get you started.
At a minimum, your README.txt
should contain the following info:
The description should indicate whether your solution for the part is working or not. You may also want to include anything else you would like to communicate to the TA grader, such as extra functionality you implemented or how you tried to fix your non-working code.
Once you have completed the assignment, you must create a tarball of your work.
Everything should still be in a directory named hw1/
. Create the tarball with
the following:
tar czf hw1-<UNI>.tgz hw1
where <UNI>
is replaced with your UNI. Upload hw1-<UNI>.tgz
to Courseworks
to submit your assignment.
Answers to written questions must be added to the part1.txt
file provided.
Please limit your explanations to around 40 words each.
Please read the following two chapters from OSTEP:
4: The Abstraction: The Process
6: Mechanism: Limited Direct Execution
Then answer the following questions:
What actions does a kernel perform to context switch between processes?
What process state does a process enter immediately after it terminates? What is this state for, and when does it leave this state?
Read chapter 6 of LKD (pp.85-96) on Linux’s linked list implementation.
If the LKD book’s explanation left you still confused, you can take a look at another book, Understanding the Linux Kernel, 3rd Edition, by Bovet & Cesati, (pp.87-90) for another explanation.
Read the container_of()
macro code shown in LKD, page 89 carefully. Answer the
following questions:
What is typeof
? Is it a C language keyword, function, macro, or something
else? What does it do?
What does offsetof
do?
What does the container_of()
macro do?
The container_of()
macro, in the way it’s written in LKD, is not portable
across C compilers because it uses GNU-specific language extensions. Can you
identify what they are?
The container_of()
macro in LKD has two lines. Is the 1st line (the
declaration of __mptr
) really necessary? Is it possible to rewrite the
macro without it? What is the purpose of that 1st line?
The answers to all these questions can be found by Googling, which you are allowed to do in this case. However, I urge you to make a serious attempt to answer these questions on your own. The point of this exercise is to get you ready for the kind of crazy C code you will encounter in the Linux kernel. Take this opportunity to flex your C legs.
If you find it difficult to answer these questions after reading the books, come back to this after you have completed the programming assignment below.
Before starting the programming portion of this assignment, familiarize yourself
with writing and using Linux kernel modules. You will be
writing your own module using Linux’s linked list implementation, included from
<linux/list.h>
. Be sure to test your module for memory leaks using KEDR
(the TAs will also check for memory leaks using KEDR while grading all module-based
assignments in this course).
We define a kernel data structure to represent Pokémon:
struct pokemon {
char name[32];
int dex_no;
struct list_head list;
};
Note that the struct list_head
field is embedded within our data structure.
Your task is to write a kernel module that will manipulate a Pokédex, which is a list of Pokémon. On initialization, it will add some Pokémon to the list, and then print them out. On exit, it will print out those Pokémon again, then delete those Pokémon from the list.
The skeleton code contains the outline of the kernel module, including the above
struct pokemon
definition, and the entry and exit points, pokedex_init
and
pokedex_exit
. You should not modify these.
Your job is to fill in the functions called by these entry and exit points.
First, you should declare a single struct list_head
named pokedex
, which
will serve as the head of the list of struct pokemon
s.
Then, you must fill in following functions, using the macros and functions
provided by <linux/list.h>
:
void add_pokemon(char *name, int dex_no)
adds a new Pokémon to the end of
the global pokedex
. The new pokemon’s fields should be populated with the
corresponding arguments. The list_add_tail()
function may come in handy
here.
void print_pokedex(void)
prints each Pokémon in the pokedex
, using the
given print_pokemon()
function. The list_for_each_entry
macro may be
useful here.
void delete_pokedex(void)
deletes all Pokémon from the global pokedex
. If
the pokedex
is empty, this function does nothing. You should use the
list_for_each_entry_safe()
macro and list_del()
function here.
Note that your Pokémon will need to persist between your module’s entry point
and exit point – you may find the functions kmalloc()
and kfree()
useful.
These serve the same purpose as malloc()
and free()
, except in the kernel
execution context. Note that kmalloc()
takes an additional flag
parameter –
for this assignment, you may just use GFP_KERNEL
for that argument. See LKD
pp.238-244 for more details.
Rewrite part2 without using the macros or functions defined in <linux/list.h>
.
In other words, replace the calls to the following functions and macros from
<linux/list.h>
with your own inline code, directly manipulating pointers. None
of the following functions or macros should appear in your code:
list_add_tail
list_for_each_entry
list_del
list_for_each_entry_safe
LIST_HEAD
LIST_HEAD_INIT
list_entry
container_of
offsetof
READ_ONCE
WRITE_ONCE
Do not call any linked-list related functions or macros in the linux kernel. Do not define any additional macros or functions in your code other than those already defined in the skeleton code.
Your linked-list should still be implemented using struct list_head
and
preserve the same semantics as the kernel’s implementation. That is, your code
should add/remove/access/traverse list elements in the same way that the kernel
macros/functions would.
The purpose of this exercise is to understand how the Linux linked-list implementation works under the hood. Study the implementations of the macros/functions that you used in part2. Once you understand how they work, you should be able to write the equivalent C code for this part. You may find it helpful to familiarize yourself with bootlin, a popular online kernel-source navigator. Be sure to select the correct Linux kernel version. In the Spring 2023 semester, you will be working with version v5.10.158.
Note that the linux macros/functions contain type/error-checking code to make
the generic implementation safer to use. You may not need that for your
type-specific code. Also, don’t worry about replicating READ/WRITE_ONCE
– the
kernel uses these macros to ensure safe concurrent access.
In Linux, processes are represented by the struct task_struct
, which is
defined in <linux/sched.h>
. Each task_struct
represents a node in the
process tree. To implement this tree data structure, it uses two struct
list_head
fields: children
and sibling
. children
is the list head to the
process’s list of children, while sibling
connects each process to its
siblings, as well to its parent’s children
field.
For this part, the skeleton code we have provided is a minimal kernel module
boilerplate. Your task is to write a kernel module, tasklist
. Upon
initialization, it performs a depth-first traversal of the process tree, and
prints out the process tree. You’re not supposed to use recursion in the linux
kernel, but for the sake of simplicity, it’s fine for this assignment. This
module should do nothing on exit.
You should start at the root of the process tree, which you may access via the
struct task_struct init_task
variable exposed by <linux/sched/task.h>
. For
each process, you should print out its PID and the process name. You may obtain
these from the pid
and comm
fields of the task_struct
.
When you print out the processes, you should also display the hierarchical tree. Here is an example output:
[ 9303.501734] Removing Module
[ 9304.846569] Loading Module
[ 9304.846574] -- [0] swapper/0
[ 9304.846576] \_ [1] systemd
[ 9304.846578] \_ [95] systemd-journal
[ 9304.846579] \_ [119] systemd-udevd
[ 9304.846581] \_ [14948] systemd-udevd
[ 9304.846583] \_ [150] VBoxService
[ 9304.846585] \_ [152] systemd-logind
[ 9304.846586] \_ [153] avahi-daemon
[ 9304.846588] \_ [160] avahi-daemon
[ 9304.846590] \_ [154] dbus-daemon
[ 9304.846592] \_ [155] dhcpcd
[ 9304.846593] \_ [162] login
[ 9304.846595] \_ [365] bash
[ 9304.846597] \_ [441] startx
[ 9304.846599] \_ [458] xinit
[ 9304.846601] \_ [459] X
[ 9304.846603] \_ [462] sh
[ 9304.846605] \_ [474] xfce4-session
[ 9304.846607] \_ [488] xfwm4
[ 9304.846609] \_ [490] Thunar
[ 9304.846611] \_ [491] xfce4-panel
[ 9304.846614] \_ [512] panel-16-systra
[ 9304.846616] \_ [516] panel-17-action
[ 9304.846618] \_ [493] xfdesktop
[ 9304.846621] \_ [496] xfce4-terminal
[ 9304.846623] \_ [556] gnome-pty-helpe
[ 9304.846625] \_ [558] bash
[ 9304.846627] \_ [570] bash
[ 9304.846629] \_ [581] bash
[ 9304.846631] \_ [587] bash
[ 9304.846633] \_ [15189] sudo
[ 9304.846635] \_ [15190] insmod
[ 9304.846638] \_ [8939] bash
[ 9304.846639] \_ [167] sshd
[ 9304.846641] \_ [182] rpcbind
[ 9304.846642] \_ [193] ntpd
[ 9304.846644] \_ [174] automount
[ 9304.846645] \_ [195] rpc.gssd
[ 9304.846646] \_ [362] systemd
[ 9304.846648] \_ [364] (sd-pam)
[ 9304.846649] \_ [468] dbus-daemon
[ 9304.846651] \_ [467] dbus-launch
[ 9304.846652] \_ [476] polkitd
[ 9304.846654] \_ [484] xfconfd
[ 9304.846655] \_ [487] gpg-agent
[ 9304.846657] \_ [494] xfsettingsd
[ 9304.846659] \_ [503] xfce4-power-man
[ 9304.846660] \_ [525] upowerd
[ 9304.846661] \_ [548] VBoxClient
[ 9304.846662] \_ [561] VBoxClient
[ 9304.846663] \_ [578] VBoxClient
[ 9304.846664] \_ [593] VBoxClient
[ 9304.846665] \_ [632] rpc.statd
[ 9304.846667] \_ [1233] at-spi-bus-laun
[ 9304.846668] \_ [1220] firefox
[ 9304.846669] \_ [2] kthreadd
[ 9304.846670] \_ [3] ksoftirqd/0
[ 9304.846671] \_ [5] kworker/0:0H
[ 9304.846672] \_ [6] kworker/u2:0
[ 9304.846673] \_ [7] migration/0
[ 9304.846674] \_ [8] rcu_bh
[ 9304.846675] \_ [9] rcu_sched
[ 9304.846676] \_ [10] watchdog/0
[ 9304.846677] \_ [11] khelper
[ 9304.846679] \_ [12] kdevtmpfs
[ 9304.846680] \_ [13] netns
[ 9304.846681] \_ [14] writeback
[ 9304.846682] \_ [15] bioset
[ 9304.846683] \_ [16] kblockd
[ 9304.846684] \_ [18] khungtaskd
[ 9304.846685] \_ [19] kswapd0
[ 9304.846686] \_ [20] ksmd
[ 9304.846687] \_ [21] khugepaged
[ 9304.846688] \_ [22] fsnotify_mark
[ 9304.846689] \_ [23] crypto
[ 9304.846690] \_ [27] kthrotld
[ 9304.846691] \_ [28] deferwq
[ 9304.846693] \_ [52] ata_sff
[ 9304.846694] \_ [53] khubd
[ 9304.846695] \_ [54] scsi_eh_0
[ 9304.846696] \_ [55] scsi_eh_1
[ 9304.846697] \_ [56] kworker/u2:2
[ 9304.846698] \_ [57] scsi_eh_2
[ 9304.846699] \_ [64] kworker/0:1H
[ 9304.846700] \_ [72] jbd2/sda1-8
[ 9304.846701] \_ [73] ext4-dio-unwrit
[ 9304.846702] \_ [99] rpciod
[ 9304.846703] \_ [107] nfsiod
[ 9304.846704] \_ [112] iprt
[ 9304.846705] \_ [142] kpsmoused
[ 9304.846706] \_ [8798] lockd
[ 9304.846707] \_ [11501] kworker/0:1
[ 9304.846709] \_ [12139] kworker/0:2
[ 9304.846710] \_ [12956] kworker/0:3
The format doesn’t have to be exactly the same, but the output must clearly show the hierarchical tree.
To avoid printing too much whitespace to your kernel log buffer, you must set a
maximum indentation level of 20. This means that processes deeper down in the
tree will only receive 20 levels worth of indentation. Use the following formula
to calculate the indentation per level, where each indent is 4 spaces:
num_spaces = 4 * min(cur_level, 20)
.
You can test your module by running the following command: ps -e --forest
.
This will display a process tree which you can compare against your module’s
output. Note that ps
doesn’t show the swapper
task, so the output will look
slightly different. You can also start new processes and verify that your module
displays them.
Parts 2 and 4 adapted from the programming projects in chapters 2 and 3 of Operating Systems Concepts by Abraham Silberschatz, Peter B. Galvin, and Greg Gagne.
Last updated: 2023-02-03