dev-guides

Linux Kernel Modules

Kernel Modules Primer

Operating systems implement a lot of functionality that must be executed in kernel mode: device drivers, system services, file systems, etc. However, those subsystems are not always used, so for most people, including them in the kernel image would only add on bloat to the OS distribution. Instead, they are left out of the base kernel image, and each user may opt to link in the program code only they need.

To avoid having to recompile the kernel each time to link in a new piece of code, Linux exposes a kernel module interface. This feature allow users to dynamically link object files into a running kernel, as well as remove loaded modules. However, to allow the kernel to make sense of the otherwise arbitrary code, kernel modules must strictly abide by Linux’s module protocol, which defines entry points, exit points, and various pieces of metadata for module bookkeeping.

In this guide, we introduce how to create a simple kernel module, as well as ways to interact with it. It should be noted that the module code will be running in the kernel mode, meaning it has privileges that allow it to corrupt a running Linux kernel. It your code crashes, it may bring down the entire operating system! Since you are working within a virtual machine, you can recover from a module failure by simply rebooting your system. But do carefully consider the code you are about to load, and make sure to save any work you may have open in your VM.

Creating a Module

A basic kernel module might look something like this:

#include <linux/module.h>
#include <linux/printk.h>

/* This function is called when the module is loaded. */
int hello(void) 
{
        printk(KERN_INFO "Loading module... Hello World!\n");

        return 0;
}

/* This function is called when the module is removed. */
void goodbye(void) 
{
        printk(KERN_INFO "Removing module... Goodbye World!\n");
}

/* Macros for registering module entry and exit points */
module_init( hello );
module_exit( goodbye );

/* Macros for declaring module metadata */
MODULE_DESCRIPTION("A basic Hello World module");
MODULE_AUTHOR("cs4118");
MODULE_LICENSE("GPL");

This module does nothing but print a “Hello World!” message when it is loaded, and a “Goodbye World!” message when it is removed, by respectively calling the hello() and goodbye() functions.

The module knows to call these functions because we declared them as the entry and exit points using the module_init and module_exit macros. These functions may be named anything as long as their names are give to these macros. Neither the entry point nor the exit point takes any arguments. The entry point must return an error code, with 0 representing success; the exit point does not return anything.

We declare some module metadata using the MODULE_DESCRIPTION, MODULE_AUTHOR, and MODULE_LICENSE macros. These aren’t strictly necessary, but just like a README, it is good practice to include them. The MODULE_DESCRIPTION can be a short synopsis of what your module is trying to accomplish. For the purposes of your assignments, MODULE_AUTHOR should always be your UNI for individual assignments, or the team number and UNIs of each team member for team assignments. You may leave MODULE_LICENSE set to GPL to keep Richard Stallman happy.

Note that we #include <linux/module.h> at the top of this small program. This defines the module macros that you used. In a more complex module, you may need to include more kernel headers files.

Building a Module

Now that we have written our module, we must still compile it before it can be loaded it into a running Linux system. We shall name our example module hello, so we save its source code in a file named hello.c.

Linux kernel modules are built using the GNU Make build system. More precisely, they are built using commands defined in Makefiles provided by the GNU + Linux operating system. We may hook into these Makefiles from a Makefile of our own:

obj-m += hello.o
all:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

This Makefile should be located in the same directory as hello.c, your build directory. That is:

./
|_ hello.c
|_ Makefile

It isn’t too important to understand the details of exactly how this Makefile works, as long as it does. In your build directory, you may use make all (or just make) to build your module, and use make clean to clean your build directory of any build artifacts. You should be able to build without root privileges:

$ make

Doing so will produce several files – the kernel module object that we are interested in will be named hello.ko.

Note that the obj-m variable is used to tell the Linux module build system which module to build, so if you are building a module named foo (with its source code in foo.c, you will need to change the first line of your Makefile to:

obj-m += foo.o

Loading and Removing Modules

To insert your kernel module, hello.ko, we run the following command, with root privileges:

# insmod hello.ko

At this point, the function we passed to module_init (hello in the above example) will be executed.

We can check that our module is present by running the following command:

$ lsmod

This will list all modules running in your system. This may include other running services provided by kernel modules, depending on your system setup. At this point, our hello module should be present.

To remove our running module, hello, we run the following command, with root privileges:

# rmmod hello

Note that we do not need to include the .ko file extension here. Now, running lsmod should no longer show the hello module.

Reading the Kernel Log Buffer

In our above example, the hello() entry point invokes printk(). This is the Linux kernel-equivalent of printf() – it is called with the format string as the first argument, followed by a variable number of arguments used by the format string. printk() supports the same formatting directives as printf(), e.g. %d, %s, %.5f, etc. It is defined in linux/printk.h (which, in our example, is #included by linux/module.h).

printk() cannot output to stdout, since it is not a user process. Instead, its output goes to the kernel log buffer. You can read this log buffer using the following command:

# dmesg

You will likely find a number of messages from other system services as well – these all share the same log buffer – starting from the earliest message to the latest. If you just loaded the above example module, you should find the message Loading module... Hello World! at the bottom; if you also removed it, you should also find Removing module... Goodbye World! there too.

You may also find it helpful to keep dmesg open, and see kernel output as it is being produced by printk() (similar to the behavior of tail -f). You may do this by running dmesg with the -w flag.

Since the kernel log buffer is shared amongst many services, it is often full of verbose, noisy messages. This can get rather unwieldy, so to clear the log buffer, we can use the following command:

# dmesg -c

To help you sift through the verbosity, printk() also supports logging priorities. These may be specified by passing in macros such as KERN_INFO to printk(); these macros are defined in <linux/printk.h>. You may filter the output to dmesg by using the -l flag.

For more detailed usage of dmesg, please check its man pages.


Acknowledgements

Adapted from Chapter 2 of Operating Systems Concepts by Abraham Silberschatz, Peter B. Galvin, and Greg Gagne; Programming Projects; Linux Kernel Modules.