COMS W4118 Operating Systems I

Linux File System Architecture

Virtual File System (VFS) Overview

vfs-schematic

Many file system types and device types can coexist on the same system. How can we unify them all under one uniform interface? Let’s consider three levels of the storage schematic presented above:

File System Interface

This is the API that userspace programs use to interact with files. Recall file I/O syscalls from L03-file-io: open(), close(), read(), etc. Userspace programs receive file descriptor from the kernel and uses syscalls to interact with the file.

Prefer to give userspace programs a simple and unified interface instead of exposing implementation details! As such, delegate fs-specific details to the kernel.

Storage Level

We’ve learned about several file systems at this point: FFS, ext2, ext3, lfs, reiserfs, etc. We can format (or partition) a storage device (HDD/SSD/etc.) with a file system.

There are even distributed file systems where data is not stored locally, but on some remote server!

However, we don’t want to burden userspace programs with fs-specific details. We need to hook into the kernel so that userspace programs can simply use the unified syscall API!

VFS Interface

An FS abstraction layer that transparently and uniformly supports multiple file systems. VFS specifies an interface that a given FS implements to hook into the kernel

VFS carries out file I/O operations from userspace by dispatching operations to the FS implementation of the VFS interface

VFS Data Structures

Four high-level VFS data structures: struct file, struct dentry, struct inode, struct super_block.

struct file

Recall from L03-file-io/HW4: struct file represents an instance of an open file.

VFS interface: struct file_operations *f_op

struct dentry

Basically a “hard link”: contains name of link and inode number.

Break up an absolute path into dentries, one per component. For example: /home/hans/foo:

Path resolution is expensive! To open /home/hans/foo you need to:

Linux employs some cool caching to help improve performance… more on this later.

VFS interface: const struct dentry_operations *d_op

struct inode

Unique descriptor of a file or directory

i_ino: inode # unique per mounted filesystem
Can refer to fs-specific data via i_private (will be used for HW8)

VFS interface: const struct inode_operations *i_op

struct super_block

Descriptor of a mounted filesystem.

VFS interface: const struct super_operations *s_op

Dentry Cache

Linux kernel makes path resolution efficient by employing a dentry cache (dcache).

dcache (simplified version from this online book)

  1. Mount an instance of ext2 at /home
    • s_root field of super_block refers to root dentry of the mount
  2. P1 opens "/home/hans/foo" for reading
    • Recall path resultion above: need to read several inodes/dentries from disk
    • Along the way, cache them in the dcache
  3. P2 opens "/home/hans/foo" for writing
    • Different struct file because P2 is an independent process
    • Same dentry and same inode
    • Before consulting directory data blocks for dentries, check if present in dcache! Speeds up path resolution!
  4. P3 opens "/home/hans/bar"
    • Different file than P1 and P2
    • /home/hans/ path resolution cached in dcache
    • Need to read in hans/ directory data block to find dentry for bar
    • …only to find it refers to the same inode as foo
    • bar and foo are hard links to the same inode!

Last updated: 2023-04-18