COMS W4118 Operating Systems I

Interprocess communication in UNIX

Pipes

Unnamed Pipe

#include <unistd.h>

int pipe(int fd[2]);
    // Returns: 0 if OK, –1 on error

After calling pipe():

Figure 15.2, APUE

fd[0] is opened for reading, fd[1] is opening for writing

Recall blocking semantics of pipe – see L03-file-io or man 7 pipe.
Contents are stored in the kernel (not userspace) – special “pipe file”

After calling pipe() and then fork():

Figure 15.3, APUE

Recall from L03-file-io: when forking, all open file descriptors are “dup’d”. Child gets a reference to the read and write ends of the pipe.

Diagram is slightly misleading: pipes are only half-duplex (one-way communication). You can only do one of the following:

Parent writes to fd[1], child reads from fd[0]
Child writes to fd[1], parent reads from fd[0]

You can’t use pipe as a full-duplex (two-way communication) channel, e.g.:

If parent writes to fd[1], and then reads from fd[0] expecting to block until child writes something, it actually ends up just reading back what it just wrote.
Pipe is basically a single bounded buffer, readers block until there’s contents to read, writes block until there’s enough space to write – no notion of who read/wrote what.

Since pipe is only half-duplex, close() unused ends of pipe after forking depending on who you want to read/write. e.g., where child reads and parent writes:

int fd[2];
pipe(fd);

if (fork() == 0) {
    close(fd[1]);  // close unused write end
    // ...
    read(fd[0], ...);
} else {
    close(fd[0]);  // close unused read end
    // ...
    write(fd[1], ...)
}

Note dependence on fork() for sharing ends of pipe via dup’d file descriptors. This form of IPC only works for related processes (e.g. parent-child)

connect2 demo: how shell stitches together two processes when you run a pipeline: p1 | p2.

Note usage of dup2(): allows you to target a specific newfd to copy oldfd into. If newfd is taken, atomically close()s newfd before copying oldfd into it.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>

int main(int argc, char **argv)
{
    int fd[2];
    pid_t pid1, pid2;

    // Split arguments ["cmd1", ..., "--", "cmd2", ...] into
    //                 ["cmd1", ...] and ["cmd2", ...]

    char **argv1 = argv + 1; // argv for the first command
    char **argv2;            // argv for the second command

    for (argv2 = argv1; *argv2; argv2++) {
        if (strcmp(*argv2, "--") == 0) {
            *argv2++ = NULL;
            break;
        }
    }
    if (*argv1 == NULL || *argv2 == NULL) {
        fprintf(stderr, "%s\n", "separate two commands with --");
        exit(1);
    }

    pipe(fd);

    if ((pid1 = fork()) == 0) {
        close(fd[0]);   // Close read end of pipe
        dup2(fd[1], 1); // Redirect stdout to write end of pipe
        close(fd[1]);   // stdout already writes to pipe, close spare fd
        execvp(*argv1, argv1);
        // Unreachable
    }

    if ((pid2 = fork()) == 0) {
        close(fd[1]);   // Close write end of pipe
        dup2(fd[0], 0); // Redirect stdin from read end of pipe
        close(fd[0]);   // stdin already reads from pipe, close spare fd
        execvp(*argv2, argv2);
        // Unreachable
    }

    // Parent does not need either end of the pipe
    close(fd[0]);
    close(fd[1]); 

    waitpid(pid1, NULL, 0);
    waitpid(pid2, NULL, 0);
    return 0;
}

Named pipe (FIFO)

#include <sys/stat.h>

int mkfifo(const char *path, mode_t mode);
    // Returns: 0 if OK, –1 on error

mkfifo(): create a new named pipe on the filesystem

Afterwards, use file I/O syscalls to interact with special pipe file. Shares semantics with unnamed pipe – still half-duplex.

Unlike unnamed pipe, FIFO can be used for IPC between unrelated processes because they can use the filesystem as a rendezvous point. Both processes just need to know the path to the FIFO.

Another example of IPC between unrelated processes: lock-file from L03-file-io

POSIX Semaphores

Semaphore Theory

Semaphore: fundamentally, just an integer value mainly manipulated by two methods.

Increment: increase the integer

a.k.a. verhoog, V(), up, sem_post()

Decrement: wait until value > 0, then decrease the integer value

a.k.a. probeer, P(), down, sem_wait()
Note blocking semantic: unlike increment (which can run whenever), decrement blocks until value is positive.

Initial value affects semaphore semantics!

Binary semaphore (a.k.a. lock): initial value is 1. Protects one resource.

Before acquiring resource, run sem_wait(). Value decremented to 0.
Use resource
Run sem_post() to release the resource. Value incremented to 1.

Resource is limited to 1 user at a time. Concurrent access while resource is locked sees value of 0, blocks, and is woken up when value is incremented back to 1.

Counting semaphore: initial value is N > 1. Protects N resources.

Before acquiring resource, run sem_wait(). Value is decremented by 1.
Use resource
Run sem_post() to release the resource. Value incremented by 1.

Since there are N resources, concurrent access is only blocked after N concurrent accesses (i.e. when the value hits 0). N + 1th concurrent access sees value of 0, blocks, is woken up when the value becomes positive again (i.e. some users posted the semaphore).

Ordering semaphore: take advantage of blocking semantic to implement “events”. e.g.:

sem = 0  // initial value is 0

P1: 1 -> 2 -> sem_wait() -> 4 -> 5

P2: A -> B -> C -> D -> sem_post()

P1 completes tasks 1-2 then blocks until P2 completes tasks A-D before moving on to tasks 4 and 5. P1 has to wait until P2 increments the semaphore value.

POSIX API

Initializing and destroying unnamed POSIX semaphores:

#include <semaphore.h>

int sem_init(sem_t *sem, int pshared, unsigned int value);
        // Returns: 0 if OK, –1 on error

int sem_destroy(sem_t *sem);
        // Returns: 0 if OK, –1 on error

sem_t *sem: pointer to shared semaphore object. Declared by user and initialized/destroyed by API.

int pshared: If semaphore is meant to be shared by processes, pass in non-zero value. Otherwise, (e.g. threads), pass in 0.

unsigned int value: Initial value for the semaphore

If unnammed semaphore is to be shared by related processes, where should semaphore be declared?

Recall unnamed pipe: parent opens, child obtains file descriptors after fork() because of dup()ing semantics.
Can process-shared unnamed semaphore simply be declared globally?
See mmap() below on discussion of “shared memory”.

Creating, opening, closing, and removing named POSIX semaphores:

#include <semaphore.h>

sem_t *sem_open(const char *name, int oflag, ...
                /* mode_t mode, unsigned int value  */ );
        // Returns: Pointer to semaphore if OK, SEM_FAILED on error

int sem_close(sem_t *sem);
        // Returns: 0 if OK, –1 on error

int sem_unlink(const char *name);
        // Returns: 0 if OK, –1 on error

Similar semantics to file API syscalls (recall L03-file-io).

Named semaphores meant to be used by unrelated processes – use semaphore name as “redezvous” point.

On Linux, named semaphores are stored in the filesystem under /dev/shm

Decrement the value of semaphores:

#include <semaphore.h>
#include <time.h>

int sem_trywait(sem_t *sem);
int sem_wait(sem_t *sem);
        // Both return: 0 if OK, –1 on error

int sem_timedwait(sem_t *restrict sem,
                    const struct timespec *restrict tsptr);
        // Returns: 0 if OK, –1 on error

Blocking semantics:

sem_trywait() does NOT block, returns immediately if semaphore value is 0.
sem_wait() blocks until semaphore value is positive
- Sets errno to EINTR if interrupted by a signal
sem_timedwait() blocks until it times out or semaphore value is positive, whichever happens first
- Can sem_timedwait() be safely implemented using SIGALRM? Recall L04-signals.

Increment the value of semaphores:

#include <semaphore.h>

int sem_post(sem_t *sem);
        // Returns: 0 if OK, –1 on error

Memory-mapped I/O

Mapping a file into memory

Using the file I/O syscalls can be annoying. Consider the example of opening a file with O_RDWR:

Editing/accessing different parts of the files: have to keep calling lseek()
Reading from the file requires read() to copy contents out of kernel to userspace buffer
Writing to the file requires write() to copy contents out of userspace buffer into kernel

Alternative: map region of file into your virtual address space! Figure 14.26, APUE

Memory-mapped region is backed by disk. That is, updates to the memory-mapped region go to memory first, then (eventually) flushed to disk

Furthermore, mappings can be private or shared.

Private mappings receive a snapshot of the file, but changes are not flushed to disk and are not seen by other processes that map the same region.
Shared mappings reference the same memory. Processes with shared mappings see each other’s updates

#include <sys/mman.h>

void *mmap(void *addr, size_t len, int prot, int flag, int fd, off_t off);
        // Returns: starting address of mapped region if OK, MAP_FAILED on error

Note-worthy parameters:

void *addr: Virtual address to place the mapping at. Prefer to pass NULL and let mmap() decide for you (address is the return value).
int prot: Protection of the mapped region (read, write, exec, none)
int flag: Visibility (shared/private) + other modifiers
int fd: file descriptor attached to file we want to map

Mapping a file with MAP_SHARED is a form of IPC for unrelated processes

again, rendezvous point is the the file on the filesystem, like FIFO/named semaphore

Sometimes we want to map memory that is not backed by a file (kinda like malloc()):

specify fd = -1 and flag = MAP_ANON | ...

However, mapping visibility makes this more powerful than malloc()! Consider a process that creates an anonymous memory and then fork()s. We know that child will inherit all of the parent’s memory mappings, but…

MAP_PRIVATE: child gets its own indepdendent copy of the mapping (like malloc())
MAP_SHARED: child shares memory mapping with parent, both see each other’s updates

Mapping some anonymous memory with MAP_SHARED is a form of IPC for related processes

again, fork() facilitates the sharing, like for pipe()

Example: counter.c – note that unnamed semaphore is placed in shared memory so both parent and child have access to it.

#include <stdio.h>
#include <unistd.h>
#include <assert.h>
#include <semaphore.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/wait.h>

#define LOOPS 2059

struct counter {
    sem_t sem;
    int cnt;
};

static struct counter *counter = NULL;

static void inc_loop() {
    for (int i = 0; i < LOOPS; i++) {
        sem_wait(&counter->sem);

        // Not an atomic operation, needs lock!
        // 1) Load counter->cnt into tmp
        // 2) Increment tmp
        // 3) Store tmp into counter->cnt
        counter->cnt++;

        sem_post(&counter->sem);
    }
}

int main(int argc, char **argv) {
    // Create a shared anonymous memory mapping, set global pointer to it
    counter = mmap(/*addr=*/NULL, sizeof(struct counter),
                   // Region is readable and writable
                   PROT_READ | PROT_WRITE,
                   // Want to share anonymous mapping with forked child
                   MAP_SHARED | MAP_ANON,
                   /*fd=*/-1,  // No associated file
                   /*offset=*/0);
    assert(counter != MAP_FAILED);

    // Mapping is already zero-initialized.
    assert(counter->cnt == 0);

    sem_init(&counter->sem, /*pshared=*/1, /*value=*/1);

    pid_t pid;
    if ((pid = fork()) == 0) {
        inc_loop();
        return 0;
    }

    inc_loop();
    waitpid(pid, NULL, 0);

    printf("Total count: %d, Expected: %d\n", counter->cnt, LOOPS * 2);

    sem_destroy(&counter->sem);
    assert(munmap(counter, sizeof(struct counter)) == 0);
    return 0;
}

XSI IPC

Skim briefly just to appreciate how great POSIX IPC is :^)

They share common naming and interface scheme:

XSI Message queues

  int msgget(key_t key, int flag);
  int msgctl(int msqid, int cmd, struct msqid_ds *buf);
  int msgsnd(int msqid, const void *ptr, size_t nbytes, int flag);
  ssize_t msgrcv(int msqid, void *ptr, size_t nbytes, long type, int flag);

XSI Semaphores

  int semget(key_t key, int nsems, int flag);
  int semctl(int semid, int semnum, int cmd, ... /* union semun arg */ );
  int semop(int semid, struct sembuf semoparray[], size_t nops);

XSI Shared memory

  int shmget(key_t key, size_t size, int flag);
  int shmctl(int shmid, int cmd, struct shmid_ds *buf);
  void *shmat(int shmid, const void *addr, int flag);
  int shmdt(const void *addr);

And they all suck…

Hard to clean-up because there is no reference counting
- pipes get automatically removed when last process terminates
- data left in a FIFO is removed when last process terminates
Hard to use
- complex and inelegant interfaces that don’t fit into UNIX file system paradigm
- stupid naming scheme: IPC identifiers, keys, and project IDs – are you serious?

They have been widely used for lack of alternatives. Fortunately we do have alternatives these days:

Instead of XSI message queues, use:
- UNIX domain sockets
- POSIX message queues (still not widely available, so not covered in APUE; see man 7 mq_overview)
Instead of XSI semaphores, use:
- POSIX semaphores
Instead of XSI shared memory, use:
- memory mapping using mmap()

Last updated: 2023-02-02