CS444 Lab 2

Hacking XV6

With the shell completed, your operating system is just about usable! Browsing through the source, you even notice it's quite robust—as every kernel should be—but also rather...spartan. While your shell is more-or-less fine, it's going to take some work before it can run your favorite programs—web browsers, email clients, games, and so forth. You can cheat on at least some of the interface, but you'll need to have at least a little more kernel-mode code to get everything going. The typical way to do that, of course, is the humble syscall...

A "system call", or syscall, is a software-generated request for attention. In order to talk about how it works, and why it's implemented the way it is, a little history is prerequisite...

Back in the ol' days of computing, computers were expensive, hulking machines, with a (human) supervisor who was tasked with keeping things running smoothly while various operators put in decks of punch cards or, heavens forbid, toggled in binary on the switches of a debug console. While it was sufficient for a little while, as computers became steadily more popular, powerful, and useful, it was soon realized that the (human) supervisor was the bottleneck, and that most of the logical duty of their job could be reduced to another program. These "supervisory programs" are the ancestors of today's kernels.

Better utilization of these expensive machines could be achieved if the computer knew to switch tasks, but until software supervisors were prevalent, it was common for operators (through trial and error) to have to arrange their programs' use of resources manually. If only it were possible for each of those operators to have their own computer, with its own resources! Hardware being prohibitively expensive at the time, whereas software was relatively cheap, engineers of various sorts came up with the concept of virtualization—where running programs would be isolated in an environment where they could pretend they had access to an entire machine, possibly even more "machine" than was actually there, but where all the hardware was emulated by the supervisory program.

While that was a major advance, there were still issues—particularly, with devices. While devices could be "faked", many of them are unfortunately complex, and the "real" devices attached to the machine had no idea that it was actually dealing with multiple discrete, virtualized programs! In order to keep everything working smoothly, the supervisory programs also had to control access to resources, and, in doing so, the interface to devices had to be changed from the usual arbitrary, virtualized one, to a consistent interface independent of what devices actually where there. In doing so, programs gained portability—the ability to run on different systems—for free, requiring only that the local supervisor knew how to service the requests. The modern kernel and process emerge from these metaphors.

Since interrupts caused by devices are the typical way that devices signal that they want attention from the supervisor, it was natural that user programs would also use interrupts analogously—to signal that they want attention. In this way, the supervisor becomes "merely" a program that deals with event caused by hardware and the software it is emulating, while simultaneously arranging the resources necessary to keep up the charade. Four decades later, and while the computers have gotten faster and more complicated, the details of this process have barely changed.

Material

XV6 is a pedagogical operating system descended from V6 UNIX, the subject of the classic Lions Book. However, XV6 has been updated for modern(ish) 32-bit, multicore x86 machines. It is relatively simple, and completely implemented in less than 10k lines of code, including comments and whitespace! Additionally, it comes with a book describing its design.

For this lab, you should read (if you haven't already) chapters 0 and 1 of the XV6 book.

Assignment

Due Thursday, 2017 February 16, at 5:00PM

Implement a getprocs syscall, and a getprocs userspace program which calls it and prints out the return value. It should walk the process table and count the number of process that are not in the UNUSED or ZOMBIE state.

Submit your work to your GitLab repository to hand it in. The repository should be named exactly xv6, all lowercase. You may work on the code itself anywhere—in the ITL, on Odin in your home directory, etc.—but only submissions to GitLab will be graded. A copy of the code you'll start with is already set up for your use in your ~/cs444-sp17/xv6/ directory. Please also demo your getprocs userspace program to our TAs; details on the test will be published here soon!

We'll be using the same xv6 repository for the rest of the labs, and building upon all the previous labs. So, be sure to get the work done!