Project 1: The Anthropic Processing Engine

(c) 1995, Howard E. Motteler

Assigned Tue Sept 19; Due Tue Oct 10; 100 pts


Project Goals

The purpose of this project is to learn more about Unix processes, and to get hands-on experience with inter-process communication, data-flow, semaphores, and shared memory.

A Semi-True Story

It is a little-known fact that Charles Babbage, the inventor of the famous Difference and Analytical Engines, had a brother P.D.Q. Babbage, who came up with his own scheme for performing complex calculations--the `Anthropic Processing Engine.'

At first glance the Anthropic Processing Engine seemed to be little more than a typical office environment of the day. It consisted of a room with a number of desks and a row of baskets, labeled b1, b2, b3, etc., along one wall. The distinguishing feature of the APE was carefully defined rules of operation. A worker with a fixed task sat at each desk. A typical worker's job might be to get up, take a slip of paper from the bottom of basket 4 and another from the bottom of basket 7, return to his desk and add the numbers, write down the result on a new slip of paper, and place this result on top, in basket 5. The task was performed repeatedly, as long as there was something in both baskets 4 and 7. If there was no paper in baskets 4 or 7, the worker would simply wait.

PDQ's real contribution to computer science, where he perhaps surpassed even his brother Charles, if not Charles' friend Ada, was in the creation of a simple, elegant programming language for the APE. PDQ realized that the instructions for the addition worker described above could be specified with the expression

                        b5 = b4 + b7 .
More generally, he saw that any binary operation "op" performed on the contents of baskets i and j, and placed in basket k, could be expressed as

                        bk = bi op bj .
In the simplest version of PDQ's scheme, there was one such instruction for each worker, which was performed repeatedly. There was also a restriction that no two workers would take things from the same basket, and that no two workers would put things into the same basket. Certain workers had the job of taking things from PDQ and placing them in the appropriate baskets, while others had the job of taking things from certain baskets and giving them to PDQ. In modern terms we would think of these workers as providing the inputs to and outputs from the system.

A typical run of the APE would proceed as follows. First, a list of instructions was made up; then a big enough room, with sufficient desks and baskets was found, and the instructions were handed out to the workers. PDQ then gave the `input' workers some data, and sat back and waited for the results from the `output' workers; upon receiving this, everyone was sent home.

Alas, the concept of testing, not to mention "alpha" and "beta" releases, had not yet been developed, and PDQ's first major program, intended to calculate potato futures, contained some serious bugs; also, some of his workers were not entirely reliable. During a demonstration for the Royal Academy some of the workers, faced with empty baskets, fell asleep; others became restless and began writing whatever they pleased and placing the results in random baskets; a few pelted their fellows with spit-wads made from the paper slips. PDQ's sponsors were unimpressed by his attempt at office automation, and he disappeared into historical obscurity, grinding gears for his more famous brother.

The Project

PDQ was actually onto something, and his reputation is sorely in need of rehabilitation. To help with this, and to pass the class, you are to write a program "wp" (for worker process) that acts as one of PDQ's workers, communicating via queues that implement the baskets. With a bit of care, your worker processes will be more reliable that PDQ's, and will never pelt their fellows with unexpected surprises.

The baskets, or queues, are implemented with shared memory and semaphores, and are discussed below. To test your workers and to get the computation started, a special program "pdq" is provided. This program writes to queues 1, 2, and 3, the input baskets, and reads from queue 0, the output basket.

Your worker program should be able to add, subtract, multiply, and divide the values from two queues, and write the results to a third queue. In addition, the worker program should be able to copy an input to two outputs, and merge two inputs into one output. Worker programs are given commands in PDQ's language, with one command per worker. A typical worker invocation would be

                     wp b5 = b4 + b7 .
Given this command, your worker program "wp" should parse the string "b5 = b4 + b7", read values from queues 4 and 7, add the values, and place the result in queue 5, repeatedly. The full set of commands that your program "wp" should be able to parse and perform are

                        bi = bj + bk
                        bi = bj - bk
                        bi = bj * bk
                        bi = bj / bk
                        bi = bj = bk
                        bi = bj M bk
where i, j, and k are queue numbers, in the range 0 to NBUFS-1. You can assume that all input is well-formed, and each of the tokens {+, -, *, /, =, M} are separated by one or more spaces. Tokens appear as `argv' strings, making parsing the input very easy.

The command bi = bj = bk is a `copy' command, and has the effect of taking a value from queue bk and copying it to queues bi and bj.

The command bi = bj M bk is a `merge' command, and works slightly differently from the others. For the merge, the worker simply takes a value from either of the specified input queues, whenever a value is present, and places it in the specified output queue. More precisely, if bj has a value and bk is empty, then merge moves the value from bj to bi. If bk has a value and bj is empty, then merge moves the value from bk to bi. If both bj and bk are non-empty, then merge moves one of the two values (you can choose which) to bi. Your merge should be `fair', in that if there are values in both queues bj and bk, then neither will be bypassed more than once.

Your wp program should continue to run, either performing its specified command or waiting for input, until a special END value is received. (END is defined in pdq.h.) When an END is received on either input of an arithmetic command {+, -, *, /}, the process first sends an END to its output queue, and then terminates.

The copy command should work in a similar way, except that it sends an END to both its output queues. The merge is different, in that it should stop receiving values on a queue when an END is received on that queue, but it does not send an END and terminate until it has seen an END on both its inputs.

Queues

You are to implement the queues used by the wp and pdq programs. Queues are bounded buffers, kept in shared memory, with semaphores used for mutual exclusion. The file bufskel.c contains a partial implementation of NBUFS bounded queues, numbered 0 to NBUFS-1, and the file pdq.h defined a structure `bufrec' used by the routines in buf.c. The implementation in bufskel.c works correctly, except that there is no mutual exclusion. Part of the project is to add mutual exclusion to these routines.

Semaphores and Shared Memory

The file sem.c contains routines that are linked with your program to provide NSEMS semaphores and a block of shared memory. The signal and wait procedures defined in sem.c are called ssig() and swait(). Semaphore `names' are simply integers in the range 0 to NSEMS-1. The routine sbase() returns a pointer to the start of the shared memory block. See the file sem.c for further information.

The pdq Program

The pdq program is used to test your wp program. The pdq program writes various test sequences to queues 1, 2, and 3, and reads values from queue 0. Pdq forks into a sending and a receiving process. The sender prints values as they are sent, and the receiver prints values as they are received.

The pdq program performs several sorts of tests. Briefly, these include a test where values are sent out in repeated triples b1-b2-b3, a test where a sequence of values are first sent to b1, then to b2, and finally to b3, and a test where the next channel for transmission is selected at random. For further information on the tests, see pdq.c.

Initialization

The pdq program initializes semaphores and shared memory. Pdq sets semaphores 0 through NSEM-2 to the value 1, and writes zeros to the shared memory, so that all queues start out initially as empty. A special semaphore `READY' is defined as semaphore number NSEMS-1. The pdq program sends NBUFS-3 signals (the maximum possible number of workers) on this READY semaphore, after all initialization is complete. Your wp program should wait on READY after calling sinit(), before performing any operations with buffers or semaphores.

An Example Run

Here is an example from a simple test, where the wp processes are used to add up the number in queues 1-3 and sending the result to queue 0.

../proj1 > wp b0 = b3 + b4 &
../proj1 > wp b4 = b1 + b2 &
../proj1 > pdq 
PDQ: sent 1  1  1
PDQ: sent 2  2  2
PDQ: sent 3  3  3
PDQ: sent 4  4  4
PDQ: received 3
PDQ: received 6
PDQ: received 9
PDQ: received 12
PDQ: received END
PDQ: received 4 values
PDQ: Deleting semaphore block . . .
In this example all of the sends occurred before any of the receives. With longer tests, it is not unusual to see sends and receives mixed together.

Test Data

Some example test scripts will be provided. Your project will be graded using scripts that are similar but may not be exactly the same as the sample test data.

Submitting Programs

You are to submit two source files: your program wp.c, and your improved implementation of bounded buffers, buf.c. These programs should be submitted with the UCS submit program. For section 101, the command to submit both your "wp" and "buf" routines in one step is

         submit cs421_101 proj1 wp.c buf.c
Submit has a number of other options; see the man page for details. The submit program copies your program into a protected directory in my account, which has the same name as your user id. Do not submit from more than one account, or by email.

Grading

You should read the general information on programming projects; as mentioned there, your name, SSN, and "project 1" should appear at the top of your project. Approximately 20% of your grade is for documentation, including a brief (around 1/2 page) description of how your program wp works. This should be in comments at the beginning of the program. Another 30% percent is for design, and the remainder is based on how well things work. You can get partial credit even if you can't run all of the test data successfully.

Projects that do not compile get 0 points. You program should work with the routines pdq.c, pdq.h, sem.c, and the makefile as provided; when I compile and test you program for grading, I will be using the original versions. The makefile will compile your wp program, as well as pdq and the example programs. If you want to use C++, or have some other valid reason for modifying the makefile, check with me first, and submit your makefile in addition to your other files.

You must do the project described here. Doing some other similar or dissimilar project, no matter how difficult or clever, may be worth 0 points. This is not a group project. Please do your own work, and be careful about sharing your code.

Getting Started

A tar file is available, containing files for this project. Rename the `skeleton files' wpskel.c to wp.c, and bufskel.c to buf.c; these give you an outline to to get you started. The procedures in bufskel.c are complete except for mutual exclusion, while there is very little in wpskel.c except for include files and semaphore initialization.

You should read the comments in sem.c and pdq.c. You may want to play with the signal and wait examples in sig.c, wai.c, and show.c. The program show.c can be useful for debugging code that uses semaphores.

Relevant Files

The following files are in the tar file mentioned above.
  pdq.c       tests a wp program; includes documentation

  wpskel.c    a `skeletal' version of your program wp.c
              to help get you started

  bufskel.c   a partially correct implementation of
              bounded buffers procedures for buf.c

  pdq.h       definitions for wp.c, buf.c and pdq.c

  sem.c       semaphore package; includes documentation

  sig.c       example using ssig()
  wai.c       example using swai()
  sclear.c    release semaphores
  show.c      display semaphores and shared memory

  makefile    compile and link everything

  reset       shell script to release semaphores and to kill
              off any lingering wp processes

  test1-4     sample test scripts