CMSC 483/691c, Parallel Programming
Project 1: Neural Nets with Threads
(c) 2001, Howard E. Motteler
Assigned Thu 15 Feb; Due Thu 8 Mar; 100 pts
Project Goals
The purpose of this project is to get hands-on experience
programming matrix operations with threads and to provide an
introduction to neural networks and back-propagation training.
The Project
You are to write two programs using threads, one to do a three-layer
"forward" neural network calculation, and one to train a three-layer
network by back-propagation. The two programs may share procedures
and/or headers.
Forward Calculation
A three layer forward calculation can be represented as
y = f(x) = W3*t(W2*t(W1*x + b1) + b2) + b3
where x is a column vector input, y a column vector output, W1, W2,
and W3 are matrices, b1, b2, and b3 are vectors, and t() represents
the application of the hyperbolic tangent function to each element
of a matrix or vector. The function f can be generalized to an
operation from matrices X to matrices Y, if we duplicate columns of
the bias vectors so that the additions conform.
Your forward program will read the X matrix, the W matrices and b
vectors, and some stats on X and Y used to normalize the data, and
will calculate and then write the Y matrix. All the matrices and
vectors are saved as simple ASCII-format data, in column order. Sample C procedures are provided to read and write
such data.
Your program to do forward calculations should be called "nnfwd",
and should
- read filenames for the input and output matrices (X and Y in
the formula above) and the number of threads from the command
line; if no number of threads is given, it should default to
the actual number of processors,
- read the input matrix (X in the formula, above),
- read the matrices "W1", "W2", and "W3", vectors "B1", "B2", and
"B3", and the training data statistics vectors "Xmean", "Xstd",
"Ymean", and "Ystd",
- rescale the input data: subtract the mean, and divide by the
standard deviation,
- do the three-layer forward calculation, as described above,
- rescale the output data: multipy by the standard deviation and
add the mean, and
- write out the result matrix (Y in the formula above).
A typical invocation would be
nnfwd X1 Y1 4
where "X1" and "Y1" are matrix file names, and Y1 is to be calculated
using 4 threads.
Training
The weights W and bias vectors B are chosen so that a forward
calculation gives a reasonable fit to some particular set of
training data. Let X and Y be matrices, and suppose that each column
of X represents an input and each column of Y a desired output of our
network. We want to find weight matrices and bias vectors such that
the output of the forward calculation F(X) is as close to the
supplied data Y as possible. This is done by "training" the
network.
If e is an error term, for example, e = |F(X) - Y|, then we want to
minimize e.
This is done by finding de/dwikj for each weight
wikj and adjusting the weights wikj by a small
increment along the gradient, and checking the error with the
new weights. Details of this adjustment may vary; a sample training program is provided in
Matlab that uses both "momentum" and an "adaptive learning rate".
You do not have to use the training algorithm used in the demo
program nntrain.m, improvements are welcome.
Your training program should be called "nntrain", and should
- read a limit on the number of "epochs" (training iterations)
and the number of threads from the command line; if the number
of threads is not given, it should default to the actual number
of processors,
- read the training data arrays "X" and "Y", the stats on the
training data, "Xmean", "Xstd", "Ymean", "Ystd", the weight
matrices "W1", "W2", and "W3", and the bias vectors "B1",
"B2", and "B3",
- rescale the training data: subtract the mean, and divide by the
standard deviation, for both X and Y.
- do back-propagation training, for at most the requested number
of epochs, to improve the weight matrices and bias vectors,
- write out the new weight matrices "W1", "W2", and "W3" and bias
vectors "B1", "B2", and "B3"
A typical invocation would be
nntrain 100 4
where the training should be done for most 100 epochs, using 4
threads.
Getting Started
Since this is a course in parallel processing rather than neural
networks, demo Matlab procedures are provided
for both the forward calculation and for back-propagaion training;
nnfwd.m is a Matlab implementation of nnfwd, and nntrain.m a Matlab
implementation of nntrain.
The Matlab procedures are
nnfwd.m -- demo program to do a forward calculation
nntrain.m -- demo program to train a network
nninit.m -- initial values for weight and bias matrices
nnparams.m -- prompt user for various net test parameters
nntest.m -- demo net test program
The program nntest.m prompts the user for network parameters,
generates some sample test data, calls nninit.m to generate initial
values for the weight matrices and bias and to calculate the stats
vectors, calls nntrain to train up the net, and nnfwd to check the
accuracy of the trained net.
Note that the project is to implement nnfwd and nntrain with
threads, you can use the provided nninit.m and nnparams.m to set
thing up, and you can use nntest.m as a prototype for testing your
own networks.
Test Data
The Matlab procedures nninit.m to initialize
weight matrices and bias vectors and nntest.m to generate test data
are provided. Relatively small matrices--say, 4 or 5 element
inputs, outputs and and hidden layers, and maybe 100 epochs of
training--are fine for doing debugging and testing. The default
values are for a larger test, and are useful for benchmarks against
the Matlab code.
Timing
Both the training and forward calculation programs should report
"wall" runtime (not CPU usage) to the nearest second, with the Unix
"time" sys call. Start counting time after all the matrices are
read, and before any calculation is done, and stop counting time
after the calculations are done, and before any data is written.
Your time message should be printed to stderr.
Note that the forward calculation is very fast, in comparison to
typical training times; when I test your nnfwd I will use a large
enough input matrix, on the order of 100 x 1000, to get a
significant measurement.
Your programs should be at least as fast as the Matlab demo
programs, when run on a single processor, and significantly faster.
when run on multiple processors.
Submitting Your Project
Make a tar file containing the project files, that is, your *.c,
*.h, and Makefile, and submit this as an email attachment, with
subject "project 1". Make sure that your name and "project 1" are
at the top of your main files.
Grading
Make sure you read the page concerning
general information on programming projects. About 60% of your
grade is based on how well your code works, with the remainder of
the points divided between design and documentation. Projects are
due by midnight of the assigned date; there is a 5% bonus for each
day a project is turned in early (for up to two days early), and a
5% penalty for each day the project is turned in late.