Big FFT with Pthreads

CMSC 483/691P, Parallel Programming

Project 3: Big FFT with Pthreads

Assigned Monday, May 1
Due Monday, May 22
80 points

Project Goals

The purpose of this project is to gain experience with Pthreads and the Fast Fourier Transform.

Project Summary

You are to write a program using Pthreads that (1) reads a file containing an N by K array of single precision complex numbers, (2) does a parallel FFT on each column, and (3) saves the results in another file.

Specifications

The file format consists of two 32-bit integers giving the values of N and K, followed by an N by K array of complex numbers. The complex number are represented as pairs of 32-bit floats and the array is organized in column-order as follows:

                    vectors
                1   2   3  ...  K 
              +----------- ... ---+
     x1 real  |                   |
     x1 imag  |                   |
     x2 real  |                   |
     x2 imag  |                   |
         :    :                   :
     xN real  |                   |
     xN imag  |                   |
              +----------- ... ---+

You can assume that N is a power of two, and that the number of threads you use will divide N evenly. Your program should accept the following command line parameters:

  -i fin   specify an input file name, default fft_in.dat
  -o fout  specify an output file name, default fft_out.dat
  -p nn    run with nn threads, default 4

When K is greater than 1 you have more than one FFT to do, but do not parallelize by doing multiple FFTs (columns from the input matrix) at the same time. You should be able to get a speedup close to the number of processors on a single large transform, for example when N is 2^24 and K is 1.

The largest value of N is 2^24, but the number of columns K may be any value. Because there's no limit on K, don't try to read all the input data at once. You should read an input column, do the FFT, and write the output column, before reading the next input column. To save space, you may want to overwrite the storage you use to hold an input column with working or output data.

Getting Started

You should write a sequential version of the program first. This will resolve basic issues like I/O and managing arrays and indices. Your algorithms text, Cormen, Leiserson, and Rivest, is a good resource for this topic. When you have a sequential implementation, divide up the work done in the inner loop (or loops) among a set of threads to finish the project. Example Matlab and C sequential implementations are now available.

Test data

A basic set of test data is available as a tar file. This includes a README and input and output files for N=2^10 and K=16, and N=2^24 and K=1.

Submitting Your Project

Make a tar file with everything necessary to build and run your code, including a Makefile, optional README, and any supporting documentation you think is relevant. Please don't include any ".doc" files or use any other wonky formats. At the beginning of your comments you should (1) say how well your project works, including any missing or extra features, and (2) give your time for the 2^24 point data set with 2 processors.

Make sure that your name and "Project 3" are at the top of your main files. Mail this to me as an email attachment, with subject "Project 3". Do not mail data sets unless they are relatively small and there is a good reason to do so.

Grading

Please read the notes on general information on programming projects. Half your grade will be based on how well your code works, one quarter on design and code sanity, and one quarter on documentation. The documentation should include an honest evaluation of how well your code works, as best you are able to determine. Projects are due by midnight of the assigned date. There is a 5% bonus for each day a project is turned in early, for up to two days early, and a 5% penalty for each day the project is turned in late.

Return to Course Information