Assignment 11: Where others fear to thread

Due Sunday, December 1 before midnight (No late submissions accepted)

The goals for this assignment are:

  • Implement multi-threaded algorithms using the pthread library

  • Implement a tree data structure

  • Work with OS system calls

Update your repository

Do a fetch upstream to obtain the basecode for this assignment.

Using the command line

  • Open terminal and change your current directory to your assignment repository.

  • Run the command git fetch upstream

  • Run the command git merge upstream/master

Your repository should now contain a new folder named A11.

The fetch and merge commands update your repository with any changes from the original.

1. Grep

This question implements a simplified version of the bash command grep. Grep searches a list of files for a given keyword or expression. For example, the following command searches for the keyword public in a list of source files. We use the find command to generate the list of source files to search.

$ grep public find code -name "*.h"
code/Seal.h:class Seal : public Animal
code/Seal.h:public:
code/Walk.h:class Walk : public Locomotion
code/Walk.h:public:
code/Animal.h:public:
code/Duck.h:class Duck : public Bird
code/Duck.h:public:
code/Locomotion.h:public:
code/Fly.h:class Fly : public Locomotion
code/Fly.h:public:
code/Bird.h:class Bird : public Animal
code/Bird.h:public:
code/Swim.h:class Swim : public Locomotion
code/Swim.h:public:
code/Fish.h:class Fish : public Animal
code/Fish.h:public:
code/Whale.h:class Whale : public Fish
code/Whale.h:public:

In the file, grep.c, implement a program that uses N threads to search for a keyword in a set of files.

$ make grep
gcc -g -Wall -Wvla -Werror grep.c -o grep -lpthreads
$ ./grep
usage: ./grep   
$ ./grep 3 public find code -name "*.h"
Searching 10 files for keyword: public
Thread 0 searching 4 files (3 to 7)
Thread 1 searching 3 files (7 to 10)
Thread 2 searching 3 files (10 to 13)
0) code/Seal.h:class Seal : public Animal
0) code/Seal.h:public:
0) code/Walk.h:class Walk : public Locomotion
0) code/Walk.h:public:
0) code/Animal.h:public:
0) code/Duck.h:class Duck : public Bird
2) code/Swim.h:class Swim : public Locomotion
1) code/Locomotion.h:public:
2) code/Swim.h:public:
0) code/Duck.h:public:
1) code/Fly.h:class Fly : public Locomotion
1) code/Fly.h:public:
2) code/Fish.h:class Fish : public Animal
2) code/Fish.h:public:
2) code/Whale.h:class Whale : public Fish
2) code/Whale.h:public:
1) code/Bird.h:class Bird : public Animal
1) code/Bird.h:public:
Elapsed time is 0.000979
Thread 0 found 7 lines containing keyword: public
Thread 1 found 5 lines containing keyword: public
Thread 2 found 6 lines containing keyword: public

Requirements:

  • Your program should take the number of threads, keyword, and a list of files as command line arguments

  • Subdivide the files among the N threads. Note that the number of files may not divide evenly between threads.

  • Use a mutex to ensure that threads do not interrupt each others printing

  • Print the totals and sub-division of files similarly to the output above

  • Use the fgets to process the lines in each file

  • Use the function strstr defined in string.h to search lines of text for the keyword. Documentation is here

2. Code Analysis

This question has two parts. In the first part, you will implement a search tree with string keys. In the second part, you will use the tree to store the source files from a directory, along with their dependency information. Similarly to your grep implementation, you will build the tree using multiple threads.

2.1. Tree

In the file, tree.c, implement a binary search tree. Do not change the header tree.h, nor the function declarations in tree.c. But you can add functions to tree.c! We recommend using the book "Data Structures and Algorithm Analysis in C" by Mark Allen Weiss. See the course webpage for a link.

$ **make ./tree_tests**
gcc -g -Wall -Wvla -Werror -Wno-unused-variable -Wno-unused-but-set-variable tree_tests.c tree.c -o tree_tests -lpthread
$ **./tree_tests**
Running tests...
test 1: insert for empty tree: PASSED
test 2: name set correctly: PASSED
test 3: left is null: PASSED
test 4: right is null: PASSED
test 5: name set correctly: PASSED
test 6: right is null: PASSED
test 7: name set correctly: PASSED
test 8: item not in tree: PASSED
test 9: item in tree: PASSED
test 10: found correct object: PASSED
M
 l:A
 r:X
  l:P
A
M
P
X

Requirements:

  • You must not modify the basecode for tree.h or tree.c, except to add functions to tree.c

  • Your output should match the above and pass all tests

2.2. Dependencies

A dependency is a general software term for any software required by another. In this question, we will store files in a binary search tree so we can list their file dependencies, as determined by #include statements. For example, the file "Animal.h" depends on "Locomotion.h".

In the file, dependency.c, implement a program that uses N threads to build a binary search tree of a given set of files. After building the tree, the program should give the user a prompt where they can list the processed files in alphabetical order and then query the dependencies of the file by giving the filename.

$ ./dependency 3 find code -name ".h" -o -name ".cpp"
Processing 11 files
Thread 0 processing 4 files (arguments 2 to 6)
Thread 1 processing 4 files (arguments 6 to 10)
Thread 2 processing 3 files (arguments 10 to 13)
Elapsed time is 0.000496
$ list
code/Animal.h
code/Bird.h
code/Duck.h
code/Fish.h
code/Fly.h
code/Locomotion.h
code/Seal.h
code/Swim.h
code/Walk.h
code/Whale.h
code/Zoo.cpp
$ code/Animal.cpp
code/Animal.cpp not found
$ code/Animal.h
code/Animal.h has the following dependencies
  string
  vector
  Locomotion.h
$ code/Zoo.cpp
code/Zoo.cpp has the following dependencies
  Animal.h
  Bird.h
  Duck.h
  Fish.h
  Seal.h
  Whale.h
  Locomotion.h
  Walk.h
  Fly.h
  Swim.h
$ apple
apple not found
$ quit

Requirements:

  • Check that a file is valid before adding it to the tree

  • Support the commands: quit, list, <filename>

  • Use a mutex to avoid race conditions when setting up the tree

  • Your program should take the number of threads and a list of files as command line arguments

  • Subdivide the files among the N threads. Note that the number of files may not divide evenly between threads.

  • Use a mutex to ensure that threads do not have a race conditions when constructing the tree

  • Use strstr and fgets to parse the source file and find lines containing #include

  • Strip the filenames from the #include lines when printing dependencies.

3. Submit your work

Submit both your code and images.

1) Push your code work to github

$ git status
$ git add .
$ git status
$ git commit -m "assignment complete"
$ git status
$ git push
$ git status

4. Grading Rubric

Assignment rubrics

Grades are out of 4 points.

  • (1.5 points) Grep

    • (0.15 points) style and header comment

    • (0.1 points) Correct command line arguments: number of threads, keyword, and a list of files

    • (0.2 points) Subdivide the files among the N threads

    • (0.3 points) Correct output, uses mutex

    • (0.75 points) no memory errors

  • (1.5 points) Tree

    • (0.15 points) style and header comment

    • (0.6 points) implements all functions, passes all tests and does not modify basecode

    • (0.75 points) no memory errors

  • (1 points) Dependency

    • (0.1 points) style and header comment

    • (0.1 points) Correct command line arguments: number of threads, keyword, and a list of files

    • (0.1 points) Subdivide the files among the N threads, builds tree using multiple threads using mutex

    • (0.2 points) Correct behavior

    • (0.5 points) no memory errors

Code rubrics

For full credit, your C programs must be feature-complete, robust (e.g. run without memory errors or crashing) and have good style.

  • Some credit lost for missing features or bugs, depending on severity of error

  • -5% for style errors. See the class coding style here.

  • -50% for memory errors

  • -100% for failure to checkin work to Github

  • -100% for failure to compile on linux using make