Assignment 11: Where others fear to thread
Due Sunday, December 1 before midnight (No late submissions accepted)
The goals for this assignment are:
-
Implement multi-threaded algorithms using the pthread library
-
Implement a tree data structure
-
Work with OS system calls
Update your repository
Do a fetch upstream to obtain the basecode for this assignment.
Using the command line
-
Open terminal and change your current directory to your assignment repository.
-
Run the command
git fetch upstream
-
Run the command
git merge upstream/master
Your repository should now contain a new folder named A11
.
The fetch
and merge
commands update your repository with any changes from the original.
1. Grep
This question implements a simplified version of the bash command grep
. Grep searches a list of files
for a given keyword or expression. For example, the following command searches for the keyword public
in a list of source files. We use the find
command to generate the list of source files to search.
$ grep public find code -name "*.h"
code/Seal.h:class Seal : public Animal
code/Seal.h:public:
code/Walk.h:class Walk : public Locomotion
code/Walk.h:public:
code/Animal.h:public:
code/Duck.h:class Duck : public Bird
code/Duck.h:public:
code/Locomotion.h:public:
code/Fly.h:class Fly : public Locomotion
code/Fly.h:public:
code/Bird.h:class Bird : public Animal
code/Bird.h:public:
code/Swim.h:class Swim : public Locomotion
code/Swim.h:public:
code/Fish.h:class Fish : public Animal
code/Fish.h:public:
code/Whale.h:class Whale : public Fish
code/Whale.h:public:
In the file, grep.c
, implement a program that uses N threads to search for a keyword
in a set of files.
$ make grep
gcc -g -Wall -Wvla -Werror grep.c -o grep -lpthreads
$ ./grep
usage: ./grep
$ ./grep 3 public find code -name "*.h"
Searching 10 files for keyword: public
Thread 0 searching 4 files (3 to 7)
Thread 1 searching 3 files (7 to 10)
Thread 2 searching 3 files (10 to 13)
0) code/Seal.h:class Seal : public Animal
0) code/Seal.h:public:
0) code/Walk.h:class Walk : public Locomotion
0) code/Walk.h:public:
0) code/Animal.h:public:
0) code/Duck.h:class Duck : public Bird
2) code/Swim.h:class Swim : public Locomotion
1) code/Locomotion.h:public:
2) code/Swim.h:public:
0) code/Duck.h:public:
1) code/Fly.h:class Fly : public Locomotion
1) code/Fly.h:public:
2) code/Fish.h:class Fish : public Animal
2) code/Fish.h:public:
2) code/Whale.h:class Whale : public Fish
2) code/Whale.h:public:
1) code/Bird.h:class Bird : public Animal
1) code/Bird.h:public:
Elapsed time is 0.000979
Thread 0 found 7 lines containing keyword: public
Thread 1 found 5 lines containing keyword: public
Thread 2 found 6 lines containing keyword: public
Requirements:
-
Your program should take the number of threads, keyword, and a list of files as command line arguments
-
Subdivide the files among the N threads. Note that the number of files may not divide evenly between threads.
-
Use a mutex to ensure that threads do not interrupt each others printing
-
Print the totals and sub-division of files similarly to the output above
-
Use the
fgets
to process the lines in each file -
Use the function
strstr
defined instring.h
to search lines of text for the keyword. Documentation is here
2. Code Analysis
This question has two parts. In the first part, you will implement a search tree with string keys.
In the second part, you will use the tree to store the source files from a directory, along with their
dependency information. Similarly to your grep
implementation, you will build the tree using multiple
threads.
2.1. Tree
In the file, tree.c
, implement a binary search tree. Do not change the header
tree.h
, nor the function declarations in tree.c
. But you can add functions to tree.c
!
We recommend using the book
"Data Structures and Algorithm Analysis in C" by Mark Allen Weiss. See the
course webpage for a link.
$ **make ./tree_tests** gcc -g -Wall -Wvla -Werror -Wno-unused-variable -Wno-unused-but-set-variable tree_tests.c tree.c -o tree_tests -lpthread $ **./tree_tests** Running tests... test 1: insert for empty tree: PASSED test 2: name set correctly: PASSED test 3: left is null: PASSED test 4: right is null: PASSED test 5: name set correctly: PASSED test 6: right is null: PASSED test 7: name set correctly: PASSED test 8: item not in tree: PASSED test 9: item in tree: PASSED test 10: found correct object: PASSED M l:A r:X l:P A M P X
Requirements:
-
You must not modify the basecode for
tree.h
ortree.c
, except to add functions totree.c
-
Your output should match the above and pass all tests
2.2. Dependencies
A dependency is a general software term for any software required by another. In this question,
we will store files in a binary search tree so we can list their file dependencies,
as determined by #include
statements. For example,
the file "Animal.h" depends on "Locomotion.h".
In the file, dependency.c
, implement a program that uses N threads to build a binary search tree of a
given set of files. After building the tree, the program should give the user a prompt where they
can list the processed files in alphabetical order and then query the dependencies of the file by
giving the filename.
$ ./dependency 3 find code -name ".h" -o -name ".cpp"
Processing 11 files
Thread 0 processing 4 files (arguments 2 to 6)
Thread 1 processing 4 files (arguments 6 to 10)
Thread 2 processing 3 files (arguments 10 to 13)
Elapsed time is 0.000496
$ list
code/Animal.h
code/Bird.h
code/Duck.h
code/Fish.h
code/Fly.h
code/Locomotion.h
code/Seal.h
code/Swim.h
code/Walk.h
code/Whale.h
code/Zoo.cpp
$ code/Animal.cpp
code/Animal.cpp not found
$ code/Animal.h
code/Animal.h has the following dependencies
string
vector
Locomotion.h
$ code/Zoo.cpp
code/Zoo.cpp has the following dependencies
Animal.h
Bird.h
Duck.h
Fish.h
Seal.h
Whale.h
Locomotion.h
Walk.h
Fly.h
Swim.h
$ apple
apple not found
$ quit
Requirements:
-
Check that a file is valid before adding it to the tree
-
Support the commands: quit, list, <filename>
-
Use a mutex to avoid race conditions when setting up the tree
-
Your program should take the number of threads and a list of files as command line arguments
-
Subdivide the files among the N threads. Note that the number of files may not divide evenly between threads.
-
Use a mutex to ensure that threads do not have a race conditions when constructing the tree
-
Use
strstr
andfgets
to parse the source file and find lines containing#include
-
Strip the filenames from the
#include
lines when printing dependencies.
3. Submit your work
Submit both your code and images.
1) Push your code work to github
$ git status
$ git add .
$ git status
$ git commit -m "assignment complete"
$ git status
$ git push
$ git status
4. Grading Rubric
Assignment rubrics
Grades are out of 4 points.
-
(1.5 points) Grep
-
(0.15 points) style and header comment
-
(0.1 points) Correct command line arguments: number of threads, keyword, and a list of files
-
(0.2 points) Subdivide the files among the N threads
-
(0.3 points) Correct output, uses mutex
-
(0.75 points) no memory errors
-
-
(1.5 points) Tree
-
(0.15 points) style and header comment
-
(0.6 points) implements all functions, passes all tests and does not modify basecode
-
(0.75 points) no memory errors
-
-
(1 points) Dependency
-
(0.1 points) style and header comment
-
(0.1 points) Correct command line arguments: number of threads, keyword, and a list of files
-
(0.1 points) Subdivide the files among the N threads, builds tree using multiple threads using mutex
-
(0.2 points) Correct behavior
-
(0.5 points) no memory errors
-
Code rubrics
For full credit, your C programs must be feature-complete, robust (e.g. run without memory errors or crashing) and have good style.
-
Some credit lost for missing features or bugs, depending on severity of error
-
-5% for style errors. See the class coding style here.
-
-50% for memory errors
-
-100% for failure to checkin work to Github
-
-100% for failure to compile on linux using make