Homework Assignment 2; CS 450

Part 1: Plan for new features

I have some good news... and some bad news. The good news is that your program to find the most common word a big hit. The bad news is that the users want some more features.

In addition to displaying the most common word, they want to be able to display the total number of distinct words and the total number of words in the file. Also, they need to be able to generate a list of all the words along with the number of times they occur.

The users would like to be able to call this tool from a script so querying the user for what they want to do won't work.

They also tell you that it is essential that your program work quickly even on really big files. Consider what type of data structures will be necessary to support this requirement.

Incorporate into your README document a discussion of the data structures you've chosen. Compare your choice with other alternatives.

Also in your README document, carefully specify your definition of a word. Consider this discussion of what is a word and compare this to your definition of a word.

I would advise against hard-coded limits on word size etc. Consider how long words can get!

How will the user specify which of these new options they would like?

Consider that users have expressed a strong desire to be able to automate the use of this tool with a script.

Add a section to your MANPAGE document regarding your new user interface.

Also, add to your MANPAGE some examples of valid and invalid words and specify any hardcoded limits on filename size, etc. embedded in your code.

Part 2: Estimate

Before you begin, read the directions carefully and then estimate how long it will take to complete the entire assignment. Record this estimate so you can compare it with the reality when you are done.


Part 3: Implement the New Features

Modify the program you submitted for homework 1 to count the number of distinct words, the number of total words and to generate the list of words with their frequencies.

If you have multiple source code files, describe their purpose and their relationship to each other in your README.

Part 4: CHANGE_LOG

Keep a running list of the changes you made and how long it took you to make each one. Place the description of your changes in the file CHANGE_LOG. When I grade your assignment, I will use this file as a guide to answer the question "What changes have been made since hw1 to improve the program?

Organize these changes into verison descriptions. Describe version 1 (homework 1). Describe changes for version 2. Describe changes for version 4 (homework 4).

Part 5: Bug tracking

As you find and fix bugs in your code, make a note of them in a new file, BUG_LOG.

For each bug, note the date found, a brief description of what you did to fix it, how long it took you and what you could have done differently to have avoided it in the first place.

If you haven't yet fixed a given bug, make a note to that effect.

Part 6: TIME_LOG

Update your TIME_LOG file. Cateogrize the time into categories like design, implementation, debugging, etc. This file should simply be in plain ascii text format.

You should include a total time in minutes for each of the following categories:

planning/preparation
coding
debugging/testing
document preparation

Add or update the following lines:

your_username TAB TOTAL_HW2 TAB time_in_minutes
your_username TAB ESTIMATE_HW2 TAB time in minutes
your_username TAB TOTAL_ALL_HW TAB time_in_minutes
your_username TAB LINES_OF_CODE_HW2 TAB lines_of_code
your_username TAB LINES_OF_DOCUMENTATION_HW2 TAB lines of documentation

Don't forget to estimate how long it will take you to do this homework before you start !

Part 7: Submit your assignment.

You must check the following files into /afs/clarkson.edu/class/cs450/students/YOUR_USERNAME/hw2 our class account: README, MANPAGE,TIME_LOG, BUG_LOG, HW2_TESTING, Makefile and all your source files. Remember remove unnecessary files to save space.

These names *must* be exact!!!