This assessed programming task is worth 40% of the credit CSM210 (Programming in C).
These are IMPORTANT. Please read them, and if in any doubt, seek clarification from me PRIOR to the submission of the assessed practical task.
Your programs MUST be compilable using the version of UNIX gcc installed on a UNIX server of the Department of Computer Science and AI. If the examiner is unable to compile the program on at least one of the UNIX servers provided by the Department of Computer Science and AI, the examiner will penalise the submission accordingly.
Plagiarism will not be tolerated. Students found to have plagiarised will fail the credit and will risk being expelled from their respective degree course. THIS IS FOR REAL.
The deadline for the assessed practical task will be the time of the credit test for CSM210. The task must be submitted to Room 202, New Computing Bulding, University of Malta, Tal-Qroqq, Msida, and must be signed in as proof of submission. Late submission of tasks will attract an immediate 50% penalty (regardless of the reason for lateness) with an additional 10% penalty for each subsequent day of late submission, weekends included. In the event that a candidate is sick on the day of the credit test, the candidate must ensure that the assessed practical task is delivered to the location specified above in conjunction with the medical certificate (which must arrive within 1 hour of the start of the exam). Note that the penalties referred to in this document apply only to the 40% allocated to the assessed practical task.
The examiner reserves the right to ask *any* candidate to defend his or her submission via an oral examination prior to the results being published for this credit.
Failing candidates: in the event of a resit, the marks awarded for the assessed practical task in the first sit will stand, unless the candidate gives notice that they intend to resubmit the assignment at the time they register for the resit. In this case, the submission will be worth only 20%, with the written part of the examination worth 60%. It is not possible for the marks awarded to the first submission to be disclosed to resitting candidates.
It is important that your implementation of wc (called wcPlus) follows the requirements detailed in this description. Failure to do so could result in the deduction of marks.
SYNOPSIS
wcPlus [ -clLws ] [ name ... ]
DESCRIPTION
wcPlus counts lines, words, and characters in the named text files, or in the standard
input if no filenames appear. It also keeps a total count for all named files. A word
is a string of characters delimited by a SPACE, TAB, or by any other
character in the library function iswspace() (see the UNIX manual pages for
a description of iswspace()). wcPlus also returns an alphabetically sorted list of
words in each file and a numerically sorted list in descending order of word
frequency. It also returns the words and frequency for the most frequently
occurring words in each named file.
OPTIONS
When the filename is specified on the command line, they are printed along with
the counts.
If no option is specified the default is -lwc (count lines, words, and characters.)
-c Count characters.
-l Count lines.
-w Count words delimited by white space characters or new line characters. Delimiting characters are Extended Unix Code (EUC) characters from any code set defined by iswspace().
-L Return an alphabetically sorted list of words that appear in the input. Words must not appear more than once in the list. Capital letters are not significant (i.e., "West" and "west" are considered identical) and acronyms are treated as whole words (e.g., "A.R.C" is one word). Words composed of digits (e.g., telephone numbers) should be sorted according to their numerical value (e.g., 12345 is greater than 213).
-s Return a list of words and their frequency in descending order of word frequency.
Apart from the usual error conditions (named files exist, user has permissions to read named files, command line arguments given are supported, etc.) the program must check that the named files are text files.
The functions which cater for the -L and -s command line arguments must use linked lists of data structures to store the words and word frequencies. Memory allocated for the lists must be managed dynamically. If you wish to implement a more complex data structure (e.g., a balanced tree), you must first seek approval from your lecturer (i.e., me!).
The C source code must be compilable by the UNIX gcc compiler, and the executable must run on one of the UNIX hosts belonging to the Department of Computer Science and AI (e.g., babe.cs.um.edu.mt).
1. Electronic version of the C source code, which must be compilable by gcc on one of the department's UNIX servers. The source code should be adequately commented.
2. Electronic version of the C object code, which must have been generated by gcc on one of the department's UNIX servers.
3. Written documentation to include: