CSA2060 Assessed Project for the January 2004 exam session
Date due: Monday 26th January 2004 at midday
This assessed programming task is worth 100% of the study-unit
CSA2060 (Introduction to C).
Changes!!!!
13/11/03... following lecture...
I have introduced the concept of CPU and Resource clock ticks to make
the separation between the two easier to understand (I hope!). One
resource clock tick can be the equivalent of several CPU clock ticks
(because CPUs are generally faster than resources!). I have changed
every reference to clock tick to either CPU clock tick or resource
clock tick. Apart from that, the changed parts of this specification
are indicated by "==>".
Rules and Regulations
These are IMPORTANT. Please read them, and if in any doubt,
seek clarification from me PRIOR to the submission of the
assessed practical task.
Your programs MUST be compilable using the version of UNIX gcc
installed on a UNIX server of the Department of Computer Science and
AI. If the examiner is unable to compile the program on at least one of
the UNIX servers provided by the Department of Computer Science and AI,
the examiner will penalise the submission accordingly.
Plagiarism will not be tolerated. Evidence of plagiarism in
the assignment will attract a Fail grade for the candidate, and may
result in further disciplinary action being taken against the candidate
in accordance with University guidelines. For more information please
visit
the departmental Web site on plagiarism.
The deadline for the assessed practical task is midday on Monday 26th January 2004.
The task must be submitted to Room 202, New Computing Bulding,
University of Malta, Tal-Qroqq, Msida, and must be signed in as proof
of submission. Late submission of tasks will attract an immediate 50%
penalty (regardless of the reason for lateness) with an additional 10%
penalty for each subsequent day of late submission, weekends included.
In the event that a candidate is sick on the day of the deadline, the
candidate must ensure that the assessed practical task is delivered to
the location specified above in conjunction with the medical
certificate (which must arrive by midday on Monday 26th January
2004). Missing "final touches" due to illness on and slightly before
the day of the deadline will be taken into consideration. If you are
ill during the semester, and the illness prevents you from working on
this assignment, then you must inform me as soon as possible (and the
Chair of the Board of Studies for IT, or the Dean of the Faculty of
Science, if it is effecting other courses). In any case, as with
anything else, you must be prepared for all sorts of eventualities.
Short-term illnesses (of a few days' duration) should not really have
any significant impact on your ability to complete all tasks - assuming
that you have planned your work well, and you work steadily throughout
the semester :-) If you leave everything till the last moment and then
get sick, well...
The examiner reserves the right to ask *any* candidate to defend his
or her submission via an oral examination prior to the results
being published for this credit.
Failing candidates: in the event of a resit, candidates will
sit for a written paper the following September. Expect compulsory
questions about the assessed assignment :-) Candidates who have the
right to a delayed first sit, because a legitimate reason prevented
them from working on and submitting the project by the deadline at the
end of this semester will also sit for a written paper the following
September. If the September session contains resitting and first
sitting students, they will all take the same written paper. All
students will be expected to have sufficient knowledge of the
assignment.
Submission Guidelines
Please follow these instructions carefully. Failure to comply with
these instructions may lead to loss of marks. If in doubt, please seek
the lecturer's advice.
Your project will be anonymous. Your name, ID number, student
registration number, etc. must not appear on anything that will be
given to the examiner. Disciplinary action may be taken against
students who identify themselves in their submission.
When you submit your project to Rm 202 (see above), you will be
given a cover sheet to fill in with your personal details. This
information will not be passed on to the examiner.
Project Description
Good, you've made it this far :-)
An Operating System environment typically consists of a number of
processes all competing for computer resources
to complete their task. For example, a program which reads data from a
file, processes it, and then produces a hard copy of its results
requires, along with the CPU and access to memory (RAM) to execute in
the first place, access to long-term storage (typically a hard
disk) and a printer. If any of these resources is denied to the
process, then the program's task cannot complete.
In your program, you will simulate requests being generated which must
then be processed by the resource manager. Although some resources can
be shared (used by many
processes simultaneously), some resources are non-shareable. A non-shareable
resource is one, like a printer, which must complete one job in its
entirety before it is able to start the next one. For example, it would
be considered to be an error
if a printer began printing somebody else's job before it had finished
printing the current job (even if it was possible to return to it
later!). On the other hand, disks, for example, can be shared because although a
program may request a file to be written (or read), it is possible to
attend to another request (to read from or write to another file stored
on the same disk) before the
first file has been written (or read) in its entirety.
In a real operating system environment, most resources can be used
simultaneously by different processes. For instance, it is possible for
one program to read data off a CD while another prints a file. The
resources are managed independently of each other. However, in your
simulated environment you will not be managing real resources, so we
will use a single resource manager to control several different types
of resource.
NB: This specification may contains
mistakes! If you encounter something that looks like it may me a
mistake please e-mail me at cstaff@cs.um.edu.mt.
The Resource Manager
Resources hang around waiting for request. Upon receiving a request the
resource will service it. A non-shareable resource must service the
request to completion before it can process another request. This is
not the case with a shareable resource, which is able to service many
requests from different processes apparently concurrently.
At the heart of the Resource Manager is a random event generator
which generates events. These events include: the id of the currently
running process and a request for a resource.
Whereas the request for a resource is
generated by the currently running process, the resource itself will
generate an interrupt to
indicate that a request has been serviced. If we assume a uniprocessor
environment then only one process at a time (in a CPU clock tick) can generate a
single
request. However, technically resources (that do not require the
central
processing unit to operate) can each generate an interrupt at the same
point in time to indicate that they have completed some task. To model
this, each of your CPU clock ticks will last long enough for a single
resource request to be generated, and to check whether a resource has
generated an interrupt.
If a resource is currently servicing a request, what happens if a new
request of the same resource is generated? Each resource will maintain
its own queue of incoming
requests. What happens to the incoming request depends on whether the
resource is shareable or non-shareable. If the resource is
non-shareable, then the queued requests must wait their turn for the
resource to become available. Each process that is waiting for a
resource is suspended until
the request is serviced and the resource signals that the resource has
completed the task. On the other hand, with a shareable resource,
waiting requests can be attended to one at a time in sequence with a
little bit of work done on each in turn. The requesting process will
also be suspended waiting for the resource to signal that its request
has been completed. This means that whereas the queue for non-shareable
resources will be a FIFO queue, the queue for shareable resources will
resemble a circular queue.
In general, shareable resources are requested more frequently than
non-shareable ones. Also, non-shareable resources require significantly
more CPU clock ticks to complete a task than shareable resources.
Several
megabytes of data can be written to disk much faster than a printer can
print a few 10's of kilobytes of data.
Whenever a process requests a resource you will need to know the
following things:
- The Process ID (PID) of the requesting process (this uniquely
identifies a process)
- The Resource ID (RID) that is requested
- The number of bytes of data to transfer (the size of the task)
- The time at which the request occurred
You will also need to know the following about each resource:
- The resource type (e.g., PRINTER, DISK - it is possible to have
more than one resource of the same type)
- The Resource ID (RID) of the resource (unique)
- Whether the resource is shareable or non-shareable (resources of
the same type must either be all shareable or all non-shareable)
- The number of bytes that can be transferred in one resource clock
tick
(shareable resources are likely to have higher data trasfer rates than
non-shareable resources)
==> Each resource clock tick the number of bytes left to transfer
per resource is
the number of bytes transferred so far minus the number of bytes that
can be transferred in a resource clock tick. Once a resource has
transferred all
of the data that it can process in a single resource clock tick, then
an interrupt can be generated.
When your program is run, the first thing it will do is locate and
open a configuration file that contains certain data which will control
how the environment behaves. The information in the configuration file
should be read into an appropriate data structure.
The program will then enter a loop in which it will scan the input
channel to determine if a 't', 's', or 'g' has been entered by the user
(the user should not need to enter any input to cause the program to
continue executing). If a 't' has been input, then the program will
terminate. If an 's' has been input then the program will enter
single-step mode and report on the current status of all resources,
pausing at each iteration until the user types a <return> or a
'g'. The report will be printed to a file as well as to the screen. If
the users types 'g', the program will enter go mode and stop reporting
the status of each resource. If the user presses <return> while
in
single-step mode, then program will continue to report on the status of
each resource after each iteration (and prompt the user for input at
the
end of each iteration). An iteration consists of the following steps,
which are explained further below.
An iteration is the equivalent of a CPU clock tick. In a typical CPU
clock
tick, the following will logically happen:
- Generate the id of the currently running process.
This is the process that may make a request.
- Generate a request for a resource, and add the request to the
appropriate queue.
- For each resource in the resource structure:
- If it has been servicing a
request, has it finished?
- If
it has finished, then generate an interrupt and start servicing the
next request
in the resource queue
- If it has not finished, then:
- If it is a nonshareable resource, then:
- Continue processing the current
- Otherwise, if it is a shareable resource:
- Start or resume servicing the next request in the
circular queue
Generate the ID of the currently running process
Generate a random PID as a short integer.
Generate a request for a resource
There is a 30% chance that a process will request a resource, and the
chances of requesting a shareable resource are 3 times higher than the
chance of the process requesting a non-shareable resource. However,
there is then an equal chance of any of the shareable (or nonshareable)
resources being requested. As a example, consider the following. On
average, a resource request will be generated only 3 times in every 10
"goes" (CPU clock ticks). However, for every 4 requests that are
made
3 requests will be for a shareable resource and only 1 will be for a
non-shareable resource. If your system has 5 shareable
resources, then each one has an equal opportunity of being requested if
the process has made a request for a shareable resource.
Once a request for a specific resource has been made the request is
sent to the Resource Manager which checks if the resource is free. If
it is, the status of the resource is set to busy, the resource is
"given" as much data as it can process in a single resource clock tick,
and the
rest of the data, if any, is placed on the request queue. If the
resource is currently busy, then the request is added to the
appropriate queue. If there is more than one resource of a
given type (e.g., 3 identical printers), then those resources will share a queue so that the next job
can be given to the first resource in its class that becomes available.
You will also need to generate the size of the data that needs to be
transferred, remembering that a nonshareable resource will normally be
given significantly less data to transfer than a shareable resource.
Also, record the time at which the request was added to the resource
queue, together with two other time slots, one which will be used to
record the number of CPU clock ticks spent being serviced and the other
to
record the number of CPU clock ticks spent waiting in the queue.
Managing Resources
Each resource is represented by a resource
descriptor that records the following information:
- Type (Symbolic, e.g. PRINTER1)
- Resource ID (unique number)
- Number in class (e.g., if it is the only resource of its type,
then 1, otherwise, if there are e.g., 2 printers of the same type the
first will be 1 and the second will be 2)
- Status (busy, free - 0, 1)
- Size of data block (in bytes) that can be processed in one
resource clock
tick
- ==> No. of CPU clock ticks = 1 resource clock tick (see note).
- PID of process currently servicing
- Address of the request queue for this resource (multiple
resources of the same type will point to the same address)
- Address of interrupt location for this resource in the interrupt
vector table
Note that multiple resources of the same type will each require a separate resource descriptor, but
each resource descriptor will have the same Type. For example, you can
have three PRINTER1 resources each with a unique Resource ID.
==> Note: Mapping resource clock
ticks to CPU clock ticks...
In a real operating system, resources operate slower than the
computer's CPU. If Time is real time (measured in nanoseconds) then,
for example, a single CPU clock tick can occur in one Time interval.
However, a printer may take a whole second of Time to register a single
one of its resource clock ticks. The resource descriptor records how
many bytes of data can be processed in a single resource clock tick.
However, one resource's clock tick may last the equivalent of several
CPU clock ticks. The resource descriptor will record the number of
equivalent CPU clock ticks that are required to process the data that
can be processed in one resource clock tick.
Resource descriptors are linked together through the resource structure. The resource
structure is a linked list of resource descriptors. The interrupt
vector table is a table of pairs of interrupt together with a pointer
to the function which acts as the resource
handler. The resource handler is responsible for transferring a
block of data to the resource so that it can be processed. The
resource handler is invoked by the resource manager when a resource has
generated an interrupt at the appropriate location in the interrupt
vector table. The resource handler will reset the interrupt in the
vector table and will change the status of the resource to "free", if
the resource has completed its current task and there are no more tasks
in the resource's request queue.
Each CPU clock tick the interrupt vector table is traversed. If an
interrupt is set, then the Resource Manager will call the function for
the associated resource handler. For a
nonshareable resource, if the
resource is busy, the resource handler will reduce the size of data
remaining to be transferred
(stored with the resource request in the resource queue) by the size of
the data block that can be processed in one resource clock tick. If the
resulting value is 0 or negative, the request has been completed. The
resource handler will generate an interrupt (through the interrupt
vector table), and do
whatever else you consider to be good housekeeping (remember to
document what you do!). For each shareable
resource, do the same as for nonshareable resources but even if there is still data
remaining
to be transferred (after reducing it by the amount that can be
processed in a single resource clock tick) you will now transfer
control to the
next waiting process so that it can make some progress the next time
around. You will need to implement this structure at least as a
circular queue. However, because you will want to add new requests to
the "end" of the list, you'll also need to implement it as a doubly
linked circular list (because you know that the logical end of the list
is just before the process you are currently servicing, because it is
the one which will take you longest to reach). ==> Do remember that
one resource clock tick may be equivalent to several CPU clock ticks.
When a resource begins to process a block of data, you will need to
initialise an associated counter to either the number of CPU equivalent
clock ticks that it will take to transfer the maximum amount of data
that the resource can process in one resource clock tick, or if the
amount of data to transfer is less than the maximum amount that can be
handled in a single resource clock tick, then the actual number of CPU
equivalent clock ticks that will be required. Each CPU clock tick that
passes the counter will be decremented. When the counter reaches 0, an
interrupt will be generated signalling the completion of the current
task.
If the resource is free, then you'll need to see if there's anything in
the resource queue waiting to be serviced. If there is then load it
into the resource descriptor. For each request, whether or not it is
being serviced, update the appropriate time serviced/time waiting time
slots.
Scanning the input buffer
Your program normally runs non-interactively. With no user interaction,
the program
terminates after a number of iterations (read in from the configuration
file). Just before the program
terminates, it produces a short report on the statistics that have been
kept.
At each iteration, however, the program will scan its input buffer,
without pausing, to see if the user has entered a directive. The
possible directives are:
- t: terminate program
- s: enter single-step mode (ignore if already in single-step mode)
- <return>: continue single-step mode, otherwise ignore input
- g: go mode - exit single-step mode, if in it, otherwise ignore
input
All other inputs are discarded. If there is more than one character in
the input buffer, obey the first legitimate directive, and discard all
the others. For instance if the input buffer contains 'ast', ignore the
'a' (because it is invalid), obey the single-step directive ('s'), and
discard the remaining input. Similarly, if the program is in go mode
and the input buffer contains '<return>gts', then the
<return> ('\n') and 'g' should be ignored (because they both
require the program to first be in single-step mode), the 't' directive
will be obeyed, and the remaining characters will be discarded.
t: terminate program
Forces the program to end before it reaches the maximum number of
possible iterations. The behaviour of the program should be identical
to a normal termination. It will still print the vital statistics, but
obviously, they will have been collected over a smaller number of
iterations.
s: enter single-step mode
The current state of all processes and statistical information are
printed to the screen and appended to file. Pause for user input. The
user is allowed to enter <return>, 't', and 'g' only. All other
input is rejected.
The following information will be printed (to screen and to file):
Current iteration number:
Resource [Resource_ID] (Resource Name):
Current Status: <busy, free>
Current PID serviced: (if busy)
Last PID serviced:
Average wait for service:
%age of up-time spent in free state:
Queued Requests: [PID, job size[,PID, job size]...]
...
(obviously, time is measured in iterations. Time created will be an
absolute number, whereas the other time references will be an offset
from the time of creation).
<return>: continue single-step mode
Iterate once through the process queue and then display (and append to
file) the updated statistics and information about each process. Pause
for user input. The user is allowed to enter <return>, 't', or
'g' only. All other input is rejected.
g: go mode
Only allowed in single-step mode. Interactivity is switched off, and
reporting to the screen and file is disabled. In this state, only 't'
and 's' are valid directives.
When your program terminates
Your program can terminate for two legitimate reasons. Either the user
has input 't', to force early termination, or else the total number of
iterations has exceeded the maximum value specified in the
configuration file.
The statistics that you will
print (to screen and to file) are:
Total number of iterations:
[The next lines are repeated for each resource]
Resource Resource_ID (Resource Type):
%age of up-time spent in free state:
Number of requests serviced:
Average length of service per request:
Average length of time request spent waiting for service:
Resource Resource_ID (Resource Type):
...
Configuration file
The contents of the configuration file are:
The initial state: {s | g} (s = single step mode, g = go mode)
Maximum number of iterations: {9999 = infinite}
Odds of a process generating a request for a request: 30%
Odds of a nonshareable request being requested, if a resource has been
requested: 25%
List of Resource Descriptors
Resource Type: string
Resource ID: unique integer
Shareable: 0 = nonshareable, 1 = shareable
Data Block Size: (in bytes, for max data transfer size in one resource
clock
tick)
==> Clock ticks: the number of CPU clock ticks that are executed for
one resource clock tick to pass
Max Job Size: the maximum size of job that can be handled by the
resource (artificial limit, in bytes)
Resources of the same type will have the same string value, e.g.,
PRINTER.
You should use the following representations of information in the
configuration file:
INIT=g
MAXITER=30000
REQUEST=30
REQUESTNONSHARE=25
RESOURCESEPARATOR
RESOURCETYPE=PRINTER
RESOURCEID=1
SHAREABLE=0
DATABLOCKSIZE=50
==> CLOCKTICKS=7
MAXJOBSIZE=1000
RESOURCESEPARATOR
RESOURCETYPE=...
You should check that the token on the left hand side is recognisable,
rejecting any errors in the input. You may ignore tokens that are not
recognised without aborting, but you must have all of the expected
tokens to continue. For example, you can ignore ITER=9, but if
DATABLOCKSIZE is missing for the first resource, then you must abort
the program. You should also ensure that legal
values are provided for each resource. You should also ensure that
MAXJOBSIZE, which is the maximum size of job that a resource will
accept, is reasonable (MAXJOBSIZE/ DATABLOCKSIZE << MAXITER). On
average, jobs processed by shareable resources will tend to be much
smaller than those processed by nonshareable resources, otherwise
requests for nonshareable resources will be left unserviced for
unreasonable lengths of time.
Deliverables
C Source code
Documentation, including a brief section on weaknesses (if any) of your
approach. If your solution has no weaknesses, please explain why.
Evidence that your program has been adequately tested.
An answer to the
question posed at the end of
this document.
Guidelines for the documentation
Your documentation should be from 15-30 pages in length (excluding
source code listing). You will normally describe, in your own words,
the problem you are trying to solve; the solution you implemented, and
why that particular solution, rather than any other solution; problems
you encountered, and how you solved them; major data structures used
and the operations on those data structures; evidence that your program
works; an example session (with screen shots, if appropriate);
weaknesses of your approach (including things required but not
implemented); and future enhancements.
The documentation should also contain a section which reports
comparisons of the final vital statistics of the program (which should
be allowed to terminate normally) when it has been run with different
values in the configuration file for the odds of occurance of different
events, different maximum loads and a different number of resources.
You should provide final statistics, an explanation of why the
statistics differ, and which configuration file appears to result in
"better system behaviour" for at least the following two experiments.
==> Each configuration file also contains references to CLOCKTICKS
which are the number of CPU clock ticks that are equivalent to a single
resource clock tick.
Experiment 1: Initial configuration file
INIT=g
MAXITER=3000
REQUEST=30
REQUESTNONSHARE=25
RESOURCESEPARATOR
RESOURCETYPE=PRINTER
RESOURCEID=1
SHAREABLE=0
DATABLOCKSIZE=5
MAXJOBSIZE=50
CLOCKTICKS=10
RESOURCESEPARATOR
RESOURCETYPE=DISK
RESOURCEID=2
SHAREABLE=1
CLOCKTICKS=3
DATABLOCKSIZE=250
MAXJOBSIZE=5000
RESOURCESEPARATOR
RESOURCETYPE=PRINTER
RESOURCEID=3
SHAREABLE=0
DATABLOCKSIZE=5
CLOCKTICKS=10
MAXJOBSIZE=50
RESOURCESEPARATOR
RESOURCETYPE=RAM
RESOURCEID=4
SHAREABLE=1
DATABLOCKSIZE=500
CLOCKTICKS=1
MAXJOBSIZE=10000
Experiment 2: Initial configuration file
INIT=g
MAXITER=5000
REQUEST=30
REQUESTNONSHARE=25
RESOURCESEPARATOR
RESOURCETYPE=PRINTER1
RESOURCEID=1
SHAREABLE=0
DATABLOCKSIZE=5
MAXJOBSIZE=50
CLOCKTICKS=10
RESOURCESEPARATOR
RESOURCETYPE=DISK
RESOURCEID=2
SHAREABLE=1
DATABLOCKSIZE=250
MAXJOBSIZE=5000
CLOCKTICKS=3
RESOURCESEPARATOR
RESOURCETYPE=PRINTER2
RESOURCEID=3
SHAREABLE=0
DATABLOCKSIZE=5
MAXJOBSIZE=50
CLOCKTICKS=8
RESOURCESEPARATOR
RESOURCETYPE=RAM
RESOURCEID=4
SHAREABLE=1
DATABLOCKSIZE=500
MAXJOBSIZE=10000
CLOCKTICKS=1
You should, of course, also test your program with more resources, less
and more processes, different data transfer rates, etc.
When the program terminates, one statistic
that is given for each
resource is the amount of time it was free (idle). Of course, it is not
efficient for resources to be idle for most of their time. Another
statistic is the average amount of time a request is waiting to be
serviced. It is not efficient for requests to be delayed indefinitely,
because the processes they belong to cannot continue executing,
resulting in many frustrated users. Experiment with combinations
of values and resources in the initial configuration files. Which
combinations appear to approach the ideal situation of all resources
being in constant use and requestsbeing serviced almost immediately?
On average, and assuming a reasonable level of competence with C,
this program should take you approximately 2.5 days of effort to code
and test, and another 1.5 days of effort to document.
Have fun!