INTRODUCTION
TO
OPERATING SYSTEMS
Lecture 15: PROTECTION AND RELIABILITY
CHRIS STAFF
Dept. of Computer Science and Artificial Intelligence
University of Malta
Lecture Outline
Aims and Objectives
Motivation for Protection
Worms and Viruses
Protection Mechanisms
Capabilities
Reliability
Fault Avoidance
Error Detection
Fault Treatment
Error Recovery
Multilevel Error Handling
Summary
Aims and Objectives
- Motivation for Protection
- Protection Mechanisms
- Motivation for Reliability
- Faults and Errors
- A Standard Means of Handling Errors
Motivation for Protection
- Already looked at some protection issues
- Need to protect against faults and malice
Protection from Faults
- Faults occur due to software bugs or malfunctioning hardware
- User processes and OS need protection from these faults
- Example: 'War Games'
Protection from Malice
- Unauthorised attempts to access information / resources
- Difficult to distinguish between authorised and unauthorised access
attempts
Worms and Viruses
- A worm is a program that can travel across networks
- A virus infects a computer installation, consuming system resources and
potentially compromising data
- A virus can hide by attaching itself to another file - when the file is
copied the virus is copied with it!
- Viruses can be eliminated by sweeper programs
- Infection can be prevented by detecting attempts to modify supposedly
unmodifiable files
- Trusted software must be virus free
- But worms and viruses can access and infect computer systems by other means
if the systems `internals' are known by the programmer
Protection Mechanisms
- External Security Mechanisms - access to building, fire- and water-proofing
building, etc.
- We will cover securing access to computer services through login and
protecting computer services against malicious access attempts
Secure Login
- Give authorised users password protected accounts, with limited `failed
access' attempts
Monitor Activity
- Keep a log of access attempts to sensitive information
Capabilities
Resources are called Objects
- Processes are called Subjects
- In order for a subject to access an object, it must possess a
capability
- A capability is a token which allows the bearer to access the object to which
the capability applies
- Objects are created by subjects. When a subject creates an object, is also
creates a capability for the object and may pass the capability on to other
subjects
- A capability may never be increased, only copied or reduced
- The capability that is given to other subjects should comply with the
principle of least privilege
- Capabilities are created and modified using highly protected system
functions
- As long as all resources are protected using capabilities, it should not be
easy to gain unauthorised access to an object
Reliability
- "The reliability of an operating system is the degree to which it
meets it specifications in respect of service to its users, even when subjected
to unexpected and hostile conditions." (Lister & Eager, 1988.)
- Users trust a reliable system, but won't trust an unreliable one
Issues
- Graceful degradation
- Shield user / operator errors from other users / programs
- Build reliablity into the design of the system - don't just add it
afterwards
- The degree of reliability should match the needs of the environment - high
reliability = high cost
Terminology
- Error - a departure of the system from the specified behaviour
- Fault - the cause of the error. Can be caused by users, user programs,
malfunctioning hardware. The error does not always occur immediately after the
fault.
- Damage - the extent to which corruption of the system has occurred as a
result of the fault
Fault Avoidance
- Reduce, but not eliminate, user command line errors by supplying friendly
user interface
- Hardware problems can be reduced by using `zero defects' components or
replacing components at regular time intervals.
- Majority polling - for High Reliability systems
- Reduce software faults by using standards and methodologies, and by
thoroughly testing applications
Error Detection
- Use redundant information - parity bits, etc.
- Use doubly-linked data structures for operating system structures
Fault Treatment
- An error is a sign of a fault - fault may be obvious (broken large hardware)
or not (incorrect assignment in a program)
- A fault is easier to locate if the error is reproducable
- Repair faults as soon as they are discovered, either replace the hardware, or
amend the program or recover a correct version from backup
Error Recovery
- Must assess the damage first
- Can recover by aborting process and starting again, if cheap, or by
recovering from checkpoint - but watch out for synchronised processes!!
Multilevel Error Handling
- OS is made up of many layers providing different functions
- Calls to lower layers are made through well-defined system calls
- Operations of the lower layers are hidden from the upper layers, and the
user
- So should errors that occur in those layers!!
- If an error occurs in a layer, then that layer should attempt to repair it
(it may be a transient fault) and the error should only be reported back to the
calling layer if the fault is permanent
- Some errors should not be hidden from the user (e.g., division by zero, hard
disk space exceeded, etc) even though the OS could protect the user from them
Summary
- Want to protect Computer System from faults and malice
- Worms and Viruses, and how to protect against them
- Mechanisms that can be implemented to protect computer system
- Capabilities, as a standard means of encoding protection information
- Why computer environments should be reliable
- How computer environments can be made reliable (though Fault avoidance, Error
Detection, Fault Treatment and Error Recovery)
- How to protect the layers of the OS, and the user, from faults and errors.
Next Lecture...
Case Study: UNIX