INTRODUCTION

TO

OPERATING SYSTEMS

Lecture 17: INTRODUCTION TO DISTRIBUTED OPERATING SYSTEMS

CHRIS STAFF
Dept. of Computer Science and Artificial Intelligence
University of Malta

Next Lecture: Communication in Distributed Operating Systems

Lecture Outline

Aims and Objectives
What's a Distributed System?
Motivation
Hardware Concepts
Bus-Based Multiprocessors
Switched Multiprocessors
Bus-based multicomputers
Switched Multicomputers
Software
Types of Operating System
Design Issues

Aims and Objectives

What's a Distributed System?

General Issues

Hardware for Distributed Systems

Software for Distributed Systems

Design Issues

What's a Distributed System?

A collection of independent computers which can cooperate, but which appear to users of the system as a uniprocessor computer.

Two aspects Hardware and Software

Examples

Users sharing a processor pool. System dynamically decides where processes are executed

Distributed Banking

Motivation

Advantages of Distributed over Centralised Systems

Economics - 10,000 CPUs executing 50 MIPS yields system executing 500,000 MIPS. Not possible for single CPU to achieve.

Some problems need distributed solutions - e.g., CSCW

Reliability of Distributed Systems, Load Sharing

Incremental Growth

Advantages of Distributed Systems over Independent PCs

Sharing of Data and Resources

Communication

Flexibility (E.G., I have a Mac but also use departments compute power when I need it)

Disadvantages of Distributed Systems

Software

Networking - overloading, data loss

Security

Hardware Concepts

Although all Distributed Systems consist of multiple CPUs, there are different ways of interconnecting them and how they communicate

Flynn (1972) identified two essential characteristics to classify multiple CPU computer systems: the number of instruction streams and the number of data streams

Uniprocessors SISD

Array processors are SIMD - processors cooperate on a single problem

MISD - No known computer fits this model

Distributed Systems are MIMD - a group of independent computers each with its own program counter, program and data

MIMD can be split into two classifications

Multiprocessors - CPUs share a common memory

Multicomputers - CPUs have separate memories

Can be further subclassified as

Bus - All machines connected by single medium (e.g., LAN, bus, backplane, cable)

Switched - Single wire from machine to machine, with possibly different wiring patterns (e.g, Internet)

Further classification is

Tightly-coupled - short delay in communication between computers, high data rate (e.g., Parallel computers working on related computations)

Loosely-coupled - Large delay in communications, Low data rate (Distributed Systems working on unrelated computations)

Bus-Based Multiprocessors

Up to 64 Tightly-coupled CPUs on a bus, backplane or motherboard with a shared memory module

+ coherent

- Bus traffic

So add high speed memory cache to each CPU

Snoopy cache

Switched Multiprocessors

For more than 64 CPUs

Split memory into smaller modules

Connect all CPUs to each memory module, two common methods

Crossbar switch

Omega network

Bus-based multicomputers

Each CPU has its own memory, and CPUs are connected over a network

But how do the CPUs communicate?

Software problem

Switched Multicomputers

Each CPU has direct and exclusive access to its own private memory

Software

Operating Systems for multiprocessors and multicomputers

Tightly-coupled and Loosely-coupled operating systems

Loosely-coupled

E.G., LAN where users have their own, independent machines, but are still able to interact in a limited way when necessary

Tightly-coupled

E.G., multiprocessor dedicated to solving a particular problem in parallel

Types of Operating System

Network Operating Systems

Loosely-coupled software on loosely-coupled hardware

E.g., LAN with file server

Users are aware that they are using independent hardware, but share a consistent view of the filing system with other network users

True Distributed Systems

Tightly-coupled software on Loosely-coupled hardware

No known commercial examples

Give users impression that collection of computers is a single timesharing system - the virtual uniprocessor

Processes are capable of being executed on any computer on the network, and users won't realise

Characteristics

Global interprocess communication mechanism

Global protection scheme

Uniform Process Management system

Uniform File System

Identical Operating System kernels resident on each computer - kernel responsible for memory management and process scheduling on each computer

Multiprocessor Timesharing Systems

Tightly-coupled software on toghtly-coupled hardware

E.g., UNIX workstation with several CPUs

Single RUN Queue

Scheduler must run as a critical section to prevent two CPUs from selecting the same process to run

Doesn't matter which CPU executes a process, because all CPUs share the same memory

Design Issues

Transparency

Giving the impression that the processor pool is acting as a uniprocessor (WWW is a good-ish example)

Location Transparency - don't need to know where resources are

Migration Transparency - Resources can be moved without their names being changed

Replication Transparency - System free to make multiple copies of files without users being affected (e.g., caching)

Concurrency Transparency - Multiple users can share resources automatically

Parallelism Transparency - Activities can happen in parallel without users knowing

Flexibility

Kernel should do as little as possible. User-level servers provide bulk of operating systems services

In this case, the kernel is called the microkernel

Microkernel responsible for

Interprocess communication

Some memory management

Some low-level process management and scheduling

Low-level input and output

All other services are obtained by sending messages to the appropriate server

Advantages

Modular

Every client has equally opportunity to access server, regardless of location

Reliability

Availability

Protection against unauthorised access and data loss

Data consistency if maintaining multiple copies of data

Fault tolerence

Performance

A Distributed system should not run an appliction slower than if is were run on an independent computer

How is performance measured? e.g., response time, thoughput, system utilization, network capacity consumed, etc.

But performance is affected by communications...

Scalability

Current distributed systems are designed to work with a few hundred CPUs. Will they still work with thousands or hundreds of thousands?

- Don't centralise components

- Don't centralize tables

- Don't centralize algorithms

- Don't assume all machines are synchronised

Next Lecture...

Communication in Distributed Operating Systems