Tour of Computer-Systems

Compilation

Compiler: Program are translated into different forms.

A classic compiler work in following sequence:

  1. Preprocessing Phase: Source program (hello.c) $$\Rightarrow$$ Pre-Processor (cpp) $\Rightarrow$ Modified source program (hello.i).
  2. Compilation Phase: Compiler (cc1) $$\Rightarrow$$ Assembly program (hello.s).
  3. Assembly Phase: Assembler (as) $$\Rightarrow$$ Relocatable object programs (hello.o).
  4. Linking Phase: Linker (ld) $$\Rightarrow$$ Executable object program (hello)

We need to understand compilation systems work:

  1. Optimizing program performance;
  2. Understanding link-time errors;
  3. Avoiding security holes;

Hardware Organization

Buses (总线):

  • Buses are typically designed to transfer fixed-sized chunks of bytes known as word;
  • 一个 word 的长度是计算机系统的基本度量单位,大多数现代机器都是 4 bytes (32-bits machine) 或 8 bytes (64-bits machine);

I/O Devices (IO 设备):

  • Each I/O device is connected to the I/O bus by either a controller or an adapter:
    1. controllers: chip sets in the device itself or on the motherboard;
    2. adapter: a card that plugs into a slot on the motherboard;

Main Memory (主存):

  • Main memory consists of a collection of dynamic random access memory (DRAM) chips

Processor, central processing unit (CPU):

  • CPU 核心由一个 word 大小的寄存器 PC (program counter),它指向主存中的某个机器指令。
  • 到目前为止,大部分计算机都是单处理器机器(which is known as a uniprocessor system),相对于多处理器机器(multiprocessor system),现在都是用多核单处理器系统。

Storage Hierarchy

L0: Register;

L1: L1 cache, SRAM;

L2: L2 cache, SRAM;

L3: L3 cache, SRAM;

L4: Main memory, DRAM;

L5: Local secodary storage, local disks;

L6: Remote secondary storage (distributed file systems, web servers);

Operating System

Parallelism

并发 Concurrency 与并行 Prallelism:

  • We use the term concurrency to refer to the general concept of a system with multiple simultaneous activities.
  • The term parallelism to refer to the use of concurrency to make a system run faster.

三层并行,通过三种不同类型的并发技术实现:

  1. Thread-Level Concurrency:线程是操作系统调度的基本单位。线程间的切换大约需要 20000 左右的时钟周期;
  2. Instruction-Level Parallelism:有的处理器一个时钟周期可以处理同时处理多个指令(被称为 superscalar processors),同时理论上也可以通过流水线设计来实现指令级别的并行;
  3. Single-Instruction Multiple-Data (SIMD) Parallelism:有的处理器提供了执行一个指令,得到多个输入数据结果的功能,比如最新的 AMD 和 Intel 处理器都提供了同时处理四对浮点数运算的功能。