Compilation
Compiler: Program are translated into different forms.
A classic compiler work in following sequence:
- Preprocessing Phase: Source program (
hello.c
) $$\Rightarrow$$ Pre-Processor (cpp
) $\Rightarrow$ Modified source program (hello.i
). - Compilation Phase: Compiler (
cc1
) $$\Rightarrow$$ Assembly program (hello.s
). - Assembly Phase: Assembler (
as
) $$\Rightarrow$$ Relocatable object programs (hello.o
). - Linking Phase: Linker (
ld
) $$\Rightarrow$$ Executable object program (hello
)
We need to understand compilation systems work:
- Optimizing program performance;
- Understanding link-time errors;
- Avoiding security holes;
Hardware Organization
Buses (总线):
- Buses are typically designed to transfer fixed-sized chunks of bytes known as
word
; - 一个
word
的长度是计算机系统的基本度量单位,大多数现代机器都是 4 bytes (32-bits machine) 或 8 bytes (64-bits machine);
I/O Devices (IO 设备):
- Each I/O device is connected to the I/O bus by either a
controller
or anadapter
:controllers
: chip sets in the device itself or on the motherboard;adapter
: a card that plugs into a slot on the motherboard;
Main Memory (主存):
- Main memory consists of a collection of
dynamic random access memory (DRAM)
chips
Processor, central processing unit (CPU):
- CPU 核心由一个
word
大小的寄存器 PC (program counter),它指向主存中的某个机器指令。 - 到目前为止,大部分计算机都是单处理器机器(which is known as a
uniprocessor system
),相对于多处理器机器(multiprocessor system
),现在都是用多核单处理器系统。
Storage Hierarchy
L0: Register;
L1: L1 cache, SRAM;
L2: L2 cache, SRAM;
L3: L3 cache, SRAM;
L4: Main memory, DRAM;
L5: Local secodary storage, local disks;
L6: Remote secondary storage (distributed file systems, web servers);
Operating System
Parallelism
并发 Concurrency 与并行 Prallelism:
- We use the term concurrency to refer to the general concept of a system with multiple simultaneous activities.
- The term parallelism to refer to the use of concurrency to make a system run faster.
三层并行,通过三种不同类型的并发技术实现:
- Thread-Level Concurrency:线程是操作系统调度的基本单位。线程间的切换大约需要 20000 左右的时钟周期;
- Instruction-Level Parallelism:有的处理器一个时钟周期可以处理同时处理多个指令(被称为
superscalar processors
),同时理论上也可以通过流水线设计来实现指令级别的并行; - Single-Instruction Multiple-Data (SIMD) Parallelism:有的处理器提供了执行一个指令,得到多个输入数据结果的功能,比如最新的 AMD 和 Intel 处理器都提供了同时处理四对浮点数运算的功能。