分布式存储的困难点:
- 高性能:High performance in many server;
- 多机器:System with many machine could cause “constant fault”;
- 一致性错误:To avoid contact fault, we will need replication;
- 数据同步:During the replication, potential inconsistencies will occur;
- 一致性与高性能矛盾:To get better consistency, low performance occur;
分布式存储的一个大的课题就是在“一致性”与“高性能”之间的 tradeoff
.
GFS Master:
- RAM 中存储:
- 一个
filename
到handlers array
的映射表; - 每个
handler
都包含version
/chunk servers list
/primary
/least time
信息;
- 一个
- 磁盘中存储着
log
/checkpoint
;
READ,客户端读取流程:
- C send
filename
andoffset
to the master M; - M finds
chunk handle
for thatoffset
; - M replies list of
chunkservers
(aka CS) only withlastest versions
; - C caches this response;
- C sends request to
nearest CS
withoffset
; - CS read the chuck file from disk and returns;
APPEND,客户端追加流程:
C ask M about
filename
’s last chunk;If M see
filename
has noprimary
hanlder:- Pick a
primary
(aka P) andsecondaries
(aka S) with latest version (Only these server was allowed to handle storing filename); - Increment
lastest version
; - Tell P/S who they are;
- Pick a
M response C with
primary
,secondaries
andversion
;C sends append data to all:
- The paper change thier tone after, C only sends data to the nearest replica and chain the data to all replicas which can reduce trans cost;
C tells
primary
P to executeappend
;P checks that lease hasn’t expired, and chunk has space. And then P picks an
offset
(at end of chunk), writes chunk file (a Linux file) at the offset;P tells each secondary the
offset,
tells to exexuteappend
to chunk fileP waits for all secondaries to reply, or timeout;
P tells C “ok” or “error”. C retries from start if error
Split Brain 问题:
- 描述:一个
filename
同时对应了多个primary
处理; - 原因:通常是因为 network partition 导致的(部分机器之间无法通信);
- 解决方案:在指派
primary
时同时指派一个lease
表示过期时间,master
与primary
同时维护这个过期时间,过期后master
重新指派primary
;
总结:GFS 优点:
- Global cluster file system as universal infrastructure;
- Seperation of naming system (master) and storage system (chunk servers)
- Sharding file into chunk for parallel throughput;
- Huge files/chunks to reduce overheads;
- Designation primary to achieve sequential writes;
- Lease to prevent split-brain chunk servers primaries;
GFS 缺点:
- Single master performance: ran out of RAM and CPU
- Chunkservers not very efficient for small files
- Lack of automatic fail-over to master replica;
- Maybe consistency was too relaxed