Brouillon LM
Version 1 (Lois Taulelle, 04/02/2016 11:52)
1 | 1 | Lois Taulelle | h1. Brouillon |
---|---|---|---|
2 | 1 | Lois Taulelle | |
3 | 1 | Lois Taulelle | § en vrac |
4 | 1 | Lois Taulelle | |
5 | 1 | Lois Taulelle | |
6 | 1 | Lois Taulelle | h2. Choix du matériel |
7 | 1 | Lois Taulelle | |
8 | 1 | Lois Taulelle | h3. Des disques ! Encore des disques ! |
9 | 1 | Lois Taulelle | |
10 | 1 | Lois Taulelle | Ce système de fichier étant avant tout pensé dans une optique "serveur", avec "des" disques (beaucoup de disques), vouloir utiliser ZFS sur un seul disque dur confine au ridicule. La dernière version de ZoL (RC14) règle définitivement le problème : un pool ne peut être composé qu'avec des disques, pas des partitions (ref nécéssaire). |
11 | 1 | Lois Taulelle | |
12 | 1 | Lois Taulelle | multipath pour Kevin : |
13 | 1 | Lois Taulelle | |
14 | 1 | Lois Taulelle | http://anothersysadmin.wordpress.com/2008/11/17/howto-debian-and-scsi-multipathing-with-multipath-tools/ |
15 | 1 | Lois Taulelle | |
16 | 1 | Lois Taulelle | h2. Sucreries d'administrateurs système |
17 | 1 | Lois Taulelle | |
18 | 1 | Lois Taulelle | h3. Oh la vache ! Du Copy-On-Write ! |
19 | 1 | Lois Taulelle | |
20 | 1 | Lois Taulelle | Toutes les opérations de ZFS sont des transactions copie-à-l'écriture, ainsi l'état sur le disque est toujours valide. |
21 | 1 | Lois Taulelle | |
22 | 1 | Lois Taulelle | |
23 | 1 | Lois Taulelle | h3. C'est trolldi, c'est permis ! |
24 | 1 | Lois Taulelle | |
25 | 1 | Lois Taulelle | *Why btrfs is theorically better, while zfs is pratically* |
26 | 1 | Lois Taulelle | |
27 | 1 | Lois Taulelle | Pas de mise en concurrence réelle, l'un est utilisable maintenant mais pas mainstream, avec toutes les bonnes choses. L'autre sera utilisable bientôt, et 1000% mainstream, et il faut le surveiller activement. |
28 | 1 | Lois Taulelle | |
29 | 1 | Lois Taulelle | btrfs: Pre-history |
30 | 1 | Lois Taulelle | |
31 | 1 | Lois Taulelle | Imagine you are a Linux file system developer. It's 2007, and you are at the Linux Storage and File systems workshop. Things are looking dim for Linux file systems: Reiserfs, plagued with quality issues and an unsustainable funding model, has just lost all credibility with the arrest of Hans Reiser a few months ago. ext4 is still in development; in fact, it isn't even called ext4 yet. Fundamentally, ext4 is just a straightforward extension of a 30-year-old format and is light-years behind the competition in terms of features. At the same time, companies are clamping down on funding for Linux development; IBM's Linux division is coming to the end of its grace period and needs to show profitability now. Other companies are catching wind of an upcoming recession and are cutting research across the board. They want projects with time to results measured in months, not years. |
32 | 1 | Lois Taulelle | |
33 | 1 | Lois Taulelle | Ever hopeful, the file systems developers are meeting anyway. Since the workshop is co-located with USENIX FAST '07, several researchers from academia and industry are presenting their ideas to the workshop. One of them is Ohad Rodeh. He's invented a kind of btree that is copy-on-write (COW) friendly [PDF]. To start with, btrees in their native form are wildly incompatible with COW. The leaves of the tree are linked together, so when the location of one leaf changes (via a write - which implies a copy to a new block), the link in the adjacent leaf changes, which triggers another copy-on-write and location change, which changes the link in the next leaf... The result is that the entire btree, from top to bottom, has to be rewritten every time one leaf is changed. |
34 | 1 | Lois Taulelle | |
35 | 1 | Lois Taulelle | Rodeh's btrees are different: first, he got rid of the links between leaves of the tree - which also "throws out a lot of the existing b-tree literature", as he says in his slides [PDF] - but keeps enough btree traits to be useful. (This is a fairly standard form of btrees in file systems, sometimes called "B+trees".) He added some algorithms for traversing the btree that take advantage of reference counts to limit the amount of the tree that has to be traversed when deleting a snapshot, as well as a few other things, like proactive split and merge of interior nodes so that inserts and deletes don't require any backtracking. The result is a simple, robust, generic data structure which very efficiently tracks extents (groups of contiguous data blocks) in a COW file system. Rodeh successfully prototyped the system some years ago, but he's done with that area of research and just wants someone to take his COW-friendly btrees and put them to good use. |
36 | 1 | Lois Taulelle | |
37 | 1 | Lois Taulelle | + btrfs: A brief comparison with ZFS |
38 | 1 | Lois Taulelle | |
39 | 1 | Lois Taulelle | différence algorithmique (https://lwn.net/Articles/342892/ , btrfs: A brief comparison with ZFS) |
40 | 1 | Lois Taulelle | |
41 | 1 | Lois Taulelle | * pro/cons : http://rudd-o.com/linux-and-free-software/ways-in-which-zfs-is-better-than-btrfs ? (loïs: à mon avis, mauvaise idée, s'ils sont comparable, ils ne sont pas concurrent) |
42 | 1 | Lois Taulelle | |
43 | 1 | Lois Taulelle | On a pu visualiser, sans réellement l'expliquer autrement que par l'algolrithmie interne, une différence flagrante : btrfs est plus rapide en écriture, là ou zfs est meilleur en lecture (même plateforme matérielle). |
44 | 1 | Lois Taulelle | |
45 | 1 | Lois Taulelle | h3. ZFS sur partitions, pas bon ! |
46 | 1 | Lois Taulelle | |
47 | 1 | Lois Taulelle | <pre> |
48 | 1 | Lois Taulelle | root@ocean:~# zpool status |
49 | 1 | Lois Taulelle | pool: data |
50 | 1 | Lois Taulelle | state: UNAVAIL |
51 | 1 | Lois Taulelle | status: One or more devices could not be used because the label is missing |
52 | 1 | Lois Taulelle | or invalid. There are insufficient replicas for the pool to continue |
53 | 1 | Lois Taulelle | functioning. |
54 | 1 | Lois Taulelle | action: Destroy and re-create the pool from |
55 | 1 | Lois Taulelle | a backup source. |
56 | 1 | Lois Taulelle | see: http://zfsonlinux.org/msg/ZFS-8000-5E |
57 | 1 | Lois Taulelle | scan: none requested |
58 | 1 | Lois Taulelle | config: |
59 | 1 | Lois Taulelle | |
60 | 1 | Lois Taulelle | NAME STATE READ WRITE CKSUM |
61 | 1 | Lois Taulelle | data UNAVAIL 0 0 0 insufficient replicas |
62 | 1 | Lois Taulelle | raidz2-0 UNAVAIL 0 0 0 insufficient replicas |
63 | 1 | Lois Taulelle | sdc3 FAULTED 0 0 0 corrupted data |
64 | 1 | Lois Taulelle | sdd3 FAULTED 0 0 0 corrupted data |
65 | 1 | Lois Taulelle | sdh3 FAULTED 0 0 0 corrupted data |
66 | 1 | Lois Taulelle | sde3 FAULTED 0 0 0 corrupted data |
67 | 1 | Lois Taulelle | sdg3 ONLINE 0 0 0 |
68 | 1 | Lois Taulelle | sdf3 ONLINE 0 0 0 |
69 | 1 | Lois Taulelle | root@ocean:~# zpool export data |
70 | 1 | Lois Taulelle | root@ocean:~# zpool status |
71 | 1 | Lois Taulelle | no pools available |
72 | 1 | Lois Taulelle | root@ocean:~# zpool import data |
73 | 1 | Lois Taulelle | cannot import 'data': pool may be in use from other system |
74 | 1 | Lois Taulelle | use '-f' to import anyway |
75 | 1 | Lois Taulelle | root@ocean:~# zpool import -f data |
76 | 1 | Lois Taulelle | root@ocean:~# zpool status |
77 | 1 | Lois Taulelle | pool: data |
78 | 1 | Lois Taulelle | state: ONLINE |
79 | 1 | Lois Taulelle | scan: scrub repaired 0 in 0h21m with 0 errors on Fri Mar 1 04:36:07 2013 |
80 | 1 | Lois Taulelle | config: |
81 | 1 | Lois Taulelle | |
82 | 1 | Lois Taulelle | NAME STATE READ WRITE CKSUM |
83 | 1 | Lois Taulelle | data ONLINE 0 0 0 |
84 | 1 | Lois Taulelle | raidz2-0 ONLINE 0 0 0 |
85 | 1 | Lois Taulelle | sdd3 ONLINE 0 0 0 |
86 | 1 | Lois Taulelle | sde3 ONLINE 0 0 0 |
87 | 1 | Lois Taulelle | sdc3 ONLINE 0 0 0 |
88 | 1 | Lois Taulelle | sdh3 ONLINE 0 0 0 |
89 | 1 | Lois Taulelle | sdg3 ONLINE 0 0 0 |
90 | 1 | Lois Taulelle | sdf3 ONLINE 0 0 0 |
91 | 1 | Lois Taulelle | spares |
92 | 1 | Lois Taulelle | sdb3 AVAIL |
93 | 1 | Lois Taulelle | sda3 AVAIL |
94 | 1 | Lois Taulelle | |
95 | 1 | Lois Taulelle | errors: No known data errors |
96 | 1 | Lois Taulelle | </pre> |
97 | 1 | Lois Taulelle | |
98 | 1 | Lois Taulelle | La première, ça fait peur. Les fois suivantes, c'est juste pénible. |
99 | 1 | Lois Taulelle | |
100 | 1 | Lois Taulelle | On peut forcer un scrub, juste pour être sûr : |
101 | 1 | Lois Taulelle | |
102 | 1 | Lois Taulelle | <pre> |
103 | 1 | Lois Taulelle | root@ocean:~# zpool scrub data |
104 | 1 | Lois Taulelle | |
105 | 1 | Lois Taulelle | [wait ~5 mn] |
106 | 1 | Lois Taulelle | |
107 | 1 | Lois Taulelle | root@ocean:~# zpool status |
108 | 1 | Lois Taulelle | pool: data |
109 | 1 | Lois Taulelle | state: ONLINE |
110 | 1 | Lois Taulelle | scan: scrub in progress since Fri Mar 1 11:18:49 2013 |
111 | 1 | Lois Taulelle | 105G scanned out of 713G at 292M/s, 0h35m to go |
112 | 1 | Lois Taulelle | 0 repaired, 14,71% done |
113 | 1 | Lois Taulelle | config: |
114 | 1 | Lois Taulelle | |
115 | 1 | Lois Taulelle | NAME STATE READ WRITE CKSUM |
116 | 1 | Lois Taulelle | data ONLINE 0 0 0 |
117 | 1 | Lois Taulelle | raidz2-0 ONLINE 0 0 0 |
118 | 1 | Lois Taulelle | sdd3 ONLINE 0 0 0 |
119 | 1 | Lois Taulelle | sde3 ONLINE 0 0 0 |
120 | 1 | Lois Taulelle | sdc3 ONLINE 0 0 0 |
121 | 1 | Lois Taulelle | sdh3 ONLINE 0 0 0 |
122 | 1 | Lois Taulelle | sdg3 ONLINE 0 0 0 |
123 | 1 | Lois Taulelle | sdf3 ONLINE 0 0 0 |
124 | 1 | Lois Taulelle | spares |
125 | 1 | Lois Taulelle | sdb3 AVAIL |
126 | 1 | Lois Taulelle | sda3 AVAIL |
127 | 1 | Lois Taulelle | |
128 | 1 | Lois Taulelle | errors: No known data errors |
129 | 1 | Lois Taulelle | </pre> |
130 | 1 | Lois Taulelle | |
131 | 1 | Lois Taulelle | à lire : http://www.binaries.fr/files/slides/index.html |