Brouillon LM

Version 1 (Lois Taulelle, 04/02/2016 11:52)

1 1 Lois Taulelle
h1. Brouillon
2 1 Lois Taulelle
3 1 Lois Taulelle
§ en vrac
4 1 Lois Taulelle
5 1 Lois Taulelle
6 1 Lois Taulelle
h2. Choix du matériel
7 1 Lois Taulelle
8 1 Lois Taulelle
h3. Des disques ! Encore des disques !
9 1 Lois Taulelle
10 1 Lois Taulelle
Ce système de fichier étant avant tout pensé dans une optique "serveur", avec "des" disques (beaucoup de disques), vouloir utiliser ZFS sur un seul disque dur confine au ridicule. La dernière version de ZoL (RC14) règle définitivement le problème : un pool ne peut être composé qu'avec des disques, pas des partitions (ref nécéssaire).
11 1 Lois Taulelle
12 1 Lois Taulelle
multipath pour Kevin :
13 1 Lois Taulelle
14 1 Lois Taulelle
http://anothersysadmin.wordpress.com/2008/11/17/howto-debian-and-scsi-multipathing-with-multipath-tools/
15 1 Lois Taulelle
16 1 Lois Taulelle
h2. Sucreries d'administrateurs système
17 1 Lois Taulelle
18 1 Lois Taulelle
h3. Oh la vache ! Du Copy-On-Write !
19 1 Lois Taulelle
20 1 Lois Taulelle
Toutes les opérations de ZFS sont des transactions copie-à-l'écriture, ainsi l'état sur le disque est toujours valide.
21 1 Lois Taulelle
22 1 Lois Taulelle
23 1 Lois Taulelle
h3. C'est trolldi, c'est permis !
24 1 Lois Taulelle
25 1 Lois Taulelle
*Why btrfs is theorically better, while zfs is pratically*
26 1 Lois Taulelle
27 1 Lois Taulelle
Pas de mise en concurrence réelle, l'un est utilisable maintenant mais pas mainstream, avec toutes les bonnes choses. L'autre sera utilisable bientôt, et 1000% mainstream, et il faut le surveiller activement.
28 1 Lois Taulelle
29 1 Lois Taulelle
btrfs: Pre-history
30 1 Lois Taulelle
31 1 Lois Taulelle
Imagine you are a Linux file system developer. It's 2007, and you are at the Linux Storage and File systems workshop. Things are looking dim for Linux file systems: Reiserfs, plagued with quality issues and an unsustainable funding model, has just lost all credibility with the arrest of Hans Reiser a few months ago. ext4 is still in development; in fact, it isn't even called ext4 yet. Fundamentally, ext4 is just a straightforward extension of a 30-year-old format and is light-years behind the competition in terms of features. At the same time, companies are clamping down on funding for Linux development; IBM's Linux division is coming to the end of its grace period and needs to show profitability now. Other companies are catching wind of an upcoming recession and are cutting research across the board. They want projects with time to results measured in months, not years.
32 1 Lois Taulelle
33 1 Lois Taulelle
Ever hopeful, the file systems developers are meeting anyway. Since the workshop is co-located with USENIX FAST '07, several researchers from academia and industry are presenting their ideas to the workshop. One of them is Ohad Rodeh. He's invented a kind of btree that is copy-on-write (COW) friendly [PDF]. To start with, btrees in their native form are wildly incompatible with COW. The leaves of the tree are linked together, so when the location of one leaf changes (via a write - which implies a copy to a new block), the link in the adjacent leaf changes, which triggers another copy-on-write and location change, which changes the link in the next leaf... The result is that the entire btree, from top to bottom, has to be rewritten every time one leaf is changed.
34 1 Lois Taulelle
35 1 Lois Taulelle
Rodeh's btrees are different: first, he got rid of the links between leaves of the tree - which also "throws out a lot of the existing b-tree literature", as he says in his slides [PDF] - but keeps enough btree traits to be useful. (This is a fairly standard form of btrees in file systems, sometimes called "B+trees".) He added some algorithms for traversing the btree that take advantage of reference counts to limit the amount of the tree that has to be traversed when deleting a snapshot, as well as a few other things, like proactive split and merge of interior nodes so that inserts and deletes don't require any backtracking. The result is a simple, robust, generic data structure which very efficiently tracks extents (groups of contiguous data blocks) in a COW file system. Rodeh successfully prototyped the system some years ago, but he's done with that area of research and just wants someone to take his COW-friendly btrees and put them to good use. 
36 1 Lois Taulelle
37 1 Lois Taulelle
+ btrfs: A brief comparison with ZFS
38 1 Lois Taulelle
39 1 Lois Taulelle
différence algorithmique (https://lwn.net/Articles/342892/ , btrfs: A brief comparison with ZFS)
40 1 Lois Taulelle
41 1 Lois Taulelle
* pro/cons : http://rudd-o.com/linux-and-free-software/ways-in-which-zfs-is-better-than-btrfs ? (loïs: à mon avis, mauvaise idée, s'ils sont comparable, ils ne sont pas concurrent)
42 1 Lois Taulelle
43 1 Lois Taulelle
On a pu visualiser, sans réellement l'expliquer autrement que par l'algolrithmie interne, une différence flagrante : btrfs est plus rapide en écriture, là ou zfs est meilleur en lecture (même plateforme matérielle).
44 1 Lois Taulelle
45 1 Lois Taulelle
h3. ZFS sur partitions, pas bon !
46 1 Lois Taulelle
47 1 Lois Taulelle
<pre>
48 1 Lois Taulelle
root@ocean:~# zpool status
49 1 Lois Taulelle
  pool: data
50 1 Lois Taulelle
 state: UNAVAIL
51 1 Lois Taulelle
status: One or more devices could not be used because the label is missing 
52 1 Lois Taulelle
	or invalid.  There are insufficient replicas for the pool to continue
53 1 Lois Taulelle
	functioning.
54 1 Lois Taulelle
action: Destroy and re-create the pool from
55 1 Lois Taulelle
	a backup source.
56 1 Lois Taulelle
   see: http://zfsonlinux.org/msg/ZFS-8000-5E
57 1 Lois Taulelle
 scan: none requested
58 1 Lois Taulelle
config:
59 1 Lois Taulelle
60 1 Lois Taulelle
	NAME        STATE     READ WRITE CKSUM
61 1 Lois Taulelle
	data        UNAVAIL      0     0     0  insufficient replicas
62 1 Lois Taulelle
	  raidz2-0  UNAVAIL      0     0     0  insufficient replicas
63 1 Lois Taulelle
	    sdc3    FAULTED      0     0     0  corrupted data
64 1 Lois Taulelle
	    sdd3    FAULTED      0     0     0  corrupted data
65 1 Lois Taulelle
	    sdh3    FAULTED      0     0     0  corrupted data
66 1 Lois Taulelle
	    sde3    FAULTED      0     0     0  corrupted data
67 1 Lois Taulelle
	    sdg3    ONLINE       0     0     0
68 1 Lois Taulelle
	    sdf3    ONLINE       0     0     0
69 1 Lois Taulelle
root@ocean:~# zpool export data
70 1 Lois Taulelle
root@ocean:~# zpool status
71 1 Lois Taulelle
no pools available
72 1 Lois Taulelle
root@ocean:~# zpool import data
73 1 Lois Taulelle
cannot import 'data': pool may be in use from other system
74 1 Lois Taulelle
use '-f' to import anyway
75 1 Lois Taulelle
root@ocean:~# zpool import -f data
76 1 Lois Taulelle
root@ocean:~# zpool status
77 1 Lois Taulelle
  pool: data
78 1 Lois Taulelle
 state: ONLINE
79 1 Lois Taulelle
 scan: scrub repaired 0 in 0h21m with 0 errors on Fri Mar  1 04:36:07 2013
80 1 Lois Taulelle
config:
81 1 Lois Taulelle
82 1 Lois Taulelle
	NAME        STATE     READ WRITE CKSUM
83 1 Lois Taulelle
	data        ONLINE       0     0     0
84 1 Lois Taulelle
	  raidz2-0  ONLINE       0     0     0
85 1 Lois Taulelle
	    sdd3    ONLINE       0     0     0
86 1 Lois Taulelle
	    sde3    ONLINE       0     0     0
87 1 Lois Taulelle
	    sdc3    ONLINE       0     0     0
88 1 Lois Taulelle
	    sdh3    ONLINE       0     0     0
89 1 Lois Taulelle
	    sdg3    ONLINE       0     0     0
90 1 Lois Taulelle
	    sdf3    ONLINE       0     0     0
91 1 Lois Taulelle
	spares
92 1 Lois Taulelle
	  sdb3      AVAIL   
93 1 Lois Taulelle
	  sda3      AVAIL   
94 1 Lois Taulelle
95 1 Lois Taulelle
errors: No known data errors
96 1 Lois Taulelle
</pre>
97 1 Lois Taulelle
98 1 Lois Taulelle
La première, ça fait peur. Les fois suivantes, c'est juste pénible.
99 1 Lois Taulelle
100 1 Lois Taulelle
On peut forcer un scrub, juste pour être sûr :
101 1 Lois Taulelle
102 1 Lois Taulelle
<pre>
103 1 Lois Taulelle
root@ocean:~# zpool scrub data
104 1 Lois Taulelle
105 1 Lois Taulelle
[wait ~5 mn]
106 1 Lois Taulelle
107 1 Lois Taulelle
root@ocean:~# zpool status
108 1 Lois Taulelle
  pool: data
109 1 Lois Taulelle
 state: ONLINE
110 1 Lois Taulelle
 scan: scrub in progress since Fri Mar  1 11:18:49 2013
111 1 Lois Taulelle
    105G scanned out of 713G at 292M/s, 0h35m to go
112 1 Lois Taulelle
    0 repaired, 14,71% done
113 1 Lois Taulelle
config:
114 1 Lois Taulelle
115 1 Lois Taulelle
	NAME        STATE     READ WRITE CKSUM
116 1 Lois Taulelle
	data        ONLINE       0     0     0
117 1 Lois Taulelle
	  raidz2-0  ONLINE       0     0     0
118 1 Lois Taulelle
	    sdd3    ONLINE       0     0     0
119 1 Lois Taulelle
	    sde3    ONLINE       0     0     0
120 1 Lois Taulelle
	    sdc3    ONLINE       0     0     0
121 1 Lois Taulelle
	    sdh3    ONLINE       0     0     0
122 1 Lois Taulelle
	    sdg3    ONLINE       0     0     0
123 1 Lois Taulelle
	    sdf3    ONLINE       0     0     0
124 1 Lois Taulelle
	spares
125 1 Lois Taulelle
	  sdb3      AVAIL   
126 1 Lois Taulelle
	  sda3      AVAIL   
127 1 Lois Taulelle
128 1 Lois Taulelle
errors: No known data errors
129 1 Lois Taulelle
</pre>
130 1 Lois Taulelle
131 1 Lois Taulelle
à lire : http://www.binaries.fr/files/slides/index.html