Révision 6e0010bc doc/sphinx_doc/build/text/tuto.txt
b/doc/sphinx_doc/build/text/tuto.txt | ||
---|---|---|
44 | 44 |
the 53 samples is indentify by a uniq identifier. The file |
45 | 45 |
*CSV_SAMPLE_FILE* sums up this information. |
46 | 46 |
|
47 |
configurator.CSV_SAMPLE_FILE = None |
|
48 |
|
|
49 |
Path to cvs file that contains sample information. |
|
50 |
|
|
51 | 47 |
We use a convention to link sample and Illumina fastq outputs. |
52 | 48 |
Illumina output files of the sample *ID* will be stored in the |
53 | 49 |
directory *ILLUMINA_OUTPUTFILE_PREFIX* + *ID*. For example, sample 41 |
54 | 50 |
outputs will be stored in the directory |
55 | 51 |
*data/2012-09-05/FASTQ/Sample_Yvert_Bq41/*. |
56 | 52 |
|
57 |
configurator.ILLUMINA_OUTPUTFILE_PREFIX = None |
|
58 |
|
|
59 |
Prefix for Illumina fastq output files. |
|
60 |
|
|
61 | 53 |
For BY (resp. RM and YJM) we use following reference genome |
62 | 54 |
*saccharomyces_cerevisiae_BY_S288c_chromosomes.fasta* (resp. |
63 | 55 |
*saccharomyces_cerevisiae_rm11-1a_1_supercontigs.fasta* and |
64 | 56 |
*saccharomyces_cerevisiae_YJM_789_screencontig.fasta*). The index |
65 | 57 |
*FASTA_REFERENCE_GENOME_FILES* stores this information. |
66 | 58 |
|
67 |
configurator.FASTA_REFERENCE_GENOME_FILES = None |
|
68 |
|
|
69 |
Dictionary where each fasta reference genomes is indexed by |
|
70 |
reference strain that it corresponds. |
|
71 |
|
|
72 | 59 |
Each chromosome/contig is identify in the fasta file by an obscure |
73 | 60 |
identifier. For example, BY chromosome I is identify by |
74 | 61 |
*gi|144228165|ref|NC_001133.7|* when TemplateFilter is waiting for an |
75 | 62 |
integer. So, we translate it. The index *FASTA_INDEXES* stores this |
76 | 63 |
translation. |
77 | 64 |
|
78 |
configurator.FASTA_INDEXES = None |
|
79 |
|
|
80 |
Dictionary of strain that indexes dictionaries where keys are |
|
81 |
chromosome reference from Fastq file and value are its |
|
82 |
correspondance for Templatefilter. |
|
83 |
|
|
84 | 65 |
From a pragamatical point of view we discard some part of the genome |
85 | 66 |
(repeated sequence etc...). The list of the black listed area is |
86 | 67 |
explicitely detailled in *AREA_BLACK_LIST*. |
87 | 68 |
|
88 |
configurator.AREA_BLACK_LIST = None |
|
89 |
|
|
90 |
Dictionary where keys are strain and values are black listed of |
|
91 |
geneome region. |
|
92 |
|
|
93 | 69 |
For BY-RM (resp. BY-YJM and RM-YJM) genome sequence alignment we use |
94 | 70 |
previously compute .c2c file |
95 | 71 |
*data/2012-03_primarydata/BY_RM_gxcomp.c2c* (resp. |
... | ... | |
98 | 74 |
*NucleoMiner*, the old version of *NucleoMiner2* (http://www.ens- |
99 | 75 |
lyon.fr/LBMC/gisv/NucleoMiner_Manual/manual.pdf). |
100 | 76 |
|
101 |
configurator.C2C_FILES = None |
|
102 |
|
|
103 |
Dictionary where each strain combination indexes genome aligment. |
|
104 |
|
|
105 | 77 |
*nucleominer* uses specific directory to work in, these are described |
106 | 78 |
in *INDEX_DIR*, *ALIGN_DIR* and *LOG_DIR*. |
107 | 79 |
|
Formats disponibles : Unified diff