Statistiques
| Branche: | Révision :

root / doc / R_tutorial / NucleoMiner2.Rnw @ 780f632a

Historique | Voir | Annoter | Télécharger (12,5 ko)

1 780f632a Florent Chuffart
% Generated by Sphinx.
2 780f632a Florent Chuffart
% \def\sphinxdocclass{report}
3 780f632a Florent Chuffart
\documentclass[letterpaper,10pt,english]{article}
4 780f632a Florent Chuffart
\usepackage[utf8]{inputenc}
5 780f632a Florent Chuffart
\DeclareUnicodeCharacter{00A0}{\nobreakspace}
6 780f632a Florent Chuffart
\usepackage{cmap}
7 780f632a Florent Chuffart
\usepackage[T1]{fontenc}
8 780f632a Florent Chuffart
\usepackage{babel}
9 780f632a Florent Chuffart
\usepackage{times}
10 780f632a Florent Chuffart
\usepackage[Bjarne]{fncychap}
11 780f632a Florent Chuffart
\usepackage{longtable}
12 780f632a Florent Chuffart
\usepackage{multirow}
13 780f632a Florent Chuffart
\usepackage{url}
14 780f632a Florent Chuffart
15 780f632a Florent Chuffart
\addto\captionsenglish{\renewcommand{\figurename}{Fig. }}
16 780f632a Florent Chuffart
\addto\captionsenglish{\renewcommand{\tablename}{Table }}
17 780f632a Florent Chuffart
18 780f632a Florent Chuffart
19 780f632a Florent Chuffart
20 780f632a Florent Chuffart
\title{NucleoMiner2 Documentation}
21 780f632a Florent Chuffart
\date{\today}
22 780f632a Florent Chuffart
\author{Florent CHUFFART, Jean-Baptiste VEYRIERAS, Gael YVERT}
23 780f632a Florent Chuffart
\newcommand{\sphinxlogo}{}
24 780f632a Florent Chuffart
\makeindex
25 780f632a Florent Chuffart
26 780f632a Florent Chuffart
\def\PYGZus{\char`\_}
27 780f632a Florent Chuffart
28 780f632a Florent Chuffart
29 780f632a Florent Chuffart
\begin{document}
30 780f632a Florent Chuffart
31 780f632a Florent Chuffart
\maketitle
32 780f632a Florent Chuffart
\tableofcontents
33 780f632a Florent Chuffart
\textbf{}\label{index::doc}
34 780f632a Florent Chuffart
35 780f632a Florent Chuffart
36 780f632a Florent Chuffart
\section{Inferring Nucleosome Position and Extracting Read Counts}
37 780f632a Florent Chuffart
\label{tuto:inferring-nucleosome-position-and-extracting-read-counts}
38 780f632a Florent Chuffart
The second part of the tutorial uses R (\url{http://http://www.r-project.org}{http://http://www.r-project.org}).
39 780f632a Florent Chuffart
It consists in 5 main steps: 
40 780f632a Florent Chuffart
\begin{itemize}
41 780f632a Florent Chuffart
42 780f632a Florent Chuffart
\item {} 
43 780f632a Florent Chuffart
Seting Up Configuration
44 780f632a Florent Chuffart
45 780f632a Florent Chuffart
\item {} 
46 780f632a Florent Chuffart
Computing CURs
47 780f632a Florent Chuffart
48 780f632a Florent Chuffart
\item {} 
49 780f632a Florent Chuffart
Extracting Nucleosome Maps from TemplateFilter Outputs
50 780f632a Florent Chuffart
51 780f632a Florent Chuffart
\item {} 
52 780f632a Florent Chuffart
Building Count Tables from TemplateFilter Inputs
53 780f632a Florent Chuffart
54 780f632a Florent Chuffart
\item {} 
55 780f632a Florent Chuffart
Analysing Count Tables using DESeq
56 780f632a Florent Chuffart
57 780f632a Florent Chuffart
\end{itemize}
58 780f632a Florent Chuffart
59 780f632a Florent Chuffart
60 780f632a Florent Chuffart
61 780f632a Florent Chuffart
62 780f632a Florent Chuffart
\subsection{Seting Up Configuration}
63 780f632a Florent Chuffart
64 780f632a Florent Chuffart
This first step is in charge of:
65 780f632a Florent Chuffart
\begin{itemize}
66 780f632a Florent Chuffart
\item {} 
67 780f632a Florent Chuffart
launching libraries used in the scripts
68 780f632a Florent Chuffart
69 780f632a Florent Chuffart
\item {} 
70 780f632a Florent Chuffart
launching configuration (design, strain, marker...)
71 780f632a Florent Chuffart
72 780f632a Florent Chuffart
{\footnotesize
73 780f632a Florent Chuffart
<<echo=TRUE, results=hide>>=
74 780f632a Florent Chuffart
library(nucleominer)
75 780f632a Florent Chuffart
@
76 780f632a Florent Chuffart
<<echo=TRUE, results=verbatim>>=
77 780f632a Florent Chuffart
package.version("nucleominer")
78 780f632a Florent Chuffart
79 780f632a Florent Chuffart
library(rjson)
80 780f632a Florent Chuffart
json_conf_file = "nucleominer_config.json"
81 780f632a Florent Chuffart
config = fromJSON(paste(readLines(json_conf_file), collapse=""))
82 780f632a Florent Chuffart
83 780f632a Florent Chuffart
all_samples = read.csv(config$CSV_SAMPLE_FILE, sep=";", head=TRUE, stringsAsFactors=FALSE)
84 780f632a Florent Chuffart
head(all_samples)
85 780f632a Florent Chuffart
86 780f632a Florent Chuffart
markers = unique(all_samples$marker)
87 780f632a Florent Chuffart
print(markers)
88 780f632a Florent Chuffart
89 780f632a Florent Chuffart
strains = unique(all_samples$strain)
90 780f632a Florent Chuffart
print(strains)
91 780f632a Florent Chuffart
92 780f632a Florent Chuffart
combis = combn(strains, 2, simplify=FALSE)
93 780f632a Florent Chuffart
print(combis)
94 780f632a Florent Chuffart
@
95 780f632a Florent Chuffart
}
96 780f632a Florent Chuffart
97 780f632a Florent Chuffart
98 780f632a Florent Chuffart
\subsection{Computing CURs}
99 780f632a Florent Chuffart
100 780f632a Florent Chuffart
This second step is in charge of computing and caching Common Uinterrupted Regions (CURs). Caching means storing the information in the computer's memory.
101 780f632a Florent Chuffart
102 780f632a Florent Chuffart
\end{itemize}
103 780f632a Florent Chuffart
104 780f632a Florent Chuffart
Note that you can customize the function “translate”. This function allows you to use the alignments between genomes when performing various tasks.
105 780f632a Florent Chuffart
\begin{itemize}
106 780f632a Florent Chuffart
\item {} 
107 780f632a Florent Chuffart
You may want to analyze data of a single strain (e.g. treatment/control, or only few mutations). In this case, the genome is identical across all samples and you do not need to define particular CURs (CURs are chromosomes). Simply use the default translate function which is neutral.
108 780f632a Florent Chuffart
109 780f632a Florent Chuffart
\item {} 
110 780f632a Florent Chuffart
If you are analyzing data from two or more strains (as NucleoMiner2 was designed for), then you need to translate coordinates of one genome into the coordinates of another one. You must do this by aligning the two genomes, which will produce a .c2c file (see Appendice ``Generate .c2c Files'').  thenuse it to produce the list of regions and customise “translate”.
111 780f632a Florent Chuffart
112 780f632a Florent Chuffart
\end{itemize}
113 780f632a Florent Chuffart
114 780f632a Florent Chuffart
In our tutorial, we are in the second case and to perform all these steps run the following command line in your R console:
115 780f632a Florent Chuffart
116 780f632a Florent Chuffart
117 780f632a Florent Chuffart
{\footnotesize
118 780f632a Florent Chuffart
<<echo=TRUE, results=hide>>=
119 780f632a Florent Chuffart
curs = compute_curs(diff_allowed=30, min_cur_width=4000, combis, config)
120 780f632a Florent Chuffart
@
121 780f632a Florent Chuffart
<<echo=TRUE, results=verbatim>>=
122 780f632a Florent Chuffart
head(curs)
123 780f632a Florent Chuffart
# Reduce first CUR length to accelerate demo, TODO: remove it before submission.
124 780f632a Florent Chuffart
curs [1,3] = curs[1,3] - 110000
125 780f632a Florent Chuffart
curs [1,5] = curs[1,5] - 110000
126 780f632a Florent Chuffart
@
127 780f632a Florent Chuffart
}
128 780f632a Florent Chuffart
129 780f632a Florent Chuffart
130 780f632a Florent Chuffart
\subsection{Extracting Nucleosome Maps from TemplateFilter Outputs}
131 780f632a Florent Chuffart
132 780f632a Florent Chuffart
This third step is in charge of extracting Maps for well-positioned and sensitive nucleosomes. First of all, this script computes intra and inter-strain matches of nucleosome maps for each CUR. This step can be executed in parallel on many cores using the BoT library. Next, it collects results and produces maps of  well-positioned nucleosomes, sensitive nucleosomes and Unaligned Nucleosomal Regions .
133 780f632a Florent Chuffart
134 780f632a Florent Chuffart
The map of well-positioned nucleosomes for BY is collected in the result directory and is called \emph{BY\_wp.tab}. It is composed of following columns:
135 780f632a Florent Chuffart
\begin{itemize}
136 780f632a Florent Chuffart
\item {} 
137 780f632a Florent Chuffart
chr, the number of the chromosome
138 780f632a Florent Chuffart
139 780f632a Florent Chuffart
\item {} 
140 780f632a Florent Chuffart
lower\_bound, the lower bound of the nucleosome
141 780f632a Florent Chuffart
142 780f632a Florent Chuffart
\item {} 
143 780f632a Florent Chuffart
upper\_bound, the upper bound of the nucleosome
144 780f632a Florent Chuffart
145 780f632a Florent Chuffart
\item {} 
146 780f632a Florent Chuffart
cur\_index, index of the CUR
147 780f632a Florent Chuffart
148 780f632a Florent Chuffart
\item {} 
149 780f632a Florent Chuffart
index\_nuc, the index of the nucleosome in the CUR
150 780f632a Florent Chuffart
151 780f632a Florent Chuffart
\item {} 
152 780f632a Florent Chuffart
wp, 1 if it is a well positioned nucleosome, 0 otherwise
153 780f632a Florent Chuffart
154 780f632a Florent Chuffart
\item {} 
155 780f632a Florent Chuffart
nb\_reads, the number of reads that support this nucleosome
156 780f632a Florent Chuffart
157 780f632a Florent Chuffart
\item {} 
158 780f632a Florent Chuffart
nb\_nucs, the number of TemplateFilter nucleosome across replicates (= the number of replicates in which it is a well-positioned nucleosome)
159 780f632a Florent Chuffart
160 780f632a Florent Chuffart
\item {} 
161 780f632a Florent Chuffart
llr\_1, for a well-positioned nucleosome, it is the LLR1 (log-likelihood ratio) between the first and the second TemplateFilter nucleosome on the chain.
162 780f632a Florent Chuffart
163 780f632a Florent Chuffart
\item {} 
164 780f632a Florent Chuffart
llr\_2, for a well-positioned nucleosome, it is the LLR1 between the second and the third TemplateFilter nucleosome on the chain.
165 780f632a Florent Chuffart
166 780f632a Florent Chuffart
\item {} 
167 780f632a Florent Chuffart
wp\_llr, for a well-positioned nucleosome, it is the LLR2 that compares consistency of the positioning over all TemplateFilter nucleosomes.
168 780f632a Florent Chuffart
169 780f632a Florent Chuffart
\item {} 
170 780f632a Florent Chuffart
wp\_pval, for a well-positioned nucleosome, it is the p-value chi square test obtained from LLR2 (\emph{1-pchisq(2.LLR2, df=4)})
171 780f632a Florent Chuffart
172 780f632a Florent Chuffart
\item {} 
173 780f632a Florent Chuffart
dyad\_shift, for a well-positioned nucleosome, it is the shift between the two extreme TemplateFilter nucleosome dyad positions.
174 780f632a Florent Chuffart
175 780f632a Florent Chuffart
\end{itemize}
176 780f632a Florent Chuffart
177 780f632a Florent Chuffart
The sensitive map for BY is collected in the result directory and is called \emph{BY\_fuzzy.tab}. It is composed of following columns:
178 780f632a Florent Chuffart
\begin{itemize}
179 780f632a Florent Chuffart
\item {} 
180 780f632a Florent Chuffart
chr, the number of the chromosome
181 780f632a Florent Chuffart
182 780f632a Florent Chuffart
\item {} 
183 780f632a Florent Chuffart
lower\_bound, the lower bound of the nucleosome
184 780f632a Florent Chuffart
185 780f632a Florent Chuffart
\item {} 
186 780f632a Florent Chuffart
upper\_bound, the upper bound of the nucleosome
187 780f632a Florent Chuffart
188 780f632a Florent Chuffart
\item {} 
189 780f632a Florent Chuffart
cur\_index, index of the CUR
190 780f632a Florent Chuffart
191 780f632a Florent Chuffart
\end{itemize}
192 780f632a Florent Chuffart
193 780f632a Florent Chuffart
The map of common well-positioned nucleosomes aligned between the BY and RM strains is collected in the result directory and is called \emph{BY\_RM\_common\_wp.tab}. It is composed of following columns:
194 780f632a Florent Chuffart
\begin{itemize}
195 780f632a Florent Chuffart
\item {} 
196 780f632a Florent Chuffart
cur\_index, the index of the CUR
197 780f632a Florent Chuffart
198 780f632a Florent Chuffart
\item {} 
199 780f632a Florent Chuffart
index\_nuc\_BY, the index of the BY nucleosome in the CUR
200 780f632a Florent Chuffart
201 780f632a Florent Chuffart
\item {} 
202 780f632a Florent Chuffart
index\_nuc\_RM, the index of the RM nucleosome in the CUR
203 780f632a Florent Chuffart
204 780f632a Florent Chuffart
\item {} 
205 780f632a Florent Chuffart
llr\_score, , the LLR3 score that estimates conservation between the positions in BY and RM
206 780f632a Florent Chuffart
207 780f632a Florent Chuffart
\item {} 
208 780f632a Florent Chuffart
common\_wp\_pval,  the p-value chi square test obtained from LLR3 (\emph{1-pchisq(2.LLR3, df=2)})
209 780f632a Florent Chuffart
210 780f632a Florent Chuffart
\item {} 
211 780f632a Florent Chuffart
diff, the dyads shift between the positions in the two strains (in bp)
212 780f632a Florent Chuffart
213 780f632a Florent Chuffart
\end{itemize}
214 780f632a Florent Chuffart
215 780f632a Florent Chuffart
The common UNR map for BY and RM strains is collected in the result directory and is called \emph{BY\_RM\_common\_unr.tab}. It is composed of the following columns:
216 780f632a Florent Chuffart
\begin{itemize}
217 780f632a Florent Chuffart
\item {} 
218 780f632a Florent Chuffart
cur\_index, the index of the CUR
219 780f632a Florent Chuffart
220 780f632a Florent Chuffart
\item {} 
221 780f632a Florent Chuffart
index\_nuc\_BY, the index of the BY nucleosome in the CUR
222 780f632a Florent Chuffart
223 780f632a Florent Chuffart
\item {} 
224 780f632a Florent Chuffart
index\_nuc\_RM,the index of the RM nucleosome in the CUR
225 780f632a Florent Chuffart
226 780f632a Florent Chuffart
\end{itemize}
227 780f632a Florent Chuffart
228 780f632a Florent Chuffart
To execute this script, run the following command in your R console:
229 780f632a Florent Chuffart
230 780f632a Florent Chuffart
231 780f632a Florent Chuffart
{\footnotesize
232 780f632a Florent Chuffart
<<echo=TRUE, results=hide>>=
233 780f632a Florent Chuffart
build_maps(strains, combis, all_samples, curs, config)
234 780f632a Florent Chuffart
@
235 780f632a Florent Chuffart
<<echo=TRUE, results=verbatim>>=
236 780f632a Florent Chuffart
BY_wp = read.table("outputs/BY_wp.tab", header=TRUE)
237 780f632a Florent Chuffart
head(BY_wp)
238 780f632a Florent Chuffart
RM_wp = read.table("outputs/RM_wp.tab", header=TRUE)
239 780f632a Florent Chuffart
head(RM_wp)
240 780f632a Florent Chuffart
BY_RM_common_wp = read.table("outputs/BY_RM_common_wp.tab", header=TRUE)
241 780f632a Florent Chuffart
head(BY_RM_common_wp)
242 780f632a Florent Chuffart
@
243 780f632a Florent Chuffart
}
244 780f632a Florent Chuffart
245 780f632a Florent Chuffart
\subsection{Building Count Tables from TemplateFilter Inputs
246 780f632a Florent Chuffart
}
247 780f632a Florent Chuffart
\label{tuto:the-script-count-reads-r}
248 780f632a Florent Chuffart
To associate a number of observations (read) to each nucleosome we run the script \emph{count\_reads.R}. It produces the files \emph{BY\_RM\_H3K14ac\_wp\_and\_nbreads.tab}, \emph{BY\_RM\_H3K14ac\_unr\_and\_nbreads.tab} \emph{BY\_RM\_Mnase\_Seq\_wp\_and\_nbreads.tab} and \emph{BY\_RM\_Mnase\_Seq\_unr\_and\_nbreads.tab}
249 780f632a Florent Chuffart
for H3K14ac common well-positioned nucleosomes, H3K14ac UNRs, Mnase common well-positioned nucleosomes and Mnase UNRs respectively.
250 780f632a Florent Chuffart
251 780f632a Florent Chuffart
For example, the file \emph{BY\_RM\_H3K14ac\_unr\_and\_nbreads.tab} contains counted reads for well-positioned nucleosomes with the experimental condition ChIP H3K14ac. It is composed of the following columns:
252 780f632a Florent Chuffart
\begin{itemize}
253 780f632a Florent Chuffart
\item {} 
254 780f632a Florent Chuffart
chr\_BY, the number of the chromosome for BY
255 780f632a Florent Chuffart
256 780f632a Florent Chuffart
\item {} 
257 780f632a Florent Chuffart
lower\_bound\_BY, the lower bound of the nucleosome for BY
258 780f632a Florent Chuffart
259 780f632a Florent Chuffart
\item {} 
260 780f632a Florent Chuffart
upper\_bound\_BY, the upper bound of the nucleosome  for BY
261 780f632a Florent Chuffart
262 780f632a Florent Chuffart
\item {} 
263 780f632a Florent Chuffart
index\_nuc\_BY, the index of the BY nucleosome in the CUR for BY
264 780f632a Florent Chuffart
265 780f632a Florent Chuffart
\item {} 
266 780f632a Florent Chuffart
chr\_RM, the number of the chromosome for RM
267 780f632a Florent Chuffart
268 780f632a Florent Chuffart
\item {} 
269 780f632a Florent Chuffart
lower\_bound\_RM, the lower bound of the nucleosome for RM
270 780f632a Florent Chuffart
271 780f632a Florent Chuffart
\item {} 
272 780f632a Florent Chuffart
upper\_bound\_RM, the upper bound of the nucleosome  for RM
273 780f632a Florent Chuffart
274 780f632a Florent Chuffart
\item {} 
275 780f632a Florent Chuffart
index\_nuc\_RM,the index of the RM nucleosome in the CUR for RM
276 780f632a Florent Chuffart
277 780f632a Florent Chuffart
\item {} 
278 780f632a Florent Chuffart
cur\_index, index of the CUR
279 780f632a Florent Chuffart
280 780f632a Florent Chuffart
\item {} 
281 780f632a Florent Chuffart
BY\_H3K14ac\_36, the number of reads for the current nucleosome for the sample 36
282 780f632a Florent Chuffart
283 780f632a Florent Chuffart
\item {} 
284 780f632a Florent Chuffart
BY\_H3K14ac\_37, \#reads for sample 37
285 780f632a Florent Chuffart
286 780f632a Florent Chuffart
\item {} 
287 780f632a Florent Chuffart
BY\_H3K14ac\_53, \#reads for sample 53
288 780f632a Florent Chuffart
289 780f632a Florent Chuffart
\item {} 
290 780f632a Florent Chuffart
RM\_H3K14ac\_38, \#reads for sample 38
291 780f632a Florent Chuffart
292 780f632a Florent Chuffart
\item {} 
293 780f632a Florent Chuffart
RM\_H3K14ac\_39, \#reads for sample 39
294 780f632a Florent Chuffart
295 780f632a Florent Chuffart
\end{itemize}
296 780f632a Florent Chuffart
297 780f632a Florent Chuffart
To execute this script, run the following command in your R console:
298 780f632a Florent Chuffart
299 780f632a Florent Chuffart
300 780f632a Florent Chuffart
{\footnotesize
301 780f632a Florent Chuffart
<<echo=TRUE, results=hide>>=
302 780f632a Florent Chuffart
for (form in c("wp", "unr")) {
303 780f632a Florent Chuffart
	for (marker in markers) {
304 780f632a Florent Chuffart
    build_count_table(marker, combis[[1]], form, curs, all_samples, config)
305 780f632a Florent Chuffart
  }
306 780f632a Florent Chuffart
}			
307 780f632a Florent Chuffart
@
308 780f632a Florent Chuffart
<<echo=TRUE, results=verbatim>>=
309 780f632a Florent Chuffart
a_count_table = read.table("outputs/BY_RM_Mnase_Seq_wp_and_nbreads.tab", header=TRUE)
310 780f632a Florent Chuffart
head(a_count_table)
311 780f632a Florent Chuffart
@
312 780f632a Florent Chuffart
}
313 780f632a Florent Chuffart
314 780f632a Florent Chuffart
\subsection{Analysing Count Tables using DESeq}
315 780f632a Florent Chuffart
316 780f632a Florent Chuffart
Finally, the script \emph{launch\_deseq.R} perform statistical analysis on each nucleosome using \emph{DESeq}. It produces files:
317 780f632a Florent Chuffart
\begin{itemize}
318 780f632a Florent Chuffart
\item {} 
319 780f632a Florent Chuffart
results/current/BY\_RM\_H3K14ac\_wp\_snep.tab
320 780f632a Florent Chuffart
321 780f632a Florent Chuffart
\item {} 
322 780f632a Florent Chuffart
results/current/BY\_RM\_H3K14ac\_unr\_snep.tab
323 780f632a Florent Chuffart
324 780f632a Florent Chuffart
\item {} 
325 780f632a Florent Chuffart
results/current/BY\_RM\_H3K14ac\_wpunr\_snep.tab
326 780f632a Florent Chuffart
327 780f632a Florent Chuffart
\end{itemize}
328 780f632a Florent Chuffart
329 780f632a Florent Chuffart
These files are organised with the following columns (see file \emph{BY\_RM\_H3K14ac\_wp\_snep.tab} for an example):
330 780f632a Florent Chuffart
\begin{itemize}
331 780f632a Florent Chuffart
\item {} 
332 780f632a Florent Chuffart
chr\_BY, the number of the chromosome for BY
333 780f632a Florent Chuffart
334 780f632a Florent Chuffart
\item {} 
335 780f632a Florent Chuffart
lower\_bound\_BY, the lower bound of the nucleosome for BY
336 780f632a Florent Chuffart
337 780f632a Florent Chuffart
\item {} 
338 780f632a Florent Chuffart
upper\_bound\_BY, the upper bound of the nucleosome  for BY
339 780f632a Florent Chuffart
340 780f632a Florent Chuffart
\item {} 
341 780f632a Florent Chuffart
index\_nuc\_BY, the index of the BY nucleosome in the CUR for BY
342 780f632a Florent Chuffart
343 780f632a Florent Chuffart
\item {} 
344 780f632a Florent Chuffart
chr\_RM, the number of the chromosome for RM
345 780f632a Florent Chuffart
346 780f632a Florent Chuffart
\item {} 
347 780f632a Florent Chuffart
lower\_bound\_RM, the lower bound of the nucleosome for RM
348 780f632a Florent Chuffart
349 780f632a Florent Chuffart
\item {} 
350 780f632a Florent Chuffart
upper\_bound\_RM, the upper bound of the nucleosome  for RM
351 780f632a Florent Chuffart
352 780f632a Florent Chuffart
\item {} 
353 780f632a Florent Chuffart
index\_nuc\_RM,the index of the RM nucleosome in the CUR for RM
354 780f632a Florent Chuffart
355 780f632a Florent Chuffart
\item {} 
356 780f632a Florent Chuffart
cur\_index, index of the CUR
357 780f632a Florent Chuffart
358 780f632a Florent Chuffart
\item {} 
359 780f632a Florent Chuffart
form
360 780f632a Florent Chuffart
361 780f632a Florent Chuffart
\item {} 
362 780f632a Florent Chuffart
BY\_Mnase\_Seq\_1, the number of reads for the current nucleosome for the sample 1
363 780f632a Florent Chuffart
364 780f632a Florent Chuffart
\end{itemize}
365 780f632a Florent Chuffart
366 780f632a Florent Chuffart
Next columns concern indicators for each sample:
367 780f632a Florent Chuffart
\begin{itemize}
368 780f632a Florent Chuffart
\item {} 
369 780f632a Florent Chuffart
BY\_Mnase\_Seq\_2, \#reads for sample 2
370 780f632a Florent Chuffart
371 780f632a Florent Chuffart
\item {} 
372 780f632a Florent Chuffart
BY\_Mnase\_Seq\_3, \#reads for sample 3
373 780f632a Florent Chuffart
374 780f632a Florent Chuffart
\item {} 
375 780f632a Florent Chuffart
RM\_Mnase\_Seq\_4, \#reads for sample 4
376 780f632a Florent Chuffart
377 780f632a Florent Chuffart
\item {} 
378 780f632a Florent Chuffart
RM\_Mnase\_Seq\_5, \#reads for sample 5
379 780f632a Florent Chuffart
380 780f632a Florent Chuffart
\item {} 
381 780f632a Florent Chuffart
RM\_Mnase\_Seq\_6, \#reads for sample 6
382 780f632a Florent Chuffart
383 780f632a Florent Chuffart
\item {} 
384 780f632a Florent Chuffart
BY\_H3K14ac\_36, \#reads for sample 36
385 780f632a Florent Chuffart
386 780f632a Florent Chuffart
\item {} 
387 780f632a Florent Chuffart
BY\_H3K14ac\_37, \#reads for sample 37
388 780f632a Florent Chuffart
389 780f632a Florent Chuffart
\item {} 
390 780f632a Florent Chuffart
BY\_H3K14ac\_53, \#reads for sample 53
391 780f632a Florent Chuffart
392 780f632a Florent Chuffart
\item {} 
393 780f632a Florent Chuffart
RM\_H3K14ac\_38, \#reads for sample 38
394 780f632a Florent Chuffart
395 780f632a Florent Chuffart
\item {} 
396 780f632a Florent Chuffart
RM\_H3K14ac\_39, \#reads for sample 39
397 780f632a Florent Chuffart
398 780f632a Florent Chuffart
\end{itemize}
399 780f632a Florent Chuffart
400 780f632a Florent Chuffart
The 7 last columns concern DESeq analysis:
401 780f632a Florent Chuffart
\begin{itemize}
402 780f632a Florent Chuffart
\item {} 
403 780f632a Florent Chuffart
manip{[}a\_manip{]} strain{[}a\_strain{]} manip{[}a\_strain{]}:strain{[}a\_strain{]}, the manip (marker) effect, the strain effect and the snep effect. These are the coefficients of the fitted generalized linear model.
404 780f632a Florent Chuffart
405 780f632a Florent Chuffart
\item {} 
406 780f632a Florent Chuffart
pvalsGLM, the pvalue resulting from the comparison of the GLM model considering the interaction term \emph{marker:strain} to the GLM model that does not consider it. This is the statsitcial significance of the interaction term and therefore the statistical significance of the SNEP.
407 780f632a Florent Chuffart
408 780f632a Florent Chuffart
\item {} 
409 780f632a Florent Chuffart
snep\_index, a boolean set to TRUE if the pvalueGLM value is under the threshold computed with FDR function with a rate set to 0.0001.
410 780f632a Florent Chuffart
411 780f632a Florent Chuffart
\item {} 
412 780f632a Florent Chuffart
mnase\_l2fc, this is the coefficients of the fitted generalized linear model based on MNASE samples only.
413 780f632a Florent Chuffart
414 780f632a Florent Chuffart
\item {} 
415 780f632a Florent Chuffart
mnase\_l2fc\_pval, the pvalue resulting from the analysis of the simple model based on MNASE samples only. This is the statsitcial significance of the \emph{strain} term and therefore the statistical significance of the strain effect.
416 780f632a Florent Chuffart
417 780f632a Florent Chuffart
\end{itemize}
418 780f632a Florent Chuffart
419 780f632a Florent Chuffart
To execute this script, run the following command in your R console:
420 780f632a Florent Chuffart
421 780f632a Florent Chuffart
422 780f632a Florent Chuffart
{\footnotesize
423 780f632a Florent Chuffart
<<echo=TRUE, results=hide>>=
424 780f632a Florent Chuffart
for (form in c("wp", "unr", "wpunr")) {
425 780f632a Florent Chuffart
  sneps = analyse_count_table(markers[2], combis[[1]], form, all_samples, config=config)
426 780f632a Florent Chuffart
}
427 780f632a Florent Chuffart
@
428 780f632a Florent Chuffart
<<echo=TRUE, results=verbatim>>=
429 780f632a Florent Chuffart
a_sneps_table = read.table("outputs/BY_RM_H3K14ac_wp_snep.tab", header=TRUE)
430 780f632a Florent Chuffart
head(a_sneps_table)
431 780f632a Florent Chuffart
432 780f632a Florent Chuffart
@
433 780f632a Florent Chuffart
}
434 780f632a Florent Chuffart
435 780f632a Florent Chuffart
Here are the number of computed SNEPs for each forms.
436 780f632a Florent Chuffart
437 780f632a Florent Chuffart
438 780f632a Florent Chuffart
439 780f632a Florent Chuffart
440 780f632a Florent Chuffart
\end{document}