Statistiques
| Branche: | Révision :

root / doc / sphinx_doc / build / text / ref.txt @ 21b8928f

Historique | Voir | Annoter | Télécharger (27,86 ko)

1

    
2
References
3
**********
4

    
5

    
6
Python Reference
7
================
8

    
9

    
10
R Reference
11
===========
12

    
13

    
14
Arabic to Roman pair list.
15
--------------------------
16

    
17

    
18
Description
19
~~~~~~~~~~~
20

    
21
Util to convert Arabicto Roman
22

    
23

    
24
Usage
25
~~~~~
26

    
27
   ARAB2ROM()
28

    
29

    
30
Author(s)
31
~~~~~~~~~
32

    
33
Florent Chuffart
34

    
35
R: False Discovery Rate
36

    
37

    
38
False Discovery Rate
39
--------------------
40

    
41

    
42
Description
43
~~~~~~~~~~~
44

    
45
From a vector x of independent p-values, extract the cutoff
46
corresponding to the specified FDR. See Benjamini & Hochberg 1995
47
paper
48

    
49

    
50
Usage
51
~~~~~
52

    
53
   FDR(x, FDR)
54

    
55

    
56
Arguments
57
~~~~~~~~~
58

    
59
"x"
60

    
61
A vector x of independent p-values.
62

    
63
"FDR"
64

    
65
The specified FDR.
66

    
67

    
68
Value
69
~~~~~
70

    
71
Return the the corresponding cutoff.
72

    
73

    
74
Author(s)
75
~~~~~~~~~
76

    
77
Gael Yvert, Florent Chuffart
78

    
79

    
80
Examples
81
~~~~~~~~
82

    
83
   print("example")
84

    
85
R: Roman to Arabic pair list.
86

    
87

    
88
Roman to Arabic pair list.
89
--------------------------
90

    
91

    
92
Description
93
~~~~~~~~~~~
94

    
95
Util to convert Roman to Arabic
96

    
97

    
98
Usage
99
~~~~~
100

    
101
   ROM2ARAB()
102

    
103

    
104
Author(s)
105
~~~~~~~~~
106

    
107
Florent Chuffart
108

    
109
R: Aggregate replicated sample's nucleosomes.
110

    
111

    
112
Aggregate replicated sample's nucleosomes.
113
------------------------------------------
114

    
115

    
116
Description
117
~~~~~~~~~~~
118

    
119
This function aggregates nucleosome for replicated samples. It uses
120
TemplateFilter ouput of each sample as replicate. Each sample owns a
121
set of nucleosomes computed using TemplateFilter and ordered by the
122
position of their center. Adajacent nucleosomes are compared two by
123
two. Comparison is based on a log likelihood ratio score. The issue of
124
comparison is adjacents nucleosomes merge or separation. Finally the
125
function returns a list of clusters and all computed *lod_scores*.
126
Each cluster ows an attribute *wp* for "well positionned". This
127
attribute is set as *TRUE* if the cluster is composed of exactly one
128
nucleosomes of each sample.
129

    
130

    
131
Usage
132
~~~~~
133

    
134
   aggregate_intra_strain_nucs(samples, lod_thres = 20, coord_max = 2e+07)
135

    
136

    
137
Arguments
138
~~~~~~~~~
139

    
140
"samples"
141

    
142
A list of samples. Each sample is a list like *sample = list(id=...,
143
marker=..., strain=..., roi=..., inputs=..., outputs=...)* with *roi =
144
list(name=..., begin=..., end=..., chr=..., genome=...)*.
145

    
146
"lod_thres"
147

    
148
Log likelihood ration threshold.
149

    
150
"coord_max"
151

    
152
A too big value to be a coord for a nucleosome lower bound.
153

    
154

    
155
Value
156
~~~~~
157

    
158
Returns a list of clusterized nucleosomes, and all computed lod
159
scores.
160

    
161

    
162
Author(s)
163
~~~~~~~~~
164

    
165
Florent Chuffart
166

    
167

    
168
Examples
169
~~~~~~~~
170

    
171
   # Dealing with a region of interest
172
   roi =list(name="example", begin=1000,  end=1300, chr="1", genome=rep("A",301))
173
   samples = list()
174
   for (i in 1:3) {
175
       # Create TF output
176
       tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9)
177
       outputs = dfadd(NULL,tf_nuc)
178
       outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end)
179
       # Generate corresponding reads
180
       nb_reads = round(runif(1,170,230))
181
       reads = round(rnorm(nb_reads, tf_nuc$center,20))
182
       u_reads = sort(unique(reads))
183
       strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2))))
184
       counts = apply(t(u_reads), 2, function(r) { sum(reads == r)})
185
       shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)})
186
       u_reads = u_reads + shifts
187
       inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)),
188
                                "V2" = u_reads,
189
                                                        "V3" = strands,
190
                                                        "V4" = counts), stringsAsFactors=FALSE)
191
       samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain="strain_ex", total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs)
192
   }
193
   print(aggregate_intra_strain_nucs(samples))
194

    
195
R: Aligns nucleosomes between 2 strains.
196

    
197

    
198
Aligns nucleosomes between 2 strains.
199
-------------------------------------
200

    
201

    
202
Description
203
~~~~~~~~~~~
204

    
205
This function aligns nucs between two strains for a given genome
206
region.
207

    
208

    
209
Usage
210
~~~~~
211

    
212
   align_inter_strain_nucs(replicates, wp_nucs_strain_ref1 = NULL,
213
       wp_nucs_strain_ref2 = NULL, corr_thres = 0.5, lod_thres = 100,
214
       config = NULL, ...)
215

    
216

    
217
Arguments
218
~~~~~~~~~
219

    
220
"replicates"
221

    
222
Set of replicates, ideally 3 per strain.
223

    
224
"wp_nucs_strain_ref1"
225

    
226
List of aggregates nucleosome for strain 1. If it's null this list
227
will be computed.
228

    
229
"wp_nucs_strain_ref2"
230

    
231
List of aggregates nucleosome for strain 2. If it's null this list
232
will be computed.
233

    
234
"corr_thres"
235

    
236
Correlation threshold.
237

    
238
"lod_thres"
239

    
240
LOD cut off.
241

    
242
"config"
243

    
244
GLOBAL config variable
245

    
246
"..."
247

    
248
A list of parameters that will be passed to
249
*aggregate_intra_strain_nucs* if needed.
250

    
251

    
252
Value
253
~~~~~
254

    
255
Returns a list of clusterized nucleosomes, and all computed lod
256
scores.
257

    
258

    
259
Author(s)
260
~~~~~~~~~
261

    
262
Florent Chuffart
263

    
264

    
265
Examples
266
~~~~~~~~
267

    
268
       # Define new translate_roi function...
269
       translate_roi = function(roi, strain2, big_roi=NULL, config=NULL) {
270
         return(roi)
271
       }
272
       # Binding it by uncomment follwing lines.
273
       unlockBinding("translate_roi", as.environment("package:nucleominer"))
274
       unlockBinding("translate_roi", getNamespace("nucleominer"))
275
       assign("translate_roi", translate_roi, "package:nucleominer")
276
       assign("translate_roi", translate_roi, getNamespace("nucleominer"))
277
       lockBinding("translate_roi", getNamespace("nucleominer"))
278
       lockBinding("translate_roi", as.environment("package:nucleominer"))
279

    
280
   # Dealing with a region of interest
281
   roi =list(name="example", begin=1000,  end=1300, chr="1", genome=rep("A",301), strain_ref1 = "STRAINREF1")
282
   roi2 = translate_roi(roi, roi$strain_ref1)
283
   replicates = list()
284
   for (j in 1:2) {
285
       samples = list()
286
       for (i in 1:3) {
287
           # Create TF output
288
           tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9)
289
           outputs = dfadd(NULL,tf_nuc)
290
           outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end)
291
           # Generate corresponding reads
292
           nb_reads = round(runif(1,170,230))
293
           reads = round(rnorm(nb_reads, tf_nuc$center,20))
294
           u_reads = sort(unique(reads))
295
           strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2))))
296
           counts = apply(t(u_reads), 2, function(r) { sum(reads == r)})
297
           shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)})
298
           u_reads = u_reads + shifts
299
           inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)),
300
                                    "V2" = u_reads,
301
                                                            "V3" = strands,
302
                                                            "V4" = counts), stringsAsFactors=FALSE)
303
           samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain=paste("strain_ex",j,sep=""), total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs)
304
       }
305
       replicates[[length(replicates) + 1]] = samples
306
   }
307
   print(align_inter_strain_nucs(replicates))
308

    
309
R: Launch deseq methods.
310

    
311

    
312
Launch deseq methods.
313
---------------------
314

    
315

    
316
Description
317
~~~~~~~~~~~
318

    
319
This function is based on deseq example. It mormalizes data, fit data
320
to GLM model with and without interaction term and compare the two
321
l;=models.
322

    
323

    
324
Usage
325
~~~~~
326

    
327
   analyse_design(snep_design, reads)
328

    
329

    
330
Arguments
331
~~~~~~~~~
332

    
333
"snep_design"
334

    
335
The design to considere.
336

    
337
"reads"
338

    
339
The data to considere.
340

    
341

    
342
Author(s)
343
~~~~~~~~~
344

    
345
Florent Chuffart
346

    
347
R: Stage replicates data
348

    
349

    
350
Stage replicates data
351
---------------------
352

    
353

    
354
Description
355
~~~~~~~~~~~
356

    
357
This function loads in memory data corresponding to the given
358
experiments.
359

    
360

    
361
Usage
362
~~~~~
363

    
364
   build_replicates(expe, roi, only_fetch = FALSE, get_genome = FALSE,
365
       all_samples, config = NULL)
366

    
367

    
368
Arguments
369
~~~~~~~~~
370

    
371
"expe"
372

    
373
a list of vector corresponding to vector of replicates.
374

    
375
"roi"
376

    
377
the region that we are interested in.
378

    
379
"only_fetch"
380

    
381
filter or not inputs.
382

    
383
"get_genome"
384

    
385
Load or not corresponding genome.
386

    
387
"all_samples"
388

    
389
Global list of samples.
390

    
391
"config"
392

    
393
GLOBAL config variable.
394

    
395

    
396
Author(s)
397
~~~~~~~~~
398

    
399
Florent Chuffart
400

    
401

    
402
Examples
403
~~~~~~~~
404

    
405
   # library(rjson)
406
   # library(nucleominer)
407
   #
408
   # # Read config file
409
   # json_conf_file = "nucleo_miner_config.json"
410
   # config = fromJSON(paste(readLines(json_conf_file), collapse=""))
411
   # # Read sample file
412
   # all_samples = get_content(config$CSV_SAMPLE_FILE, "cvs", sep=";", head=TRUE, stringsAsFactors=FALSE)
413
   # # here are the sample ids in a list
414
   # expes = list(c(1))
415
   # # here is the region that we wnt to see the coverage
416
   # cur = list(chr="8", begin=472000, end=474000, strain_ref="BY")
417
   # # it displays the corverage
418
   # replicates = build_replicates(expes, cur, all_samples=all_samples, config=config)
419
   # out = watch_samples(replicates, config$READ_LENGTH,
420
   #       plot_coverage = TRUE,
421
   #       plot_squared_reads = FALSE,
422
   #       plot_ref_genome = FALSE,
423
   #       plot_arrow_raw_reads = FALSE,
424
   #       plot_arrow_nuc_reads = FALSE,
425
   #       plot_gaussian_reads = FALSE,
426
   #       plot_gaussian_unified_reads = FALSE,
427
   #       plot_ellipse_nucs = FALSE,
428
   #       plot_wp_nucs = FALSE,
429
   #       plot_wp_nuc_model = FALSE,
430
   #       plot_common_nucs = FALSE,
431
   #       height = 50)
432

    
433
R: reformat an "apply manipulated" list of regions
434

    
435

    
436
reformat an "apply manipulated" list of regions
437
-----------------------------------------------
438

    
439

    
440
Description
441
~~~~~~~~~~~
442

    
443
Utils to reformat an "apply manipulated" list of regions
444

    
445

    
446
Usage
447
~~~~~
448

    
449
   collapse_regions(regions)
450

    
451

    
452
Arguments
453
~~~~~~~~~
454

    
455
+-----------------+------+
456
+-----------------+------+
457

    
458

    
459
Author(s)
460
~~~~~~~~~
461

    
462
Florent Chuffart
463

    
464
R: Compute Common Uninterrupted Regions (CUR)
465

    
466

    
467
Compute Common Uninterrupted Regions (CUR)
468
------------------------------------------
469

    
470

    
471
Description
472
~~~~~~~~~~~
473

    
474
CURs are regions that can be aligned between the genomes
475

    
476

    
477
Usage
478
~~~~~
479

    
480
   compute_inter_all_strain_curs(diff_allowed = 10, min_cur_width = 200,
481
       config = NULL, plot = FALSE)
482

    
483

    
484
Arguments
485
~~~~~~~~~
486

    
487
"diff_allowed"
488

    
489
the maximum indel width allowe din a CUR
490

    
491
"min_cur_width"
492

    
493
The minimum width of a CUR
494

    
495
"config"
496

    
497
GLOBAL config variable
498

    
499
"plot"
500

    
501
Plot CURs or not
502

    
503

    
504
Author(s)
505
~~~~~~~~~
506

    
507
Florent Chuffart
508

    
509
R: Crop bound of regions according to region of interest bound
510

    
511

    
512
Crop bound of regions according to region of interest bound
513
-----------------------------------------------------------
514

    
515

    
516
Description
517
~~~~~~~~~~~
518

    
519
The fucntion is no more necessary since we remove "big_roi" bug in
520
translate_roi function.
521

    
522

    
523
Usage
524
~~~~~
525

    
526
   crop_fuzzy(tmp_fuzzy_nucs, roi, strain, config = NULL)
527

    
528

    
529
Arguments
530
~~~~~~~~~
531

    
532
"tmp_fuzzy_nucs"
533

    
534
the regiuons to be croped.
535

    
536
"roi"
537

    
538
The region of interest.
539

    
540
"strain"
541

    
542
The strain to consider.
543

    
544
"config"
545

    
546
GLOBAL config variable
547

    
548

    
549
Author(s)
550
~~~~~~~~~
551

    
552
Florent Chuffart
553

    
554
R: Adding list to a dataframe.
555

    
556

    
557
Adding list to a dataframe.
558
---------------------------
559

    
560

    
561
Description
562
~~~~~~~~~~~
563

    
564
Add a list *l* to a dataframe *df*. Create it if *df* is *NULL*.
565
Return the dataframe *df*.
566

    
567

    
568
Usage
569
~~~~~
570

    
571
   dfadd(df, l)
572

    
573

    
574
Arguments
575
~~~~~~~~~
576

    
577
"df"
578

    
579
A dataframe
580

    
581
"l"
582

    
583
A list
584

    
585

    
586
Value
587
~~~~~
588

    
589
Return the dataframe *df*.
590

    
591

    
592
Author(s)
593
~~~~~~~~~
594

    
595
Florent Chuffart
596

    
597

    
598
Examples
599
~~~~~~~~
600

    
601
   ## Here dataframe is NULL
602
   print(df)
603
   df = NULL
604

    
605
   # Initialize df
606
   df = dfadd(df, list(key1 = "value1", key2 = "value2"))
607
   print(df)
608

    
609
   # Adding elements to df
610
   df = dfadd(df, list(key1 = "value1'", key2 = "value2'"))
611
   print(df)
612

    
613
R: Extract wp nucs from nuc map.
614

    
615

    
616
Extract wp nucs from nuc map.
617
-----------------------------
618

    
619

    
620
Description
621
~~~~~~~~~~~
622

    
623
Function based on common wp nuc index and roi_index.
624

    
625

    
626
Usage
627
~~~~~
628

    
629
   extract_wp(strain_maps, roi_index, strain, tmp_common_nucs)
630

    
631

    
632
Arguments
633
~~~~~~~~~
634

    
635
"strain_maps"
636

    
637
Nuc maps.
638

    
639
"roi_index"
640

    
641
The region of interest index.
642

    
643
"strain"
644

    
645
The strain to consider.
646

    
647
"tmp_common_nucs"
648

    
649
the list of wp nucs.
650

    
651

    
652
Author(s)
653
~~~~~~~~~
654

    
655
Florent Chuffart
656

    
657
R: Prefetch data
658

    
659

    
660
Prefetch data
661
-------------
662

    
663

    
664
Description
665
~~~~~~~~~~~
666

    
667
Fetch and filter inputs and outpouts per region of interest. Organize
668
it per replicates.
669

    
670

    
671
Usage
672
~~~~~
673

    
674
   fetch_mnase_replicates(strain, roi, all_samples, config = NULL,
675
       only_fetch = FALSE, get_genome = FALSE, get_ouputs = TRUE)
676

    
677

    
678
Arguments
679
~~~~~~~~~
680

    
681
"strain"
682

    
683
The strain we want mnase replicatesList of replicates. Each replicates
684
is a vector of sample ids.
685

    
686
"roi"
687

    
688
Region of interest.
689

    
690
"all_samples"
691

    
692
Global list of samples.
693

    
694
"config"
695

    
696
GLOBAL config variable
697

    
698
"only_fetch"
699

    
700
If TRUE, only fetch and not filtering. It is used tio load sample
701
files into memory before forking.
702

    
703
"get_genome"
704

    
705
If TRUE, load corresponding genome sequence.
706

    
707
"get_ouputs"
708

    
709
If TRUE, get also ouput corresponding TF output files.
710

    
711

    
712
Author(s)
713
~~~~~~~~~
714

    
715
Florent Chuffart
716

    
717
R: Filter TemplateFilter inputs
718

    
719

    
720
Filter TemplateFilter inputs
721
----------------------------
722

    
723

    
724
Description
725
~~~~~~~~~~~
726

    
727
This function filters TemplateFilter inputs according genome area
728
observed properties. It takes into account reads that are at the
729
frontier of this area and the strand of these reads.
730

    
731

    
732
Usage
733
~~~~~
734

    
735
   filter_tf_inputs(inputs, chr, x_min, x_max, nuc_width = 160,
736
       only_f = FALSE, only_r = FALSE, filter_for_coverage = FALSE)
737

    
738

    
739
Arguments
740
~~~~~~~~~
741

    
742
"inputs"
743

    
744
TF inputs to be filtered.
745

    
746
"chr"
747

    
748
Chromosome observed, here chr is an integer.
749

    
750
"x_min"
751

    
752
Coordinate of the first bp observed.
753

    
754
"x_max"
755

    
756
Coordinate of the last bp observed.
757

    
758
"nuc_width"
759

    
760
Nucleosome width.
761

    
762
"only_f"
763

    
764
Filter only F reads.
765

    
766
"only_r"
767

    
768
Filter only R reads.
769

    
770
"filter_for_coverage"
771

    
772
Does it filter for plot coverage?
773

    
774

    
775
Value
776
~~~~~
777

    
778
Returns filtred inputs.
779

    
780

    
781
Author(s)
782
~~~~~~~~~
783

    
784
Florent Chuffart
785

    
786
R: Filter TemplateFilter outputs
787

    
788

    
789
Filter TemplateFilter outputs
790
-----------------------------
791

    
792

    
793
Description
794
~~~~~~~~~~~
795

    
796
This function filters TemplateFilter outputs according, not only
797
genome area observerved properties, but also correlation and overlap
798
threshold.
799

    
800

    
801
Usage
802
~~~~~
803

    
804
   filter_tf_outputs(tf_outputs, chr, x_min, x_max, nuc_width = 160,
805
       ol_bp = 59, corr_thres = 0.5)
806

    
807

    
808
Arguments
809
~~~~~~~~~
810

    
811
"tf_outputs"
812

    
813
TemplateFilter outputs.
814

    
815
"chr"
816

    
817
Chromosome observed, here chr is an integer.
818

    
819
"x_min"
820

    
821
Coordinate of the first bp observed.
822

    
823
"x_max"
824

    
825
Coordinate of the last bp observed.
826

    
827
"nuc_width"
828

    
829
Nucleosome width.
830

    
831
"ol_bp"
832

    
833
Overlap Threshold.
834

    
835
"corr_thres"
836

    
837
Correlation threshold.
838

    
839

    
840
Value
841
~~~~~
842

    
843
Returns filtered TemplateFilter Outputs
844

    
845

    
846
Author(s)
847
~~~~~~~~~
848

    
849
Florent Chuffart
850

    
851
R: to flat aggregate_intra_strain_nucs function output
852

    
853

    
854
to flat aggregate_intra_strain_nucs function output
855
---------------------------------------------------
856

    
857

    
858
Description
859
~~~~~~~~~~~
860

    
861
This function builds a dataframe of all clusters obtain from
862
aggregate_intra_strain_nucs function.
863

    
864

    
865
Usage
866
~~~~~
867

    
868
   flat_aggregated_intra_strain_nucs(partial_strain_maps, roi_index)
869

    
870

    
871
Arguments
872
~~~~~~~~~
873

    
874
"partial_strain_maps"
875

    
876
the output of aggregate_intra_strain_nucs function
877

    
878
"roi_index"
879

    
880
the index of the roi involved
881

    
882

    
883
Value
884
~~~~~
885

    
886
Returns a dataframe of all clusters obtain from
887
aggregate_intra_strain_nucs function.
888

    
889

    
890
Author(s)
891
~~~~~~~~~
892

    
893
Florent Chuffart
894

    
895
R: flat reads
896

    
897

    
898
flat reads
899
----------
900

    
901

    
902
Description
903
~~~~~~~~~~~
904

    
905
Extract reads coordinates from TempleteFilter input sequence
906

    
907

    
908
Usage
909
~~~~~
910

    
911
   flat_reads(reads, nuc_width)
912

    
913

    
914
Arguments
915
~~~~~~~~~
916

    
917
"reads"
918

    
919
TemplateFilter input reads
920

    
921
"nuc_width"
922

    
923
Width used to shift F and R reads.
924

    
925

    
926
Value
927
~~~~~
928

    
929
Returns a list of F reads, R reads and joint/shifted F and R reads.
930

    
931

    
932
Author(s)
933
~~~~~~~~~
934

    
935
Florent Chuffart
936

    
937
R: Retrieve Reads
938

    
939

    
940
Retrieve Reads
941
--------------
942

    
943

    
944
Description
945
~~~~~~~~~~~
946

    
947
Retrieve reads for a given marker, combi, form.
948

    
949

    
950
Usage
951
~~~~~
952

    
953
   get_all_reads(marker, combi, form = "wp", config = NULL)
954

    
955

    
956
Arguments
957
~~~~~~~~~
958

    
959
"marker"
960

    
961
The marker to considere.
962

    
963
"combi"
964

    
965
The starin combination to considere.
966

    
967
"form"
968

    
969
The nuc form to considere.
970

    
971
"config"
972

    
973
GLOBAL config variable
974

    
975

    
976
Author(s)
977
~~~~~~~~~
978

    
979
Florent Chuffart
980

    
981
R: get comp strand
982

    
983

    
984
get comp strand
985
---------------
986

    
987

    
988
Description
989
~~~~~~~~~~~
990

    
991
Compute the complementatry strand.
992

    
993

    
994
Usage
995
~~~~~
996

    
997
   get_comp_strand(strand)
998

    
999

    
1000
Arguments
1001
~~~~~~~~~
1002

    
1003
"strand"
1004

    
1005
The original strand.
1006

    
1007

    
1008
Value
1009
~~~~~
1010

    
1011
Returns the complementatry strand.
1012

    
1013

    
1014
Author(s)
1015
~~~~~~~~~
1016

    
1017
Florent Chuffart
1018

    
1019
R: Build the design for deseq
1020

    
1021

    
1022
Build the design for deseq
1023
--------------------------
1024

    
1025

    
1026
Description
1027
~~~~~~~~~~~
1028

    
1029
This function build the design according sample properties.
1030

    
1031

    
1032
Usage
1033
~~~~~
1034

    
1035
   get_design(marker, combi, all_samples)
1036

    
1037

    
1038
Arguments
1039
~~~~~~~~~
1040

    
1041
"marker"
1042

    
1043
The marker to considere.
1044

    
1045
"combi"
1046

    
1047
The starin combination to considere.
1048

    
1049
"all_samples"
1050

    
1051
Global list of samples.
1052

    
1053

    
1054
Author(s)
1055
~~~~~~~~~
1056

    
1057
Florent Chuffart
1058

    
1059
R: Compute the fuzzy nucs.
1060

    
1061

    
1062
Compute the fuzzy nucs.
1063
-----------------------
1064

    
1065

    
1066
Description
1067
~~~~~~~~~~~
1068

    
1069
This function aggregate non common wp nucs for each strain and
1070
substract common wp nucs. It does not take care about the size of the
1071
resulting fuzzy regions. It will be take into account in the count
1072
read part og the pipeline.
1073

    
1074

    
1075
Usage
1076
~~~~~
1077

    
1078
   get_fuzzy(combi, roi, roi_index, strain_maps, common_nuc_results,
1079
       config = NULL)
1080

    
1081

    
1082
Arguments
1083
~~~~~~~~~
1084

    
1085
"combi"
1086

    
1087
The strain combination to consider.
1088

    
1089
"roi"
1090

    
1091
The region of interest.
1092

    
1093
"roi_index"
1094

    
1095
The region of interest index.
1096

    
1097
"strain_maps"
1098

    
1099
Nuc maps.
1100

    
1101
"common_nuc_results"
1102

    
1103
Common wp nuc maps
1104

    
1105
"config"
1106

    
1107
GLOBAL config variable
1108

    
1109

    
1110
Author(s)
1111
~~~~~~~~~
1112

    
1113
Florent Chuffart
1114

    
1115
R: Compute the list of SNEPs for a given set of marker, strain...
1116

    
1117

    
1118
Compute the list of SNEPs for a given set of marker, strain combination and nuc form.
1119
-------------------------------------------------------------------------------------
1120

    
1121

    
1122
Description
1123
~~~~~~~~~~~
1124

    
1125
This function uses
1126

    
1127

    
1128
Usage
1129
~~~~~
1130

    
1131
   get_sneps(marker, combi, form, all_samples, config = NULL)
1132

    
1133

    
1134
Arguments
1135
~~~~~~~~~
1136

    
1137
"marker"
1138

    
1139
The marker involved.
1140

    
1141
"combi"
1142

    
1143
The strain combination involved.
1144

    
1145
"form"
1146

    
1147
the nuc form involved.
1148

    
1149
"all_samples"
1150

    
1151
Global list of samples.
1152

    
1153
"config"
1154

    
1155
GLOBAL config variable
1156

    
1157

    
1158
Author(s)
1159
~~~~~~~~~
1160

    
1161
Florent Chuffart
1162

    
1163

    
1164
Examples
1165
~~~~~~~~
1166

    
1167
   marker = "H3K4me1"
1168
   combi = c("BY", "YJM")
1169
   form = "wpfuzzy" # "wp" | "fuzzy" | "wpfuzzy"
1170
   # foo = get_sneps(marker, combi, form)
1171
   # foo = get_sneps("H4K12ac", c("BY", "RM"), "wp")
1172

    
1173
R: Likelihood ratio
1174

    
1175

    
1176
Likelihood ratio
1177
----------------
1178

    
1179

    
1180
Description
1181
~~~~~~~~~~~
1182

    
1183
Compute the likelihood log of two set of value from two models Vs. a
1184
unique model.
1185

    
1186

    
1187
Usage
1188
~~~~~
1189

    
1190
   lod_score_vecs(x, y)
1191

    
1192

    
1193
Arguments
1194
~~~~~~~~~
1195

    
1196
"x"
1197

    
1198
First vector.
1199

    
1200
"y"
1201

    
1202
Second vector.
1203

    
1204

    
1205
Value
1206
~~~~~
1207

    
1208
Returns the likelihood ratio.
1209

    
1210

    
1211
Author(s)
1212
~~~~~~~~~
1213

    
1214
Florent Chuffart
1215

    
1216

    
1217
Examples
1218
~~~~~~~~
1219

    
1220
   # LOD score for 2 set of values
1221
   mean1=5; sd1=2; card2 = 250
1222
   mean2=6; sd2=3; card1 = 200
1223
   x1 = rnorm(card1, mean1, sd1)
1224
   x2 = rnorm(card2, mean2, sd2)
1225
   min = floor(min(c(x1,x2)))
1226
   max = ceiling(max(c(x1,x2)))
1227
   hist(c(x1,x2), xlim=c(min, max), breaks=min:max)
1228
   lines(min:max,dnorm(min:max,mean1,sd1)*card1,col=2)
1229
   lines(min:max,dnorm(min:max,mean2,sd2)*card2,col=3)
1230
   lines(min:max,dnorm(min:max,mean(c(x1,x2)),sd(c(x1,x2)))*card2,col=4)
1231
   lod_score_vecs(x1,x2)
1232

    
1233
R: nm
1234

    
1235

    
1236
nm
1237
--
1238

    
1239

    
1240
Description
1241
~~~~~~~~~~~
1242

    
1243
It provides a set of useful functions allowing to perform quantitative
1244
analysis of nucleosomal epigenome.
1245

    
1246

    
1247
Details
1248
~~~~~~~
1249

    
1250
+-----------------+-----------------------------------------------------+
1251
| Package:        | nucleominer                                         |
1252
+-----------------+-----------------------------------------------------+
1253
| Maintainer:     | Florent Chuffart <florent.chuffart@ens-lyon.fr>     |
1254
+-----------------+-----------------------------------------------------+
1255
| Author:         | Florent Chuffart                                    |
1256
+-----------------+-----------------------------------------------------+
1257
| Version:        | 2.3.28                                              |
1258
+-----------------+-----------------------------------------------------+
1259
| License:        | CeCILL                                              |
1260
+-----------------+-----------------------------------------------------+
1261
| Title:          | nm                                                  |
1262
+-----------------+-----------------------------------------------------+
1263
| Depends:        | seqinr, plotrix, DESeq, cachecache                  |
1264
+-----------------+-----------------------------------------------------+
1265

    
1266

    
1267
Author(s)
1268
~~~~~~~~~
1269

    
1270
Florent Chuffart
1271

    
1272
R: Performaing ANOVAs
1273

    
1274

    
1275
Performaing ANOVAs
1276
------------------
1277

    
1278

    
1279
Description
1280
~~~~~~~~~~~
1281

    
1282
Counts reads and Performs ANOVAS for each common nucleosomes involved.
1283

    
1284

    
1285
Usage
1286
~~~~~
1287

    
1288
   perform_anovas(replicates, aligned_inter_strain_nucs, inputs_name = "Mnase_Seq",
1289
       plot_anova_boxes = FALSE)
1290

    
1291

    
1292
Arguments
1293
~~~~~~~~~
1294

    
1295
"replicates"
1296

    
1297
Set of replicates, each replicate is a list of samples (ideally 3).
1298
Each sample is a list like *sample = list(id=..., marker=...,
1299
strain=..., roi=..., inputs=..., outputs=...)* with *roi =
1300
list(name=..., begin=..., end=..., chr=..., genome=...)*. In the
1301
*perform_anovas* contexte, we need 4 replicates (4 * (3 samples)): 2
1302
strains * (1 marker + 1 input (Mnase_Seq)).
1303

    
1304
"aligned_inter_strain_nucs"
1305

    
1306
List of common nucleosomes.
1307

    
1308
"inputs_name"
1309

    
1310
Name of the input.
1311

    
1312
"plot_anova_boxes"
1313

    
1314
Plot (or not) boxplot for each nuc.
1315

    
1316

    
1317
Value
1318
~~~~~
1319

    
1320
Returns ANOVA results and comunted reads.
1321

    
1322

    
1323
Author(s)
1324
~~~~~~~~~
1325

    
1326
Florent Chuffart
1327

    
1328
R: Plot the distribution of reads.
1329

    
1330

    
1331
Plot the distribution of reads.
1332
-------------------------------
1333

    
1334

    
1335
Description
1336
~~~~~~~~~~~
1337

    
1338
This fuxntion use the deseq nomalization feature to compare
1339
qualitatively the distribution.
1340

    
1341

    
1342
Usage
1343
~~~~~
1344

    
1345
   plot_dist_samples(strain, marker, res, all_samples, NEWPLOT = TRUE)
1346

    
1347

    
1348
Arguments
1349
~~~~~~~~~
1350

    
1351
"strain"
1352

    
1353
The strain to considere.
1354

    
1355
"marker"
1356

    
1357
The marker to considere.
1358

    
1359
"res"
1360

    
1361
Data
1362

    
1363
"all_samples"
1364

    
1365
Global list of samples.
1366

    
1367
"NEWPLOT"
1368

    
1369
If FALSE the curve will be add to the current plot.
1370

    
1371

    
1372
Author(s)
1373
~~~~~~~~~
1374

    
1375
Florent Chuffart
1376

    
1377
R: Remove wp nucs from common nucs list.
1378

    
1379

    
1380
Remove wp nucs from common nucs list.
1381
-------------------------------------
1382

    
1383

    
1384
Description
1385
~~~~~~~~~~~
1386

    
1387
It is based on common wp nucs index on nucs and region.
1388

    
1389

    
1390
Usage
1391
~~~~~
1392

    
1393
   remove_aligned_wp(strain_maps, roi_index, tmp_common_nucs, strain)
1394

    
1395

    
1396
Arguments
1397
~~~~~~~~~
1398

    
1399
"strain_maps"
1400

    
1401
Nuc maps.
1402

    
1403
"roi_index"
1404

    
1405
The region of interest index.
1406

    
1407
"tmp_common_nucs"
1408

    
1409
the list of wp nucs.
1410

    
1411
"strain"
1412

    
1413
The strain to consider.
1414

    
1415

    
1416
Author(s)
1417
~~~~~~~~~
1418

    
1419
Florent Chuffart
1420

    
1421
R: sign from strand
1422

    
1423

    
1424
sign from strand
1425
----------------
1426

    
1427

    
1428
Description
1429
~~~~~~~~~~~
1430

    
1431
Get the sign of strand
1432

    
1433

    
1434
Usage
1435
~~~~~
1436

    
1437
   sign_from_strand(strands)
1438

    
1439

    
1440
Arguments
1441
~~~~~~~~~
1442

    
1443
+-----------------+------+
1444
+-----------------+------+
1445

    
1446

    
1447
Value
1448
~~~~~
1449

    
1450
If strand in forward then returns 1 else returns -1
1451

    
1452

    
1453
Author(s)
1454
~~~~~~~~~
1455

    
1456
Florent Chuffart
1457

    
1458
R: Substract to a list of regions an other list of regions that...
1459

    
1460

    
1461
Substract to a list of regions an other list of regions that intersect it.
1462
--------------------------------------------------------------------------
1463

    
1464

    
1465
Description
1466
~~~~~~~~~~~
1467

    
1468
This fucntion embed a recursive part. It occurs when a substracted
1469
region split an original region on two.
1470

    
1471

    
1472
Usage
1473
~~~~~
1474

    
1475
   substract_region(region1, region2)
1476

    
1477

    
1478
Arguments
1479
~~~~~~~~~
1480

    
1481
"region1"
1482

    
1483
Original regions.
1484

    
1485
"region2"
1486

    
1487
Regions to substract.
1488

    
1489

    
1490
Author(s)
1491
~~~~~~~~~
1492

    
1493
Florent Chuffart
1494

    
1495
R: Switch a pairlist
1496

    
1497

    
1498
Switch a pairlist
1499
-----------------
1500

    
1501

    
1502
Description
1503
~~~~~~~~~~~
1504

    
1505
Take a pairlist key:value and return the switched pairlist value:key.
1506

    
1507

    
1508
Usage
1509
~~~~~
1510

    
1511
   switch_pairlist(l)
1512

    
1513

    
1514
Arguments
1515
~~~~~~~~~
1516

    
1517
"l"
1518

    
1519
The pairlist to switch.
1520

    
1521

    
1522
Value
1523
~~~~~
1524

    
1525
The switched pairlist.
1526

    
1527

    
1528
Author(s)
1529
~~~~~~~~~
1530

    
1531
Florent Chuffart
1532

    
1533

    
1534
Examples
1535
~~~~~~~~
1536

    
1537
   l = list(key1 = "value1", key2 = "value2")
1538
   print(switch_pairlist(l))
1539

    
1540
R: Translate a list of regions from a strain ref to another.
1541

    
1542

    
1543
Translate a list of regions from a strain ref to another.
1544
---------------------------------------------------------
1545

    
1546

    
1547
Description
1548
~~~~~~~~~~~
1549

    
1550
This function is an eloborated call to translate_roi.
1551

    
1552

    
1553
Usage
1554
~~~~~
1555

    
1556
   translate_regions(regions, combi, roi_index, config = NULL, roi)
1557

    
1558

    
1559
Arguments
1560
~~~~~~~~~
1561

    
1562
"regions"
1563

    
1564
Regions to be translated.
1565

    
1566
"combi"
1567

    
1568
Combination of strains.
1569

    
1570
"roi_index"
1571

    
1572
The region of interest index.
1573

    
1574
"config"
1575

    
1576
GLOBAL config variable
1577

    
1578
"roi"
1579

    
1580
The region of interest.
1581

    
1582

    
1583
Author(s)
1584
~~~~~~~~~
1585

    
1586
Florent Chuffart
1587

    
1588
R: Translate coords of a genome region.
1589

    
1590

    
1591
Translate coords of a genome region.
1592
------------------------------------
1593

    
1594

    
1595
Description
1596
~~~~~~~~~~~
1597

    
1598
This function is used in the examples, usualy you have to define your
1599
own translation function and overwrite this one using *unlockBinding*
1600
features. Please, refer to the example.
1601

    
1602

    
1603
Usage
1604
~~~~~
1605

    
1606
   translate_roi(roi, strain2, config = NULL, big_roi = NULL)
1607

    
1608

    
1609
Arguments
1610
~~~~~~~~~
1611

    
1612
"roi"
1613

    
1614
Original genome region of interest.
1615

    
1616
"strain2"
1617

    
1618
The strain in wich you want the genome region of interest.
1619

    
1620
"config"
1621

    
1622
GLOBAL config variable
1623

    
1624
"big_roi"
1625

    
1626
A largest region than roi use to filter c2c if it is needed.
1627

    
1628

    
1629
Author(s)
1630
~~~~~~~~~
1631

    
1632
Florent Chuffart
1633

    
1634

    
1635
Examples
1636
~~~~~~~~
1637

    
1638
   # Define new translate_roi function...
1639
   translate_roi = function(roi, strain2, config) {
1640
       strain1 = roi$strain_ref
1641
       if (strain1 == strain2) {
1642
           return(roi)
1643
       } else {
1644
         stop("Here is my new translate_roi function...")
1645
       }
1646
   }
1647
   # Binding it by uncomment follwing lines.
1648
   # unlockBinding("translate_roi", as.environment("package:nm"))
1649
   # unlockBinding("translate_roi", getNamespace("nm"))
1650
   # assign("translate_roi", translate_roi, "package:nm")
1651
   # assign("translate_roi", translate_roi, getNamespace("nm"))
1652
   # lockBinding("translate_roi", getNamespace("nm"))
1653
   # lockBinding("translate_roi", as.environment("package:nm"))
1654

    
1655
R: Aggregate regions that intersect themnselves.
1656

    
1657

    
1658
Aggregate regions that intersect themnselves.
1659
---------------------------------------------
1660

    
1661

    
1662
Description
1663
~~~~~~~~~~~
1664

    
1665
This function is based on sort of lower bounds to detect regions that
1666
intersect. We compare lower bound and upper bound of the porevious
1667
item. This function embed a while loop and break break regions list
1668
become stable.
1669

    
1670

    
1671
Usage
1672
~~~~~
1673

    
1674
   union_regions(regions)
1675

    
1676

    
1677
Arguments
1678
~~~~~~~~~
1679

    
1680
"regions"
1681

    
1682
The Regions to be aggregated
1683

    
1684

    
1685
Author(s)
1686
~~~~~~~~~
1687

    
1688
Florent Chuffart
1689

    
1690
R: Watching analysis of samples
1691

    
1692

    
1693
Watching analysis of samples
1694
----------------------------
1695

    
1696

    
1697
Description
1698
~~~~~~~~~~~
1699

    
1700
This function allows to view analysis for a particuler region of the
1701
genome.
1702

    
1703

    
1704
Usage
1705
~~~~~
1706

    
1707
   watch_samples(replicates, read_length, plot_ref_genome = TRUE,
1708
       plot_arrow_raw_reads = TRUE, plot_arrow_nuc_reads = TRUE,
1709
       plot_squared_reads = TRUE, plot_coverage = FALSE, plot_gaussian_reads = TRUE,
1710
       plot_gaussian_unified_reads = TRUE, plot_ellipse_nucs = TRUE,
1711
       change_col = TRUE, plot_wp_nucs = TRUE, plot_wp_nuc_model = TRUE,
1712
       plot_common_nucs = TRUE, plot_anovas = FALSE, plot_anova_boxes = FALSE,
1713
       plot_wp_nucs_4_nonmnase = FALSE, plot_chain = FALSE, aggregated_intra_strain_nucs = NULL,
1714
       aligned_inter_strain_nucs = NULL, height = 10, config = NULL)
1715

    
1716

    
1717
Arguments
1718
~~~~~~~~~
1719

    
1720
"replicates"
1721

    
1722
replicates under the form...
1723

    
1724
"read_length"
1725

    
1726
length of the reads
1727

    
1728
"plot_ref_genome"
1729

    
1730
Plot (or not) reference genome.
1731

    
1732
"plot_arrow_raw_reads"
1733

    
1734
Plot (or not) arrows for raw reads.
1735

    
1736
"plot_arrow_nuc_reads"
1737

    
1738
Plot (or not) arrows for reads aasiocied to a nucleosome.
1739

    
1740
"plot_squared_reads"
1741

    
1742
Plot (or not) reads in the square fashion.
1743

    
1744
"plot_coverage"
1745

    
1746
Plot (or not) reads in the covergae fashion. fashion.
1747

    
1748
"plot_gaussian_reads"
1749

    
1750
Plot (or not) gaussian model of a F anf R reads.
1751

    
1752
"plot_gaussian_unified_reads"
1753

    
1754
Plot (or not) gaussian model of a nuc.
1755

    
1756
"plot_ellipse_nucs"
1757

    
1758
Plot (or not) ellipse for a nuc.
1759

    
1760
"change_col"
1761

    
1762
Change the color of each nucleosome.
1763

    
1764
"plot_wp_nucs"
1765

    
1766
Plot (or not) cluster of nucs
1767

    
1768
"plot_wp_nuc_model"
1769

    
1770
Plot (or not) gaussian model for a cluster of nucs
1771

    
1772
"plot_common_nucs"
1773

    
1774
Plot (or not) aligned reads.
1775

    
1776
"plot_anovas"
1777

    
1778
Plot (or not) scatter for each nuc.
1779

    
1780
"plot_anova_boxes"
1781

    
1782
Plot (or not) boxplot for each nuc.
1783

    
1784
"plot_wp_nucs_4_nonmnase"
1785

    
1786
Plot (or not) clusters for non inputs samples.
1787

    
1788
"plot_chain"
1789

    
1790
Plot (or not) clusterised nuceosomes between mnase samples.
1791

    
1792
"aggregated_intra_strain_nucs"
1793

    
1794
list of aggregated intra strain nucs. If NULL, it will be computed.
1795

    
1796
"aligned_inter_strain_nucs"
1797

    
1798
list of aligned inter strain nucs. If NULL, it will be computed.
1799

    
1800
"height"
1801

    
1802
Number of reads in per million read for each sample, graphical
1803
parametre for the y axis.
1804

    
1805
"config"
1806

    
1807
GLOBAL config variable
1808

    
1809

    
1810
Author(s)
1811
~~~~~~~~~
1812

    
1813
Florent Chuffart