Statistiques
| Branche: | Révision :

root / doc / sphinx_doc / rref.rst @ b20637ed

Historique | Voir | Annoter | Télécharger (27,25 ko)

1
Arabic to Roman pair list.
2
--------------------------
3

    
4
Description
5
~~~~~~~~~~~
6

    
7
Util to convert Arabicto Roman
8

    
9
Usage
10
~~~~~
11

    
12
::
13

    
14
    ARAB2ROM()
15

    
16
Author(s)
17
~~~~~~~~~
18

    
19
Florent Chuffart
20

    
21
R: False Discovery Rate
22

    
23
False Discovery Rate
24
--------------------
25

    
26
Description
27
~~~~~~~~~~~
28

    
29
From a vector x of independent p-values, extract the cutoff
30
corresponding to the specified FDR. See Benjamini & Hochberg 1995 paper
31

    
32
Usage
33
~~~~~
34

    
35
::
36

    
37
    FDR(x, FDR)
38

    
39
Arguments
40
~~~~~~~~~
41

    
42
``x``
43

    
44
A vector x of independent p-values.
45

    
46
``FDR``
47

    
48
The specified FDR.
49

    
50
Value
51
~~~~~
52

    
53
Return the the corresponding cutoff.
54

    
55
Author(s)
56
~~~~~~~~~
57

    
58
Gael Yvert, Florent Chuffart
59

    
60
Examples
61
~~~~~~~~
62

    
63
::
64

    
65
    print("example")
66

    
67
R: Roman to Arabic pair list.
68

    
69
Roman to Arabic pair list.
70
--------------------------
71

    
72
Description
73
~~~~~~~~~~~
74

    
75
Util to convert Roman to Arabic
76

    
77
Usage
78
~~~~~
79

    
80
::
81

    
82
    ROM2ARAB()
83

    
84
Author(s)
85
~~~~~~~~~
86

    
87
Florent Chuffart
88

    
89
R: Aggregate replicated sample's nucleosomes.
90

    
91
Aggregate replicated sample's nucleosomes.
92
------------------------------------------
93

    
94
Description
95
~~~~~~~~~~~
96

    
97
This function aggregates nucleosome for replicated samples. It uses
98
TemplateFilter ouput of each sample as replicate. Each sample owns a set
99
of nucleosomes computed using TemplateFilter and ordered by the position
100
of their center. Adajacent nucleosomes are compared two by two.
101
Comparison is based on a log likelihood ratio score. The issue of
102
comparison is adjacents nucleosomes merge or separation. Finally the
103
function returns a list of clusters and all computed *lod\_scores*. Each
104
cluster ows an attribute *wp* for "well positionned". This attribute is
105
set as *TRUE* if the cluster is composed of exactly one nucleosomes of
106
each sample.
107

    
108
Usage
109
~~~~~
110

    
111
::
112

    
113
    aggregate_intra_strain_nucs(samples, lod_thres = -20, coord_max = 2e+07)
114

    
115
Arguments
116
~~~~~~~~~
117

    
118
``samples``
119

    
120
A list of samples. Each sample is a list like *sample = list(id=...,
121
marker=..., strain=..., roi=..., inputs=..., outputs=...)* with *roi =
122
list(name=..., begin=..., end=..., chr=..., genome=...)*.
123

    
124
``lod_thres``
125

    
126
Log likelihood ration threshold.
127

    
128
``coord_max``
129

    
130
A too big value to be a coord for a nucleosome lower bound.
131

    
132
Value
133
~~~~~
134

    
135
Returns a list of clusterized nucleosomes, and all computed lod scores.
136

    
137
Author(s)
138
~~~~~~~~~
139

    
140
Florent Chuffart
141

    
142
Examples
143
~~~~~~~~
144

    
145
::
146

    
147
    # Dealing with a region of interest
148
    roi =list(name="example", begin=1000,  end=1300, chr="1", genome=rep("A",301))
149
    samples = list()
150
    for (i in 1:3) {
151
        # Create TF output
152
        tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9)
153
        outputs = dfadd(NULL,tf_nuc)
154
        outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end)
155
        # Generate corresponding reads
156
        nb_reads = round(runif(1,170,230))
157
        reads = round(rnorm(nb_reads, tf_nuc$center,20))
158
        u_reads = sort(unique(reads))
159
        strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2))))
160
        counts = apply(t(u_reads), 2, function(r) { sum(reads == r)})
161
        shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)})
162
        u_reads = u_reads + shifts
163
        inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)), 
164
                                 "V2" = u_reads, 
165
                                                         "V3" = strands, 
166
                                                         "V4" = counts), stringsAsFactors=FALSE)
167
        samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain="strain_ex", total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs)
168
    }
169
    print(aggregate_intra_strain_nucs(samples))
170

    
171
R: Aligns nucleosomes between 2 strains.
172

    
173
Aligns nucleosomes between 2 strains.
174
-------------------------------------
175

    
176
Description
177
~~~~~~~~~~~
178

    
179
This function aligns nucs between two strains for a given genome region.
180

    
181
Usage
182
~~~~~
183

    
184
::
185

    
186
    align_inter_strain_nucs(replicates, wp_nucs_strain_ref1 = NULL, 
187
        wp_nucs_strain_ref2 = NULL, corr_thres = 0.5, lod_thres = -100, 
188
        config = NULL, ...)
189

    
190
Arguments
191
~~~~~~~~~
192

    
193
``replicates``
194

    
195
Set of replicates, ideally 3 per strain.
196

    
197
``wp_nucs_strain_ref1``
198

    
199
List of aggregates nucleosome for strain 1. If it's null this list will
200
be computed.
201

    
202
``wp_nucs_strain_ref2``
203

    
204
List of aggregates nucleosome for strain 2. If it's null this list will
205
be computed.
206

    
207
``corr_thres``
208

    
209
Correlation threshold.
210

    
211
``lod_thres``
212

    
213
LOD cut off.
214

    
215
``config``
216

    
217
GLOBAL config variable
218

    
219
``...``
220

    
221
A list of parameters that will be passed to
222
*aggregate\_intra\_strain\_nucs* if needed.
223

    
224
Value
225
~~~~~
226

    
227
Returns a list of clusterized nucleosomes, and all computed lod scores.
228

    
229
Author(s)
230
~~~~~~~~~
231

    
232
Florent Chuffart
233

    
234
Examples
235
~~~~~~~~
236

    
237
::
238

    
239

    
240
        # Define new translate_roi function...
241
        translate_roi = function(roi, strain2, big_roi=NULL, config=NULL) {
242
          return(roi)
243
        }
244
        # Binding it by uncomment follwing lines.
245
        unlockBinding("translate_roi", as.environment("package:nucleominer"))
246
        unlockBinding("translate_roi", getNamespace("nucleominer"))
247
        assign("translate_roi", translate_roi, "package:nucleominer")
248
        assign("translate_roi", translate_roi, getNamespace("nucleominer"))
249
        lockBinding("translate_roi", getNamespace("nucleominer"))
250
        lockBinding("translate_roi", as.environment("package:nucleominer"))  
251

    
252
    # Dealing with a region of interest
253
    roi =list(name="example", begin=1000,  end=1300, chr="1", genome=rep("A",301), strain_ref1 = "STRAINREF1")
254
    roi2 = translate_roi(roi, roi$strain_ref1)
255
    replicates = list()
256
    for (j in 1:2) {
257
        samples = list()
258
        for (i in 1:3) {
259
            # Create TF output
260
            tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9)
261
            outputs = dfadd(NULL,tf_nuc)
262
            outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end)
263
            # Generate corresponding reads
264
            nb_reads = round(runif(1,170,230))
265
            reads = round(rnorm(nb_reads, tf_nuc$center,20))
266
            u_reads = sort(unique(reads))
267
            strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2))))
268
            counts = apply(t(u_reads), 2, function(r) { sum(reads == r)})
269
            shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)})
270
            u_reads = u_reads + shifts
271
            inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)), 
272
                                     "V2" = u_reads, 
273
                                                             "V3" = strands, 
274
                                                             "V4" = counts), stringsAsFactors=FALSE)
275
            samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain=paste("strain_ex",j,sep=""), total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs)
276
        }
277
        replicates[[length(replicates) + 1]] = samples
278
    }
279
    print(align_inter_strain_nucs(replicates))
280

    
281
R: Launch deseq methods.
282

    
283
Launch deseq methods.
284
---------------------
285

    
286
Description
287
~~~~~~~~~~~
288

    
289
This function is based on deseq example. It mormalizes data, fit data to
290
GLM model with and without interaction term and compare the two
291
l;=models.
292

    
293
Usage
294
~~~~~
295

    
296
::
297

    
298
    analyse_design(snep_design, reads)
299

    
300
Arguments
301
~~~~~~~~~
302

    
303
``snep_design``
304

    
305
The design to considere.
306

    
307
``reads``
308

    
309
The data to considere.
310

    
311
Author(s)
312
~~~~~~~~~
313

    
314
Florent Chuffart
315

    
316
R: Stage replicates data
317

    
318
Stage replicates data
319
---------------------
320

    
321
Description
322
~~~~~~~~~~~
323

    
324
This function loads in memory data corresponding to the given
325
experiments.
326

    
327
Usage
328
~~~~~
329

    
330
::
331

    
332
    build_replicates(expe, roi, only_fetch = FALSE, get_genome = FALSE, 
333
        all_samples, config = NULL)
334

    
335
Arguments
336
~~~~~~~~~
337

    
338
``expe``
339

    
340
a list of vector corresponding to vector of replicates.
341

    
342
``roi``
343

    
344
the region that we are interested in.
345

    
346
``only_fetch``
347

    
348
filter or not inputs.
349

    
350
``get_genome``
351

    
352
Load or not corresponding genome.
353

    
354
``all_samples``
355

    
356
Global list of samples.
357

    
358
``config``
359

    
360
GLOBAL config variable.
361

    
362
Author(s)
363
~~~~~~~~~
364

    
365
Florent Chuffart
366

    
367
Examples
368
~~~~~~~~
369

    
370
::
371

    
372
    # library(rjson)
373
    # library(nucleominer)
374
    # 
375
    # # Read config file
376
    # json_conf_file = "nucleo_miner_config.json"
377
    # config = fromJSON(paste(readLines(json_conf_file), collapse=""))
378
    # # Read sample file
379
    # all_samples = get_content(config$CSV_SAMPLE_FILE, "cvs", sep=";", head=TRUE, stringsAsFactors=FALSE)  
380
    # # here are the sample ids in a list
381
    # expes = list(c(1))
382
    # # here is the region that we wnt to see the coverage
383
    # cur = list(chr="8", begin=472000, end=474000, strain_ref="BY") 
384
    # # it displays the corverage
385
    # replicates = build_replicates(expes, cur, all_samples=all_samples, config=config)
386
    # out = watch_samples(replicates, config$READ_LENGTH, 
387
    #       plot_coverage = TRUE,  
388
    #       plot_squared_reads = FALSE,  
389
    #       plot_ref_genome = FALSE, 
390
    #       plot_arrow_raw_reads = FALSE,  
391
    #       plot_arrow_nuc_reads = FALSE,  
392
    #       plot_gaussian_reads = FALSE,  
393
    #       plot_gaussian_unified_reads = FALSE,  
394
    #       plot_ellipse_nucs = FALSE,  
395
    #       plot_wp_nucs = FALSE,  
396
    #       plot_wp_nuc_model = FALSE,  
397
    #       plot_common_nucs = FALSE,  
398
    #       height = 50)
399

    
400
R: reformat an "apply manipulated" list of regions
401

    
402
reformat an "apply manipulated" list of regions
403
-----------------------------------------------
404

    
405
Description
406
~~~~~~~~~~~
407

    
408
Utils to reformat an "apply manipulated" list of regions
409

    
410
Usage
411
~~~~~
412

    
413
::
414

    
415
    collapse_regions(regions)
416

    
417
Arguments
418
~~~~~~~~~
419

    
420
+---------------+----+
421
| ``regions``   |    |
422
+---------------+----+
423

    
424
Author(s)
425
~~~~~~~~~
426

    
427
Florent Chuffart
428

    
429
R: Compute Common Uninterrupted Regions (CUR)
430

    
431
Compute Common Uninterrupted Regions (CUR)
432
------------------------------------------
433

    
434
Description
435
~~~~~~~~~~~
436

    
437
CURs are regions that can be aligned between the genomes
438

    
439
Usage
440
~~~~~
441

    
442
::
443

    
444
    compute_inter_all_strain_curs(diff_allowed = 10, min_cur_width = 200, 
445
        config = NULL, plot = FALSE)
446

    
447
Arguments
448
~~~~~~~~~
449

    
450
``diff_allowed``
451

    
452
the maximum indel width allowe din a CUR
453

    
454
``min_cur_width``
455

    
456
The minimum width of a CUR
457

    
458
``config``
459

    
460
GLOBAL config variable
461

    
462
``plot``
463

    
464
Plot CURs or not
465

    
466
Author(s)
467
~~~~~~~~~
468

    
469
Florent Chuffart
470

    
471
R: Crop bound of regions according to region of interest bound
472

    
473
Crop bound of regions according to region of interest bound
474
-----------------------------------------------------------
475

    
476
Description
477
~~~~~~~~~~~
478

    
479
The fucntion is no more necessary since we remove "big\_roi" bug in
480
translate\_roi function.
481

    
482
Usage
483
~~~~~
484

    
485
::
486

    
487
    crop_fuzzy(tmp_fuzzy_nucs, roi, strain, config = NULL)
488

    
489
Arguments
490
~~~~~~~~~
491

    
492
``tmp_fuzzy_nucs``
493

    
494
the regiuons to be croped.
495

    
496
``roi``
497

    
498
The region of interest.
499

    
500
``strain``
501

    
502
The strain to consider.
503

    
504
``config``
505

    
506
GLOBAL config variable
507

    
508
Author(s)
509
~~~~~~~~~
510

    
511
Florent Chuffart
512

    
513
R: Adding list to a dataframe.
514

    
515
Adding list to a dataframe.
516
---------------------------
517

    
518
Description
519
~~~~~~~~~~~
520

    
521
Add a list *l* to a dataframe *df*. Create it if *df* is *NULL*. Return
522
the dataframe *df*.
523

    
524
Usage
525
~~~~~
526

    
527
::
528

    
529
    dfadd(df, l)
530

    
531
Arguments
532
~~~~~~~~~
533

    
534
``df``
535

    
536
A dataframe
537

    
538
``l``
539

    
540
A list
541

    
542
Value
543
~~~~~
544

    
545
Return the dataframe *df*.
546

    
547
Author(s)
548
~~~~~~~~~
549

    
550
Florent Chuffart
551

    
552
Examples
553
~~~~~~~~
554

    
555
::
556

    
557
    ## Here dataframe is NULL
558
    print(df)
559
    df = NULL
560

    
561
    # Initialize df
562
    df = dfadd(df, list(key1 = "value1", key2 = "value2"))
563
    print(df)
564

    
565
    # Adding elements to df
566
    df = dfadd(df, list(key1 = "value1'", key2 = "value2'"))
567
    print(df)
568

    
569
R: Extract wp nucs from nuc map.
570

    
571
Extract wp nucs from nuc map.
572
-----------------------------
573

    
574
Description
575
~~~~~~~~~~~
576

    
577
Function based on common wp nuc index and roi\_index.
578

    
579
Usage
580
~~~~~
581

    
582
::
583

    
584
    extract_wp(strain_maps, roi_index, strain, tmp_common_nucs)
585

    
586
Arguments
587
~~~~~~~~~
588

    
589
``strain_maps``
590

    
591
Nuc maps.
592

    
593
``roi_index``
594

    
595
The region of interest index.
596

    
597
``strain``
598

    
599
The strain to consider.
600

    
601
``tmp_common_nucs``
602

    
603
the list of wp nucs.
604

    
605
Author(s)
606
~~~~~~~~~
607

    
608
Florent Chuffart
609

    
610
R: Prefetch data
611

    
612
Prefetch data
613
-------------
614

    
615
Description
616
~~~~~~~~~~~
617

    
618
Fetch and filter inputs and outpouts per region of interest. Organize it
619
per replicates.
620

    
621
Usage
622
~~~~~
623

    
624
::
625

    
626
    fetch_mnase_replicates(strain, roi, all_samples, config = NULL, 
627
        only_fetch = FALSE, get_genome = FALSE, get_ouputs = TRUE)
628

    
629
Arguments
630
~~~~~~~~~
631

    
632
``strain``
633

    
634
The strain we want mnase replicatesList of replicates. Each replicates
635
is a vector of sample ids.
636

    
637
``roi``
638

    
639
Region of interest.
640

    
641
``all_samples``
642

    
643
Global list of samples.
644

    
645
``config``
646

    
647
GLOBAL config variable
648

    
649
``only_fetch``
650

    
651
If TRUE, only fetch and not filtering. It is used tio load sample files
652
into memory before forking.
653

    
654
``get_genome``
655

    
656
If TRUE, load corresponding genome sequence.
657

    
658
``get_ouputs``
659

    
660
If TRUE, get also ouput corresponding TF output files.
661

    
662
Author(s)
663
~~~~~~~~~
664

    
665
Florent Chuffart
666

    
667
R: Filter TemplateFilter inputs
668

    
669
Filter TemplateFilter inputs
670
----------------------------
671

    
672
Description
673
~~~~~~~~~~~
674

    
675
This function filters TemplateFilter inputs according genome area
676
observed properties. It takes into account reads that are at the
677
frontier of this area and the strand of these reads.
678

    
679
Usage
680
~~~~~
681

    
682
::
683

    
684
    filter_tf_inputs(inputs, chr, x_min, x_max, nuc_width = 160, 
685
        only_f = FALSE, only_r = FALSE)
686

    
687
Arguments
688
~~~~~~~~~
689

    
690
``inputs``
691

    
692
TF inputs to be filtered.
693

    
694
``chr``
695

    
696
Chromosome observed, here chr is an integer.
697

    
698
``x_min``
699

    
700
Coordinate of the first bp observed.
701

    
702
``x_max``
703

    
704
Coordinate of the last bp observed.
705

    
706
``nuc_width``
707

    
708
Nucleosome width.
709

    
710
``only_f``
711

    
712
Filter only F reads.
713

    
714
``only_r``
715

    
716
Filter only R reads.
717

    
718
Value
719
~~~~~
720

    
721
Returns filtred inputs.
722

    
723
Author(s)
724
~~~~~~~~~
725

    
726
Florent Chuffart
727

    
728
R: Filter TemplateFilter outputs
729

    
730
Filter TemplateFilter outputs
731
-----------------------------
732

    
733
Description
734
~~~~~~~~~~~
735

    
736
This function filters TemplateFilter outputs according, not only genome
737
area observerved properties, but also correlation and overlap threshold.
738

    
739
Usage
740
~~~~~
741

    
742
::
743

    
744
    filter_tf_outputs(tf_outputs, chr, x_min, x_max, nuc_width = 160, 
745
        ol_bp = 59, corr_thres = 0.5)
746

    
747
Arguments
748
~~~~~~~~~
749

    
750
``tf_outputs``
751

    
752
TemplateFilter outputs.
753

    
754
``chr``
755

    
756
Chromosome observed, here chr is an integer.
757

    
758
``x_min``
759

    
760
Coordinate of the first bp observed.
761

    
762
``x_max``
763

    
764
Coordinate of the last bp observed.
765

    
766
``nuc_width``
767

    
768
Nucleosome width.
769

    
770
``ol_bp``
771

    
772
Overlap Threshold.
773

    
774
``corr_thres``
775

    
776
Correlation threshold.
777

    
778
Value
779
~~~~~
780

    
781
Returns filtered TemplateFilter Outputs
782

    
783
Author(s)
784
~~~~~~~~~
785

    
786
Florent Chuffart
787

    
788
R: flat reads
789

    
790
flat reads
791
----------
792

    
793
Description
794
~~~~~~~~~~~
795

    
796
Extract reads coordinates from TempleteFilter input sequence
797

    
798
Usage
799
~~~~~
800

    
801
::
802

    
803
    flat_reads(reads, nuc_width)
804

    
805
Arguments
806
~~~~~~~~~
807

    
808
``reads``
809

    
810
TemplateFilter input reads
811

    
812
``nuc_width``
813

    
814
Width used to shift F and R reads.
815

    
816
Value
817
~~~~~
818

    
819
Returns a list of F reads, R reads and joint/shifted F and R reads.
820

    
821
Author(s)
822
~~~~~~~~~
823

    
824
Florent Chuffart
825

    
826
R: Retrieve Reads
827

    
828
Retrieve Reads
829
--------------
830

    
831
Description
832
~~~~~~~~~~~
833

    
834
Retrieve reads for a given marker, combi, form.
835

    
836
Usage
837
~~~~~
838

    
839
::
840

    
841
    get_all_reads(marker, combi, form = "wp")
842

    
843
Arguments
844
~~~~~~~~~
845

    
846
``marker``
847

    
848
The marker to considere.
849

    
850
``combi``
851

    
852
The starin combination to considere.
853

    
854
``form``
855

    
856
The nuc form to considere.
857

    
858
Author(s)
859
~~~~~~~~~
860

    
861
Florent Chuffart
862

    
863
R: get comp strand
864

    
865
get comp strand
866
---------------
867

    
868
Description
869
~~~~~~~~~~~
870

    
871
Compute the complementatry strand.
872

    
873
Usage
874
~~~~~
875

    
876
::
877

    
878
    get_comp_strand(strand)
879

    
880
Arguments
881
~~~~~~~~~
882

    
883
``strand``
884

    
885
The original strand.
886

    
887
Value
888
~~~~~
889

    
890
Returns the complementatry strand.
891

    
892
Author(s)
893
~~~~~~~~~
894

    
895
Florent Chuffart
896

    
897
R: Build the design for deseq
898

    
899
Build the design for deseq
900
--------------------------
901

    
902
Description
903
~~~~~~~~~~~
904

    
905
This function build the design according sample properties.
906

    
907
Usage
908
~~~~~
909

    
910
::
911

    
912
    get_design(marker, combi, all_samples)
913

    
914
Arguments
915
~~~~~~~~~
916

    
917
``marker``
918

    
919
The marker to considere.
920

    
921
``combi``
922

    
923
The starin combination to considere.
924

    
925
``all_samples``
926

    
927
Global list of samples.
928

    
929
Author(s)
930
~~~~~~~~~
931

    
932
Florent Chuffart
933

    
934
R: Compute the fuzzy nucs.
935

    
936
Compute the fuzzy nucs.
937
-----------------------
938

    
939
Description
940
~~~~~~~~~~~
941

    
942
This function aggregate non common wp nucs for each strain and substract
943
common wp nucs. It does not take care about the size of the resulting
944
fuzzy regions. It will be take into account in the count read part og
945
the pipeline.
946

    
947
Usage
948
~~~~~
949

    
950
::
951

    
952
    get_fuzzy(combi, roi, roi_index, strain_maps, common_nuc_results, 
953
        config = NULL)
954

    
955
Arguments
956
~~~~~~~~~
957

    
958
``combi``
959

    
960
The strain combination to consider.
961

    
962
``roi``
963

    
964
The region of interest.
965

    
966
``roi_index``
967

    
968
The region of interest index.
969

    
970
``strain_maps``
971

    
972
Nuc maps.
973

    
974
``common_nuc_results``
975

    
976
Common wp nuc maps
977

    
978
``config``
979

    
980
GLOBAL config variable
981

    
982
Author(s)
983
~~~~~~~~~
984

    
985
Florent Chuffart
986

    
987
R: Compute the list of SNEPs for a given set of marker, strain...
988

    
989
Compute the list of SNEPs for a given set of marker, strain combination and nuc form.
990
-------------------------------------------------------------------------------------
991

    
992
Description
993
~~~~~~~~~~~
994

    
995
This function uses
996

    
997
Usage
998
~~~~~
999

    
1000
::
1001

    
1002
    get_sneps(marker, combi, form, all_samples)
1003

    
1004
Arguments
1005
~~~~~~~~~
1006

    
1007
``marker``
1008

    
1009
The marker involved.
1010

    
1011
``combi``
1012

    
1013
The strain combination involved.
1014

    
1015
``form``
1016

    
1017
the nuc form involved.
1018

    
1019
``all_samples``
1020

    
1021
Global list of samples.
1022

    
1023
Author(s)
1024
~~~~~~~~~
1025

    
1026
Florent Chuffart
1027

    
1028
Examples
1029
~~~~~~~~
1030

    
1031
::
1032

    
1033
    marker = "H3K4me1"
1034
    combi = c("BY", "YJM") 
1035
    form = "wpfuzzy" # "wp" | "fuzzy" | "wpfuzzy"
1036
    # foo = get_sneps(marker, combi, form)
1037
    # foo = get_sneps("H4K12ac", c("BY", "RM"), "wp")
1038

    
1039
R: Likelihood ratio
1040

    
1041
Likelihood ratio
1042
----------------
1043

    
1044
Description
1045
~~~~~~~~~~~
1046

    
1047
Compute the likelihood log of two set of value from two models Vs. a
1048
unique model.
1049

    
1050
Usage
1051
~~~~~
1052

    
1053
::
1054

    
1055
    lod_score_vecs(x, y)
1056

    
1057
Arguments
1058
~~~~~~~~~
1059

    
1060
``x``
1061

    
1062
First vector.
1063

    
1064
``y``
1065

    
1066
Second vector.
1067

    
1068
Value
1069
~~~~~
1070

    
1071
Returns the likelihood ratio.
1072

    
1073
Author(s)
1074
~~~~~~~~~
1075

    
1076
Florent Chuffart
1077

    
1078
Examples
1079
~~~~~~~~
1080

    
1081
::
1082

    
1083
    # LOD score for 2 set of values
1084
    mean1=5; sd1=2; card2 = 250
1085
    mean2=6; sd2=3; card1 = 200
1086
    x1 = rnorm(card1, mean1, sd1)
1087
    x2 = rnorm(card2, mean2, sd2)  
1088
    min = floor(min(c(x1,x2)))
1089
    max = ceiling(max(c(x1,x2)))
1090
    hist(c(x1,x2), xlim=c(min, max), breaks=min:max)
1091
    lines(min:max,dnorm(min:max,mean1,sd1)*card1,col=2)
1092
    lines(min:max,dnorm(min:max,mean2,sd2)*card2,col=3)
1093
    lines(min:max,dnorm(min:max,mean(c(x1,x2)),sd(c(x1,x2)))*card2,col=4)
1094
    lod_score_vecs(x1,x2)
1095

    
1096
R: nm
1097

    
1098
nm
1099
--
1100

    
1101
Description
1102
~~~~~~~~~~~
1103

    
1104
It provides a set of useful functions allowing to perform quantitative
1105
analysis of nucleosomal epigenome.
1106

    
1107
Details
1108
~~~~~~~
1109

    
1110
+---------------+---------------------------------------------------+
1111
| Package:      | nucleominer                                       |
1112
+---------------+---------------------------------------------------+
1113
| Maintainer:   | Florent Chuffart <florent.chuffart@ens-lyon.fr>   |
1114
+---------------+---------------------------------------------------+
1115
| Author:       | Florent Chuffart                                  |
1116
+---------------+---------------------------------------------------+
1117
| Version:      | 2.3.19                                            |
1118
+---------------+---------------------------------------------------+
1119
| License:      | CeCILL                                            |
1120
+---------------+---------------------------------------------------+
1121
| Title:        | nm                                                |
1122
+---------------+---------------------------------------------------+
1123
| Depends:      | seqinr, plotrix, DESeq, cachecache                |
1124
+---------------+---------------------------------------------------+
1125

    
1126
Author(s)
1127
~~~~~~~~~
1128

    
1129
Florent Chuffart
1130

    
1131
R: Performaing ANOVAs
1132

    
1133
Performaing ANOVAs
1134
------------------
1135

    
1136
Description
1137
~~~~~~~~~~~
1138

    
1139
Counts reads and Performs ANOVAS for each common nucleosomes involved.
1140

    
1141
Usage
1142
~~~~~
1143

    
1144
::
1145

    
1146
    perform_anovas(replicates, aligned_inter_strain_nucs, inputs_name = "Mnase_Seq", 
1147
        plot_anova_boxes = FALSE)
1148

    
1149
Arguments
1150
~~~~~~~~~
1151

    
1152
``replicates``
1153

    
1154
Set of replicates, each replicate is a list of samples (ideally 3). Each
1155
sample is a list like *sample = list(id=..., marker=..., strain=...,
1156
roi=..., inputs=..., outputs=...)* with *roi = list(name=..., begin=...,
1157
end=..., chr=..., genome=...)*. In the *perform\_anovas* contexte, we
1158
need 4 replicates (4 \* (3 samples)): 2 strains \* (1 marker + 1 input
1159
(Mnase\_Seq)).
1160

    
1161
``aligned_inter_strain_nucs``
1162

    
1163
List of common nucleosomes.
1164

    
1165
``inputs_name``
1166

    
1167
Name of the input.
1168

    
1169
``plot_anova_boxes``
1170

    
1171
Plot (or not) boxplot for each nuc.
1172

    
1173
Value
1174
~~~~~
1175

    
1176
Returns ANOVA results and comunted reads.
1177

    
1178
Author(s)
1179
~~~~~~~~~
1180

    
1181
Florent Chuffart
1182

    
1183
R: Plot the distribution of reads.
1184

    
1185
Plot the distribution of reads.
1186
-------------------------------
1187

    
1188
Description
1189
~~~~~~~~~~~
1190

    
1191
This fuxntion use the deseq nomalization feature to compare
1192
qualitatively the distribution.
1193

    
1194
Usage
1195
~~~~~
1196

    
1197
::
1198

    
1199
    plot_dist_samples(strain, marker, res, all_samples, NEWPLOT = TRUE)
1200

    
1201
Arguments
1202
~~~~~~~~~
1203

    
1204
``strain``
1205

    
1206
The strain to considere.
1207

    
1208
``marker``
1209

    
1210
The marker to considere.
1211

    
1212
``res``
1213

    
1214
Data
1215

    
1216
``all_samples``
1217

    
1218
Global list of samples.
1219

    
1220
``NEWPLOT``
1221

    
1222
If FALSE the curve will be add to the current plot.
1223

    
1224
Author(s)
1225
~~~~~~~~~
1226

    
1227
Florent Chuffart
1228

    
1229
R: Remove wp nucs from common nucs list.
1230

    
1231
Remove wp nucs from common nucs list.
1232
-------------------------------------
1233

    
1234
Description
1235
~~~~~~~~~~~
1236

    
1237
It is based on common wp nucs index on nucs and region.
1238

    
1239
Usage
1240
~~~~~
1241

    
1242
::
1243

    
1244
    remove_aligned_wp(strain_maps, roi_index, tmp_common_nucs, strain)
1245

    
1246
Arguments
1247
~~~~~~~~~
1248

    
1249
``strain_maps``
1250

    
1251
Nuc maps.
1252

    
1253
``roi_index``
1254

    
1255
The region of interest index.
1256

    
1257
``tmp_common_nucs``
1258

    
1259
the list of wp nucs.
1260

    
1261
``strain``
1262

    
1263
The strain to consider.
1264

    
1265
Author(s)
1266
~~~~~~~~~
1267

    
1268
Florent Chuffart
1269

    
1270
R: sign from strand
1271

    
1272
sign from strand
1273
----------------
1274

    
1275
Description
1276
~~~~~~~~~~~
1277

    
1278
Get the sign of strand
1279

    
1280
Usage
1281
~~~~~
1282

    
1283
::
1284

    
1285
    sign_from_strand(strands)
1286

    
1287
Arguments
1288
~~~~~~~~~
1289

    
1290
+---------------+----+
1291
| ``strands``   |    |
1292
+---------------+----+
1293

    
1294
Value
1295
~~~~~
1296

    
1297
If strand in forward then returns 1 else returns -1
1298

    
1299
Author(s)
1300
~~~~~~~~~
1301

    
1302
Florent Chuffart
1303

    
1304
R: Substract to a list of regions an other list of regions that...
1305

    
1306
Substract to a list of regions an other list of regions that intersect it.
1307
--------------------------------------------------------------------------
1308

    
1309
Description
1310
~~~~~~~~~~~
1311

    
1312
This fucntion embed a recursive part. It occurs when a substracted
1313
region split an original region on two.
1314

    
1315
Usage
1316
~~~~~
1317

    
1318
::
1319

    
1320
    substract_region(region1, region2)
1321

    
1322
Arguments
1323
~~~~~~~~~
1324

    
1325
``region1``
1326

    
1327
Original regions.
1328

    
1329
``region2``
1330

    
1331
Regions to substract.
1332

    
1333
Author(s)
1334
~~~~~~~~~
1335

    
1336
Florent Chuffart
1337

    
1338
R: Switch a pairlist
1339

    
1340
Switch a pairlist
1341
-----------------
1342

    
1343
Description
1344
~~~~~~~~~~~
1345

    
1346
Take a pairlist key:value and return the switched pairlist value:key.
1347

    
1348
Usage
1349
~~~~~
1350

    
1351
::
1352

    
1353
    switch_pairlist(l)
1354

    
1355
Arguments
1356
~~~~~~~~~
1357

    
1358
``l``
1359

    
1360
The pairlist to switch.
1361

    
1362
Value
1363
~~~~~
1364

    
1365
The switched pairlist.
1366

    
1367
Author(s)
1368
~~~~~~~~~
1369

    
1370
Florent Chuffart
1371

    
1372
Examples
1373
~~~~~~~~
1374

    
1375
::
1376

    
1377
    l = list(key1 = "value1", key2 = "value2")
1378
    print(switch_pairlist(l))
1379

    
1380
R: Translate a list of regions from a strain ref to another.
1381

    
1382
Translate a list of regions from a strain ref to another.
1383
---------------------------------------------------------
1384

    
1385
Description
1386
~~~~~~~~~~~
1387

    
1388
This function is an eloborated call to translate\_roi.
1389

    
1390
Usage
1391
~~~~~
1392

    
1393
::
1394

    
1395
    translate_regions(regions, combi, roi_index, config = NULL, roi)
1396

    
1397
Arguments
1398
~~~~~~~~~
1399

    
1400
``regions``
1401

    
1402
Regions to be translated.
1403

    
1404
``combi``
1405

    
1406
Combination of strains.
1407

    
1408
``roi_index``
1409

    
1410
The region of interest index.
1411

    
1412
``config``
1413

    
1414
GLOBAL config variable
1415

    
1416
``roi``
1417

    
1418
The region of interest.
1419

    
1420
Author(s)
1421
~~~~~~~~~
1422

    
1423
Florent Chuffart
1424

    
1425
R: Translate coords of a genome region.
1426

    
1427
Translate coords of a genome region.
1428
------------------------------------
1429

    
1430
Description
1431
~~~~~~~~~~~
1432

    
1433
This function is used in the examples, usualy you have to define your
1434
own translation function and overwrite this one using *unlockBinding*
1435
features. Please, refer to the example.
1436

    
1437
Usage
1438
~~~~~
1439

    
1440
::
1441

    
1442
    translate_roi(roi, strain2, config = NULL, big_roi = NULL)
1443

    
1444
Arguments
1445
~~~~~~~~~
1446

    
1447
``roi``
1448

    
1449
Original genome region of interest.
1450

    
1451
``strain2``
1452

    
1453
The strain in wich you want the genome region of interest.
1454

    
1455
``config``
1456

    
1457
GLOBAL config variable
1458

    
1459
``big_roi``
1460

    
1461
A largest region than roi use to filter c2c if it is needed.
1462

    
1463
Author(s)
1464
~~~~~~~~~
1465

    
1466
Florent Chuffart
1467

    
1468
Examples
1469
~~~~~~~~
1470

    
1471
::
1472

    
1473
    # Define new translate_roi function...
1474
    translate_roi = function(roi, strain2, config) {
1475
        strain1 = roi$strain_ref
1476
        if (strain1 == strain2) {
1477
            return(roi)
1478
        } else {
1479
          stop("Here is my new translate_roi function...")      
1480
        }   
1481
    }
1482
    # Binding it by uncomment follwing lines.
1483
    # unlockBinding("translate_roi", as.environment("package:nm"))
1484
    # unlockBinding("translate_roi", getNamespace("nm"))
1485
    # assign("translate_roi", translate_roi, "package:nm")
1486
    # assign("translate_roi", translate_roi, getNamespace("nm"))
1487
    # lockBinding("translate_roi", getNamespace("nm"))
1488
    # lockBinding("translate_roi", as.environment("package:nm"))    
1489

    
1490
R: Aggregate regions that intersect themnselves.
1491

    
1492
Aggregate regions that intersect themnselves.
1493
---------------------------------------------
1494

    
1495
Description
1496
~~~~~~~~~~~
1497

    
1498
This function is based on sort of lower bounds to detect regions that
1499
intersect. We compare lower bound and upper bound of the porevious item.
1500
This function embed a while loop and break break regions list become
1501
stable.
1502

    
1503
Usage
1504
~~~~~
1505

    
1506
::
1507

    
1508
    union_regions(regions)
1509

    
1510
Arguments
1511
~~~~~~~~~
1512

    
1513
``regions``
1514

    
1515
The Regions to be aggregated
1516

    
1517
Author(s)
1518
~~~~~~~~~
1519

    
1520
Florent Chuffart
1521

    
1522
R: Watching analysis of samples
1523

    
1524
Watching analysis of samples
1525
----------------------------
1526

    
1527
Description
1528
~~~~~~~~~~~
1529

    
1530
This function allows to view analysis for a particuler region of the
1531
genome.
1532

    
1533
Usage
1534
~~~~~
1535

    
1536
::
1537

    
1538
    watch_samples(replicates, read_length, plot_ref_genome = TRUE, 
1539
        plot_arrow_raw_reads = TRUE, plot_arrow_nuc_reads = TRUE, 
1540
        plot_squared_reads = TRUE, plot_coverage = FALSE, plot_gaussian_reads = TRUE, 
1541
        plot_gaussian_unified_reads = TRUE, plot_ellipse_nucs = TRUE, 
1542
        plot_wp_nucs = TRUE, plot_wp_nuc_model = TRUE, plot_common_nucs = TRUE, 
1543
        plot_anovas = FALSE, plot_anova_boxes = FALSE, plot_wp_nucs_4_nonmnase = FALSE, 
1544
        aggregated_intra_strain_nucs = NULL, aligned_inter_strain_nucs = NULL, 
1545
        height = 10, config = NULL)
1546

    
1547
Arguments
1548
~~~~~~~~~
1549

    
1550
``replicates``
1551

    
1552
replicates under the form...
1553

    
1554
``read_length``
1555

    
1556
length of the reads
1557

    
1558
``plot_ref_genome``
1559

    
1560
Plot (or not) reference genome.
1561

    
1562
``plot_arrow_raw_reads``
1563

    
1564
Plot (or not) arrows for raw reads.
1565

    
1566
``plot_arrow_nuc_reads``
1567

    
1568
Plot (or not) arrows for reads aasiocied to a nucleosome.
1569

    
1570
``plot_squared_reads``
1571

    
1572
Plot (or not) reads in the square fashion.
1573

    
1574
``plot_coverage``
1575

    
1576
Plot (or not) reads in the covergae fashion. fashion.
1577

    
1578
``plot_gaussian_reads``
1579

    
1580
Plot (or not) gaussian model of a F anf R reads.
1581

    
1582
``plot_gaussian_unified_reads``
1583

    
1584
Plot (or not) gaussian model of a nuc.
1585

    
1586
``plot_ellipse_nucs``
1587

    
1588
Plot (or not) ellipse for a nuc.
1589

    
1590
``plot_wp_nucs``
1591

    
1592
Plot (or not) cluster of nucs
1593

    
1594
``plot_wp_nuc_model``
1595

    
1596
Plot (or not) gaussian model for a cluster of nucs
1597

    
1598
``plot_common_nucs``
1599

    
1600
Plot (or not) aligned reads.
1601

    
1602
``plot_anovas``
1603

    
1604
Plot (or not) scatter for each nuc.
1605

    
1606
``plot_anova_boxes``
1607

    
1608
Plot (or not) boxplot for each nuc.
1609

    
1610
``plot_wp_nucs_4_nonmnase``
1611

    
1612
Plot (or not) clusters for non inputs samples.
1613

    
1614
``aggregated_intra_strain_nucs``
1615

    
1616
list of aggregated intra strain nucs. If NULL, it will be computed.
1617

    
1618
``aligned_inter_strain_nucs``
1619

    
1620
list of aligned inter strain nucs. If NULL, it will be computed.
1621

    
1622
``height``
1623

    
1624
Number of reads in per million read for each sample, graphical parametre
1625
for the y axis.
1626

    
1627
``config``
1628

    
1629
GLOBAL config variable
1630

    
1631
Author(s)
1632
~~~~~~~~~
1633

    
1634
Florent Chuffart