Statistiques
| Branche: | Révision :

root / doc / sphinx_doc / rref.rst @ d973538c

Historique | Voir | Annoter | Télécharger (28,82 ko)

1
Arabic to Roman pair list.
2
--------------------------
3

    
4
Description
5
~~~~~~~~~~~
6

    
7
Utility to convert Arabic numbers to Roman numbers
8

    
9
Usage
10
~~~~~
11

    
12
::
13

    
14
    ARAB2ROM()
15

    
16
Author(s)
17
~~~~~~~~~
18

    
19
Florent Chuffart
20

    
21
R: False Discovery Rate
22

    
23
False Discovery Rate
24
--------------------
25

    
26
Description
27
~~~~~~~~~~~
28

    
29
From a vector x of independent p-values, extract the cutoff
30
corresponding to the specified FDR. See Benjamini & Hochberg 1995 paper
31

    
32
Usage
33
~~~~~
34

    
35
::
36

    
37
    FDR(x, FDR)
38

    
39
Arguments
40
~~~~~~~~~
41

    
42
``x``
43

    
44
A vector x of independent p-values.
45

    
46
``FDR``
47

    
48
The specified FDR.
49

    
50
Value
51
~~~~~
52

    
53
Return the the corresponding cutoff.
54

    
55
Author(s)
56
~~~~~~~~~
57

    
58
Gael Yvert, Florent Chuffart
59

    
60
Examples
61
~~~~~~~~
62

    
63
::
64

    
65
    print("example")
66

    
67
R: Roman to Arabic pair list.
68

    
69
Roman to Arabic pair list.
70
--------------------------
71

    
72
Description
73
~~~~~~~~~~~
74

    
75
Utility to convert Roman numbers into Arabic numbers
76

    
77
Usage
78
~~~~~
79

    
80
::
81

    
82
    ROM2ARAB()
83

    
84
Author(s)
85
~~~~~~~~~
86

    
87
Florent Chuffart
88

    
89
R: Aggregate replicated sample's nucleosomes.
90

    
91
Aggregate replicated sample's nucleosomes.
92
------------------------------------------
93

    
94
Description
95
~~~~~~~~~~~
96

    
97
This function aggregates nucleosomes from replicated samples. It uses
98
TemplateFilter ouput of each sample as replicate. Each sample owns a set
99
of nucleosomes computed using TemplateFilter and ordered by the position
100
of their center (dyad). A chain of nucleosomes is builts across all
101
replicates. Adjacent nucleosomes of the chain are compared two by two.
102
Comparison is based on a log likelihood ratio (LLR1). depending on the
103
LLR1 value nucleosomes are merged (low LLR) or separated (high LLR).
104
Finally the function returns a list of clusters and all computed
105
llr\_scores. Each cluster ows an attribute wp for "well positioned".
106
This attribute is set to TRUE if the cluster is composed of exactly one
107
nucleosome of each sample.
108

    
109
Usage
110
~~~~~
111

    
112
::
113

    
114
    aggregate_intra_strain_nucs(samples, llr_thres = 20, coord_max = 2e+07)
115

    
116
Arguments
117
~~~~~~~~~
118

    
119
``samples``
120

    
121
A list of samples. Each sample is a list like *sample = list(id=...,
122
marker=..., strain=..., roi=..., inputs=..., outputs=...)* with *roi =
123
list(name=..., begin=..., end=..., chr=..., genome=...)*.
124

    
125
``llr_thres``
126

    
127
Log likelihood ratio threshold to decide between merging and separating
128

    
129
``coord_max``
130

    
131
A too big value to be a coord for a nucleosome lower bound.
132

    
133
Value
134
~~~~~
135

    
136
Returns a list of clusterized nucleosomes, and all computed llr scores.
137

    
138
Author(s)
139
~~~~~~~~~
140

    
141
Florent Chuffart
142

    
143
Examples
144
~~~~~~~~
145

    
146
::
147

    
148
    # Dealing with a region of interest
149
    roi =list(name="example", begin=1000,  end=1300, chr="1", genome=rep("A",301))
150
    samples = list()
151
    for (i in 1:3) {
152
        # Create TF output
153
        tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9)
154
        outputs = dfadd(NULL,tf_nuc)
155
        outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end)
156
        # Generate corresponding reads
157
        nb_reads = round(runif(1,170,230))
158
        reads = round(rnorm(nb_reads, tf_nuc$center,20))
159
        u_reads = sort(unique(reads))
160
        strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2))))
161
        counts = apply(t(u_reads), 2, function(r) { sum(reads == r)})
162
        shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)})
163
        u_reads = u_reads + shifts
164
        inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)),
165
                                 "V2" = u_reads,
166
                                                         "V3" = strands,
167
                                                         "V4" = counts), stringsAsFactors=FALSE)
168
        samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain="strain_ex", total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs)
169
    }
170
    print(aggregate_intra_strain_nucs(samples))
171

    
172
R: Aligns nucleosomes between 2 strains.
173

    
174
Aligns nucleosomes between 2 strains.
175
-------------------------------------
176

    
177
Description
178
~~~~~~~~~~~
179

    
180
This function aligns nucleosomes between two strains for a given genome
181
region.
182

    
183
Usage
184
~~~~~
185

    
186
::
187

    
188
    align_inter_strain_nucs(replicates, wp_nucs_strain_ref1 = NULL, 
189
        wp_nucs_strain_ref2 = NULL, corr_thres = 0.5, llr_thres = 100, 
190
        config = NULL, ...)
191

    
192
Arguments
193
~~~~~~~~~
194

    
195
``replicates``
196

    
197
Set of replicates, ideally 3 per strain.
198

    
199
``wp_nucs_strain_ref1``
200

    
201
List of aggregates nucleosome for strain 1. If it's NULL this list will
202
be computed.
203

    
204
``wp_nucs_strain_ref2``
205

    
206
List of aggregates nucleosome for strain 2. If it's NULL this list will
207
be computed.
208

    
209
``corr_thres``
210

    
211
Correlation threshold.
212

    
213
``llr_thres``
214

    
215
Log likelihood ratio threshold to decide between merging and separating
216

    
217
``config``
218

    
219
GLOBAL config variable
220

    
221
``...``
222

    
223
A list of parameters that will be passed to
224
*aggregate\_intra\_strain\_nucs* if needed.
225

    
226
Value
227
~~~~~
228

    
229
Returns a list of clusterized nucleosomes, and all computed llr scores.
230

    
231
Author(s)
232
~~~~~~~~~
233

    
234
Florent Chuffart
235

    
236
Examples
237
~~~~~~~~
238

    
239
::
240

    
241

    
242
        # Define new translate_cur function...
243
        translate_cur = function(roi, strain2, big_cur=NULL, config=NULL) {
244
          return(roi)
245
        }
246
        # Binding it by uncomment follwing lines.
247
        unlockBinding("translate_cur", as.environment("package:nucleominer"))
248
        unlockBinding("translate_cur", getNamespace("nucleominer"))
249
        assign("translate_cur", translate_cur, "package:nucleominer")
250
        assign("translate_cur", translate_cur, getNamespace("nucleominer"))
251
        lockBinding("translate_cur", getNamespace("nucleominer"))
252
        lockBinding("translate_cur", as.environment("package:nucleominer"))
253

    
254
    # Dealing with a region of interest
255
    roi =list(name="example", begin=1000,  end=1300, chr="1", genome=rep("A",301), strain_ref1 = "STRAINREF1")
256
    roi2 = translate_cur(roi, roi$strain_ref1)
257
    replicates = list()
258
    for (j in 1:2) {
259
        samples = list()
260
        for (i in 1:3) {
261
            # Create TF output
262
            tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9)
263
            outputs = dfadd(NULL,tf_nuc)
264
            outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end)
265
            # Generate corresponding reads
266
            nb_reads = round(runif(1,170,230))
267
            reads = round(rnorm(nb_reads, tf_nuc$center,20))
268
            u_reads = sort(unique(reads))
269
            strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2))))
270
            counts = apply(t(u_reads), 2, function(r) { sum(reads == r)})
271
            shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)})
272
            u_reads = u_reads + shifts
273
            inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)),
274
                                     "V2" = u_reads,
275
                                                             "V3" = strands,
276
                                                             "V4" = counts), stringsAsFactors=FALSE)
277
            samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain=paste("strain_ex",j,sep=""), total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs)
278
        }
279
        replicates[[length(replicates) + 1]] = samples
280
    }
281
    print(align_inter_strain_nucs(replicates))
282

    
283
R: Launch DESeq methods.
284

    
285
Launch DESeq methods.
286
---------------------
287

    
288
Description
289
~~~~~~~~~~~
290

    
291
This function is based on DESeq example. It normalizes data, fit data to
292
GLM model with and without interaction term and compares the two models.
293

    
294
Usage
295
~~~~~
296

    
297
::
298

    
299
    analyse_design(snep_design, reads)
300

    
301
Arguments
302
~~~~~~~~~
303

    
304
``snep_design``
305

    
306
The design to consider.
307

    
308
``reads``
309

    
310
The data to consider.
311

    
312
Author(s)
313
~~~~~~~~~
314

    
315
Florent Chuffart
316

    
317
R: Stage replicates data
318

    
319
Stage replicates data
320
---------------------
321

    
322
Description
323
~~~~~~~~~~~
324

    
325
This function loads in memory the data corresponding to the given
326
experiments.
327

    
328
Usage
329
~~~~~
330

    
331
::
332

    
333
    build_replicates(expe, roi, only_fetch = FALSE, get_genome = FALSE, 
334
        all_samples, config = NULL)
335

    
336
Arguments
337
~~~~~~~~~
338

    
339
``expe``
340

    
341
a list of vectors corresponding to replicates.
342

    
343
``roi``
344

    
345
the region that we are interested in.
346

    
347
``only_fetch``
348

    
349
filter or not inputs.
350

    
351
``get_genome``
352

    
353
Load or not corresponding genome.
354

    
355
``all_samples``
356

    
357
Global list of samples.
358

    
359
``config``
360

    
361
GLOBAL config variable.
362

    
363
Author(s)
364
~~~~~~~~~
365

    
366
Florent Chuffart
367

    
368
Examples
369
~~~~~~~~
370

    
371
::
372

    
373
    # library(rjson)
374
    # library(nucleominer)
375
    #
376
    # # Read config file
377
    # json_conf_file = "nucleominer_config.json"
378
    # config = fromJSON(paste(readLines(json_conf_file), collapse=""))
379
    # # Read sample file
380
    # all_samples = get_content(config$CSV_SAMPLE_FILE, "cvs", sep=";", head=TRUE, stringsAsFactors=FALSE)
381
    # # here are the sample ids in a list
382
    # expes = list(c(1))
383
    # # here is the region that we wnt to see the coverage
384
    # cur = list(chr="8", begin=472000, end=474000, strain_ref="BY")
385
    # # it displays the corverage
386
    # replicates = build_replicates(expes, cur, all_samples=all_samples, config=config)
387
    # out = watch_samples(replicates, config$READ_LENGTH,
388
    #       plot_coverage = TRUE,
389
    #       plot_squared_reads = FALSE,
390
    #       plot_ref_genome = FALSE,
391
    #       plot_arrow_raw_reads = FALSE,
392
    #       plot_arrow_nuc_reads = FALSE,
393
    #       plot_gaussian_reads = FALSE,
394
    #       plot_gaussian_unified_reads = FALSE,
395
    #       plot_ellipse_nucs = FALSE,
396
    #       plot_wp_nucs = FALSE,
397
    #       plot_wp_nuc_model = FALSE,
398
    #       plot_common_nucs = FALSE,
399
    #       height = 50)
400

    
401
R: Extract a sub part of the corresponding c2c file
402

    
403
Extract a sub part of the corresponding c2c file
404
------------------------------------------------
405

    
406
Description
407
~~~~~~~~~~~
408

    
409
This fonction allows to access to a specific part of the c2c file.
410

    
411
Usage
412
~~~~~
413

    
414
::
415

    
416
    c2c_extraction(strain1, strain2, chr = NULL, lower_bound = NULL, 
417
        upper_bound = NULL, config = NULL)
418

    
419
Arguments
420
~~~~~~~~~
421

    
422
``strain1``
423

    
424
the key strain
425

    
426
``strain2``
427

    
428
the target strain
429

    
430
``chr``
431

    
432
if defined, the c2c will be filtered according to the chromosome value
433

    
434
``lower_bound``
435

    
436
if defined, the c2c will be filtered for part of the genome upper than
437
lower\_bound
438

    
439
``upper_bound``
440

    
441
if defined, the c2c will be filtered for part of the genome lower than
442
upper\_bound
443

    
444
``config``
445

    
446
GLOBAL config variable
447

    
448
Author(s)
449
~~~~~~~~~
450

    
451
Florent Chuffart
452

    
453
R: reformat an "apply manipulated" list of regions
454

    
455
reformat an "apply manipulated" list of regions
456
-----------------------------------------------
457

    
458
Description
459
~~~~~~~~~~~
460

    
461
Utils to reformat an "apply manipulated" list of regions
462

    
463
Usage
464
~~~~~
465

    
466
::
467

    
468
    collapse_regions(regions)
469

    
470
Arguments
471
~~~~~~~~~
472

    
473
+---------------+----+
474
| ``regions``   |    |
475
+---------------+----+
476

    
477
Author(s)
478
~~~~~~~~~
479

    
480
Florent Chuffart
481

    
482
R: Compute Common Uninterrupted Regions (CUR)
483

    
484
Compute Common Uninterrupted Regions (CUR)
485
------------------------------------------
486

    
487
Description
488
~~~~~~~~~~~
489

    
490
CURs are regions that can be aligned between the genomes
491

    
492
Usage
493
~~~~~
494

    
495
::
496

    
497
    compute_inter_all_strain_curs(diff_allowed = 30, min_cur_width = 4000, 
498
        config = NULL)
499

    
500
Arguments
501
~~~~~~~~~
502

    
503
``diff_allowed``
504

    
505
the maximum indel width allowe din a CUR
506

    
507
``min_cur_width``
508

    
509
The minimum width of a CUR
510

    
511
``config``
512

    
513
GLOBAL config variable
514

    
515
Author(s)
516
~~~~~~~~~
517

    
518
Florent Chuffart
519

    
520
R: Crop bound of regions according to region of interest bound
521

    
522
Crop bound of regions according to region of interest bound
523
-----------------------------------------------------------
524

    
525
Description
526
~~~~~~~~~~~
527

    
528
The fucntion is no more necessary since we remove "big\_cur" bug in
529
translate\_cur function.
530

    
531
Usage
532
~~~~~
533

    
534
::
535

    
536
    crop_fuzzy(tmp_fuzzy_nucs, roi, strain, config = NULL)
537

    
538
Arguments
539
~~~~~~~~~
540

    
541
``tmp_fuzzy_nucs``
542

    
543
the regiuons to be croped.
544

    
545
``roi``
546

    
547
The region of interest.
548

    
549
``strain``
550

    
551
The strain to consider.
552

    
553
``config``
554

    
555
GLOBAL config variable
556

    
557
Author(s)
558
~~~~~~~~~
559

    
560
Florent Chuffart
561

    
562
R: Adding list to a dataframe.
563

    
564
Adding list to a dataframe.
565
---------------------------
566

    
567
Description
568
~~~~~~~~~~~
569

    
570
Add a list *l* to a dataframe *df*. Create it if *df* is *NULL*. Return
571
the dataframe *df*.
572

    
573
Usage
574
~~~~~
575

    
576
::
577

    
578
    dfadd(df, l)
579

    
580
Arguments
581
~~~~~~~~~
582

    
583
``df``
584

    
585
A dataframe
586

    
587
``l``
588

    
589
A list
590

    
591
Value
592
~~~~~
593

    
594
Return the dataframe *df*.
595

    
596
Author(s)
597
~~~~~~~~~
598

    
599
Florent Chuffart
600

    
601
Examples
602
~~~~~~~~
603

    
604
::
605

    
606
    ## Here dataframe is NULL
607
    print(df)
608
    df = NULL
609

    
610
    # Initialize df
611
    df = dfadd(df, list(key1 = "value1", key2 = "value2"))
612
    print(df)
613

    
614
    # Adding elements to df
615
    df = dfadd(df, list(key1 = "value1'", key2 = "value2'"))
616
    print(df)
617

    
618
R: Prefetch data
619

    
620
Prefetch data
621
-------------
622

    
623
Description
624
~~~~~~~~~~~
625

    
626
Fetch and filter inputs and outpouts per region of interest. Organize it
627
per replicates.
628

    
629
Usage
630
~~~~~
631

    
632
::
633

    
634
    fetch_mnase_replicates(strain, roi, all_samples, config = NULL, 
635
        only_fetch = FALSE, get_genome = FALSE, get_ouputs = TRUE)
636

    
637
Arguments
638
~~~~~~~~~
639

    
640
``strain``
641

    
642
The strain we want mnase replicatesList of replicates. Each replicates
643
is a vector of sample ids.
644

    
645
``roi``
646

    
647
Region of interest.
648

    
649
``all_samples``
650

    
651
Global list of samples.
652

    
653
``config``
654

    
655
GLOBAL config variable
656

    
657
``only_fetch``
658

    
659
If TRUE, only fetch and not filtering. It is used tio load sample files
660
into memory before forking.
661

    
662
``get_genome``
663

    
664
If TRUE, load corresponding genome sequence.
665

    
666
``get_ouputs``
667

    
668
If TRUE, get also ouput corresponding TF output files.
669

    
670
Author(s)
671
~~~~~~~~~
672

    
673
Florent Chuffart
674

    
675
R: Filter TemplateFilter inputs
676

    
677
Filter TemplateFilter inputs
678
----------------------------
679

    
680
Description
681
~~~~~~~~~~~
682

    
683
This function filters TemplateFilter inputs according genome area
684
observed properties. It takes into account reads that are at the
685
frontier of this area and the strand of these reads.
686

    
687
Usage
688
~~~~~
689

    
690
::
691

    
692
    filter_tf_inputs(inputs, chr, x_min, x_max, nuc_width = 160, 
693
        only_f = FALSE, only_r = FALSE, filter_for_coverage = FALSE, 
694
        USE_DPLYR = TRUE)
695

    
696
Arguments
697
~~~~~~~~~
698

    
699
``inputs``
700

    
701
TF inputs to be filtered.
702

    
703
``chr``
704

    
705
Chromosome observed, here chr is an integer.
706

    
707
``x_min``
708

    
709
Coordinate of the first bp observed.
710

    
711
``x_max``
712

    
713
Coordinate of the last bp observed.
714

    
715
``nuc_width``
716

    
717
Nucleosome width.
718

    
719
``only_f``
720

    
721
Filter only F reads.
722

    
723
``only_r``
724

    
725
Filter only R reads.
726

    
727
``filter_for_coverage``
728

    
729
Does it filter for plot coverage?
730

    
731
``USE_DPLYR``
732

    
733
Use dplyr lib to filter reads.
734

    
735
Value
736
~~~~~
737

    
738
Returns filtred inputs.
739

    
740
Author(s)
741
~~~~~~~~~
742

    
743
Florent Chuffart
744

    
745
R: Filter TemplateFilter outputs
746

    
747
Filter TemplateFilter outputs
748
-----------------------------
749

    
750
Description
751
~~~~~~~~~~~
752

    
753
This function filters TemplateFilter outputs according, not only genome
754
area observerved properties, but also correlation and overlapping
755
threshold.
756

    
757
Usage
758
~~~~~
759

    
760
::
761

    
762
    filter_tf_outputs(tf_outputs, chr, x_min, x_max, nuc_width = 160, 
763
        ol_bp = 59, corr_thres = 0.5)
764

    
765
Arguments
766
~~~~~~~~~
767

    
768
``tf_outputs``
769

    
770
TemplateFilter outputs.
771

    
772
``chr``
773

    
774
Chromosome observed, here chr is an integer.
775

    
776
``x_min``
777

    
778
Coordinate of the first bp observed.
779

    
780
``x_max``
781

    
782
Coordinate of the last bp observed.
783

    
784
``nuc_width``
785

    
786
Nucleosome width.
787

    
788
``ol_bp``
789

    
790
Overlap Threshold.
791

    
792
``corr_thres``
793

    
794
Correlation threshold.
795

    
796
Value
797
~~~~~
798

    
799
Returns filtered TemplateFilter Outputs
800

    
801
Author(s)
802
~~~~~~~~~
803

    
804
Florent Chuffart
805

    
806
R: to flat aggregate\_intra\_strain\_nucs function output
807

    
808
to flat aggregate\_intra\_strain\_nucs function output
809
------------------------------------------------------
810

    
811
Description
812
~~~~~~~~~~~
813

    
814
This function builds a dataframe of all clusters obtain from
815
aggregate\_intra\_strain\_nucs function.
816

    
817
Usage
818
~~~~~
819

    
820
::
821

    
822
    flat_aggregated_intra_strain_nucs(partial_strain_maps, cur_index, 
823
        nb_tracks = 3)
824

    
825
Arguments
826
~~~~~~~~~
827

    
828
``partial_strain_maps``
829

    
830
the output of aggregate\_intra\_strain\_nucs function
831

    
832
``cur_index``
833

    
834
the index of the roi involved
835

    
836
``nb_tracks``
837

    
838
the number of replicates
839

    
840
Value
841
~~~~~
842

    
843
Returns a dataframe of all clusters obtain from
844
aggregate\_intra\_strain\_nucs function.
845

    
846
Author(s)
847
~~~~~~~~~
848

    
849
Florent Chuffart
850

    
851
R: flat reads
852

    
853
flat reads
854
----------
855

    
856
Description
857
~~~~~~~~~~~
858

    
859
Extract reads coordinates from TempleteFilter input sequence
860

    
861
Usage
862
~~~~~
863

    
864
::
865

    
866
    flat_reads(reads, nuc_width)
867

    
868
Arguments
869
~~~~~~~~~
870

    
871
``reads``
872

    
873
TemplateFilter input reads
874

    
875
``nuc_width``
876

    
877
Width used to shift F and R reads.
878

    
879
Value
880
~~~~~
881

    
882
Returns a list of F reads, R reads and joint/shifted F and R reads.
883

    
884
Author(s)
885
~~~~~~~~~
886

    
887
Florent Chuffart
888

    
889
R: Retrieve Reads
890

    
891
Retrieve Reads
892
--------------
893

    
894
Description
895
~~~~~~~~~~~
896

    
897
Retrieve reads for a given marker, combi, form.
898

    
899
Usage
900
~~~~~
901

    
902
::
903

    
904
    get_all_reads(marker, combi, form = "wp", config = NULL)
905

    
906
Arguments
907
~~~~~~~~~
908

    
909
``marker``
910

    
911
The marker to considere.
912

    
913
``combi``
914

    
915
The starin combination to considere.
916

    
917
``form``
918

    
919
The nuc form to considere.
920

    
921
``config``
922

    
923
GLOBAL config variable
924

    
925
Author(s)
926
~~~~~~~~~
927

    
928
Florent Chuffart
929

    
930
R: get comp strand
931

    
932
get comp strand
933
---------------
934

    
935
Description
936
~~~~~~~~~~~
937

    
938
Compute the complementatry strand.
939

    
940
Usage
941
~~~~~
942

    
943
::
944

    
945
    get_comp_strand(strand)
946

    
947
Arguments
948
~~~~~~~~~
949

    
950
``strand``
951

    
952
The original strand.
953

    
954
Value
955
~~~~~
956

    
957
Returns the complementatry strand.
958

    
959
Author(s)
960
~~~~~~~~~
961

    
962
Florent Chuffart
963

    
964
R: Build the design for DESeq
965

    
966
Build the design for DESeq
967
--------------------------
968

    
969
Description
970
~~~~~~~~~~~
971

    
972
This function build the design according sample properties.
973

    
974
Usage
975
~~~~~
976

    
977
::
978

    
979
    get_design(marker, combi, all_samples)
980

    
981
Arguments
982
~~~~~~~~~
983

    
984
``marker``
985

    
986
The marker to considere.
987

    
988
``combi``
989

    
990
The starin combination to considere.
991

    
992
``all_samples``
993

    
994
Global list of samples.
995

    
996
Author(s)
997
~~~~~~~~~
998

    
999
Florent Chuffart
1000

    
1001
R: Compute the fuzzy list for a given strain.
1002

    
1003
Compute the fuzzy list for a given strain.
1004
------------------------------------------
1005

    
1006
Description
1007
~~~~~~~~~~~
1008

    
1009
This function grabs the nucleosomes detxted by template\_filter that
1010
have been rejected bt aggregate\_intra\_strain\_nucs as well positions.
1011

    
1012
Usage
1013
~~~~~
1014

    
1015
::
1016

    
1017
    get_intra_strain_fuzzy(wp_map, roi, strain, config = NULL)
1018

    
1019
Arguments
1020
~~~~~~~~~
1021

    
1022
``wp_map``
1023

    
1024
Well positionned nucleosomes map.
1025

    
1026
``roi``
1027

    
1028
The region of interest.
1029

    
1030
``strain``
1031

    
1032
The strain we want to extracvt the fuzzy map.
1033

    
1034
``config``
1035

    
1036
GLOBAL config variable.
1037

    
1038
Author(s)
1039
~~~~~~~~~
1040

    
1041
Florent Chuffart
1042

    
1043
R: Compute the list of SNEPs for a given set of marker, strain...
1044

    
1045
Compute the list of SNEPs for a given set of marker, strain combination and nuc form.
1046
-------------------------------------------------------------------------------------
1047

    
1048
Description
1049
~~~~~~~~~~~
1050

    
1051
This function uses
1052

    
1053
Usage
1054
~~~~~
1055

    
1056
::
1057

    
1058
    get_sneps(marker, combi, form, all_samples, FDR = 1e-04, config = NULL)
1059

    
1060
Arguments
1061
~~~~~~~~~
1062

    
1063
``marker``
1064

    
1065
The marker involved.
1066

    
1067
``combi``
1068

    
1069
The strain combination involved.
1070

    
1071
``form``
1072

    
1073
the nuc form involved.
1074

    
1075
``all_samples``
1076

    
1077
Global list of samples.
1078

    
1079
``FDR``
1080

    
1081
``config``
1082

    
1083
GLOBAL config variable
1084

    
1085
Author(s)
1086
~~~~~~~~~
1087

    
1088
Florent Chuffart
1089

    
1090
Examples
1091
~~~~~~~~
1092

    
1093
::
1094

    
1095
    marker = "H3K4me1"
1096
    combi = c("BY", "YJM")
1097
    form = "wpunr" # "wp" | "unr" | "wpunr"
1098
    # foo = get_sneps(marker, combi, form)
1099
    # foo = get_sneps("H4K12ac", c("BY", "RM"), "wp")
1100

    
1101
R: Compute the unaligned nucleosomal regions (UNRs).
1102

    
1103
Compute the unaligned nucleosomal regions (UNRs).
1104
-------------------------------------------------
1105

    
1106
Description
1107
~~~~~~~~~~~
1108

    
1109
This function aggregate non common wp nucs for each strain and substract
1110
common wp nucs. It does not take care about the size of the resulting
1111
UNR. It will be take into account in the count read part og the
1112
pipeline.
1113

    
1114
Usage
1115
~~~~~
1116

    
1117
::
1118

    
1119
    get_unrs(combi, roi, cur_index, wp_maps, fuzzy_maps, common_nuc_results, 
1120
        config = NULL)
1121

    
1122
Arguments
1123
~~~~~~~~~
1124

    
1125
``combi``
1126

    
1127
The strain combination to consider.
1128

    
1129
``roi``
1130

    
1131
The region of interest.
1132

    
1133
``cur_index``
1134

    
1135
The region of interest index.
1136

    
1137
``wp_maps``
1138

    
1139
Well positionned nucleosomes maps.
1140

    
1141
``fuzzy_maps``
1142

    
1143
Fuzzy nucleosomes maps.
1144

    
1145
``common_nuc_results``
1146

    
1147
Common wp nuc maps
1148

    
1149
``config``
1150

    
1151
GLOBAL config variable
1152

    
1153
Author(s)
1154
~~~~~~~~~
1155

    
1156
Florent Chuffart
1157

    
1158
R: Returns the intersection of 2 list on regions.
1159

    
1160
Returns the intersection of 2 list on regions.
1161
----------------------------------------------
1162

    
1163
Description
1164
~~~~~~~~~~~
1165

    
1166
This function...
1167

    
1168
Usage
1169
~~~~~
1170

    
1171
::
1172

    
1173
    intersect_region(region1, region2)
1174

    
1175
Arguments
1176
~~~~~~~~~
1177

    
1178
``region1``
1179

    
1180
Original regions.
1181

    
1182
``region2``
1183

    
1184
Regions to intersect.
1185

    
1186
Author(s)
1187
~~~~~~~~~
1188

    
1189
Florent Chuffart
1190

    
1191
R: Likelihood ratio
1192

    
1193
Likelihood ratio
1194
----------------
1195

    
1196
Description
1197
~~~~~~~~~~~
1198

    
1199
Compute the log likelihood ratio of two or more set of value.
1200

    
1201
Usage
1202
~~~~~
1203

    
1204
::
1205

    
1206
    llr_score_nvecs(xs)
1207

    
1208
Arguments
1209
~~~~~~~~~
1210

    
1211
``xs``
1212

    
1213
list of vectors.
1214

    
1215
Value
1216
~~~~~
1217

    
1218
Returns the log likelihood ratio.
1219

    
1220
Author(s)
1221
~~~~~~~~~
1222

    
1223
Florent Chuffart
1224

    
1225
Examples
1226
~~~~~~~~
1227

    
1228
::
1229

    
1230
    # LLR score for 2 set of values
1231
    mean1=5; sd1=2; card2 = 250
1232
    mean2=6; sd2=3; card1 = 200
1233
    x1 = rnorm(card1, mean1, sd1)
1234
    x2 = rnorm(card2, mean2, sd2)
1235
    min = floor(min(c(x1,x2)))
1236
    max = ceiling(max(c(x1,x2)))
1237
    hist(c(x1,x2), xlim=c(min, max), breaks=min:max)
1238
    lines(min:max,dnorm(min:max,mean1,sd1)*card1,col=2)
1239
    lines(min:max,dnorm(min:max,mean2,sd2)*card2,col=3)
1240
    lines(min:max,dnorm(min:max,mean(c(x1,x2)),sd(c(x1,x2)))*card2,col=4)
1241
    llr_score_nvecs(list(x1,x2))
1242

    
1243
R: nm
1244

    
1245
nm
1246
--
1247

    
1248
Description
1249
~~~~~~~~~~~
1250

    
1251
It provides a set of useful functions allowing to perform quantitative
1252
analysis of nucleosomal epigenome.
1253

    
1254
Details
1255
~~~~~~~
1256

    
1257
+---------------+---------------------------------------------------+
1258
| Package:      | nucleominer                                       |
1259
+---------------+---------------------------------------------------+
1260
| Maintainer:   | Florent Chuffart <florent.chuffart@ens-lyon.fr>   |
1261
+---------------+---------------------------------------------------+
1262
| Author:       | Florent Chuffart                                  |
1263
+---------------+---------------------------------------------------+
1264
| Version:      | 2.3.47                                            |
1265
+---------------+---------------------------------------------------+
1266
| License:      | CeCILL                                            |
1267
+---------------+---------------------------------------------------+
1268
| Title:        | nm                                                |
1269
+---------------+---------------------------------------------------+
1270
| Depends:      | seqinr, plotrix, DESeq, cachecache, dplyr         |
1271
+---------------+---------------------------------------------------+
1272

    
1273
Author(s)
1274
~~~~~~~~~
1275

    
1276
Florent Chuffart
1277

    
1278
R: Plot the distribution of reads.
1279

    
1280
Plot the distribution of reads.
1281
-------------------------------
1282

    
1283
Description
1284
~~~~~~~~~~~
1285

    
1286
This fuxntion use the DESeq nomalization feature to compare
1287
qualitatively the distribution.
1288

    
1289
Usage
1290
~~~~~
1291

    
1292
::
1293

    
1294
    plot_dist_samples(strain, marker, res, all_samples, NEWPLOT = TRUE)
1295

    
1296
Arguments
1297
~~~~~~~~~
1298

    
1299
``strain``
1300

    
1301
The strain to considere.
1302

    
1303
``marker``
1304

    
1305
The marker to considere.
1306

    
1307
``res``
1308

    
1309
Data
1310

    
1311
``all_samples``
1312

    
1313
Global list of samples.
1314

    
1315
``NEWPLOT``
1316

    
1317
If FALSE the curve will be add to the current plot.
1318

    
1319
Author(s)
1320
~~~~~~~~~
1321

    
1322
Florent Chuffart
1323

    
1324
R: sign from strand
1325

    
1326
sign from strand
1327
----------------
1328

    
1329
Description
1330
~~~~~~~~~~~
1331

    
1332
Get the sign of strand
1333

    
1334
Usage
1335
~~~~~
1336

    
1337
::
1338

    
1339
    sign_from_strand(strands)
1340

    
1341
Arguments
1342
~~~~~~~~~
1343

    
1344
+---------------+----+
1345
| ``strands``   |    |
1346
+---------------+----+
1347

    
1348
Value
1349
~~~~~
1350

    
1351
If strand in forward then returns 1 else returns -1
1352

    
1353
Author(s)
1354
~~~~~~~~~
1355

    
1356
Florent Chuffart
1357

    
1358
R: Substract to a list of regions an other list of regions that...
1359

    
1360
Substract to a list of regions an other list of regions that intersect it.
1361
--------------------------------------------------------------------------
1362

    
1363
Description
1364
~~~~~~~~~~~
1365

    
1366
This fucntion embed a recursive part. It occurs when a substracted
1367
region split an original region on two.
1368

    
1369
Usage
1370
~~~~~
1371

    
1372
::
1373

    
1374
    substract_region(region1, region2)
1375

    
1376
Arguments
1377
~~~~~~~~~
1378

    
1379
``region1``
1380

    
1381
Original regions.
1382

    
1383
``region2``
1384

    
1385
Regions to substract.
1386

    
1387
Author(s)
1388
~~~~~~~~~
1389

    
1390
Florent Chuffart
1391

    
1392
R: Switch a pairlist
1393

    
1394
Switch a pairlist
1395
-----------------
1396

    
1397
Description
1398
~~~~~~~~~~~
1399

    
1400
Take a pairlist key:value and return the switched pairlist value:key.
1401

    
1402
Usage
1403
~~~~~
1404

    
1405
::
1406

    
1407
    switch_pairlist(l)
1408

    
1409
Arguments
1410
~~~~~~~~~
1411

    
1412
``l``
1413

    
1414
The pairlist to switch.
1415

    
1416
Value
1417
~~~~~
1418

    
1419
The switched pairlist.
1420

    
1421
Author(s)
1422
~~~~~~~~~
1423

    
1424
Florent Chuffart
1425

    
1426
Examples
1427
~~~~~~~~
1428

    
1429
::
1430

    
1431
    l = list(key1 = "value1", key2 = "value2")
1432
    print(switch_pairlist(l))
1433

    
1434
R: Translate coords of a genome region.
1435

    
1436
Translate coords of a genome region.
1437
------------------------------------
1438

    
1439
Description
1440
~~~~~~~~~~~
1441

    
1442
This function is used in the examples, usualy you have to define your
1443
own translation function and overwrite this one using *unlockBinding*
1444
features. Please, refer to the example.
1445

    
1446
Usage
1447
~~~~~
1448

    
1449
::
1450

    
1451
    translate_cur(roi, strain2, config = NULL, big_cur = NULL)
1452

    
1453
Arguments
1454
~~~~~~~~~
1455

    
1456
``roi``
1457

    
1458
Original genome region of interest.
1459

    
1460
``strain2``
1461

    
1462
The strain in wich you want the genome region of interest.
1463

    
1464
``config``
1465

    
1466
GLOBAL config variable
1467

    
1468
``big_cur``
1469

    
1470
A largest region than roi use to filter c2c if it is needed.
1471

    
1472
Author(s)
1473
~~~~~~~~~
1474

    
1475
Florent Chuffart
1476

    
1477
Examples
1478
~~~~~~~~
1479

    
1480
::
1481

    
1482
    # Define new translate_cur function...
1483
    translate_cur = function(roi, strain2, config) {
1484
        strain1 = roi$strain_ref
1485
        if (strain1 == strain2) {
1486
            return(roi)
1487
        } else {
1488
          stop("Here is my new translate_cur function...")
1489
        }
1490
    }
1491
    # Binding it by uncomment follwing lines.
1492
    # unlockBinding("translate_cur", as.environment("package:nm"))
1493
    # unlockBinding("translate_cur", getNamespace("nm"))
1494
    # assign("translate_cur", translate_cur, "package:nm")
1495
    # assign("translate_cur", translate_cur, getNamespace("nm"))
1496
    # lockBinding("translate_cur", getNamespace("nm"))
1497
    # lockBinding("translate_cur", as.environment("package:nm"))
1498

    
1499
R: Translate a list of regions from a strain ref to another.
1500

    
1501
Translate a list of regions from a strain ref to another.
1502
---------------------------------------------------------
1503

    
1504
Description
1505
~~~~~~~~~~~
1506

    
1507
This function is an elaborated call to translate\_cur.
1508

    
1509
Usage
1510
~~~~~
1511

    
1512
::
1513

    
1514
    translate_regions(regions, combi, cur_index, config = NULL, roi)
1515

    
1516
Arguments
1517
~~~~~~~~~
1518

    
1519
``regions``
1520

    
1521
Regions to be translated.
1522

    
1523
``combi``
1524

    
1525
Combination of strains.
1526

    
1527
``cur_index``
1528

    
1529
The region of interest index.
1530

    
1531
``config``
1532

    
1533
GLOBAL config variable
1534

    
1535
``roi``
1536

    
1537
The region of interest.
1538

    
1539
Author(s)
1540
~~~~~~~~~
1541

    
1542
Florent Chuffart
1543

    
1544
R: Aggregate regions that intersect themselves.
1545

    
1546
Aggregate regions that intersect themselves.
1547
--------------------------------------------
1548

    
1549
Description
1550
~~~~~~~~~~~
1551

    
1552
This function is based on sort of lower bounds to detect regions that
1553
intersect. We compare lower bound and upper bound of the porevious item.
1554
This function embed a while loop and break break regions list become
1555
stable.
1556

    
1557
Usage
1558
~~~~~
1559

    
1560
::
1561

    
1562
    union_regions(regions)
1563

    
1564
Arguments
1565
~~~~~~~~~
1566

    
1567
``regions``
1568

    
1569
The Regions to be aggregated
1570

    
1571
Author(s)
1572
~~~~~~~~~
1573

    
1574
Florent Chuffart
1575

    
1576
R: Watching analysis of samples
1577

    
1578
Watching analysis of samples
1579
----------------------------
1580

    
1581
Description
1582
~~~~~~~~~~~
1583

    
1584
This function allows to view analysis for a particuler region of the
1585
genome.
1586

    
1587
Usage
1588
~~~~~
1589

    
1590
::
1591

    
1592
    watch_samples(replicates, read_length, plot_ref_genome = TRUE, 
1593
        plot_arrow_raw_reads = TRUE, plot_arrow_nuc_reads = TRUE, 
1594
        plot_squared_reads = TRUE, plot_coverage = FALSE, plot_gaussian_reads = TRUE, 
1595
        plot_gaussian_unified_reads = TRUE, plot_ellipse_nucs = TRUE, 
1596
        change_col = TRUE, plot_wp_nucs = TRUE, plot_fuzzy_nucs = TRUE, 
1597
        plot_wp_nuc_model = TRUE, plot_common_nucs = FALSE, plot_common_unrs = FALSE, 
1598
        plot_wp_nucs_4_nonmnase = FALSE, plot_chain = FALSE, plot_sample_id = FALSE, 
1599
        aggregated_intra_strain_nucs = NULL, aligned_inter_strain_nucs = NULL, 
1600
        height = 10, main = NULL, xlab = NULL, ylab = "#reads (per million reads)", 
1601
        config = NULL)
1602

    
1603
Arguments
1604
~~~~~~~~~
1605

    
1606
``replicates``
1607

    
1608
replicates under the form...
1609

    
1610
``read_length``
1611

    
1612
length of the reads
1613

    
1614
``plot_ref_genome``
1615

    
1616
Plot (or not) reference genome.
1617

    
1618
``plot_arrow_raw_reads``
1619

    
1620
Plot (or not) arrows for raw reads.
1621

    
1622
``plot_arrow_nuc_reads``
1623

    
1624
Plot (or not) arrows for reads aasiocied to a nucleosome.
1625

    
1626
``plot_squared_reads``
1627

    
1628
Plot (or not) reads in the square fashion.
1629

    
1630
``plot_coverage``
1631

    
1632
Plot (or not) reads in the covergae fashion. fashion.
1633

    
1634
``plot_gaussian_reads``
1635

    
1636
Plot (or not) gaussian model of a F anf R reads.
1637

    
1638
``plot_gaussian_unified_reads``
1639

    
1640
Plot (or not) gaussian model of a nuc.
1641

    
1642
``plot_ellipse_nucs``
1643

    
1644
Plot (or not) ellipse for a nuc.
1645

    
1646
``change_col``
1647

    
1648
Change the color of each nucleosome.
1649

    
1650
``plot_wp_nucs``
1651

    
1652
Plot (or not) cluster of nucs
1653

    
1654
``plot_fuzzy_nucs``
1655

    
1656
Plot (or not) cluster of fuzzy
1657

    
1658
``plot_wp_nuc_model``
1659

    
1660
Plot (or not) gaussian model for a cluster of nucs
1661

    
1662
``plot_common_nucs``
1663

    
1664
Plot (or not) aligned reads.
1665

    
1666
``plot_common_unrs``
1667

    
1668
Plot (or not) unaligned nucleosomal refgions (UNRs).
1669

    
1670
``plot_wp_nucs_4_nonmnase``
1671

    
1672
Plot (or not) clusters for non inputs samples.
1673

    
1674
``plot_chain``
1675

    
1676
Plot (or not) clusterised nuceosomes between mnase samples.
1677

    
1678
``plot_sample_id``
1679

    
1680
Plot (or not) the sample id for each sample.
1681

    
1682
``aggregated_intra_strain_nucs``
1683

    
1684
list of aggregated intra strain nucs. If NULL, it will be computed.
1685

    
1686
``aligned_inter_strain_nucs``
1687

    
1688
list of aligned inter strain nucs. If NULL, it will be computed.
1689

    
1690
``height``
1691

    
1692
Number of reads in per million read for each sample, graphical parametre
1693
for the y axis.
1694

    
1695
``main``
1696

    
1697
main title of the produced plot
1698

    
1699
``xlab``
1700

    
1701
xlab of the produced plot
1702

    
1703
``ylab``
1704

    
1705
ylab of the produced plot
1706

    
1707
``config``
1708

    
1709
GLOBAL config variable
1710

    
1711
Author(s)
1712
~~~~~~~~~
1713

    
1714
Florent Chuffart