Statistiques
| Branche: | Révision :

root / doc / sphinx_doc / rref.rst @ 9b38a0af

Historique | Voir | Annoter | Télécharger (28,75 ko)

1
Arabic to Roman pair list.
2
--------------------------
3

    
4
Description
5
~~~~~~~~~~~
6

    
7
Utility to convert Arabic numbers to Roman numbers
8

    
9
Usage
10
~~~~~
11

    
12
::
13

    
14
    ARAB2ROM()
15

    
16
Author(s)
17
~~~~~~~~~
18

    
19
Florent Chuffart
20

    
21
R: False Discovery Rate
22

    
23
False Discovery Rate
24
--------------------
25

    
26
Description
27
~~~~~~~~~~~
28

    
29
From a vector x of independent p-values, extract the cutoff
30
corresponding to the specified FDR. See Benjamini & Hochberg 1995 paper
31

    
32
Usage
33
~~~~~
34

    
35
::
36

    
37
    FDR(x, FDR)
38

    
39
Arguments
40
~~~~~~~~~
41

    
42
``x``
43

    
44
A vector x of independent p-values.
45

    
46
``FDR``
47

    
48
The specified FDR.
49

    
50
Value
51
~~~~~
52

    
53
Return the the corresponding cutoff.
54

    
55
Author(s)
56
~~~~~~~~~
57

    
58
Gael Yvert, Florent Chuffart
59

    
60
Examples
61
~~~~~~~~
62

    
63
::
64

    
65
    print("example")
66

    
67
R: Roman to Arabic pair list.
68

    
69
Roman to Arabic pair list.
70
--------------------------
71

    
72
Description
73
~~~~~~~~~~~
74

    
75
Utility to convert Roman numbers into Arabic numbers
76

    
77
Usage
78
~~~~~
79

    
80
::
81

    
82
    ROM2ARAB()
83

    
84
Author(s)
85
~~~~~~~~~
86

    
87
Florent Chuffart
88

    
89
R: Aggregate replicated sample's nucleosomes.
90

    
91
Aggregate replicated sample's nucleosomes.
92
------------------------------------------
93

    
94
Description
95
~~~~~~~~~~~
96

    
97
This function aggregates nucleosomes from replicated samples. It uses
98
TemplateFilter ouput of each sample as replicate. Each sample owns a set
99
of nucleosomes computed using TemplateFilter and ordered by the position
100
of their center (dyad). A chain of nucleosomes is builts across all
101
replicates. Adjacent nucleosomes of the chain are compared two by two.
102
Comparison is based on a log likelihood ratio (LLR1). depending on the
103
LLR1 value nucleosomes are merged (low LLR) or separated (high LLR).
104
Finally the function returns a list of clusters and all computed
105
llr\_scores. Each cluster ows an attribute wp for "well positioned".
106
This attribute is set to TRUE if the cluster is composed of exactly one
107
nucleosome of each sample.
108

    
109
Usage
110
~~~~~
111

    
112
::
113

    
114
    aggregate_intra_strain_nucs(samples, llr_thres = 20, coord_max = 2e+07)
115

    
116
Arguments
117
~~~~~~~~~
118

    
119
``samples``
120

    
121
A list of samples. Each sample is a list like *sample = list(id=...,
122
marker=..., strain=..., roi=..., inputs=..., outputs=...)* with *roi =
123
list(name=..., begin=..., end=..., chr=..., genome=...)*.
124

    
125
``llr_thres``
126

    
127
Log likelihood ratio threshold to decide between merging and separating
128

    
129
``coord_max``
130

    
131
A too big value to be a coord for a nucleosome lower bound.
132

    
133
Value
134
~~~~~
135

    
136
Returns a list of clusterized nucleosomes, and all computed llr scores.
137

    
138
Author(s)
139
~~~~~~~~~
140

    
141
Florent Chuffart
142

    
143
Examples
144
~~~~~~~~
145

    
146
::
147

    
148
    # Dealing with a region of interest
149
    roi =list(name="example", begin=1000,  end=1300, chr="1", genome=rep("A",301))
150
    samples = list()
151
    for (i in 1:3) {
152
        # Create TF output
153
        tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9)
154
        outputs = dfadd(NULL,tf_nuc)
155
        outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end)
156
        # Generate corresponding reads
157
        nb_reads = round(runif(1,170,230))
158
        reads = round(rnorm(nb_reads, tf_nuc$center,20))
159
        u_reads = sort(unique(reads))
160
        strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2))))
161
        counts = apply(t(u_reads), 2, function(r) { sum(reads == r)})
162
        shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)})
163
        u_reads = u_reads + shifts
164
        inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)),
165
                                 "V2" = u_reads,
166
                                                         "V3" = strands,
167
                                                         "V4" = counts), stringsAsFactors=FALSE)
168
        samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain="strain_ex", total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs)
169
    }
170
    print(aggregate_intra_strain_nucs(samples))
171

    
172
R: Aligns nucleosomes between 2 strains.
173

    
174
Aligns nucleosomes between 2 strains.
175
-------------------------------------
176

    
177
Description
178
~~~~~~~~~~~
179

    
180
This function aligns nucleosomes between two strains for a given genome
181
region.
182

    
183
Usage
184
~~~~~
185

    
186
::
187

    
188
    align_inter_strain_nucs(replicates, wp_nucs_strain_ref1 = NULL, 
189
        wp_nucs_strain_ref2 = NULL, corr_thres = 0.5, llr_thres = 100, 
190
        config = NULL, ...)
191

    
192
Arguments
193
~~~~~~~~~
194

    
195
``replicates``
196

    
197
Set of replicates, ideally 3 per strain.
198

    
199
``wp_nucs_strain_ref1``
200

    
201
List of aggregates nucleosome for strain 1. If it's NULL this list will
202
be computed.
203

    
204
``wp_nucs_strain_ref2``
205

    
206
List of aggregates nucleosome for strain 2. If it's NULL this list will
207
be computed.
208

    
209
``corr_thres``
210

    
211
Correlation threshold.
212

    
213
``llr_thres``
214

    
215
Log likelihood ratio threshold to decide between merging and separating
216

    
217
``config``
218

    
219
GLOBAL config variable
220

    
221
``...``
222

    
223
A list of parameters that will be passed to
224
*aggregate\_intra\_strain\_nucs* if needed.
225

    
226
Value
227
~~~~~
228

    
229
Returns a list of clusterized nucleosomes, and all computed llr scores.
230

    
231
Author(s)
232
~~~~~~~~~
233

    
234
Florent Chuffart
235

    
236
Examples
237
~~~~~~~~
238

    
239
::
240

    
241

    
242
        # Define new translate_cur function...
243
        translate_cur = function(roi, strain2, big_cur=NULL, config=NULL) {
244
          return(roi)
245
        }
246
        # Binding it by uncomment follwing lines.
247
        unlockBinding("translate_cur", as.environment("package:nucleominer"))
248
        unlockBinding("translate_cur", getNamespace("nucleominer"))
249
        assign("translate_cur", translate_cur, "package:nucleominer")
250
        assign("translate_cur", translate_cur, getNamespace("nucleominer"))
251
        lockBinding("translate_cur", getNamespace("nucleominer"))
252
        lockBinding("translate_cur", as.environment("package:nucleominer"))
253

    
254
    # Dealing with a region of interest
255
    roi =list(name="example", begin=1000,  end=1300, chr="1", genome=rep("A",301), strain_ref1 = "STRAINREF1")
256
    roi2 = translate_cur(roi, roi$strain_ref1)
257
    replicates = list()
258
    for (j in 1:2) {
259
        samples = list()
260
        for (i in 1:3) {
261
            # Create TF output
262
            tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9)
263
            outputs = dfadd(NULL,tf_nuc)
264
            outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end)
265
            # Generate corresponding reads
266
            nb_reads = round(runif(1,170,230))
267
            reads = round(rnorm(nb_reads, tf_nuc$center,20))
268
            u_reads = sort(unique(reads))
269
            strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2))))
270
            counts = apply(t(u_reads), 2, function(r) { sum(reads == r)})
271
            shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)})
272
            u_reads = u_reads + shifts
273
            inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)),
274
                                     "V2" = u_reads,
275
                                                             "V3" = strands,
276
                                                             "V4" = counts), stringsAsFactors=FALSE)
277
            samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain=paste("strain_ex",j,sep=""), total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs)
278
        }
279
        replicates[[length(replicates) + 1]] = samples
280
    }
281
    print(align_inter_strain_nucs(replicates))
282

    
283
R: Launch DESeq methods.
284

    
285
Launch DESeq methods.
286
---------------------
287

    
288
Description
289
~~~~~~~~~~~
290

    
291
This function is based on DESeq example. It normalizes data, fit data to
292
GLM model with and without interaction term and compares the two models.
293

    
294
Usage
295
~~~~~
296

    
297
::
298

    
299
    analyse_design(snep_design, reads)
300

    
301
Arguments
302
~~~~~~~~~
303

    
304
``snep_design``
305

    
306
The design to consider.
307

    
308
``reads``
309

    
310
The data to consider.
311

    
312
Author(s)
313
~~~~~~~~~
314

    
315
Florent Chuffart
316

    
317
R: Stage replicates data
318

    
319
Stage replicates data
320
---------------------
321

    
322
Description
323
~~~~~~~~~~~
324

    
325
This function loads in memory the data corresponding to the given
326
experiments.
327

    
328
Usage
329
~~~~~
330

    
331
::
332

    
333
    build_replicates(expe, roi, only_fetch = FALSE, get_genome = FALSE, 
334
        all_samples, config = NULL)
335

    
336
Arguments
337
~~~~~~~~~
338

    
339
``expe``
340

    
341
a list of vectors corresponding to replicates.
342

    
343
``roi``
344

    
345
the region that we are interested in.
346

    
347
``only_fetch``
348

    
349
filter or not inputs.
350

    
351
``get_genome``
352

    
353
Load or not corresponding genome.
354

    
355
``all_samples``
356

    
357
Global list of samples.
358

    
359
``config``
360

    
361
GLOBAL config variable.
362

    
363
Author(s)
364
~~~~~~~~~
365

    
366
Florent Chuffart
367

    
368
Examples
369
~~~~~~~~
370

    
371
::
372

    
373
    # library(rjson)
374
    # library(nucleominer)
375
    #
376
    # # Read config file
377
    # json_conf_file = "nucleominer_config.json"
378
    # config = fromJSON(paste(readLines(json_conf_file), collapse=""))
379
    # # Read sample file
380
    # all_samples = get_content(config$CSV_SAMPLE_FILE, "cvs", sep=";", head=TRUE, stringsAsFactors=FALSE)
381
    # # here are the sample ids in a list
382
    # expes = list(c(1))
383
    # # here is the region that we wnt to see the coverage
384
    # cur = list(chr="8", begin=472000, end=474000, strain_ref="BY")
385
    # # it displays the corverage
386
    # replicates = build_replicates(expes, cur, all_samples=all_samples, config=config)
387
    # out = watch_samples(replicates, config$READ_LENGTH,
388
    #       plot_coverage = TRUE,
389
    #       plot_squared_reads = FALSE,
390
    #       plot_ref_genome = FALSE,
391
    #       plot_arrow_raw_reads = FALSE,
392
    #       plot_arrow_nuc_reads = FALSE,
393
    #       plot_gaussian_reads = FALSE,
394
    #       plot_gaussian_unified_reads = FALSE,
395
    #       plot_ellipse_nucs = FALSE,
396
    #       plot_wp_nucs = FALSE,
397
    #       plot_wp_nuc_model = FALSE,
398
    #       plot_common_nucs = FALSE,
399
    #       height = 50)
400

    
401
R: Extract a sub part of the corresponding c2c file
402

    
403
Extract a sub part of the corresponding c2c file
404
------------------------------------------------
405

    
406
Description
407
~~~~~~~~~~~
408

    
409
This fonction allows to access to a specific part of the c2c file.
410

    
411
Usage
412
~~~~~
413

    
414
::
415

    
416
    c2c_extraction(strain1, strain2, chr = NULL, lower_bound = NULL, 
417
        upper_bound = NULL, config = NULL)
418

    
419
Arguments
420
~~~~~~~~~
421

    
422
``strain1``
423

    
424
the key strain
425

    
426
``strain2``
427

    
428
the target strain
429

    
430
``chr``
431

    
432
if defined, the c2c will be filtered according to the chromosome value
433

    
434
``lower_bound``
435

    
436
if defined, the c2c will be filtered for part of the genome upper than
437
lower\_bound
438

    
439
``upper_bound``
440

    
441
if defined, the c2c will be filtered for part of the genome lower than
442
upper\_bound
443

    
444
``config``
445

    
446
GLOBAL config variable
447

    
448
Author(s)
449
~~~~~~~~~
450

    
451
Florent Chuffart
452

    
453
R: reformat an "apply manipulated" list of regions
454

    
455
reformat an "apply manipulated" list of regions
456
-----------------------------------------------
457

    
458
Description
459
~~~~~~~~~~~
460

    
461
Utils to reformat an "apply manipulated" list of regions
462

    
463
Usage
464
~~~~~
465

    
466
::
467

    
468
    collapse_regions(regions)
469

    
470
Arguments
471
~~~~~~~~~
472

    
473
+---------------+----+
474
| ``regions``   |    |
475
+---------------+----+
476

    
477
Author(s)
478
~~~~~~~~~
479

    
480
Florent Chuffart
481

    
482
R: Compute Common Uninterrupted Regions (CUR)
483

    
484
Compute Common Uninterrupted Regions (CUR)
485
------------------------------------------
486

    
487
Description
488
~~~~~~~~~~~
489

    
490
CURs are regions that can be aligned between the genomes
491

    
492
Usage
493
~~~~~
494

    
495
::
496

    
497
    compute_inter_all_strain_curs(diff_allowed = 30, min_cur_width = 4000, 
498
        config = NULL)
499

    
500
Arguments
501
~~~~~~~~~
502

    
503
``diff_allowed``
504

    
505
the maximum indel width allowe din a CUR
506

    
507
``min_cur_width``
508

    
509
The minimum width of a CUR
510

    
511
``config``
512

    
513
GLOBAL config variable
514

    
515
Author(s)
516
~~~~~~~~~
517

    
518
Florent Chuffart
519

    
520
R: Crop bound of regions according to region of interest bound
521

    
522
Crop bound of regions according to region of interest bound
523
-----------------------------------------------------------
524

    
525
Description
526
~~~~~~~~~~~
527

    
528
The fucntion is no more necessary since we remove "big\_cur" bug in
529
translate\_cur function.
530

    
531
Usage
532
~~~~~
533

    
534
::
535

    
536
    crop_fuzzy(tmp_fuzzy_nucs, roi, strain, config = NULL)
537

    
538
Arguments
539
~~~~~~~~~
540

    
541
``tmp_fuzzy_nucs``
542

    
543
the regiuons to be croped.
544

    
545
``roi``
546

    
547
The region of interest.
548

    
549
``strain``
550

    
551
The strain to consider.
552

    
553
``config``
554

    
555
GLOBAL config variable
556

    
557
Author(s)
558
~~~~~~~~~
559

    
560
Florent Chuffart
561

    
562
R: Adding list to a dataframe.
563

    
564
Adding list to a dataframe.
565
---------------------------
566

    
567
Description
568
~~~~~~~~~~~
569

    
570
Add a list *l* to a dataframe *df*. Create it if *df* is *NULL*. Return
571
the dataframe *df*.
572

    
573
Usage
574
~~~~~
575

    
576
::
577

    
578
    dfadd(df, l)
579

    
580
Arguments
581
~~~~~~~~~
582

    
583
``df``
584

    
585
A dataframe
586

    
587
``l``
588

    
589
A list
590

    
591
Value
592
~~~~~
593

    
594
Return the dataframe *df*.
595

    
596
Author(s)
597
~~~~~~~~~
598

    
599
Florent Chuffart
600

    
601
Examples
602
~~~~~~~~
603

    
604
::
605

    
606
    ## Here dataframe is NULL
607
    print(df)
608
    df = NULL
609

    
610
    # Initialize df
611
    df = dfadd(df, list(key1 = "value1", key2 = "value2"))
612
    print(df)
613

    
614
    # Adding elements to df
615
    df = dfadd(df, list(key1 = "value1'", key2 = "value2'"))
616
    print(df)
617

    
618
R: Prefetch data
619

    
620
Prefetch data
621
-------------
622

    
623
Description
624
~~~~~~~~~~~
625

    
626
Fetch and filter inputs and outpouts per region of interest. Organize it
627
per replicates.
628

    
629
Usage
630
~~~~~
631

    
632
::
633

    
634
    fetch_mnase_replicates(strain, roi, all_samples, config = NULL, 
635
        only_fetch = FALSE, get_genome = FALSE, get_ouputs = TRUE)
636

    
637
Arguments
638
~~~~~~~~~
639

    
640
``strain``
641

    
642
The strain we want mnase replicatesList of replicates. Each replicates
643
is a vector of sample ids.
644

    
645
``roi``
646

    
647
Region of interest.
648

    
649
``all_samples``
650

    
651
Global list of samples.
652

    
653
``config``
654

    
655
GLOBAL config variable
656

    
657
``only_fetch``
658

    
659
If TRUE, only fetch and not filtering. It is used tio load sample files
660
into memory before forking.
661

    
662
``get_genome``
663

    
664
If TRUE, load corresponding genome sequence.
665

    
666
``get_ouputs``
667

    
668
If TRUE, get also ouput corresponding TF output files.
669

    
670
Author(s)
671
~~~~~~~~~
672

    
673
Florent Chuffart
674

    
675
R: Filter TemplateFilter inputs
676

    
677
Filter TemplateFilter inputs
678
----------------------------
679

    
680
Description
681
~~~~~~~~~~~
682

    
683
This function filters TemplateFilter inputs according genome area
684
observed properties. It takes into account reads that are at the
685
frontier of this area and the strand of these reads.
686

    
687
Usage
688
~~~~~
689

    
690
::
691

    
692
    filter_tf_inputs(inputs, chr, x_min, x_max, nuc_width = 160, 
693
        only_f = FALSE, only_r = FALSE, filter_for_coverage = FALSE)
694

    
695
Arguments
696
~~~~~~~~~
697

    
698
``inputs``
699

    
700
TF inputs to be filtered.
701

    
702
``chr``
703

    
704
Chromosome observed, here chr is an integer.
705

    
706
``x_min``
707

    
708
Coordinate of the first bp observed.
709

    
710
``x_max``
711

    
712
Coordinate of the last bp observed.
713

    
714
``nuc_width``
715

    
716
Nucleosome width.
717

    
718
``only_f``
719

    
720
Filter only F reads.
721

    
722
``only_r``
723

    
724
Filter only R reads.
725

    
726
``filter_for_coverage``
727

    
728
Does it filter for plot coverage?
729

    
730
Value
731
~~~~~
732

    
733
Returns filtred inputs.
734

    
735
Author(s)
736
~~~~~~~~~
737

    
738
Florent Chuffart
739

    
740
R: Filter TemplateFilter outputs
741

    
742
Filter TemplateFilter outputs
743
-----------------------------
744

    
745
Description
746
~~~~~~~~~~~
747

    
748
This function filters TemplateFilter outputs according, not only genome
749
area observerved properties, but also correlation and overlapping
750
threshold.
751

    
752
Usage
753
~~~~~
754

    
755
::
756

    
757
    filter_tf_outputs(tf_outputs, chr, x_min, x_max, nuc_width = 160, 
758
        ol_bp = 59, corr_thres = 0.5)
759

    
760
Arguments
761
~~~~~~~~~
762

    
763
``tf_outputs``
764

    
765
TemplateFilter outputs.
766

    
767
``chr``
768

    
769
Chromosome observed, here chr is an integer.
770

    
771
``x_min``
772

    
773
Coordinate of the first bp observed.
774

    
775
``x_max``
776

    
777
Coordinate of the last bp observed.
778

    
779
``nuc_width``
780

    
781
Nucleosome width.
782

    
783
``ol_bp``
784

    
785
Overlap Threshold.
786

    
787
``corr_thres``
788

    
789
Correlation threshold.
790

    
791
Value
792
~~~~~
793

    
794
Returns filtered TemplateFilter Outputs
795

    
796
Author(s)
797
~~~~~~~~~
798

    
799
Florent Chuffart
800

    
801
R: to flat aggregate\_intra\_strain\_nucs function output
802

    
803
to flat aggregate\_intra\_strain\_nucs function output
804
------------------------------------------------------
805

    
806
Description
807
~~~~~~~~~~~
808

    
809
This function builds a dataframe of all clusters obtain from
810
aggregate\_intra\_strain\_nucs function.
811

    
812
Usage
813
~~~~~
814

    
815
::
816

    
817
    flat_aggregated_intra_strain_nucs(partial_strain_maps, cur_index, 
818
        nb_tracks = 3)
819

    
820
Arguments
821
~~~~~~~~~
822

    
823
``partial_strain_maps``
824

    
825
the output of aggregate\_intra\_strain\_nucs function
826

    
827
``cur_index``
828

    
829
the index of the roi involved
830

    
831
``nb_tracks``
832

    
833
the number of replicates
834

    
835
Value
836
~~~~~
837

    
838
Returns a dataframe of all clusters obtain from
839
aggregate\_intra\_strain\_nucs function.
840

    
841
Author(s)
842
~~~~~~~~~
843

    
844
Florent Chuffart
845

    
846
R: flat reads
847

    
848
flat reads
849
----------
850

    
851
Description
852
~~~~~~~~~~~
853

    
854
Extract reads coordinates from TempleteFilter input sequence
855

    
856
Usage
857
~~~~~
858

    
859
::
860

    
861
    flat_reads(reads, nuc_width)
862

    
863
Arguments
864
~~~~~~~~~
865

    
866
``reads``
867

    
868
TemplateFilter input reads
869

    
870
``nuc_width``
871

    
872
Width used to shift F and R reads.
873

    
874
Value
875
~~~~~
876

    
877
Returns a list of F reads, R reads and joint/shifted F and R reads.
878

    
879
Author(s)
880
~~~~~~~~~
881

    
882
Florent Chuffart
883

    
884
R: Retrieve Reads
885

    
886
Retrieve Reads
887
--------------
888

    
889
Description
890
~~~~~~~~~~~
891

    
892
Retrieve reads for a given marker, combi, form.
893

    
894
Usage
895
~~~~~
896

    
897
::
898

    
899
    get_all_reads(marker, combi, form = "wp", config = NULL)
900

    
901
Arguments
902
~~~~~~~~~
903

    
904
``marker``
905

    
906
The marker to considere.
907

    
908
``combi``
909

    
910
The starin combination to considere.
911

    
912
``form``
913

    
914
The nuc form to considere.
915

    
916
``config``
917

    
918
GLOBAL config variable
919

    
920
Author(s)
921
~~~~~~~~~
922

    
923
Florent Chuffart
924

    
925
R: get comp strand
926

    
927
get comp strand
928
---------------
929

    
930
Description
931
~~~~~~~~~~~
932

    
933
Compute the complementatry strand.
934

    
935
Usage
936
~~~~~
937

    
938
::
939

    
940
    get_comp_strand(strand)
941

    
942
Arguments
943
~~~~~~~~~
944

    
945
``strand``
946

    
947
The original strand.
948

    
949
Value
950
~~~~~
951

    
952
Returns the complementatry strand.
953

    
954
Author(s)
955
~~~~~~~~~
956

    
957
Florent Chuffart
958

    
959
R: Build the design for DESeq
960

    
961
Build the design for DESeq
962
--------------------------
963

    
964
Description
965
~~~~~~~~~~~
966

    
967
This function build the design according sample properties.
968

    
969
Usage
970
~~~~~
971

    
972
::
973

    
974
    get_design(marker, combi, all_samples)
975

    
976
Arguments
977
~~~~~~~~~
978

    
979
``marker``
980

    
981
The marker to considere.
982

    
983
``combi``
984

    
985
The starin combination to considere.
986

    
987
``all_samples``
988

    
989
Global list of samples.
990

    
991
Author(s)
992
~~~~~~~~~
993

    
994
Florent Chuffart
995

    
996
R: Compute the fuzzy list for a given strain.
997

    
998
Compute the fuzzy list for a given strain.
999
------------------------------------------
1000

    
1001
Description
1002
~~~~~~~~~~~
1003

    
1004
This function grabs the nucleosomes detxted by template\_filter that
1005
have been rejected bt aggregate\_intra\_strain\_nucs as well positions.
1006

    
1007
Usage
1008
~~~~~
1009

    
1010
::
1011

    
1012
    get_intra_strain_fuzzy(wp_map, roi, strain, config = NULL)
1013

    
1014
Arguments
1015
~~~~~~~~~
1016

    
1017
``wp_map``
1018

    
1019
Well positionned nucleosomes map.
1020

    
1021
``roi``
1022

    
1023
The region of interest.
1024

    
1025
``strain``
1026

    
1027
The strain we want to extracvt the fuzzy map.
1028

    
1029
``config``
1030

    
1031
GLOBAL config variable.
1032

    
1033
Author(s)
1034
~~~~~~~~~
1035

    
1036
Florent Chuffart
1037

    
1038
R: Compute the list of SNEPs for a given set of marker, strain...
1039

    
1040
Compute the list of SNEPs for a given set of marker, strain combination and nuc form.
1041
-------------------------------------------------------------------------------------
1042

    
1043
Description
1044
~~~~~~~~~~~
1045

    
1046
This function uses
1047

    
1048
Usage
1049
~~~~~
1050

    
1051
::
1052

    
1053
    get_sneps(marker, combi, form, all_samples, FDR = 1e-04, config = NULL)
1054

    
1055
Arguments
1056
~~~~~~~~~
1057

    
1058
``marker``
1059

    
1060
The marker involved.
1061

    
1062
``combi``
1063

    
1064
The strain combination involved.
1065

    
1066
``form``
1067

    
1068
the nuc form involved.
1069

    
1070
``all_samples``
1071

    
1072
Global list of samples.
1073

    
1074
``FDR``
1075

    
1076
``config``
1077

    
1078
GLOBAL config variable
1079

    
1080
Author(s)
1081
~~~~~~~~~
1082

    
1083
Florent Chuffart
1084

    
1085
Examples
1086
~~~~~~~~
1087

    
1088
::
1089

    
1090
    marker = "H3K4me1"
1091
    combi = c("BY", "YJM")
1092
    form = "wpunr" # "wp" | "unr" | "wpunr"
1093
    # foo = get_sneps(marker, combi, form)
1094
    # foo = get_sneps("H4K12ac", c("BY", "RM"), "wp")
1095

    
1096
R: Compute the unaligned nucleosomal regions (UNRs).
1097

    
1098
Compute the unaligned nucleosomal regions (UNRs).
1099
-------------------------------------------------
1100

    
1101
Description
1102
~~~~~~~~~~~
1103

    
1104
This function aggregate non common wp nucs for each strain and substract
1105
common wp nucs. It does not take care about the size of the resulting
1106
UNR. It will be take into account in the count read part og the
1107
pipeline.
1108

    
1109
Usage
1110
~~~~~
1111

    
1112
::
1113

    
1114
    get_unrs(combi, roi, cur_index, wp_maps, fuzzy_maps, common_nuc_results, 
1115
        config = NULL)
1116

    
1117
Arguments
1118
~~~~~~~~~
1119

    
1120
``combi``
1121

    
1122
The strain combination to consider.
1123

    
1124
``roi``
1125

    
1126
The region of interest.
1127

    
1128
``cur_index``
1129

    
1130
The region of interest index.
1131

    
1132
``wp_maps``
1133

    
1134
Well positionned nucleosomes maps.
1135

    
1136
``fuzzy_maps``
1137

    
1138
Fuzzy nucleosomes maps.
1139

    
1140
``common_nuc_results``
1141

    
1142
Common wp nuc maps
1143

    
1144
``config``
1145

    
1146
GLOBAL config variable
1147

    
1148
Author(s)
1149
~~~~~~~~~
1150

    
1151
Florent Chuffart
1152

    
1153
R: Returns the intersection of 2 list on regions.
1154

    
1155
Returns the intersection of 2 list on regions.
1156
----------------------------------------------
1157

    
1158
Description
1159
~~~~~~~~~~~
1160

    
1161
This function...
1162

    
1163
Usage
1164
~~~~~
1165

    
1166
::
1167

    
1168
    intersect_region(region1, region2)
1169

    
1170
Arguments
1171
~~~~~~~~~
1172

    
1173
``region1``
1174

    
1175
Original regions.
1176

    
1177
``region2``
1178

    
1179
Regions to intersect.
1180

    
1181
Author(s)
1182
~~~~~~~~~
1183

    
1184
Florent Chuffart
1185

    
1186
R: Likelihood ratio
1187

    
1188
Likelihood ratio
1189
----------------
1190

    
1191
Description
1192
~~~~~~~~~~~
1193

    
1194
Compute the log likelihood ratio of two or more set of value.
1195

    
1196
Usage
1197
~~~~~
1198

    
1199
::
1200

    
1201
    llr_score_nvecs(xs)
1202

    
1203
Arguments
1204
~~~~~~~~~
1205

    
1206
``xs``
1207

    
1208
list of vectors.
1209

    
1210
Value
1211
~~~~~
1212

    
1213
Returns the log likelihood ratio.
1214

    
1215
Author(s)
1216
~~~~~~~~~
1217

    
1218
Florent Chuffart
1219

    
1220
Examples
1221
~~~~~~~~
1222

    
1223
::
1224

    
1225
    # LLR score for 2 set of values
1226
    mean1=5; sd1=2; card2 = 250
1227
    mean2=6; sd2=3; card1 = 200
1228
    x1 = rnorm(card1, mean1, sd1)
1229
    x2 = rnorm(card2, mean2, sd2)
1230
    min = floor(min(c(x1,x2)))
1231
    max = ceiling(max(c(x1,x2)))
1232
    hist(c(x1,x2), xlim=c(min, max), breaks=min:max)
1233
    lines(min:max,dnorm(min:max,mean1,sd1)*card1,col=2)
1234
    lines(min:max,dnorm(min:max,mean2,sd2)*card2,col=3)
1235
    lines(min:max,dnorm(min:max,mean(c(x1,x2)),sd(c(x1,x2)))*card2,col=4)
1236
    llr_score_nvecs(list(x1,x2))
1237

    
1238
R: nm
1239

    
1240
nm
1241
--
1242

    
1243
Description
1244
~~~~~~~~~~~
1245

    
1246
It provides a set of useful functions allowing to perform quantitative
1247
analysis of nucleosomal epigenome.
1248

    
1249
Details
1250
~~~~~~~
1251

    
1252
+---------------+---------------------------------------------------+
1253
| Package:      | nucleominer                                       |
1254
+---------------+---------------------------------------------------+
1255
| Maintainer:   | Florent Chuffart <florent.chuffart@ens-lyon.fr>   |
1256
+---------------+---------------------------------------------------+
1257
| Author:       | Florent Chuffart                                  |
1258
+---------------+---------------------------------------------------+
1259
| Version:      | 2.3.51                                            |
1260
+---------------+---------------------------------------------------+
1261
| License:      | CeCILL                                            |
1262
+---------------+---------------------------------------------------+
1263
| Title:        | nm                                                |
1264
+---------------+---------------------------------------------------+
1265
| Depends:      | seqinr, plotrix, DESeq, cachecache, dplyr         |
1266
+---------------+---------------------------------------------------+
1267

    
1268
Author(s)
1269
~~~~~~~~~
1270

    
1271
Florent Chuffart
1272

    
1273
R: Plot the distribution of reads.
1274

    
1275
Plot the distribution of reads.
1276
-------------------------------
1277

    
1278
Description
1279
~~~~~~~~~~~
1280

    
1281
This fuxntion use the DESeq nomalization feature to compare
1282
qualitatively the distribution.
1283

    
1284
Usage
1285
~~~~~
1286

    
1287
::
1288

    
1289
    plot_dist_samples(strain, marker, res, all_samples, NEWPLOT = TRUE)
1290

    
1291
Arguments
1292
~~~~~~~~~
1293

    
1294
``strain``
1295

    
1296
The strain to considere.
1297

    
1298
``marker``
1299

    
1300
The marker to considere.
1301

    
1302
``res``
1303

    
1304
Data
1305

    
1306
``all_samples``
1307

    
1308
Global list of samples.
1309

    
1310
``NEWPLOT``
1311

    
1312
If FALSE the curve will be add to the current plot.
1313

    
1314
Author(s)
1315
~~~~~~~~~
1316

    
1317
Florent Chuffart
1318

    
1319
R: sign from strand
1320

    
1321
sign from strand
1322
----------------
1323

    
1324
Description
1325
~~~~~~~~~~~
1326

    
1327
Get the sign of strand
1328

    
1329
Usage
1330
~~~~~
1331

    
1332
::
1333

    
1334
    sign_from_strand(strands)
1335

    
1336
Arguments
1337
~~~~~~~~~
1338

    
1339
+---------------+----+
1340
| ``strands``   |    |
1341
+---------------+----+
1342

    
1343
Value
1344
~~~~~
1345

    
1346
If strand in forward then returns 1 else returns -1
1347

    
1348
Author(s)
1349
~~~~~~~~~
1350

    
1351
Florent Chuffart
1352

    
1353
R: Substract to a list of regions an other list of regions that...
1354

    
1355
Substract to a list of regions an other list of regions that intersect it.
1356
--------------------------------------------------------------------------
1357

    
1358
Description
1359
~~~~~~~~~~~
1360

    
1361
This fucntion embed a recursive part. It occurs when a substracted
1362
region split an original region on two.
1363

    
1364
Usage
1365
~~~~~
1366

    
1367
::
1368

    
1369
    substract_region(region1, region2)
1370

    
1371
Arguments
1372
~~~~~~~~~
1373

    
1374
``region1``
1375

    
1376
Original regions.
1377

    
1378
``region2``
1379

    
1380
Regions to substract.
1381

    
1382
Author(s)
1383
~~~~~~~~~
1384

    
1385
Florent Chuffart
1386

    
1387
R: Switch a pairlist
1388

    
1389
Switch a pairlist
1390
-----------------
1391

    
1392
Description
1393
~~~~~~~~~~~
1394

    
1395
Take a pairlist key:value and return the switched pairlist value:key.
1396

    
1397
Usage
1398
~~~~~
1399

    
1400
::
1401

    
1402
    switch_pairlist(l)
1403

    
1404
Arguments
1405
~~~~~~~~~
1406

    
1407
``l``
1408

    
1409
The pairlist to switch.
1410

    
1411
Value
1412
~~~~~
1413

    
1414
The switched pairlist.
1415

    
1416
Author(s)
1417
~~~~~~~~~
1418

    
1419
Florent Chuffart
1420

    
1421
Examples
1422
~~~~~~~~
1423

    
1424
::
1425

    
1426
    l = list(key1 = "value1", key2 = "value2")
1427
    print(switch_pairlist(l))
1428

    
1429
R: Translate coords of a genome region.
1430

    
1431
Translate coords of a genome region.
1432
------------------------------------
1433

    
1434
Description
1435
~~~~~~~~~~~
1436

    
1437
This function is used in the examples, usualy you have to define your
1438
own translation function and overwrite this one using *unlockBinding*
1439
features. Please, refer to the example.
1440

    
1441
Usage
1442
~~~~~
1443

    
1444
::
1445

    
1446
    translate_cur(roi, strain2, config = NULL, big_cur = NULL)
1447

    
1448
Arguments
1449
~~~~~~~~~
1450

    
1451
``roi``
1452

    
1453
Original genome region of interest.
1454

    
1455
``strain2``
1456

    
1457
The strain in wich you want the genome region of interest.
1458

    
1459
``config``
1460

    
1461
GLOBAL config variable
1462

    
1463
``big_cur``
1464

    
1465
A largest region than roi use to filter c2c if it is needed.
1466

    
1467
Author(s)
1468
~~~~~~~~~
1469

    
1470
Florent Chuffart
1471

    
1472
Examples
1473
~~~~~~~~
1474

    
1475
::
1476

    
1477
    # Define new translate_cur function...
1478
    translate_cur = function(roi, strain2, config) {
1479
        strain1 = roi$strain_ref
1480
        if (strain1 == strain2) {
1481
            return(roi)
1482
        } else {
1483
          stop("Here is my new translate_cur function...")
1484
        }
1485
    }
1486
    # Binding it by uncomment follwing lines.
1487
    # unlockBinding("translate_cur", as.environment("package:nm"))
1488
    # unlockBinding("translate_cur", getNamespace("nm"))
1489
    # assign("translate_cur", translate_cur, "package:nm")
1490
    # assign("translate_cur", translate_cur, getNamespace("nm"))
1491
    # lockBinding("translate_cur", getNamespace("nm"))
1492
    # lockBinding("translate_cur", as.environment("package:nm"))
1493

    
1494
R: Translate a list of regions from a strain ref to another.
1495

    
1496
Translate a list of regions from a strain ref to another.
1497
---------------------------------------------------------
1498

    
1499
Description
1500
~~~~~~~~~~~
1501

    
1502
This function is an elaborated call to translate\_cur.
1503

    
1504
Usage
1505
~~~~~
1506

    
1507
::
1508

    
1509
    translate_regions(regions, combi, cur_index, config = NULL, roi)
1510

    
1511
Arguments
1512
~~~~~~~~~
1513

    
1514
``regions``
1515

    
1516
Regions to be translated.
1517

    
1518
``combi``
1519

    
1520
Combination of strains.
1521

    
1522
``cur_index``
1523

    
1524
The region of interest index.
1525

    
1526
``config``
1527

    
1528
GLOBAL config variable
1529

    
1530
``roi``
1531

    
1532
The region of interest.
1533

    
1534
Author(s)
1535
~~~~~~~~~
1536

    
1537
Florent Chuffart
1538

    
1539
R: Aggregate regions that intersect themselves.
1540

    
1541
Aggregate regions that intersect themselves.
1542
--------------------------------------------
1543

    
1544
Description
1545
~~~~~~~~~~~
1546

    
1547
This function is based on sort of lower bounds to detect regions that
1548
intersect. We compare lower bound and upper bound of the porevious item.
1549
This function embed a while loop and break break regions list become
1550
stable.
1551

    
1552
Usage
1553
~~~~~
1554

    
1555
::
1556

    
1557
    union_regions(regions)
1558

    
1559
Arguments
1560
~~~~~~~~~
1561

    
1562
``regions``
1563

    
1564
The Regions to be aggregated
1565

    
1566
Author(s)
1567
~~~~~~~~~
1568

    
1569
Florent Chuffart
1570

    
1571
R: Watching analysis of samples
1572

    
1573
Watching analysis of samples
1574
----------------------------
1575

    
1576
Description
1577
~~~~~~~~~~~
1578

    
1579
This function allows to view analysis for a particuler region of the
1580
genome.
1581

    
1582
Usage
1583
~~~~~
1584

    
1585
::
1586

    
1587
    watch_samples(replicates, read_length, plot_ref_genome = TRUE, 
1588
        plot_arrow_raw_reads = TRUE, plot_arrow_nuc_reads = TRUE, 
1589
        plot_squared_reads = TRUE, plot_coverage = FALSE, plot_gaussian_reads = TRUE, 
1590
        plot_gaussian_unified_reads = TRUE, plot_ellipse_nucs = TRUE, 
1591
        change_col = TRUE, plot_wp_nucs = TRUE, plot_fuzzy_nucs = FALSE, 
1592
        plot_wp_nuc_model = TRUE, plot_common_nucs = FALSE, plot_common_unrs = FALSE, 
1593
        plot_wp_nucs_4_nonmnase = FALSE, plot_chain = FALSE, plot_sample_id = FALSE, 
1594
        aggregated_intra_strain_nucs = NULL, aligned_inter_strain_nucs = NULL, 
1595
        height = 10, main = NULL, xlab = NULL, ylab = "#reads (per million reads)", 
1596
        config = NULL)
1597

    
1598
Arguments
1599
~~~~~~~~~
1600

    
1601
``replicates``
1602

    
1603
replicates under the form...
1604

    
1605
``read_length``
1606

    
1607
length of the reads
1608

    
1609
``plot_ref_genome``
1610

    
1611
Plot (or not) reference genome.
1612

    
1613
``plot_arrow_raw_reads``
1614

    
1615
Plot (or not) arrows for raw reads.
1616

    
1617
``plot_arrow_nuc_reads``
1618

    
1619
Plot (or not) arrows for reads aasiocied to a nucleosome.
1620

    
1621
``plot_squared_reads``
1622

    
1623
Plot (or not) reads in the square fashion.
1624

    
1625
``plot_coverage``
1626

    
1627
Plot (or not) reads in the covergae fashion. fashion.
1628

    
1629
``plot_gaussian_reads``
1630

    
1631
Plot (or not) gaussian model of a F anf R reads.
1632

    
1633
``plot_gaussian_unified_reads``
1634

    
1635
Plot (or not) gaussian model of a nuc.
1636

    
1637
``plot_ellipse_nucs``
1638

    
1639
Plot (or not) ellipse for a nuc.
1640

    
1641
``change_col``
1642

    
1643
Change the color of each nucleosome.
1644

    
1645
``plot_wp_nucs``
1646

    
1647
Plot (or not) cluster of nucs
1648

    
1649
``plot_fuzzy_nucs``
1650

    
1651
Plot (or not) cluster of fuzzy
1652

    
1653
``plot_wp_nuc_model``
1654

    
1655
Plot (or not) gaussian model for a cluster of nucs
1656

    
1657
``plot_common_nucs``
1658

    
1659
Plot (or not) aligned reads.
1660

    
1661
``plot_common_unrs``
1662

    
1663
Plot (or not) unaligned nucleosomal refgions (UNRs).
1664

    
1665
``plot_wp_nucs_4_nonmnase``
1666

    
1667
Plot (or not) clusters for non inputs samples.
1668

    
1669
``plot_chain``
1670

    
1671
Plot (or not) clusterised nuceosomes between mnase samples.
1672

    
1673
``plot_sample_id``
1674

    
1675
Plot (or not) the sample id for each sample.
1676

    
1677
``aggregated_intra_strain_nucs``
1678

    
1679
list of aggregated intra strain nucs. If NULL, it will be computed.
1680

    
1681
``aligned_inter_strain_nucs``
1682

    
1683
list of aligned inter strain nucs. If NULL, it will be computed.
1684

    
1685
``height``
1686

    
1687
Number of reads in per million read for each sample, graphical parametre
1688
for the y axis.
1689

    
1690
``main``
1691

    
1692
main title of the produced plot
1693

    
1694
``xlab``
1695

    
1696
xlab of the produced plot
1697

    
1698
``ylab``
1699

    
1700
ylab of the produced plot
1701

    
1702
``config``
1703

    
1704
GLOBAL config variable
1705

    
1706
Author(s)
1707
~~~~~~~~~
1708

    
1709
Florent Chuffart