/ - Diff - NucleoMiner - Forge du Centre Blaise Pascal

Révision e5603c3f

     *****************************************
     *NucleoMiner2* offers Python API and R package allowing to perform
     quantitative analysis of nucleosomal epigenome. It is especially well
     suited for scripting to extract natural Single-Nucleosome Epi-
     Polymorphisms (SNEP) from ChIP-Seq data.
     quantitative analysis of epigenetic marks on individual nucleosomes.
     It was developed to detect natural Single-Nucleosome Epi-Polymorphisms
     (SNEP) from MNase-seq and ChIP-seq data.
     License
-...
     of the economic rights,  and the successive licensors  have only
     limited liability.
     In this respect, the user's attention is drawn to the risks associated
     with loading,  using,  modifying and/or developing or reproducing the
     software by the user in light of its specific status of free software,
     that may mean  that it is complicated to manipulate,  and  that  also
     therefore means  that it is reserved for developers  and  experienced
     professionals having in-depth computer knowledge. Users are therefore
     encouraged to load and test the software's suitability as regards
     their requirements in conditions enabling the security of their
     systems and/or data to be ensured and,  more generally, to use and
     operate it in the same conditions as regards security.
     This software is provided with absolutely NO WARRANTY. The authors can
     not be held responsible, even partially, for any damage, loss,
     financial loss or any other undesired facts resulting from the use of
     the software. In this respect, the user's attention is drawn to the
     risks associated with loading,  using,  modifying and/or developing or
     reproducing the software by the user in light of its specific status
     of free software, that may mean  that it is complicated to manipulate,
     and  that  also therefore means  that it is reserved for developers
     and  experienced professionals having in-depth computer knowledge.
     Users are therefore encouraged to load and test the software's
     suitability as regards their requirements in conditions enabling the
     security of their systems and/or data to be ensured and,  more
     generally, to use and operate it in the same conditions as regards
     security.
     The fact that you are presently reading this means that you have had
     knowledge of the CeCILL license and that you accept its terms.
-...
     Links
     -----
     *NucleoMiner2* home page and documentation: https://forge.cbp.ens-
     lyon.fr/redmine/projects/nucleominer
     *NucleoMiner2* home page and documentation are available here:
     Gael Yvert lab page: http://www.ens-lyon.fr/LBMC/gisv/
        * https://forge.cbp.ens-lyon.fr/redmine/projects/nucleominer
     The Yvert lab web page is accessible here:
        * http://www.ens-lyon.fr/LBMC/gisv/
     Installation
     ------------
        * Download archive
        * Compile bowtie2
     Prerequisites
     ~~~~~~~~~~~~~
     To work properly, NucleoMiner2 needs that the following free software
     are installed and made available on your system:
        * Bowtie2 http://bowtie-bio.sourceforge.net/bowtie2
        * Compile samtools
        * SAMtools http://samtools.sourceforge.net
        * Compile bedtools
        * bedtools http://code.google.com/p/bedtools/
        * Compile TemplateFilter
        * TemplateFilter
          http://compbio.cs.huji.ac.il/NucPosition/TemplateFiltering
     Required R packages:
        * bot
     It also requires the following R packages to be installed on your
     system:
        * fork
-...
        * seqinr
        * cachecache
        cd src/r_packages/
              tar xfvz R-latest.tar.gz
              cd R-patched
              ./configure --with-x=no PDFLATEX="ls"
              make
        cd ../../..
        R_BIN=src/r_packages/R-patched/bin/R
              $R_BIN CMD INSTALL src/r_packages/rjson_0.2.12.tar.gz
              $R_BIN CMD INSTALL src/r_packages/seqinr_3.0-7.tar.gz
              $R_BIN CMD INSTALL src/r_packages/plotrix_3.4-5.tar.gz
              $R_BIN CMD INSTALL src/r_packages/nm_2.0.tar.gz
              $R_BIN CMD INSTALL src/r_packages/fork_1.2.4.tar.gz
              $R_BIN CMD INSTALL src/r_packages/bot_0.9.tar.gz
              $R_BIN CMD INSTALL src/r_packages/DESeq_1.14.0.tar.gz
     ...
        * cachecache https://forge.cbp.ens-
          lyon.fr/redmine/projects/cachecache
     usage
     =====
        * bot https://forge.cbp.ens-lyon.fr/redmine/projects/bot
     See html documentation for *NucleoMiner2*: http://www.ens-
     lyon.fr/LBMC/gisv/
        * nucleominer https://forge.cbp.ens-
          lyon.fr/redmine/projects/nucleominer

     Welcome to *NucleoMiner2*
     *************************
     * Readme / Documentation for *NucleoMiner2*
       * License
       * Installation Instructions
       * usage
     * Tutorial
       * Python and R Common Configuration File
       * Dataset and Configuration Variables
       * Preprocessing Illumina Fastq Reads for Each Sample
       * Inferring Nucleosome Position and Extracting Read Counts
       * Results
     * References
       * Python Reference
       * R Reference
     Indices and tables
     ******************
     * *Index*
     * *Search Page*

     Readme / Documentation for *NucleoMiner2*
     *****************************************
     *NucleoMiner2* offers Python API and R package allowing to perform
     quantitative analysis of nucleosomal epigenome. It is especially well
     suited for scripting to extract natural Single-Nucleosome Epi-
     Polymorphisms (SNEP) from ChIP-Seq data.
     License
     =======
     Copyright CNRS 2012-2013
     * Florent CHUFFART
     * Jean-Baptiste VEYRIERAS
     * Gael YVERT
     This software is a computer program which purpose is to perform
     quanti- tative analysis of epigenetic marks at single nucleosome
     resolution.
     This software is governed by the CeCILL license under French law and
     abiding by the rules of distribution of free software.  You can  use,
     modify and/ or redistribute the software under the terms of the CeCILL
     license as circulated by CEA, CNRS and INRIA at the following URL
     "http://www.cecill.info".
     As a counterpart to the access to the source code and  rights to copy,
     modify and redistribute granted by the license, users are provided
     only with a limited warranty  and the software's author,  the holder
     of the economic rights,  and the successive licensors  have only
     limited liability.
     In this respect, the user's attention is drawn to the risks associated
     with loading,  using,  modifying and/or developing or reproducing the
     software by the user in light of its specific status of free software,
     that may mean  that it is complicated to manipulate,  and  that  also
     therefore means  that it is reserved for developers  and  experienced
     professionals having in-depth computer knowledge. Users are therefore
     encouraged to load and test the software's suitability as regards
     their requirements in conditions enabling the security of their
     systems and/or data to be ensured and,  more generally, to use and
     operate it in the same conditions as regards security.
     The fact that you are presently reading this means that you have had
     knowledge of the CeCILL license and that you accept its terms.
     Installation Instructions
     =========================
     Links
     -----
     *NucleoMiner2* home page and documentation: https://forge.cbp.ens-
     lyon.fr/redmine/projects/nucleominer
     Gael Yvert lab page: http://www.ens-lyon.fr/LBMC/gisv/
     Installation
     ------------
        * Download archive
        * Compile bowtie2
        * Compile samtools
        * Compile bedtools
        * Compile TemplateFilter
     Required R packages:
        * bot
        * fork
        * rjson
        * seqinr
        * cachecache
        cd src/r_packages/
              tar xfvz R-latest.tar.gz
              cd R-patched
              ./configure --with-x=no PDFLATEX="ls"
              make
        cd ../../..
        R_BIN=src/r_packages/R-patched/bin/R
              $R_BIN CMD INSTALL src/r_packages/rjson_0.2.12.tar.gz
              $R_BIN CMD INSTALL src/r_packages/seqinr_3.0-7.tar.gz
              $R_BIN CMD INSTALL src/r_packages/plotrix_3.4-5.tar.gz
              $R_BIN CMD INSTALL src/r_packages/nm_2.0.tar.gz
              $R_BIN CMD INSTALL src/r_packages/fork_1.2.4.tar.gz
              $R_BIN CMD INSTALL src/r_packages/bot_0.9.tar.gz
              $R_BIN CMD INSTALL src/r_packages/DESeq_1.14.0.tar.gz
     ...
     usage
     =====
     See html documentation for *NucleoMiner2*: http://www.ens-
     lyon.fr/LBMC/gisv/

     References
     **********
     Python Reference
     ================
     configurator.CSV_SAMPLE_FILE = None
        Path to cvs file that contains sample information.
     configurator.BOWTIE_BUILD_BIN = None
        Path for bowtie2 build bin.
     configurator.BOWTIE2_BIN = None
        Path for bowtie2 bin.
     configurator.SAMTOOLS_BIN = None
        Path for samtools bin.
     configurator.BEDTOOLS_BIN = None
        Path for bedtools bin.
     configurator.TF_BIN = None
        Path for TemplateFilter bin.
     configurator.TF_TEMPLATES_FILE = None
        Path for TemplateFilter templates file.
     configurator.ILLUMINA_OUTPUTFILE_PREFIX = None
        Prefix for Illumina fastq output files.
     configurator.INDEX_DIR = None
        Path for index dir.
     configurator.ALIGN_DIR = None
        Path for align dir.
     configurator.LOG_DIR = None
        Path for log dir
     configurator.CACHE_DIR = None
        Path for cache dir.
     configurator.RESULTS_DIR = None
        Path for results dir
     configurator.FASTA_REFERENCE_GENOME_FILES = None
        Dictionary where each fasta reference genomes is indexed by
        reference strain that it corresponds.
     configurator.AREA_BLACK_LIST = None
        Dictionary where keys are strain and values are black listed of
        geneome region.
     configurator.FASTA_INDEXES = None
        Dictionary of strain that indexes dictionaries where keys are
        chromosome reference from Fastq file and value are its
        correspondance for Templatefilter.
     configurator.C2C_FILES = None
        Dictionary where each strain combination indexes genome aligment.
     configurator.READ_LENGTH = None
        Length of Illumina reads.
     configurator.MAPQ_THRES = None
        Aligment quality thresold.
     configurator.TF_CORR = None
        TemplateFilter Template correlation threshold.
     configurator.TF_MINW = None
        TemplateFilter minimum width of a nucleosome.
     configurator.TF_MAXW = None
        TemplateFilter maximum  width of a nucleosome.
     configurator.TF_OL = None
        TemplateFilter maximum allowed overlap for two nucleosomes.
     wf.json_conf_file = 'src/current/nucleominer_config.json'
        Path to the json configuration file.
     wf.samples = []
        List of samples where a sample is identify by an id (key: *id*) and
        a strain name (key *strain*).
     wf.samples_mnase = []
        List of Mnase samples.
     wf.strains = []
        List of reference strains.
     libcoverage.create_bowtie_index(strain, strain_fasta_ref, index_dir, bowtie_build_bin)
        Creates bowtie index for a strain *strain*.
        Parameters:
           * **strain** -- the strain reference.
           * **strain_fasta_ref** -- fasta reference genome.
           * **index_dir** -- directories where to put bowtie index.
           * **bowtie_build_bin** -- bowtie2 build binary.
     libcoverage.align_reads(sample, align_dir, log_dir, index_dir, illumina_outputfile_prefix, bowtie2_bin, samtools_bin, bedtools_bin)
        Aligns reads to reference genomes. It produces .sam files, that are
        converted to .bam, that are converted to .bed.
        Parameters:
           * **sample** -- a dict that describe a sample.
           * **align_dir** -- directory where aligned reads will be
             stored.
           * **log_dir** -- directory where logs will be stored.
           * **illumina_outputfile_prefix** -- prefix of Illumina
             sequencer fastq.gz output files.
           * **bowtie2_bin** -- bowtie2 binary.
           * **samtools_bin** -- samtools binary.
           * **bedtools_bin** -- bedtools binary.
           * **index_dir** -- bowtie index directory.
     libcoverage.split_fr_4_TF(sample, align_dir, fasta_indexes, area_black_list, read_length, mapq_thres)
        Create TempleFilter input files form bed files. This function
        appends in two times. First, it collects reads from bed files and
        feeds a datastructure
        Parameters:
           * **sample** -- a dict that describe a sample.
           * **align_dir** -- directory where aligned reads will be
             stored.
           * **fasta_index** -- the chr reference from the illumina
             output file.
           * **area_black_list** -- the description of genome that will
             be omit.
           * **read_length** -- Length of Illumina reads.
           * **mapq_thres** -- mapping quality criterion threshold, see
             MAPQ in BED/BAM file format.
     libcoverage.template_filter(sample, align_dir, log_dir, tf_bin, tf_templates_file, corr, minw, maxw, ol)
        Run TemplateFilter on a specifi sample. It produces .tab file.
        Parameters:
           * **sample** -- a dict that describe a sample.
           * **align_dir** -- directory where aligned reads will be
             stored.
           * **log_dir** -- directory where logs will be stored.
           * **tf_bin** -- path to the TemplateFilter binary.
           * **tf_templates_file** -- path to the TemplateFilter
             templates file.
           * **corr** -- correlation threshold transmits to
             TemplateFilter.
           * **minw** -- minimum width of a nuc, transmits to
             TemplateFilter.
           * **maxw** -- maximum width of a nuc, transmits to
             TemplateFilter.
           * **ol** -- maximum overlaps for 2 nuc, transmits to
             TemplateFilter.
     R Reference
     ===========
     Arabic to Roman pair list.
     --------------------------
     Description
     ~~~~~~~~~~~
     Util to convert Arabicto Roman
     Usage
     ~~~~~
        ARAB2ROM()
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: False Discovery Rate
     False Discovery Rate
     --------------------
     Description
     ~~~~~~~~~~~
     From a vector x of independent p-values, extract the cutoff
     corresponding to the specified FDR. See Benjamini & Hochberg 1995
     paper
     Usage
     ~~~~~
        FDR(x, FDR)
     Arguments
     ~~~~~~~~~
     "x"
     A vector x of independent p-values.
     "FDR"
     The specified FDR.
     Value
     ~~~~~
     Return the the corresponding cutoff.
     Author(s)
     ~~~~~~~~~
     Gael Yvert, Florent Chuffart
     Examples
     ~~~~~~~~
        print("example")
     R: Roman to Arabic pair list.
     Roman to Arabic pair list.
     --------------------------
     Description
     ~~~~~~~~~~~
     Util to convert Roman to Arabic
     Usage
     ~~~~~
        ROM2ARAB()
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: Aggregate replicated sample's nucleosomes.
     Aggregate replicated sample's nucleosomes.
     ------------------------------------------
     Description
     ~~~~~~~~~~~
     This function aggregates nucleosome for replicated samples. It uses
     TemplateFilter ouput of each sample as replicate. Each sample owns a
     set of nucleosomes computed using TemplateFilter and ordered by the
     position of their center. Adajacent nucleosomes are compared two by
     two. Comparison is based on a log likelihood ratio score. The issue of
     comparison is adjacents nucleosomes merge or separation. Finally the
     function returns a list of clusters and all computed *llr_scores*.
     Each cluster ows an attribute *wp* for "well positionned". This
     attribute is set as *TRUE* if the cluster is composed of exactly one
     nucleosomes of each sample.
     Usage
     ~~~~~
        aggregate_intra_strain_nucs(samples, llr_thres = 20, coord_max = 2e+07)
     Arguments
     ~~~~~~~~~
     "samples"
     A list of samples. Each sample is a list like *sample = list(id=...,
     marker=..., strain=..., roi=..., inputs=..., outputs=...)* with *roi =
     list(name=..., begin=..., end=..., chr=..., genome=...)*.
     "llr_thres"
     Log likelihood ration threshold.
     "coord_max"
     A too big value to be a coord for a nucleosome lower bound.
     Value
     ~~~~~
     Returns a list of clusterized nucleosomes, and all computed llr
     scores.
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     Examples
     ~~~~~~~~
        # Dealing with a region of interest
        roi =list(name="example", begin=1000,  end=1300, chr="1", genome=rep("A",301))
        samples = list()
        for (i in 1:3) {
            # Create TF output
            tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9)
            outputs = dfadd(NULL,tf_nuc)
            outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end)
            # Generate corresponding reads
            nb_reads = round(runif(1,170,230))
            reads = round(rnorm(nb_reads, tf_nuc$center,20))
            u_reads = sort(unique(reads))
            strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2))))
            counts = apply(t(u_reads), 2, function(r) { sum(reads == r)})
            shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)})
            u_reads = u_reads + shifts
            inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)),
                                     "V2" = u_reads,
                                                             "V3" = strands,
                                                             "V4" = counts), stringsAsFactors=FALSE)
            samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain="strain_ex", total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs)
+       }
        print(aggregate_intra_strain_nucs(samples))
     R: Aligns nucleosomes between 2 strains.
     Aligns nucleosomes between 2 strains.
     -------------------------------------
     Description
     ~~~~~~~~~~~
     This function aligns nucs between two strains for a given genome
     region.
     Usage
     ~~~~~
        align_inter_strain_nucs(replicates, wp_nucs_strain_ref1 = NULL,
            wp_nucs_strain_ref2 = NULL, corr_thres = 0.5, llr_thres = 100,
            config = NULL, ...)
     Arguments
     ~~~~~~~~~
     "replicates"
     Set of replicates, ideally 3 per strain.
     "wp_nucs_strain_ref1"
     List of aggregates nucleosome for strain 1. If it's null this list
     will be computed.
     "wp_nucs_strain_ref2"
     List of aggregates nucleosome for strain 2. If it's null this list
     will be computed.
     "corr_thres"
     Correlation threshold.
     "llr_thres"
     LOD cut off.
     "config"
     GLOBAL config variable
     "..."
     A list of parameters that will be passed to
     *aggregate_intra_strain_nucs* if needed.
     Value
     ~~~~~
     Returns a list of clusterized nucleosomes, and all computed llr
     scores.
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     Examples
     ~~~~~~~~
            # Define new translate_cur function...
            translate_cur = function(roi, strain2, big_cur=NULL, config=NULL) {
              return(roi)
+           }
            # Binding it by uncomment follwing lines.
            unlockBinding("translate_cur", as.environment("package:nucleominer"))
            unlockBinding("translate_cur", getNamespace("nucleominer"))
            assign("translate_cur", translate_cur, "package:nucleominer")
            assign("translate_cur", translate_cur, getNamespace("nucleominer"))
            lockBinding("translate_cur", getNamespace("nucleominer"))
            lockBinding("translate_cur", as.environment("package:nucleominer"))
        # Dealing with a region of interest
        roi =list(name="example", begin=1000,  end=1300, chr="1", genome=rep("A",301), strain_ref1 = "STRAINREF1")
        roi2 = translate_cur(roi, roi$strain_ref1)
        replicates = list()
        for (j in 1:2) {
            samples = list()
            for (i in 1:3) {
                # Create TF output
                tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9)
                outputs = dfadd(NULL,tf_nuc)
                outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end)
                # Generate corresponding reads
                nb_reads = round(runif(1,170,230))
                reads = round(rnorm(nb_reads, tf_nuc$center,20))
                u_reads = sort(unique(reads))
                strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2))))
                counts = apply(t(u_reads), 2, function(r) { sum(reads == r)})
                shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)})
                u_reads = u_reads + shifts
                inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)),
                                         "V2" = u_reads,
                                                                 "V3" = strands,
                                                                 "V4" = counts), stringsAsFactors=FALSE)
                samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain=paste("strain_ex",j,sep=""), total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs)
+           }
            replicates[[length(replicates) + 1]] = samples
+       }
        print(align_inter_strain_nucs(replicates))
     R: Launch deseq methods.
     Launch deseq methods.
     ---------------------
     Description
     ~~~~~~~~~~~
     This function is based on deseq example. It mormalizes data, fit data
     to GLM model with and without interaction term and compare the two
     l;=models.
     Usage
     ~~~~~
        analyse_design(snep_design, reads)
     Arguments
     ~~~~~~~~~
     "snep_design"
     The design to considere.
     "reads"
     The data to considere.
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: Stage replicates data
     Stage replicates data
     ---------------------
     Description
     ~~~~~~~~~~~
     This function loads in memory data corresponding to the given
     experiments.
     Usage
     ~~~~~
        build_replicates(expe, roi, only_fetch = FALSE, get_genome = FALSE,
            all_samples, config = NULL)
     Arguments
     ~~~~~~~~~
     "expe"
     a list of vector corresponding to vector of replicates.
     "roi"
     the region that we are interested in.
     "only_fetch"
     filter or not inputs.
     "get_genome"
     Load or not corresponding genome.
     "all_samples"
     Global list of samples.
     "config"
     GLOBAL config variable.
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     Examples
     ~~~~~~~~
        # library(rjson)
        # library(nucleominer)
+       #
        # # Read config file
        # json_conf_file = "nucleo_miner_config.json"
        # config = fromJSON(paste(readLines(json_conf_file), collapse=""))
        # # Read sample file
        # all_samples = get_content(config$CSV_SAMPLE_FILE, "cvs", sep=";", head=TRUE, stringsAsFactors=FALSE)
        # # here are the sample ids in a list
        # expes = list(c(1))
        # # here is the region that we wnt to see the coverage
        # cur = list(chr="8", begin=472000, end=474000, strain_ref="BY")
        # # it displays the corverage
        # replicates = build_replicates(expes, cur, all_samples=all_samples, config=config)
        # out = watch_samples(replicates, config$READ_LENGTH,
        #       plot_coverage = TRUE,
        #       plot_squared_reads = FALSE,
        #       plot_ref_genome = FALSE,
        #       plot_arrow_raw_reads = FALSE,
        #       plot_arrow_nuc_reads = FALSE,
        #       plot_gaussian_reads = FALSE,
        #       plot_gaussian_unified_reads = FALSE,
        #       plot_ellipse_nucs = FALSE,
        #       plot_wp_nucs = FALSE,
        #       plot_wp_nuc_model = FALSE,
        #       plot_common_nucs = FALSE,
        #       height = 50)
     R: Extract a sub part of the corresponding c2c file
     Extract a sub part of the corresponding c2c file
     ------------------------------------------------
     Description
     ~~~~~~~~~~~
     This fonction allow to acces to a specific part of the c2c file.
     Usage
     ~~~~~
        c2c_extraction(strain1, strain2, chr = NULL, lower_bound = NULL,
            upper_bound = NULL, config = NULL)
     Arguments
     ~~~~~~~~~
     "strain1"
     the key strain
     "strain2"
     the target strain
     "chr"
     if defined, the c2c will filtered according to the chromosome value
     "lower_bound"
     if defined, the c2c will filtered for part of the genome upper than
     lower_bound
     "upper_bound"
     if defined, the c2c will filtered for part of the genome lower than
     upper_bound
     "config"
     GLOBAL config variable
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: reformat an "apply manipulated" list of regions
     reformat an "apply manipulated" list of regions
     -----------------------------------------------
     Description
     ~~~~~~~~~~~
     Utils to reformat an "apply manipulated" list of regions
     Usage
     ~~~~~
        collapse_regions(regions)
     Arguments
     ~~~~~~~~~
     +-----------------+------+
     +-----------------+------+
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: Compute Common Uninterrupted Regions (CUR)
     Compute Common Uninterrupted Regions (CUR)
     ------------------------------------------
     Description
     ~~~~~~~~~~~
     CURs are regions that can be aligned between the genomes
     Usage
     ~~~~~
        compute_inter_all_strain_curs(diff_allowed = 30, min_cur_width = 4000,
            config = NULL)
     Arguments
     ~~~~~~~~~
     "diff_allowed"
     the maximum indel width allowe din a CUR
     "min_cur_width"
     The minimum width of a CUR
     "config"
     GLOBAL config variable
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: Crop bound of regions according to region of interest bound
     Crop bound of regions according to region of interest bound
     -----------------------------------------------------------
     Description
     ~~~~~~~~~~~
     The fucntion is no more necessary since we remove "big_cur" bug in
     translate_cur function.
     Usage
     ~~~~~
        crop_fuzzy(tmp_fuzzy_nucs, roi, strain, config = NULL)
     Arguments
     ~~~~~~~~~
     "tmp_fuzzy_nucs"
     the regiuons to be croped.
     "roi"
     The region of interest.
     "strain"
     The strain to consider.
     "config"
     GLOBAL config variable
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: Adding list to a dataframe.
     Adding list to a dataframe.
     ---------------------------
     Description
     ~~~~~~~~~~~
     Add a list *l* to a dataframe *df*. Create it if *df* is *NULL*.
     Return the dataframe *df*.
     Usage
     ~~~~~
        dfadd(df, l)
     Arguments
     ~~~~~~~~~
     "df"
     A dataframe
     "l"
     A list
     Value
     ~~~~~
     Return the dataframe *df*.
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     Examples
     ~~~~~~~~
        ## Here dataframe is NULL
        print(df)
        df = NULL
        # Initialize df
        df = dfadd(df, list(key1 = "value1", key2 = "value2"))
        print(df)
        # Adding elements to df
        df = dfadd(df, list(key1 = "value1'", key2 = "value2'"))
        print(df)
     R: Prefetch data
     Prefetch data
     -------------
     Description
     ~~~~~~~~~~~
     Fetch and filter inputs and outpouts per region of interest. Organize
     it per replicates.
     Usage
     ~~~~~
        fetch_mnase_replicates(strain, roi, all_samples, config = NULL,
            only_fetch = FALSE, get_genome = FALSE, get_ouputs = TRUE)
     Arguments
     ~~~~~~~~~
     "strain"
     The strain we want mnase replicatesList of replicates. Each replicates
     is a vector of sample ids.
     "roi"
     Region of interest.
     "all_samples"
     Global list of samples.
     "config"
     GLOBAL config variable
     "only_fetch"
     If TRUE, only fetch and not filtering. It is used tio load sample
     files into memory before forking.
     "get_genome"
     If TRUE, load corresponding genome sequence.
     "get_ouputs"
     If TRUE, get also ouput corresponding TF output files.
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: Filter TemplateFilter inputs
     Filter TemplateFilter inputs
     ----------------------------
     Description
     ~~~~~~~~~~~
     This function filters TemplateFilter inputs according genome area
     observed properties. It takes into account reads that are at the
     frontier of this area and the strand of these reads.
     Usage
     ~~~~~
        filter_tf_inputs(inputs, chr, x_min, x_max, nuc_width = 160,
            only_f = FALSE, only_r = FALSE, filter_for_coverage = FALSE)
     Arguments
     ~~~~~~~~~
     "inputs"
     TF inputs to be filtered.
     "chr"
     Chromosome observed, here chr is an integer.
     "x_min"
     Coordinate of the first bp observed.
     "x_max"
     Coordinate of the last bp observed.
     "nuc_width"
     Nucleosome width.
     "only_f"
     Filter only F reads.
     "only_r"
     Filter only R reads.
     "filter_for_coverage"
     Does it filter for plot coverage?
     Value
     ~~~~~
     Returns filtred inputs.
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: Filter TemplateFilter outputs
     Filter TemplateFilter outputs
     -----------------------------
     Description
     ~~~~~~~~~~~
     This function filters TemplateFilter outputs according, not only
     genome area observerved properties, but also correlation and
     overlapping threshold.
     Usage
     ~~~~~
        filter_tf_outputs(tf_outputs, chr, x_min, x_max, nuc_width = 160,
            ol_bp = 59, corr_thres = 0.5)
     Arguments
     ~~~~~~~~~
     "tf_outputs"
     TemplateFilter outputs.
     "chr"
     Chromosome observed, here chr is an integer.
     "x_min"
     Coordinate of the first bp observed.
     "x_max"
     Coordinate of the last bp observed.
     "nuc_width"
     Nucleosome width.
     "ol_bp"
     Overlap Threshold.
     "corr_thres"
     Correlation threshold.
     Value
     ~~~~~
     Returns filtered TemplateFilter Outputs
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: to flat aggregate_intra_strain_nucs function output
     to flat aggregate_intra_strain_nucs function output
     ---------------------------------------------------
     Description
     ~~~~~~~~~~~
     This function builds a dataframe of all clusters obtain from
     aggregate_intra_strain_nucs function.
     Usage
     ~~~~~
        flat_aggregated_intra_strain_nucs(partial_strain_maps, cur_index)
     Arguments
     ~~~~~~~~~
     "partial_strain_maps"
     the output of aggregate_intra_strain_nucs function
     "cur_index"
     the index of the roi involved
     Value
     ~~~~~
     Returns a dataframe of all clusters obtain from
     aggregate_intra_strain_nucs function.
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: flat reads
     flat reads
     ----------
     Description
     ~~~~~~~~~~~
     Extract reads coordinates from TempleteFilter input sequence
     Usage
     ~~~~~
        flat_reads(reads, nuc_width)
     Arguments
     ~~~~~~~~~
     "reads"
     TemplateFilter input reads
     "nuc_width"
     Width used to shift F and R reads.
     Value
     ~~~~~
     Returns a list of F reads, R reads and joint/shifted F and R reads.
     Author(s)
     ~~~~~~~~~
     Florent Chuffart
     R: Retrieve Reads
     Retrieve Reads
     --------------
     Description
     ~~~~~~~~~~~
     Retrieve reads for a given marker, combi, form.
     Usage
     ~~~~~
        get_all_reads(marker, combi, form = "wp", config = NULL)
     Arguments
     ~~~~~~~~~
     "marker"
     The marker to considere.
     "combi"

... Ce différentiel a été tronqué car il excède la taille maximale pouvant être affichée.

Formats disponibles : Unified diff

LBMC » NucleoMiner

Révision e5603c3f