References ********** Python Reference ================ configurator.CSV_SAMPLE_FILE = None Path to cvs file that contains sample information. configurator.BOWTIE_BUILD_BIN = None Path for bowtie2 build bin. configurator.BOWTIE2_BIN = None Path for bowtie2 bin. configurator.SAMTOOLS_BIN = None Path for samtools bin. configurator.BEDTOOLS_BIN = None Path for bedtools bin. configurator.TF_BIN = None Path for TemplateFilter bin. configurator.TF_TEMPLATES_FILE = None Path for TemplateFilter templates file. configurator.ILLUMINA_OUTPUTFILE_PREFIX = None Prefix for Illumina fastq output files. configurator.INDEX_DIR = None Path for index dir. configurator.ALIGN_DIR = None Path for align dir. configurator.LOG_DIR = None Path for log dir configurator.CACHE_DIR = None Path for cache dir. configurator.RESULTS_DIR = None Path for results dir configurator.FASTA_REFERENCE_GENOME_FILES = None Dictionary where each fasta reference genomes is indexed by reference strain that it corresponds. configurator.AREA_BLACK_LIST = None Dictionary where keys are strain and values are black listed of geneome region. configurator.FASTA_INDEXES = None Dictionary of strain that indexes dictionaries where keys are chromosome reference from Fastq file and value are its correspondance for Templatefilter. configurator.C2C_FILES = None Dictionary where each strain combination indexes genome aligment. configurator.READ_LENGTH = None Length of Illumina reads. configurator.MAPQ_THRES = None Aligment quality thresold. configurator.TF_CORR = None TemplateFilter Template correlation threshold. configurator.TF_MINW = None TemplateFilter minimum width of a nucleosome. configurator.TF_MAXW = None TemplateFilter maximum width of a nucleosome. configurator.TF_OL = None TemplateFilter maximum allowed overlap for two nucleosomes. wf.json_conf_file = 'src/current/nucleominer_config.json' Path to the json configuration file. wf.samples = [] List of samples where a sample is identify by an id (key: *id*) and a strain name (key *strain*). wf.samples_mnase = [] List of Mnase samples. wf.strains = [] List of reference strains. libcoverage.create_bowtie_index(strain, strain_fasta_ref, index_dir, bowtie_build_bin) Creates bowtie index for a strain *strain*. Parameters: * **strain** -- the strain reference. * **strain_fasta_ref** -- fasta reference genome. * **index_dir** -- directories where to put bowtie index. * **bowtie_build_bin** -- bowtie2 build binary. libcoverage.align_reads(sample, align_dir, log_dir, index_dir, illumina_outputfile_prefix, bowtie2_bin, samtools_bin, bedtools_bin) Aligns reads to reference genomes. It produces .sam files, that are converted to .bam, that are converted to .bed. Parameters: * **sample** -- a dict that describe a sample. * **align_dir** -- directory where aligned reads will be stored. * **log_dir** -- directory where logs will be stored. * **illumina_outputfile_prefix** -- prefix of Illumina sequencer fastq.gz output files. * **bowtie2_bin** -- bowtie2 binary. * **samtools_bin** -- samtools binary. * **bedtools_bin** -- bedtools binary. * **index_dir** -- bowtie index directory. libcoverage.split_fr_4_TF(sample, align_dir, fasta_indexes, area_black_list, read_length, mapq_thres) Create TempleFilter input files form bed files. This function appends in two times. First, it collects reads from bed files and feeds a datastructure Parameters: * **sample** -- a dict that describe a sample. * **align_dir** -- directory where aligned reads will be stored. * **fasta_index** -- the chr reference from the illumina output file. * **area_black_list** -- the description of genome that will be omit. * **read_length** -- Length of Illumina reads. * **mapq_thres** -- mapping quality criterion threshold, see MAPQ in BED/BAM file format. libcoverage.template_filter(sample, align_dir, log_dir, tf_bin, tf_templates_file, corr, minw, maxw, ol) Run TemplateFilter on a specifi sample. It produces .tab file. Parameters: * **sample** -- a dict that describe a sample. * **align_dir** -- directory where aligned reads will be stored. * **log_dir** -- directory where logs will be stored. * **tf_bin** -- path to the TemplateFilter binary. * **tf_templates_file** -- path to the TemplateFilter templates file. * **corr** -- correlation threshold transmits to TemplateFilter. * **minw** -- minimum width of a nuc, transmits to TemplateFilter. * **maxw** -- maximum width of a nuc, transmits to TemplateFilter. * **ol** -- maximum overlaps for 2 nuc, transmits to TemplateFilter. R Reference =========== Arabic to Roman pair list. -------------------------- Description ~~~~~~~~~~~ Util to convert Arabicto Roman Usage ~~~~~ ARAB2ROM() Author(s) ~~~~~~~~~ Florent Chuffart R: False Discovery Rate False Discovery Rate -------------------- Description ~~~~~~~~~~~ From a vector x of independent p-values, extract the cutoff corresponding to the specified FDR. See Benjamini & Hochberg 1995 paper Usage ~~~~~ FDR(x, FDR) Arguments ~~~~~~~~~ "x" A vector x of independent p-values. "FDR" The specified FDR. Value ~~~~~ Return the the corresponding cutoff. Author(s) ~~~~~~~~~ Gael Yvert, Florent Chuffart Examples ~~~~~~~~ print("example") R: Roman to Arabic pair list. Roman to Arabic pair list. -------------------------- Description ~~~~~~~~~~~ Util to convert Roman to Arabic Usage ~~~~~ ROM2ARAB() Author(s) ~~~~~~~~~ Florent Chuffart R: Aggregate replicated sample's nucleosomes. Aggregate replicated sample's nucleosomes. ------------------------------------------ Description ~~~~~~~~~~~ This function aggregates nucleosome for replicated samples. It uses TemplateFilter ouput of each sample as replicate. Each sample owns a set of nucleosomes computed using TemplateFilter and ordered by the position of their center. Adajacent nucleosomes are compared two by two. Comparison is based on a log likelihood ratio score. The issue of comparison is adjacents nucleosomes merge or separation. Finally the function returns a list of clusters and all computed *llr_scores*. Each cluster ows an attribute *wp* for "well positionned". This attribute is set as *TRUE* if the cluster is composed of exactly one nucleosomes of each sample. Usage ~~~~~ aggregate_intra_strain_nucs(samples, llr_thres = 20, coord_max = 2e+07) Arguments ~~~~~~~~~ "samples" A list of samples. Each sample is a list like *sample = list(id=..., marker=..., strain=..., roi=..., inputs=..., outputs=...)* with *roi = list(name=..., begin=..., end=..., chr=..., genome=...)*. "llr_thres" Log likelihood ration threshold. "coord_max" A too big value to be a coord for a nucleosome lower bound. Value ~~~~~ Returns a list of clusterized nucleosomes, and all computed llr scores. Author(s) ~~~~~~~~~ Florent Chuffart Examples ~~~~~~~~ # Dealing with a region of interest roi =list(name="example", begin=1000, end=1300, chr="1", genome=rep("A",301)) samples = list() for (i in 1:3) { # Create TF output tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9) outputs = dfadd(NULL,tf_nuc) outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end) # Generate corresponding reads nb_reads = round(runif(1,170,230)) reads = round(rnorm(nb_reads, tf_nuc$center,20)) u_reads = sort(unique(reads)) strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2)))) counts = apply(t(u_reads), 2, function(r) { sum(reads == r)}) shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)}) u_reads = u_reads + shifts inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)), "V2" = u_reads, "V3" = strands, "V4" = counts), stringsAsFactors=FALSE) samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain="strain_ex", total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs) } print(aggregate_intra_strain_nucs(samples)) R: Aligns nucleosomes between 2 strains. Aligns nucleosomes between 2 strains. ------------------------------------- Description ~~~~~~~~~~~ This function aligns nucs between two strains for a given genome region. Usage ~~~~~ align_inter_strain_nucs(replicates, wp_nucs_strain_ref1 = NULL, wp_nucs_strain_ref2 = NULL, corr_thres = 0.5, llr_thres = 100, config = NULL, ...) Arguments ~~~~~~~~~ "replicates" Set of replicates, ideally 3 per strain. "wp_nucs_strain_ref1" List of aggregates nucleosome for strain 1. If it's null this list will be computed. "wp_nucs_strain_ref2" List of aggregates nucleosome for strain 2. If it's null this list will be computed. "corr_thres" Correlation threshold. "llr_thres" LOD cut off. "config" GLOBAL config variable "..." A list of parameters that will be passed to *aggregate_intra_strain_nucs* if needed. Value ~~~~~ Returns a list of clusterized nucleosomes, and all computed llr scores. Author(s) ~~~~~~~~~ Florent Chuffart Examples ~~~~~~~~ # Define new translate_cur function... translate_cur = function(roi, strain2, big_cur=NULL, config=NULL) { return(roi) } # Binding it by uncomment follwing lines. unlockBinding("translate_cur", as.environment("package:nucleominer")) unlockBinding("translate_cur", getNamespace("nucleominer")) assign("translate_cur", translate_cur, "package:nucleominer") assign("translate_cur", translate_cur, getNamespace("nucleominer")) lockBinding("translate_cur", getNamespace("nucleominer")) lockBinding("translate_cur", as.environment("package:nucleominer")) # Dealing with a region of interest roi =list(name="example", begin=1000, end=1300, chr="1", genome=rep("A",301), strain_ref1 = "STRAINREF1") roi2 = translate_cur(roi, roi$strain_ref1) replicates = list() for (j in 1:2) { samples = list() for (i in 1:3) { # Create TF output tf_nuc = list("chr"=paste("chr", roi$chr, sep=""), "center"=(roi$end + roi$begin)/2, "width"= 150, "correlation.score"= 0.9) outputs = dfadd(NULL,tf_nuc) outputs = filter_tf_outputs(outputs, roi$chr, roi$begin, roi$end) # Generate corresponding reads nb_reads = round(runif(1,170,230)) reads = round(rnorm(nb_reads, tf_nuc$center,20)) u_reads = sort(unique(reads)) strands = sample(c(rep("R",ceiling(length(u_reads)/2)),rep("F",floor(length(u_reads)/2)))) counts = apply(t(u_reads), 2, function(r) { sum(reads == r)}) shifts = apply(t(strands), 2, function(s) { if (s == "F") return(-tf_nuc$width/2) else return(tf_nuc$width/2)}) u_reads = u_reads + shifts inputs = data.frame(list("V1" = rep(roi$chr, length(u_reads)), "V2" = u_reads, "V3" = strands, "V4" = counts), stringsAsFactors=FALSE) samples[[length(samples) + 1]] = list(id=1, marker="Mnase_Seq", strain=paste("strain_ex",j,sep=""), total_reads = 10000000, roi=roi, inputs=inputs, outputs=outputs) } replicates[[length(replicates) + 1]] = samples } print(align_inter_strain_nucs(replicates)) R: Launch deseq methods. Launch deseq methods. --------------------- Description ~~~~~~~~~~~ This function is based on deseq example. It mormalizes data, fit data to GLM model with and without interaction term and compare the two l;=models. Usage ~~~~~ analyse_design(snep_design, reads) Arguments ~~~~~~~~~ "snep_design" The design to considere. "reads" The data to considere. Author(s) ~~~~~~~~~ Florent Chuffart R: Stage replicates data Stage replicates data --------------------- Description ~~~~~~~~~~~ This function loads in memory data corresponding to the given experiments. Usage ~~~~~ build_replicates(expe, roi, only_fetch = FALSE, get_genome = FALSE, all_samples, config = NULL) Arguments ~~~~~~~~~ "expe" a list of vector corresponding to vector of replicates. "roi" the region that we are interested in. "only_fetch" filter or not inputs. "get_genome" Load or not corresponding genome. "all_samples" Global list of samples. "config" GLOBAL config variable. Author(s) ~~~~~~~~~ Florent Chuffart Examples ~~~~~~~~ # library(rjson) # library(nucleominer) # # # Read config file # json_conf_file = "nucleo_miner_config.json" # config = fromJSON(paste(readLines(json_conf_file), collapse="")) # # Read sample file # all_samples = get_content(config$CSV_SAMPLE_FILE, "cvs", sep=";", head=TRUE, stringsAsFactors=FALSE) # # here are the sample ids in a list # expes = list(c(1)) # # here is the region that we wnt to see the coverage # cur = list(chr="8", begin=472000, end=474000, strain_ref="BY") # # it displays the corverage # replicates = build_replicates(expes, cur, all_samples=all_samples, config=config) # out = watch_samples(replicates, config$READ_LENGTH, # plot_coverage = TRUE, # plot_squared_reads = FALSE, # plot_ref_genome = FALSE, # plot_arrow_raw_reads = FALSE, # plot_arrow_nuc_reads = FALSE, # plot_gaussian_reads = FALSE, # plot_gaussian_unified_reads = FALSE, # plot_ellipse_nucs = FALSE, # plot_wp_nucs = FALSE, # plot_wp_nuc_model = FALSE, # plot_common_nucs = FALSE, # height = 50) R: Extract a sub part of the corresponding c2c file Extract a sub part of the corresponding c2c file ------------------------------------------------ Description ~~~~~~~~~~~ This fonction allow to acces to a specific part of the c2c file. Usage ~~~~~ c2c_extraction(strain1, strain2, chr = NULL, lower_bound = NULL, upper_bound = NULL, config = NULL) Arguments ~~~~~~~~~ "strain1" the key strain "strain2" the target strain "chr" if defined, the c2c will filtered according to the chromosome value "lower_bound" if defined, the c2c will filtered for part of the genome upper than lower_bound "upper_bound" if defined, the c2c will filtered for part of the genome lower than upper_bound "config" GLOBAL config variable Author(s) ~~~~~~~~~ Florent Chuffart R: reformat an "apply manipulated" list of regions reformat an "apply manipulated" list of regions ----------------------------------------------- Description ~~~~~~~~~~~ Utils to reformat an "apply manipulated" list of regions Usage ~~~~~ collapse_regions(regions) Arguments ~~~~~~~~~ +-----------------+------+ +-----------------+------+ Author(s) ~~~~~~~~~ Florent Chuffart R: Compute Common Uninterrupted Regions (CUR) Compute Common Uninterrupted Regions (CUR) ------------------------------------------ Description ~~~~~~~~~~~ CURs are regions that can be aligned between the genomes Usage ~~~~~ compute_inter_all_strain_curs(diff_allowed = 30, min_cur_width = 4000, config = NULL) Arguments ~~~~~~~~~ "diff_allowed" the maximum indel width allowe din a CUR "min_cur_width" The minimum width of a CUR "config" GLOBAL config variable Author(s) ~~~~~~~~~ Florent Chuffart R: Crop bound of regions according to region of interest bound Crop bound of regions according to region of interest bound ----------------------------------------------------------- Description ~~~~~~~~~~~ The fucntion is no more necessary since we remove "big_cur" bug in translate_cur function. Usage ~~~~~ crop_fuzzy(tmp_fuzzy_nucs, roi, strain, config = NULL) Arguments ~~~~~~~~~ "tmp_fuzzy_nucs" the regiuons to be croped. "roi" The region of interest. "strain" The strain to consider. "config" GLOBAL config variable Author(s) ~~~~~~~~~ Florent Chuffart R: Adding list to a dataframe. Adding list to a dataframe. --------------------------- Description ~~~~~~~~~~~ Add a list *l* to a dataframe *df*. Create it if *df* is *NULL*. Return the dataframe *df*. Usage ~~~~~ dfadd(df, l) Arguments ~~~~~~~~~ "df" A dataframe "l" A list Value ~~~~~ Return the dataframe *df*. Author(s) ~~~~~~~~~ Florent Chuffart Examples ~~~~~~~~ ## Here dataframe is NULL print(df) df = NULL # Initialize df df = dfadd(df, list(key1 = "value1", key2 = "value2")) print(df) # Adding elements to df df = dfadd(df, list(key1 = "value1'", key2 = "value2'")) print(df) R: Prefetch data Prefetch data ------------- Description ~~~~~~~~~~~ Fetch and filter inputs and outpouts per region of interest. Organize it per replicates. Usage ~~~~~ fetch_mnase_replicates(strain, roi, all_samples, config = NULL, only_fetch = FALSE, get_genome = FALSE, get_ouputs = TRUE) Arguments ~~~~~~~~~ "strain" The strain we want mnase replicatesList of replicates. Each replicates is a vector of sample ids. "roi" Region of interest. "all_samples" Global list of samples. "config" GLOBAL config variable "only_fetch" If TRUE, only fetch and not filtering. It is used tio load sample files into memory before forking. "get_genome" If TRUE, load corresponding genome sequence. "get_ouputs" If TRUE, get also ouput corresponding TF output files. Author(s) ~~~~~~~~~ Florent Chuffart R: Filter TemplateFilter inputs Filter TemplateFilter inputs ---------------------------- Description ~~~~~~~~~~~ This function filters TemplateFilter inputs according genome area observed properties. It takes into account reads that are at the frontier of this area and the strand of these reads. Usage ~~~~~ filter_tf_inputs(inputs, chr, x_min, x_max, nuc_width = 160, only_f = FALSE, only_r = FALSE, filter_for_coverage = FALSE) Arguments ~~~~~~~~~ "inputs" TF inputs to be filtered. "chr" Chromosome observed, here chr is an integer. "x_min" Coordinate of the first bp observed. "x_max" Coordinate of the last bp observed. "nuc_width" Nucleosome width. "only_f" Filter only F reads. "only_r" Filter only R reads. "filter_for_coverage" Does it filter for plot coverage? Value ~~~~~ Returns filtred inputs. Author(s) ~~~~~~~~~ Florent Chuffart R: Filter TemplateFilter outputs Filter TemplateFilter outputs ----------------------------- Description ~~~~~~~~~~~ This function filters TemplateFilter outputs according, not only genome area observerved properties, but also correlation and overlapping threshold. Usage ~~~~~ filter_tf_outputs(tf_outputs, chr, x_min, x_max, nuc_width = 160, ol_bp = 59, corr_thres = 0.5) Arguments ~~~~~~~~~ "tf_outputs" TemplateFilter outputs. "chr" Chromosome observed, here chr is an integer. "x_min" Coordinate of the first bp observed. "x_max" Coordinate of the last bp observed. "nuc_width" Nucleosome width. "ol_bp" Overlap Threshold. "corr_thres" Correlation threshold. Value ~~~~~ Returns filtered TemplateFilter Outputs Author(s) ~~~~~~~~~ Florent Chuffart R: to flat aggregate_intra_strain_nucs function output to flat aggregate_intra_strain_nucs function output --------------------------------------------------- Description ~~~~~~~~~~~ This function builds a dataframe of all clusters obtain from aggregate_intra_strain_nucs function. Usage ~~~~~ flat_aggregated_intra_strain_nucs(partial_strain_maps, cur_index) Arguments ~~~~~~~~~ "partial_strain_maps" the output of aggregate_intra_strain_nucs function "cur_index" the index of the roi involved Value ~~~~~ Returns a dataframe of all clusters obtain from aggregate_intra_strain_nucs function. Author(s) ~~~~~~~~~ Florent Chuffart R: flat reads flat reads ---------- Description ~~~~~~~~~~~ Extract reads coordinates from TempleteFilter input sequence Usage ~~~~~ flat_reads(reads, nuc_width) Arguments ~~~~~~~~~ "reads" TemplateFilter input reads "nuc_width" Width used to shift F and R reads. Value ~~~~~ Returns a list of F reads, R reads and joint/shifted F and R reads. Author(s) ~~~~~~~~~ Florent Chuffart R: Retrieve Reads Retrieve Reads -------------- Description ~~~~~~~~~~~ Retrieve reads for a given marker, combi, form. Usage ~~~~~ get_all_reads(marker, combi, form = "wp", config = NULL) Arguments ~~~~~~~~~ "marker" The marker to considere. "combi" The starin combination to considere. "form" The nuc form to considere. "config" GLOBAL config variable Author(s) ~~~~~~~~~ Florent Chuffart R: get comp strand get comp strand --------------- Description ~~~~~~~~~~~ Compute the complementatry strand. Usage ~~~~~ get_comp_strand(strand) Arguments ~~~~~~~~~ "strand" The original strand. Value ~~~~~ Returns the complementatry strand. Author(s) ~~~~~~~~~ Florent Chuffart R: Build the design for deseq Build the design for deseq -------------------------- Description ~~~~~~~~~~~ This function build the design according sample properties. Usage ~~~~~ get_design(marker, combi, all_samples) Arguments ~~~~~~~~~ "marker" The marker to considere. "combi" The starin combination to considere. "all_samples" Global list of samples. Author(s) ~~~~~~~~~ Florent Chuffart R: Compute the fuzzy list for a given strain. Compute the fuzzy list for a given strain. ------------------------------------------ Description ~~~~~~~~~~~ This function grabs the nucleosomes detxted by template_filter that have been rejected bt aggregate_intra_strain_nucs as well positions. Usage ~~~~~ get_intra_strain_fuzzy(wp_map, roi, strain, config = NULL) Arguments ~~~~~~~~~ "wp_map" Well positionned nucleosomes map. "roi" The region of interest. "strain" The strain we want to extracvt the fuzzy map. "config" GLOBAL config variable. Author(s) ~~~~~~~~~ Florent Chuffart R: Compute the list of SNEPs for a given set of marker, strain... Compute the list of SNEPs for a given set of marker, strain combination and nuc form. ------------------------------------------------------------------------------------- Description ~~~~~~~~~~~ This function uses Usage ~~~~~ get_sneps(marker, combi, form, all_samples, config = NULL) Arguments ~~~~~~~~~ "marker" The marker involved. "combi" The strain combination involved. "form" the nuc form involved. "all_samples" Global list of samples. "config" GLOBAL config variable Author(s) ~~~~~~~~~ Florent Chuffart Examples ~~~~~~~~ marker = "H3K4me1" combi = c("BY", "YJM") form = "wpunr" # "wp" | "unr" | "wpunr" # foo = get_sneps(marker, combi, form) # foo = get_sneps("H4K12ac", c("BY", "RM"), "wp") R: Compute the unaligned nucleosomal regions (UNRs). Compute the unaligned nucleosomal regions (UNRs). ------------------------------------------------- Description ~~~~~~~~~~~ This function aggregate non common wp nucs for each strain and substract common wp nucs. It does not take care about the size of the resulting UNR. It will be take into account in the count read part og the pipeline. Usage ~~~~~ get_unrs(combi, roi, cur_index, wp_maps, fuzzy_maps, common_nuc_results, config = NULL) Arguments ~~~~~~~~~ "combi" The strain combination to consider. "roi" The region of interest. "cur_index" The region of interest index. "wp_maps" Well positionned nucleosomes maps. "fuzzy_maps" Fuzzy nucleosomes maps. "common_nuc_results" Common wp nuc maps "config" GLOBAL config variable Author(s) ~~~~~~~~~ Florent Chuffart R: Returns the intersection of 2 list on regions. Returns the intersection of 2 list on regions. ---------------------------------------------- Description ~~~~~~~~~~~ This function... Usage ~~~~~ intersect_region(region1, region2) Arguments ~~~~~~~~~ "region1" Original regions. "region2" Regions to intersect. Author(s) ~~~~~~~~~ Florent Chuffart R: Likelihood ratio Likelihood ratio ---------------- Description ~~~~~~~~~~~ Compute the log likelihood ratio of two or more set of value. Usage ~~~~~ llr_score_nvecs(xs) Arguments ~~~~~~~~~ "xs" list of vectors. Value ~~~~~ Returns the log likelihood ratio. Author(s) ~~~~~~~~~ Florent Chuffart Examples ~~~~~~~~ # LOD score for 2 set of values mean1=5; sd1=2; card2 = 250 mean2=6; sd2=3; card1 = 200 x1 = rnorm(card1, mean1, sd1) x2 = rnorm(card2, mean2, sd2) min = floor(min(c(x1,x2))) max = ceiling(max(c(x1,x2))) hist(c(x1,x2), xlim=c(min, max), breaks=min:max) lines(min:max,dnorm(min:max,mean1,sd1)*card1,col=2) lines(min:max,dnorm(min:max,mean2,sd2)*card2,col=3) lines(min:max,dnorm(min:max,mean(c(x1,x2)),sd(c(x1,x2)))*card2,col=4) llr_score_nvecs(list(x1,x2)) R: nm nm -- Description ~~~~~~~~~~~ It provides a set of useful functions allowing to perform quantitative analysis of nucleosomal epigenome. Details ~~~~~~~ +-----------------+-----------------------------------------------------+ | Package: | nucleominer | +-----------------+-----------------------------------------------------+ | Maintainer: | Florent Chuffart | +-----------------+-----------------------------------------------------+ | Author: | Florent Chuffart | +-----------------+-----------------------------------------------------+ | Version: | 2.3.45 | +-----------------+-----------------------------------------------------+ | License: | CeCILL | +-----------------+-----------------------------------------------------+ | Title: | nm | +-----------------+-----------------------------------------------------+ | Depends: | seqinr, plotrix, DESeq, cachecache | +-----------------+-----------------------------------------------------+ Author(s) ~~~~~~~~~ Florent Chuffart R: Plot the distribution of reads. Plot the distribution of reads. ------------------------------- Description ~~~~~~~~~~~ This fuxntion use the deseq nomalization feature to compare qualitatively the distribution. Usage ~~~~~ plot_dist_samples(strain, marker, res, all_samples, NEWPLOT = TRUE) Arguments ~~~~~~~~~ "strain" The strain to considere. "marker" The marker to considere. "res" Data "all_samples" Global list of samples. "NEWPLOT" If FALSE the curve will be add to the current plot. Author(s) ~~~~~~~~~ Florent Chuffart R: sign from strand sign from strand ---------------- Description ~~~~~~~~~~~ Get the sign of strand Usage ~~~~~ sign_from_strand(strands) Arguments ~~~~~~~~~ +-----------------+------+ +-----------------+------+ Value ~~~~~ If strand in forward then returns 1 else returns -1 Author(s) ~~~~~~~~~ Florent Chuffart R: Substract to a list of regions an other list of regions that... Substract to a list of regions an other list of regions that intersect it. -------------------------------------------------------------------------- Description ~~~~~~~~~~~ This fucntion embed a recursive part. It occurs when a substracted region split an original region on two. Usage ~~~~~ substract_region(region1, region2) Arguments ~~~~~~~~~ "region1" Original regions. "region2" Regions to substract. Author(s) ~~~~~~~~~ Florent Chuffart R: Switch a pairlist Switch a pairlist ----------------- Description ~~~~~~~~~~~ Take a pairlist key:value and return the switched pairlist value:key. Usage ~~~~~ switch_pairlist(l) Arguments ~~~~~~~~~ "l" The pairlist to switch. Value ~~~~~ The switched pairlist. Author(s) ~~~~~~~~~ Florent Chuffart Examples ~~~~~~~~ l = list(key1 = "value1", key2 = "value2") print(switch_pairlist(l)) R: Translate coords of a genome region. Translate coords of a genome region. ------------------------------------ Description ~~~~~~~~~~~ This function is used in the examples, usualy you have to define your own translation function and overwrite this one using *unlockBinding* features. Please, refer to the example. Usage ~~~~~ translate_cur(roi, strain2, config = NULL, big_cur = NULL) Arguments ~~~~~~~~~ "roi" Original genome region of interest. "strain2" The strain in wich you want the genome region of interest. "config" GLOBAL config variable "big_cur" A largest region than roi use to filter c2c if it is needed. Author(s) ~~~~~~~~~ Florent Chuffart Examples ~~~~~~~~ # Define new translate_cur function... translate_cur = function(roi, strain2, config) { strain1 = roi$strain_ref if (strain1 == strain2) { return(roi) } else { stop("Here is my new translate_cur function...") } } # Binding it by uncomment follwing lines. # unlockBinding("translate_cur", as.environment("package:nm")) # unlockBinding("translate_cur", getNamespace("nm")) # assign("translate_cur", translate_cur, "package:nm") # assign("translate_cur", translate_cur, getNamespace("nm")) # lockBinding("translate_cur", getNamespace("nm")) # lockBinding("translate_cur", as.environment("package:nm")) R: Translate a list of regions from a strain ref to another. Translate a list of regions from a strain ref to another. --------------------------------------------------------- Description ~~~~~~~~~~~ This function is an eloborated call to translate_cur. Usage ~~~~~ translate_regions(regions, combi, cur_index, config = NULL, roi) Arguments ~~~~~~~~~ "regions" Regions to be translated. "combi" Combination of strains. "cur_index" The region of interest index. "config" GLOBAL config variable "roi" The region of interest. Author(s) ~~~~~~~~~ Florent Chuffart R: Aggregate regions that intersect themnselves. Aggregate regions that intersect themnselves. --------------------------------------------- Description ~~~~~~~~~~~ This function is based on sort of lower bounds to detect regions that intersect. We compare lower bound and upper bound of the porevious item. This function embed a while loop and break break regions list become stable. Usage ~~~~~ union_regions(regions) Arguments ~~~~~~~~~ "regions" The Regions to be aggregated Author(s) ~~~~~~~~~ Florent Chuffart R: Watching analysis of samples Watching analysis of samples ---------------------------- Description ~~~~~~~~~~~ This function allows to view analysis for a particuler region of the genome. Usage ~~~~~ watch_samples(replicates, read_length, plot_ref_genome = TRUE, plot_arrow_raw_reads = TRUE, plot_arrow_nuc_reads = TRUE, plot_squared_reads = TRUE, plot_coverage = FALSE, plot_gaussian_reads = TRUE, plot_gaussian_unified_reads = TRUE, plot_ellipse_nucs = TRUE, change_col = TRUE, plot_wp_nucs = TRUE, plot_fuzzy_nucs = TRUE, plot_wp_nuc_model = TRUE, plot_common_nucs = FALSE, plot_common_unrs = FALSE, plot_wp_nucs_4_nonmnase = FALSE, plot_chain = FALSE, plot_sample_id = FALSE, aggregated_intra_strain_nucs = NULL, aligned_inter_strain_nucs = NULL, height = 10, main = NULL, xlab = NULL, ylab = "#reads (per million reads)", config = NULL) Arguments ~~~~~~~~~ "replicates" replicates under the form... "read_length" length of the reads "plot_ref_genome" Plot (or not) reference genome. "plot_arrow_raw_reads" Plot (or not) arrows for raw reads. "plot_arrow_nuc_reads" Plot (or not) arrows for reads aasiocied to a nucleosome. "plot_squared_reads" Plot (or not) reads in the square fashion. "plot_coverage" Plot (or not) reads in the covergae fashion. fashion. "plot_gaussian_reads" Plot (or not) gaussian model of a F anf R reads. "plot_gaussian_unified_reads" Plot (or not) gaussian model of a nuc. "plot_ellipse_nucs" Plot (or not) ellipse for a nuc. "change_col" Change the color of each nucleosome. "plot_wp_nucs" Plot (or not) cluster of nucs "plot_fuzzy_nucs" Plot (or not) cluster of fuzzy "plot_wp_nuc_model" Plot (or not) gaussian model for a cluster of nucs "plot_common_nucs" Plot (or not) aligned reads. "plot_common_unrs" Plot (or not) unaligned nucleosomal refgions (UNRs). "plot_wp_nucs_4_nonmnase" Plot (or not) clusters for non inputs samples. "plot_chain" Plot (or not) clusterised nuceosomes between mnase samples. "plot_sample_id" Plot (or not) the sample id for each sample. "aggregated_intra_strain_nucs" list of aggregated intra strain nucs. If NULL, it will be computed. "aligned_inter_strain_nucs" list of aligned inter strain nucs. If NULL, it will be computed. "height" Number of reads in per million read for each sample, graphical parametre for the y axis. "main" main title of the produced plot "xlab" xlab of the produced plot "ylab" ylab of the produced plot "config" GLOBAL config variable Author(s) ~~~~~~~~~ Florent Chuffart