spike_in_calibration

Removes experimental bias by normalizing fragment counts based on sequencing depth to a spike-in genome and visulizes results.

Arguments 

Required arguments:

-b, --bed: Input file folder of filterd BED files for normalization
-ss, --sam_spike_in: Input file folder of SAM files exported from alignment to a spike in genome.
-cs, --chromosome_sizes: Input file of sorted chromosome sizes information.

Optional arguments:

-tbl, --fragment_table ALIGNMENT SUMMARY TABLE:Input CSV file containing the following columns = [“Sample”, “Replication”, “SequencingDepth”, “MappedFragments”, “AlignmentRate”, “MappedFragments_SpikeIn”, “AlignmentRate_SpikeIn”] with corresponding sample information , default = “bowtie2_alignment_ref_and_spike_in.csv” exported by this pipeline function: bowtie2_alignment.
-o, --out_dir: Output directory, default = current working directory.

The function will assume that the “bowtie2_alignment_ref_and_spike_in.csv” file is present $out_dir/”Epimapper/summary_tables”. Therefore, it is important to use the same output directory “-o/–out_dir” as the one you utilized for the spike-in alignment. This will make sure that the fucntion will find the table.

$ epimapper spike_in_calibration -b /Users/me/results/Epimapper/alignment/bed -ss /Users/me/results/Epimapper/alignment/sam_spike_in -cs /Users/me/in_folder/hg38_chromosome_sizes.txt -o /Users/me/results

If you want a differnet output directory you may choose to input the path to the table manually:

$ epimapper spike_in_calibration -b /Users/me/results/Epimapper/alignment/bed -tbl /Users/me/results/Epimapper/summary_tables/bowtie2_alignment_ref_and_spike_in.csv -ss /Users/me/results/Epimapper/alignment/sam_spike_in -cs /Users/me/in_folder/hg38_chromosome_sizes.txt -o /Users/me/results

If you have not used this pipelines bowtie2_alignment to preform the reference genome and spike-in alignment, you must manually create a summary table containing the following columns:[“Sample”, “Replication”, “SequencingDepth”, “MappedFragments”, “AlignmentRate”, “MappedFragments_SpikeIn”, “AlignmentRate_SpikeIn”] with corresponding infromation for each sample. Therefore, it is recommended to use the pipeline as a whole to avoid any manual labor.

$ epimapper spike_in_calibration -b /Users/me/results/Epimapper/alignment/bed -tbl /Users/me/results/my_table.csv -ss /Users/me/results/sam_spike_in -cs /Users/me/in_folder/hg38_chromosome_sizes.txt -o /Users/me/results

Output 

Like all the other functions in EpiMapper Python package, the function will create a main Epimapper output directiry, if it is not already present in the chosen output directory. Further, this function will create a “bedgraph” folder to store the spike-in normalized files. Further, this function will create a summay table and a PNG figure with boxplots of spike-in scaling factors and normalized fragment count.

Epimapper
|- alignment
|   |- bedgraph
|   |   |- "sample-name".fragments.normalized.bedgraph
|- summary_tables
|   |- spike_in_calibration_summary.csv
|   |- spike_in_calibration.png

spike_in_calibration

Arguments

Example usage

Output

Arguments 

Example usage 

Output 