spike_in_calibration
Removes experimental bias by normalizing fragment counts based on sequencing depth to a spike-in genome and visulizes results.
Arguments
Required arguments:
-b, --bed: Input file folder of filterd BED files for normalization-ss, --sam_spike_in: Input file folder of SAM files exported from alignment to a spike in genome.-cs, --chromosome_sizes: Input file of sorted chromosome sizes information.
Optional arguments:
-tbl, --fragment_table ALIGNMENT SUMMARY TABLE:Input CSV file containing the following columns = [“Sample”, “Replication”, “SequencingDepth”, “MappedFragments”, “AlignmentRate”, “MappedFragments_SpikeIn”, “AlignmentRate_SpikeIn”] with corresponding sample information , default = “bowtie2_alignment_ref_and_spike_in.csv” exported by this pipeline function:bowtie2_alignment.-o, --out_dir: Output directory, default = current working directory.
Example usage
The function will assume that the “bowtie2_alignment_ref_and_spike_in.csv” file is present $out_dir/”Epimapper/summary_tables”. Therefore, it is important to use the same output directory “-o/–out_dir” as the one you utilized for the spike-in alignment. This will make sure that the fucntion will find the table.
$ epimapper spike_in_calibration -b /Users/me/results/Epimapper/alignment/bed -ss /Users/me/results/Epimapper/alignment/sam_spike_in -cs /Users/me/in_folder/hg38_chromosome_sizes.txt -o /Users/me/results
If you want a differnet output directory you may choose to input the path to the table manually:
$ epimapper spike_in_calibration -b /Users/me/results/Epimapper/alignment/bed -tbl /Users/me/results/Epimapper/summary_tables/bowtie2_alignment_ref_and_spike_in.csv -ss /Users/me/results/Epimapper/alignment/sam_spike_in -cs /Users/me/in_folder/hg38_chromosome_sizes.txt -o /Users/me/results
If you have not used this pipelines bowtie2_alignment to preform the reference genome and spike-in alignment, you must manually create a summary table containing the following columns:[“Sample”, “Replication”, “SequencingDepth”, “MappedFragments”, “AlignmentRate”, “MappedFragments_SpikeIn”, “AlignmentRate_SpikeIn”] with corresponding infromation for each sample.
Therefore, it is recommended to use the pipeline as a whole to avoid any manual labor.
$ epimapper spike_in_calibration -b /Users/me/results/Epimapper/alignment/bed -tbl /Users/me/results/my_table.csv -ss /Users/me/results/sam_spike_in -cs /Users/me/in_folder/hg38_chromosome_sizes.txt -o /Users/me/results
Output
Like all the other functions in EpiMapper Python package, the function will create a main Epimapper output directiry, if it is not already present in the chosen output directory. Further, this function will create a “bedgraph” folder to store the spike-in normalized files. Further, this function will create a summay table and a PNG figure with boxplots of spike-in scaling factors and normalized fragment count.
Epimapper
|- alignment
| |- bedgraph
| | |- "sample-name".fragments.normalized.bedgraph
|- summary_tables
| |- spike_in_calibration_summary.csv
| |- spike_in_calibration.png