Demos

There are currently two demos avalible for using the EpiMapper Python package, one for CUT&Tag data and one for ATAC-seq data. Arguments

CUT&Tag Human Histone Modification Demo

The data used in this demo is collected from GEO association: GSE145187 derived from Kaya-Okur et al. (2020).

This dataset consists of 6 samples, targeting two histone modifications: H3K4me3 and H3K27me3 with two replicates each, as well as two IgG control samples.

You need to download and create two folders:

Additionally, you need to create an “out” folder where the output will be stored. Folders can be created by using:

$ mkdir out

The script to run the demo is shown below:

# 1. fastqc

epimapper fastqc -f fastq -o out

# 2. bowtie2_alignment - To reference genome Hg38

epimapper bowtie2_alignment -f fastq -i in/bowtie2_index_hg38  -m True -o out

# bowtie2_alignment - To spike-in genome (E.coli)

epimapper bowtie2_alignment -f fastq -s True -i in/bowtie2_index_ecoli -m True -o out

Example table: After running bowtie2_alignment, the result (Table 1) appears as follows:

Table 1: Example bowtie2_alignment_ref_and_spike_in

Sample

Replication

SequencingDepth

MappedFragments

AlignmentRate

MappedFragments_SpikeIn

AlignmentRate_SpikeIn

d-H3K27me3

rep1

729951

729478

99.94%

473

0.06%

d-H3K27me3

rep2

695765

695534

99.97%

231

0.03%

d-H3K4me3

rep1

358119

357764

99.9%

355

0.1%

d-H3K4me3

rep2

472641

468294

99.08%

4347

0.92%

d-IgG

rep1

134669

59177

43.94%

75492

56.06%

d-IgG

rep2

603373

482561

79.98%

120812

20.02%

# 3. remove_duplicates

epimapper remove_duplicates -s out/Epimapper/alignment/sam -o out

# 4. fragment_length

epimapper fragment_length -s out/Epimapper/alignment/removeDuplicate/sam_duplicates_removed -o out

Example Plot: After running fragment_length, the fragment length distribution plot (Figure 1 and Figure 2) appears as follows:

ViolinPlot Fragment Length
# 5. filtering

epimapper filtering -s out/Epimapper/alignment/removeDuplicate/sam_duplicates_removed \
-cs in/hg38.chrom.sizes.clear.sorted -bl in/blacklist.bed  -sn True -o out

# 6. spike_in_calibration

epimapper spike_in_calibration -b out/Epimapper/alignment/bed -cs in/hg38.chrom.sizes.clear.sorted \
-ss out/Epimapper/alignment/sam_spike_in -o out

# 7. peak_calling

epimapper peak_calling  -soft seacr -f out/Epimapper/alignment/bed -bg out/Epimapper/alignment/bedgraph \
-c IgG -o out

# 8. heatmap

epimapper heatmap -b out/Epimapper/alignment/bam  -p out/Epimapper/peakCalling/seacr/control \
-bl in/blacklist.bed -r in/hg38.refFlat.txt  -o out

Example Plot: After running heatmap, the heatmap of histone enrichment around genes (Figure 3 and Figure 4) appears as follows. Figure 4 is a composite heatmap constructed from individual peak heatmaps of single samples, represented as a single image in the original file.

Heatmap
# 9. differential_analysis

epimapper differential_analysis -p out/Epimapper/peakCalling/seacr/control \
-bg out/Epimapper/alignment/bedgraph \
-bl in/blacklist.bed -r in/hg38.refFlat.txt -cs in/hg38.chrom.sizes.clear.sorted \
-la H3K27me3_rep1 H3K27me3_rep2 -lb H3K4me3_rep1 H3K4me3_rep2 -an True \
-e  in/hg38_all_enhancers_merged_hglft_genome_327b3_4dmr.bed -o out

ATAC-seq Demo

The data used in this demo is from an ATAC-seq experiment of healthy/diabetic pancreatic islet, collected from Brysani et al (2020) with the GEO assositation: GSE129383.

Here, the demo data only contians the chr21 from the orginal data, to save space. This dataset conists of ATAC-seq data from 6 diabetic donors and 9 healthy donors, only one replicate from each sample. The data avalible for this demo is publicly avalible at a zenodo:

You need to download and create two folders:

Additionally, you need to create an “out” folder where the output will be stored. Folders can be created by using:

$ mkdir out

The script to run the demo is shown below:

# EpiMapper demo run on human ATAC-seq data (only chr21)

# 1. fastqc

epimapper fastqc -f fastq -o out

# 2. bowtie2_alignment

epimapper bowtie2_alignment -f fastq -i in/hg19_chr21_bowtie2_index -o out

# 3. remove_duplicates

epimapper remove_duplicates -s out/Epimapper/alignment/sam -o out

# 4. fragment_length

epimapper fragment_length -s out/Epimapper/alignment/removeDuplicate/sam_duplicates_removed -o out

# 5. filtering

epimapper filtering -s /Users/eier/Documents/demo/ATAC/out/Epimapper/alignment/removeDuplicate/sam_duplicates_removed \
-cs in/hg19_chromosome_sizes_sorted.txt -bl in/hg19-blacklist_sorted.bed -atac True -o /Users/eier/Documents/demo/ATAC/out

# 6. peak_calling

epimapper peak_calling -soft macs2 -f /Users/eier/Documents/demo/ATAC/out/Epimapper/alignment/bed -b /Users/eier/Documents/demo/ATAC/out/Epimapper/alignment/bam \
-gs 2.7e9  -o /Users/eier/Documents/demo/ATAC/out

# 7. heatmaps

epimapper heatmap -b out/Epimapper/alignment/bam -bl in/hg19-blacklist_sorted.bed \
-p out/Epimapper/peakCalling/macs2/top_peaks -r in/hg19.refFlat_chr21.txt -o /Users/eier/Documents/demo/ATAC/out


#8. differntial_analysis

epimapper differential_analysis -p out/Epimapper/peakCalling/macs2/top_peaks \
-r in/hg19.refFlat_chr21.txt  -bl in/hg19-blacklist_sorted.bed -cs in/hg19_chromosome_sizes_sorted_filtered.txt \
-fold True -an True -e in/hg19_all_enhancers_merged_4dmr.bed -o out \
-la diabetic-1_rep1 diabetic-2_rep1 diabetic-3_rep1 diabetic-4_rep1 diabetic-5_rep1 diabetic-6_rep1 \
-lb healthy-1_rep1 healthy-2_rep1 healthy-3_rep1 healthy-4_rep1 healthy-5_rep1 healthy-6_rep1 healthy-7_rep1 healthy-8_rep1 healthy-9_rep1