======== Demos ======== There are currently two demos avalible for using the EpiMapper Python package, one for CUT&Tag data and one for ATAC-seq data. Arguments CUT&Tag Human Histone Modification Demo ======================================== The data used in this demo is collected from GEO association: GSE145187 derived from Kaya-Okur et al. (2020). This dataset consists of 6 samples, targeting two histone modifications: H3K4me3 and H3K27me3 with two replicates each, as well as two IgG control samples. You need to download and create two folders: - fastq - containing demo FASTQ files from CUT&Tag, downloaded from : https://zenodo.org/records/10822274 - in - containing all necessary input files for EpiMapper usage, downloaded from https://zenodo.org/records/10822349 Additionally, you need to create an "out" folder where the output will be stored. Folders can be created by using: .. code-block:: bash $ mkdir out The script to run the demo is shown below: .. code-block:: bash # 1. fastqc epimapper fastqc -f fastq -o out # 2. bowtie2_alignment - To reference genome Hg38 epimapper bowtie2_alignment -f fastq -i in/bowtie2_index_hg38 -m True -o out # bowtie2_alignment - To spike-in genome (E.coli) epimapper bowtie2_alignment -f fastq -s True -i in/bowtie2_index_ecoli -m True -o out **Example table**: After running ``bowtie2_alignment``, the result (Table 1) appears as follows: .. table:: Table 1: Example bowtie2_alignment_ref_and_spike_in :widths: auto :align: center :class: my-custom-class ========== =========== =============== =============== ============= ======================= ===================== Sample Replication SequencingDepth MappedFragments AlignmentRate MappedFragments_SpikeIn AlignmentRate_SpikeIn ========== =========== =============== =============== ============= ======================= ===================== d-H3K27me3 rep1 729951 729478 99.94% 473 0.06% d-H3K27me3 rep2 695765 695534 99.97% 231 0.03% d-H3K4me3 rep1 358119 357764 99.9% 355 0.1% d-H3K4me3 rep2 472641 468294 99.08% 4347 0.92% d-IgG rep1 134669 59177 43.94% 75492 56.06% d-IgG rep2 603373 482561 79.98% 120812 20.02% ========== =========== =============== =============== ============= ======================= ===================== .. code-block:: bash # 3. remove_duplicates epimapper remove_duplicates -s out/Epimapper/alignment/sam -o out # 4. fragment_length epimapper fragment_length -s out/Epimapper/alignment/removeDuplicate/sam_duplicates_removed -o out **Example Plot**: After running ``fragment_length``, the fragment length distribution plot (Figure 1 and Figure 2) appears as follows: .. figure:: ../content/figures/fragment_histone.png :alt: ViolinPlot Fragment Length :align: center :width: 100% .. code-block:: bash # 5. filtering epimapper filtering -s out/Epimapper/alignment/removeDuplicate/sam_duplicates_removed \ -cs in/hg38.chrom.sizes.clear.sorted -bl in/blacklist.bed -sn True -o out # 6. spike_in_calibration epimapper spike_in_calibration -b out/Epimapper/alignment/bed -cs in/hg38.chrom.sizes.clear.sorted \ -ss out/Epimapper/alignment/sam_spike_in -o out # 7. peak_calling epimapper peak_calling -soft seacr -f out/Epimapper/alignment/bed -bg out/Epimapper/alignment/bedgraph \ -c IgG -o out # 8. heatmap epimapper heatmap -b out/Epimapper/alignment/bam -p out/Epimapper/peakCalling/seacr/control \ -bl in/blacklist.bed -r in/hg38.refFlat.txt -o out **Example Plot**: After running ``heatmap``, the heatmap of histone enrichment around genes (Figure 3 and Figure 4) appears as follows. Figure 4 is a composite heatmap constructed from individual peak heatmaps of single samples, represented as a single image in the original file. .. figure:: ../content/figures/heatmap_histone.png :alt: Heatmap :align: center :width: 100% .. code-block:: bash # 9. differential_analysis epimapper differential_analysis -p out/Epimapper/peakCalling/seacr/control \ -bg out/Epimapper/alignment/bedgraph \ -bl in/blacklist.bed -r in/hg38.refFlat.txt -cs in/hg38.chrom.sizes.clear.sorted \ -la H3K27me3_rep1 H3K27me3_rep2 -lb H3K4me3_rep1 H3K4me3_rep2 -an True \ -e in/hg38_all_enhancers_merged_hglft_genome_327b3_4dmr.bed -o out ATAC-seq Demo ======================== The data used in this demo is from an ATAC-seq experiment of healthy/diabetic pancreatic islet, collected from Brysani et al (2020) with the GEO assositation: GSE129383. Here, the demo data only contians the chr21 from the orginal data, to save space. This dataset conists of ATAC-seq data from 6 diabetic donors and 9 healthy donors, only one replicate from each sample. The data avalible for this demo is publicly avalible at a zenodo: You need to download and create two folders: - fastq - containing demo FASTQ files from ATAC-seq, downloaded from : https://zenodo.org/records/10818453 - in - containing all necessary input files for EpiMapper usage, downloaded from https://zenodo.org/records/10818469 Additionally, you need to create an "out" folder where the output will be stored. Folders can be created by using: .. code-block:: bash $ mkdir out The script to run the demo is shown below: .. code-block:: bash # EpiMapper demo run on human ATAC-seq data (only chr21) # 1. fastqc epimapper fastqc -f fastq -o out # 2. bowtie2_alignment epimapper bowtie2_alignment -f fastq -i in/hg19_chr21_bowtie2_index -o out # 3. remove_duplicates epimapper remove_duplicates -s out/Epimapper/alignment/sam -o out # 4. fragment_length epimapper fragment_length -s out/Epimapper/alignment/removeDuplicate/sam_duplicates_removed -o out # 5. filtering epimapper filtering -s /Users/eier/Documents/demo/ATAC/out/Epimapper/alignment/removeDuplicate/sam_duplicates_removed \ -cs in/hg19_chromosome_sizes_sorted.txt -bl in/hg19-blacklist_sorted.bed -atac True -o /Users/eier/Documents/demo/ATAC/out # 6. peak_calling epimapper peak_calling -soft macs2 -f /Users/eier/Documents/demo/ATAC/out/Epimapper/alignment/bed -b /Users/eier/Documents/demo/ATAC/out/Epimapper/alignment/bam \ -gs 2.7e9 -o /Users/eier/Documents/demo/ATAC/out # 7. heatmaps epimapper heatmap -b out/Epimapper/alignment/bam -bl in/hg19-blacklist_sorted.bed \ -p out/Epimapper/peakCalling/macs2/top_peaks -r in/hg19.refFlat_chr21.txt -o /Users/eier/Documents/demo/ATAC/out #8. differntial_analysis epimapper differential_analysis -p out/Epimapper/peakCalling/macs2/top_peaks \ -r in/hg19.refFlat_chr21.txt -bl in/hg19-blacklist_sorted.bed -cs in/hg19_chromosome_sizes_sorted_filtered.txt \ -fold True -an True -e in/hg19_all_enhancers_merged_4dmr.bed -o out \ -la diabetic-1_rep1 diabetic-2_rep1 diabetic-3_rep1 diabetic-4_rep1 diabetic-5_rep1 diabetic-6_rep1 \ -lb healthy-1_rep1 healthy-2_rep1 healthy-3_rep1 healthy-4_rep1 healthy-5_rep1 healthy-6_rep1 healthy-7_rep1 healthy-8_rep1 healthy-9_rep1