In these exercises we will review some of the functionality of ChIPQC, reading in peaks and annotating peaks to genes.
We will be using data directly downloaded from the ENCODE consortium.
Precomputed ChIPQC results as a list for 2 Encode CTCF samples and their input can be found in the data directory.
data/CTCFQC.RData
We will also perform some of our own QC on some human data of Pancreas CTCF data. We should download thew BAM file here
We can also retrieve the relevant peak calls from here
Load the CTCF QC .RData into R
Produce a cross coverage plot from these samples using ChIPQC. Add metadata for antibody
## Scale for 'fill' is already present. Adding another scale for 'fill', which
## will replace the existing scale.
## Using Sample as id variables
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## >> preparing features information... 2021-08-02 03:29:47 PM
## >> identifying nearest features... 2021-08-02 03:29:48 PM
## >> calculating distance from peak to TSS... 2021-08-02 03:29:53 PM
## >> assigning genomic annotation... 2021-08-02 03:29:53 PM
## >> adding gene annotation... 2021-08-02 03:30:29 PM
## 'select()' returned 1:many mapping between keys and columns
## >> assigning chromosome lengths 2021-08-02 03:30:30 PM
## >> done... 2021-08-02 03:30:30 PM
Export annotated peaks to a tab separated file.
Download the blacklist for hg38 and QC our newly downloaded BAM file in ChIPQC. To save time only run ChIPQC on chromosomes (chr10, chr11,chr12). Create cross-coverage plot using ChIPQC.
## Reads Map% Filt% Dup% ReadL FragL RelCC
## 4984428.00 100.00 5.56 0.00 76.00 264.00 2.34
## SSD RiP% RiBL%
## 0.93 NA 2.61