Working with Genomic Intervals

In this exercise we will work with some ChIP-seq peak calls for the Encode experiment ENCSR000ERN

  1. Install the GenomicRanges and rtracklayer package

  2. Read in the file data/Myc_Ch12_1_withInput_Input_Ch12_peaks.xls and create a GRanges object which includes values of fold_enrichment and - log10 pvalue provided in file.

  3. Create a boxplot of the fold enrichments in genomic intervals over every chromosome.

  1. Create a GRanges object of genomic intervals on chromosome 1 with scores greater than 10 and pvalue less than 0.0001 and export as BED file filteredMyc.bed.

  2. Read in TXT file of gene positions (containing contig - named as seqnames column-, genomic start and end) from the file data/mm10_GenePosForIGV.txt and export to a BED file Genes.bed

  3. Create a GRanges of the transcriptional start site positions of every gene (1bp exact TSS).

  4. Extend this GRanges to be +/- 500 bp around the transciptional start sites.

  5. Create a BED file (called filteredMycOnTSS.bed) of our Myc peaks from question 4 which overlap our new GRanges of +/- 500 bp around transciptional start sites.

  6. Import the generated BED files (filteredMycOnTSS.bed,filteredMyc.bed, Genes.bed) into IGV on mm10 genome.

Download the signal p-value bigwig from the encode portal for replicate 1 from experiment ENCSR000ERN and capture an image of bigWig signal over our BED intervals over igfbp2 gene.

10. Import the data/Myc_Ch12_1_withInput_Input_Ch12_summits.bed BED file to a GRanges. The 5th column in file represents summit height.

  1. Generate density plots of summit heights for summits overlapping and not overlapping our extended TSS positions

  1. Filter the Summits GRanges to the top 500 ranked by the GRanges score column.

  2. Extend these top 500 Summits GRanges to 50bps around the centre of the GRanges intervals. Extract the sequences under the peaks and write to a file.