Working with BSgenome objects using Biostrings

In these exercises will gain some experience working with the BSgenome packages.

  1. Load the library BSgenome.Hsapiens.UCSC.hg19

  2. Count the number of contigs.

## [1] 298
  1. Give the sum of lengths of the 3 smallest chromosomes.
## [1] 35839
  1. How many unknown bases - base N - are in chromosome 20
##       N 
## 3520000
  1. Create a barchart of the total number of the A,T,C,G bases on chromosome 20.

  1. Extract the sequence from chromosome 20 at position 1,000,000 to 1,000,020 and retrieve the complement sequence
## 21-letter DNAString object
## seq: CACCCTCTCTTGACCTTGTTC
  1. Write this complement sequence to a FASTA file.

  2. Look up the position of MYC in IGV (Human hg19) and find the genomic coordinates of its first exon.

  3. Extract the sequence for the first exon.

  4. Compare the sequence to that found in IGV and identify start of translated region in gene

  5. Count the number of classical start codons (ATG) in the first exon.

## [1] 0
  1. Use IGV to review translation start codon for Gapdh and similarly count occurrence of ATG in exon 2 of NM_001289745 transcript. (chr12:6643976-6644027)
## [1] 1