In these exercises will gain some experience working with the Gene models and annnotation using the TxDb, OrgDb and GenomicFeatures packages.
Load the library TxDb.Mmusculus.UCSC.mm10.knownGene
Count the number of genes and transcripts
## 66 genes were dropped because they have exons located on both strands
## of the same reference sequence or on more than one reference sequence,
## so cannot be represented by a single genomic range.
## Use 'single.strand.genes.only=FALSE' to get all the genes in a
## GRangesList object, or use suppressMessages() to suppress this message.
## 66 genes were dropped because they have exons located on both strands
## of the same reference sequence or on more than one reference sequence,
## so cannot be represented by a single genomic range.
## Use 'single.strand.genes.only=FALSE' to get all the genes in a
## GRangesList object, or use suppressMessages() to suppress this message.
Read in the expression table GM12878_minus_HeLa.csv from Data directory containing human/hg19 expression data.
In the GM12878_minus_HeLa.csv file, Column 1 contains Entrez IDs and column 2 contains symbols. Add a column of gene names to the table.
## 'select()' returned 1:1 mapping between keys and columns