These exercises cover the sections of Data wrangling with tidy.
All files can be found in the “dataset” directory.Exercise 7
Hint:
Counts per million (CPM) are the gene counts normalized to total counts in a sample, multiplied by a million to give you a sensible number.
gene_A_CPM = (gene_A_counts / sum(all_genes_counts)) * 1,000,000
Transcripts per million (TPM) are the gene counts normalized to total counts in a sample, multiplied by a million to give you a sensible number.
gene_A_TPM = (gene_A_counts / sum(all_genes_counts / all_genes_lengths)) * 1/gene_A_length * 1,000,000
More info on RNAseq counts quantification here: http://luisvalesilva.com/datasimple/rna-seq_units.html