These exercises cover the sections of Data wrangling with tidy.

All files can be found in the “dataset” directory.

 

Exercise 9

 

  1. How many Hox genes (starting with HOX), or ATPase genes (starting with ATP), in our expressed genes list (tidy_counts_expressed_norm)?
  2. Subset dataframe to just ATPase genes.
  3. Create a new lowercase variable that has ‘ATP’ removed from the symbol.

ANSWERS

Answer 1

## $CD34_1
## [1] 15
## 
## $CD34_2
## [1] 15
## 
## $ORTHO_1
## [1] 15
## 
## $ORTHO_2
## [1] 15
## Warning: Unknown or uninitialised column: 'HOX'.

## Warning: Unknown or uninitialised column: 'HOX'.

## Warning: Unknown or uninitialised column: 'HOX'.

## Warning: Unknown or uninitialised column: 'HOX'.
## $CD34_1
## [1] 0
## 
## $CD34_2
## [1] 0
## 
## $ORTHO_1
## [1] 0
## 
## $ORTHO_2
## [1] 0

Answer 2

## # A tibble: 60 x 11
## # Groups:   Sample [4]
##    ENTREZ Sample  CellType Rep   counts count_total     CPM SYMBOL CHR   LENGTH     TPM
##    <chr>  <chr>   <chr>    <chr>  <int>       <int>   <dbl> <chr>  <chr>  <int>   <dbl>
##  1 476    CD34_1  CD34     1       4952       14023 33065.  ATP1A1 chr1    5912 15132. 
##  2 476    ORTHO_1 ORTHO    1       3453       14023 41990.  ATP1A1 chr1    5912 18205. 
##  3 476    CD34_2  CD34     2       4202       14023 39760.  ATP1A1 chr1    5912 16803. 
##  4 476    ORTHO_2 ORTHO    2       1416       14023 43033.  ATP1A1 chr1    5912 19342. 
##  5 477    CD34_1  CD34     1         13          44    86.8 ATP1A2 chr1    5972    39.3
##  6 477    ORTHO_1 ORTHO    1          3          44    36.5 ATP1A2 chr1    5972    15.7
##  7 477    CD34_2  CD34     2         26          44   246.  ATP1A2 chr1    5972   103. 
##  8 477    ORTHO_2 ORTHO    2          2          44    60.8 ATP1A2 chr1    5972    27.0
##  9 478    CD34_1  CD34     1         78         266   521.  ATP1A3 chr19   4054   348. 
## 10 478    ORTHO_1 ORTHO    1        121         266  1471.  ATP1A3 chr19   4054   930. 
## # … with 50 more rows

Answer 3

## # A tibble: 60 x 12
## # Groups:   Sample [4]
##    ENTREZ Sample  CellType Rep   counts count_total     CPM SYMBOL CHR   LENGTH     TPM ATPtype
##    <chr>  <chr>   <chr>    <chr>  <int>       <int>   <dbl> <chr>  <chr>  <int>   <dbl> <chr>  
##  1 476    CD34_1  CD34     1       4952       14023 33065.  ATP1A1 chr1    5912 15132.  1a1    
##  2 476    ORTHO_1 ORTHO    1       3453       14023 41990.  ATP1A1 chr1    5912 18205.  1a1    
##  3 476    CD34_2  CD34     2       4202       14023 39760.  ATP1A1 chr1    5912 16803.  1a1    
##  4 476    ORTHO_2 ORTHO    2       1416       14023 43033.  ATP1A1 chr1    5912 19342.  1a1    
##  5 477    CD34_1  CD34     1         13          44    86.8 ATP1A2 chr1    5972    39.3 1a2    
##  6 477    ORTHO_1 ORTHO    1          3          44    36.5 ATP1A2 chr1    5972    15.7 1a2    
##  7 477    CD34_2  CD34     2         26          44   246.  ATP1A2 chr1    5972   103.  1a2    
##  8 477    ORTHO_2 ORTHO    2          2          44    60.8 ATP1A2 chr1    5972    27.0 1a2    
##  9 478    CD34_1  CD34     1         78         266   521.  ATP1A3 chr19   4054   348.  1a3    
## 10 478    ORTHO_1 ORTHO    1        121         266  1471.  ATP1A3 chr19   4054   930.  1a3    
## # … with 50 more rows