These exercises are about logical operators and reading/writing and scripts from Session 2.
Exercise 1 - Reading/Writing
import numpy as np
geneExpression = np.genfromtxt("data/GeneExpression.txt", delimiter="\t", skip_header=True)skip_header
argument in the help pagegeneExpression = np.genfromtxt("data/GeneExpressionWithMethods.txt", delimiter="\t", skip_header=4)
geneExpression## array([[ 5.74251 , 3.214303 , 4.11682 , 3.212353 , 5.742333 ,
## 5.9350948],
## [ 6.444368 , 5.896076 , 2.592581 , 5.089549 , 3.624812 ,
## 2.6313925],
## [ 3.083392 , 3.414723 , 3.706069 , 4.535536 , 5.104273 ,
## 5.7149521],
## [ 4.726498 , 3.023746 , 3.033173 , 8.017895 , 8.0988 ,
## 8.1964109],
## [ 9.909185 , 9.174323 , 9.957153 , 2.053501 , 3.276533 ,
## 0.7332521],
## [10.680459 , 9.951243 , 8.985412 , 3.360963 , 3.566663 ,
## 3.8519471],
## [10.516534 , 10.176163 , 9.778173 , 11.78152 , 9.005437 ,
## 11.1733928],
## [ 9.01702 , 9.342291 , 9.895636 , 12.046704 , 11.00324 ,
## 9.90325 ]])
## np.float64(4.660568966666667)
## np.float64(4.379796416666666)
## np.float64(4.259824183333333)
## np.float64(5.849420483333333)
## np.float64(5.85065785)
## np.float64(6.732781183333333)
## np.float64(10.405203300000002)
## np.float64(10.201356833333334)
## array([ 4.66056897, 4.37979642, 4.25982418, 5.84942048, 5.85065785,
## 6.73278118, 10.4052033 , 10.20135683])
sub_idx = geneExpression.mean(axis=1) > 6
geneNames = np.genfromtxt("data/GeneNames.txt", delimiter="\t", dtype="U6")
geneNames_sub = geneNames[sub_idx]
np.savetxt("GeneNames_highexpression.txt", geneNames_sub, delimiter="\t", fmt='%s')Exercise 2 - Scripts
Lets try to put as much together that we have learnt thus far. This will be a multistep challenge. Break it down and use pseudocode to help. Start by working the code interactively, then turn it into a script.
for i in range(geneExpression.shape[0]):
geneExpression_zscore = zscore(geneExpression[i], my_mean[i], my_std[i])
if i==0:
my_zscore=np.array(geneExpression_zscore)
else:
my_zscore=np.vstack((my_zscore,geneExpression_zscore))
my_abs = abs(my_zscore)
top_values = my_abs.max(axis=1)
top_value = top_values.max()
my_top_index = top_values == top_value
geneNames = np.genfromtxt("data/GeneNames.txt", delimiter="\t", dtype="U6")
geneNames_sub = geneNames[my_top_index]
print(geneNames_sub)## ['Gene_h']