Standardizing an Array

One of the most common task is standardizing a vector. This can be used for example to compare expression data or to standardize ranges before applying machine learning methods.

In standard python it could be written like this:

#Suppose a list l with many elements
mean_tmp = std_tmp = 0.0
for val in l:
mean_tmp += val
std_tmp += val*val
mean = mean_tmp/len(l)
std = std_tmp/len(l) - mean*mean

standardized_l = []
for val in l:
standardized_l.append((val-mean)/std)

By using List comprehensions the task is somewhat shorter:

mean_tmp = std_tmp = 0.0
for val in l:
mean_tmp += val
std_tmp += val*val
mean = mean_tmp/len(l)
std = std_tmp/len(l) - mean*mean

standardized_l = [(val-mean)/std for val in l]

With numpy:

#The list must first be converted to a numpy array, otherwise
#no methods nor vectorized operations would be available
l_np = np.array(l)
standardized_l = (l_np-l_np.mean())/l_np.std()

Condividi: