Data Science Basics Requirements you should know.
How to stack your array horizontally and vertically, find unique values, split your array and some more tips to use Numpy effectively.
Numpy is arguably the most used library in data science, just in line with Pandas. It’s the first library you import in every data science project, and in my experience, it’s quite helpful to just know some functions to quickly modify your arrays with the least amount of Googling.
import numpy as np
Thus, spin up your Jupyter notebook as well as your favourite note-taking app, and jot down these functions as you read this article. They will be handy in your most future projects.
Let’s get going! 👇
Making special arrays
This is a category in which I include three numpy functions:
- Make an array of zeroes:
np.zeros((2, 3)) # make a 2x3 matrix of zeroes
- Make an array of ones:
np.ones((2,3))
and,
- Make a square array of identities:
# identity matrix
np.eye(3, 3, dtype=int)
special numpy array functions
Finding unique values in an array
To find and get a list of all the unique values in an array, you only need to remember a simple function:
np.unique(arr) # get unique values in arr
unqiue values in a numpy array
Quite handy, isn’t it?
Splitting an array vertically and horizontally
Splitting an array into two equal parts can be useful for randomly sampling some part of the data you’re working with.
We define a simple array first:
array definiton
Then, we split it:
np.hspit(arr, (tuple of columns to split with))
numpy array splitting
Similarly, you can split vertically too via the vsplit function.
np.vsplit(arr, 5)
Finding minimum and maximum
Consider an array such as:
sample numpy array
Then, we can have the max and min values via:
numpy arrays max, min values
If you put axis = 0
, it calculates max and min values row wise, instead of column wise.
Stacking Arrays
One on top of another or one beside the other, numpy has functions for all kinds of stacking two or more arrays.
Let’s define two simple arrays first:
two numpy arrays
Then, we can stack them horizontally via:
np.hstack((arr, arr2))
horizontal stacking of arrays
Similarly, for vertical stacking, you can do this:
np.vstack((arr, arr2))
vertical stacking of arrays
Concluding…
What’s next? Should you go ahead and memorise these methods? Not really, because once you start practising them on some data in your next project, you’ll automatically come to remember them in due time. No additional effort will be needed.
Now that you know the existence of such cool numpy functions, I would recommend you to go ahead and experiment with them on your own. Maybe you’ll find some neat little trick that I didn’t explore in this article? It all comes down to some practise. It can be wondrous and quite fun too!