Data Science Basics Requirements you should know.

Amit Kuraria
4 min readApr 26, 2021

--

How to stack your array horizontally and vertically, find unique values, split your array and some more tips to use Numpy effectively.

Numpy is arguably the most used library in data science, just in line with Pandas. It’s the first library you import in every data science project, and in my experience, it’s quite helpful to just know some functions to quickly modify your arrays with the least amount of Googling.

import numpy as np

Thus, spin up your Jupyter notebook as well as your favourite note-taking app, and jot down these functions as you read this article. They will be handy in your most future projects.

Let’s get going! 👇

Making special arrays

This is a category in which I include three numpy functions:

  • Make an array of zeroes:
np.zeros((2, 3)) # make a 2x3 matrix of zeroes
  • Make an array of ones:
np.ones((2,3))

and,

  • Make a square array of identities:
# identity matrix
np.eye(3, 3, dtype=int)

special numpy array functions

Finding unique values in an array

To find and get a list of all the unique values in an array, you only need to remember a simple function:

np.unique(arr) # get unique values in arr

unqiue values in a numpy array

Quite handy, isn’t it?

Splitting an array vertically and horizontally

Splitting an array into two equal parts can be useful for randomly sampling some part of the data you’re working with.

We define a simple array first:

array definiton

Then, we split it:

np.hspit(arr, (tuple of columns to split with))

numpy array splitting

Similarly, you can split vertically too via the vsplit function.

np.vsplit(arr, 5)

Finding minimum and maximum

Consider an array such as:

sample numpy array

Then, we can have the max and min values via:

numpy arrays max, min values

If you put axis = 0, it calculates max and min values row wise, instead of column wise.

Stacking Arrays

One on top of another or one beside the other, numpy has functions for all kinds of stacking two or more arrays.

Let’s define two simple arrays first:

two numpy arrays

Then, we can stack them horizontally via:

np.hstack((arr, arr2))

horizontal stacking of arrays

Similarly, for vertical stacking, you can do this:

np.vstack((arr, arr2))

vertical stacking of arrays

Concluding…

What’s next? Should you go ahead and memorise these methods? Not really, because once you start practising them on some data in your next project, you’ll automatically come to remember them in due time. No additional effort will be needed.

Now that you know the existence of such cool numpy functions, I would recommend you to go ahead and experiment with them on your own. Maybe you’ll find some neat little trick that I didn’t explore in this article? It all comes down to some practise. It can be wondrous and quite fun too!

--

--