Introduction to NumPy Library (Numerical Python):

Anurag Singh Choudhary
16 min readSep 23, 2022

--

NumPy is a Python library widely used for working with arrays of arrays. Numpy can handle oversized, multidimensional arrays and matrices, along with a large collection of mathematical operations for working with these arrays. The acronym stands for numeric python. NumPy provides an array object which is approximately 50 times faster than conventional Python lists.

An array takes up less memory and is extremely convenient to use compared to python lists. Additionally, it has a mechanism for specifying data types. NumPy can work with individual elements in an array without using loops and list comprehensions.

Now, before we dive deep into the concept of NumPy arrays, it’s important to note that Python lists can very well perform all the actions that NumPy arrays perform; it’s simply the fact that NumPy arrays are faster and more convenient, it says, when it comes to large-scale calculations, which makes them extremely useful, especially when you’re working with large amounts of data.

It provides a high-performance multidimensional array object and tools for working with those arrays. It define as base package for scientific computing with Python. It is open-source software. It includes various features including these important ones:

  • A powerful N-dimensional array object
  • Sophisticated (broadcast) functions
  • Tools for integrating C/C++ and Fortran code
  • Useful skills in linear algebra, Fourier transforms and random numbers

Array:

An array is defined as the data structure combining of a collection of elements (values ​​or variables), each represented by at least one array index or key.

An array is defined as the major data structure of the NumPy library. An array in NumPy is called a NumPy Array.

comparison between a Python list and a Numpy Array?

A Python List can contain elements with different data types, while Numpy Array elements are always homogeneous (same data types).

NumPy arrays are much faster and compact as compared to Python lists.

Reason why NumPy arrays faster than lists?

1. NumPy Array uses fixed memory to store data and less memory than Python lists.

2. Contiguous Memory Allocation in NumPy Arrays.

Coding Part of the NumPy For Data Science:

1. Importing NumPy Library:

NumPy Library is usually used to perform a wide variety of mathematical operations on arrays. It adds powerful data structures to Python that guarantee efficient calculations with arrays and matrices and supplies a huge library of high-level math functions that work with those arrays and matrices.

#import numpy as np1. a=[1,2,3,4,5]
2. print(a)
output : [1, 2, 3, 4, 5]
3. print(type(a))
output : <class 'list' >

2. ndarray object Creation:

An array object indicates a multidimensional array of particular-size items. The associated data type object describes the format of each element in the array (its byte order, how many bytes it takes up in memory, whether it’s an integer, a floating-point number, or something else, etc.)

Create a NumPy ndarray object

NumPy is used to work with arrays. An array object in NumPy is defined as ndarray . We can create a NumPy ndarray object using the array() function.

the starting or basic ndarray is developed using an array function in NumPy as follows:

Basic Structure: numpy.array(object, dtype = None)

NOTE: In NumPy, axis = 0 is for columns and axis = 1 is for rows.

In[1] : np_a = np.array(a)
In[2] : print(np_a)
Outuput : [1 2 3 4 5]
In[3] : print(type(np_a))
Output : <class 'numpy.ndarray'>

#Python List

In[1] : a = np.arange(1000)
In[2] : %timeit a**2
output : 1.71 ps 2 57.8 ns per loop (mean 2 std. dev. of 7 runs, 1000000 loops each)

Creating NumPy Arrays From a List

A list in Python is a linear data structure that can contain heterogeneous elements that do not require declaration and can be flexibly reduced and expanded. On the other hand, an array is a data structure that can contain homogeneous elements, arrays are implemented in Python using the NumPy library. Arrays require less memory than a list.

The similarity between an array and a list is that the elements of both an array and a list can be identified by their index value.

In Python, lists can be converted to arrays using two methods from the NumPy library:

1-D : One-Dimensional Numpy Array

In[1] : a = np.array([0, 1, 2, 3])
In[2] : print(a)
Output :[0 1 2 3]
#for printing dinensions:
In[1]: a.ndim
Output : 1
for Checking the Shape of the Numpy Array
In[1] : a.shape
output : (4,)

2-D : Two-Dimensional Numpy Array:

So far we have been dealing with a 1-D field. Let’s jump into the 2D field. We can create a 2-D array using the concept of a list of lists. Using this concept, we can develop a 2-D array.

In[1] : b = np.array([[0,1,2],[3,4,5]])
In[2]: print(b)
output:array([[0, 1, 2],
[3, 4, 5]])
In[3] : b.ndim
output : 2
In[4] : b.shape
output : (2,3)

# We Can Convert a Matrix Which is also a list to array

In[1] : my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
In[2] : b = np.array(my_matrix)
In[3] : print(b)
Output :
array( [ [1, 2, 3],
[4, 5, 6] ,
[7, 8, 9] ])
#Printing the Shape
In[4]: b.shape
output : (3, 3)

3-D : Three-Dimensional Numpy Array

In[1] : c = np.array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]])
output :
array([[[0, 1],
[ 2, 3 ] ],
[[4, 5],
[6, 7]]])
#Printing the shape
In[3] : c.shape
output : (2,2,2)
#Printing the dimensions
In[4] : c.ndim
output : 3

Functions for creating arrays:

1. arange:

arange([start,] stop[, step,][, dtype]) : Returns an array with evenly spaced elements by interval. The specified interval is half-open, i.e. [Start, Stop). arange is defined as array-valued concept of the already built-in Python range function:

Syntax: np.arange(start,end,step)
In[1] : np.arange(10, 100, 10)
output : array([10, 20, 30, 40, 50, 60, 70, 80, 90])

2. linspace:

NumPy linspace function (may be called np.linspace) is a tool in Python for developing numeric sequences. It is somehow same to the NumPy arange function in that develops sequences of evenly spaced numbers structured as a NumPy array

Returns numeric spaces evenly spaced. Similar to arange, but uses sample number instead of step. Return evenly spaced numbers over a specified interval.

Syntax: np.linspace(start, end, num_of_samples)
In[1] : np.linspace(0, 5, 6)
output : array([0., 1., 2., 3., 4., 5.])

3. zeros:

The zeros() function returns a new array of the given shape and type, where the value of the element is 0.0

Generate arrays of zeros

Syntax: np.zeros(shape)

In[1] : np.zeros( 3)
output : array([0., 0., 0.])
In[2] : np.zeros((3,3)) #Zero Matrix or Null Matrix
output :
array([[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]])

4. ones:

The ones() function returns a new array of the given shape and data type where the value of the element is set to 1. This function is exclusively same to the numpy zeros() function.

Generate arrays of ones

Syntax: np.ones(shape)

In[1] : np.ones ((4, 3))
output :
array([[1.,1.,1.],
[1.,1.,1.] ,
[1.,1.,l.] ,
[1.,1.,1.]])

5. identity matrix:

The eye() function in Python is used to return a two-dimensional array with ones (1) on the diagonal and zeros (0) elsewhere.

it gives 2-D array that comes with ones on the diagonal and zeros elsewhere.

Syntax: np.eye(shape)
In[1] : np.eye(2)
output :
array([[1., 0.],
[0., 1.]])
In[2] : np.eye(3,2)
output :
array([[1., 0.],
[0., 1.],
[0., 0.]])

6. diagonal matrix:

diag() function. The diag() function is used to extract the diagonal or construct a diagonal array. If v is a 2D array, return a copy of its kth diagonal.

Return a 2-D array with diagonal values and zeros elsewhere.

Syntax: np.diag(diagonal values)

In[1] : a = np. d1ag( [7, 9, 3, 4] )In[2] : print(a)output :
array([[7, 0, 0, 0],
[0, 9, 0, 0],
[0, 0, 3, 0],
[0, 0, 0, 4]])
In[3] : np.diag(a)
output : array([7, 9, 3, 4])

7. random.randint:

accidental. randint() is one of the functions for performing random sampling in numpy. Returns an array of the given shape and fills it with random integers from low (inclusive) to high (exclusive), i.e. in the interval [low, high)

Return random integers from start (inclusive) to end (exclusive)

Syntax: np.random.randint(start, end, num_of_sampIes)

In[1] : np.random.randint(1,40,2)
output : array([35, 26])

8. random.rand:

e.g. accidental. rand() generates random numbers from a standard uniform distribution (ie, a uniform distribution from 0 to 1) and outputs those numbers as a Numpy array.

Create an array with random samples from a uniform distribution over (0, 1)

Syntax: np.random.rand(num_of_sampIes)In[1] : np.random.rand(40)
output:
array([0.15968365,0.10103724,0.74335488,0.96066994,0.26934805,
0.8884551,0.31741055,0.97859335,0.61937223,0.20354284,
0.69652774,0.4224069,0.18444385,0.12748784,0.75832597,
0.27040267,0.14500459,0.32695968,0.04512633,0.51910825,
0.97725918,0.01549304,0.30444066,0.95476832,0.84277896,
0.35330099,0.04976558,0.27740822,0.83415149,0.10576513,
0.35660972,0.50923459,0.05434971,0.6417897,0.1070533 ,
0.04990192,0.78016956,0.2083521 ,0.91456974,0.03827678])

9. random.normal:

Numpy’s random normal function generates a sample of numbers drawn from a normal distribution, otherwise called a Gaussian distribution.

Create an array of the given input and populate it with random samples from a normal distribution.

Syntax: np.random.rand(mean_vaIue, std_vaIue, num_of_sampIes)
In[1] : np .random. norna1(10, 1, 100)
output :
array([ 9.2519856 , 9.88568656, 10.53069748, 9.60640501, 9.38029484,
8.62014416, 12.62993012, 10.05362153, 10.28989837, 10.25986058,
11.20238987, 10.00577682, 9.27764198, 9.55022348, 9.17623389,
11.52214179, 9.70926248, 10.64414598, 9.24480658, 9.32420587,
9.73279543, 9.29621687, 9.03739096, 10.98896823, 11.30835943,
11.29882984, 11.21593227, 11.02886698, 11.42913613, 8.21073102,
10.75264072, 12.14991599, 9.32070331, 9.86434776, 10.1276733 ,
10.62044623, 11.27497313, 8.82817136, 9.87249989, 9.8813644 ,
10.78280971, 10.38980641, 10.4600693 , 9.12948334, 9.31397425,
11.21950835, 9.41585927, 7.40326127, 9.9387721 , 11.34396186,
11.60145401, 11.60840649, 10.73343093, 9.46060605, 8.29029835,
9.95457239, 9.52078355, 10.1921334 , 9.49608252, 9.15529378,
10.48640243, 10.05622719, 9.73819979, 10.8924576 , 11.06789028,
10.80159032, 10.02361934, 9.55717953, 9.71773057, 11.18940336,
9.23418238, 10.81406215, 8.6335664 , 10.37060308, 11.94899351,
10.1693854 , 9.20944851, 8.2884319 , 10.85905469, 9.54971309,
8.62332921, 9.17037822, 10.33593733, 9.0067163 , 10.43022183,
9.71086052, 9.61819985, 8.80892456, 11.0464651 , 9.52446284,
11.17117767, 10.1346511 , 10.08155151, 8.94251017, 10.33008117,
9.1487683 , 11.52131465, 9.86151899, 10.16629038, 10.76639116])

NumPy Operations:

1. Flattening:

The flatten() function is used to get a copy of the given array collapsed into one dimension.

In[1] : a=np.array([[1,2,3],[4,5,6]])In[2] : a.ravel()output : array([1,2,3,4,5,6])

2. Reshape array:

Numpy is basically used to create an n-dimensional array. Reshaping an array numpa simply means changing the shape of the given array, the shape basically tells the number of elements and the dimension of the array, by reshaping the array we can add or remove dimensions or change the number of elements in each dimension.

In[1] : arr=np.arange(25)In[2] : print(arr)In[3] : print(arr.shape)output : [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24]
(25,)
# Notice the two sets of brackets
In[4] : arr.reshape(5,5)
In[5] : print(a1)
output :
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9] ,
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
In[6] : a = np. array( [ [1, 2,3] , [4,5, 6] ])
In [7] : print(a)
output :
[[1 2 3]
[4 5 6]]
In[7] : b = a.reshape(3, 2)
In[8] : pr1nt (b)
output :
[[1 2]
[3 4]
[5 6]]

3. Transpose Array :

The transpose() function is used to describe the dimensions of an array. Input field. By default, reverse the dimensions, otherwise permute the axes according to the given values.

In[1] : a = np.array([[1,2,3],[4,5,6]])
In[2] : print(a)
output :
[[1 2 3]
[4 5 6]]
In[3] : c = a.T
In[4] : print(c)
output :
[[1 4]
[2 5]
[3 6]]

4. Copy array:

copy() function to copy a Python NumPy array (ndarray) to another array. This method takes the field you wanted to copy as an argument and returns a copy of that object’s field. The copy owns the data and any changes made to the copy will not affect the original field.

In[1]: a = np.array([1,2,3])
b = a
b[0] = 100
print(a)
print(b)
output :
[100 2 3]
[100 2 3]
In[2]:
a = np.array([1,2,3])
b = a.copy()
b[0] = 100
print(a)
print(b)
output :
[1 2 3]
[100 2 3]

5. sort :

In some cases we need a sorted array for calculation. For this purpose, Python’s numpy module provides a function called numpy. sort (). This function provides an ordered copy of the source array or input array

Sorting along an axis :

axis = 1 is for row
axis = 0 is for column
In[1] : a = np.array([[5, 4, 6], [2, 8, 2]])
In[2] : b = np.sort(a, axis=0) b
In[3] : print(b)
output :
array([[2, 4, 2],
[5, 8, 6]])

6. Add/Remove operations on a numpy array:

In[1] : a = np.array([4,7,8])
In[2] : a = np.append(a,[2,4,5])
In[3] : print('Numpy array after addition:',a)
output : Numpy array after addition: [4 7 8 2 4 5]
In[5] : a = np.delete(a,2)
In[6] : print('Numpy array after deletion of one element:',a)
output : Numpy array after deletion of one element: [4 7 2 4 5]

7. Indexing

Indexing can be done in numpy using an array as the index. In the case of a slice, a view or shallow copy of the array is returned, but in the case of an index array, a copy of the original array is returned. Numpy arrays usually indexed with diiferent arrays or another sequence other than tuples.

In[1] : a=np.arange(3,10)
In[2] : print(b)
In[3] : print(a[5])
output : [3 4 5 6 7 8 9]
8
In[4] : a = np.diag([1, 2, 3])
In[5] : print(a)
In[6] : print(a[2,1])
output :
[[1 0 0]
[0 2 0]
[0 0 3]]
0
#assigning value
In[3] :
a[2, 1] = 5
print(a)
output :
array([[1, 0, 0],
[0, 2, 0],
[0, 5, 3]])

8. Indexing with an array of integers:

Each integer field represents the number of indexes into a given dimension. When the index consists of as many integer arrays as the dimensions of the destination ndarray, it becomes linear. The following example selects one element of the specified column from each row of the ndarray object.

In[1] : a = np.arange(0, 101, 10)
In[2] : print(a)
array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100])
In[3] : a[[0,2,3,6,10]]
output : array([ 0, 20, 30, 60, 100])
#indexing can be done with an array of integers where the same index is repeated several time :
In[4] : a[[ 2, 3, 2, 4, 2 ]]
In[5] : print(a)
output : array([20, 30, 20, 40, 20])
In[6] : a[[9, 7]] = -200
In[7] : print(a)
output :array([ 0, 10, 20, 30, 40, 50, 60, -200, 80, -200, 100])
#New Values can be assigned

9. Mask indexing or Fancy indexing :

index masking is a mechanism by which an application can control which indexes a given record will be stored in. Usage is relatively simple and best explained with an example. Masked files are created by adding ISMASKED to mode when calling isBuild or isbuild and are opened in the same way.

In[1] : a=np.array([15,6,2,16,4])
output : array([15, 6, 2, 16, 4])
In[3] : a % 2 == 0
output : array([False, True, True, True, True])
In[4] : a[a%2==0]
output : array([ 6, 2, 16, 4])

10. Slicing:

The slice object is used to specify how to split the sequence. You can specify where the slicing starts and ends. You can also specify a step that allows you to, for example, slice only every other item.

In[1] : a = np.arange(4,18)
In[2] : print(a)
output : array([ 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17])
In[3] : a[1 : 8 : 2]
output : array([5, 7, 9, 11])

Arithmetic, Comparison and Transcendental Operations:

1.Arithmetic operations:

The slice object is used to specify how to split the sequence. You can specify where the slicing starts and ends. You can also specify a step that allows you to slice only every other item, for example.

  1. Addition
  2. Subtraction
  3. Multiplication
  4. Division
  5. Modulus
  6. Exponentiation
  7. Floor division
In[1] :
a = np.array( [1, 2, 3, 4]) #create an array
print(a)
print(a+1)
output :
[1 2 3 4]
[2 3 4 5]
In[2] : a**2
output : array([ 1, 4, 9, 16], dtype=int32)
In[3] : b = np .ones(4) + 1
In[4] : print(b)
output : array([2., 2., 2., 2.])
In[5] : a+b
output : array([3., 4., 5., 6.])
In[6] : a-b
output : array([-1., 0., 1., 2.])
In[7] : a*b
output : array([2., 4., 6., 8.])
In[8] : a = np.array([[1,2],[3,4],[5,6]])
In[9] : b = np.array([[4,4],[4,4],[4,4]])
# adding a and b
In[10] : print(a+b)
output :
[[ 5 6]
[ 7 8]
[ 9 10]]

# Multiply a and b elementwise Multiplication

In[11] : print(a*b)
output :
[[4 8]
[12 16]
[20 24]]

# matrix multiplication

a = np.array([[1,2,1],[3,4,3],[5,6,5]])
b = np.array([[4,4,1],[4,4,2],[4,4,3]])
print('Matrix multiplication:\n',np.matmul(a,b))
output :
Matrix multiplication:
[[16 16 8]
[40 40 20]
[64 64 32]]

2. Comparison operations:

A comparison operator in python, also called a python relational operator, compares the values ​​of two operands and returns True or False based on whether a condition is met.

We have six of them, including and limited to — less than, greater than, less than or equal to, greater than or equal to, equal to, and not equal to.

In[12] : a = np.array([1,2,3,4])In[13] : b = np.array([5,2,2,4])a == boutput : array([False, True, False, True])
In[13] : a>b
output : array([False, False, True, False])
#array-wise comparision:
In[14] :
a = np.array([1,2,3,4])
b = np.array([5,2,2,4])
c = np.array([1,2,3,4])
In[15] : np.array_equal(a, b)
output : False
In[16] : np.array_equal(a, c)
output : True

3. Transcendental operations:

Transcendental functions belongs to logs, exponentials, sine, cosine.

In[1] : a= np.arange(0,6)
In[2] : np.sin(a)
output : arrayarray([ 0. ,0.84147098, 0.90929743, 0.14112001, -0.7568025, -0.95892427])
In[3] : np.log(a)
output : array([-inf, 0. , 0.69314718, 1.09861229, 1.38629436,1.60943791])
In[4] : np.exp(a)
output : array([1., 2.71828183, 7.3890561 , 20.08553692, 54.59815003, 148.4131591 ])

Statistics:

NumPy has several useful statistical functions to find the minimum, maximum, percentile standard deviation and variance, etc. of the given elements in an array. The functions are explained as follows:

In[1] : array = np.arange(1, 11)
In[2] : print(array)
output : [ 1 2 3 4 5 6 7 8 9 10]

Finding the Minimum Value:

Numpy module of the python gives a function to find out the minimum value from a Numpy array, i.e.

In[3] : np.min(array)
output : 1

index of minimum element:

The numpy function argmin() takes arr, axis, and out as parameters and returns an array. To find the index of the minimum element from an array, use np. argmin() function.

In[4] : np.argmin(array)
output : 0

Finding the Maximum Value:

Numpy module of the python gives a function to find out the maximum value from a Numpy array, i.e.

In[5] : np.max(array)
output : 10

Index of the maximum element:

NumPy argmax() in Python is used to return the indices of the maximum elements of a given array along with the specified axis. Use this function to get the indices of the maximum elements of individual dimensions and multidimensional (row or column) of the given array.

In[6] : np.argmax(array)
output : 9

Finding the sum of the array:

Python numpy sum() function syntax.

Array elements are used to calculate the sum. If the axis is not defined, the sum of all elements are returned. If the axis is a tuple of ints, the sum of all elements in the given axes is returned.

In[7] : np.sum(array)
output : 55

Finding the mean of the array:

The Python NumPy array mean() function is used to calculate the arithmetic mean or mean of the elements of an array along a specified axis or multiple axes. You get the average by calculating the sum of all the values ​​in the Numpy array divided by the total number of values.

In[8] : np.mean(array)
output : 5.5

Finding the Median of the array:

Numpy has many mathematical methods that allow you to do computational work very quickly. Numpy’s median() method is one of them. Computes the median of NumPy arrays and returns the output as an array. Below is the syntax of the numpy.median() method.

In[9] : np.median(array)
output : 5.5

Finding the Variance:

The Numpy variance function calculates the variance of the elements of a Numpy array. Variance calculates the mean of the squared deviations from the mean.

In[10] : np.var(array)
output : 8.25

Finding the Standard Deviation:

Python’s numpy module provides a function called numpy. std(), is used to calculate the standard deviation along the specified axis. This function gives us the standard deviation of the array elements. The square root of the mean squared deviation (calculated from the mean) is known as the standard deviation.

In[11] : np.std(array)
output : 2.8722813232690143

Finding the percentile:

A percentile is a mathematical criterion used in statistics that indicates the value below which a given percentage of observations falls within a group of observations.

np.percentile() is a math array numpy method used to calculate the i-th percentile of the provided input data supplied using arrays along the specified axis.

In[12] : np.percentile(array,25)
output : 3.25
In[13] : x = np.array([[1,1], [2,2]])
In[14] : print(x)
output :
array([[1, 1],
[2, 2 ]])

Finding the Column Sum:

Sum along the first axis (axis=0)

In[15] : x.sum(axis = 0)
output : array([3, 3])

Finding the row sum:

Sum along the second axis (axis=1)

In[16] : x.sum(axis=1)
output : array([2, 4])

Conclusion to NumPy:

Take a deep breath. We have covered a lot of things in this article. You are familiar with using NumPy arrays and ready to incorporate them into your daily analysis tasks.

To learn more about any NumPy feature, check out their official documentation for a detailed description of each feature.

Going forward, I encourage you to explore the following Data Science courses to help you become an awesome Data Scientist!

Concluding the Advantages of NumPy:

NumPy has several advantages over using Python’s basic math functions, some of which are listed here:

  • NumPy is extremely fast compared to core Python due to heavy use of C extensions.
  • Many Advance Level Python libraries like sklearn, Scipy, and Keras create heavy use of the NumPy library. So if you’re planning a career in data science or machine learning, NumPy is a very good tool to pick up.
  • NumPy comes with a number of built-in functions that would require a fair amount of custom code in core Python.

Thank you for reading. I hope you feel confident about NumPy Concepts. I hope you enjoyed the coding part and were able to test your knowledge about NumPy Library used in Data Science and Machine Learning. If you have any suggestions, please share them in the comment box below.

Please feel free to contact me on Linkedin.

email: anurag8200@gmail.com

Image Sources:

Image 1 : https://study.com/learn/lesson/one-dimensional-arrays.html

Image 2 : https://numpy.org/doc/stable/

Image 3 : https://www.educative.io/answers/what-is-a-3-d-array

Image 4 : https://realpython.com/how-to-use-numpy-arange/

--

--

Anurag Singh Choudhary
0 Followers

Passionate Machine learning professional and data-driven analyst with ability to apply ML techniques & various algorithms to solve real-world business problems.