NumPy array in Python are basic entities for numerical computations. They originate from the NumPy library, an essential package in Python for scientific computing. A NumPy array also termed as ndarray, is highly efficient and an object of multi-dimensional array (N-dimensional array), which provides the base functions of numerical operations. Being NumPy Arrays an essential object to study in data analysis, machine learning, and scientific computing.
In this detailed article, we will explore what NumPy arrays are, why they are useful, and core features of NumPy Arrays.
Before we start working with NumPy, we should have NumPy installed on machine. Read article to Install NumPy on Windows, Linux and Mac.
NumPy Array in Python
Numpy stands for Numerical Python, which is a fundamental library in Python for scientific computing. At its core, Numpy introduces the concept of arrays, which are powerful data structures for efficient numerical operations.
A NumPy array is an N-dimensional array (ndarray) container that allows to store and manipulate large datasets efficiently. These arrays are a grid of values, all of the same data type, and can have any number of dimensions (1-D, 2-D, 3-D, etc.). Unlike standard Python lists, NumPy arrays have the following key advantages:
- Homogeneous – All elements in a NumPy array (ndarray) are of the same data type, such as all integers or all floats. This property allows for more memory-efficient storage and faster computation.
- Multidimensional – NumPy arrays can be multi-dimensional, meaning they can have one or more axes (dimensions). That is why NumPy arrays are generally referred as ndarray (N-dimensional array). For example:
- A 1-D array is like a simple list =>
[23, 25, 38, 49]
. - A 2-D array could be thought of as a matrix or a table or nested list =>
[[21, 23], [36, 48]]
. - A 3-D array could be a stack of 2-D arrays, like a cube or a set of matrices.
- A 1-D array is like a simple list =>
- Fixed Size – The size of NumPy array or ndarray cannot be change once it is created. That the size of NumPy array remains fixed after its creation. However, we can create another array with a different size.
- Performance Optimization – NumPy arrays code was written in C programming language. In general, these arrays are much faster than standard Python lists and its performance is much more significant for large numerical tasks. This is because of the optimized memory layout and operation on that memory, which unleashes low-level system optimizations.
Why Use NumPy Arrays
Speed and Performance
- Vectorized Operations – Vectorized operations in NumPy are applied to perform calculation on the entire array. This way we don’t need to use any slow loops of Python. This significantly improves the execution time and provide faster results.
- Memory Management – NumPy array elements are stored in contiguous blocks of memory. This allows to get all array elements from a single block and also can iterate through these element with continuous pointer to memory. This makes NumPy arrays more efficient than standard list, especially for large datasets.
- C language based Implementation – Most of the fundamental operations in NumPy are written in C programming language. Because of this we can directly perform operations in memory and also byte code conversion is fast. It gives a substantial boost in performance as compared to the same operation in pure Python.
Flexibility and Convenience
- Various Operations – NumPy supports a great variety of mathematical, statistical, and logical operations on arrays, including basic addition, subtraction, multiplication, division, indexing, slicing, reshaping, etc. This provides a greater flexibility and convenience to the developers.
- Built-in Functions – NumPy has built-in functions like
np.arange()
,np.linspace()
, andnp.zeros()
for easy creation and manipulation of arrays. - Interoperability – NumPy arrays are compatible with a wide range of scientific computing and machine learning libraries such as Pandas, SciPy, TensorFlow, and Scikit-learn, making them a versatile tool for data analysis and modeling. Many data science workflows are based on NumPy arrays, making them highly compatible with various libraries and tools.
Key Features of NumPy Arrays
NumPy arrays or ndarray are fundamental to numerical computing in Python. Because of its efficiency and flexibility, it provides an efficient way to work with large datasets and perform advanced mathematical operations. Core features of NumPy arrays are:
Homogeneous Data Type
All elements in a NumPy array must be of the same data type (e.g., integers, floats, etc.). This homogeneity allows NumPy to optimize memory usage and computation. Unlike Python lists, which can hold different data types in a single list, NumPy arrays store elements efficiently, resulting in faster operations and lower memory overhead.
# Homogeneous Data Type
import numpy as np
integer_array = np.array([14, 22, 35, 45]) # All integers
string_array = np.array(["shbytes.com", "NumPy", "Python", "Power BI"]) # All strings
boolean_array = np.array([False, True, True, True]) # All booleans
print(integer_array) # [14 22 35 45]
print(string_array) # ['shbytes.com' 'NumPy' 'Python' 'Power BI']
print(boolean_array) # [False True True True]
Multidimensional (N-dimensional) Structure
NumPy arrays can have any number of dimensions. A 1-D array is like a simple list, 2-D array is a matrix, 3-D array is a tensor, and so on. This flexibility allows NumPy to represent complex data structures such as matrices, images, or even multi-dimensional grids efficiently.
# Multidimensional (N-dimensional) Structure
import numpy as np
arr1d = np.array([14, 22, 35, 45]) # 1-D array
arr2d = np.array([[14, 22], [35, 45]]) # 2-D array (matrix)
arr3d = np.array([[[14, 22], [35, 45]], [[54, 64], [74, 68]]]) # 3-D array (tensor)
Shape and Size of Arrays
The shape of an array represents its dimensions, while its size indicates the total number of elements.
- Shape is a tuple representing the array dimensions.
- Size – The total number of elements in the array.
# Shape and Size of Arrays
import numpy as np
arr = np.array([[12, 22, 34], [46, 65, 86]])
print(arr.shape) # Output: (2, 3) (2 rows and 3 columns)
print(arr.size) # Output: 6 (2 * 3)
Array Indexing and Slicing
NumPy arrays allow indexing and slicing to access elements or subarrays. Indexing and slicing in NumPy arrays works similarly as it worked with List and Tuple in Python.
# Array Indexing and Slicing
import numpy as np
arr = np.array([14, 22, 35, 45])
print(arr[2]) # Access the third element (indexing starts at 0) => 35
print(arr[1:3]) # Slice elements from index 1 to 2 (not inclusive of 3) => [22 35]
Element-wise Operations
NumPy supports element-wise operations on arrays, which means that operations are applied to each element of the array individually without explicit loops.
# Element-wise Operations
import numpy as np
arr_1 = np.array([12, 22, 34])
arr_2 = np.array([46, 65, 86])
result = arr_1 + arr_2 # Element-wise addition
print(result) # Output: [ 58 87 120]
Broadcasting
Broadcasting allows NumPy to perform arithmetic operations on arrays of different shapes in a way that is consistent with the smaller array’s shape being “broadcast” over the larger array. This eliminates the need for explicit loops and improves performance.
# Broadcasting
import numpy as np
arr_1 = np.array([12, 22, 34]) # 1-D array with shape (3,) (3 elements)
arr_2 = np.array([[46], [65], [86]]) # 2-D array with shape (3, 1) (3 rows, 1 column)
result = arr_1 + arr_2 # Broadcasting: arr_1 is added to each column of arr_2
print(result)
# Output:
# [[ 58 68 80]
# [ 77 87 99]
# [ 98 108 120]]
Reshaping Arrays
You can reshape arrays into different shapes, as long as the total number of elements remains the same. This allows you to change the layout of the array to suit different use cases.
# Reshaping Arrays
import numpy as np
arr = np.array([12, 22, 34, 46, 65, 86]) # 1-D array with shape (6,) (6 elements)
reshaped_arr = arr.reshape((2, 3)) # Reshape into 2 rows and 3 columns
print(reshaped_arr)
# Output:
# [[12 22 34]
# [46 65 86]]
Universal Functions (ufuncs)
NumPy provides a large set of “universal functions” (ufuncs) that operate element-wise on arrays. These functions perform mathematical operations like square root, sine, logarithm, etc.
# Universal Functions (ufuncs)
import numpy as np
arr = np.array([41, 44, 29, 16])
sqrt_arr = np.sqrt(arr) # Apply square root to each element
print(sqrt_arr) # Output: [6.40312424 6.63324958 5.38516481 4.]
Array Concatenation and Splitting
You can concatenate or split NumPy arrays along any axis.
- Concatenation: Joining two or more arrays along an existing axis.
# Concatenation
import numpy as np
arr_1 = np.array([15, 52])
arr_2 = np.array([23, 44])
result = np.concatenate((arr_1, arr_2)) # Concatenate along the 0-axis (1-D arrays)
print(result) # Output: [15 52 23 44]
- Splitting: Dividing an array into multiple sub-arrays.
# Splitting
import numpy as np
arr = np.array([41, 44, 29, 16, 15, 52])
result = np.split(arr, 3) # Split into 3 sub-arrays
print(result) # Output: [array([41, 44]), array([29, 16]), array([15, 52])]
Memory Efficiency
NumPy arrays are more memory efficient than Python lists. The elements in a NumPy array are stored in contiguous blocks of memory, which allows for faster operations and more efficient memory usage.
# Memory Efficiency
import numpy as np
arr = np.array([41, 44, 29, 16, 15, 52], dtype=np.int32)
print(arr.nbytes) # Output: 24 (6 integers * 4 bytes per int32)
Vectorization
NumPy supports vectorized operations, which means you can perform operations on entire arrays at once without writing explicit loops. This results in cleaner code and often better performance compared to iterating over arrays element-by-element.
# Vectorization
import numpy as np
arr = np.array([41, 44, 29, 16, 15, 52])
result = arr * 2 # Multiply each element by 2
print(result) # Output: [82 88 58 32 30 104]
Random Number Generation
NumPy provides a module (numpy.random
) for generating random numbers and random sampling from various distributions (e.g., uniform, normal).
# Random Number Generation
import numpy as np
random_arr = np.random.rand(2, 3) # Generate a 2x3 array of random floats in [0, 1)
print(random_arr)
# Output => Can be different on next run
# [[0.328437 0.31566905 0.36429999]
# [0.03383582 0.72987953 0.33411073]]
Linear Algebra Functions
NumPy includes a comprehensive set of linear algebra operations such as matrix multiplication, dot product, matrix inverse, eigenvalues, etc.
# Linear Algebra Functions
import numpy as np
A = np.array([[6, 2], [3, 4]])
B = np.array([[5, 4], [7, 8]])
result = np.dot(A, B) # Matrix multiplication
print(result)
# Output:
# [[44 40]
# [43 44]]
Universal Mathematical Functions
NumPy includes a wide variety of mathematical functions for performing operations on arrays such as sin()
, cos()
, log()
, exp()
, etc.
# Universal Mathematical Functions
import numpy as np
arr = np.array([12, 22, 32])
log_arr = np.log(arr) # Natural logarithm of each element
print(log_arr) # Output: [2.48490665 3.09104245 3.4657359 ]
Conclusion
NumPy arrays are one of the most versatile and fundamental building blocks for numerical computing in Python. By supporting multi-dimensional data, using memory efficiently, allowing advanced operations such as broadcasting and vectorization, and providing a wide range of mathematical functions, NumPy arrays are the basis for more complex operations in data science, machine learning, and scientific computing. Mastering these core features will allow you to efficiently work with large datasets and perform high-performance numerical calculations.
Code snippets and programs related to NumPy Array in Python, can be accessed from GitHub Repository. This GitHub repository all contains programs related to other topics in NumPy tutorial.
Related Topics
- NumPy Array Attributes | ndarray Attributes (with Example Programs)NumPy (Numerical Python) is a powerful library for numerical computing in Python. It provides support for large multidimensional arrays and matrices, and it also provides a collection of mathematical functions to operate on these arrays. Understanding the attributes of NumPy arrays is essential for efficiently working with them. In previous articles, we learned about NumPy…
- np.logspace(): Create Array of Evenly Spaced Numbers on Logarithmic Scale (with Example Programs)NumPy is a powerful Python library for numerical computing, and np.logspace() is one of the powerful function to create array of evenly spaced numbers on logarithmic scale. In previous tutorials, we learned about Key Features of NumPy Arrays in Python. This tutorial will provide a step-by-step guide to understand how to use np.logspace() effectively, with examples.…
- np.linspace(): Create Arrays with Evenly Spaced Numbers in NumPy (with Example Programs)NumPy is a powerful Python library for numerical computing, and np.linspace() is one of the most useful functions for generating arrays of evenly spaced values within a specified range. In previous tutorials, we learned about Key Features of NumPy Arrays in Python. This tutorial will provide a step-by-step guide to understand how to use np.linspace() effectively,…
- numpy.arange(): Create Array of Evenly Spaced Numbers within a Range (with Example Programs)NumPy library provides various functions to create arrays with evenly spaced numbers within range. In previous tutorials, we learned about Key Features of NumPy Arrays in Python. In this tutorial, we will learn to create NumPy arrays using numpy.arrange() function. numpy.arange() to Create Array of Evenly Spaced Numbers within a Range numpy.arange() is used to create…
- Create Arrays with Predefined Values using np.zeros(), np.ones(), np.full() and np.empty()Create Arrays with Predefined Values NumPy library provides various functions to create arrays with predefined values. While creating an array, these NumPy functions helps to initialize the arrays with initial values. In last tutorial, we learned about Key Features of NumPy Arrays in Python. In this article, we will learn about 4 functions to create…