Numpy Arrays vs. Pytorch Tensors

Explanation of the difference between Numpy Arrays and Pytorch Tensors

Seongbin Park


July 30, 2022

Both NumPy arrays and PyTorch tensors can be viewed as multidimensional tables of data. By using them, we can speed up computations by many thousands of times compared to using pure Python. This post will clarify the difference between the two.

I will be using the fastai library for this post; the code will be partially based on chapter 4 of the fastai book.

required libraries
! [ -e /content ] && pip install -Uqq fastbook
import fastbook

from import *
from fastbook import *

matplotlib.rc('image', cmap='Greys')


NumPy Arrays

All items in a NumPy array must be of the same type. Arrays are mutable, which means that we can change the values of each item in the array.

The innermost arrays of multidimensional arrays can have varying sizes—this is called a “jagged array.”

Most functions supported by NumPy arrays are supported by PyTorch tensors.

Pytorch Tensor

All items in a PyTorch tensor must also be of the same type, but it has the additional restriction that the type has to be a single basic numeric type. Also, PyTorch tensors are immutable, which means that we cannot change the values of each item in the array.

Unlike arrays, PyTorch tensors cannot be jagged. Also, they can live on the GPU, which is optimized for parallel computations. Therefore, given a large amount of data, it is much faster to use a GPU than a CPU. Additionally, PyTorch can automatically calculate derivatives.

Creating Arrays and Tensors

Creating NumPy arrays and PyTorch tensors is very similar:

data = [[1,2,3],[4,5,6]]
arr = array (data)
tns = tensor(data)
array([[1, 2, 3],
       [4, 5, 6]])
tensor([[1, 2, 3],
        [4, 5, 6]])

Performing Operations

Operations on arrays and tensors mostly use the same syntax. For example,

tensor([[2, 3, 4],
        [5, 6, 7]])
array([[2, 3, 4],
       [5, 6, 7]])

Refer to the PyTorch documentation for more operations. Happy learning!