import numpy as np
The arrays we have worked with up to this point have all been one-dimensional arrays which consist of a sequence of numbers in a linear order. Numpy provides us with tools for creating and working with higher dimensional arrays. In this lesson, we will work exclusively with 2D arrays, which consist of several values arranged into ordered rows and columns.
You can create a two dimensional array by applying np.array()
to a list of lists, as long as the sublists are of the same size, and contain elements of a single data type.
list_of_lists = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]
array1 = np.array(list_of_lists)
print(array1)
To access elements within a 2D array, we use the familiar square brace notation for indexing. Within the braces, we need to supply two numbers separated by a comma. The first number indexes a row in the array and the second number indexes a column.
print(array1[1,2])
print(array1[2,-1])
We can also use slicing with 2D arrays. The code in the following cell slices out the first 2 rows and first three columns of array1
.
print(array1[:2, :3])
If we would like to select all rows or all columns when slicing, then we can use a colon by itself in the relevant location within the brackets.
The code in the following cell prints rows 1-2 and all columns of array1
.
print(array1)
print(array1[1:, :])
We can use array slicing to select individual rows of an array.
print(array1[1, :])
We can also select out individual columns of an array.
print(array1[:,3])
Notice that the array that is printed above is not displayed in the form of a column. In fact, when slicing a single row or a column out of a 2D array, the result is returned as a simple 1D array.
Every Numpy array comes equipped with a shape
attribute that we can use to determine the shape of the array. This can be accessed by appending .shape
to the end of the name of a variable containing the array.
print(array1.shape)
One dimensional arrays have only a single number displayed for their shape.
array_1d = np.array([5, 7, 1, 8, 4])
print(array_1d.shape)
We can use the reshape()
array method to create a new array that has the same number of elements as an existing array, but with a different number of rows and columns. The order of the elements is preserved, as read row-by-row, from top to bottom.
array2 = array1.reshape(2,6)
print('Shape =', array2.shape)
print('Contents:')
print(array2)
We can replace one of the dimensions within reshape with a -1
, in which case Numpy will infer the correct value from the size of the array.
array3 = array1.reshape(4,-1)
print('Shape =', array3.shape)
print('Contents:')
print(array3)
In the following example, we reshape a 1D array into a 2D array with a single row, as well a 2D array with a single column.
array_1d = np.array([3, 1, 4])
row_array = array_1d.reshape(1,3)
col_array = array_1d.reshape(3,1)
print('array_1d:')
print('Shape =', array_1d.shape)
print(array_1d)
print()
print('row_array:')
print('Shape =', row_array.shape)
print(row_array)
print()
print('col_array:')
print('Shape =', col_array.shape)
print(col_array)
Many of the functions that we have encountered previously for creating arrays have a size
or shape
parameter that we can use to specify the shape of the desired array.
rand_array = np.random.uniform(low=0, high=1, size=(2,5))
rand_array = np.round(rand_array, 2)
print(rand_array)
ones_3x5 = np.ones(shape=(3,5))
print(ones_3x5)
As with 1D arrays, when we perform elementwise arithmetic operations on 2D arrays of the same size.
x1 = np.array([[7, 0, 8], [5, 6, 2]])
x2 = np.array([[2, 5, 3], [7, 1, 4]])
print('x1.shape =', x1.shape)
print('x2.shape =', x2.shape)
print(x1 + x2)
print(x1 * x2)
print(x1 ** x2)
In certain situations, we can perform arithmetic operations on arrays with different sizes. One such scenario is adding a 2D array with a single row to another 2D array. If the two arrays have the same number of columns, then the row array will be added to each row of the other array. This process can be performed with other arithmetic operations, and is referred to as broadcasting. We can also broadcast arrays with a single column over other arrays.
In the cell below, we create two three new arrays to illustrate broadcasting.
y1 = np.array([[3, 1, 0], [5, 2, 7]])
y2 = np.array([[10], [20]])
y3 = np.array([[10, 20, 30]])
print('y1.shape =', y1.shape)
print('y2.shape =', y2.shape)
print('y3.shape =', y3.shape, '\n')
The cells below illustrate the process of broadcasting a column array over another array using addition.
print(y1)
print()
print(y2)
print(y1 + y2)
The cells below illustrate the process of broadcasting a row array over another array using multiplication.
print(y1)
print()
print(y3)
print(y1 * y3)
Numpy provides a dot()
function that can be used to perform mathematical operations that are commonly encountered when working with arrays or matrices.
When dot()
is provided with two 1D arrays, it returns the dot product of these arrays.
z1 = np.array([2, 5, 1])
z2 = np.array([3, 4, 2])
print(np.dot(z1, z2))
When dot()
is provided with two 2D arrays, it returns the matrix product of these arrays, assume that the dimensions allow for such a product to be calculated.
M1 = np.array([1, 2, 3, 4]).reshape(2,2)
M2 = np.array([5, 6, 7, 8, 9, 0]).reshape(2, 3)
print(M1)
print()
print(M2)
print(np.dot(M1, M2))
If we attempt to use dot()
to perform matrix multiplication on two arrays with incompatible dimensions, we will get an error.
print(np.dot(M2, M1))
Numpy provides two functions for concatenating arrays: hstack()
, or horizontal stack, and vstack()
, or vertical stack. As the names imply, these functions allow us to create new arrays by horizontally or vertically stacking arrays, as long as stacked arrays have the same sizes along the dimension in which the stacking occurs (columns for hstack()
and rows for vstack()
). It is important to note that these stacking functions take only a single argument, which should be a list of (arbitrarily many) arrays to be stacked.
We will create arrays with shape (3,2)
, (3,3)
, and (2,2)
to illustrate the use of the stacking functions.
a1 = np.array([11, 12, 13, 14, 15, 16]).reshape(3,2)
a2 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]).reshape(3,3)
a3 = np.array([-1, -2, -3, -4]).reshape(2,2)
In the cell below, we will horizontally stack a1
and a2
, both of which have 3 rows. Notice that the
print(np.hstack([a1, a2]))
We can vertically stack a1
and a3
since both of these arrays have 2 columns.
print(np.vstack([a1, a3]))
We can stack more than two arrays at the same time, as long as they have appropriate shapes.
stacked_array = np.hstack([
np.zeros(shape=(4,2)),
np.random.choice([2,6], size=(4,4)),
np.ones(shape=(4,1))
])
print(stacked_array)
It is possible to apply restrict certain Numpy functions to rows or columns. For example, we can use np.sum()
to calculate the row sums or column sums of a 2D Numpy array. Before we demonstrate this, we will first explore the default behavior of the np.sum()
function on 2D arrays.
In the cell below, we create an array with shape (2,4)
.
v = np.array([1, 2, 3, 4, 5, 6, 7, 8]).reshape(2,4)
print(v)
By defauly, the np.sum()
function will sum together all of the elements of a 2D array.
print(np.sum(v))
We can use the optional axis
parameter of np.sum()
to ask it to perform row or column sums. If we set axis=0
, then we will sum each column individually.
print(np.sum(v, axis=0))
If we set axis=1
, then we will sum each row individually.
print(np.sum(v, axis=1))
Notice that if we use np.sum()
to perform row or column sums, the result is a 1D array. If we would like the result to be a 2D array, we can specify this by setting the optional keemdims
parameter to True
.
print(np.sum(v, axis=1, keepdims=True))
We can also perform row and column products.
print(np.prod(v, axis=0))
print(np.prod(v, axis=1))
We can use np.mean()
to perform row and column means.
print(np.mean(v, axis=0))
print(np.mean(v, axis=1))