Lesson 13 - Random Numbers¶

The following topics are discussed in this notebook:¶

Generating Random Numbers.
Plotting with Matplotlib.
Setting seeds.

NumPy¶

In this lesson, we will learn how to ask Python to generate random numbers. Although base Python has the ability to generate random numbers, this functionality is extended in NumPy.

import numpy as np

The numpy.random.choice() function in NumPy can be use to randomly generate samples. The parameters for this function are as follows:

a: An array or list that the sample will be drawn from.
size: The number of elements to be drawn for the sample. The default value is 1.
replace: A boolean value that determines if elements are returned to the sample after being drawn. The defauly value is True.
p: An array or list that determines the probability of drawing each of the elements. If p is not specified, then all elements will be equally likely.

Let's start with a simple example.

die_roll = np.random.choice(range(1,7), 1)
print(int(die_roll))

4

The range in the code above could have been replaced with a list that explicitly states the value options.

die_roll = np.random.choice([1,2,3,4,5,6],1)
print(die_roll)

[2]

We can draw samples of any size.

EightRolls = np.random.choice(range(1,7), 8)
print(EightRolls)

[5 6 2 3 4 2 1 4]

We can also generate observations drawn from non-numerical lists.

mySample = np.random.choice(['A', 'B', 'C'], 1)
print(mySample)

['A']

� Exercise

Write code that creates a sample of 20 observations randomly drawn from the values 'A', 'B', 'C', 'D', and 'E'. Give the sample a name, and print it.

mySample = np.random.choice(['A', 'B', 'C', 'D', 'E'], 20)
print(mySample)

['D' 'E' 'E' 'D' 'A' 'C' 'B' 'C' 'A' 'B' 'A' 'E' 'B' 'D' 'E' 'D' 'C' 'A'
 'C' 'A']

Plotting the Results¶

Let's say we want to simulate a die being rolled several times, and then want to get an idea as to the distibution of the rolls. In other words, we'd like to know how many times each number was rolled. We could certainly produce numerical counts and then analyze those, but it is sometimes more convenient to represent such information visually.

We will now look at how to create a bar chart that shows the number of times each result was rolled on a die.

To create our chart, we will need to use the matplotlib.pyplot package, which we will import as plt.

import matplotlib.pyplot as plt

We will now generate several rolls, count the number of occurrences of each of the 6 possible results, and will then plot the results in a bar chart.

ManyRolls = np.random.choice(range(1,7), 60)

n1 = sum(ManyRolls == 1)
n2 = sum(ManyRolls == 2)
n3 = sum(ManyRolls == 3)
n4 = sum(ManyRolls == 4)
n5 = sum(ManyRolls == 5)
n6 = sum(ManyRolls == 6)
RollDist = [n1, n2, n3, n4, n5, n6]

plt.bar(range(0,6), RollDist)
labels = ['one', 'two', 'three', 'four','five', 'six']
plt.xticks(range(0,6), labels)
plt.xlabel('Roll')
plt.ylabel('Count')
plt.title('Die Roll Distribution')
plt.show()

print(ManyRolls)
print(ManyRolls == 3)
print(sum(ManyRolls == 3))

[1 1 3 1 2 5 5 2 4 3 1 6 1 5 5 1 3 1 3 2 1 1 5 2 6 2 2 1 1 3 1 3 1 1 1 4 3
 3 4 1 1 2 2 6 5 1 5 4 4 1 2 2 5 1 5 2 1 1 4 6]
[False False  True False False False False False False  True False False
 False False False False  True False  True False False False False False
 False False False False False  True False  True False False False False
  True  True False False False False False False False False False False
 False False False False False False False False False False False False]
8

Coin Flipping¶

We will now consider a few examples in which we simulate a coin being flipped multiple times. To simplify our code slightly, we will create a function that simulates a single flip.

� Exercise

Write a function called coin() that takes no parameters, and returns either 'H' or 'T' with equal probability.

def coin():
    return np.random.choice(['H','T'])

Call your function to simulate a single coin flip.

coin()

'H'

Simulate 100 coin flips. Store the results in a list called flips.

flips = []
for i in range(0,100):
    flips.append(coin())

print(flips)

['T', 'H', 'H', 'H', 'H', 'H', 'T', 'T', 'T', 'H', 'H', 'T', 'T', 'H', 'H', 'T', 'T', 'H', 'H', 'T', 'H', 'T', 'H', 'H', 'T', 'H', 'T', 'T', 'T', 'H', 'H', 'T', 'T', 'H', 'T', 'H', 'T', 'T', 'H', 'T', 'H', 'T', 'T', 'T', 'T', 'H', 'H', 'H', 'H', 'H', 'H', 'T', 'T', 'H', 'H', 'H', 'H', 'T', 'H', 'T', 'H', 'H', 'T', 'H', 'H', 'T', 'T', 'H', 'T', 'T', 'H', 'T', 'T', 'H', 'T', 'H', 'T', 'T', 'H', 'H', 'T', 'H', 'T', 'T', 'T', 'T', 'T', 'T', 'T', 'H', 'H', 'T', 'H', 'T', 'T', 'T', 'T', 'T', 'H', 'T']

� Exercise

Write a loop that performs the following task 100 times:

Repeatedly simulate a coinflip, flipping the coin until a result of Heads is obtain.
Count the number of times the coin was flipped. Store the results in a list called flipCounts.

Print flipCounts.

flipCounts = []

for i in range(0, 100):

    flip = coin()
    count = 1
    
    while flip == 'T':
        flip = coin()
        count += 1
        
    flipCounts.append(count)
    
print(flipCounts)

[3, 3, 2, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 2, 2, 2, 2, 1, 3, 3, 1, 1, 1, 2, 1, 3, 1, 2, 1, 3, 1, 1, 3, 1, 1, 3, 3, 2, 1, 1, 1, 2, 6, 1, 1, 4, 1, 1, 1, 2, 1, 7, 1, 2, 2, 1, 4, 1, 1, 3, 1, 1, 2, 1, 3, 7, 1, 1, 1, 3, 2, 2, 1, 1, 1, 3, 2, 1, 3, 1, 4, 1, 1, 2, 4, 3, 5, 1, 1, 1, 1, 1, 3, 2, 2]

We will now create a list called CountDist. The first element of this list will contain the number of times that 1 appears in flipCounts. The second element will contain the number of times that 2 appears in flipCounts, and so on.

flipCounts = np.array(flipCounts)
CountDist = []
HighestCount = max(flipCounts)
for i in range(1, HighestCount + 1):
    n = sum(flipCounts == i)
    CountDist.append(n)
    
print(CountDist)

[53, 23, 16, 4, 1, 1, 2]

We will now use Matplotlib to print the distribution of numbers appearing in flipCounts.

plt.bar(range(1, HighestCount+1), CountDist)
plt.xlabel('Flips')
plt.ylabel('Count')
plt.title('Number of Flips Before First Heads')
plt.show()

Combine the code from the last three cells, and then run it several times.

Random Walks¶

A random walk is a path that consists of several consecutive steps taken in a random direction. The size of the steps could be random, or it could be preset. Random walks have many applications in fields such as computer science, physics, biology, chemistry, and economics.

We will consider perhaps the most basic example of a random walk. In this example, the position is recorded as a single integer, and the starting position is 0. At each step, we either move by an amount of either +1 or -1, with equal probability.

� Exercise

Create a function called bWalk() that takes two parameter:

pos indicates the starting position of the walk, and should have a default of 0.
n indicates the number of steps in the walk, and should have a default value of 1000.

The function should generate a random walk as described above, consisting of n steps. The output of the walk should be a list that tracks every integer that the walk visited, in order.

def bWalk(pos = 0, n = 1000):
    
    #pos = 0
    path = [pos]
    
    for i in range(0, n):
        pos = pos + np.random.choice([-1,1])
        path.append(pos)
    
    return path

The code below will plot a random walk consisting of 1000 steps. Run it several times.

plt.plot(bWalk(100, 10000))
plt.show()

The code below will plot five random walks simultaneously.

fig=plt.figure(figsize=(8, 4), dpi= 80)

plt.plot(bWalk())
plt.plot(bWalk())
plt.plot(bWalk())
plt.plot(bWalk())
plt.plot(bWalk())
plt.show()

Setting Seeds¶

np.random.seed(1)
plt.plot(bWalk())
plt.show()

Simulating Stock Prices¶

def simStock(price = 100, n = 100):
    
    d = [-0.005, -0.004, -0.003, -0.002, -0.001, 0, 0.001, 0.002, 0.003, 0.004, 0.005]
    prices = [price]
    
    for i in range(0,n):
        r = np.random.choice(d)
        price = price * (1 + r)
        prices.append(price)
        
    return prices

print(simStock(100, 10))

[100, 100.29999999999998, 100.40029999999997, 100.29989969999997, 100.70109929879997, 100.19759380230597, 100.69858177131749, 100.69858177131749, 101.20207468017406, 101.20207468017406, 101.30327675485422]

#np.random.seed(1)
plt.plot(simStock(100, 1000))
plt.show()

Assuming this model, estimate the probabiliy that the price of the stick is greater than 110 after one year.

finalPrices = []

for i in range(0,10000):
    sim = simStock(100, 365)
    finalPrices.append(sim[-1])
    
finalPrices = np.array(finalPrices)

prob = sum(finalPrices >= 110)/10000

print(prob)

0.0571