NumPy Examples¶

Example: Employee Data¶

Several arrays are defined in the cell below. These arrays are meant to contain records for the salespersons working at a certain company.

EID, fname, and lname store the employee ID and name of the salespeople. The array salary contains the monthly salaries of the employees.

The arrays sales_month, sales_EID, sales_rev, and sales_exp represent entries in a table of sales data. Each particular index in these lists refers to a month, and employee, and the total revenues and expenses for that employee during that month.

EID = np.array([103, 106, 107, 111, 115])
fname = np.array(['Anna', 'Brad', 'Cory', 'Brad', 'Emma'])
lname = np.array(['Jones', 'Green', 'Brown', 'Davis', 'Green'])
salary = np.array([5620, 6250, 5480, 4350, 4640])


sales_month = np.array(['Jan']*5 + ['Feb']*5 + ['Mar']*5 + ['Apr']*5 + ['May']*5 + ['Jun']*5 +
                       ['Jul']*5 + ['Aug']*5 + ['Sep']*5 + ['Oct']*5 + ['Nov']*5 + ['Dec']*5)
    
sales_EID = np.array([103, 106, 107, 111, 115]*12)

sales_rev = np.array([16887, 36296, 10219, 22377, 43366, 20087, 25643, 29853, 19925, 45259, 30953, 27038, 
                      23563, 14986, 32105, 29042, 26106, 47848, 30160, 21224, 37301, 38803, 16794, 36425,
                      29011, 19220, 41406, 24551, 29567, 29522, 20435, 16094, 17346, 21056, 25443, 29500, 
                      25748, 16914, 10973, 23193, 19599, 27395, 31450, 21705, 22856, 39687,  8435, 14932,
                      24479, 32870, 32042, 30693, 12245, 15057, 13041, 43451, 27246, 21278, 36200, 15107])

sales_exp = np.array([ 724, 2138, 2978, 1351, 1542, 1667, 1954, 1192, 1454, 1741, 2019, 1882, 
                      1681, 1894, 1068, 1442, 1382, 3075,  665,  990, 1426, 1654, 1649, 1325,  
                       958, 4082, 1713, 2127, 3086, 1481, 3309, 1313, 1823, 1635, 1632, 2378, 
                      2551, 1715, 2290, 1782, 2126, 1356, 2630, 2316, 1644, 2003,  769, 2402, 
                      1628, 1155, 2274, 2538, 2090, 2652, 2909, 2281, 2238, 2844,  789, 1528])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-2161592c269c> in <module>
----> 1 EID = np.array([103, 106, 107, 111, 115])
      2 fname = np.array(['Anna', 'Brad', 'Cory', 'Brad', 'Emma'])
      3 lname = np.array(['Jones', 'Green', 'Brown', 'Davis', 'Green'])
      4 salary = np.array([5620, 6250, 5480, 4350, 4640])
      5 

NameError: name 'np' is not defined

In the cell below, we use numpy operations to calculate the total annual revenue, expenses, and profit generated by each employee.

rev_by_employee = np.zeros(5).astype('int')
exp_by_employee = np.zeros(5).astype('int')

for i in range(len(EID)):
    rev_by_employee[i] = np.sum(sales_rev[sales_EID == EID[i]])
    exp_by_employee[i] = np.sum(sales_exp[sales_EID == EID[i]])

profit_by_employee = rev_by_employee - exp_by_employee - salary

print('Revenue: ', rev_by_employee)
print('Expenses:', exp_by_employee)
print('Profit:  ', profit_by_employee)

We will now identify the employee who generated the greatest total annual revenue.

n = np.argmax(rev_by_employee)

print('EID:    ', EID[n])
print('Name:   ', fname[n], lname[n])
print('Revenue:', rev_by_employee[n])

Now let’s identify the employee who generated the greatest total annual profit.

n = np.argmax(profit_by_employee)

print('EID:   ', EID[n])
print('Name:  ', fname[n], lname[n])
print('Profit:', profit_by_employee[n])

Finally, let’s determine the month during which the company generated the most revenue.

months = np.unique(sales_month)
total_monthly_sales = np.zeros(12).astype('int')

for i in range(12):
    total_monthly_sales[i] = np.sum(sales_rev[sales_month == months[i]])
    
n = np.argmax(total_monthly_sales)    

print('Month:  ', months[n])
print('Revenue:', total_monthly_sales[n])