Goal

Juxtaposing the ebb and flow of the US stock market, represented by the Dow Jones Industrial Average, with the unfolding crisis of the COVID-19 pandemic, this analysis provides a unique perspective. By converting cumulative COVID-19 case data into a daily case count metric, we aim to unveil the synchronicities or disparities between economic performance and the public health landscape during these unprecedented times.

Import the Data

Run the following code to download two time series data sets:

  • Historical data on the Dow Jones Industrial Average (a composite performance measure of the US stock market), retrieved from Yahoo Finance.
  • Cumulative COVID19 cases over time, from the New York Times.
import pandas as pd
import datetime

url = "https://query1.finance.yahoo.com/v7/finance/download/%5EDJI?period1=1580750232&period2=1612372632&interval=1d&events=history&includeAdjustedClose=true"
DJI = pd.read_csv(url)
DJI['date'] = pd.to_datetime(DJI['Date'])
DJI = DJI.drop(["Date"], axis = 1)
DJI.head()
Open High Low Close Adj Close Volume date
0 28319.650391 28630.390625 28319.650391 28399.810547 28399.810547 307910000 2020-02-03
1 28696.740234 28904.880859 28696.740234 28807.630859 28807.630859 332750000 2020-02-04
2 29048.730469 29308.890625 29000.849609 29290.849609 29290.849609 357540000 2020-02-05
3 29388.580078 29408.050781 29246.929688 29379.769531 29379.769531 263700000 2020-02-06
4 29286.919922 29286.919922 29056.980469 29102.509766 29102.509766 252860000 2020-02-07

url = "https://raw.githubusercontent.com/nytimes/covid-19-data/master/us.csv"
COVID = pd.read_csv(url)
COVID['date'] = pd.to_datetime(COVID['date'])
COVID.head()
date cases deaths
0 2020-01-21 1 0
1 2020-01-22 1 0
2 2020-01-23 1 0
3 2020-01-24 2 0
4 2020-01-25 3 0

Reshape the data

The series COVID['cases'] is essentially a numpy array containing the cumulative case counts over time. The COVID19 case data is cumulative, but we would like to see the number of new cases per day. Construct a new array, called per_day, giving the number of new cases per day. Then, make a new array called per_day_date that gives the appropriate date for each case count. In particular, per_day and per_day_date should have the same shape.

per_day = np.diff(COVID["cases"]) # use np.diff function to find out the the number of new cases per day
per_day_date = COVID["date"].loc[1:] # select all the dates except for the day with index 0

Plot the Dow Jones Industrial Average over time & Plot the variables per_day_date and per_day to visualize the progress of the COVID19 pandemic over time.

# your solution here
# modify this block in the remaining parts of the problem

# create 2 empty subplots on the same column and use sharex to make two subplots share the same horizontal axis
fig, ax = plt.subplots(2, 1, sharex = True)
ax[0].plot(DJI["date"], DJI["Close"]) # make the upper subplot have data of "date" and "Close" from DJI dataframe
ax[1].plot(per_day_date, per_day, color = "red") # make the lower subplot have data of per_day_date and per_day

ax[0].set(ylabel = "DJI") # add "DJI Close" as y-label on the upper subplot
ax[1].set(xlabel = "Date", ylabel = "COVID new cases per day") # add "Date" as x-label and "COVID new cases per day" as y-label of the lower subplot

# DJI starts to decrease
ax[0].axvspan(datetime.datetime(2020,3,1),
              datetime.datetime(2020,3,20), 
              alpha = .3, 
              color = "gray")
# DJI during Black Friday
ax[0].axvspan(datetime.datetime(2020,11,20),
              datetime.datetime(2020,11,30), 
              alpha = .3, 
              color = "gray")
# COVID starts in the U.S.
ax[1].axvspan(datetime.datetime(2020,2,15),
              datetime.datetime(2020,4,1), 
              alpha = .3, 
              color = "gray")

ax[0].text(datetime.datetime(2020,2,28), 
           28000, 
           "DJI reaches\nthe lowest")
ax[0].text(datetime.datetime(2020,11,1), 
           23000, 
           "DJI during\nBlack Friday")
ax[1].text(datetime.datetime(2020,2,1), 
           70000, 
           "COVID starts\nin the U.S.")

ax[0].set(title = "Time vs. DJI & Covid")
plt.tight_layout() # avoid squished plots

png