COVID-19 Government Policy Effectiveness
As COVID-19 continues to spread throughout the United States, I wanted to create a project that would shed some light on the relationship between government policy and the spread of COVID-19, as well as the complex system that influences the pandemic. I wanted to understand how effective our current government policies have been in stopping the spread of COVID-19 and from there conclude how we could do better moving forward.
My project focuses on two main data points, the government policies set in place by each state and the R-value (effective reproductive rate) of COVID-19 for every state. The data that I used in graphs are from within the range of March 3rd to July 21st.
As a state's government increases the number of policies or strictness of its policies, the R-Value for COVID-19 in that state will decrease.
What is R-value?
R-value, also known as "R naught" or the basic reproduction number, is a value that estimates how many secondary infections will occur from one original case out of a susceptible population.
The R-value of a disease within a population can tell you whether or not a disease will continue spreading and how fast it is spreading. If the R-value is above 1, that means there will be an expected increasing number of cases. While a value under 1 suggests that the spread of the disease has been contained and slowing down.
Though a calculated R-value will never be completely accurate, it can give useful insight into the effectiveness of government policy in containing the epidemic and how close we are to successfully stopping it.
These are a few of the histograms (with 20 bins) that I created to show the frequency of R-Values in each state from March 3rd to July 21st where the x-axis is the R-Value and the y-axis is the number of occurrences:
The histogram for Alabama is skewed to the right (more small values on the right), which means that, in general, they have lower R-values.
The histogram for Alaska is skewed to the left (more small values on the left), which means that, in general, they have higher R-Values.
The histogram for Arizona is skewed to the right (more small values on the right), which means that, in general, they have lower R-values.
The histogram for Arkansas is skewed to the right (more small values on the right), which means that, in general, they have lower R-values.
These graphs from https://rt.live/ show the continuous progression of R-Values in those same 4 states from March to August 11th. These give more information about how the R-Values have been changing over time. Since the R-values have continued to level out/decrease since July 21st, the histograms will be slightly skewed to the right. But you can still how the histograms correlate to the continuous graphs. If a state has lots of fluctuations, the range of values in the histogram will be greater and if a state has more red than blue, most the R-values in the histogram will be in the bins greater than 1.0.
Modeling the System
This Loopy model gives an overview of the main nodes with edges showing the relationships between them in my model of government policy effectiveness. This model is an outline of the major components within my program and the following graphs.
To understand the model, I wrote a Python program, importing pandas and matplotlib.pyplot, to organize, analyze, and graph the data.
I first aligned the two datasets, the R-Values per state and the State Social Distancing Actions. With the State Social Distancing Actions dataset, I quantified the policy actions with values from http://epidemicforecasting.org/containment-calculator. The percentages, for example, -9% for masks became an integer value of 9 on the strictness scale. To match the policies from the State Social Distancing Actions, I used the reduction data of stay-at-home order = 18, symptomatic testing = 10, gatherings limited to 100 people = 5, gatherings limited to 10 people = 24, some businesses suspended = 34, and schools and universities closed = 33, and over 60% of the population wears masks = 9. These integers were added up per data point to produce the final strictness index with a minimum of 0 and a maximum of 133. This calculation was made under the assumption that a stricter set of policies would make the policies for that state more effective in lowering the R-Value of COVID-19.
You can access and download my program with this link:
The percentage values were turned into integers than represented the strictness of each policy.
The integer values of each policy taken into account in my code.
This is one of the datasets listing the different policy actions taken by each state on a specific date.
This scattergram shows the correlation between each state's (represented by a different color) mean R-Value for a specific day and the strictness index from that same day.
Each dot in the scattergram represents one day of data.
Because there are only so many combinations of policy strictness based on the method of quantification, the data points ended up showing up in lines on those values.
From the scattergrams we can see that a lower R-Value does not always mean a lower strictness index. For the USA as a whole, the data points are fairly random, but when we look closer at each state, we can see specific trends. In general, there are more data points for a lower strictness index, which means that many states have policies surrounding COVID-19 that lean on the looser side.
Also, as the strictness index increases, there are less R-Values that fall at the extremes. This could mean that when strictness index is low, policies have not been set in place to slow the spread of COVID-19 or the R-Value of COVID-19 has been low, so the state eased up policies.
When we break the scattergrams by state, we can see that the trend of the data points varies:
In the graph for Alabama, the data points are more random, suggesting there is not a strong correlation between the strictness of their policies and how the R-Value for COVID-19 in their state changes. This could be due to the fact that the government is not enforcing their policies or that the general public is not following the policies even with enforcement.
For Alaska, there is a cluster of data points in the top left where the R-Value is high and the policy strictness index is low, suggesting that there was a long period of time where the state had not taken into account the high R-Value and did not enact stricter policies. In general, Alaska has a positive trend, which could seem counter-intuitive. But delay time between placement of policy and outcome due to COVID-19's incubation period of up to two weeks could come into play and make it so that when the government added stricter policies, the R-Value did not immediately drop, but instead, took time to do so.
In Arizona and Washington, there is a general negative trend between the R-Values and strictness index, which follows my hypothesis that as a state increases the strictness of its policies, the corresponding R-Values will decrease as the spread of COVID-19 is slowed.
The correlation between the R-Values and each states policy strictness varies, which implies that increasing the number of policies or how strict and enforced the policies are may not result in an actual decrease in R-Value for COVID-19. While stricter policies can increase effectiveness, that is only assuming that the general population is willing to follow those policies and work as a whole to slow the spread of the COVID-19 epidemic. We know that the base value R-Value for COVID-19 is around 3.6, so having R-Values around 1 in the Unites States means that the policies are working to some extent, but to get it under 1 and stop the epidemic, more work has to be done to understand the most effective ways to reduce the spread of the virus, whether that be stricter policies or funding the development of a vaccine.
More data on state policies would be helpful in producing a more accurate relationship between the R-Value and the policy strictness index. Getting more fine grained data, such as policy at the county or city level would result in better insights. Also, the current policy data is not daily, which means the R-Value was averaged over the span of days in between the policy data points.
In the current project, I only used simple mathematical correlations. It would be interesting to extend the project by analyzing the data with temporal models, such as linear regression, which would allow us to make better future predictions.
Theoretical reduction of R-value
My python code
Thank you to ISB, specifically Claudia Ludwig and Rachel Calder for running the Computational Modeling Workgroup and guiding me on my project and Brooke Ury for completing the external review on my project.