Because of some of the recent discussion I a couple of models. Unfortunately the page I use for vaccine rates went dormant. It still only has vaccine rates through 8/31/2022. But i used the vaccine rates as of then as an indicator of tendency to follow public health recommendations.
Here is the end of the story: There is no support for the idea that States with more rural area are worse off because of that. That's because one of the factors I looked it is population density. And to the extent that there is an effect of population density, greater population density is associated with a higher death rate. The single most important factor isn't something like percent population 65 or older. It's poverty. And the second most important factor is boosted rate as of 8/31/2022 (which I am viewing as an indicator of the extent to which people in a State follow public health recommendations).
I looked at factors that might influence State by State COVID-19 death rates. BTW let me digress here by saying that, when you do the associations and see how things work out as expected, you can see how ridiculous it is to think that the death rates data are way off. They clearly are not.
OK. So anyway, a big part of my job is analyzing data on the impact of various factors on environmental pollution indicator levels. I just used the approach I use for that. I looked at the impact of various factors on State by State death rates as of 7 pm CST today. Here is a correlation matrix:
You can already see the end of the story. The yellow highlighted coefficients are significant at 95% confidence. You can see that the coefficients with the two highest absolute values for association with Death Rate are Poverty Rate and Boosted Rate. The positive correlation with Poverty Rate says Death Rate tends to be higher when Poverty Rate is higher. The negative correlation with Boosted Rate says Death Rate tends to be higher when Boosted Rate is lower.
I did some multiple regression models. First I did one leaving Boosted Rate out. I used the same process I use to eliminate variable in my job. And, in the end, the only variable remaining are Population Density and Poverty Rate. Both associations are positive. Higher Population Density tends to mean higher Death Rate. Again: That contradicts the rural is worse thing. Higher Poverty Rate also means higher Death rate.
But when I left Boosted Rate out, the diagnostic site I used said that the assumption of equal variances of residuals is violated. And it said that one possible reason for that is that I left out an important variable.
Enter adding Boosted Rate to the start of the process. After doing that, the final model included Percent 65 and older, Population Density, Poverty Rate, and Boosted Rate. Poverty appears to be the most important factor and Boosted Rate appears to be the second most. And the diagnostic site said, after I added Boosted Rate to the Model after it suggested that an important variable is missing, that the assumption of equal variances of residuals is good to go.
I swear, people who express some of the views I see on this site just have no clue. It is obvious that following public health recommendations is the smart thing to do. It is also obvious that the deaths data are reliable. I mean really, really obvious.