Home Page Is COVID-19 sensitive to weather ?
Study 1: data by country
Countries are not infected in the same way by COVID-19
Some countries have 0 deaths declared; in some other countries there are tens of thousands of deaths
Why such differences between countries ?
For each country, we have collected these DATA:
- number of inhabitants
- population density
- wealth by inhabitant
- average life span (from 52 years old in Angola to 84 years old in Japan)
- average age (from 16 years old in Chad to 43 years old in Italy or Japan)
- quality of health system
- freedom of the press (from 0 in North Korean to 77 in Sweden)
- temperature on average in 2020 April, in the biggest city of the country
- number of deaths (declared !), by million of inhabitants, from COVID-19, in April and May
Let's analyze the correlation matrix
(A correlation matrix is a table showing correlation coefficients between variables. Each cell in the table shows the correlation between two variables. A correlation matrix is used to summarize data, as an input into a more advanced analysis, and as a diagnostic for advanced analyses.
The value correlation ranges from -1 to +1.
+1 describes a perfect positive correlation
-1 describes a perfect negative correlation
0 means no linear correlation)
4 features are highly correlated: wealth by inhabitant, average life span, average age and quality of health system.
Indeed, in a prosperous country, people live a long time, the average age is high and the quality of the health system is better
The number of deaths declared by million of inhabitants is very correlated with these 4 features.
That may sound contradictory, but the better the quality of the health care system, the more deaths from Covid-19. It's a side effect.
A good health care system has a consequence a long life span and an high percentage of older population, who themselves have the consequence of an increased death rate from COVID-19
The freedom of press and the number of deaths from COVID-19 are positively correlated: the less freedom of the press, the lower the death rate declared
Temperatures and the wealth by inhabitant are negatively correlated: the poorer a country, the hotter its weather
Temperatures are also negatively correlated with the number of deaths
But beware: an effect is correlated to its cause but two effects will also be correlated between them
1) Does an increase in temperature reduce the number of deaths ?
2) Or are there fewer deaths in countries where it is hot because they are poor countries with a young population and a low life span ?
We cannot answer that question by only analysing our matrix of correlations.
We need to deepen our study: we have to train a machine learning algorithm
The gradient boosting algorithm is considered to be the most reliable machine learning algorithm with the best results
We have trained the algorithm with our dataset
The algorithm has given us the features it considered important to explain the differences in the number of COVID-19 deaths between countries.
The most important of our variables is the average age of the population.
The temperature has virtually zero impact, and is not retained by the algorithm.
Conclusion: the temperature has no significant impact on the evolution of the pandemic. What a pity!
Our algorithm can be improved by adding other features to our dataset: we are going to do that to the next few days
Our data sources::
data about COVID-19: https://www.jhu.edu
freedom of the press : https://rsf.org/fr/classement
quality of health system: https://fr.april-international.com/fr/sante-des-expatries/quels-sont-les-pays-avec-les-meilleurs-systemes-de-sante
average life span: https://fr.wikipedia.org/wiki/Liste_des_pays_par_esp%C3%A9rance_de_vie
average age, by country: https://fr.wikipedia.org/wiki/Liste_des_pays_par_%C3%A2ge_m%C3%A9dian
wealth by inhabitant, by country: https://fr.wikipedia.org/wiki/Liste_des_pays_par_PIB_(PPA)_par_habitant
population density: https://fr.wikipedia.org/wiki/Liste_des_pays_par_densit%C3%A9_de_population
number of inhabitants: https://fr.wikipedia.org/wiki/Liste_des_pays_par_population
Temperature on 2020 April: https://fr.tutiempo.net/
Study 2: data by US-county
For each of the 3242 counties in the US, we have collected data on:
- number of inhabitants
- density of population
- distribution by age group
- percentage of graduates
- containment index through Google Mobility
- mean of temperature in March 2020
- mean of percentage of humidity in March 2020
- evolution of epidemic: number of cases and number of deaths in March/April 2020
We have trained our algorithms on our data sets.
The objective is to identify the features which are important and which have an impact on the evolution of the epidemic.
We have been careful of biases.
Our data sources:
Bad news: there is no correlation between the temperature and the speed of evolution of the COVID-19.
The United States is a very large country. In March 2020 a vast range of temperatures could be found there: betwwen 20 F and 90 F
It's the same thing for the humidity and atmospheric pressure: there is no correlation
We have built a correlation matrix heat map: the correlation coefficients between weather and evolution of the epidemic are insignificant