Prolific Bloggers on COVID-19 and the Pace of Posts

Rees Morrison
3 min readAug 5, 2020

This article continues my analysis of blog posts that use R to explore COVID-19 data. The five previous articles in the series address the timing of posts, the roles of the posters and their countries; the R packages employed; the data sources and math tools used; the topics written about; and the R and COVID-19 terms used.

For this sixth article, which is based on 215 posts collected so far, let’s look at the timing of prolific posters. During January and February 2020, news was seeping out about the novel coronavirus, but quantitative data about positive cases and deaths were sparse. The earliest blog post (in English) found by me that uses R to investigate what has become known as COVID-19 was published by Holger Von Jouanne-Diedrich in the fifth week of the year (February 4). The second post, by Patrick Tung, appeared about a week later, followed by a post of William Briggs on Feb. 13 and one by Tim Churches on Feb. 17.

Among the 16 bloggers who have so far published at least three posts, the plot below shows a spurt of activity in mid-March through April. Those weeks number 12 through 15 are speckled with entries by this group. For an example of how to read the plot, Arthur Charpentier at the bottom of the reverse alphabetical y-axis has published a total of four posts, three of them in March and one of them in May, but has not subsequently returned to the topic. Note that the text boxes with the months are somewhat approximately located on the plot.

Adding nuance, the color of the points corresponds to the work role of the blogger as explained in the legend at the bottom. It is immediately apparent that professors (blue) and academic researchers (red) predominate in this group of bloggers. If you include the postgraduate students, universities writ large account for nearly all of the prolific bloggers.

The paired plots below compare the explosion of COVID 19 cases around the world (on the left, based on data from Johns Hopkins University) to the cumulative number of blog posts in my current data set. Obviously, the scales are wildly different but the general shapes match. It does seem, however, during the most recent weeks the pace of blogging has not kept up with the pace of positive cases. By the way, we created this side-by-side comparison, using the patchwork package, because double axis plots are frowned upon by the R community.

Will the pace of blog posts keep up with the pace of positive cases? Quite possibly, because more real-time data sources will be mined and more plentiful data generally will allow nuanced investigations. Also, as in all the articles on this subject, qualifying blog posts written in English are likely still out there but I have not located them. If any readers know of such blog posts, or if anyone would like the data set, please leave a comment or email me at rees (at) reesmorrison (dot)com.

--

--

Rees Morrison

An enthusiast of R programming, surveys, and data analysis/visualization