epiforecasts / covid Goto Github PK
View Code? Open in Web Editor NEWTemporal variation in transmission during the COVID-19 outbreak
Home Page: https://epiforecasts.io/covid/
License: MIT License
Temporal variation in transmission during the COVID-19 outbreak
Home Page: https://epiforecasts.io/covid/
License: MIT License
I note that for a day or two the projections for New Zealand appear to be very high. The predicted range for the Reproduction number having a range up to 75 and the recorded cases being dwarfed by the predicted range.
I've attached an image to illustrate.
(Niger and Palestine may be similar.)
The doubling time plot also looks odd.
I have not personally checked the number of data points and how that might impact the computations.
Put the date on which the report was generated somewhere at the top of each page, as well as "using data up to 2020-xx-xx" - the Rt estimates "as of..." can give the impression that it's outdated.
The website was updated every ~2 days for a while, until about a week ago, nothing since. I suggest adding a clear indication on the website with the update frequency/schedule.
You have mixed up total cases with daily case load. The R0 plots for some states are clearly wrong. Hawaii, Montana, Alaska.
To do:
global/nowcast
to be based on regional_rt_pipeline
rather than the current custom set upEpiNow
(broadly the same)Ref: https://epiforecasts.io/covid/posts/national/united-states/ Figure 3
First, the rate of growth should be properly defined. If the serial interval is assumed to be constant, then rate of growth and effective reproduction number is equal. So rate of growth needs a proper time frame, IIRC it is the daily growth rate - in the beginning we had ~25% more infections each day.
I have not fully understood how the results of Fig. 3C on the coefficient of determination were calculated. Does it show an evaluation on A or B? The caption says "with values closer to 1 indicating a better fit", so I would doubt the whole calculation if R squared goes negative.
Overall I think this is great. Comments below are meant as helpful suggestions, not eviscerating the work that's been done.
Agree with @kathsherratt in #7 on colour. Greys are too similar, and I'm not sure that blue adequately represents that there are increasing cases. I don't think red-yellow-green is the right palette to use but colorbrewer's 3-class RdYlBu scheme is broadly interpretable as "bad", "not so bad", "good". The five-class scheme can be used for increasing, likely increasing, unsure, likely decreasing, decreasing, with grey reserved for "No data".
Equirectangular map projection is better than Mercator but you can probably ditch Antarctica and consider a separate plot for each of the World Bank's continent definitions, optionally splitting the Americas into South and "North and Central". There's too big a difference in variation in country area (both true and distorted) to just show all nations on the one map.
Figures 2 and 3 are missing, or relabelling has broken.
I assume that "likely increasing" has a median R0 > 1 but that 1 is outside the 50% and inside the 90% interval. This isn't confirmed on the page, though, and makes it a little difficult to interpret figures 1 and 4.
Using a different colour for the R0 estimates and the number of cases by date of infection would help make it clear to the reader that figures 5 (and 7) and 6 (and 8) show different things. The consistency of style is great, but the colouring can help identify that they're different. Please don't reuse the colours from the likely increasing/decreasing when doing this, though.
Caption for table 2 should go above, and might be worth putting in the note that when the doubling time estimate is "cases decreasing" that this corresponds to cases no longer doubling and hence the doubling time is effectively infinite. One way we got around this with the outbreak delay paper was to say "at least 4.5 days" rather than "4.5 - outbreak delayed".
Really great work.
Hawaii is missing. You might want dedicated pages for each state so that people can see bigger charts.
I have been working on interactive maps and data visualisations consulting with @seabbs. The vis are interactive svgs written with d3.js, and packaged as html widgets for inclusion in the .Rmd document.
There is a sample vis here.
Can I request 2 files to make this more straightforward and reliable?
rt.csv
is working well to generate the r0 plot for each country. Could we have a similar file for the nowcasts and a summary file for country classifications?
For nowcasts:
Proposed file columns:
country
: country name - using the same country names as rt.csv
date
: date
median
: nowcast median
lower_90
: nowcast lower 90% CI - currently named 'bottom'
upper_90
: nowcast upper 90% CI - currently named 'top'
lower_50
: nowcast lower 50% CI - currently named 'lower'
upper_50
: nowcast upper 50% CI - currently named 'upper'
cases
: number of cases on each date - all NA values set to 0
(assuming that the original columns are 50% and 90% CI's)
This would make the format of the new nowcast csv file the same as rt.csv
.
For individual country visualisations, these datasets could just be subset for each country and we could output static svgs with the same styles.
For summary map:
Could we output a summary csv file of the classification of each country.
Proposed file columns:
country
: country name - using the same country names as rt.csv
and the above file.
trajectory
: with 6 coded values - decreasing, likely_decreasing, unsure, increasing, likely_increasing, no_data
This would allow for a quick join of the map data to each day's classification and then some styles.
Thanks, let me know what you think.
Make it very clear on all pages that estimates are impacted by changes in testing and reporting. Running out of tests is a particular issue until a new reporting equilibrium is arrived at.
For color / labelling scheme:
Some minor and non-critical comments:
Global summary page:
Methods page:
Do you have information about R0 COVID-19 for Veenzuela?
https://epiforecasts.io/covid/posts/national/united-kingdom/ shows regional halving times all -29 or higher, but national halving time is shown as -72 (-460 – -39)
I haven't checked the calculations in the code, but that appears to suggest the two are calculated in different ways?
The page containing data for Brazil doesn't make it clear which estimates use data up to 2020-04-14 and which use data up to 2020-04-24.
Multiple reviewers have flagged colour palette as an issue for the map and summary plot. In this meta-thread please battle out your colour palette choices. (I will then choose the survivor)
From @samclifford: Agree with @kathsherratt in #7 on colour. Greys are too similar, and I'm not sure that blue adequately represents that there are increasing cases. I don't think red-yellow-green is the right palette to use but colorbrewer's 3-class RdYlBu scheme is broadly interpretable as "bad", "not so bad", "good". The five-class scheme can be used for increasing, likely increasing, unsure, likely decreasing, decreasing, with grey reserved for "No data".
From @kathsherratt: Figure 1: “Likely Increasing” colour grey > looks a bit too close to NA to me - could change to e.g. light blue - or to a sequential scale so that all three values are in colour order (Increasing > Likely increasing > Unsure)
From @jhellewell14 : Pick better colours to correspond to Increasing, Likely increasing etc. and sync them with all other plots (begun this in a branch)
From @pearsonca: there's no apparent gradient currently from increasing -> likely increasing -> unsure, but there is conceptually; worth having a gradient in the color scheme?
the NAs need to be more distinct from other colors
It looks like ECDC case counts and John hopkins data have different case counts leading to different number of a national to state scale in the USA
Highlight the impact of changes in testing, testing saturation and general step changes in testing on all pages.
“new infections” - is possibly a misleading label. It’s the number of new infections that ultimately get confirmed (which means something different in every country). Perhaps call it “New cases by infection date” and in the figure labels like Fig. 6 on “Global” slightly rephrase to “Cases by date of report and their estimated date of infection”.
big reproduction number plot (global) - would this look bad if they were all on the same time scale (probably, because of China)?
Latest estimates table: I’d remove “new infections” for the reasons given above, unless we can come up with a better term
It would be nice if the tables had sortable headings, e.g. using https://rstudio.github.io/DT/
plot_summary
from EpiNow is showing a small y on the axis when it should not. Remove for next update.
Two of Italy's regions are missing both in the map and the table:
At the time of writing, Table 1 on https://epiforecasts.io/covid/posts/global/ has some bizarre entries, such as Australia's doubling time of -71 (7 – -6), or Austria's of 440 (15 – -16). Belgium has -14 (-22 – -10), which indicates the issue might be the way doubling/halving times are reported back to the table when the interval contains 0.
Cameroon is 110 (15 – -21) and the estimate is not contained within the interval at all. Same with Cote d'Ivoire, 51 (9.7 – -16). Croatia is -10 (15 – -3.8). Bahrain is 200 (17 – -20). These are reflected in the national summaries, e.g. https://epiforecasts.io/covid/posts/national/cameroon/ and it's clear that there's something wrong when converting from growth rates to doubling/halving times.
It is missing with the 'estimates as of the 2020-03-29' / 'Using data available up to the: 2020-04-08' run.
But was available with the 'up to 2020-04-03(?)' run.
Question: why is Hawaii excluded from all your research on COVID?
Some regions/countries take much longer to be simulated than others. It might make sense to run all regions sequentially with parallelisation within each region. For example the USA regional breakdown has one state that takes 3 times longer to run than any other during this time all cores excepting one are idle.
Changes upstream mean that infinite doubling times are not being properly caught. Review and fix.
It looks like page builds may sporadically be failing (see #31 ) - track down, isolate and fix.
The map in Figure 1 https://epiforecasts.io/covid/posts/national/germany/ does not fit to the classifications given in figure 4 / table 2.
In Covid-19: Global summary, figures are misnumbered. There are two "Figure 1" figures.
On:
https://epiforecasts.io/covid/posts/national/united-states/
Alaska and Hawaii do not appear in your list of states at the bottom fo the page.
Hosted in the EpiNow
repo: epiforecasts/EpiNow#91
Bringing to your attention that Kosovo is not in Africa, but in rather in Europe. Kindly rectify accordingly
This looks great.
Comments: The ribbons showing cases by date of infection are beautiful but to me are a little tricky to interpret. I hate to say this, but would this maybe be clearer as a geom_pointrange (with a thicker line for 50% CrI?)
The title "Summary of latest reproduction number and case count estimates by date of infection" to me was confusing as the plot is not by date.
Can there be more detail on the difference between wide CrI and the measure of uncertainty that gets reflected as translucency?
Can you show distributions that go into estimating e.g. time between infection and reporting?
Can I ask about the doubling times which often have an upper bound of infinity, and for which the point estimate sometimes is not within the uncertainty range (e.g. -100 (14 – Inf) for Italy)?
Since the average ICU stay appears to be 23 days, it would be nice if we could see if the 50%/90% confidence interval was above this 'safe' threshold. Meaning people have left the ICU before new ones arrive.
Great work, thank you for providing this!
Could you please clarify which exact data source you are using for Germany? The linked source https://github.com/jgehrcke/covid-19-germany-gae is providing official data from RKI as well as data "curated" by two large german newspapers. The newspaper source is always more recent but the quality of the curation is debatable. The data plotted at your tool today looks unreliable (seems to have a gap for the last days and looks in general inconsistent to the offical data). Maybe it would be worth considering switching to the official RKI data? Might even be an option to take RKI data augmenting it using other source for the last three days where RKI is lacking behind? Please see screenshots attached.
Thank you
"Your Data" (gap, weird jumps):
.
"Official Data" (dense, looking very consistent in general except for the weekend effect):
The global page is missing the reporting rate limitation statement ("These results are impacted by changes in testing effort, increases and decreases in testing effort will increase and decrease reproduction number estimates respectively (see Methods for further explanation)."). Add for the next update.
From: https://twitter.com/DogOfPoasts/status/1246869025855963136
Looks like old data is causing duplicates to show. Remove old version.
Report: https://twitter.com/ATabarrok/status/1249677197058678785
Hi,
I have been working in a Brazilian task force for covd-19.
Would you guys be interested to add Brazilian data at subnational level (adm 1)? I could help point out where the data is and help with translation if needed.
Cheers,
Leo
I had a slight confusion when trying to find the licence of this repo: the top-right link "View License" on https://github.com/epiforecasts/covid points to file LICENSE
, which appears to say that it's a normal proprietary software:
YEAR: 2020
COPYRIGHT HOLDER: Epiforecasts
However then I found LICENSE.md
, which states it's open source software under the MIT license. Which is great 🙂 The redundant and incomplete / confusing file LICENSE
can probably be simply deleted. Github will then automatically pick up LICENSE.md
for the licensing information.
It looks like German regions are now not mapping correctly. This may be a change in region name processing or maybe a problem higher up the tool chain (NCoVUtils
).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.