Giter Club home page Giter Club logo

covid19canada's Issues

BC and Alberta weekend new case numbers

Both BC and Alberta new case numbers don't show weekend data, Saturday and Sunday numbers are merged with Monday, leading to 0s on weekends and hyper-inflated Monday counts. Both provinces do however report weekend breakdowns, and the data are available via the BC CDC and the Alberta government. Is there any appetite to correct the BC and Alberta weekend numbers?

Add ESRI maps

Add ESRI maps (old SK borders and new SK borders).

BC cases for reporting gaps

The file cases_timeseries_prov.csv could be improved for BC (and perhaps other provinces). The issue is that the province does not issue a report every day: there are gaps. The first report after a gap provides the case counts for each of the days during the gap, as well as the current day.

The problem is that the cases for the gap days are recorded as 0 in the database, and all the cases that occurred during the gap are lumped together into one number on the day of the next report.

Example (from today): If the case counts on (Saturday, Sunday, Monday) are (10, 6, 16) but there was no report issued on Saturday and Sunday, the database values are recorded as (0, 0, 32). It would be better if the cases were properly attributed to the correct days, to be consistent with the official reports.

If the current team does not have the ability to implement that procedure, I would be willing to make those updates, since I am tracking these numbers for my own purposes.

JSON API

Hi there,

I would like to ask the maintainers if it would be ok if we forked this into a (open, of course) JSON API?

data entry error recovered_cumulative.csv

Dear team,

The cumulative recovered counts for SK for May 31, 2020 (most recent update) should be 582 instead of 682.

see line number 992 on data page, https://github.com/ishaberry/Covid19Canada/blob/master/recovered_cumulative.csv#L992

31-05-2020 | Saskatchewan | 682 <- should be 582

https://www.saskatchewan.ca/government/health-care-administration-and-provider-resources/treatment-procedures-and-guidelines/emerging-public-health-issues/2019-novel-coronavirus/cases-and-risk-of-covid-19-in-saskatchewan

Many thanks,
Kuan

Contributing to Spreadsheet

This is an awesome project. Thanks for doing this.

I've noticed that the data is outdated for my area of Windsor-Essex. I'd love to help contribute to the spreadsheet. I'm a student from the University of Windsor. Would it be possible to get access and contribute to your data set?

Add Canada-wide time series

Add an additional set of time series aggregated to the level of the entire country. Should mirror what is available for the the provincial time series:

  • active cases
  • cases
  • mortality
  • recovered
  • testing

Could hospitalization data be added?

At least one province (BC) has started publishing hospitalization numbers. This seems like an important statistic to follow. Could the available data be added to the spreadsheet?

For BC the daily reports are listed on http://www.bccdc.ca/health-info/diseases-conditions/covid-19/case-counts-press-statements. An example report is http://www.bccdc.ca/Health-Info-Site/Documents/BC_Surveillance_Summary_March_27_final.pdf. I've extracted some of the data here https://docs.google.com/spreadsheets/d/1uz-hq7ncFff92iSh63_G-oFew2O1F_2eHOfr1_zdREg/edit?usp=sharing

Ontario testing

Cumulative test numbers in Ontario for Mar-29 or Mar-30 appear incorrect. The number of tests on Mar-29 is higher, which should not be true.

Add official datasets adapted for compatibility with CCODWG datasets

Several provinces offer datasets (e.g., CSV files) that we do not use as direct inputs into our dataset. For example, the Ontario and BC datasets use different date schemes than our dataset, rendering them incompatible with our universal date scheme, which is public reporting date.

This addition will solve many previously discussed issues, such as #44 by allowing official datasets to serve as drop-in replacements for portions of our dataset.

The following datasets will be adapted for compatibility with the CCODWG datasets (additional datasets may added later):

Reported cases/deaths are not using proper health regions

Looking at the health region data it looks like the names of the regions are incomplete. The full list of health regions are listed at https://www150.statcan.gc.ca/n1/pub/82-402-x/2015002/app-ann/ap-an1-eng.htm

Among the values in the data I see "Fraser" but this is ambiguous and can't be mapped to the actual health regions which are:

  • 5921 Fraser East Health Service Delivery Area
  • 5922 Fraser North Health Service Delivery Area
  • 5923 Fraser South Health Service Delivery Area

Hoping this data is available somewhere upstream...

Some more ideas for graphing the data

I created the following from the John Hopkins data:

https://rigsomelight.com/canada_covid

It contains some graphing ideas that I got from the Financial Times and talking to friends.

Mainly providing a number of days since first case as an X value and looking at percentage of population as a Y value. You may be interested in putting these on your dashboard.

I'm available if you need someone to work on it.

Ontario Significant Discrepancies in cases.csv

Thank you very much for the work you do. Multiple provinces have some minor variances to official Canadian numbers due to timing. And this is fine. When it comes to Ontario, your data set shows almost 2,000 cases higher than Ontario/Canada officially reported to date. Please see the attached excel with COVID19 cases cross-validation. It is definitely not due to the timing, as the gap started a few months ago and keeps on widening. I am trying to understand the nature of the difference that is unique to Ontario. If you think the data need to be fixed, I can help. I am also happy to jump on a call to discuss.

Variances in COVID19 Daily New Cases Reporting by Province.xlsx

Recovery numbers for Ontario are lagging as case numbers are from PHUs

Thanks for curating this dataset and all your efforts!

I noticed that the number of recoveries is equal to that reported by Ontario (which is lagging) while case numbers are from individual PHUs (which are more recent). This makes it seem like Ontario has a larger number of active cases than it actually does.

For example, on July 28th, 21:00
According to this data: # of active cases is 3,400 with 34,567 recovered.
If you sum up recoveries from individual PHUs they come to around 36,200 meaning in reality active cases is almost half of that.

Would it be possible for you to source recoveries from individual PHUs as well?

I can't see where you're getting the numbers for Recovery, for 5 jurisdictions.

Format of numeric columns

Thanks so much for the repo and maintaining the spreadsheet. In the spreadsheet, would it be possible to keep columns with numerical values, purely numeric? The cumulative_testing column on the Testing tab has asterisks on some numbers, which can cause (small) headaches when importing the data. Could the asterisk be in the column beside instead?

Cheers!
Jon

Redesign meta-data / README.md

The meta-data (primarily stored within README.md) is in need of an overhaul. Tasks will be added below as they arise:

  • Add instructions for downloading from GitHub for non-technical users
  • Explain differences between various datasets (e.g., recovered_cumulative and recovered_timeseries_prov.csv)

Tests-performed or people-tested?

What exactly do the testing numbers mean?

Are you reporting on people-tested, or tests-performed? I don't see that specified anywhere.

When I look at the government source data, it seems to be a mixed bag.
Some jurisdictions report on both tests-performed and people-tested, but most don't.

Examples:
Tests-performed: BC, MB
People-tested: QC, NL

Source of recovered / deaths / tested

First of all curating this data is nothing less than phenomenal, and the dashboard you have created is awesome. Thank you for this public service. I'm curious though about your source for the number recovered, deaths and number tested, as this information doesn't seem to be in the google spreadsheet you linked from the dashboard. Can you point me in the right direction? Thanks!

Detailed Ontario data available in CSV format

Hello;

Detailed Province of Ontario data is available in CSV format at

https://data.ontario.ca/dataset/f4f86e54-872d-43f8-8a86-3892fd3cb5e6/resource/ed270bb8-340b-41f9-a7c6-e8ef587e6d11/download/covidtesting.csv

It is updated daily at approximately 10:15 AM Eastern. Lots of interesting stuff. Some examples are...

  • Column E is currently acive cases, noticeably lower than your count

  • Column G is cumulative deaths. Subtract previous day from current day to get deaths occuring on the most recent day..

  • Column H is cumulative number of positive tests. Subtract previous day from current day to get positives occuring on the most recent day.. I believe Ontario currently uses number of tests, not number of people, in the count.

  • Column J number of tests run in past 24 hours.

And I'm sure there is other stuff you'll be interested in. The province also has detailed daily PDF reports at https://covid-19.ontario.ca/covid-19-epidemiologic-summaries-public-health-ontario#daily

The PDF files have breakdowns by Public Health Unit going back to June 11 (June 9th and 10th data). I've been scraping the daily PDFs for daily case counts for each health unit. I can upload the daily PHU new cases data if you're interested. Note that with "adjustments", you'll see occasional days with negative case counts.

Ontario data

I have switched the provincial/territorial charts on my site link fully to your datasets, and I am noticing an oddity with the Ontario data. At first I thought it was a problem of the Canada data as the numbers here differ greatly from the John Hopkins data (they report about 2000 fewer cumulative cases), but on drilling down all the provinces seem fine except Ontario.

As of this morning 2020/08/15 1100h Pacific, the Ontario Government is reporting 40,565 cases
Ontario ca

While the data here reports 42288 cases. While comparing to: COVID-19 Case Data (Ontario.ca), I see the data starts diverging on April 01, 2020. I am curious if there is a known reason for this difference?

Date formatting for Ontario cases

Ontario cases from 12 to 59 have MM-DD-YYYY date formatting. The rest of the ontario cases are DD-MM-YYYY.

If at all possible, it would be good to have YYYY-MM-DD for all.

Add "Repatriated" to testing time series

Repatriated testing numbers are available from the PHAC CSV file.

Repatriated cases have already been added to the recovered file based on assumptions regarding recovery time (and lack of news reports indicating serious disease in identified repatriated travelers).

Add HR_UID to health region time series

Although HR_UID linkage is available in other/hr_map.csv, it would be much more convenient if it were present directly in the health region time series CSV files themselves.

Postal Codes

Hi all,

Love the effort so far. I am interested in creating a project that displays this data visually on a map to neighbourhood detail. This won’t be possible without postal codes.

How challenging would it be to get postal codes? I realised just the first three alphanumerics are good enough to show a neighbourhood close up (also mitigating any privacy concerns).

Ratnesh

Feature request: Hospitalizations

Quebec officials have suggested hospitalizations as a useful measure of the impact of the virus in different provinces. Would including this data be achievable?

Tests - ON not including pending tests

Although I don't see it stated anywhere, you seem to be reporting on the total number of tests performed, including pending tests.

If that's the case, your reporting for Ontario seems to be off. For example, yesterday, on April 22, ON reported these numbers:

  • Total tests completed 184,531
  • Currently under investigation 6,845 (Samples with testing in progress.)

Your reported testing number for ON for that day is 184,531. If you are indeed reporting completed+pending, shouldn't you be reporting the sum of the above two numbers?

Add "Repatriated" to health region time series

Add to both hr_map.csv and actual timeseries_hr files for each stat.

This will allow the same range of data to be included in each of the timeseries (canada, prov, hr).

It will also simplify the function to construct the actual timeseries files from the raw files using update_data.R.

code for Health regions

Hi,
Impressive work! Just wonder is it possible to add the HR_UID to each health region?
Seems that the health regions names from different sources (e.g., shapefile) could be very different.

thanks,

Guowen

Geocoding for each health region

Hi, I'm using Leaflet to geo locate the health region, but there's some health regoin i cannot locate, do you guys have any thoughts on that? I'm using Mapbox geocoding

Make .xls more parsable

I'm writing a parser for the .xls for https://github.com/neherlab/covid19_scenarios, and ran into some minor issues. In particular

  • The data rows are not starting on the same row among worksheets (4,3,3 for cases, deaths,recovered)
  • province is used as label for two columns on the recovered sheet
  • there is no 'health_region' column on the recovered sheet
  • dates colums are labelled 'date_report' or 'date_death_report' and 'date_recovered'. It would be easier if they would all be called 'date_report' if possible
    It would be great if these could be unified to make parsing easier.

/timeseries_hr/recoveries file available?

Firstly, thank you for maintaining the spreadsheet and this repo.

I was looking at recoveries data in Covid19Canada/timeseries_hr/ in particular. Would that be available as well?

I am creating a web-app providing info for each province & regions and I am missing # of recoveries for each region ATM. Thanks again!

BC Testing Numbers Discrepancy

There seems to a be an issue with the testing numbers coming from the API, where there will be no tests reported for 2-3 days, then all of the missing days reporting on a singular day.

image

The data from the BC CDC website (which seems to be your source) has all of the testing numbers by day, which matches up with what they present on their ArcGIS dashboard. Any chance we can get that information updated in the API?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.