Giter Club home page Giter Club logo

cdc-flusight-ensemble's People

Contributors

brookslogan avatar craigjmcgowan avatar d-osthus avatar dependabot[bot] avatar elray1 avatar hbiegel avatar katiehouse3 avatar khoale1096 avatar lcbrooks avatar lepisma avatar lpmi-13 avatar nickreich avatar nutchaw avatar sjfox avatar srinivvenkat avatar tkcy avatar tomcm39 avatar xinyuexiong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cdc-flusight-ensemble's Issues

modify make-cv-ensemble-forecast-files script

  • needs to be able to take in multiple weight files
  • change over to using standard model_ids
  • change to use all weeks, not just the common subset, since will only be using complete models

investigate 2016/17 ensemble outputs

Need to compare the 2016/17 ensemble outputs to all submitted models to see how they would have compared. Could also compare to an unweighted average of just the models submitted by teams that are participating in the ensemble project (i.e. Delphi, CU, KoT, and LANL models)

make leave-one-season-out cross-validated weights file

Each ensemble specification will have its own csv file with weights in it. Each file should have the following three columns:

  • season: the left-out season for which the weights apply
  • component_model_id: the folder name that has the forecasts in it
  • weight: the weight
    And the file may also have the following columns, depending on how detailed the ensemble weights are stratified:
  • target: these should be in the same format as expected in an entry file
  • location: again, same format as entry file

If the file doesn't have one or both of these columns, then we assume them to be the same across all targets or all locations.

We can impose the check that, for a fixed target (t) and location (l) (if specified) $\sum_{i=i}^M weights_{i, t, l} = 1$.

Currently, the script that turns weights into CV ensemble entries needs the above format for the weights file. See, as an example, this file that has a functioning set of example weights.

reorganize file and folder structure

Need to create three sets of folders for forecasts

  • component model forecasts (to be included in visualization, scores)
  • CV ensemble model forecasts (to be included in scores)
  • Real-time ensemble model forecasts (to be included in visualization)

Also, this will require changing file-paths in other scripts (visualizations, score calculations, etc...) that are dependent on these files.

minor discrepancy in spot-checked distributions

I'm getting very small differences in my spot-checks of ensemble distributions, particularly in 1-4 week ahead forecasts with TRW, TW, TTW [see example image below)

Very possibly a problem on my end, which I'm checking, but also wanted to ask whether any rounding happens in creating the new distribution?

examplespotcheck

Needs more written description of how to interpret

What is "Weighted ILI (%)"?
What does a probability of 0.3 mean? Any individual has a 30% chance of getting the flu? A 0.3% chance?
What does it mean that the mean log score for 3 wk is -7.93?
Perhaps a blub or some hover-over text with overall information would help. I clicked through a few different github pages and am figuring out it's some sort of competition hosted by the CDC. Is this one team's effort? A visualization of all of them put together into an ensemble model?

minor fixes to new scores table

  • remove disclaimer about "final data" at bottom
  • add explicit "NA" for missing scores
  • @brookslogan why are scores for Delphi Uniform? related to #24 ?
  • add sorting feature on table
  • For this table, why does only week 1 have a bold "first place"?
  • Once I am at the scores tab, I am unable to re-navigate back to another tab.

image

Potential small bug in travis scoring

I've finished checked the scores generated in travis against scores calculated using the FluSight R package. We're down to 110 discrepancies of greater than 10^-12, all related to peak week in the 2014/15 season. The errors occur in Regions 2, 3, 5, and 7, as well as US National, all of which have week 52 as the peak week.

I looked at one particular error in detail - the target-based model forecasts for Epiweek 53. Specific target is HHS Region 3 peak week. Correct score should be -0.627, summing probabilities of weeks 51, 52, and 53. Travis is assigning -1.11, apparently from summing probabilities of weeks 51, 52, and 1.

image

CUBMA model missing files

Some files from the CUBMA model are missing. When running the validate_predictions file, I got an error because the CUBMA/EW51-2010-CUBMA.csv file doesn't exist. there may be others that don't exist as well, this was just the first one it ran across.

link to README file on main visualization page

Could we have the "FluSight Network" and "CDC FluSight Network" text in the top left of the visualization homepage link directly to the README file, so folks who find this page could understand the context?

Fix visualization of week targets ranges

image

Point predictions are outside the range. Maybe this is because the visualizer prefers point prediction written in the Point row which is not matching with the distributions in the csvs.

fix broken links

  • links to model description files (e.g. from scores tables) still point to old folder locations
  • link to "source" at bottom of page points to the app, not the github repo

update import code to handle new metadataformat

I updated the metadata files to have these new fields:

  • team_name (max 10 chars)
  • model_name (max 50 chars?)
  • model_abbr (max 15 chars)

These fields should be used to populate the visualization legend.

Funny point forecasts for peak week

The point forecast for US National peak week doesn't match up at all with the underlying distribution. The point forecast is for week 17, but the probabilistic values put it in the week 51-7 range. Could the code to generate the point forecast not be dealing with the New Year's transition correctly?

decide which EW files to include in scoring and weighting calculations

Previously, in #14 , we had decided that files from EW40 of year k through EW20 of year k+1 would be submitted. However, @tkcy brought up the point that the challenge does not run for those weeks this year, so we are training on weeks that are not in the competition. A question for @craigjmcgowan : what is the "EW" label for the first and last files that will be submitted for the 2017/2018 season?

CU Week 53

The bin for CU week 53 was not empty, as expected. Probably means probabilities assigned to weeks 1 through 20 need to be bumped up by 1 bin. Will check in with Sasi about a fix.

zero log scores

Log scores of 0 in the summary statistics table are probably wrong, e.g. for CU-BMA and ReichLab-SARIMA1 in HHS Region 3 in 2012/2013.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.