jarnorfb / epysurv Goto Github PK
View Code? Open in Web Editor NEWEpidemiological surveillance in Python
License: MIT License
Epidemiological surveillance in Python
License: MIT License
I've found that when using daily case count data, I can use FarringtonFlexible
without problem but with weekly or multi-week data, I am getting a runtime error. I've tried all sorts of ways of working around this including passing in glmWarnings
as False
to try to avoid since this it looks more like a warning()
than an actual failure.
I will keep trying to debug this on my own, but I'm not a great R programmer and it's hard to tell if this is an issue in the surveillance
package or how I am constructing data or something else.
So while I will keep trying to troubleshoot this, a few questions if there are any thoughts:
rpy2
or any other way of ignoring an issue like this since it looks more like a warning()
than a fatal Exception
?RRuntimeError (a fuller callstack is below):
E rpy2.rinterface.RRuntimeError: Error in algo.farrington.data.glm(dayToConsider = dayToConsider, b = control$b, :
E Some reference values did not exist (index<1).
Code to reproduce (as a pytest
)
import pandas as pd
import pytest
from epysurv.models.timepoint import FarringtonFlexible
def test_farrington_weekly_example():
model = FarringtonFlexible()
total_periods = 100
test_size = 20
case_count = 10
# set up some weekly data
dates = pd.date_range('2017-07-09', periods=total_periods, freq='7D')
# just make this constant (but this also fails with random or real case count values)
case_counts = [case_count] * total_periods
df = pd.DataFrame({'n_cases': case_counts}, index = dates)
train_data = df[:-1 * test_size]
test_data = df[-1 * test_size:]
# make sure we can fit and predict
model.fit(train_data)
_ = model.predict(test_data)
Fuller callstack:
def test_farrington_weekly_example():
model = FarringtonFlexible()
total_periods = 100
test_size = 20
case_count = 10
# set up some weekly data
dates = pd.date_range('2017-07-09', periods=total_periods, freq='7D')
# just make this constant (but this also fails with random or real case count values)
case_counts = [case_count] * total_periods
df = pd.DataFrame({'n_cases': case_counts}, index = dates)
train_data = df[:-1 * test_size]
test_data = df[-1 * test_size:]
# make sure we can fit and predict
model.fit(train_data)
> _ = model.predict(test_data)
test_farrington_specific_example.py:26:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
..\epysurv\models\timepoint\_base.py:131: in predict
surveillance_result = self._call_surveillance_algo(r_instance, detection_range)
..\epysurv\models\timepoint\farrington.py:166: in _call_surveillance_algo
surv = surveillance.farringtonFlexible(sts, control=control)
C:\anaconda3\envs\epysurv-dev\lib\site-packages\rpy2\robjects\functions.py:178: in __call__
return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = R object with classes: ('function',) mapped to:
<DocumentedSTFunction - Python:0x000001C714FD1608 / R:0x000001C70F26E4A8>
args = (R object with classes: ('sts',) mapped to:
<RS4 - Python:0x000001C7151F5548 / R:0x000001C70FF061F0>,)
kwargs = {'control': R object with classes: ('list',) mapped to:
<ListVector - Python:0x000001C7152E92C8 / R:0x000001C71400440...ect with classes: ('character',) mapped to:
<StrVector - Python:0x000001C715468C88 / R:0x000001C7149A7E00>
['delta']}
new_args = [R object with classes: ('sts',) mapped to:
<RS4 - Python:0x000001C7151F5548 / R:0x000001C70FF061F0>]
new_kwargs = {'control': R object with classes: ('list',) mapped to:
<ListVector - Python:0x000001C7152E92C8 / R:0x000001C71400440...ect with classes: ('character',) mapped to:
<StrVector - Python:0x000001C715471C88 / R:0x000001C7149A7E00>
['delta']}
k = 'control'
v = R object with classes: ('list',) mapped to:
<ListVector - Python:0x000001C7152E92C8 / R:0x000001C714004408>
[IntVect...ject with classes: ('character',) mapped to:
<StrVector - Python:0x000001C715478F48 / R:0x000001C7149A7E00>
['delta']
def __call__(self, *args, **kwargs):
new_args = [conversion.py2ri(a) for a in args]
new_kwargs = {}
for k, v in kwargs.items():
new_kwargs[k] = conversion.py2ri(v)
> res = super(Function, self).__call__(*new_args, **new_kwargs)
E rpy2.rinterface.RRuntimeError: Error in algo.farrington.data.glm(dayToConsider = dayToConsider, b = control$b, :
E Some reference values did not exist (index<1).
After I performed a
conda install -c conda-forge epysurv
I then attempted to run and my environment was unable to import from this line:
from rpy2.rinterface import RRuntimeError
I checked which rpy2 had been installed and I had 3.4.5
installed.
In rpy2 3.x the new import is this which did work for me:
from rpy2.rinterface_lib.embedded import RRuntimeError
I see a couple of ways to fix this:
I'm not sure if the second option above would work since I wonder if there are dependencies which truly do expect 2.x of rpy2.
Feedstock (https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=350396&view=logs&jobId=d0d954b5-f111-5dc4-4d76-03b6c9d0cf7e&j=d0d954b5-f111-5dc4-4d76-03b6c9d0cf7e&t=841356e0-85bb-57d8-dbbc-852e683d1642) breaks on pickle.load
:
Line 46 in d031511
As per https://stackoverflow.com/a/68342039/6256888, the recommended workaround is to replace pickle.load
with pandas.read_pickle
for pandas v1.3.0.
I just conda installed epysurv. However, the new simulation algorithms are not contained.
I noticed in using the package that columns like 'alarm' are exposed but other helpful columns like 'upperbound' are not exposed. This can be helpful to see in visualizations and to see why there was or was not an alarm.
I have a pull request ready for this. I'll link it in a moment.
Version conflict for windows Python packages on conda-forge PR:
(https://ci.appveyor.com/project/conda-forge/staged-recipes/builds/24230568)
conda.exceptions.UnsatisfiableError: The following specifications were found to be in conflict:
- python=3.7 -> vc[version='>=14,<15.0a0']
- vc=9
Use "conda search <package> --info" to see the dependencies for each package.
During handling of the above exception, another exception occurred:
`conda_build.exceptions.DependencyNeedsBuildingError: Unsatisfiable dependencies for platform win-64: {"vc[version='>=14,<15.0a0']", 'vc=9'}`
Since I will work more with simulations in the coming weeks, I thought about including the algorithms in here. For example I am currently copying the simulation used for the "An improved algorithm for outbreak detection in multiple surveillance systems". Should I try writing more simulations in a similar manner as the other algorithms and push them to epysurv?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.