mindthegrow / cafelytics Goto Github PK
View Code? Open in Web Editor NEWA basic simulation of a coffee operation.
Home Page: https://cafelytics.rtfd.io
License: MIT License
A basic simulation of a coffee operation.
Home Page: https://cafelytics.rtfd.io
License: MIT License
This class encodes species information, can be used for generating impacts for things such as "harvest" or "pruning" or "fertilizer", and use it to inform another multiplier.
I think something like... a config can span an event, one we know will be relevant, so we dont need a second lookup perhaps (for configs and events). Have to think through where the inefficiencies are.
Maybe config isnt even required? How is it used right now?
We have a step where we find the relevant configs.
For example... Config stores info about a species, then can be used to generate harvest functions like the ones in simulate.py
. Should events be able to take configs as arguments? So, rather than span four guate-functions, encode info in config, and span on the fly? Seems like itd be an inefficiency for runtime but efficiency for storage. Easier to store a config and a template callable than a bunch of generated functions from that template.
once we get the packaging initialized, we need to write a really basic readme that updates the status of the project. branch name: feature/readme
merge into develop
for this.
determine if harvest is active based on when a plot "joined" the cooperative.
in other words, can have someone with farms dating back to the 90's but joined in 2002, don't want their yields showing up as part of the total unless we explicitly toggle them to.
is_active
considers time as in age of the plot, but we can encode 'join_date' as a feature, and if present, make sure that exceeds the present time.
Trying to understand the numbers being used for the early pruning simulation. Is this implying that giving the trees early pruning gives them an extra 20 years of life? I also don't see production being affected, only years of production. This very well could be my lack of exp with matlab.
If you had ten minutes, something that would be extremely helpful is a brief summary of how production numbers (e.g. proportion of full harvest; years of production) change as a result of (1) pruning and (2) intercropping.
if we want this to be a CLI usage, then we should properly define it as a console script that can be run from anywhere
discover:
in python is it acceptable to call member functions in the initializer to help abstract the bulk of the code away from the init function? In c++ this behavior is deprecated, however c++ is much different in regards to scope & user accessability so it may not apply here.
example:
with member function calls:
class Farmarelli:
def __init__(self, eventRows:pd.DataFrame, initialYear:int):
self.assignParameters()
where self.assignParameters()
assigns member variables?
or should these simply be assigned directly in the __init__
function?
make some sort of interactive front-end
continuous deployment (heroku?)
even if we only use it once, I think that methods like this should be defined by a decorator in order to briefen the class definition:
def years(self, current_time=datetime.datetime.today()) -> int:
return round(self.age(current_time).days / 365.25)
def days(self, current_time=datetime.datetime.today()) -> int:
return self.age(current_time).days
def mins(self, current_time=datetime.datetime.today()) -> int:
return round(self.age(current_time).seconds / 60)
how it would work:
@add_age_attrs
as a decoratorcls.age
is defined. does nothing otherwise (or raises a usage error of some sort).testing
placeholders for docker
placeholders for pypi
after the simulation is in good working-order, set up a function that will log data from each year iteration in the simulation (this will be helpful to come back to later to compare and contrast the effectiveness of strategies).
add more context into error response that's run when fakedata hasn't been generated and/or the filepath argument doesn't exist
assure yaml file import is functioning, then remove the data that is stored within the farm.Farm class
Binder points to coffecode
not cafelytics
Low, github seems to resolve /coffecode
to cafelytics
just fine (maybe we changed the name?)
maybe we keep the original matlab code anyway? I don't know, perhaps put it on a separate branch with some mention of the archival process on the README. We can actually just refer people to the last commit number before the merge of #15 and provide some backstory in a blog post.
That said I eventually do want auto-generated documentation and it would be nice if some markdown file from this repo just got incorporated into the docs, however that happens.
Part of this process should involve re-writing the README, but I'll raise another issue for that.
these are really outdated. now that we don't depend on octave, we can use the default binder settings to just build the dependency set from requirements.txt
test we can simulate pruning (and what comes with it, such as:)
test we can simulate intercropping of the same crop
test we can simulate intercropping of a different crop (this totally changes the dynamic, though)
test we can simulate a wildfire event which wipes out a proportion of crops and requires replanting the following year
test we can simulate the impacts of pests on plot health (i.e. proportion of plants that stay alive, lifespan of plants) and crop yield
test we can simulate drought and how it impacts plot health and crop yield
test we can simulate a large storm and how it impacts plot health and crop yield
test we can simulate adding/removing an irrigation system and how it impacts plot health and crop yield
start_year = datetime.datetime(2020,1, 1)
end_year = datetime.datetime(2021, 1, 1)
events.append(Event("catastrophic overfertilization", impact=.001, scope={"type": "species", "def": "e14"}, start=start_year, end=end_year))
update pyscaffold version, update workflows for new packaging procedures.
try fully migrating to pyproject.toml (but ... I do like development installations)
now that the CLI is successfully returning plots (somewhat) resembling the original:
until
too much stuff is included in the package...
cafelytics/src/Worksheet.m is like a precursor to a notebook.
I want the user to interact with this as follows:
python simulate_growth.py --farm ./data/farm.xlsx --growth ./data/growth.yml --strategy ./interventions/strategy1.yml --years 60
before merging, grab the stuff from index.ipynb
Originally posted by @mathematicalmichael in #22 (comment)
use the str
library or something similar to standardize stripping/formatting instead of trying to catch this
Originally posted by @mathematicalmichael in #39 (comment)
Based on data (and industry trends), develop an example/demo dataset to model & test the simulation. Then, construct a class of co-op/plantation and utilize the dataset to build & fill the class.
Also: what are the most common units of measurements? What units can be constructed ambiguously (I.e units/time; units/space) to extrapolate beyond coffee?
GitHub Secrets (PyPi token)
GitHub Actions
need to adjust simulation data creation so that any farmers listed have at least one cuerda of trees
farm class is now updated to have a dictionary passed to it with tree attribute information (as opposed to it being defined in the class itself). either: (1) something needs to be built so that the class can rely on a default dictionary for testing or (2) the class should instead be passed a filepath (however this would break the current workflow pattern)
Test how long the example simulation takes, benchmark it.
See if caching with lru_cache
has any impact whatsoever.
Get a sense of what time forward simulations will require.
Figure out how to collect input parameters in a way that is collapsable to a matrix of samples.
It still talks of this code as if it's written in Octave!
be able to take in a spreadsheet (to start), csv/tsv, parquet, something that you imagine a user can drag/drop into streamlit eventually.
right now, some settings are just grabbed from hardcoded locations in a xlsx file
that's not required. you can make settings available in a human-readable format such as yaml
assumptions about growth of trees should be decoupled from farmer information
def test_farm_instantiation_defaults():
test_farm = cf.Farm()
I added some default arguments to be able to instantiate this without arguments, and came across an immediate error during testing:
https://github.com/mathematicalmichael/cafelytics/pull/20/checks?check_run_id=962191696
> elif ((age >= self.firstHarvest['year']) and (age <= self.death['year'])):
E AttributeError: 'Farm' object has no attribute 'firstHarvest'
for now I'm going to take this test out and simply "pass" everything just to simulate the basics of the testing workflow. See PR #20 for the last instance of an "X" in the test status next to each commit message before the checkmarks started
if type is unknown, raise AttributeError
Originally posted by @mathematicalmichael in #20
create some sort of global functions (or elect them from re/string class/etc). string matching is a large component of the flow of control of these simulations and there should be as many safety nets built in as possible (also see #41 )
for best practice: while translating, all significant numbers should be assigned to variables with descriptive names and/or the lines in which they are used should be commented. Many already have this, but, for example,
if n ==18:
data(1,13)=20;
should be something like:
pruneThreshold = 18 # description of what 'pruneThreshold' means (fake var name)
if n == pruneThreshold: # if the year is pruneThreshold, coffee plants explode
data[1][13] =20;
syntax pytest
from project root directory should return a bunch of passed tests.
code coverage above 60%
pint (python units) to make sure we have proper agreement ("pounds/year") etc
won't necessarily be named main, simulate_cooperate(dataFrame, time, etc)
It should not be doing data loading.
never done this before but the idea is that when you push/submit a PR, your linting errors are either highlighted or just automatically fixed.
codecov config file
runs in github actions
EDIT: using coveralls
pick your library and make it look pretty and readable, first step is just recreate the plots we have now (growth over time), we can work on making them more informative later
should be its own python function (what does it take in?)
ref: cafelytics/src/comparativeline.m https://github.com/mathematicalmichael/cafelytics/blob/master/src/comparativeline.m
Shouldnt be in the docs, not planning on using jt.
Moving forward, I will be closing issues with keywords in commits when the commit solves the specific issue. I was not aware of this function until now.
dockerfile to build project as lightly as possible
test that docker build works in github actions (do the symlinking thing so that we run additional tests in docker python)
push to mindthegrow/cafelytics
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.