rsginc / bca4abm Goto Github PK
View Code? Open in Web Editor NEWBenefit Cost Analysis for Travel Demand Models
Home Page: http://rsginc.github.io/bca4abm/
License: Other
Benefit Cost Analysis for Travel Demand Models
Home Page: http://rsginc.github.io/bca4abm/
License: Other
All the model step settings should be in the model step yaml files. For example, move aggregate_zone_file_names from settings.yaml to aggregate_zone.yaml.
I think we should remove the docs and tutorials folder. Maybe we move the tutorial to @toliwaga's account, or maybe branch it off, or move it to devtools, or maybe the doc folder.
like ActivitySim - https://activitysim.github.io/activitysim/
We should update the ABM aggregate_trips_processor (and therefore aggregate_data_manifest.csv) to allow for a more flexible set of inputs like the four_step aggregate_od processor.
The activitysim convention of prefixing expressions with an at sign to indicate that they should be evaluated with a python eval rather than pandas eval is a little bit inconvenient if you want to maintain your csv file of expressions in excel as those statement are interpreted as formulas. We might want to use a different method of tagging these expressions that plays better with excel.
locals_OD_aggregate:
should be
locals_aggregate_od:
in example_4step/configs/settings.yaml
Update default crash cost to a more reasonable value to avoid confusion
assignment_expressions is now a df, not a series
Daysim writes tsv by default. Please expose the pandas read csv sep argument in the settings.yaml file
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
Often many build scenarios are run against a single base scenario. Should we copy the base folder input data each time, or just add support for multiple build folders? Or maybe just a base and build input and outputs folder location in the settings file?
What is the copyright/ownership status of the existing sandag bca tool - can I quote/excerpt from the MSSQL stored procedures in the bca4abm source or documentation?
For the aggregate_zone processor, we will allow the analyst to specify a list of csv files that should be combined into a single table.
I am wondering whether to include the 1024 column cval table, and whether that needs to be special-cased. All the other zone files will have two versions, one for the build and one for the base scenario. I was thinking we could handle that by automatically prepending base_ or build_ to the column names, but we probably don’t want to create two versions of the 1024 cval columns (unless they can be different in build and base? In which case we also need two versions in the aggregate demographics processor?)
I thought there was supposed to be a different vot for commute and noncommote tours?
tests fail with orca 1.4.0
Which of the various daysim purpose category codes can appear as tour purposes (pdpurp)?
0 'home'
1 'work'
2 'school'
3 'escort'
4 'pers.bus'
5 'shop'
6 'meal'
7 'social/recreation'
10 'change mode (park and ride)
The sandag bca tool carries a very large number of columns into the multiyear processor and final report. This greatly increases its complexity, and reduces its flexibility.
The more parsimonious we are in this area, the easier it will be for the tool to be adapted to different abm data sources. Also it will make it more flexible in terms of experimental modifications to the bca calculations.
My intent is to start by implementing the bare minimum and we can debate the tradeoffs inherent in adding more detailed reporting as we move forward.
Scream if you have a problem with this approach.
it would be good to speed up the slow processors - most likely the od processor. We can test with the full Oregon Metro example data set and set of expressions.
Need to make sure bca4abm works for Python 3 (and update all related materials as well). Updating ActivitySim to work for both 2 and 3 wasn't a big deal, so updating bca4abm should be relatively straightforward.
@toliwaga and I decided on the following terminology and model setup.
The BCA tool will expect the user to provide a base folder location and a build folder location. The two folders will contain the same input files.
The four sets of alternatives are named as follows:
Within the code and in the exposed expressions, alt
and altlos
is used.
for the four step example, aggregate_results does not output anything. The bug appears to be in add_aggregate_results().
cval
is Metro specific language, so we should change this to something more generic like hhs
. See cval_file_name: mf.cval.csv
in https://github.com/RSGInc/bca4abm/blob/master/example_4step/configs/tables.yaml for example.
hardwired travis to use toolz=0.8.0 because 0.8.2 not in cache
We should check back and remove this when it is fixed
The sandag bca tool takes scenario runs for multiple years and does some sort of interpolation/extrapolation between the years. Do we need to support anything like that? And if so, what is the spec for which columns are to be handled in what ways?
The sandag bca tool weights ovt differently for vot calculations but ovt isn't broken out for transit trips in bcatest6 sample db
Are we just going to ignore this for the initial version?
need to be able to read multiple link files to sum up to daily
Sometimes we need to read a zone vector into the od processors as well. For example, parking cost at the destination. We should add the ability to also read in the zone data and specify if the zone vector is replicated to a full matrix by row or by column. For now you can pre-process the zone vector to create a matrix.
at least there isn't any in the trip file variables in bcatest6 spreadsheet.
Up until now we have only run this against the trivially small sample database.
We have no idea what the performance will be with a big dataset. It would be nice to find out.
Who is tasked with assembling a full scale dataset for testing?
why bother with orig and dest taz or microzone ids in trips file if we have travel time and cost columns?
Sandag BCA tool maps taxi to auto. I'm looking at daysim TMODEDETP and MODE category codes and wondering how taxi trips are categorized in daysim.
Also wondering whether toll and fare are aggregated into travcost - and whether this is handled the same in CT-RAMP and daysim?
vehicle ownership is by households, but some coc determinations (age) are person-based and some are hh based (income)
to allocate auto ownership by coc, do we use the age of the oldest hh member?
Suppose there is a household with two members ages 35 and 85 and two cars, do we try and be really clever and allocate one car to each hh member? And what if there are three adults hh members and two cars?
The aggregate_od.csv output file label is "link" but should be something like "od". This is the last line in
https://github.com/RSGInc/bca4abm/blob/mce/bca4abm/processors/four_step/aggregate_od.py
The specification that @jfdman provided for SANDAG includes special logic for dealing with the toll matrix since some of the OD pairs have a toll + 10000 to identify it as a special toll. @mabcal suggested using a SCALE column for calculating the auto operating cost since some agencies only use a distance matrix times a scalar. To support these types of customizations in the matrix specification, I think we should add an EXPRESSION column for each matrix read that supports a Python expression. This expression could be applied when the matrix is first read, or could be applied on demand. Exposing this flexibility in the form of a Python expression is the spirit of ActivitySim and would go a long way toward a more generic tool. @toliwaga and I decided to wait and implement this after we get the minimal operating solution up and running.
a link results output table is missing
specifically they could go in transposed summary_results.csv and coc_silos.csv
Link level benefits are currently computed as : results = results['base'] - results['build']
should be : results['build'] - results['base']
Line 93 on link.py
It might be worthwhile to have a load_data_processor that reads the csv data files into an hdf5 store and then the individual processors could read their input from the store (a la activitysim.)
This might be a lot faster, as the load process wold only need to be run if the input data changed, which might be convenient while the model is being initially built, tweaked, and re-run with the same data, but revised specs and settings.
Also, it would make it more flexible because different versions of the load_data_processor could read data from different sources, including reading the data from, say, a MSSQL database.
We currently write out trace files for 4step as follows;
What we want to do instead for the od processor is write a sum for each origin district to destination district pair. We will code the districts at the TAZ level, probably in the COCs definition file. We will then trace out a FROM DISTRICT, TO DISTRICT, aggregated calculation result for expression 1, aggregated calculation result for expression 2, etc. The output file will look something like this:
FROM DISTRICT | TO DISTRICT | TRIPS | TIMES | ... |
---|---|---|---|---|
zone-group-1 | zone-group-1 | 5678 | 78 | ... |
zone-group-1 | zone-group-2 | 456 | 34 | ... |
zone-group-2 | zone-group-1 | 1234 | 234 | ... |
zone-group-2 | zone-group-2 | 8786 | 222 | ... |
If the the travcost trip file variable is for an exploded person-trip, then has the trip travel cost been pro-rated or otherwise allocated across/to the appropriate persons travelling on that trip?
it will get long for big datasets...
@VinceBernardin said "I think the logic to determine which nodes are intersections would be pretty basic. If the node is connected to a freeway link, it is not an intersection, and if the node is a centroid it is not an intersection. I think that would probably be good enough. Then we would just need to determine the number of legs and the max and min volume approaches which takes a little work, requiring a join between the links and nodes but isn’t really too difficult."
In order to illustrate the outputs, I've created an outputs table in the wiki for the example. We need to populate it with descriptions, units, and anything else important.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.