Giter Club home page Giter Club logo

policyengine-api's Introduction

PolicyEngine

This repository contains the core infrastructure for policyengine.org. Namely:

  • policyengine, a Python package which contains the server-side implementations, and
  • policyengine-client, a React library containing high-level components to build the client-side interface.

Development

NOTE: requires Python 3.7

First, ensure you have pnpm installed: https://pnpm.io/installation.

Then, install using make install. Then, to debug the client, run make debug-client, or to debug the server, run make debug-server.

If your changes involve the server, change useLocalServer = false; to useLocalServer = true; in policyengine-client/src/countries/country.jsx. Otherwise, change usePolicyEngineOrgServer = false; to usePolicyEngineOrgServer = true; in policyengine-client/src/countries/country.jsx.

If you don't have access to the UK Family Resources Survey, you can still run the UK population-wide calculator on an anonymised version. To do that, instead of running make debug-server, run UK_SYNTHETIC=1 make debug-server

policyengine-api's People

Contributors

anth-volk avatar maxghenis avatar nikhilwoodruff avatar pavelmakarchuk avatar sgarciahelguera avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

policyengine-api's Issues

get_economic_impact and get_analysis endpoints return messages mid-computation

This isn't an error, but more an idea for altering current functionality. At present, when a user accesses the GET-request endpoint at /{country_id}/economy/{policy_id}/over/{baseline_policy_id}, the endpoint returns three different 200 OK JSON responses:

  • The first, containing a status of computing, returns every time a job is first logged, and it appears that every user would receive this when they first send a request
  • If the request is resent, the user often receives a message indicating their position in a job queue, as well as an average processing time
  • After that, when the job is completed, the user receives the computed results

From an API design standpoint, I think it may be best to alter this flow in one of two ways:

  • Option 1: Change the first two outputs so that, instead of a 200 OK, they return with a 202 Accepted so that users can better error-test and determine more programmatically whether or not a job has completed
  • Option 2: Incorporate more async functionality and merely have the user await a successful return or an error, and in future docs, indicate that this process takes some time; this could even involve returning a 202 Accepted with a merged version of the first two responses, then creating a post-processing handler at a different endpoint that returns a 200 OK or error message, as well as the completed data

Deployment failing

Most recent PR deployment

Beginning deployment of service [default]...
╔════════════════════════════════════════════════════════════╗
╠═ Uploading 7 files to Google Cloud Storage ═╣
╚════════════════════════════════════════════════════════════╝
File upload done.
ERROR: (gcloud.app.deploy) INVALID_ARGUMENT: The following quotas were exceeded: CPUS (quota: 24, used: 18 + needed: 8).
make: *** [Makefile:17: deploy] Error 1
Error: Process completed with exit code 2.

Investigate `sqlalchemy` forced pin

To get tests passing, I had to downgrade sqlalchemy to >=1.4,<2 (the major version bump seems to break the interface). Not a high priority, just filing to make a note.

Compute top-level net income impacts (for decile chart) independently of deciles

Currently these are averaged across deciles, which results in some reforms having different directional impacts between relative and absolute, e.g. this.

Let's instead compute these as total aggregates, i.e.

  • Absolute = total change to net income in £/$ divided by number of households
  • Relative = absolute / total baseline net income

Then we should change the client to read:

  • [Reform] would {increase, decrease} total net income by $x per household
  • [Reform] would {increase, decrease} total net income by y%

500 server error when computing UK reform impact

https://beta.policyengine.org/uk/policy?reform=23&focus=policyOutput.netIncome&region=uk&timePeriod=2022&baseline=1

We ran into an issue when trying to simulate your policy. Please try again later. The full message is "Error computing reform impact: <!doctype html>\n\n<title>500 Internal Server Error</title>\n

Internal Server Error

\n

The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

\n"

This reform changes the blind TV license discount

API allows submission of garbage data to /{country_id}/household POST endpoint

While doing some manual testing in order to build out the OpenAPI specs, I emitted the following to the /uk/household POST endpoint via Postman:

{ "label": "testValue", "data": { "dataPoint1": "testValue" } }

This data was successfully added to the database as household_id 30573. I'm not sure if this is even really an issue, but it may be beneficial to add data field requirements to the data object and validate on the server before emitting to the database.

Deploying API exceeds CPU quota

I tried retrying but it didn't work. This is blocking deployments.

https://github.com/PolicyEngine/policyengine-api/actions/runs/4154248138/jobs/7186563968#step:7:43

ERROR: (gcloud.app.deploy) INVALID_ARGUMENT: The following quotas were exceeded: CPUS (quota: 24, used: 24 + needed: 4), IN_USE_ADDRESSES (quota: 8, used: 8 + needed: 1).

Some suggestions at https://stackoverflow.com/questions/43656886/google-app-engine-error-gcloud-app-deploy-invalid-argument-the-following-quo

get_search tests for malformed country_id by raising error instead of returning 404

In a similar way to the get_analysis controller, get_search raises a ValueError if the country_id supplied in the URL is malformed, as opposed to returning with the standard 404 Not Found response utilized elsewhere throughout the application. Because of this, if a user inputs an incorrect country_id, they actually receive a 500 Internal Server Error.

To alter this, I'd propose one of two actions:

  • More likely: Alter the code so that it returns the standard 404 message
  • Less likely: if this function is utilized elsewhere on the back-end, it may be appropriate to build a separate controller wrapper that returns the standard 404 Not Found

get_analysis does not error test for malformed country_id

Unlike most other controllers, the get_analysis controller, defined in policyengine_api/endpoints/analysis.py, does not test for whether or not an incorrect country_id is input. If an incorrect country_id is used, instead of returning with the standard 404 Not Found utilized elsewhere across the app, it returns with a different 404 Not Found indicating that the URL was not found.

This is likely a simple fix of merely importing and invoking at the top of the controller the verification function used elsewhere.

Unexpected behavior when no policies are found via /{country_id}/policies

While manually testing the API to build out the OpenAPI specs, I accessed the following URL via Postman:

api.policyengine.org/ng/policies?query=tax

Instead of returning the expected JSON output as defined in get_policy_search, lines 151-155, I received the following with a 200 OK status code:

{ "message": null, "result": [], "status": "ok" }

Additionally, get_policy_search as currently defined does not appear to explicitly return a 404 status code when no policies are found.

Auto-retry failed API deployments

There's been a few cases recently where GCP has run into an error, but a simple redeploy worked. We could probably automate this with an extra action step

File name too long after make install on windows

(base) PS C:\Users\Kevin\Documents\policyengine-api> make install
pip install -e .
Obtaining file:///C:/Users/Kevin/Documents/policyengine-api
Preparing metadata (setup.py) ... done
Collecting policyengine_uk@ git+https://github.com/policyengine/policyengine-uk@policyengine-dev
Cloning https://github.com/policyengine/policyengine-uk (to revision policyengine-dev) to c:\users\kevin\appdata\local\temp\pip-install-n3zx7ogc\policyengine-uk_17169ec21bfe49048596570357a65f04
Running command git clone --filter=blob:none --quiet https://github.com/policyengine/policyengine-uk 'C:\Users\Kevin\AppData\Local\Temp\pip-install-n3zx7ogc\policyengine-uk_17169ec21bfe49048596570357a65f04'
error: unable to create file policyengine_uk/tests/policy/baseline/gov/dwp/pension_credit/guarantee_credit/minimum_guarantee/additional/severe_disability_minimum_guarantee_addition.yaml: Filename too long
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet https://github.com/policyengine/policyengine-uk 'C:\Users\Kevin\AppData\Local\Temp\pip-install-n3zx7ogc\policyengine-uk_17169ec21bfe49048596570357a65f04' did not run successfully.
│ exit code: 128
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet https://github.com/policyengine/policyengine-uk 'C:\Users\Kevin\AppData\Local\Temp\pip-install-n3zx7ogc\policyengine-uk_17169ec21bfe49048596570357a65f04' did not run successfully.
│ exit code: 128
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
make: *** [Makefile:2: install] Error 1
(base) PS C:\Users\Kevin\Documents\policyengine-api>

Automatically update API

Could we have the API check the GitHub repos for updates every X minutes, reinstalling and updating the version numbers if needed? Would probably be much faster than redeploying.

Make auto-country-update PRs update a single branch

The country update PRs each update a new PR. Could we use --force to ensure they only edit a single branch, say Update country models that updates all country models to their latest version if different? This would save deployment times.

Only save non-default input variables to household

For example, in #200 I linked this household, which I defined by only setting the state to NY; all other variables were default. But it now breaks because it referred to co_tanf_countable_income, which is no longer in the system. Can we tie household IDs only to non-default variables, so we only break households with variables that changed?

get_search doesn't properly handle malformed type query parameter

Similar to the issue with malformed country_id parameters, if a user accesses the get_search endpoint and inputs a malformed type query parameter, the endpoint returns with a 500 Internal Server Error, as opposed to some sort of error response (perhaps 400 Bad Request?), because the function raises a ValueError when the error is encountered

Update APIs concurrently

Currently, each API update leaves the population impacts broken for 20 minutes because the compute API is updated sequentially after the main API. We could update them both concurrently.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.