Giter Club home page Giter Club logo

sdg_11.2.1's People

Contributors

abhsheksingh avatar ainsliesimons avatar antonio-john avatar dependabot[bot] avatar james-westwood avatar jwestw avatar nkshaw23 avatar paigeh1 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sdg_11.2.1's Issues

Import the correct Naptan data

  • get data from the Naptan zip
  • can this be opened from the remote source (because it's pretty big)
  • consider processing with Dask if it's really large (find out size first)

Filter the stops data just to Birmingham (or any place)

  • Birmingham_stops = gpd.sjoin(just_birmingham_geom, stops_geo_df, how = 'left', op = 'intersects') where Birmingham_LSOAs is the shapefile of just birmingham.
  • check that they are on the same projection 'just_birmingham_geom.crs == stops_geo_df.crs' if True, then it's good. Both should be EPSG:27700

Create Sql Functions

  • Connecting to SQL Server (probably MySQL eventually BigQuery)
  • Creating a New Database
  • Connecting to the Database
  • Creating Tables
  • Populating the Tables
  • Reading Data formatting Output into a pandas DataFrame

Put ages into buckets

Check if functions exist to bucket-ize values
If no function exists, write a function to bucketize the age values.
Looking at other SDG indicators, buckets should be:

From Indicator 3.6.1

  • 4 and under
  • 5 to 7
  • 8 to 11
  • 12 to 15
  • 16 to 19
  • 20 to 29
  • 30 to 39
  • 40 to 49
  • 50 to 59
  • 60 to 69
  • 70 to 79
  • 80 and over

And from Indicator 3.4.2

  • 10 to 14
  • 15 to 19
  • 20 to 24
  • 25 to 29
  • 30 to 34
  • 35 to 39
  • 40 to 44
  • 45 to 49
  • 50 to 54
  • 55 to 59
  • 60 to 64
  • 65 to 69
  • 70 to 74
  • 75 to 79
  • 80 to 84
  • 85 to 89
  • 90 and over

These do not agree with each other. Need to ask.

Speak to the data team to exactly understand their data output requirements

Hi Data team. @TomosJH @EmmaWoodONS @denzil86

I need your help. In order to get 11.2.1 to a stage where it has working output for the platform, I need to understand the data team's requirements for this stage of the project in terms of:

  • data format
  • layout including exact column titles
  • number of decimal places, where applicable
  • any other requirements

Whatever your requirements are, please can I get a mock-up of the data so I have a visual reference for what I am aiming for?

I stress "this stage" of the project because I am working towards version 1.0 of the output, which is working code that correctly calculates the numbers for the whole country. See the projects page for other stages.

Get the population-weighted centroids into a dataframe

  • import shapefiles from Output_Areas__December_2011__Population_Weighted_Centroids-shp.zip into a dataframe
  • left join on OA code into the filtered population data
  • if it's too complicated to filter/join then use whole country

cleanup requirements.txt

Due to the problem with Conda environments not activating properly, there's tonnes of superfluous requirements listed.

Speed up data import with feathers

Running the script is getting slower which is a pain for debugging. So I will use feathers and geofeathers to store dataframes for quick retrieval

Create script to fetch the timetable information from the FTP server

Before we start this, we should estimate how long and how much effort it would take to complete. An alternative is just to go through a manual process using FileZilla and a zip utility. The manual process with some instructions in the user docs is MVP.

The problem: We need to access the TNDS data at ftp.tnds.basemap.co.uk . I don't know how often the data is updated but I suspect it will need a regular re-downloading, hence, an auto-download feature would be a nice to have.

We have an FTP module already so can this be adapted to fetch data from the Travel Line (TNDS) FTP server without too much extra work?

  • I have the username and password already.
  • the files are all zipped (so if we were to implement and auto-download feature, then we would need to adapt the _import_extract_delete_zip function.
  • File format is in xml

create output for qgis and visualise

@MusaChirikeni I have a few questions about this ticket:

  • What is the most universal geo file formats to output to? (e.g. shp, kml or geojson?)
  • Are those formats different in size? Which one is most efficient ?
  • Would it be helpful to output example maps as jpeg?
  • Should I output a file for the whole country (which would be a lot of data) or a smaller regional one for example?

Get QGis installed

  • it can read the geospatial files and give visual reference
  • overlay layers to visualise

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.