Giter Club home page Giter Club logo

geolocation's Introduction

Summary

Tools to create a geolocation API similar to that offered by Google

High Level Logic

  1. Takes list of place names from GeoNames
  2. Takes list of languages and prevalences by country from Wikipedia
  3. Parses and imports into MySQL DB
  4. Sets up API with Flask
  5. Responds to queries of text location strings with coordinates and country name

DB Schema

There are four tables iso, places, language and admin1

  1. places lists all place names and maps to coordinates and other info
name clean_name lat lon country population elevation admin_name feature
Suhūl az̧ Z̧afrah suhūl az̧ z̧afrah 22.75 53.1667 AE 0 119 00 00
  1. admin1 lists all first level administrative divisions e.g. in the US, these are states such as New York or Arizona

| code| name | ascii_name | pop | country | admin_code | |------|------------|-------------|-------|---------|---------|------ | | AD.06 | Sant Julià de Loria | Sant Julia de Loria | 3039162 | AD |

  1. iso lists all countries and their ISO codes
name iso2 iso3
Afghanistan AF AFG
  1. language lists countries and their languages along with ISO code
language country_name iso2 status lang_iso level
Brunei Malay brunei BN regional NULL 2

level indicates importance of language in that country e.g. 'Significant minority' is level 2 while 'Official' is level 1

Feature Types

Each feature has an associated type; referring to populated places, geographical features etc. The (partial) count of most common features are

| PPLA3 | 90397 | Seat of a 3rd order division
| PPLX | 91773 | Section of a populated place
| HMSD | 99105 | Homestead
| ADM3 | 108767 | 3rd level admin division
| RSTN | 116788 | Railroad station
| LCTY | 131307 | Locality (a minor area or place of unspecified or mixed character and indefinite boundaries)
| PPLA4 | 131855 | Seat of a 4th order division
| HTL | 133210 | Hotel
| LK | 161605 | Lake
| HLL | 173397 | Hill
| STMI | 194574 | Intermittent stream
| ADM4 | 206125 | 4th level admin division
| FRM | 218814 | Farm
| ISL | 220766 | Island
| MT | 503068 | Mountain
| STM | 593570 | Stream
| PPL | 5812629 | Populated place

Optimisations

  • Make name the primary key in the places table, this speeds up querys based on where statements
  • Eliminate all feature types except PPL and any features with zero population

API Call

Set up API with

python app.py

Which serves to http://127.0.0.1:5000/

Query DB for location with http://127.0.0.1:5000/loc=`location`

e.g. http://127.0.0.1:5000/loc=Mount%20Kpa

[{"name":"Mount Kpa","clean_name":"mount kpa","lat":6.58333,"lon":-9.35,"country":"LR","pop":0,"elevation":322,"admin_name":"11","feature":"MT"}]

Query DB for location with country hint with http://127.0.0.1:5000/loc=`location`&country=`country`

Query DB for location with language hint with http://127.0.0.1:5000/loc=`location`&langs=`lang1,lang2...`

Query a large messy string e.g. an entire document with http://127.0.0.1:5000/raw/loc=`rawString` and narrowed down to a single country with http://127.0.0.1:5000/raw/loc=`rawString`&country=`XX`

  • Uses NLTK stopwords

Error codes follow W3 guidelines, need to be updated to Heroku spec

Gotchas

The following values sometimes appear in the admin level 1 column

00/0 = the entire country
Values that do not appear in admin1 table are not regular part of country
e.g. the Tunb islands of UAE: feature code is ISL and admin code is 11

Dependencies

Non-core Dependencies

  1. Flask
  2. MySQLdb
  3. Requests

Todo

  1. Add in country names explicitly!
  2. Add in clues e.g. likely country, region, timezone or language
  3. Add in fuzzy matching e.g. Al Raqqah/Al Raqah
  4. Automatically query Google API and update DB
  5. Add in admin level 2 as well as level 1
  6. Add in Google reverse geocoding for placing lat.long coords
  7. Need to be updated to Heroku spec
  8. Add sparse/verbose return option e.g. name and lat/lon

geolocation's People

Contributors

alexrutherford avatar

Watchers

James Cloos avatar  avatar

Forkers

gilby125

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.