Giter Club home page Giter Club logo

miner's People

Contributors

alexanderjfink avatar waffle-iron avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

ericpp jimthedev

miner's Issues

add geolocation data to maps

Need the ability to mark a specific data set by the location it provides coverage of... this will allow building features in the future in such a way as to allow downloading of various data depending on location.

Fix SQL inserts to be speedier

Couldn't figure out "load local data infile" in a cross-compatible and reliable way, so moved to individual sql inserts which is a pain and very slow comparatively. I'd like a faster way to do this.

Maps need an abstracted map model that does some of the behind the scenes work

This is relatively complicated.

Take example data:

  • Enron Emails (all separate email files in folders)
  • Official Hospital data from Medicare (multiple csvs all in single folder)
  • Form 990 data from CitizenAudit (csv manifest and separate set of files from 2012)
  • US Census data (separate files for each state, sql sets up database and needs to be repaired before inserts can work)

Build OS X installer script

Should install on different systems and pull necessary dependencies mdbtools and so forth. Should also have basic setup process for database used.

Build a basic plugin architecture

A basic plugin architecture should cover at least the map processing tools -- download, unpack, install, and clean up. A plugin, for example, could do of more ethnic to every downloaded file, could replace how the files are downloaded. It would also need to be able to add a CLI option for a plugin. Maybe even the database driver use could be managed by plugins?

The first potential plugin is in Issue #14. I'd like to do the downloading of data through the Tor network for anonymous and encrypted data. It would be an optional to use CLI option that would vastly slow down the downloading (its gonna take a while to download big data sets through Tor) but would make it secure and anonymous. For people downloading datasets that are in some way political, it might be best that it remain anonymous. Since it isn't part of the core mission of this project, I think it would be better to have it plug in but be separate.

I may try to use this basic plugin architecture:
http://martyalchin.com/2008/jan/10/simple-plugin-framework/

Or this one: http://yapsy.sourceforge.net/

More information that I found about how to build these (since I've never built a plugin architecture before) was here: http://stackoverflow.com/questions/932069/building-a-minimal-plugin-architecture-in-python

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.