Giter Club home page Giter Club logo

contratospr's Introduction

contratospr

Central repo for the ContratosPR projects.

Why two projects

We thought it was important to keep the project concerns separate.

Contratospr-api is focused on scrapping, indexing, and exposing government data from the Pueto Rico comptroller's site. It was importants for us to not tie changes in this project to changes that might occur in Contratospr-web, or frontend project.

Contratospr-web is focused on making our scrapped data discoverable, searchable, and consumable by everybody, not just those that have specialized knowledge in tech or goverment.

It is our view that this will make both projects approachable to a wider set of people. It also simplifies maintanance and deployments of new features or fixes.

You can see our projects here:

Code of conduct

All ContratosPR projects follow the Code for Puerto Rico Code of Conduct, which is an extension of the Code for America Code of Conduct. Please read these as they help us keep an inclusive and approachable community.

For any code of conduct issues please send an email to [email protected]. If it helps you can use one of our email templates english / español

Contributing

If you are interested in contributing to these projects, please visit their repos and take a look at their CONTRIBUTING.md. These files will help you understand how the team works and how to submit your contributions

Don't know what to do

If you have an issue, feedback, or question and you do not know in which repo to place it, feel free to add it here.

Financials

You can see all our financial data in the form of GitHub Issues.

We will publish these issues once a month at the end of the month to reflect correct consumption and financial data.

Talks by the Team

contratospr's People

Contributors

froi avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

puertoricandev

contratospr's Issues

Add financial data for February 2021

Financials for February 2021

Totals

Grand Total Paid out by Project Paid out by Sponsors
$ (77.79) $ (77.79) $ 0.00

Sponsorship Break Down

Sponsor Amount

Details

Date Description Amount Balance
February 2021 ContratosPR Web Heroku hosting $ (7.00) $ (7.00
February 2021 ContratosPR API Heroku hosting $ (14.00) $ (21.00)
February 2021 ContratosPR Database Heroku hosting $ (50.00) $ (71.00)
February 2021 ContratosPR Web Preview App Hosting $ (2.39) $ (73.39)
February 2021 Tax $ (4.40) $ (77.79)
February 2021 Froi payment $ 77.79 $ (0.00)
$ 0.00

Invoice

contratospr-heroku-bill-feb2021.pdf

Add financial data for April 2020

Financials for April 2020

Totals

Grand Total Paid out by Project Paid out by Sponsors
$ (75.26) $ (75.26) $ 0.00

Sponsorship Break Down

Sponsor Amount

Details

Date Description Amount Balance
April 2020 ContratosPR Web Heroku hosting $ (7.00) $ (7.00
April 2020 ContratosPR API Heroku hosting $ (14.00) $ (21.00)
April 2020 ContratosPR Database Heroku hosting $ (50.00) $ (71.00)
April 2020 Tax $ (4.26) $ (75.26)
April 2020 Froi payment $ 75.26 $ (0.00)
$ 0.00

Add financial data for September 2020

Financials for September 2020

Totals

Grand Total Paid out by Project Paid out by Sponsors
$ (75.26) $ (75.26) $ 0.00

Sponsorship Break Down

Sponsor Amount

Details

Date Description Amount Balance
September 2020 ContratosPR Web Heroku hosting $ (7.00) $ (7.00
September 2020 ContratosPR API Heroku hosting $ (14.00) $ (21.00)
September 2020 ContratosPR Database Heroku hosting $ (50.00) $ (71.00)
September 2020 Tax $ (4.26) $ (75.26)
September 2020 Froi payment $ 75.26 $ (0.00)
$ 0.00

Invoice

TBA

Add financial data for May 2020

Financials for May 2020

Totals

Grand Total Paid out by Project Paid out by Sponsors
$ (75.26) $ (75.26) $ 0.00

Sponsorship Break Down

Sponsor Amount

Details

Date Description Amount Balance
May 2020 ContratosPR Web Heroku hosting $ (7.00) $ (7.00
May 2020 ContratosPR API Heroku hosting $ (14.00) $ (21.00)
May 2020 ContratosPR Database Heroku hosting $ (50.00) $ (71.00)
May 2020 Tax $ (4.26) $ (75.26)
May 2020 Froi payment $ 75.26 $ (0.00)
$ 0.00

Invoice

contratospr-heroku-invoice-may-2020.pdf

Add financial data for December 2020

Financials for December 2020

Totals

Grand Total Paid out by Project Paid out by Sponsors
$ (85.91) $ (85.91) $ 0.00

Sponsorship Break Down

Sponsor Amount

Details

Date Description Amount Balance
December 2020 ContratosPR Web Heroku hosting $ (7.00) $ (7.00
December 2020 ContratosPR API Heroku hosting $ (14.00) $ (21.00)
December 2020 ContratosPR Database Heroku hosting $ (50.00) $ (71.00)
December 2020 ContratosPR API Data collection $ (0.23) $ (71.23)
December 2020 ContratosPR API Data collection $ (2.82) $ (74.05)
December 2020 ContratosPR Web Preview App Hosting $ (7.00) $ (81.05
December 2020 Tax $ (4.86) $ (85.91)
December 2020 Froi payment $ 85.91 $ (0.00)
$ 0.00

Invoice

TBA

Add financial data for October 2020

Financials for October 2020

Totals

Grand Total Paid out by Project Paid out by Sponsors
$ (75.26) $ (75.26) $ 0.00

Sponsorship Break Down

Sponsor Amount

Details

Date Description Amount Balance
October 2020 ContratosPR Web Heroku hosting $ (7.00) $ (7.00
October 2020 ContratosPR API Heroku hosting $ (14.00) $ (21.00)
October 2020 ContratosPR Database Heroku hosting $ (50.00) $ (71.00)
October 2020 Tax $ (4.26) $ (75.26)
October 2020 Froi payment $ 75.26 $ (0.00)
$ 0.00

Invoice

TBA

Add financial data for August 2020

Financials for August 2020

Totals

Grand Total Paid out by Project Paid out by Sponsors
$ (75.26) $ (75.26) $ 0.00

Sponsorship Break Down

Sponsor Amount

Details

Date Description Amount Balance
August 2020 ContratosPR Web Heroku hosting $ (7.00) $ (7.00
August 2020 ContratosPR API Heroku hosting $ (14.00) $ (21.00)
August 2020 ContratosPR Database Heroku hosting $ (50.00) $ (71.00)
August 2020 Tax $ (4.26) $ (75.26)
August 2020 Froi payment $ 75.26 $ (0.00)
$ 0.00

Invoice

TBA

Project Status 2020-01-30

This is our first project status update. We hope to make more of these during the year.

A new home

We've join the Code 4 Puerto Rico community! A very important part of our effort is community. What better community to be a part of than the one we are trying to help and impact.

We are happy to join this awesome community and hope we can help it grow.

Sunset of ContratosPR on Spectrum

At some point we will be closing out the space on Spectrum.chat in favor of our Slack channel in the Code 4 Puerto Rico Slack workspace. You can join here.

A new repo

We've added this repo, yes this one 😄, to serve as the central repository for ContratosPR.

Our goal is to create a place where issues, ideas, and community conversation can happen without being tied to either of our existing projects. We will also be posting our transparency efforts in this repo.

With this addition ContratosPR is now:

New Issue Templates

We've added some new issue templates to better engage with all of you. We hope you all find them helpful.

Talks

José and I (froi) are going to be speaking at PyCascades on Feb. 08, 2020 in Portland, OR!

Data

  • We've downloaded and extracted data from 200k+ contract documents (PDFs)
  • There were 3k documents where we couldn't extract data from. This was because the content was in images.
    • We are working to get these processed using Tesseract

Tooling

We've add new tools to our data harvesting process, Poppler and Tesseract.

Poppler

  • pdftotext for text extraction
  • pdfinfo to get file metadata like amount of pages
  • pdftoppm to convert PDF pages to images

Tesseract

Tesseract is an open source OCR engine by Google. We are using this to process PDF pages that we convert into images or have its content in images.

Data

We've harvested data from our last harvest up to what is available in Jan. 2020. A Django manage command was developed to help us do atomic data harvests.

Our data grew a bit out of control for a moment and we had to move back to a Heroku hosted Postgres instance.

We've also started to play with our data a bit. We're thinking on what to do next with it and will definitely be asking for help in figuring out what we can do and what is needed.

Ideas up to now

  • Contractor grouping / normalization: since this data is supplied from a number of government entities, there are instances of contractors having the same name spelled differently or very similar names. We've been thinking on how to group these contractors to improve search and our UX.

  • We've gotten feedback that this data should be a part of a bigger knowledge graph. We agree! Although it's not in our near future, we have started to think about what this means and how to get to that goal.

  • Data experiments: we want to see what we can build to further demonstrate the uses for this data. We'd love to see what suggestions you all have.


We'll I think this is long enough. Thanks for reading and hopefully we will have more updates soon. ❤️

Add financial data for November 2020

Financials for November 2020

Totals

Grand Total Paid out by Project Paid out by Sponsors
$ (75.26) $ (75.26) $ 0.00

Sponsorship Break Down

Sponsor Amount

Details

Date Description Amount Balance
November 2020 ContratosPR Web Heroku hosting $ (7.00) $ (7.00
November 2020 ContratosPR API Heroku hosting $ (14.00) $ (21.00)
November 2020 ContratosPR Database Heroku hosting $ (50.00) $ (71.00)
November 2020 Tax $ (4.26) $ (75.26)
November 2020 Froi payment $ 75.26 $ (0.00)
$ 0.00

Invoice

TBA

Wanted: community projects

Here's the challenge / call to action. We would love for our community to play with the data we have collected and post the insights that come from these experiments.

The tools to use are up to you all. Please use whatever makes you productive.

We would love to showcase what our community has done. How are we going to showcase this? We're planning on doing a series of blog posts showing off the different submissions.

How can you submit your work. AKA what are the rules?

  1. All projects must use our API at https://api.contratospr.com
  2. All projects must be Open Source. Where to host it and what license to use is completely up to you.
  3. All projects must be objective and non-partisan.
  4. You must submit your idea in this repo by making a new issue with your submission.

What do we have

We now have some interesting data to play with!!!

These are our totals:

  • 1, 246, 989 contracts with their amendments
    • 976,833 contracts
    • 270,156 amendments.
  • 201, 003 contract documents

In January 2020, we downloaded:

  • 18215 contracts
  • 3468 amendments
  • 5421 contract documents

image

Add financial data for July 2020

Financials for July 2020

Totals

Grand Total Paid out by Project Paid out by Sponsors
$ (75.26) $ (75.26) $ 0.00

Sponsorship Break Down

Sponsor Amount

Details

Date Description Amount Balance
July 2020 ContratosPR Web Heroku hosting $ (7.00) $ (7.00
July 2020 ContratosPR API Heroku hosting $ (14.00) $ (21.00)
July 2020 ContratosPR Database Heroku hosting $ (50.00) $ (71.00)
July 2020 Tax $ (4.26) $ (75.26)
July 2020 Froi payment $ 75.26 $ (0.00)
$ 0.00

Invoice

TBA

Add financial data for January 2021

Financials for January 2021

Totals

Grand Total Paid out by Project Paid out by Sponsors
$ (82.68) $ (82.68) $ 0.00

Sponsorship Break Down

Sponsor Amount

Details

Date Description Amount Balance
January 2021 ContratosPR Web Heroku hosting $ (7.00) $ (7.00
January 2021 ContratosPR API Heroku hosting $ (14.00) $ (21.00)
January 2021 ContratosPR Database Heroku hosting $ (50.00) $ (71.00)
January 2021 ContratosPR Web Preview App Hosting $ (7.00) $ (78.00
January 2021 Tax $ (4.68) $ (82.68)
January 2021 Froi payment $ 82.68 $ (0.00)
$ 0.00

Invoice

TBA

PyCascades 2020

We need to update some of the slides for our talk on Feb. 8

  • Update deck to mention that we are now a part of Code4PR
  • Update slide 94 with a collage of events that have happened since we started this project.
  • Update our deck to reflect what we have implemented from our What's Next slide.
    • Update our deck to reflect that we hace posted our financial data as a GitHub Issue
    • Update our deck to reflect that we implemented a GitHub Action to automate a finance issue creation.

Create GitHub financial action

We need a GitHub Action that will create the month's financial status issue at the end of the month.

What should this do?

  • Create a new issue with the title ContratosPR financial state - [month], [year]
  • Apply labels financials, community
  • Assign jpadilla and froi to the issue.
  • Have the action run as a cron process at the end of the month.

@jpadilla am I missing something?

Add financial data for June 2020

Financials for June 2020

Totals

Grand Total Paid out by Project Paid out by Sponsors
$ (75.26) $ (75.26) $ 0.00

Sponsorship Break Down

Sponsor Amount

Details

Date Description Amount Balance
June 2020 ContratosPR Web Heroku hosting $ (7.00) $ (7.00
June 2020 ContratosPR API Heroku hosting $ (14.00) $ (21.00)
June 2020 ContratosPR Database Heroku hosting $ (50.00) $ (71.00)
June 2020 Tax $ (4.26) $ (75.26)
June 2020 Froi payment $ 75.26 $ (0.00)
$ 0.00

Invoice

heroku-invoice-contratospr-june-2020.pdf

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.