Giter Club home page Giter Club logo

MuckRock

MuckRock · Squarelet · DocumentCloud · DocumentCloud-Frontend

Codeship Status for MuckRock/muckrock codecov.io

MuckRock is a non-profit collaborative news site that gives you the tools to keep our government transparent and accountable.

Prerequisites

MuckRock depends on Squarelet for user authentication. As the services need to communivate directly, the development environment for MuckRock depends on the development environment for Squarelet - the MuckRock docker containers will join Squarelet's docker network. Please install Squarelet and set up its development environment first.

Install

Software required

  1. docker
  2. python
  3. invoke
  4. git

Installation Steps

  1. Check out the git repository - git clone --recurse-submodules [email protected]:MuckRock/muckrock.git
  2. Enter the directory - cd muckrock
  3. Run the dotenv initialization script - python initialize_dotenvs.py This will create files with the environment variables needed to run the development environment.
  4. Set an environment variable that directs docker-compose to use the local.yml file - export COMPOSE_FILE=local.yml
  5. Set up the javascript run inv npm "install" and inv npm "run build"
  6. Start the docker images - inv up This will build and start all of the docker images using docker-compose.
  7. Set dev.muckrock.com to point to localhost - echo "127.0.0.1 dev.muckrock.com" | sudo tee -a /etc/hosts
  8. Enter dev.muckrock.com into your browser - you should see the MuckRock home page.
  9. In .envs/.local/.django set the following environment variables:
  • SQUARELET_KEY to the value of Client ID from the Squarelet Client
  • SQUARELET_SECRET to the value of Client SECRET from the Squarelet Client
  1. You must restart the Docker Compose session (via the command docker-compose down followed by docker-compose up) each time you change a .django file for it to take effect.

You should now be able to log in to MuckRock using your Squarelet account.

Docker info

The development environment is managed via docker and [docker compose][docker compose]. Please read up on them if you are unfmiliar with them. The docker compose file is local.yml. If you would like to run docker compose commands directly, please run export COMPOSE_FILE=local.yml so you don't need to specify it in every command.

The containers which are run include the following:

  • Django This is the Django application

  • PostgreSQL PostgreSQL is the relational database used to store the data for the Django application

  • Redis Redis is an in-memory datastore, used as a message broker for Celery as well as a cache backend for Django.

  • Celery Worker Celery is a distrubuted task queue for Python, used to run background tasks from Django. The worker is responsible for running the tasks.

  • Celery Beat The celery beat image is responsible for queueing up periodic celery tasks.

All systems can be brought up using inv up. You can rebuild all images using inv build. There are various other invoke commands for common tasks interacting with docker, which you can view in the tasks.py file.

Networking Setup

The MuckRock development environment will join Squarelet's environments docker network, so that the services can coexist. Please see the README file from Squarelet for more information.

Environment Variables

The application is configured with environment variables in order to make it easy to customize behavior in different environments (dev, testing, staging, production, etc). Some of this environment variables may be sensitive information, such as passwords or API tokens to various services. For this reason, they are not to be checked in to version control. In order to assist with the setup of a new development environment, a script called initialize_dotenvs.py is provided which will create the files in the expected places, with the variables included. Those which require external accounts will generally be left blank, and you may sign up for an account to use for development and add your own credentials in. You may also add extra configuration here as necessary for your setup.

Invoke info

Invoke is a task execution library. It is used to allow easy access to common commands used during development. You may look through the file to see the commands being run. I will go through some of the more important ones here.

Release

inv prod will merge your dev branch into master, and push to GitHub, which will trigger CodeShip to release it to Heroku, as long as all code checks pass. The production site is currently hosted at https://www.muckrock.com/. inv staging will push the staging branch to GitHub, which will trigger CodeShip to release it to Heroku, as long as all code checks pass. The staging site is currently hosted at https://muckrock-staging.herokuapp.com/.

Test

inv test will run the test suite. To reuse the database, pass it the -r=1 option. inv coverage will run the test suite and generate a coverage report at htmlcov/index.html.

The test suite will be run on CodeShip prior to releasing new code. Please ensure your code passes all tests before trying to release it. Also please add new tests if you develop new code.

Code Quality

inv pylint will run pylint. It is possible to silence checks, but should only be done in instances where pylint is misinterpreting the code. inv format will format the code using the yapf code formatter.

Both linting and formatting are checked on CodeShip. Please ensure your code is linted and formatted correctly before attempting to release changes.

Run

inv up will start all containers in the background. inv runserver will run the Django server in the foreground. Be careful to not have multiple Django servers running at once. Running the server in the foreground is mainly useful for situations where you would like to use an interactive debugger within your application code. inv shell will run an interactive python shell within the Django environment. inv sh will run a bash shell within the Django docker comtainer. inv dbshell will run a postgresql shell. inv manage will allow you to easily run Django manage.py commands.

  • inv manage migrate to migrate database inv npm will allow you to run NPM commands. inv npm "run build" should be run to rebuild assets if any javascript or CSS is changed. If you will be editing a lot of javascript or CSS, you can run inv npm "run watch". inv heroku will open a python shell on Heroku.

Pip Tools

Python dependencies are managed via pip-tools. This allows us to keep all of the python dependencies (including underling dependencies) pinned, to allow for consistent execution across development and production environments.

The corresponding files are kept in the pip folder. There are requirements and dev-requirements files. requirements will be installed in all environments, while dev-requirements will only be installed for local development environments. It can be used for code only needed during develpoment, such as testing. For each environment there is an .in file and a .txt file. The .in file is the input file - you list your direct dependencies here. You may specify version constraints here, but do not have to.

Running inv pip-compile will compile the .in files to the corresponding .txt files. This will pin all of the dependencies, and their dependencies, to the latest versions that meet any constraints that have been put on them. You should run this command if you need to add any new dependencies to an .in files. Please keep the .in files sorted. After running inv pip-compile, you will need to run inv build to rebuild the docker images with the new dependencies included.

FOIAMachine

FOIAMachine is our free FOIA filing tool, that allows you to track your requests while requiring you to manually handle all of the message sending and receiving. It is run off of the same code base as MuckRock. To access it, set dev.foiamachine.org to point to localhost - sudo echo "127.0.0.1 dev.foiamachine.org" >> /etc/hosts. Then pointing your browser to dev.foiamachine.org will take you to FOIAMachine - the correst page is shown depending on the domain host.

Update search index

MuckRock uses watson for search. The index should stay updated. If a new model is registered with watson, then build the index (fab manage:buildwatson). This command should be run on any staging or production servers when pushing code that updates the registration.

MuckRock's Projects

azure-table-extractor icon azure-table-extractor

DocumentCloud Add-On that uses Azure Document Intelligence to extract tables from documents

bulk-delete-tags icon bulk-delete-tags

DocumentCloud Add-On to bulk remove specific tag or key/value pairs from documents.

bulk-tag-addon icon bulk-tag-addon

A DocumentCloud Add-On that allows you to add tags and/or key value pairs to more than 25 documents at a time.

change-note-visibility icon change-note-visibility

DocumentCloud Add-On that changes the access level (public, private, organization) of all notes on documents in a selection or query.

change-visibility icon change-visibility

DocumentCloud Add-On that changes the access level (public, private, organization) of large sets of documents.

clear-failed-uploads icon clear-failed-uploads

Add-On that runs through a sets of documents you own on DocumentCloud and deletes the documents with errors.

cloud-upload-addon icon cloud-upload-addon

This add-on allows users to import files from Google Drive, Dropbox, MediaFire, and WeTransfer

compress-pdf-add-on icon compress-pdf-add-on

Given a public Google Drive or Dropbox link to a file or set of files, it will download the file(s), attempt to compress each file, and upload the document(s) to DocumentCloud if the resulting compressed file <500MB

cpuprofile icon cpuprofile

A simple Python package to profile CPU speed by computing the Fibonacci sequence

date-entities icon date-entities

Looks through documents for dates, then uses them to build a timeline.

dc-addons-playground icon dc-addons-playground

Playground for DocumentCloud add-ons. Experimental; nothing here is guaranteed to work.

django-autocomplete-light icon django-autocomplete-light

A fresh approach to autocomplete implementations, specially for Django. Status: v3 stable, 2.x.x stable, 1.x.x deprecated. Please DO regularely ping us with your link at #yourlabs IRC channel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.