Giter Club home page Giter Club logo

orcli's Introduction

orcli (๐Ÿ’Ž+๐Ÿค–)

Bash script to control OpenRefine via its HTTP API.

Demo

Features

  • works with latest OpenRefine version (currently 3.7)
  • run batch processes (import, transform, export)
    • orcli takes care of starting and stopping OpenRefine with temporary workspaces
    • allows execution of arbitrary bash scripts
    • interactive mode for playing around and debugging
    • your existing OpenRefine data will not be touched
  • import CSV, TSV, JSON, JSONL, line-based TXT, fixed-width TXT or XML
    • supports stdin, multiple files and URLs
  • transform data by providing an undo/redo JSON file
    • orcli calls specific endpoints for each operation to provide improved error handling and logging
    • supports stdin, multiple files and URLs
  • export to CSV, TSV, JSONL, HTML, XLS, XLSX, ODS
  • templating export to additional formats like JSON or XML

Requirements

Install

  1. Navigate to the OpenRefine program directory

  2. Download bash script there and make it executable

wget https://github.com/opencultureconsulting/orcli/raw/main/orcli
chmod +x orcli

Optional:

  • Create a symlink in your $PATH (e.g. to ~/.local/bin)

    ln -s "${PWD}/orcli" ~/.local/bin/
  • Install Bash tab completion

    • temporary

      source <(orcli completions)
    • permanently

      mkdir -p ~/.bashrc.d
      orcli completions > ~/.bashrc.d/orcli

Getting Started

  1. Launch an interactive playground
./orcli run --interactive
  1. Create OpenRefine project duplicates from comma-separated-values (CSV) file
orcli import csv "https://git.io/fj5hF" --projectName "duplicates"
  1. Remove duplicates by applying an undo/redo JSON file
orcli transform "duplicates" "https://git.io/fj5ju"
  1. Export data from OpenRefine project to tab-separated-values (TSV) file duplicates.tsv
orcli export tsv "duplicates" --output "duplicates.tsv"
  1. Write out your session history to file example.sh (and delete the last line to remove the history command)
history -a "example.sh"
sed -i '$ d' example.sh
  1. Exit playground
exit
  1. Run whole process again
./orcli run example.sh

Usage

  • Use help screens for available options and examples for each command.

    orcli --help
  • If your OpenRefine is running on a different port or host, then use the environment variable OPENREFINE_URL.

    OPENREFINE_URL="http://localhost:3333" orcli list
  • If OpenRefine does not have enough memory to process the data, it becomes slow and may even crash. Check the message after the run command finishes to see how much memory was used and adjust the memory allocated to OpenRefine accordingly with the --memory flag (default: 2048M).

Development

orcli uses bashly for generating the one-file script from files in the src directory

  1. Install bashly (requires ruby)
gem install bashly
  1. Edit code in src directory

  2. Generate script

bashly generate --upgrade
  1. Run tests
./orcli test
  1. Generate help files
./help.sh

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.