Giter Club home page Giter Club logo

sapienta's Introduction

SAPIENTA for Python

This package contains all of the utilities and tools necessary to run a SAPIENTA annotation server in Python. It currently relies upon the Perl version of SAPIENTA but it is intended that a Python version of the annotation system will be implemented imminently.

Requirements

This package runs in Python 2.7 and requires the following Python libraries:

*  Flask (version 0.9 or newer)
*  pycurl (version 7.19 or newer)
*  text-sentence (version 0.14 or newer)

These packages and their dependencies can be automatically installed as part of the installation process illustrated below.

Installation

Please Read the INSTALL file for notes on how to compile and install SAPIENTA for your system.

Configuration

SAPIENTA can be used as a web server, annotation worker or just as a commandline tool for annotating papers. Regardless of how you intend to use it, you need to set a few values in the configuration file.

By default, SAPIENTA will look for a file named sapienta.cfg in the current working directory, then the user config directory (~/.config/sapienta.cfg) and finally a system-wide configuration file (/etc/sapienta.cfg). It will stop looking and use the first file it finds conforming to this pattern.

You should use the sample config file for reference when setting up your SAPIENTA installation:

~/SAPIENTA $ cp sapienta.cfg.sample ~/.config/
$ [vim or gedit or kate] ~/.config/sapienta.cfg

For a more in depth look at configuring SAPIENTA. Read the configuration guide or the server configuration guide.

Usage

pdfxconv - commandline paper annotation

The most common usage for SAPIENTA is expected to be for commandline processing of papers. Once you've installed SAPIENTA, you should have access to the commandline application pdfxconv. This is a 'swiss-army-knife' program which provides pdf conversion, sentence splitting and annotation of papers. It also supports batch processing. Depending on your configuration, pdfxconv can be used as a thin client that offloads work to a remote SAPIENTA instance or make use of local processing.

Typical usage of pdfxconv to convert a PDF to an annotated PubMed DTD paper might look like the following:

$ pdfxconv -a myarticle.pdf

You can also batch process a set of files:

$ pdfxconv -a *.pdf *.xml

Web server and computation node

You can read about how to use SAPIENTA as a web service here

sapienta's People

Contributors

ravenscroftj avatar xuperx avatar amandaclare avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.