Giter Club home page Giter Club logo

sisyphe's Introduction

Build Status bitHound Overall Score

sisyphe

Sisyphe

Sisyphe is a simple NodeJS (recursive) folders analyser application & a (lerna) git monorepo.

Basically it can provided somes informations, check here for informations

Sisyphe-pic

Requirements

Tested with [email protected], [email protected]

Works on Linux/OSX/Windows

Example to run a quick local redis (thanks to docker):

docker run --name sisyphe-redis -p 6379:6379 redis:3.2.6

Install it

  1. Download the latest Sisyphe version
  2. Just do : npm install (this will execute a npm postinstall)
  3. ... that's it.

Test

npm run test will test sisyphe & its workers

Help

./app.js --help Will output help

Options

-V, --version               output the version number
-n, --corpusname <name>     Corpus name (session name)
-s, --select <name>         Choose modules for the analyse
-c, --config-dir <path>     Configuration folder path
-t, --thread <number>       The number of process which sisyphe will take
-b, --bundle <number>       Regroup jobs in bundle of jobs
-r, --remove-module <name>  Remove module name from the workflow
-q, --quiet                 Silence output
-l, --list                  List all available workers
-h, --help                  output usage information

How it works ?

Just start Sisyphe on a folder with any files in it.

node app -n sessionName ~/Documents/customfolder/corpus

node app -n sessionName -c ~/Documents/customfolder/folderResources ~/Documents/customfolder/session

Sisyphe is now working in background using all your computer threads. Just take a coffee and wait , it will prevent you when it's done :)

The result of sisyphe is present @ sisyphe/out/{timestamp}-corpusname/ (errors,info,duration..)

Interface

For a control panel & full binded app, go to Sisyphe-monitor sisyphe has a server that allows to control it and to obtain more information on its execution. Simply run the server with npm run server to access these features

Sisyphe-dashboard

Modules

These are the default modules (focused on xml & pdf).

  • FILETYPE Will detect mimetype,extension, corrupted files..
  • PDF Will get info from PDF (version, author, meta...)
  • XML Will check if it's wellformed, valid-dtd's, get elements from balises ...
  • LANG Will detect lang of files (xml/text files ...)
  • XPATH Will generate a complete list of xpaths from submitted folder
  • OUT Will export data to json file & ElasticSearch database
  • NB Try to assing some categories to an XML document by using its abstract
  • MULTICAT Try to assing some categories to an XML document by using its identifiers
  • TEEFT Try to extract keywords of a fulltext
  • SKEEFT Try to extract keywords of a structured fulltext by using teeft algorithm and text structuration

Developpement on worker

When you work on worker, just:

  • Commit your changes as easy
  • Do a npm run updated (to check what worker has changed)
  • Do a npm run publish (it will ask you to change version of module worker & publish it to github)

Modules informations

Some bugs could occured with certains files with 'skeeft' on windows module please just disactivate it until we fix.

sisyphe's People

Contributors

clabroche avatar kerphi avatar matthd avatar rmeja avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

turnupdigital

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.