Giter Club home page Giter Club logo

cate's Introduction

CATE

Build Status Coverage Status Download

CATE helps you discover, organize, and publish taxonomic information

Features

Installation

CATE is distributed as an rpm, hosted in a yum repository on bintray.

$ wget https://bintray.com/benjaminrclark/rpm/rpm -O /etc/yum.repos.d/bintray-benjaminrclark-rpm.repo

$ yum install cate

Starting CATE

Once CATE is installed it can be started without any further configuration, running by default in 'embedded' mode.

This means that it will use an in-memory database and a local solr server and filesystem. To configure CATE to use other services and/or persist data to a specific location, see configuration, below.

As a service

$ chkconfig --level 345 cate on

$ service cate start

As a java process

$ java -jar /var/lib/cate/cate.jar

Configuration

CATE requires Java Development Kit 1.7 (either Oracle JDK or OpenJDK). CATE uses FFMPEG and ImageMagick to process multimedia files and these packages must be installed locally on the server. These dependencies are specified as dependencies of the package and are verfied / installed automatically if you install the rpm.

CATE as a system depends upon a number of other services. The location and configuration of these services is relatively flexible. By default, CATE will run in embedded mode, meaning that no other external services are required.

CATE follows the approach used by spring boot in externalizing the configuration parameters. Many of the configuration properties are generic properties defined by spring-boot. Not all of them are listed below, but can be found in the spring-boot documentation. Others are specific to CATE.

In addition to specifying configuration properties, you can also enable optional services through the use of spring profiles. To run CATE as a single application server, the default profile is sufficient. To run multiple CATE application servers behind a load balancer (e.g. nginx / apache / elb), include the 'cluster' profile e.g.

  • spring.profiles.active=default,cluster

To run CATE on a [http://aws.amazon.com](Amazon Web Services), the 'aws' profile should be enabled.

  • spring.profiles.active=aws,cluster

The aws profile uses the standard properties for the database, solr and redis but is able to make use of Amazon Relational Database Service (RDS) and Amazon Elasticache in place non-Amazon services. In addition, the aws profile uses Amazon Simple Storage Service (S3) in place of a shared filesystem, and Amazon Simple Notification Service / Simple Queue Service in place of activemq. The parameters required to configure these components are listed under the heading AWS.

A sample Amazon Cloudformation template for a CATE cluster can be found in src/main/resources/cfn/cate.cnf.

Database

CATE uses a relational database as the canonical data store. Currently it is able to make use of H2 or MySQL. The properties used to configure it are standard spring-boot properties.

  • spring.datasource.url=jdbc:mysql://localhost:3306/cate
  • spring.datasource.driver-class-name=com.mysql.jdbc.Driver
  • spring.datasource.username=root
  • spring.datasource.password=
  • liquibase.contexts=mysql

Solr

CATE uses solr to power the free-text search and faceting.

  • solr.server.url=http://localhost:8983/solr
  • solr.server.class=org.apache.solr.client.solrj.impl.HttpSolrServer
  • solr.connection.timeout=100
  • solr.so.timeout=3000

Redis

CATE uses a redis database to store http session data when running in clustered mode. N.b. When running on AWS CATE will attempt to discover the port and address of an elasticache cluster running in the same account.

  • spring.redis.database=0
  • spring.redis.host=localhost
  • spring.redis.port=6397
  • spring.redis.password=

Filesystem / Object Store

CATE stores files in a shared filesystem mounted on the application server, or in S3 when running on AWS.

Local Filesystem / Network-Attached Shared Filesystem
  • upload.file.directory=/mnt/cate/upload
  • static.file.directory=/mnt/cate/static
AWS S3
  • cloudformation.uploadBucketArn

Messaging

CATE uses messaging to distribute tenant events across nodes in the cluster and to queue batch jobs. It uses activemq or SNS / SQS when running on AWS.

ActiveMQ
  • spring.activemq.broker-url=tcp://localhost:61616
  • spring.activemq.in-memory=false
  • spring.activemq.user=
  • spring.activemq.password=
AWS SNS & SQS
  • cloudformation.topicArn
  • cloudformation.queueArn

Email

Design

CATE is a web application which is designed to work at scale, deployed on virtual servers, and supporting many users and tenant projects.

In terms of scalability, its worth noting that the CATE application itself, and the application server it runs on, is not stateful. State is held in the following services:

  • Data: The relational database, plus a denormalized copy of the data is held in solr
  • Media: Media files are held in the object store (either NAS or S3) and are fetched to the application server as required. They are served to clients directly from the store
  • Session: CATE stores session state in a redis key-value store.

Events (job requests and tenant events) are distributed using a message broker. Tenant events are distributed to all instances using a topic. Job requests are distributed across application servers using a single queue which is polled by all servers.

cate's People

Contributors

benjaminrclark avatar clarkb12 avatar bsattelb avatar

Stargazers

Mohamed Benaich avatar  avatar

Watchers

 avatar  avatar

Forkers

jaiiye tyxing007

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.