Giter Club home page Giter Club logo

datasetapi-back's Introduction

Dataset API Server

About

  • Receives and sends datasets as files and json
  • Uses MongoDB to store datasets
  • .csv, .json and .arff file formats are supported. Once the file contents have been saved, it can be downloaded as any of the previously listed formats.
  • 'testDatasets' folder contains one file for each supported fileformat. They can be used for testing.
  • Some features (such as dataset uploading) require an authorization token that can be received by logging in with an account.
  • An account can be created by sending a POST-request with credentials to the '/users' endpoint.
  • Max upload size at a time: 100 KB
  • Unit tests ready for '/datasets/' endpoint

Setup

  1. Create a MongoDB database
  2. Clone this repository
  3. Run npm install inside the project folder
  4. Set environment variables (More on this in the Configuration section)

To run this project in production mode, run npm start. For development, run npm run watch. And to run tests, run npm test.

Configuration

In order to use this server, the following environment values should be set:

  • PORT: Port in which this server runs. The matching frontend expects this to be set to 8000)
  • MONGODB_URI: MongoDB address)
  • SECRET: String that is used for user authentication
  • (TEST_MONGODB_URI: MongoDB address that is used in test environment)

One way to setup environment variables is to use the dotenv module (included in package.json).

Supported HTTP-Requests

/users /users/:id /login /datasets /datasets/:id /datasets/:id/:format
GET NO NO NO YES YES YES
POST YES NO YES YES NO NO
DELETE NO YES NO NO YES NO
PUT NO YES NO NO NO NO

Requests that require an authentication token

/users /users/:id /login /datasets /datasets/:id /datasets/:id/:format
GET - - - NO NO NO
POST NO - NO YES - -
DELETE - YES - - YES -
PUT - YES - - - -

The authentication token needs to be sent as the 'Authorization' header value with the prefix 'bearer '. Example: Header key: 'Authorization', value: 'bearer someValidToken'.

HTTP-Request Descriptions

/users

POST

Creates a new account with given credentials. The credentials should be passed in the request body. The account creation will fail if the username is too short (3 letter) or if it is not unique. Succesful requests get an response that contains the username and id.

Example body:

{
 "username": "unique username",
 "password": "not an easy to guess pw"
}

/users/:id

DELETE

Deletes an account with the same id as the one specified in the URL parameter. All the datasets that the account has posted will also be deleted. Upon successful request, the server will respond with an empty 204 response.

PUT

Updates an account with the same id as the one specified in the URL parameter. Only the password can be changed. Upon successful request, the server will respond with an empty 201 response.

Example body:

{
  "password": "new password"
}

/login

POST

Checks the database to see if an account with the given credentials exists. If such an account is found, the response will contain an authentication token along with the username.

Example body:

{
 "username": "unique username",
 "password": "not an easy to guess pw"
}

/datasets

GET

Returns all of the uploaded datasets as JSON. This endpoint supports query strings with the following options: 'name', 'username' 'fields' and 'limit_instances'. The first two options tell this server to only return datasets that match the corresponding values (same name or username). 'fields' option lets the requester specify which of the dataset fields should be returned. These fields are:

  • id
  • user
  • name
  • relation (only appears in datasets uploaded as .arff files)
  • headers
  • instances

Finally the 'limit_instances' option tells the server to only return the specified amount of data instances per dataset. The request will fail (Status code 400) if the value is not a positive number.

POST

Saves the received dataset json to the database. The dataset contents should be placed inside the request body. Upon successful request, the server will respond with with an 201 response with the saved dataset object.

Example body:

{
  "name": "patient_stats",
  "headers": [
    "height",
    "weight",
    "blood-type"
  ],
  "instances": [
    [
      "160cm",
      "87kg",
      "A"
    ],...
  ],
  "user": "someValidUserId"
}

/datasets/:id

GET

Returns the dataset with an id that matches the specified URL parameter (:id) as JSON if one exists. This endpoint supports the same query string options as the previous endpoint ('/datasets').

DELETE

Deletes the dataset with an id that matches the specified URL parameter (:id) if one exists. Users can only delete datasets that they themselves have posted (same account). Upon successful request, the server will respond with an empty 204 response.

/datasets/:id/:format

GET

Attaches a file with the same id as is specified in the URL parameter (:id) to the response. The file format will is specified in the ':format' URL parameter. Format options are csv, arff and json. Sent files are removed from the disk as soon as the response has been sent.

OAuth2 Authorization Flow (Resource Owner Password Credentials)

During user login, the client sends the user's credentials to this server (POST request to '/login'). The credentials are then validated and access token is sent on the response. The client can then send the received token with further requests to gain access to services such as file uploads which would otherwise be unavailable.

It should be noted that this project does not separate the authorization server from the resource server. In a more strict environment, the validation process and token creation should be handled in a dedicated authorization server.

For more information on OAuth2, check this link

TODO

datasetapi-back's People

Contributors

miikko avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.