cryptonomic / nautilus-cloud Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 1.0 3.34 MB

Cloud interface for Nautilus infrastructure

Scala 98.44% HTML 1.30% Shell 0.25%

nautilus-cloud's People

Contributors

Stargazers

Watchers

Forkers

developer0623

nautilus-cloud's Issues

Dev Ops Tooling

Following is the list of tooling / runbooks required to improve the QoL of folks engaging in devops.

Tool to validate that generated keys actually work.
Tool to back up / restore databases.
Tool to check logins.

Add support for ToS reacceptance

The ToS / PP may change over time, we need functionality to be able to re accept the terms.

POC: Ensure that API keys are not reused

API Keys when being generated should be validated against the DB to ensure that a previously generated key is not accidentally reused.

Deprecate cookie support on the backend

In favor of timed tokens.

Gitlab federated login support

keys table structure

Column	Data Type	Description
keyid	numeric, primary key	sequence auto-increment pk
key	text, not null	32-char API key
resourceid	numeric	FK reference to resources table
userid	numeric	FK reference to users table
tierid	numeric	FK reference to tiers table
dateissued	timestamp	date from which the key is active
datesuspended	timestamp	a means of terminating a key

Metering functionality (#11) will need to be instantaneous. To accomplish this we should keep the necessary at a in memory. This can likely be done with Postgres configuration to force contents of the related tables to reside in memory

Refreshing API key method should return new API key

Bootstraping an administrator

This issue covers the discussion on how to bootstrap administrators on the system given that none exist at the time of deployment.

Discuss the NC LDM

Area's to cover :

User representation
Low Metrics
Aggregate Metrics
Account / Audit Logs
Billing History
Api Keys

Speed up tests

Embedded PostgreSQL needs ~20 seconds to run. We have 4 tests using that database which means that we spend ~80 seconds waiting for the database to run when the test run in 10 seconds. We can optimize it by running the embedded database only once and reusing it for further tests which should reduce the time of running all tests 3 times (from 90 to 30 seconds).

Update route authorization logic

A few changes are required as to how roles are applied to routes. These are:

User routes currently require the role of admin. A user should also be able to view/update their information.
GetAllApiKeys should be restricted to the administrator role only.

In addition, given the current implementation, the create user route should be disabled as we have no immediate use case to allow users to register outside of an authentication provider mechanism.

Disable old demo site

We can leave the code in the repo as it may be easier for local testing, however in which case some mechanism should be available to disabling it.

users table structure

Column	Data Type	Description
userid	numeric, primary key	sequence auto-increment pk
username	text, not null	login name
useremail	text, not null	email
userrole	text, not null, default('user')	role, probably enum, other values might be acctadmin (corp customers may want control over their sub-accounts), infraadmin (require 2FA)
registrationdate, not null	timestamp	validity period start
accountsource	text, nullable	might be enum, values could be web, campaign id, "internal", "manual" (when we ourselves might make accounts for people)
accountdescription	text, nullable	to go with account source, something we may enter

Create basic backend API and tables

Start the project by initializing a new Scala SBT project with a basic API and database tables.

If there are no objections, we can start with the same stack as Conseil, i.e. Akka HTTP with Slick and Postgres. We are, however, completely open to any new choices presented here.

This is the initial API spec:

Route	Method	Description
apiKeys	GET	Gets all API keys
apiKeys/{apiKey}	GET	Validates given API key
users	PUT	Add new user
users/{user}	GET	Fetches user info
users/{user}/apiKeys	GET	Get all API keys for given user
users/{user}/usage	GET	Gets the number of queries used by the given user

The tables should be created accordingly.

Docker file needs to copy correct jar

The docker build process currently copies all jar files available in the build directory.

This is undesirable and the process should only take the assembled uber jar for NC.

Additionally, a update to documentation is required for handling of code changes and local docker images.

Changes to an auth endpoint

replace GET /github-callback path with POST /users/github-init.
code parameter can be passed in a body request
endpoint should return the same as /users/me (without the current redirect)

Add Open ID and GitHub for authentication

Create a authentication logic for the API such that it can plug in to multiple providers. For now, integrate with OpenID and GitHub.

Create a simple HTML test page with which to test authentication.

Authorization can be part of a subsequent ticket.

Travis integration

POC: Add functionality to allow a user to refresh his keys.

Add the necessary API routes / endpoints to allow a user to refresh his keys for a given environment i.e dev or prod. Note that keys are never deleted from the database.

Additionally routes returning key information for a user should also output resource information with the key (which service, platform, network, description etc) as opposed to just the resourceId.

Finally, add a simple HTML page demonstrating this functionality. The mock ups may be referred to for this exercise.

tiers table definition

Column	Data Type	Description
tierid	numeric, primary key	sequence auto-increment pk
tier	text, not null	tier name
description	text, not null	long-form service description
monthlyhits	numeric, not null	cumulative number of requests per calendar month - static window
dailyhits	numeric, not null	24-hour window request allowance - sliding window
effectivedate	timestamp, not null	validity period start
enddate	timestamp, not null	validity period end

Update user registration flow to handle TOS recording along with other attributes

The back end user registration flow should work as such:

FE provides token.
BE exchanges token for the user's information.
BE checks if the user exists in the database.
- If the user exists, the standard payload containing the user's information is returned to the FE and the flow exits here.
- If not, the BE constructs the standard payload with the necessary new user defaults and returns it to the FE. It also adds this payload to a data store.
FE processes the payload, notices that ToS has not been accepted and presents the user with options. Once the user completes that action, FE submits the results back to the BE.
BE then removes the user from the cache, updates the necessary attributes and commits to DB. It then returns standard payload back to the FE.
Should the user not complete the action in step 4, a timer thread should remove the user from the data store.

The following attributes are required and will necessitate a schema change:

ToS Accepted - bool
Registration I.P address - string
Newsletter Accepted - bool
Newsletter Accepted date - timestamp

Other requirements

The timer thread removing users from the cache should emit a log event whenever it executes and should indicate how much entries remain, were removed and the time window it is operating on. A log event should be written in case this thread crashes for any reason.
The user update endpoint should now have an option for setting the newsletter preference. The attribute userRole should be removed from this endpoint.

Reference: Discussion at Cryptonomic/Nautilus-Cloud-Ui#5

Add authorization logic

This is a follow up to #2.

Add support for roles, starting simply with 'user' and 'administrator' roles. Roles should determine access to route / HTTP method pairs. The 'user' role can be the default for now.

Add additional routes for role management.

Add audit / history feature for accounts

Users must be able to see a list of changes that have occurred to their account. The following field should be captured.

What happened.
When did it happen.
Who did it.

Build a nginx module for metering

The module should be able to do the following:

Read the api key header from incoming requests.
Pipe that key via unix socket to another process to check if the key is valid or not.
Return appropriate response to the client.

Request metering

Incoming requests need to be gated to confirm service availability for the provided key, the service type and tier it is assigned, and compared against total monthly, and windowed utilization numbers.

userhierarchy table schema

Retail users will be sub-accounts of some house userid. PK on this table would be userid+managerid.

Column	Data Type	Description
userid	numeric	FK into users table
managerid	numeric	FK into users table (userid)
effectivedate	timestamp	record date

POC: Add a route to permit users to delete their account.

Add a route to permit users to delete their account. Upon successful delete operation, they should be logged out immediately and the back end should ensure that all associated keys are suspended.

For this specific issue, the delete operation will be limited in scope to removing the specific row from the Users table. This has been decided in the presence of @vishakh and @anonymoussprocket .

Support for administration endpoints

Introduce new / modify existing endpoints to support the following functionality required for administration:

An endpoint for fetching the complete list of users. The data returned for now can contain the userid, registration date and email id at the minimum.
Additionally, a search functionality should be present which takes in complete or partial userId or email and returns the same information as mentioned above. This can be a separate endpoint or baked into the one above.
An endpoint that permits accounts to be deleted . The semantics of such an operation is the same as what is present in code at this point in time.

All of the above actions must be limited to users with admin roles. Pagination support is desirable.

resources table schema

Column	Data Type	Description
resourceid	numeric, primary key	sequence auto-increment pk
resourcename	text, not null
description	text, not null
platform	text, not null	'Conseil', 'Tezos', etc
network	text, not null	'prod', 'mainnet', etc

POC: User keys must be created on first login.

Two keys must be created on first login for the user.

For the Development environment, One combined key for Conseil and Tezos.
For the Production environment, One combined key for Conseil and Tezos.

For this 4 static resource entries may be added to the resources table, one for each service (conseil/tezos) and its environment/platform (prod/dev).

Keys maybe generated by taking random chars and seeds as input and using a fixed length hash functioning to transform them - or - any other suitable mechanism that will ensure no collision of keys between users.

java.lang.OutOfMemoryError: Metaspace

relates to #23

While running tests we can experience OutOfMemoryError: Metaspace. It looks that we have some memory leak because Metaspace usage seems to grow incessantly on each test run inside of sbt console. When you're running them with sbt test it's ok because a JVM is shut down after the job is done so it works now but it will stop working when we'll have more tests.

Possible actions:

check why it happens and try to fix it (maybe it's an actor system which is not being shut down),
if it's because of embedded-postgres, we should consider replacing it with https://www.testcontainers.org/modules/databases/postgres/ which is a nice thing to do anyway because embedded-postgres is not maintained anymore,
we can increase a memory for metaspace which is a temporary solution

Gitlab login UX changes

Introduce new endpoint to search users by ApiKeys

Introduce a new administrator only endpoint to search for users given a partial or full apikey and an optionally defined environment parameter.

Collect and record high level statistics from the metering systems

We would like to record high level usage statistics within the Nautilus Cloud database for non real time uses such as billing etc. This data should be stored in a separate table.

A proposed schema for this table is:

id (int or varchar PK)
user id (int or varchar FK)
service (varchar e.g Tezos, Conseil etc)
hits (long)
period_start (timestamp, UTC)
period_end (timestamp, UTC)

The above schema may be enhanced as required.

A proposed algorithm for collecting this data is as follows:

Given:

Time period A, the time interval after which the algorithm runs.
Time period B, the time interval within which we are attempting to gather statistics.

Then:

Every A wake up and gather data.
For each user in the system, query the last recorded interval (period_start, period_end).
For all metering api's listed in config, fetch all data for all api keys valid within the given periods, a) period_start to period_end and b) period_end + B.
Aggregate the two data sets collected from the previous step. Update and insert into the database respectively.

A few notes :

The time periods A and B may not necessarily align i.e its possible A >> B.
While the above scheme should ensure that in case of an extended outage of the NC server, It is important that any code ensure that stats are either not lost or double counted.
The process of collecting statistics should be kicked off immediately upon successful server start.
Any failure should be recorded to log.

usercredentials table schema

Column	Data Type	Description
userid	numeric, primary key	FK into users table
oauthprovider	text, not null	OAuth service (OpenID, Github)
oauthtoken	text, not null	OAuth token
effectivedate	datetime, not null
duration	numeric, not null	seconds til oauth token expiration

Add additional config paramters to Metering API config

The protocol should be a configurable field in the metering api configuration (MeteringApiConfig.scala)
Add a config for defining a pre shared api key similar to what NC and conseil use to exchange the key list.

Multiple usage tiers

Users should be assignable to multiple tiers according to their needs. The API should therefore support creating and modifying tiers and assigning / reassigning users to these tiers.

We can start with the following tiers and sub-tiers:

Shared / Free
Shared / Economy
Shared / Business
Dedicated / Self-managed
Dedicated / Managed
Dedicated / Bespoke

Please keep in mind that in the future switching tiers might require payment by users.

Payment integration

Look at ForgingBlock and propose a design for integration with NC.

Add an endpoint to display usage data to clients

This ticket is the first in many towards tier based services support in NC.

The first step is to establish plumbing from InfluxDB to NC. The following flow is envisioned to achieve this goal:

Given a userrr, NC constructs a list of all valid api keys for that user that have been active in the past 30 days.
NC queries InfluxDB servers constricting the list of keys to the ones returned in step 1.
NC displays the returned results to front end in a format that is compatible with a charting library (TBD).

Notes:

NC may cache the results from Step 2 for the appropriate time window for the measurement.
It is assumed that we only have 1 InfluxDB instance to pull data from.
Apply abstraction where necessary, we may choose to opt for another database apart from InfluxDB.