Giter Club home page Giter Club logo

s3-db's Introduction

Node.js CI

Document DB API for AWS S3

AWS S3 is incredibly cheap, has 2 9's of availability, 12 9s of resiliency, triggers via AWS Lambda, cross region replication, versioning and pretty decent Performance. Its a pretty compelling database solution for a lot of scenarios. A few other people agree, see Pete Warden's blog and this interesting solution.

Whats New

This library has been changed drastically from the 1.x.x version as this was written to be TypeScript first. Also:

  • It makes use of async/await patterns.
  • TypeScript decorators signficantly improved/reduced configuration.
  • Model classes are no longer decorated with convenience functions (just metadata.)
  • A new collection instance is no longer async/promisified.
  • No need to create a 'Database' object.

Next?

  • JavaScript Examples
  • eTag verification on save, for collission detection.
  • Carry over 'copy'
  • Iterator pattern on ResourceList, or instead of.
  • Performance docs on Lambda @ 128mb, 512mb, 1024mb. 2024mb, 3036mb.
  • Bug: MD5 does not appear to be persisting or returning when doing .head() check on a document.

Usage

Installation

npm install s3-db --save into the project where S3-DB will be used. If you dont already have the aws-sdk node module then npm install aws-sdk as well.

Configure S3DB

There are very reaonsable out of the box configurations, but if you would like to change any of them you can do so using the S3DB class. The only values that you need to worry about at the global level are the following.

Here is a basic example overwriting all default values.

S3DB.update({
    baseName: 'myapp',
    stage: 'quality',
    region: 'us-east-2',
    bucketPattern: '{{stage}}-{{region}}-{{baseName}}-{{bucketName}}'
});

Note: Below in the configuration section each of these values is explained in detail.

Decorate Model/Types

Create your model class and decorate it with @collection() and @id() appropriately. So that when an instance of that class type is passed into the appropriate Collection instance, it will know how it should be configured.

If you do not specify an argument for @collection() then the class type will be lower cased, and used as the name and all collection configuration defaults used.

Here is a very basic example where you are fine with the name being generated by s3-db.

@collection()
export class User {

    @id()
    private id?: string;
    private name?: string;
    private age?: name;
    private address?: Address;

    constructor(){
    }
}

So once you have your model decorated, you can create an instance of a Collection and begin creating/updating/deleting objects in a Bucket. Function on Collection is async/promisified so you can use either pattern.

Async Example.

const collection: Collection<User> = new Collection(User);
function async doStuff(){
    const user: User = await collection.save({name:'Testing',age:21});
    const checkedUser: User = await collection.load(user.id);
    await collection.delete(user.id);
}

Promise Example.

const collection: Collection<User> = new Collection(User);

/* Creates a user and generates an ID for the user record. */
collection.save({name:'Testing',age:21})
    .then( (user: User) => collection.load(user.id) )
    .then( (user: User) => collection.delete(user.id) );

Configuration

A complete list of all the configuration points and what values you can use.

S3DB

Configurations that are applied across all collections.

Name Default Description
baseName s3db Used in the bucketPattern and logging as a namespace.
stage dev The logical environment. Used in the bucketPattern.
region us-west-2 Used in the AWS configuration to target a specific region. Also used in the bucketpattern.
bucketPattern {{stage}}-{{region}}-{{baseName}}-{{bucketName}} The name that is used to lookup the bucket for a collection. Must use valid S3 bucket name characters. The replacement values for {{stage}}, {{region}}, {{baseName}} and {{bucketName}} are all case sensitive. You can omit any of them.

Collection

Configurations specific to a collection.

Name Default Description
pageSize 100 Maximum of 1000. How many documents to return when doing a .find() operation.
serversideEncryption true If S3 server side encryption is enabled (encryption at rest.)
checkIsMOdified true If enabled, save() operations will check if the object provided has been modified before being saved. If it is not modified it returns without attempting to save to S3.
isModified MD5IsModified A function that is used to check if an object is modified. If you override it, implement the IsModified interface.
serialization JSONSerialization How objects are serialized to a string before they are perstisted to S3.
defaultIdGenerator defaultIDGenerator Default generation is UUID v4. This is called when no generator is provided on the @id() annotation.
validator undefined A function that can be used to check if the object being saved is valid.
noExceptionOnNotFound false Changes the behavior to return undefined rather than throw an excpetion, when no document is found.

API's

The available objects, decorators and functions.

@collection(string? | CollectionConfiguration?)

Annotation indicates that a specific class corresponds to an S3 Bucket.

@id(generator?)

Anotation indicates what attribute or field on a class will be the key for the persisted object. If this annotation is not used then 'id' is used, or added.

S3DB

Singleton containing the 'global' configurations for the S3DB instance.

S3DB.update({ baseName?: string; stage?: string; bucketPattern?: string; region?: string })

Updates the default configuration with the values provided.

S3DB.setLogLevel(level: LogLevel): void

Updates the logging level of the S3DB logger and will change the default level that each collection instance defines.

Collection

Used to do CRUD operations. Need to create an instance to use.

const collection: Collection = new Collection(SomeClass);

Creates a new instance that will use the SomeClass definition (which should contain the @collection and @id decorators) to determine its configuration.

collection.head(id): Promise

Returns the metadata for the corresponding object identified by the id.

collection.exists(id): Promise

Returns true if the object exists within the corresponding S3 bucket.

collection.save(toSave): Promise

Creates or updates an object in S3 that corresponds to the id of the toSave object passed in. If there is no id on the object, then one is generated.

collection.delete(id): Promise

Removes an object from an S3 bucket for the corresponding id.

collection.find(prefix, pageSize, continuationToken): Promise

Returns a list of S3Metadata objects for all objects in the corresponding S3 bucket htat start with the prefix value provided. If continuationToken is passed in then the list will be a 'continuation' of a previous find operation.

collection.subCollection(prefix: string, typeOf: SomeClass): Collection

Creates a new collection where all operations will execute with the prefix in front of the id's used. So if the prefix is /users/ then when .load('1234') is called the request will result in an ID lookup for /users/1234. Similarly, all objects saved will have the prefix applied when the ID is generated by the save operation, or, when an ID is provided and it does not startWith() the configured prefix.

collection.setLogLevel(level: LogLevel): void

Lets you change the logging level for this specific collection instance. At creation of the collection, the logging level is taken from the S3DB logger, as a child logger is created from it.

s3-db's People

Contributors

dependabot[bot] avatar matt-filion avatar palmerabollo avatar relvao avatar syrok avatar teebu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

s3-db's Issues

Different region returns InvalidBucketName

Tried to change AWS_DEFAULT_REGION and got err: {status: InvalidBucketName}

This is what I tried:

"use strict";

process.env.AWS_DEFAULT_REGION = 'eu-west-1';

const DB = require('s3-db');
const db = new DB();
// console.log('db:', db);

db.createCollection('del_man_orders_s3_db')
  .then(() => console.log('success'))
  .catch(err => console.log('err:', err));

Added a console.log() to print defaults variable from src/index.js

Defaults: { db: 
   { name: 's3-db',
     environment: 'dev',
     namePattern: '${db.name}.${db.environment}-${name}' },
  provider: { name: 'aws-s3', region: 'eu-west-1' },
  collections: { default: {} },
  serializer: {} }
err: { status: 'InvalidBucketName' }

Add in the ability to reject overwrites using S3 policy.

I think its possible to have a policy apply when a request parameter is supplied. This is completely untested but it should be possible to disable deleteObject deleteObjectVersion if a header or value is provided in the request. If thats true this policy could be applied conditionally to essentially allow creation only for specific calls, we could have create and overwrite options in the API.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ExamplePolicies_EC2.html

https://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html

Cannot find module 's3-db'

Hi there,

when I install this module I'm not able to import it in my code.

I noticed that your main property in the package.json says /dist/index but after the command npm i I don't have the dist folder.

Why?

"error":{"message":"The Content-MD5 you specified did not match what we received.","co de"

var test_file_1 = JSON.parse(fs.readFileSync(`./test/${file}`, 'utf8'));
database.getCollection('mydb')
  .then( collection => collection.saveDocument(test_file_1) )
  .catch( error => console.error(error) )
{ BadDigest: The Content-MD5 you specified did not match what we received.
    at Request.extractError (C:\Users\hello\Documents\Learning\Lambdas\s3db\node_modules\aws-sdk\lib\services\s3.js:568:35)
    at Request.callListeners (C:\Users\hello\Documents\Learning\Lambdas\s3db\node_modules\aws-sdk\lib\sequential_executor.js:105:20)
    at Request.emit (C:\Users\hello\Documents\Learning\Lambdas\s3db\node_modules\aws-sdk\lib\sequential_executor.js:77:10)
    at Request.emit (C:\Users\hello\Documents\Learning\Lambdas\s3db\node_modules\aws-sdk\lib\request.js:682:14)
    at Request.transition (C:\Users\hello\Documents\Learning\Lambdas\s3db\node_modules\aws-sdk\lib\request.js:22:10)
    at AcceptorStateMachine.runTo (C:\Users\hello\Documents\Learning\Lambdas\s3db\node_modules\aws-sdk\lib\state_machine.js:14:12)
    at C:\Users\hello\Documents\Learning\Lambdas\s3db\node_modules\aws-sdk\lib\state_machine.js:26:10
    at Request.<anonymous> (C:\Users\hello\Documents\Learning\Lambdas\s3db\node_modules\aws-sdk\lib\request.js:38:9)
    at Request.<anonymous> (C:\Users\hello\Documents\Learning\Lambdas\s3db\node_modules\aws-sdk\lib\request.js:684:12)
    at Request.callListeners (C:\Users\hello\Documents\Learning\Lambdas\s3db\node_modules\aws-sdk\lib\sequential_executor.js:115:18)
  message: 'The Content-MD5 you specified did not match what we received.',
  code: 'BadDigest',
  region: null,
  time: 2017-06-17T18:07:56.232Z,
  requestId: '1D886C49334CF1FB',
  extendedRequestId: 'ql+U0UE7ZjXqyGhW/9gDK/cGWifNvOBqrTP6DglYjhXc9TmYs3RhF600+t9KktZquRlR+KceauE=',
  cfId: undefined,
  statusCode: 400,
  retryable: false,
  retryDelay: 74.86007695144244 }

Process finished with exit code 0

Triggers or Event Hooks

Ability to execute code during a collection event. Ideally, its able to be configured to call a lambda event directly. This way a primitive method of a webhook can be fairly easily implemented.

Writable, Readable and Duplex streaming on collections.

const users: Collection<User> = new Collection(User);
const newUsers: Collection<NewUser> = new Collection(NewUser);
const transformer: DuplexStream = ....;

//This should read from users, transform using the transformer and then write to newUsers.
users
   .pipe(transformer)
   .pipe(newUsers);

  • .pipe() should produce a ReadStream that makes use of .find with no prefix.
  • Collection itself should implement WritableStream so it can be used as a value into .pipe() as a write destination.
  • ReferenceList should have a .pipe() method to produce a ReadStream which will produce documents starting with the prefix defined in .find() provided to get the ReferenceList.

Version/Revision Check before update

Is that possible already with validator? if not that would be a good feature.

When updating a document check for version/rev to keep data consistent, on a successful update increment the version/rev number.

ex: 2 updates on the same document, one goes through the other will fail because of version mismatch.

The version/rev data would have to be stored in the metadata, and checked kind of like you do in the collection.

Documentation Request

  1. An example where the method saveDocument is used would be helpful.

  2. Maybe a quick explanation or link to es6 to provide an understanding of the arrow function and promises.

Needs a better "getting started" guide

const Database = require('s3-db');
const database = new Database();
const users    = database.getCollection('users');
users.get('my-user')
  .then( user => {user.age = 32; return user} )
  .then( user => user.save() );

This doesn't work because getCollection returns a promise.

Typo in collection section: database.getColletion('x',{id:{propertyName:'name'}})

I'd like to see how to create a collection, create a new document, then get document and update document.

database.createCollection('test')
      .catch(console.error)
  } )
database.getCollection('test')
  .then( collection => {
    collection.saveDocument({ "id":"321", "name":"test2", "value":123 })
      .catch(console.error)
  } )

How to (upsert) create a document if one doesn't exist, or update a document.

Some config examples would be nice.

const Database = require('s3-db');
const configuration = {
  db: {
    name: 's3-db',
    environment: 'dev',
    pageSize : 10,
    allowDrop: true
  },
  provider: {
    name: 'azure-drive',
    region: 'us-east-1'
  },
  collections: {
    default: {
      onlyUpdateOnMD5Change: false,
      collideOnMissmatch: false,
      pageSize: 1000,
      encryption: false,
      id: {
        propertyName: 'id',
        generator: collectionName => `${configuration.db}-${collectionName}-${new Date().getTime()}`
      }
    }
  }
};

The above creates an a file with: %5Bobject%20Object%5D-%5Bobject%20Object%5D-1497313127381 for some reason.

I did find some nice documentation in your repo, after I wrote this.

This example is giving me an error: TypeError: users.save is not a function, the first one is users.saveDocument because its a collection.

const Database = require('s3-db');
const database = new Database();
const user     = {name : 'Richard Cranium'};

database.getCollection('users')
  .then( users => users.save(user ))
	.then( user => {
		user.size = 1234;
		user.sex  = 'male';
		return user;
	})
	.then( user => user.save() )
	.then( user => {
		user.size = 122345;
		user.sex  = 'female';
		return user;
	})
	.then( user => user.refresh() )
	.then( user => user.delete() )
	.catch( error => console.error(error.stack) )

Not picking up correct stage

If I invoke locally

sls invoke local -s produccion -f Some

I get

{ db: 
   { name: 's3-db',
     environment: '$LATEST',
     namePattern: '${db.name}.${db.environment}-${name}' },
  provider: { name: 'aws-s3', region: 'us-west-2' },
  collections: { default: {} },
  serializer: {} }

Which later gives me an InvalidBucketName exception.

My workaround was setting STAGE=mystage on command line but I'm curious on how it will behave on deploy.

.find() for the subCollection

calling .find() for the subcollection returns the objects of the parent collection:

const companyTable = await database.getCollection('companies');
const table = companyTable.subCollection(`${company.id}-${TABLE_NAME}`);
const documentReferences = await table.find();

documentReferences would contain the files from 'companies'

Create 'change' operation on collection for a specific object.

Applies only to an existing object, fails if none found.
Argument takes in a function to manipulate the loaded object.
eTag matching used to reject the change if the object is modified 'underneath' the change.
If rejected, re-attempts the operation 3 times.
If fails 3 times then fails completely.
Object is re-loaded for each attempt with the current eTag.

Fix readme ๐Ÿ˜

Hi @matt-filion

๐Ÿ‘‹

You missed a letter for database. getColletion instead database.getCollection ! (the "c" letter)

3 times at the end of readme.md

Have a nice day, thx for s3-db ๐Ÿ˜ƒ

Compare Documents

It would be helpful to have a method that compares documents and returns a true or false.

Also helpful, a method that compares key,value pairs of documents and returns a true or false.

Use case: check if a phone number already exists

S3 Key duplication limitation

Hello, how does s3-db deal with S3 key duplication limitation? For example you want to save a User document, meaning the username needs to be unique, S3 from what I know, does not have such facility to prevent duplication and it's eventual consistency nature will make it hard for any application to prevent such.

For example, two client creates a document with the same username (but different passwords) at about the same time which client will be able to actually make it?

new Database() throws Unexpected token =

Im not exactly sure what is going on here but when I call new Database() I get

SyntaxError: Unexpected token =
 at exports.runInThisContext (vm.js:53:16
    at Module._compile (module.js:414:25)
    at Module._extensions..js (module.js:442:10)
    at Object.require.extensions.(anonymous function) [as .js] (S:\\repos\\api\\api-media\\lambda\\node_modules\\babel-register\\lib\\node.js:152:7)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:311:12)
    at Module.require (module.js:366:17)
    at require (module.js:385:17)
    at new module.exports (S:\\repos\\api\\api-media\\lambda\\node_modules\\s3-db\\src\\index.js:38:27
   at Object.create (S:/repos/api/api-media/lambda/src/db/s3/index.js:4:3)
    at S:/repos/api/api-media/lambda/src/sdk/asset/index.js:36:7
    at process._tickDomainCallback (node.js:411:9)
'use strict';
var Database = require('s3-db');

//breaks right away if i comment below
//const Database = new Database();

var breaks = (accountId,type) => {
		return new Database();
}

module.exports = {
        breaks 
}

Question: Multi-user scenarios, atomicity, locks, etc

This is a really cool project. I'm trying to determine the limit of what it could/should be used for when it comes to situations where multiple users/sessions could be trying to write to the DB at the same time. I see the checkIsModified option, which seems to leave only a small window (between the checking if its been modified and the actual write) where another user/session could have written to the DB undetected. Would you say that one would need to implement there own lockfile solution to close that window completely?

For atomic transactions involving multiple models, I'm thinking a custom solution would also be needed. Do you agree?

Thanks again for the cool library.

Rename a document

http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#copyObject-property

  • Rename: Make sure that if collideonmissmatch flag is set, that the CopySourceIfMatch attribute is used with the eTag of the source. Add configuration, that causes a renaming of the source instead of delete.
  • Move: convenience method for moving a document to another collection. Then deletes the source document when confirmed successfully.
  • Add a duplicate method as well, which falls back on a configuration to determine the name of the new object. Will duplicate the object into the same collection.

Multiple Buckets under YML, Resouce

When listing multiple buckets to deal with under the resource, they don't end up the policy so I get permission denied. How can multiple buckets be used like below?

  iamRoleStatements:
    -  Effect: "Allow"
       Action:
         - "s3:*"
       Resource:
         Fn::Join:
           - ""
           - - "arn:aws:s3:::bucket1*"
           - - "arn:aws:s3:::bucket2*"
           - - "arn:aws:s3:::bucket2*"

Choose storage mechanism by file size.

Files under 4k could be stored under only the file name (might be lower, need to double check AWS s3 limits). Would need to change the lookup/find mechanism to locate files that are only a file name. In these cases the updates would be fast since they would bypass write operations and only modify metadata.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.