Giter Club home page Giter Club logo

mongo-seeding's Introduction

Hi! 👋

I'm Paweł Kosiec, Full-stack Cloud Developer.

I'm an open-source and cloud-native enthusiast. I write back-end services in Go, and modern front-ends using JavaScript (React.js). I work in cloud-native environment, extending Kubernetes and building microservices.

To learn more about me, check out my personal website and blog - kosiec.dev.

mongo-seeding's People

Contributors

benedictrasmussen avatar dalthonmh avatar eltongarcia avatar jjpatel361 avatar michallytek avatar pkosiec avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

mongo-seeding's Issues

Merge all mongo-seeding repositories

It's rather difficult for users and also for maintaining separate repositories of mongo-seeding. Isn't better to create a monorepo and start using lerna?

JSON Schema support

Just stumbled about this nice project. How about adding JSON Schema support ?

Schemas could be simply having the same name than a module, like if you have a module called foo.ts and we would find a foo.json in the same folder.
I could do a PR.

Support Extended JSON format

Hi @pkosiec,

Awesome job at the library, it's very useful.

I'm running into an issue. I'm inserting data from a JSON file. In the file I have an "_id" field which should be MongoDB ObjectId.

The issue is that it gets inserted as plain text.

This syntax should lead to the creation of ObjectIds:

{ 
 "_id" : {"$oid":"5a68fdc3615eda645bc6bdec"}
}

But it throws an error locally: { MongoSeedingError: Error: key $oid must not start with '$'

I found that syntax here: https://stackoverflow.com/questions/51439955/mongoimport-id-is-not-an-objectid

Is there a way to end up with ObjectIds instead of plain text ids?

Thank you

Refactor Docker Image

Right now the type checking is too hard for Docker image. Aliases and helper methods are just confusing. User should be able to:

  1. mount a complete project or part of it
  2. point where import data is located
  3. import all files (where can be relative imports to helper methods or types

To do

  • Use CLI in Mongo Seeding Docker image
  • Drop support for aliases for Docker image
  • Update readme of import data preparation

Provide a way to pass insertMany options

We have run into situations where we are seeding a collection that we would prefer not to drop, but it's not possible to seed data through mongo-seeding because of a duplicate key that already exists in said collection.

It would be beneficial to be able to pass the option to submit an "unordered" list of documents so that if a duplicate key error occurs, the rest of the documents will still be written, instead of the seeder stopping altogether.

Here is the documentation for the insertMany method which allows an options object to be passed in:
https://mongodb.github.io/node-mongodb-native/3.3/api/Collection.html#insertMany

This is where the options could be passed in:

return this.db.collection(collectionName).insertMany(documentsCopy);

Example if we are able to pass an insertManyOptions Object:
return this.db.collection(collectionName).insertMany(documentsCopy, insertManyOptions);

Timestamps feature + tsconfig question

Hi!

At first a big thank you for your work on this package, we are using it since a few month at work, and we love the way we can use typescript and not only json to seed our databases.

It could be great to have these features :

  • add a transformer to set timestamps (createdAt, updatedAt)
  • add CLI option to set the timestamps (based on the previous transformer)

I can work on this PR if you are OK with that ?

Also, maybe I missed something, but I would like to use my typescripts models inside data typescript files.

I did it and it's working well except when some models are using imports with paths, is there a way to specificy which tsconfig file to use when use CLI ?
Thanks

Deprecation warning for using connection URI

DeprecationWarning: current URL string parser is deprecated, and will be removed in a future version. To use the new parser, pass option { useNewUrlParser: true } to MongoClient.connect.

Add Support for .CJS?

When running Node 14 (or 13) and utilizing ECMAScript Modules with a field "type": "module" in the top-level package.json, this error is thrown:

/node_modules/mongo-seeding/dist/index.js:111
    const error = new Error(`${err.name}: ${err.message}`);
                  ^

Error [MongoSeedingError]: Error: Must use import to load ES Module: /data/users/default.js
require() of ES modules is not supported.
require() of /data/users/default.js from /node_modules/mongo-seeding/dist/populator/filesystem.js is an ES module file as it is a .js file whose nearest parent package.json contains "type": "module" which defines all .js files in that package scope as ES modules.
Instead rename default.js to end in .cjs, change the requiring code to use import(), or remove "type": "module" from package.json.

    at wrapError (/node_modules/mongo-seeding/dist/index.js:111:19)
    at Seeder.readCollectionsFromPath (/node_modules/mongo-seeding/dist/index.js:53:23)
    at file:///seed.js:20:28
    at ModuleJob.run (internal/modules/esm/module_job.js:110:37)
    at async Loader.import (internal/modules/esm/loader.js:164:24) {
  name: 'MongoSeedingError'
}

It isn't possible for me to remove "type": "module" as that breaks the ES6 import/export functionality. Ideally, support for import/export in the core of the library would be awesome but would be a time-consuming rewrite of the importer and other elements. In the interim, is it possible that .CJS file support could be added?

I think it should be a single line change in the /core/src/config.ts file, adding .cjs to the extensions array, but I haven't tried it myself, yet.

Happy to submit a PR if this support is deemed acceptable to add.

Thanks!

Preserve stacktrace of internal error

Hello, I'm wondering what the use case is for this:

const wrapError = (err: Error) => {

The reason I ask is because, to me, it seems unnecessary and even detrimental to end-users like myself since it essentially discards the original error's stacktrace. I have some experience in dealing with errors in Node.js, perhaps I can help solve the original problem this is trying to solve?

Thank you for your open source work on this project!

Add a way to add data into multiple databases (specifically admin & app db)

Let me explain the use for this case.

Right now, I have a docker spinning up a mongodb database. Assuming that the database is totally empty I want to seed using "pkosiec/mongo-seeding" docker.

There's just one catch, I want to also seed mongodb users. These reside in the "admin" database, while my data is in an "app" database.

I worked around it by using a shell script to seed the db users when the mongodb docker spins up, then running the seeding docker.

It's really not ideal though, I'd like to use this docker to seed both db users in the admin database as well as my application data.

Any ideas on this one?

Leaking credentials in logs

Just noticed that in the docker logs it will show this:

mongo_seed_1  | 2019-06-18T03:48:24.527Z mongo-seeding Connecting to mongodb://user:pass@mongodb:27017/db...

I'm using the dockerfile, I recommend to add a env as follows HIDE_CREDS=true

If set, I think it would make sense to have an environment variable to mask the user:pass like this:
mongodb://HIDDEN_USER:HIDDEN_PASS@mongodb:27017/db...

That way, anybody viewing the logs doesn't automatically have access to the user/pass.

Error with custom import order

The import order is not correct when you start in zero and eventually you reach 10,
the 10 becomes the second following lexicographic string order.
For now, I decided to start counting from 100.

Docker image is too large

Because of new way of building Docker image, the size increased by double. From what I can see the development dependencies are put on the Image.

  • Do not put development deps on the image
  • Try to reduce image layers

Prepare integration test for Docker image

  • Write separate app that tests if data import have ended with success (just happy path)
  • Add Docker Compose configuration for Docker Image, MongoDB and testing app - not needed
  • Run it on CI

Description of running examples is confusing

Looks like a great library, however I couldn't get either of the methods to import the example data to work. Am I doing something incorrectly?

/tmp $ git clone https://github.com/pkosiec/mongo-seeding
Cloning into 'mongo-seeding'...
remote: Counting objects: 838, done.
remote: Compressing objects: 100% (92/92), done.
remote: Total 838 (delta 63), reused 73 (delta 34), pack-reused 708
Receiving objects: 100% (838/838), 556.96 KiB | 87.00 KiB/s, done.
Resolving deltas: 100% (485/485), done.
/tmp $ cd mongo-seeding-test
/tmp/mongo-seeding-test $ cat package.json index.js
{
  "name": "mongo-seeding-test",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "mongo-seeding": "^2.2.0"
  }
}
const { seedDatabase } = require('mongo-seeding');

const path = require('path');

const config = {
  database: {
    host: '127.0.0.1',
    port: 27017,
    name: 'mydatabase',
  },
  inputPath: path.resolve(__dirname, '../mongo-seeding/samples/example/data'),
  dropDatabase: true,
};

(async () => {
  try {
    await seedDatabase(config);
  } catch (err) {
    // Handle errors
    console.error('there was an error:');
    console.error(err);
  }
  // Do whatever you want after successful import
})()
/tmp/mongo-seeding-test $ node index.js
there was an error:
{ MongoSeedingError: Error: Cannot find module 'mongodb'
    at wrapError (/private/tmp/mongo-seeding-test/node_modules/mongo-seeding/dist/index.js:52:19)
    at Object.<anonymous> (/private/tmp/mongo-seeding-test/node_modules/mongo-seeding/dist/index.js:44:15)
    at Generator.next (<anonymous>)
    at /private/tmp/mongo-seeding-test/node_modules/mongo-seeding/dist/index.js:7:71
    at new Promise (<anonymous>)
    at __awaiter (/private/tmp/mongo-seeding-test/node_modules/mongo-seeding/dist/index.js:3:12)
    at exports.seedDatabase (/private/tmp/mongo-seeding-test/node_modules/mongo-seeding/dist/index.js:14:43)
    at __dirname (/private/tmp/mongo-seeding-test/index.js:17:11)
    at Object.<anonymous> (/private/tmp/mongo-seeding-test/index.js:24:3)
    at Module._compile (module.js:660:30) name: 'MongoSeedingError' }
/tmp/mongo-seeding-test $ cd ..
/tmp $ cd mongo-seeding
/tmp/mongo-seeding $ cd samples/example/data/
/tmp/mongo-seeding/samples/example/data $ seed -u 'mongodb://127.0.0.1:27017/mydb' -d .
  mongo-seeding Starting... +0ms
  mongo-seeding Closing connection... +5ms
Error MongoSeedingError: Error: Cannot find module 'mongodb'

Drop Collection option?

Is there a way to drop the collections to be seeded before a new set of seed data goes in? Dropping the entire database isn't always the desired option so a compromise in the middle would be nice

TS example - ENOENT ERROR

I am not completely sure if this is an actual issue or is just that I'm missing something, but the npm script is not working, I keep getting

MongoSeedingError: Error: ENOENT: no such file or directory, scandir '/Users/irvingarmenta/Documents/dashb-mongo-ts/data-seed/data-seed/data.ts'
    at wrapError (/Users/irvingarmenta/Documents/dashb-mongo-ts/data-seed/node_modules/mongo-seeding/src/index.ts:125:17)
    at Seeder.readCollectionsFromPath (/Users/irvingarmenta/Documents/dashb-mongo-ts/data-seed/node_modules/mongo-seeding/src/index.ts:58:13)
    at Object.<anonymous> (/Users/irvingarmenta/Documents/dashb-mongo-ts/data-seed/index.js:13:28)
    at Module._compile (internal/modules/cjs/loader.js:777:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:788:10)
    at Module.load (internal/modules/cjs/loader.js:643:32)
    at Function.Module._load (internal/modules/cjs/loader.js:556:12)
    at Function.Module.runMain (internal/modules/cjs/loader.js:840:10)
    at internal/main/run_main_module.js:17:11

I have tried multiple file names on the index.js file but it keeps saying that it does not found the data file

This is the index.js file:

require('ts-node').register();

const path = require('path');
const { Seeder } = require('mongo-seeding');

const config = {
  database: 'mongodb://localhost/dashb',
  dropDatabase: true,
};

const seeder = new Seeder(config);

const collections = seeder.readCollectionsFromPath(
  path.resolve('./data-seed/data.ts'),
  {
    extensions: ['js', 'json', 'ts'],
    transformers: [Seeder.Transformers.replaceDocumentIdWithUnderscoreId],
  },
);

seeder
  .import(collections)
  .then(() => {
    console.log('Success');
  })
  .catch(err => {
    console.log('Error', err);
  });

and this is the data.ts file that is in the same directory as index.js

import { getObjectId } from './helpers';
import { seedUser } from '../src/api/methods';

const names = ["Hanamichi", "Rukawa", "Haruko", "Akagi", "Mitsuki"];

const users: seedUser[] = names.map((name, i) => {
  return {
    id: getObjectId(name),
    name,
    email: `${name}@email.com`,
    password: `password${i}`,
    role: 'user'
  }
});

export = users;

I tried following the ts example, any ideas why I keep getting ENOENT ?

my directory tree is:

project folder > 
         src 
         data-seed 
         // other stuff

data-seed has it's own package.json and also the main project folder too, has it's own package.json

Rework core API

In near future, I would like to rewrite Mongo Seeding core. It may introduce some breaking changes, as I want to support additional features, without making the API too complex.

While rethinking API, I want to consider the following abilities:

  • seed multiple databases with different users (#73)
  • use own MongoDB client
  • minimize custom code as possible (for example, by migrating to built-in reconnect feature in Mongo client - #97)
  • handle .mjs files (#105)

Silence logger

Hello!

We are using this library in company, mostly for seeding database before each test. It works great, but the logging is terrible feature for us. It may be useful for one-time seeding, but since every our test generates such logs, our log from testing is bloated.

Could you please either implement logging silence, or point out how it should be properly done, so I can prepare PR? I looked into the code and I have at least few ideas how to configure logger (since it's just a sample function now), so we could discuss that perhaps

Prevent duplicates while seeding

Like other ORM-based seeders, it would be awesome if the seeding runs would be stored in collection to prevent duplicates within the seeded collection. For example sequelize (mysql orm) is storing timestamp and file names in a separate table to prevent that.

Keep up the good work!

Update DB URI Parsing from CLI Params to Allow mongoDB+srv protocol

I am deploying a Mongo instance using Mongo Atlas. In one of the newer driver versions, a new protocol was added to allow connecting to a Mongo cluster without having to specify all hosts in the cluster. Details may be found in the MongoDB 3.6 docs. The mongo-seeding-cli works fine with the new protocol if the user specifies the connection address using a DB URI string. However, the user is unable to specify individual parameters using the new service because the core mongo seeding library automatically adds a port to the connection string, which is not allowed when connecting to mongo using the mongo+srv protocol.

mongo-seeding Connecting to mongodb+srv://<myAdmin>:<myPassword>@runewordfinder-0ltsc.mongodb.net:27017/runewordfinder... +9ms
(node:73572) DeprecationWarning: current URL string parser is deprecated, and will be removed in a future version. To use the new parser, pass option { useNewUrlParser: true } to MongoClient.connect.
  mongo-seeding Ports not accepted with `mongodb+srv` URIs

I would like to be able to specify the individual parameters so I may hide my password as an environment variable. I propose that core/src/database/database-connector.ts#getDbConnectionUri(...) is updated to check the provided protocol and create an appropriate string that:

  • Includes the port if protocol === 'mongodb'
  • Does not include the port if protocol === 'mongodb+srv'

Generic type 'InsertWriteOpResult<TSchema>' requires 1 type argument(s).

I am using mongo-seeding version 3.3 . When i am trying to build the project it is giving me below error

-[email protected] serve /app
npm run build && tsc && node lib/server.js
[email protected] build /app
babel src --out-dir lib --extensions ".ts,.tsx"
Successfully compiled 85 files with Babel.
node_modules/mongo-seeding/dist/database/database.d.ts(19,94): error TS2314: Generic type 'InsertWriteOpResult' requires 1 type argument(s).

One day ago everything was working fine . Without modifying any code i started to get this error.
I think it is saying we have to pass default type in Tschema.

Include helpers in mongo-seeding package

It would be nice to include the helpers https://github.com/pkosiec/mongo-seeding/blob/master/examples/import-data/example/helpers/index.js in the package itself.

import { getObjetById } from "@mongo-seeding/helpers";

so instead of writing

const { getObjectId, getObjectIds } = require('../../helpers/index');

const posts = [
  {
    id: getObjectId('post1'),
    categoriesIds: getObjectIds(['Cats', 'Dogs']),
    title: 'Lorem ipsum',
    description: 'Sample Post 1 description',
   ...
  },
   ...
];

one can write

const { getObjectId, getObjectIds } = require('mongo-seeding');

const posts = [
  {
    id: getObjectId('post1'),
    categoriesIds: getObjectIds(['Cats', 'Dogs']),
    title: 'Lorem ipsum',
    description: 'Sample Post 1 description',
   ...
  },
   ...
];

Change API of Mongo Seeding JS library

Releasing v3 is great opportunity to make another breaking change. Currently the library forces users to load collections from files (using fs, which enforces back-end Node.js apps). Here's the current API:

const path = require('path');

const config = {
  database: {
    protocol: 'mongodb',
    host: '127.0.0.1',
    port: 27017,
    name: 'database',
    username: undefined,
    password: undefined,
  },
  databaseConnectionUri: undefined,
  inputPath: resolve(__dirname, '../../data'), // input directory with import data structure
  dropDatabase: false, // drops entire database before import
  dropCollection: false, // drops collection before importing it
  replaceIdWithUnderscoreId: false, // rewrites `id` property to `_id` for every document
  supportedExtensions: ['json', 'js'], // files that should be imported
  reconnectTimeoutInSeconds: 10, // maximum time of waiting for successful MongoDB connection
};

await seedDatabase(config);

Basically, the idea would be to import populator module only when it's needed. I think it would be a big improvement to change the API to something like this:

    import { Seeder } from "mongo-seeding";

    const seeder = new Seeder({
      database: {
        protocol: 'mongodb',
        host: '127.0.0.1',
        port: 27017,
        name: 'database',
        username: undefined,
        password: undefined,
      },
      databaseConnectionUri: undefined,
      reconnectTimeoutInSeconds: 10,
    });
    const collections = seeder.populateCollectionsFromPath("/some/path/", {
      extensions: ['json', 'js'],
      transformers: [
        Seeder.Transformers.replaceIdWithUnderscoreId,
        (collection) => (console.log("collection", collection)),
      ]
    });

    await seeder.import(collections, {
      dropDatabase: false,
      dropEveryCollection: false,
    });

Of course the naming in the example above is very poor and need to be polished.

Mongo Connection Timeouts because of default DEFAULT_CLIENT_OPTIONS

Version: 3.4.0

Hi, Great Library!
Im getting a connection timeout error. Inspecting a little more I found out that it is because
the current DEFAULT_CLIENT_OPTIONS for the DatabaseConnector are as follow:

DatabaseConnector.DEFAULT_CLIENT_OPTIONS = {
    ignoreUndefined: true,
    useNewUrlParser: true,
    useUnifiedTopology: true,
    connectTimeoutMS: 1,
};

where the connectTimeoutMS is 1 milli second, therefor getting a timeout every time.

I've tried setting the mongoClientOptions in the Seeder Config but it is still using the default options when constructing the DatabaseConnector.

Is there any workaround for this in the current version?
Thanks!

Incorrect links in NPM readmes

About the npm readme :
bildschirmfoto 2018-10-11 um 14 36 33

The link to the tutorial is wrong
https://github.com/pkosiec/mongo-seeding/blob/docs/import-data-definition.md -> 404
correct link:
https://github.com/pkosiec/mongo-seeding/blob/master/docs/import-data-definition.md

Handle `export default` in import data files

I just ran into this issue where all the added data would show up under a single document (with a single ID) in MongoDB. After about 30 minutes of tinkering around, trying stuff, and digging through the source code—I noticed how I exported my data using export default instead of export = [].

Why would export default result in this module thinking that I want to add an array as single document? export = [] doesn't really seem like a pattern many libs use/require?

‼️ Wrong way

export default [
  { name: "My First Item" },
  { name: "My Second Item" },
  { name: "My Third Item" },
  { name: "My Last Item" }
];

✅ Right way

export = [
  { name: "My First Item" },
  { name: "My Second Item" },
  { name: "My Third Item" },
  { name: "My Last Item" }
];

Suggestion

Maybe it's safe to assume that when the user exports an array, they want it to be imported as separate documents in their database?

Overal very useful tool, saves me a bunch of boilerplate code. 😄

Connection string interpreted literally

First of all, great library! This is such a common use case that I can't believe there aren't go-to solutions out there. Hopefully this library becomes one.

I noticed that when using a connection string, e.g. /test?retryWrites=true, it should reference an existing database named test in mongo, but instead results in a new database being created with the name /test?retryWrites=true. Seems like the params from the ? on are being interpreted literally. Don't know if it's an issue with this library or a dependency.

Improve examples

This is the outcome of the discussion from #19. Currently there is sample data directory without a basic instruction, how to import them (listing just a few commands is not enough).

To do

  • Rename samples directory to examples
  • Write detailed instruction how to run samples (with all details like cloning repository etc.)
  • Add example with basic app utilizing mongo-seeding library.

Refactor core integration tests

Currently during integration tests, sample data files are created in temporary directories. When tests fail, sometimes they are not removed. Create static sample data and do not create/remove any files during tests.

Add DB URI Options Parameter and env variable

Add another parameter which enables user to provide options for DB URI without need to use custom DB_URI.

For example, the following environmental variable should append ?replicaSet=test&connectTimeoutMS=300000 at the end of the constructed DB URI:
DB_OPTIONS='replicaSet=test;connectTimeoutMS=300000'

See #18 for discussion regarding this feature.

Code coverage reporting is broken for PRs

From some period of time, CodeCov don't report code coverage for PRs.

Detect CI environment correctly for PRs and branches. Make sure all reports on master are correctly reported.

Allow passing custom MongoDB client options

Currently, the MongoClient options while connecting to DB cannot be changed:

  static CLIENT_OPTIONS: MongoClientOptions = {
    ignoreUndefined: true,
    useNewUrlParser: true,
    useUnifiedTopology: true,
  };

Similarly to #99, provide a way to pass these options.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.