Giter Club home page Giter Club logo

Comments (6)

serpro69 avatar serpro69 commented on May 24, 2024

Hi @mboisnard ,

Thank you for the suggestion.
This is something that's been on the back of my head for awhile now :) Along with generating generating csv, json, ...

If you like to work on this - please let me know. Else I'll try to prioritize this myself in the near future to make it available for the next release.

UPD:
Oh wait, now that I looked at the links you've provided, I think I misunderstood what you meant by a "Database Provider" :D
I was thinking of an interface that can be used to automatically populate a db table, for example.

As for a Database Provider that you meant. I can see that it could be useful, definitely. But what they have in fakerjs seems quite specific and narrow, which I don't think is good enough for a more "generic use-case"
For example, if we take database collation, which db implementation are we talking about? The implementation in fakerjs doesn't seem to take that into account (e.g. https://github.com/faker-js/faker/blob/c1caa900ceb12737a3aa45b7e4dd75797a11a889/src/locales/base/database/collation.ts )
Column data types also vary from one db to another.
Postgres doesn't have a storage engine like mysql, for example.
And so on.

If you'd like to provide a "Database Provider" yml file that contains such information for various database implementations - please do so and I'll be happy to include this :)

For example, this is what chatgpt gave me:

postgresql:
  column:
    - id
    - name
    - email
    - created_at
    - updated_at
  type:
    - INTEGER
    - BIGINT
    - DECIMAL
    - NUMERIC
    - REAL
    - DOUBLE PRECISION
    - SERIAL
    - BIGSERIAL
    - CHAR
    - VARCHAR
    - TEXT
    - DATE
    - TIMESTAMP
    - TIMESTAMP WITH TIME ZONE
    - BOOLEAN
    - JSON
    - JSONB
    - BYTEA
    - ARRAY
    - UUID
    - ENUM
  engine: []
  collation:
    - "en_US.UTF-8"
    - "en_GB.UTF-8"
    - "de_DE.UTF-8"
    - "fr_FR.UTF-8"

mysql:
  column:
    - id
    - name
    - email
    - created_at
    - updated_at
  type:
    - INT
    - BIGINT
    - DECIMAL
    - FLOAT
    - DOUBLE
    - CHAR
    - VARCHAR
    - TEXT
    - DATE
    - DATETIME
    - TIMESTAMP
    - TIME
    - YEAR
    - BOOLEAN
    - JSON
    - BINARY
    - VARBINARY
    - BLOB
    - ENUM
    - SET
  engine:
    - InnoDB
    - MyISAM
    - MEMORY
    - CSV
    - ARCHIVE
    - BLACKHOLE
    - MERGE
    - FEDERATED
  collation:
    - "utf8mb4_general_ci"
    - "utf8mb4_unicode_ci"
    - "latin1_swedish_ci"
    - "latin1_general_ci"

mariadb:
  column:
    - id
    - name
    - email
    - created_at
    - updated_at
  type:
    - INT
    - BIGINT
    - DECIMAL
    - FLOAT
    - DOUBLE
    - CHAR
    - VARCHAR
    - TEXT
    - DATE
    - DATETIME
    - TIMESTAMP
    - TIME
    - YEAR
    - BOOLEAN
    - JSON
    - BINARY
    - VARBINARY
    - BLOB
    - ENUM
    - SET
  engine:
    - InnoDB
    - MyISAM
    - Aria
    - MEMORY
    - CSV
    - ARCHIVE
    - BLACKHOLE
    - MERGE
    - FEDERATED
    - TokuDB
    - Spider
  collation:
    - "utf8mb4_general_ci"
    - "utf8mb4_unicode_ci"
    - "latin1_swedish_ci"
    - "latin1_general_ci"

Is it comprehensive and accurate enough? I'm really not sure :D It's a start though, but I don't know if it's good enough so to speak.

Additionally, just in case you have a very specific use-case, I'd recommend you to take a look at creating your own data providers docs. This functionality is available since version 2.0.0-rc.1 and allows you to extend faker implementation and create your own data providers ;)

I'll still keep this issue open in case you or anyone else wants to work on this. Seems like a good "first issue" :)

from kotlin-faker.

mboisnard avatar mboisnard commented on May 24, 2024

Hello @serpro69 , thanks for your answer.

Yes actually I was talking about the same behavior as faker-js and I completely agree with you that the current implementation is generic and can be improved to match the possible data for each database.

I will take a look at your documentation, and try to contribute to the project :)

from kotlin-faker.

serpro69 avatar serpro69 commented on May 24, 2024

Contributions are always welcome :) Thanks!

I think this https://github.com/serpro69/kotlin-faker/blob/master/CONTRIBUTING.adoc#adding-new-functionality should help with the implementation of this issue. But also feel free to ask if you need any help.

As I mentioned, the bigger part of the task here would be to gather the data itself. After that you should be able to follow the above documentation to add a new data provider implementation; but if something is unclear there - please let me know. I'd like to improve the contributing guidelines also if they're not good enough.

Just a few suggestions also:

  • For collation it could be impractical to include all possible values in the .yml file. What we could do instead is use the locale value from the faker's configuration, and using that "construct" possible collation values. E.g. for postgres we'd probably only need to append .UTF-8 to the locale string. For mysql/mariadb some "conversion logic" from locale to collation would probably be needed. The other db types IDK, would need to check what are the possible values there and how to return them in a nice way.
  • For columns I'm not entirely sure what's a good "list of common column names" or what is even the use-case here. Feel free to submit some proposals from your end :) Also it doesn't need to be a separate property for each db type, since the values will be the same I guess
  • For type and engine (where applicable), they can be added to the .yml directly. I think this would be the easiest approach for these two properties

from kotlin-faker.

mboisnard avatar mboisnard commented on May 24, 2024

Thx for your suggestions, I created a branch to implement the databases behavior and I have several questions for you :)

For the MongoDB provider I would like to create a generateObjectId method based on a random date and inspired by the logic I found in JS here (https://steveridout.com/mongo-object-time/)

  • MongoDB Provider is not based on a yaml file, so I would like to implement the AbstractFakeDataProvider class just like the StringProvider for example in the databases gradle module I just created. The AbstractFakeDataProvider class is marked as internal, is it intentional or have you not yet had the need?

  • To be able to generate an objectId I would like to add a new method in the RandomService to generate an OffsetDateTime that can be used by anyone and by the MongoDB Provider. Can we access to the RandomService from a provider? (just removed the internal protection in FakerService for this field to make it work on my branch)

from kotlin-faker.

serpro69 avatar serpro69 commented on May 24, 2024

Hey @mboisnard ,

Let me give you some existing code examples to make things easier to understand.

  • Creating a new data provider that is not yaml-based outside of "core faker" is not supported. I'm not sure it makes much sense either to expose those things. Seems like a very specific use-case.

    • What I could suggest instead is having one DatabaseProvider implementation, which contains both common functionality, as well as specifics for the various <DatabaseType>Providers accessible via additional property (take a look at for example)
    • This, however, doesn't solve the part that "MongoDbProvider" isn't going to implement YamlFakeDataProvider . If that is intentional, and you only want to have this one function generateObjectId for the mongo-db provider, I can think of two ways:
      • Place the function in the DatabaseProvider instead and name it mongoDbObjectId, for example. This way you will have DatabaseProvider based on yaml, but you can also have functions inside it that don't use data from yaml files.
      • Kind of hacky, but you can just omit this part in the MongoDbProvider and still inherit from YamlFakeDataProvider.
      • I think the latter would work (haven't tried it myself though), but I'd go with the former as it's cleaner and it's perfectly fine to have functions that read from yml and that don't in the same provider implementation (see e.g. Internet#iPv4Address -
        fun iPv4Address() = List(4) { fakerService.randomService.nextInt(0, 255) }
        .joinToString(".")
        which is a custom function not based on yml-data, but is inside a YmlFakeDataProvider implementation class)
  • To get access to RandomService from a data provider implementation, you can use this as an example:

from kotlin-faker.

serpro69 avatar serpro69 commented on May 24, 2024

Don't know if the above made much sense 😁 Feel free to ask if you want me to clarify something further :)

from kotlin-faker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.