Comments (6)
Hi @mboisnard ,
Thank you for the suggestion.
This is something that's been on the back of my head for awhile now :) Along with generating generating csv, json, ...
If you like to work on this - please let me know. Else I'll try to prioritize this myself in the near future to make it available for the next release.
UPD:
Oh wait, now that I looked at the links you've provided, I think I misunderstood what you meant by a "Database Provider" :D
I was thinking of an interface that can be used to automatically populate a db table, for example.
As for a Database Provider that you meant. I can see that it could be useful, definitely. But what they have in fakerjs seems quite specific and narrow, which I don't think is good enough for a more "generic use-case"
For example, if we take database collation, which db implementation are we talking about? The implementation in fakerjs doesn't seem to take that into account (e.g. https://github.com/faker-js/faker/blob/c1caa900ceb12737a3aa45b7e4dd75797a11a889/src/locales/base/database/collation.ts )
Column data types also vary from one db to another.
Postgres doesn't have a storage engine like mysql, for example.
And so on.
If you'd like to provide a "Database Provider" yml file that contains such information for various database implementations - please do so and I'll be happy to include this :)
For example, this is what chatgpt gave me:
postgresql:
column:
- id
- name
- email
- created_at
- updated_at
type:
- INTEGER
- BIGINT
- DECIMAL
- NUMERIC
- REAL
- DOUBLE PRECISION
- SERIAL
- BIGSERIAL
- CHAR
- VARCHAR
- TEXT
- DATE
- TIMESTAMP
- TIMESTAMP WITH TIME ZONE
- BOOLEAN
- JSON
- JSONB
- BYTEA
- ARRAY
- UUID
- ENUM
engine: []
collation:
- "en_US.UTF-8"
- "en_GB.UTF-8"
- "de_DE.UTF-8"
- "fr_FR.UTF-8"
mysql:
column:
- id
- name
- email
- created_at
- updated_at
type:
- INT
- BIGINT
- DECIMAL
- FLOAT
- DOUBLE
- CHAR
- VARCHAR
- TEXT
- DATE
- DATETIME
- TIMESTAMP
- TIME
- YEAR
- BOOLEAN
- JSON
- BINARY
- VARBINARY
- BLOB
- ENUM
- SET
engine:
- InnoDB
- MyISAM
- MEMORY
- CSV
- ARCHIVE
- BLACKHOLE
- MERGE
- FEDERATED
collation:
- "utf8mb4_general_ci"
- "utf8mb4_unicode_ci"
- "latin1_swedish_ci"
- "latin1_general_ci"
mariadb:
column:
- id
- name
- email
- created_at
- updated_at
type:
- INT
- BIGINT
- DECIMAL
- FLOAT
- DOUBLE
- CHAR
- VARCHAR
- TEXT
- DATE
- DATETIME
- TIMESTAMP
- TIME
- YEAR
- BOOLEAN
- JSON
- BINARY
- VARBINARY
- BLOB
- ENUM
- SET
engine:
- InnoDB
- MyISAM
- Aria
- MEMORY
- CSV
- ARCHIVE
- BLACKHOLE
- MERGE
- FEDERATED
- TokuDB
- Spider
collation:
- "utf8mb4_general_ci"
- "utf8mb4_unicode_ci"
- "latin1_swedish_ci"
- "latin1_general_ci"
Is it comprehensive and accurate enough? I'm really not sure :D It's a start though, but I don't know if it's good enough so to speak.
Additionally, just in case you have a very specific use-case, I'd recommend you to take a look at creating your own data providers docs. This functionality is available since version 2.0.0-rc.1
and allows you to extend faker implementation and create your own data providers ;)
I'll still keep this issue open in case you or anyone else wants to work on this. Seems like a good "first issue" :)
from kotlin-faker.
Hello @serpro69 , thanks for your answer.
Yes actually I was talking about the same behavior as faker-js and I completely agree with you that the current implementation is generic and can be improved to match the possible data for each database.
I will take a look at your documentation, and try to contribute to the project :)
from kotlin-faker.
Contributions are always welcome :) Thanks!
I think this https://github.com/serpro69/kotlin-faker/blob/master/CONTRIBUTING.adoc#adding-new-functionality should help with the implementation of this issue. But also feel free to ask if you need any help.
As I mentioned, the bigger part of the task here would be to gather the data itself. After that you should be able to follow the above documentation to add a new data provider implementation; but if something is unclear there - please let me know. I'd like to improve the contributing guidelines also if they're not good enough.
Just a few suggestions also:
- For
collation
it could be impractical to include all possible values in the .yml file. What we could do instead is use thelocale
value from the faker's configuration, and using that "construct" possible collation values. E.g. for postgres we'd probably only need to append.UTF-8
to the locale string. For mysql/mariadb some "conversion logic" from locale to collation would probably be needed. The other db types IDK, would need to check what are the possible values there and how to return them in a nice way. - For
columns
I'm not entirely sure what's a good "list of common column names" or what is even the use-case here. Feel free to submit some proposals from your end :) Also it doesn't need to be a separate property for each db type, since the values will be the same I guess - For
type
andengine
(where applicable), they can be added to the .yml directly. I think this would be the easiest approach for these two properties
from kotlin-faker.
Thx for your suggestions, I created a branch to implement the databases behavior and I have several questions for you :)
For the MongoDB provider I would like to create a generateObjectId
method based on a random date and inspired by the logic I found in JS here (https://steveridout.com/mongo-object-time/)
-
MongoDB Provider is not based on a yaml file, so I would like to implement the
AbstractFakeDataProvider
class just like the StringProvider for example in thedatabases
gradle module I just created. TheAbstractFakeDataProvider
class is marked asinternal
, is it intentional or have you not yet had the need? -
To be able to generate an objectId I would like to add a new method in the
RandomService
to generate anOffsetDateTime
that can be used by anyone and by the MongoDB Provider. Can we access to the RandomService from a provider? (just removed theinternal
protection inFakerService
for this field to make it work on my branch)
from kotlin-faker.
Hey @mboisnard ,
Let me give you some existing code examples to make things easier to understand.
-
Creating a new data provider that is not yaml-based outside of "core faker" is not supported. I'm not sure it makes much sense either to expose those things. Seems like a very specific use-case.
- What I could suggest instead is having one
DatabaseProvider
implementation, which contains both common functionality, as well as specifics for the various<DatabaseType>Provider
s accessible via additional property (take a look at for example) - This, however, doesn't solve the part that "MongoDbProvider" isn't going to implement
YamlFakeDataProvider
. If that is intentional, and you only want to have this one functiongenerateObjectId
for the mongo-db provider, I can think of two ways:- Place the function in the
DatabaseProvider
instead and name itmongoDbObjectId
, for example. This way you will haveDatabaseProvider
based on yaml, but you can also have functions inside it that don't use data from yaml files. - Kind of hacky, but you can just omit this part in the
MongoDbProvider
YamlFakeDataProvider
. - I think the latter would work (haven't tried it myself though), but I'd go with the former as it's cleaner and it's perfectly fine to have functions that read from yml and that don't in the same provider implementation (see e.g.
Internet#iPv4Address
-YmlFakeDataProvider
implementation class)
- Place the function in the
- What I could suggest instead is having one
-
To get access to
RandomService
from a data provider implementation, you can use this as an example:- First add it as a constructor parameter -
- Then in the faker, you can use the
randomService
property that is available from theAbstractFaker
-
from kotlin-faker.
Don't know if the above made much sense 😁 Feel free to ask if you want me to clarify something further :)
from kotlin-faker.
Related Issues (20)
- Getting 'ClassCastException' when using kotlin-faker with Generics Type HOT 9
- Postal Code for CA/UK returns regex, not an actual Postal Code HOT 5
- Add blns
- Add rendering for sub-provider functions in cli
- Make it possible to lookup providers by name in cli
- Improve integration tests
- Allow RandomClassProvider::randomClassInstance to use predefinedGenerators from config if present HOT 4
- RandomClassProvider doesn't handle constructorless types correctly in all instances HOT 1
- Allow random to be used with unique HOT 1
- Regexify generates invalid value HOT 2
- Regexify should take Regex as an argument HOT 1
- Regexify fails with StackOverflowError HOT 5
- random.nextLong(bound: Long) also returns negative values HOT 3
- Suggestion: internet.ipv4() and internet.ipv6() HOT 4
- Support inner classes in RandomClassProvider
- Remove all deprecated functionality in faker 2.0
- Split core Faker into sub-fakers by categories
- Allow creating custom fakers / generators
- kotlin-faker 2.0 - Breaking Changes HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kotlin-faker.