muriloxyz / dataset-split Goto Github PK
View Code? Open in Web Editor NEWSplits your dataset folder into train/validation/test subfolders. Useful for non-csv data.
License: MIT License
Splits your dataset folder into train/validation/test subfolders. Useful for non-csv data.
License: MIT License
Functionality will allow the user to undo the split, rebuilding the original dataset with all the original items.
The code flow should be changed. Because of '--copy' implementation it is NEEDED to rethink the code (ugly and not optimized)
The "copy" argument, taken by the app, isn't working at all. Even when it's present, the program doesn't copy any of the files, it just moves them and leave the original dataset empty.
Decide AND code what do when program chases while executing.
Cases to think about:
The '--noshuffle' option doesn't let the app shuffle the items. The split will be made to (train, test, validation) in the same order 'ls' listed the items.
It will enhance the usability when the module is imported inside another python script. Allows the user to choose a directory anywhere in the system.
The reshuffle option will allow the user to reshuffle the already splitted dataset.
I realized this would be quite useful during a model evaluation process
Well, I need to understand some crucial stuff before building this app.
What I need/must understand in order to build this app:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.