Comments (5)
Hi @apatra9, all the functionalities implemented in the package are for binary/multiclass classification problems, and operate on feature vectors (data) and corresponding class labels (target). Applying it to image data is not recommended and is definitely not a proper use-case.
If your problem in hands is a problem of classifying images, and one of the classes has significantly lower number of elements than the other, you can use this package in the following way.
First, you need to represent the images as relatively low dimensional feature vectors (10-100-1000 dimensions) by extracting various image descriptors or using autoencoders. Once you have the feature vector representation of the images and you also have the corresponding class labels, you can feed them to the oversampling techniques implemented in the package to get a balanced dataset. That balanced dataset can be used to train a classifier, which can be expected to give better performance than a classifier trained on the imbalanced dataset.
from smote_variants.
Hi @apatra9, can we close this issue?
from smote_variants.
@gykovacs I have used the LLE smote on image data. Basically, LLE maps high dimension data into lower dimensions. So, the LLE_smote brings the high dimension image data into a lower dimension and then applies smote, after doing this it maps the oversampled data back to the original dimension. I feel this is useful. Let me know if I can create a PR to add that example.
from smote_variants.
@sakethbachu Sounds very interesting, sure, please go ahead, and add the example!
from smote_variants.
@sakethbachu Sounds very interesting, sure, please go ahead, and add the example!
Will create a PR by this weekend, thankyou :)
from smote_variants.
Related Issues (20)
- evaluation metrics HOT 7
- how smote_variants work with incremental classifier with large amount of data HOT 3
- Multiclass oversampling for multi-minority problem HOT 5
- Comparison of some SMOTE Variants without considering the entire dataset HOT 8
- Add links of datasets as comments in every notebook in examples given. HOT 2
- How to supress INFO verbose in smote_variants HOT 4
- Citation format HOT 4
- DEAGO : negative values for categorical features inside the data HOT 3
- Minimum number of rows in a class HOT 1
- when use SOMO,Why did the two types of samples not reach a balance and the number did not change HOT 2
- provided out is the wrong size for the reduction
- Categorical Variables HOT 1
- How to vary the "proportion" parameter - MulticlassOversampling class
- Why I get this error when I use smote_variants? HOT 9
- Could I apply this package to the time-series raw data?
- Question HOT 2
- Question: Combining these with Undersampling HOT 3
- Question: Regarding time complexity of Oversamplers and "Noise Filters" HOT 1
- GridSearchCV classifier parameters: int vs list HOT 3
- Implement 'verbose' parameter (feature request) HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from smote_variants.