Comments (3)
You can try successive numbers of clusters and measure their performance (basically an error curve). Why not create a pull request implementing this?
from ai4r.
@agarie what I did a while ago was an Anova for a successive number of clusters to see how well the clusters explain the variance.
from ai4r.
Hi all,
I had the same issue today and as I did not find any solution somewhere here is what I came up wit so far; I guess it's not perfect but better than nothing :)
assumptions: the data input is already z_norm && nobody wants more than 10 clusters && nobody wants a cluster that didn't improve the fitness by more than 5%
def self.create_best_cluster(data)
cluster = Ai4r::Clusterers::KMeans.new
dataset = Ai4r::Data::DataSet.new
dataset.set_data_items(data)
cluster_fitness = []
10.times do |t|
cluster.build(dataset,t+1)
cluster_fitness[t] = 0
dataset.data_items.each do |item|
cluster_fitness[t] += cluster.distance(cluster.centroids[cluster.eval(item)],item)
end
end
#when derivative gets below 0.05: the next additional cluster did not improve the fitness significant, so the best fitting cluster is the array index + 2 as the array start's at 0 but clustering start's at 1
cluster.build(dataset, cluster_fitness.each_cons(2).map{|x, y| y - x}.map.with_index{|derivate,index| derivate.abs >= 0.05 ? index : nil}.compact.max + 2)
end
from ai4r.
Related Issues (17)
- ZeroR is broken HOT 2
- Missing module in Ai4r::Classifiers::NaiveBayes#build method HOT 1
- Change 'gem install' command in install docs HOT 3
- naive_bayes.rb is incompatible with Numeric data types in data_set.rb
- License? HOT 1
- "Class Value" for classifier? How to assign id to data set element HOT 2
- ai4r.org redirects to github HOT 4
- K-Means Same Clusters for every execution HOT 1
- Please add a license to this repository HOT 5
- How to know which items are in which cluster? HOT 2
- Doubt
- domain is for sale HOT 3
- Porting parts of the code HOT 1
- DBScan HOT 1
- Down
- Tests failing on HEAD HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ai4r.