Giter Club home page Giter Club logo

cluster-strategy-main's Introduction

Java Data Clustering

Data clustering in Java using Strategy design patterns.

A project to build a pure Java library to cluster data using design patterns. My goals were to practice design patterns and write collection of data mining tools. The first clustering algorithms are for partitional data clustering, specifically various implementations of the k-means algorithm. Additionally, I'm adding code for numeric + text distance measurements and statistical normalization. The sample data this project was based on can be found at the UCI Machine Learning Repo.

Example usage can be found in Main.java:

//Demonstrates use of naive k-means strategy pattern
ClusterContext cContext = new ClusterContext();
cContext.setClusterStrategy(new KmeansNaiveStrategy());
List<double[]> centroids = cContext.findCentroids(rawData, NUM_CLUSTERS);

//Print output from naive k-means clustering
System.out.format("%n%nMain.java: Cluster Centroids...%n-------------------------------%n");
centroids.forEach(centroid -> {
    System.out.format("Centroid: %s%n", Arrays.toString(centroid));
});
System.out.format("%n%n");

Result:

Beginning k-means clustering...
Number of clusters: 3, number of data points: 150...

New Iteration...
Cluster SSE: 4.966278
Cluster SSE: 288.093598
Cluster SSE: 121.940705

Total SSE: 415.000581
----------------------

New Iteration...
Cluster SSE: 30.026815
Cluster SSE: 98.679857
Cluster SSE: 14.006869

Total SSE: 142.713541
----------------------

New Iteration...
Cluster SSE: 24.085262
Cluster SSE: 59.490720
Cluster SSE: 18.379421

Total SSE: 101.955403
----------------------

New Iteration...
Cluster SSE: 24.085262
Cluster SSE: 53.029480
Cluster SSE: 21.902186

Total SSE: 99.016928
----------------------

New Iteration...
Cluster SSE: 24.085262
Cluster SSE: 48.858549
Cluster SSE: 25.212626

Total SSE: 98.156436
----------------------

New Iteration...
Cluster SSE: 24.085262
Cluster SSE: 43.712415
Cluster SSE: 30.148729

Total SSE: 97.946406
----------------------

New Iteration...
Cluster SSE: 24.085262
Cluster SSE: 39.848696
Cluster SSE: 33.579068

Total SSE: 97.513026
----------------------

New Iteration...
Cluster SSE: 24.085262
Cluster SSE: 35.155764
Cluster SSE: 38.088031

Total SSE: 97.329057
----------------------

New Iteration...
Cluster SSE: 24.085262
Cluster SSE: 32.391083
Cluster SSE: 40.729503

Total SSE: 97.205847
----------------------

New Iteration...
Cluster SSE: 24.085262
Cluster SSE: 29.534177
Cluster SSE: 43.505612

Total SSE: 97.125051
----------------------

Cluster centroids:
------------------
Centroid: [5.005999999999999, 3.428000000000001, 1.4620000000000002, 0.2459999999999999]
Centroid: [6.827499999999999, 3.0699999999999994, 5.699999999999998, 2.062499999999999]
Centroid: [5.885000000000001, 2.74, 4.376666666666667, 1.4183333333333332]


Main.java: Cluster Centroids...
-------------------------------
Centroid: [5.005999999999999, 3.428000000000001, 1.4620000000000002, 0.2459999999999999]
Centroid: [6.827499999999999, 3.0699999999999994, 5.699999999999998, 2.062499999999999]
Centroid: [5.885000000000001, 2.74, 4.376666666666667, 1.4183333333333332]

cluster-strategy-main's People

Watchers

 avatar

cluster-strategy-main's Issues

Mark deprecated

Mark this project as deprecated in the README. Create new Gradle repo under MIT license.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.