Giter Club home page Giter Club logo

spaska's People

Contributors

gerganamd avatar nikolavp avatar vtasheva avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

spaska's Issues

Naive Bayes algorithm

The Naive Bayes algorithms is one of the best performing algorithms for classification. It's strange that it wasn't implemented in the first iteration.

Refactoring of existing code

Some of the algorithms need refactoring.

Note: This should be done after the task about unit tests is finished so we don't break something.

New clustering algorithms

Currently we only have kmeans as a clustering algorithms. So we can't compare which is better for our data. We need to implement more clustering algorithms. We can look into the mahout implemented algorithms - Here;check the Perform Clustering subsection

Run static analysis tools on the code to insure code quality

The first idea was to use sonar, but as we currently don't have any hosting solution and most of us are with laptops - we should ditch that idea.

We will have to use the individual tools and run them through maven on every push to the repository.
Someone should read the git hooks documentation and do some tweaking so this happens automatically on push. Also a baseline for the rules should be decided - i.e. under what % of rule compliance we should have a warning.

Make an evaluation API for the clustering algorithms

Currently there is no way to evaluate the clustering algorithm heuristic. There should be some method similar to the one provided by WEKA. The idea is that the clusterer should group instances the same way they are grouped by the class attribute.

Import data from a database

Currently we only support arff files which is inconvenient for most of the systems that want to build something with ML. We should support data importing from database

Add tests to the codebase

Add unit tests to the whole codebase so we can start the refactoring properly.

  1. Add unit tests for functions/utility objects
  2. Add integration tests that verify(more like validate) that we are not lowering the precision, recall with the refactoring.

Move the documentation from the doc

The current documentation is in a doc file outside of the code. It's pretty hard to follow the code and lookup the documentation.

Also the documentation is currently in Bulgarian - this stops international development on the project.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.