The goal of this project is to create a fuzzy clustering algorithm based on the artificial bee colony algorithm. Robustness to the problems of the fuzzy c-means algorithm, in particular when handling many-dimensional data, is important.
To compile the C++ code as a Python extension, see the official guide for Unix-like systems and Windows.
The VECTOR_DIM
preprocessor macro defines the dimensionality of the vectors to be clustered. Wrong dimensionality will result in a runtime error. To cluster vectors of different size than the default 2, override the macro using compiler parameters or manually overwrite it in abc_plusplus.cpp
.
The abc_plusplus
module contains 4 classes:
ArtificialBeeColony
- unmodified ABCModArtificialBeeColony
- ABC with DE-inspired mixing strageyTournamentArtificialBeeColony
- ABC with tournament selection startegyTournamentModArtificialBeeColony
- ABC with both modifications
The constructors for ArtificialBeeColony
and TournamentArtificialBeeColony
accept 4 positional parameters:
- the data to be clustered, represented as an
n
bym
sequence of floats, wheren
is the number of vectors andm
is dimensionality - the number of clusters (a positive integer)
- the size of the population (a positive integer)
- the maximum number of iterations for which a solution is retained without any improvement (a positive integer)
The constructors for ModArtificialBeeColony
and TournamentModArtificialBeeColony
accept 2 additional positional parameters, 6 in total:
- the scale factor (a float between 0 and 1)
- the modification rate (a float between 0 and 1)
All classes define 3 methods:
optimize
- takes one parameter - the number of iterations. Runs the algorithm for the specified number of iterations and returns the best found solution (as a list of lists of floats). The population is retained between calls.fit
- similar tooptimize
, but without a return value.score
- returns the fitness score (as a float) of the best currently know solution.
Basic algorithm based on:
Karaboga, D., Ozturk, C.: Fuzzy clustering with artificial bee colony algorithm.Scientific Research and Essays5(07 2010)
Modifications based on:
Kumar, A., Kumar, D., Jarial, S.: A hybrid clustering method based on improvedartificial bee colony and fuzzy c-means algorithm. International Journal of ArtificialIntelligence15, 40–60 (01 2017)
and:
Ouadfel, S., Meshoul, S.: Handling fuzzy image clustering with a modified abcalgorithm. International Journal of Intelligent Systems and Applications4, 65–74(11 2012).
The DIM-set dataset is from:
P. Fränti, O. Virmajoki and V. Hautamäki, "Fast agglomerative clustering using a k-nearest neighbor graph", IEEE Trans. on Pattern Analysis and Machine Intelligence, 28 (11), 1875-1881, November 2006.
The Worms dataset is from:
S. Sieranoja and P. Fränti, "Fast and general density peaks clustering", Pattern Recognition Letters, 128, 551-558, December 2019.