lmurtinho / ratiogreedyclustering Goto Github PK
View Code? Open in Web Editor NEWC implementation of the Ratio-Greedy algorithm from Cicalese, Laber, and Murtinho (2019).
C implementation of the Ratio-Greedy algorithm from Cicalese, Laber, and Murtinho (2019).
I'm reproducing the divisive clustering algorithm and found this repository really great. However, I have a quick question on the data generation files, make_ng20_data.py
and make_rcv1_data.py
. From the code, it looks like the .csv
files in the end contain the joint distribution of p(word, class). But the ICML branch as well as the comments in the code say that it is the conditional probability. Did I miss something here? Thank you so much for your help!
Hi,
I was reproducing this paper. For the ng20 dataset, should nrows
be 51840 instead of 51480? I can run the code with 51840 rows but not with 51480.
Many thanks!
I was using make_rcv1_data.py
script to prepare the RCV-1 dataset. The script returns an error on line 92 as
nums[w, c] += d[0,w]
IndexError: index 102 is out of bounds for axis 1 with size 102
However, if I change that line to nums[w, c-1] += d[0,w]
, the script works fine. My understanding is that there are 102 classes, so the index for the last class is 101, as the array starts from 0. Let me know if that's correct. Many thanks!
I was looking for an implementation of Inderjit et al 2002 feature clustering algorithm and came upon this repository.
I had two questions about it:
Thanks you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.