Giter Club home page Giter Club logo

Comments (7)

GoogleCodeExporter avatar GoogleCodeExporter commented on July 26, 2024
i GUESS this is an problem close to the one i have at the moment. maybe your 
values
end up in a very low variance which is almost 0. Due to rounding 
inaccuracy(because
the number is so low: 0,00000.....1) it is interpreted as zero and thorws the 
exception.

in my case I'm trying to remove the if-case which trows the exception. I can 
not tell
for your case for sure (i even can't for my case yet... i still have to check 
the
results for correctness). But I couldn't find any other way, since collected 
(vectors
in my case) data doesn't give information about the resulting variances 
directly. In
my case I would never know at the beginning how my data is combined and in which
matrixes they end up. Covariance shouldn't be <= 0 anyways in my case... 
(remember:
im working with Vectors of Reals!)

maybe you can give it a try with your usage and tell your results...

kind regards,
Ben

Original comment by [email protected] on 27 May 2010 at 10:19

  • Added labels: ****
  • Removed labels: ****

from jahmm.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 26, 2024
Me too I have the same problem with "Variance must be positive". In the case of 
the K-Means clustering algorithm I found the problem arises when the clustering 
produces a cluster with a single element (obviously the variance of a single 
element is zero). A solution to this problem may try to avoid the clustering 
algorithm to produce such a cluster. I'm trying to adjust the source code in 
this way.
But I found the same problem with Baum-Welch algorithm too, as the following 
exceptions trace says:

Exception in thread "main" java.lang.IllegalArgumentException: Variance must be 
positive
at 
be.ac.ulg.montefiore.run.distributions.GaussianDistribution.<init>(GaussianDistr
ibution.java:59)
at be.ac.ulg.montefiore.run.jahmm.OpdfGaussian.fit(OpdfGaussian.java:139)
at 
be.ac.ulg.montefiore.run.jahmm.learn.BaumWelchLearner.iterate(BaumWelchLearner.j
ava:139)
at 
be.ac.ulg.montefiore.run.jahmm.learn.BaumWelchLearner.learn(BaumWelchLearner.jav
a:172)

I haven't yet investigated about the way this Baum-Welch problem arise, but I'm 
going to do so because I'd need this library for some experiment. Maybe I'll 
post a reply if I find a solution.

Have you solved the problem in some manner, in the meantime?

Kind regards,
Simone.

Original comment by [email protected] on 21 Sep 2010 at 3:31

  • Added labels: ****
  • Removed labels: ****

from jahmm.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 26, 2024
I solved the problem for K-Means clustering. I modified the clustering 
algorithm so that it cannot create a cluster with a single element. I also had 
to modify the initial control for the number of elements and number of 
clusters, now there must be at least 2*k elements (so that each cluster can 
contain at least 2 elements). For now it's an additional control after the 
normal clustering: if the control finds a cluster with only one element, it 
performs a redistribution taking an element from a near cluster. If someone is 
interested in this modification, he can contact me.

Simone.

Original comment by [email protected] on 22 Sep 2010 at 6:05

  • Added labels: ****
  • Removed labels: ****

from jahmm.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 26, 2024
I think your way to avoid this problem is partly pretty good. The thing is: 
(Depending on what you observe) it can more or less likely happen that the one 
cluster is filled with only same elements (maybe because there is one 
observation appearing again and again). Now you still have no variance, also 
when there are more than one element in a cluster. Also when it is unlikely 
that one cluster is filled with similar elements in your problem it can happen 
some time and the algorithm crashes.

another problem is: for example you are learning clusters with a VERY low 
variance. a bit later you find a new observation and want to learn it into your 
existing hmm. because of your low variance in your clusters (and also due to 
rounding problems of your PC when a low distance between observation and a 
cluster drifts to zero) the algorithm figures out that your new observation 
doesn't fit any cluster. now you end up in the next problem. learning doesn't 
work and aborts. (i thinnk it ends up in NAN because of a division by zero)

there are some problems following each other or that are related to this 
problem. it is some kind of physical problem which the kmeans-algorithm simply 
can't deal with. i think there are algorithms which could deal with, but they 
are not implemented.

those problems appear more when similar observations appear regularly, which 
can happen in most systems.

Original comment by [email protected] on 1 Oct 2010 at 5:25

  • Added labels: ****
  • Removed labels: ****

from jahmm.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 26, 2024
Yes, my solution is not valid in all cases.
I adjusted it for my pourposes: my observations were real numbers from "real 
world" observations, so I knew it was very unlikely to obtain the same value 2 
or more times (actually I didn't find such observations, while I often found 
many single-observation clusters).
For the "no fit" problem, I'm not sure I have understood you: the clustering 
algorithm takes all the data and tries to fit them in clusters and then, based 
on the clustering, produces an HMM. The clustering algorithm search for the 
nearest cluster when it process a new point, so it shouldn't matter about 
variance.
I apologize for my english.

Simone.

Original comment by [email protected] on 1 Oct 2010 at 6:04

  • Added labels: ****
  • Removed labels: ****

from jahmm.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 26, 2024
no, the clustering algorithm doesn't. i was looking a bit ahead: if you try to 
learn a new sequence with the baum-welch algorithm into your existing 
kmeans-learnt hmm and your states have very little variance it happens at this 
point that the new observation can not be learnt since it ends up in an 
NaN-Error because of the little distance that ends up in zero (because of the 
little variance...). so it seems like this new observation absolutely doesn't 
fit your hmm.

is it coherent?

Original comment by [email protected] on 1 Oct 2010 at 9:35

  • Added labels: ****
  • Removed labels: ****

from jahmm.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 26, 2024
If someone is interested in my solution of the "Variance must be positive" 
problem when it belongs to the K-means clustering algorithm (when it creates 
single-element clusters), I attach here my modified KMeansCalculator class.

Simone.

Original comment by [email protected] on 16 Jan 2011 at 10:04

  • Added labels: ****
  • Removed labels: ****

Attachments:

from jahmm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.