<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

Variance must be positive problem about jahmm HOT 7 OPEN

GoogleCodeExporter commented on July 26, 2024

Variance must be positive problem

from jahmm.

Comments (7)

GoogleCodeExporter commented on July 26, 2024

i GUESS this is an problem close to the one i have at the moment. maybe your 
values
end up in a very low variance which is almost 0. Due to rounding 
inaccuracy(because
the number is so low: 0,00000.....1) it is interpreted as zero and thorws the 
exception.

in my case I'm trying to remove the if-case which trows the exception. I can 
not tell
for your case for sure (i even can't for my case yet... i still have to check 
the
results for correctness). But I couldn't find any other way, since collected 
(vectors
in my case) data doesn't give information about the resulting variances 
directly. In
my case I would never know at the beginning how my data is combined and in which
matrixes they end up. Covariance shouldn't be <= 0 anyways in my case... 
(remember:
im working with Vectors of Reals!)

maybe you can give it a try with your usage and tell your results...

kind regards,
Ben

Original comment by [email protected] on 27 May 2010 at 10:19

Added labels: ****
Removed labels: ****

from jahmm.

GoogleCodeExporter commented on July 26, 2024

Me too I have the same problem with "Variance must be positive". In the case of 
the K-Means clustering algorithm I found the problem arises when the clustering 
produces a cluster with a single element (obviously the variance of a single 
element is zero). A solution to this problem may try to avoid the clustering 
algorithm to produce such a cluster. I'm trying to adjust the source code in 
this way.
But I found the same problem with Baum-Welch algorithm too, as the following 
exceptions trace says:

Exception in thread "main" java.lang.IllegalArgumentException: Variance must be 
positive
at 
be.ac.ulg.montefiore.run.distributions.GaussianDistribution.<init>(GaussianDistr
ibution.java:59)
at be.ac.ulg.montefiore.run.jahmm.OpdfGaussian.fit(OpdfGaussian.java:139)
at 
be.ac.ulg.montefiore.run.jahmm.learn.BaumWelchLearner.iterate(BaumWelchLearner.j
ava:139)
at 
be.ac.ulg.montefiore.run.jahmm.learn.BaumWelchLearner.learn(BaumWelchLearner.jav
a:172)

I haven't yet investigated about the way this Baum-Welch problem arise, but I'm 
going to do so because I'd need this library for some experiment. Maybe I'll 
post a reply if I find a solution.

Have you solved the problem in some manner, in the meantime?

Kind regards,
Simone.

Original comment by [email protected] on 21 Sep 2010 at 3:31

Added labels: ****
Removed labels: ****

from jahmm.

GoogleCodeExporter commented on July 26, 2024

I solved the problem for K-Means clustering. I modified the clustering 
algorithm so that it cannot create a cluster with a single element. I also had 
to modify the initial control for the number of elements and number of 
clusters, now there must be at least 2*k elements (so that each cluster can 
contain at least 2 elements). For now it's an additional control after the 
normal clustering: if the control finds a cluster with only one element, it 
performs a redistribution taking an element from a near cluster. If someone is 
interested in this modification, he can contact me.

Simone.

Original comment by [email protected] on 22 Sep 2010 at 6:05

Added labels: ****
Removed labels: ****

from jahmm.

GoogleCodeExporter commented on July 26, 2024

I think your way to avoid this problem is partly pretty good. The thing is: 
(Depending on what you observe) it can more or less likely happen that the one 
cluster is filled with only same elements (maybe because there is one 
observation appearing again and again). Now you still have no variance, also 
when there are more than one element in a cluster. Also when it is unlikely 
that one cluster is filled with similar elements in your problem it can happen 
some time and the algorithm crashes.

another problem is: for example you are learning clusters with a VERY low 
variance. a bit later you find a new observation and want to learn it into your 
existing hmm. because of your low variance in your clusters (and also due to 
rounding problems of your PC when a low distance between observation and a 
cluster drifts to zero) the algorithm figures out that your new observation 
doesn't fit any cluster. now you end up in the next problem. learning doesn't 
work and aborts. (i thinnk it ends up in NAN because of a division by zero)

there are some problems following each other or that are related to this 
problem. it is some kind of physical problem which the kmeans-algorithm simply 
can't deal with. i think there are algorithms which could deal with, but they 
are not implemented.

those problems appear more when similar observations appear regularly, which 
can happen in most systems.

Original comment by [email protected] on 1 Oct 2010 at 5:25

Added labels: ****
Removed labels: ****

from jahmm.

GoogleCodeExporter commented on July 26, 2024

Yes, my solution is not valid in all cases.
I adjusted it for my pourposes: my observations were real numbers from "real 
world" observations, so I knew it was very unlikely to obtain the same value 2 
or more times (actually I didn't find such observations, while I often found 
many single-observation clusters).
For the "no fit" problem, I'm not sure I have understood you: the clustering 
algorithm takes all the data and tries to fit them in clusters and then, based 
on the clustering, produces an HMM. The clustering algorithm search for the 
nearest cluster when it process a new point, so it shouldn't matter about 
variance.
I apologize for my english.

Simone.

Original comment by [email protected] on 1 Oct 2010 at 6:04

Added labels: ****
Removed labels: ****

from jahmm.

GoogleCodeExporter commented on July 26, 2024

no, the clustering algorithm doesn't. i was looking a bit ahead: if you try to 
learn a new sequence with the baum-welch algorithm into your existing 
kmeans-learnt hmm and your states have very little variance it happens at this 
point that the new observation can not be learnt since it ends up in an 
NaN-Error because of the little distance that ends up in zero (because of the 
little variance...). so it seems like this new observation absolutely doesn't 
fit your hmm.

is it coherent?

Original comment by [email protected] on 1 Oct 2010 at 9:35

Added labels: ****
Removed labels: ****

from jahmm.

GoogleCodeExporter commented on July 26, 2024

If someone is interested in my solution of the "Variance must be positive" 
problem when it belongs to the K-means clustering algorithm (when it creates 
single-element clusters), I attach here my modified KMeansCalculator class.

Simone.

Original comment by [email protected] on 16 Jan 2011 at 10:04

Added labels: ****
Removed labels: ****

Attachments:

KMeansCalculator.java

from jahmm.

Variance must be positive problem about jahmm HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent