taulanti / randomforest-matlab Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/randomforest-matlab
Automatically exported from code.google.com/p/randomforest-matlab
Hi all,
I'm using the RF toolbox applied to supervised classification with
active-learning (AL). A feature of this method, is that a classifier get
iteratively retrained, in this case a RF classifier. When it reach something
like 20.000 retrains, Matlab crashes and display the attach image.
I tested on windows 7 running Matlab 2008b and 2011a, obtaining a similar
response.
Here's the Error Message:
MATLAB crash file:C:\Users\Hyper!\AppData\Local\Temp\matlab_crash_dump.4280
------------------------------------------------------------------------
Segmentation violation detected at Wed Feb 01 01:20:23 2012
------------------------------------------------------------------------
Configuration:
MATLAB Version: 7.7.0.471 (R2008b)
Window System: Version 6.1 (Build 7600)
Processor ID: x86 Family 6 Model 15 Stepping 13, GenuineIntel
Virtual Machine: Java 1.6.0_21 with Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM mixed mode
Default Encoding: windows-1252
Fault Count: 1
Register State:
rax = 0000000000000022 rbx = 0000000017b44700
rcx = 000000ffffffffff rdx = 0000000031e270c0
rbp = 0000000000000001 rsi = 0000000031e20000
rdi = 0000000031e270d0 rsp = 000000000102a650
r8 = 0000000000000000 r9 = 0000031f270c0001
r10 = 0000000000000010 r11 = 000000000000fa12
r12 = 0000000000000000 r13 = 0000000100000001
r14 = ffffffff00007fff r15 = 00000000ffff0000
rip = 0000000077891612 flg = 0000000000010202
Stack Trace:
[ 0] 0000000077891612 ntdll.dll+333330 (RtlFreeHeap+000306)
[ 1] 0000000077742A8A kernel32.dll+141962 (HeapFree+000010)
[ 2] 00000000715BC7BC MSVCR90.dll+313276 (free+000028)
This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.
If it is an official MathWorks function, please
follow these steps to report this problem to The MathWorks so we
have the best chance of correcting it:
The next time MATLAB is launched under typical usage, a dialog box will
open to help you send the error log to The MathWorks. Alternatively, you
can send an e-mail to [email protected] with the following file attached:
AppData\Local\Temp\matlab_crash_dump.4280
If the problem is reproducible, please submit a Service Request via:
http://www.mathworks.com/support/contact_us/ts/help_request_1.html
A technical support engineer might contact you with further information.
Thank you for your help. MATLAB may attempt to recover, but even if recovery
appears successful,
we recommend that you save your workspace and restart MATLAB as soon as
possible.
I appreciate any comments.
Regards
Original issue reported on code.google.com by [email protected]
on 1 Feb 2012 at 4:37
Attachments:
What version of the product are you using? On what operating system?
Windows-Precompiled-RF_MexStandalone-v0.02-\RF_MexStandalone-v0.02-precompiled
If I want to use Stratified Sampling for splitting data into testing and
training sets in this randomforest package, can you please suggest anything?
Original issue reported on code.google.com by [email protected]
on 1 Nov 2012 at 7:07
I want to use random forest for biological sequence classification.Is it
possible to use this code for sequence classification? I have some positions as
features. Is it possible to use those features in this classifier?
Original issue reported on code.google.com by [email protected]
on 1 Feb 2012 at 8:08
Hi Abhishek,
I am using random forest package for my thesis. its great and simple.
I have few questions regarding initial settings. It will be a great help to me
if you can help me out.
I am trying to stabilize number of trees to be used. My Professor wants
me to use as less trees as possible with out compromising performance. Is there
any standard way that I can accomplish this.
Second question what is the role of seed value in the algorithm. Is it used to
get the bootstrap sample to grow the tree?. Can I change the seed value.
Original issue reported on code.google.com by [email protected]
on 31 Jan 2012 at 3:40
I plan to cite your algorithm. Is there any particular way I should do this?
Original issue reported on code.google.com by [email protected]
on 17 Jun 2010 at 11:55
What steps will reproduce the problem?
1.MatLab + Ubuntu
2.Run compile_linux.m
What is the expected output? What do you see instead?
A wonderful mex file
What version of the product are you using? On what operating system?
Ubuntu, gcc 4.4, matlab 2011
Two problems in my case:
1. mex calls the LateX mex instead of the Matlab one => change the makefile to
the one attached that calls the Matlab Mex in the Matlab bin directory (the
path will be different on other plateforms)
2. Matlab complains about the gcc version 4.4 so you need to follow the
instructions [http://ubuntuforums.org/showthread.php?t=1413330 here]:
Code:
> sudo mv /usr/bin/gcc /usr/bin/gcc_mybackup
> sudo ln -s /usr/bin/gcc-'_what ever compliant version of gcc_' /usr/bin/gcc
*Compile your mex here*
and retore the gcc
> sudo mv /usr/bin/gcc_mybackup /usr/bin/gcc
Original issue reported on code.google.com by [email protected]
on 5 Mar 2012 at 4:49
Attachments:
Hi,
I want to run this code c#,
I don't have knowledge in matlab, I am using dot.net
and I don't have matlab installed,but I can install it if necessary(I'm
student), Can you please Help me out what steps I need to do in order to make
it work .
Thank's
Original issue reported on code.google.com by [email protected]
on 12 Apr 2011 at 4:43
[deleted issue]
I am using MATLAB 7.5 on Windows 7
I am try to use the code and use 2500 random forests for my training set. And
it runs out of memory.
So my question is: is the code retaining ALL the trees during the training or
should it (or does it) only just retain the best one so far? The second option
will not cause memory issues.
This is related to file classRF.cpp: and specifically the line:
for(jb = 0; jb < Ntree; jb++) {
i will be grateful for the reply,
Original issue reported on code.google.com by [email protected]
on 7 Mar 2011 at 11:41
Hi,everyone.
When I run the randomforest,say, if I have the ntree to be 500, then the
model.treemap weill be a matrix with the size of 501 X 1000,
and most of the elements are zeros. So what this treemap means?
Thank you.
Original issue reported on code.google.com by [email protected]
on 26 Sep 2012 at 2:34
What steps will reproduce the problem?
1.when I run "tutorial_ClassRF.m" in matlab2009a,I got these errors:
Random
Forest\nversion\Windows-Precompiled-RF_MexStandalone-v0.02-
\RF_MexStandalone-v0.02-precompiled\randomforest-
matlab\RF_Class_C\mexClassRF_train.mexw32':
由于应用程序配置不正确,应用程序未能启动。重新安装应��
�程序可能会纠正这个问题。.
Error in ==> classRF_train at 347
[nrnodes,ntree,xbestsplit,classwt,cutoff,treemap,nodestatus,nodeclass,bestv
ar,ndbigtree,mtry ...
Error in ==> tutorial_ClassRF at 39
model = classRF_train(X_trn,Y_trn);
How can I do it ???
Original issue reported on code.google.com by [email protected]
on 25 Feb 2010 at 2:27
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
If you are trying to run via some custom arguments and parameters, i.e. for
your own datasets etc. Can you post in the argument size and type (you can
get that via whos('argumentname')?
Please provide any additional information below.
How did u calculate the variable importance? My data set is of binary form? I
want to know which the score of importance of variables .
Original issue reported on code.google.com by [email protected]
on 29 Apr 2012 at 7:38
Hello Abhishek,
First of all thank you very much for making your code publicly available for
research. I am using an hierarchical object recognition model that I have
created, which reduces to vectors learned by a classifier. I use a SVM but I
am experimenting on other classifiers.
I tried therefore to apply Random forests, but in both regression and
classification as soon as the code hits your MEX files (which I compiled
successfully) it crashes to desktop.
I am running Windows 7 and the Matlab version is 2011a at 64bit.
Many Thanks
Aris
Original issue reported on code.google.com by [email protected]
on 22 Nov 2012 at 4:51
A. What steps will reproduce the problem?
1. Use RF to perform a classification task. We run the training program.
2. For training set, the labeling is [0,2]
3. We do not specify testing data when perform training.
B. What is the expected output? What do you see instead?
Expected successful training.
However. current version contains a bug causing crash.
C. What version of the product are you using? On what operating system?
R54. On Windows 8.
D. Possible cause is because of the following relabeling in classRF_train.m:
if exist('Xtst','var') && exist('Ytst','var')
if(size(Xtst,1)~=length(Ytst))
error('Size of Xtst and Ytst dont match');
end
fprintf('Test data available\n');
tst_available=1;
tst_size = length(Ytst);
else
Xtst=1;
Ytst=1;
tst_available=0;
tst_size=0;
end
TRUE=1;
FALSE=0;
orig_labels = sort(unique([Y; Ytst]));
Y_new = Y;
Y_new_tst = Ytst;
new_labels = 1:length(orig_labels);
for i=1:length(orig_labels)
Y_new(find(Y==orig_labels(i)))=Inf;
Y_new(isinf(Y_new))=new_labels(i);
Y_new_tst(find(Ytst==orig_labels(i)))=Inf;
Y_new_tst(isinf(Y_new_tst))=new_labels(i);
end
Y = Y_new;
Ytst = Y_new_tst;
When running the code using above input:
orig_labels=[0 1 2] and unique(Y_new)=[1 3];
However. after relabeling unique(Y_new) shall become [1 2].
E. Correction is:
Change the line:
Xtst=1;
Ytst=1;
into:
Xtst=X(1,:);
Ytst=Y(1);
Again. Thanks for the wonderful software!
Original issue reported on code.google.com by [email protected]
on 18 Sep 2012 at 8:06
Hi Abhishek,
I am trying to extract the exact bootstrap sample used in each tree.
The return value of inbag tells me which samples where in bag for a certain
tree. However it does not tell me how often each sample was selected.
Is there a way to find this out?
I would need this to be able to reproduce the gini impurity value of each node
of a tree.
Thank you for your answer.
Johannes
Original issue reported on code.google.com by [email protected]
on 11 May 2012 at 10:57
What steps will reproduce the problem?
1. With large datasets, I get an out of memory error, is there any fix for
this in Matlab?
Original issue reported on code.google.com by [email protected]
on 8 Apr 2012 at 5:10
in the function, classRF_train(X,Y,ntree,mtry, extra_options), what are X & Y??
as per readme file, they are X: data matrix, Y: target values. could you please
explain more clearly their individual role.
as far i am getting, for xtrain and xtest, features are being taken as input,
but what about ytrain and ytest? what should be the possible input their? is
that a some kind of index? please correct me if i am wrong.
also tell me when to use RF_Class_C and when RF_Reg_C with some example....
thank you.
Original issue reported on code.google.com by [email protected]
on 7 Mar 2012 at 3:54
hello,now i 'm studying the randomforest and bagging.these two methods are
similar,so i want to know the differences between them.thanks.
Original issue reported on code.google.com by [email protected]
on 3 Sep 2011 at 12:59
What steps will reproduce the problem?
1. attempt to train a RF with a high-dimensional dataset (34
1300-dimensional vectors), using 101 trees and mtry=200 features:
myRF = classRF_train(foo_vecs(2:35,1:1300),foo_classLabels(2:35,:),101,200);
foo_vecs is a 36 x 4005 matrix of doubles
foo_classLabels is a 36 x 1 vector of doubles (-1,+1)
(see attached file)
What is the expected output? What do you see instead?
Expected: a trained RF.
Instead: a segmentation violation, with stack trace:
[0] mexClassRF_train.mexmaci64:makeA(double*, int, int, int*, int*,
int*)~ + 151 bytes
[1] mexClassRF_train.mexmaci64:classRF(double*, int*, int*, int*, int*,
int*, int*, int*, int*, int*, int*, int*, double*, double*, int*, int*,
int*, double*, double*, double*, double*, int*, int*, int*, int*, int*,
int*, double*, double*, int*, double*, int*, int*, double*, int*, int,
double*, double*, int*)~ + 2673 bytes
[2] mexClassRF_train.mexmaci64:mexFunction~ + 3192 bytes
... more stuff
What version of the product are you using? On what operating system?
Version svn-v8? (0.02), MacOSX 10.6.2, Matlab 7.9.0.529 (R2009b) 64 bits,
Intel Core 2 Duo (x86 Family 6 Model 7 Stepping 10). Mex file compiled from
source.
Note: the mex file works fine until the training set reaches about 34 x
1200, thereafter crashes. Could it be a memory allocation issue?
Original issue reported on code.google.com by [email protected]
on 26 Feb 2010 at 4:22
Attachments:
Hi,
first of all thank you for enabling to use RF in Matlab.
This is nothing big, but initially I was a little bit confused by the size of
the returned model.ndbigtree, which is [nrnodes x ntrees].
I found a description of Andy Liaw stating that the size should be a vector of
size ntree, containing the number of nodes for each tree.
https://stat.ethz.ch/pipermail/r-help/2003-April/032256.html
I suppose changing the mex_ClassificationRF_train.cpp line 114 might solve that
problem:
plhs[9] = mxCreateNumericMatrix(1, nt, mxINT32_CLASS, mxREAL);
You also might want to consider adding the above mentioned descriptions of the
ouput variables into your .m file. I had a hard time finding out which content
these variables are holding.
Regards,
Johannes
Original issue reported on code.google.com by [email protected]
on 24 May 2011 at 2:39
What steps will reproduce the problem?
I am using the randomforest on 64bit Linux machine with Matlab version
7.9.0.529 (R2009b) through SSH Secure Shell Version 3.2.9.
Everything is fine but after randomforest finished its job, Matlabe is just
stuck. This means that Matlab does not respond.
I looked at the process, but it seems Matlab is just stuck but the process
is still alive, but I cannot do anything else, so I have to disconnect the
terminal and log in again.
Once I disconnected the terminal then Matlab process is killed and there is
no dump file.
Original issue reported on code.google.com by [email protected]
on 21 Apr 2010 at 9:09
Hi
Could you please let me know how nodesize would affect the classification
(Regression) result using RF?
It doesnt mean the higher the nodesize is the more accurate the result would
be, correct? How can we determine the nodesize?
for number of the trees and the choose of mtry you mentioned some hints so I
want to know how I can choose a reasonable mtry for classification (Regression)
Thanks,
Saleh
Original issue reported on code.google.com by [email protected]
on 15 May 2012 at 11:02
hi all
i am doing some project work using random forest where i need to use random
forest for voting purpose. i mean every tree would vote for desired feature and
the best feature is taken into account at the end.
how to use this random forest for this purpose? would the give code would help
me to do so? if not, how can i approach??
kindly reply to guide me.
thank you
Original issue reported on code.google.com by [email protected]
on 1 Feb 2012 at 7:04
Hello,
first of all, thank you very much for this code, I've been using it more and
more for various research projects and will hopefully soon be able to cite it
in a paper!
I was wondering if there was a way to add additional training examples to a
previously trained RF classifier (using the same set of features, of course). I
am interested in creating an interactive classification tool and being able to
add additional examples without having to re-train the whole classifier would
be _very_ useful!
I haven't investigated the fundamental aspect of random forests yet so it might
be obvious that it is impossible but I thought it would be easier to ask before
trying to figure it out by myself.
Thanks again for this code!
Regards,
Nicolas
Original issue reported on code.google.com by [email protected]
on 7 Sep 2012 at 3:59
hi,abhirana .
Thanks for your nice code.
I am not sure how you treat categorical features.I mean if there exist some
categorical features in my dataset, how could I transfer them into numerical
ones that can use your package.
Kindly guide me, please.
Original issue reported on code.google.com by [email protected]
on 16 Nov 2012 at 6:27
When I run the tutorial 'tutorial_ClassRF.m', I get the error:
??? Undefined function or method 'mexClassRF_train' for input arguments of
type 'int32'.
Error in ==> classRF_train at 347
[nrnodes,ntree,xbestsplit,classwt,cutoff,treemap,nodestatus,nodeclass,bestvar,nd
bigtree,mtry
...
I am running the student version of Matlab 7.4.0.287 (R2007a) on a MacBook
Pro with and Intel Core 2 Duo (64 bits). I downloaded
RF_MexStandalone-v0.02.zip and also MacOS_precompiled-WITHOUT_SOURCE-v0.02.
As directed I copied the files from the '2009b 64-bit' folder from
MacOS_precompiled-WITHOUT_SOURCE-v0.02 into the 'RF_Class_C' and the
'RF_Reg_C' folders produced from RF_MexStandalone-v0.02.zip. I added all
the folders to my path. Then when I run the tutorial file I get the error
above.
Based on the other email I saw concerning this same issue I guess this is
some sort of compiler issue. Do I need to recompile everything? i.e.
don't use the the precompiled files?
Thank you for your help!
Corinne
Original issue reported on code.google.com by [email protected]
on 17 May 2010 at 12:12
Basically, I just downloaded MacOS_precompile_WITHOUT_SOURCE_v0.02.tar and
I tried it. I copied mexClassRF_predict.mexmaci64 and
mexClassRF_train.mexmaci64 to the right position and I ran it. However, it
turned out the following message:
Too less/many parameters: You supplied 15??? One or more output arguments
not assigned during call to "mexClassRF_train".
Error in ==> classRF_train at 353
[nrnodes,ntree,xbestsplit,classwt,cutoff,treemap,nodestatus,nodeclass,bestvar,nd
bigtree,mtry
...
I have not do anything change to this package. So, could anyone help me fix
this issue?
What version of the product are you using? On what operating system?
My MATLAB is R2009b(64-bit), MACi64 and my system is MAC OS X 10.6.2.
Thanks a lot
Original issue reported on code.google.com by [email protected]
on 16 Feb 2010 at 1:48
Great work on the random forest implementation. Coming from the R version,
this was an easy adjustment. I have been using it effectively on matrices of
continuous data, but how does it handle categorical data? I can't pass in an
array of strings, nor can I assign integer values to categories because I don't
want them to be treated as continuous. Any suggestions?
Original issue reported on code.google.com by [email protected]
on 23 Aug 2012 at 7:06
Hello,
Thanks for submitting this implementation which is great.
I'm wondering in this implementation how do you determine the best
cutting-point at each node given a randomly selected attribute?
Does it search through all possible cutting points and use the one with best
score in certain metric?
Thanks
Original issue reported on code.google.com by [email protected]
on 30 Mar 2012 at 11:34
This is a great MATLAB transportation of Random Forest. Thank you very much.
However, I found the memory allocated to store the RF (for classification)
model is very non-efficient. Specifically, there are a lot of non-necessary
elements stored in treemap, nodestatus, nodeclass, bestvar, xbestsplit, and
ndbigtree.
I will take twonorm.mat as an example for illustrain purpose, where only one
tree is trained for simplicity.
>> load ./data/twonorm.mat
>> model = classRF_train(inputs', outputs, 1);
Dimension of variables in the model are listed below (on my computer with
64-bit Windows and MATLAB 2011b):
Value
nrnodes 601
treemap <601*2 int32>
nodestatus <601*1 int32>
nodeclass <601*1 int32>
bestvar <601*2 int32>
xbestsplit <601*2 int32>
ndbigtree <601*2 int32>
Actually, ndbigtree denotes the number of nodes in each tree, in which only the
first #model.ntree (here only 1) elements are useful and the rest are all zero.
In my output, ndbigtree(1) is
63. I checked on the predictClassTree() function in classTree.cpp to see how
the prediction is made based on the tree hierarchy. I found that only the first
#model.ntree (here only 1) elements in nodestatus, nodeclass, bestvar and
xbestsplit are useful. The index of the left and right child of the kth node is
treemap[2*k]-1 and treemap[2*k+1]-1, respectively.
I am not sure why there are only 63 nodes in the tree, but we have to store as
much as 601 (#nrnodes) elements in say nodestatus, and 2*63 elements in
treemap. It will cost a great deal of extra memory to store RF with many trees
trained on large number of samples. Is that possible to improve the momery
allocation?
Original issue reported on code.google.com by [email protected]
on 7 Apr 2012 at 10:42
Initialize random number generator with srand() in classRF.cpp, otherwise
seedMT is seeded deterministically.
#include <time.h>
srand ( time(NULL) );
prior to:
seedMT(2*rand()+1);
Original issue reported on code.google.com by [email protected]
on 5 Jan 2010 at 7:54
What steps will reproduce the problem?
1. Just run command 'make', and then it will show that 'mex cannot be found'.
2. If modify the Makefile by replacing 'mex' with 'mkotfile --mex' and run
'make' again, there are compiling errors in the source code.
What is the expected output? What do you see instead?
I cannot install it.
What version of the product are you using? On what operating system?
The lastest version in Ubuntu12.04
If you are trying to run via some custom arguments and parameters, i.e. for
your own datasets etc. Can you post in the argument size and type (you can
get that via whos('argumentname')?
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 19 Oct 2012 at 3:52
What steps will reproduce the problem?
1.I want to avoid the "out of memory" err in matlab,so I want to use c++
or csharp to calling this code.How about your advise?
Original issue reported on code.google.com by [email protected]
on 17 Mar 2010 at 1:59
What steps will reproduce the problem?
When I'm trying to compile the code I get the following error:
Compiling rfsub.f (fortran subroutines)
gfortran -O2 -fpic -march=native -c src/rfsub.f -o rfsub.o
src/rfsub.f:0: error: bad value (native) for -march= switch
src/rfsub.f:0: error: bad value (native) for -mtune= switch
Have you ever crossed with the same issue?
Any help is appreciated.
MJ
Original issue reported on code.google.com by [email protected]
on 5 Jun 2012 at 11:07
classRF_train([1 0; 1 0], [1 2]', 10, 2)
hangs my machine with probability 1.
Windows 7, Matlab 7.12.0.
It would be better to add some error handling it this case.
Thanks!
Original issue reported on code.google.com by [email protected]
on 24 Oct 2011 at 4:59
i need to know that what are the basic step required to be done so that i could
use random forest in MATLAB?
any complete documentation on programs given here (in standalone) would be much
appreciated so that any novice could understand the capability of random forest
and how it could help lot in MATLAB.
Original issue reported on code.google.com by [email protected]
on 21 Dec 2011 at 7:46
Hi
I have a question regarding the output of RF classification. You use majority
vote to assign a class label to a query. I was wondering is it possible to have
the probability that the query belongs to a class rather than the class label?
Let me give you an example to better convey my meaning. Assume that we have
three classes A, B, and C I'd like to see the probability that a query x
belongs to class A, the probability that x belongs to B and the probability
that x belongs to C. If each tree produces the class probabilities for query x,
we can average the class probabilities to have the total class probabilities.
If the class label that your code assigns to the x is A it is reasonable to see
higher probability for x belonging to A than other two classes.
Thanks,
Saleh
Original issue reported on code.google.com by [email protected]
on 11 May 2012 at 2:34
What steps will reproduce the problem?
1.When I train a RandomForest,How to save a model to predict with the new
data next time?
Original issue reported on code.google.com by [email protected]
on 27 Feb 2010 at 2:29
[deleted issue]
I call RF training more than 10000 times consequently or in parallel. During
the random iteration around 10000 it always fails with segfault.
Try to execute
parfor i = 1:10000, classRF_train(features, cols, 20, 3); end
to reproduce. I'm not sure if specific input is important, but it failed for
various inputs (which were all quite large). One of them is attached.
The program leaks a bit, so it looks like there is no memory available, but it
is not the case (there are still 5Gb of free memory when it fails). Probably
64x issue?
Win7, Matlab 7.12, 12 Gb of RAM
The dump file is attached.
Original issue reported on code.google.com by [email protected]
on 10 Nov 2011 at 5:09
Attachments:
Hello there,
First, thanks for the wonderful code for random forest. I would like to know about the functionality of calculating the confidence measure for each prediction. Can you tell me a way to measure the confidence of each prediction by the random forest. Does your code include this functionality. It will be great if you can let us know a way to do it.
Thanks
Original issue reported on code.google.com by gayumahalingam
on 7 Oct 2010 at 2:17
Hi,
First of all many thanks for this wonderful code. I have some questions.
1) Is it possible to set the depth of each tree in random forest?
2) Each node in the tree should have one of the two node status values(1,-1)
non terminal and terminal. But when I run tutorial_ClassRF.m I find a lot of
nodes with a nodestatus of zero. What does this zero mean.?
Kindly guide me.
Original issue reported on code.google.com by [email protected]
on 23 Oct 2012 at 11:19
Hi
Im trying to use RF to do pixel classification for images of size 101*101
pixels.
There are 18 features corresponding to each pixel and the number of classes is
3. Also, my dataset contains 70 images.
Reading Leo Breiman and Adele Cutler website:
http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm
They said: "In random forests, there is no need for cross-validation or a
separate test set to get an unbiased estimate of the test set error. It is
estimated internally, during the run"
I was going to use k-fold cross validation (if k =5, build my model using 65
images and test the model on 5 left over images)to validate RF classification
result which time consuming but seems like I dont need to do that.
But I noticed that in tutorial_ClassRF.m you split your dataset into two
classes of training and test and after building the model, you run the model on
the test set. Could you please clarify this? How can I use this property of
random forests via your code?
Best,
Saleh
Original issue reported on code.google.com by [email protected]
on 11 May 2012 at 2:20
Hi Abhishek Jaiantilal
Thanks for a great code. Well described!
my dataset is a 50x10000matrix and using classification trees to determine the
variables with the greatest importance. However RF is know to work as this
blackbox so, I was wondering...
if there is any way to view each tree and include it into a report? or is the
closest your "extra_options.do_trace"-function? which outputs:
tree OOB 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42 43
44 45 46 47 48
1: 56.13% 0.00% 0.00% 14.29% -1.#J% -1.#J%100.00% 0.00% 27.27% 72.73% 61.54% 54.55% 71.43% 92.31% 77.78% 90.00% 87.50% 71.43% 70.59% 70.27% 72.41% 71.43% 70.21% 96.30% 96.88% 91.67% 94.12% 94.44% 86.96% 88.24% 90.48% 92.86% 81.82% 84.85% 84.85% 72.62% 51.44% 50.80% 55.32% 56.55% 72.97% 67.01% 50.00% 50.96% 48.45% 27.18% 28.13% 20.45% 27.27%
(as two rows).
the reason I'm asking is because I'd find this video showing each tree and how
their importance is, at 4:18:
http://www.youtube.com/watch?v=RE7VO_AB7PI&feature=player_embedded
Thanks again for your program, and writing your citation in another topic.
Regards Thomas
Original issue reported on code.google.com by [email protected]
on 9 Jul 2011 at 2:00
Hi,
I am running Matlab version 7.1.0.246 (R14) Service pack 3 on a 64 bit machine,
however the Matlab is installed in C:\Program Files (x86) which indicates that
it is a 32 bit installation. I have downloaded the precompliled files and when
I run the tutorial_RegRF.m I get the following error:
Setting to defaults 500 trees and mtry=3
??? Invalid MEX-file
'C:\Users\Igor\Projects\Windows-Precompiled-RF_MexStandalone-v0.02-\RF_MexStanda
lone-v0.02-precompiled\randomforest-matlab\RF_Reg_C\mexRF_train.mexw32': The
specified procedure could not be found.
.
Error in ==> regRF_train at 283
[ldau,rdau,nodestatus,nrnodes,upper,avnode,...
I have checked that my machine has Microsoft visual C++ 2005 redistributable
installed in the Control Panel.
My version of Matlab is quiet old.
Do you think compiling the files myself will solve the issue?
Thank you in advance.
Original issue reported on code.google.com by [email protected]
on 17 Apr 2012 at 4:05
First thank you very much for this wonderful software!
I notice that for same number of samples and features, if only difference is
the labeling type so one problem is classification and the other problem is
regression, the time taken for construction of regression forest will be
considerably longer than classification forest (using default parameters for
msplit and keep ntrees the same. We also estimate variable importance along the
way.) Is there any reasons behind this?
Thanks a lot!
Original issue reported on code.google.com by [email protected]
on 27 Sep 2012 at 8:15
When compiling using the .m script files provided with MS Visual Studio 2010's
cl on a PCWIN64 machine or with g++ on a Unix64 machine, I get several errors
like the following:
src\mex_ClassificationRF_train.cpp(179) : error C2664:
'mxCreateNumericMatrix_730' : cannot convert parameter 4 from 'int' to
'mxComplexity'. Conversion to enumeration type requires an explicit cast
(static_cast, C-style cast or function-style cast)
My understanding is that C++ does not support implicit casting from int to enum
types. I also tried using OPTIMFLAGS="$OPTIMFLAGS /Tc" in mex but the code does
not seem to be C compatible either.
Original issue reported on code.google.com by [email protected]
on 31 Jan 2011 at 12:41
Hi,
i d like to make an accuracy assessment on each class im using.
In many papers ive read, thats its possible to compute a confusion Matrix, from
which i could calculate my classaccuracy...
Unfourtntly i dont know how to implement the confusion Matrix, also its written
in the readme.
Im usin v0.02 from RF_MexStandalone-v0.02.zip
Would be awesome if u could help me out, coz i need it for my ba thesis badly ;)
greetings
Original issue reported on code.google.com by [email protected]
on 25 May 2011 at 11:39
What steps will reproduce the problem?
1. After training, regression model does not include xbestsplit
2. ClassRF_predict gives error : no ??? Reference to non-existent field
'xbestsplit'.
version: Windows-Precompiled-RF_MexStandalone-v0.02-.zip 445
Original issue reported on code.google.com by [email protected]
on 8 Apr 2012 at 4:22
I am trying to run a image data base in RF-MATLAB ,so that it can classify from
an given database,,,,,but cont... it gives error,,,needs two class for
classification,,,I wannt that it should run on my data base as it runs on
twonorm data base,,,,i have attach that file with database ,,the file is
vatsnewrf.m ,,n database is yale_database_B.mat,,,,,please help me,,,
Original issue reported on code.google.com by [email protected]
on 29 Apr 2012 at 8:00
Attachments:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.