keedi / rf-ace Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/rf-ace
Automatically exported from code.google.com/p/rf-ace
What steps will reproduce the problem?
1. svn update
2. compile
3. run - see below
What is the expected output? What do you see instead?
billwhite@isaac~/src/rf-ace$ bin/rf_ace -I test_5by10_numeric_matrix.arff -i 5
-O foo
-------------------------------------------------------
| RF-ACE version: 0.9.7, December 29th, 2011 |
| Project page: http://code.google.com/p/rf-ace |
| Report bugs: [email protected] |
-------------------------------------------------------
Reading file 'test_5by10_numeric_matrix.arff', please wait... Segmentation fault
billwhite@isaac~/src/rf-ace$
What version of the product are you using? On what operating system?
Max OS X 10.6
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 10 Jan 2012 at 7:03
compiling with g++:
[rkreisbe@breve ~/rf-ace]$ g++ -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk
--disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC)
compile fails with:
g++ -O3 -std=c++0x -Wall -Wextra -pedantic -Isrc/ -lz src/rf_ace.cpp
src/murmurhash3.cpp src/datadefs.cpp src/progress.cpp src/statistics.cpp
src/math.cpp src/stochasticforest.cpp src/rootnode.cpp src/node.cpp
src/treedata.cpp src/utils.cpp src/distributions.cpp src/reader.cpp
src/feature.cpp -pthread -o bin/rf-ace
In file included from src/feature.hpp:10,
from src/feature.cpp:1:
src/datadefs.hpp: In function ‘bool datadefs::isNAN(const T&) [with T =
std::unordered_set<unsigned int, std::hash<unsigned int>,
std::equal_to<unsigned int>, std::allocator<unsigned int> >]’:
src/feature.cpp:144: instantiated from here
src/datadefs.hpp:159: error: no match for ‘operator!=’ in ‘value !=
value’
make: *** [rf-ace] Error 1
Original issue reported on code.google.com by [email protected]
on 21 Mar 2013 at 8:33
For categorical data predictions already contain confidence intervals, but for
numerical data this feature is missing.
Original issue reported on code.google.com by [email protected]
on 25 Mar 2012 at 4:51
What version of the product are you using? On what operating system?
rf_ace_v1.0.3_*
all operating systems
Please provide any additional information below.
The seed parameter for Random Forests should be available to be able to
reproduce results where the trained model is not present any more. This is
often the case where a huge amount of models is trained (I'm talking tens of
thousands), and space on the harddrive is spared by just keeping the results.
Furthermore, it would be helpful for comparing the performance of the RF-ACE to
other Random Forests like the ones in Mahout or Weka.
best regards,
Berni
Original issue reported on code.google.com by [email protected]
on 22 Mar 2012 at 10:51
line 40 of treedata.cpp causes segmentation fault
sampleHeaders_.resize(rawMatrix[0].size(),"NO_SAMPLE_ID");
......................^^^^^^^^^^^^^^^^^^^
Original issue reported on code.google.com by [email protected]
on 2 Jan 2012 at 3:10
This would allow the program to become easier to manage on the tree level.
Major implication: parallelizing building of trees across many CPUs/machines
will become easier.
Original issue reported on code.google.com by [email protected]
on 30 Mar 2012 at 3:27
Currently, if one specifies a nonexistent directory in the output string, a
segmentation fault is printed. I'll implement a platform independent support
for creating directories.
Original issue reported on code.google.com by [email protected]
on 1 Jul 2011 at 11:36
Calling
rf-ace --filter -I vector -i 0 -T vector.sub.arff -n 1000 -m 10 -o predictions
on the file attached gives
terminate called after throwing an instance of 'int'
Does rf-ace support the sparse arff format?
What version of the product are you using? On what operating system?
| RF-ACE version: 1.0.6, Aug 17 2012 |
| Compile date: Aug 23 2012, 17:04:14 |
uname -a : Linux lucid-vostro 2.6.35-32-generic-pae #68-Ubuntu SMP Tue Mar 27
18:04:42 UTC 2012 i686 GNU/Linux
Original issue reported on code.google.com by digitalpebble
on 30 Aug 2012 at 2:25
Attachments:
It is not clear whether the predictions on the training data are out of bag
(OOB) predictions, or not.
Original issue reported on code.google.com by [email protected]
on 15 Jan 2012 at 7:37
Currently RF-ACE outputs a GBT predictor when parameter -F / --forest is used.
However, the format lacks some crucial information and is not in very
machine-readable. I think there should be only one format, which would be easy
to read and interpret by both computer and human.
Original issue reported on code.google.com by [email protected]
on 7 Jan 2012 at 11:50
based on the model built on the attached training data, test predictions are
all wrong. There should be problem.
Original issue reported on code.google.com by [email protected]
on 15 Jan 2012 at 7:17
Attachments:
Now prediction caching is done per request, after growing the trees, but it can
be done single-pass during tree-growing. StochasticForest will be responsible
for storing the cached predictions and return them upon request. This will
speed things up since importance score calculations rely heavily on train data
predictions.
Original issue reported on code.google.com by [email protected]
on 17 Aug 2012 at 10:12
What version of the product are you using? On what operating system?
rf-ace-predict-*.exe
Every OS
Please provide any additional information below.
The seed parameter should only play a role for building a classifier, but not
for prediction. Hence, to avoid confusion it should be removed from both the
interface as well as the command line feedback.
best regards,
Berni
Original issue reported on code.google.com by [email protected]
on 30 Mar 2012 at 7:28
The method interface and implementation of datadefs::mode(...) incorrectly
handles for multiple values that occur with the same top frequency. In this
case, the underlying implementation will select the first element in the
natural key ordering of an STL std::map, effectively selecting the lower of two
or more values.
This interface should be updated to return a set of values, and its underlying
implementation refactored to not rely on the idiosyncrasies of max_element and
related functions.
Original issue reported on code.google.com by [email protected]
on 17 Aug 2011 at 8:50
At the moment the logic is spread between the options namespace and the main
program, which adds confusion. Thus, all logic will be lifted over to the main
program eventually.
Original issue reported on code.google.com by [email protected]
on 30 May 2012 at 10:45
It has been reported that feature selection problems with a gigantic number of
features and only a tiny fraction of relevant features may prove to be
problematic for RFs, however, which can be remedied by adapting sampling of
features towards more informative ones. This will make base learners more
accurate while retaining diversity of learners in the ensembles. See e.g.
http://bioinformatics.oxfordjournals.org/content/24/18/2010.abstract
http://clopinet.com/fextract-book/
for further information.
Original issue reported on code.google.com by [email protected]
on 7 Jan 2012 at 11:37
What steps will reproduce the problem?
1. Executing bin/rf-ace --filter with -B option
2.
3.
What is the expected output?
A successful execution
What do you see instead?
The program exits (139) with Segmentation fault.
What version of the product are you using? 1.0.7
On what operating system?
CentOS release 6.3 (Final)
Linux 2.6.32-220.7.1.el6.x86_64 x86_64
Please provide any additional information below.
blacklist text file tested were:
1. A list of feature names (one row per each)
2. Tab delimited line of feature names
3. 1 and 2 but with integers (index) of the features.
All tests resulted in the same error.
The exact execution without the -B option completed successfully and
produced the requested outputs.
Original issue reported on code.google.com by [email protected]
on 4 Oct 2012 at 6:33
What steps will reproduce the problem?
1. run the following comment for the attached training file
rf_ace_win64 --traindata train.arff --target clas -O yaz.txt
What version of the product are you using? On what operating system?
Version 0.9.8, 64 bit version, on Windows 7 Home Premium (64 bit)
Original issue reported on code.google.com by [email protected]
on 14 Jan 2012 at 12:55
Attachments:
In order to make Randomforest and GBT lighter, a new dynamic construction
process of trees will be introduced. This will also include the introduction of
RootNode that has control over the child Nodes.
Original issue reported on code.google.com by [email protected]
on 28 Jun 2011 at 10:01
The assumption of equal population variances may be one reason why p-values in
some cases are behaving oddly.
Original issue reported on code.google.com by [email protected]
on 30 Mar 2012 at 3:29
This is to make the class responsibilities clearer, and to make Treedata
lighter as it is nor responsible of not just storing data but also for
splitting it. Splitting will be the Node's responsibility in the future.
Original issue reported on code.google.com by [email protected]
on 28 Jun 2011 at 9:58
Hi Timo,
I tried running the biovis feature on the new rf-ace release and got
"No features match the specified target identifier '0'"
while an older release, r169, ran okay.
I will try it with an older TCGA dataset too.
Thanks,
Jake
What steps will reproduce the problem?
1. feature matrix
/proj/ilyalab/Patrick/bioviscontest_dataset_2011_v2/data/rf.input.tsv 7577x500
run rf-ace_r227 (latest as of 07/12)
/proj/ilyalab/TCGA/rf-ace_r227/bin/rf_ace -I
/proj/ilyalab/Patrick/bioviscontest_dataset_2011_v2/data/rf.input.tsv -i 0 -n
500 -m 1000 -p 20 -t 1 -O associations_0.out
---------------------------------------------------------------
| RF-ACE -- efficient feature selection with heterogeneous data |
| |
| Version: RF-ACE v0.5.8, July 8th, 2011 |
| Project page: http://code.google.com/p/rf-ace |
| Contact: [email protected] |
| [email protected] |
| |
| DEVELOPMENT VERSION, BUGS EXIST! |
---------------------------------------------------------------
Reading file
'/proj/ilyalab/Patrick/bioviscontest_dataset_2011_v2/data/rf.input.tsv'
File type is unknown -- defaulting to Annotated Feature Matrix (AFM)
AFM orientation: features as rows
No features match the specified target identifier '0'
run older rf-ace version:
rf-ace_r1169 (symlink)
/proj/ilyalab/TCGA/rf-ace/bin/rf_ace -I
/proj/ilyalab/Patrick/bioviscontest_dataset_2011_v2/data/rf.input.tsv -i 0 -n
500 -m 1000 -p 20 -t 1 -O associations_0.out
---------------------------------------------------------------
| RF-ACE -- efficient feature selection with heterogeneous data |
| |
| Version: RF-ACE v0.3.1, June 24th, 2011 |
| Project page: http://code.google.com/p/rf-ace |
| Contact: [email protected] |
| |
| DEVELOPMENT VERSION, BUGS EXIST! |
---------------------------------------------------------------
Reading file
'/proj/ilyalab/Patrick/bioviscontest_dataset_2011_v2/data/rf.input.tsv'
File type is unknown -- defaulting to Annotated Feature Matrix (AFM)
AFM orientation: features as rows
RF-ACE parameter configuration:
--input = /proj/ilyalab/Patrick/bioviscontest_dataset_2011_v2/data/rf.input.tsv
--nsamples = 500 / 500 (0% missing)
--nfeatures = 7576
--targetidx = 0, header 'C:GENO:chr16:67319257:chr16:67319257:67319257::'
--ntrees = 500
--mtry = 1000
--nodesize = 25
--nperms = 20
--pthresold = 1
--output = associations_0.out
Growing 20 Random Forests (RFs), please wait...
RF 1: 500 nodes (avg. 1 nodes / tree)
RF 2: 500 nodes (avg. 1 nodes / tree)
RF 3: 500 nodes (avg. 1 nodes / tree)
RF 4: 500 nodes (avg. 1 nodes / tree)
RF 5: 500 nodes (avg. 1 nodes / tree)
RF 6: 500 nodes (avg. 1 nodes / tree)
RF 7: 500 nodes (avg. 1 nodes / tree)
RF 8: 500 nodes (avg. 1 nodes / tree)
RF 9: 500 nodes (avg. 1 nodes / tree)
RF 10: 500 nodes (avg. 1 nodes / tree)
RF 11: 500 nodes (avg. 1 nodes / tree)
RF 12: 500 nodes (avg. 1 nodes / tree)
RF 13: 500 nodes (avg. 1 nodes / tree)
RF 14: 500 nodes (avg. 1 nodes / tree)
RF 15: 500 nodes (avg. 1 nodes / tree)
RF 16: 500 nodes (avg. 1 nodes / tree)
RF 17: 500 nodes (avg. 1 nodes / tree)
RF 18: 500 nodes (avg. 1 nodes / tree)
RF 19: 500 nodes (avg. 1 nodes / tree)
RF 20: 500 nodes (avg. 1 nodes / tree)
20 RFs, 10000 trees, and 10000 nodes generated in 94.17 seconds (106.191 nodes
per second)
Association file created. Format:
TARGET PREDICTOR P-VALUE IMPORTANCE CORRELATION
Done.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 12 Jul 2011 at 4:37
What steps will reproduce the problem?
1. Point to a weka arff file as input.
2. Running rf_ace will fail.
What is the expected output? What do you see instead?
===> Uncovering associations... 0%rf_ace: src/partitionsequence.cpp:8:
PartitionSequence::PartitionSequence(size_t): Assertion `nMaxLength <
sizeof(graycode_t)' failed.
What version of the product are you using? On what operating system?
Latest trunk version. Reverting to last released version works.
Please provide any additional information below.
Original issue reported on code.google.com by sshivaji
on 19 Oct 2011 at 1:50
What steps will reproduce the problem?
1. Unpack the package (tar.gz)
2. run make, then make test
What is the expected output? What do you see instead?
- Expect some test to run and verify that the build is working
- Instead seeing multiple failures (see output below)
- Makefile seems to depend on stuff outside the package, like
-I/home/erkkila2/include
What version of the product are you using? On what operating system?
Latest downloaded package: rf_ace_v1.0.4_src.tar.gz
Please provide any additional information below.
hostname 623 ~/src/rf-ace> make test
rm -f bin/test; g++ -L/home/erkkila2/lib -lcppunit -ldl -pedantic
-I/home/erkkila2/include -I/usr/lib64/glib-2.12/include
-I/usr/include/glib-2.12 -I/usr/ -Isrc/ test/run_tests.cpp src/progress.cpp
src/statistics.cpp src/math.cpp src/gamma.cpp src/stochasticforest.cpp
src/rootnode.cpp src/node.cpp src/splitter.cpp src/treedata.cpp src/mtrand.cpp
src/datadefs.cpp src/utils.cpp -o bin/test -ggdb; ./bin/test
In file included from test/run_tests.cpp:6:0:
test/argparse_test.hpp:6:45: fatal error: cppunit/extensions/HelperMacros.h: No
such file or directory
compilation terminated.
/bin/sh: ./bin/test: not found
make: *** [test] Error 127
Thanks.
Original issue reported on code.google.com by [email protected]
on 10 Apr 2012 at 9:26
Two dimensional sparse array representation will be useful in some areas in
RF-ACE, but also elsewhere.
Original issue reported on code.google.com by [email protected]
on 22 Aug 2012 at 9:29
What is the expected output? What do you see instead?
Currently, rf-ace-predict-win64.exe provides only
TARGET SAMPLE_ID PREDICTION CONFIDENCE
in the output file.
To better understand the outcome of the testrun additional information would be
useful, such as accuracy, f-measure, ...
This summary could either be added to the console output or maybe even in an
seperate file. Please find attached the output of WEKA for inspiration.
What version of the product are you using? On what operating system?
rf_ace_v1.0.4_win7_x64
WIN7
Original issue reported on code.google.com by [email protected]
on 27 Mar 2012 at 7:28
Attachments:
Transform p-values to corrected p-values / FDRs.
Original issue reported on code.google.com by [email protected]
on 19 Jun 2012 at 1:08
Currently RootNode is somewhat poor abstraction layer, considering that what it
really does is it grows a tree.
Original issue reported on code.google.com by [email protected]
on 30 Mar 2012 at 3:24
When using the prune_features option, if all features are removed, RF-ACE
appears to return an error. I think it would be better not to call it an
error, but simply normal behavior. Maybe a warning could be written to the log
file -- otherwise it's hard to be sure, when a few jobs out of 10s of 1000s
return with an error whether it's a problem that needs to be tracked down or
not.
Original issue reported on code.google.com by [email protected]
on 22 Mar 2012 at 4:10
One of the crucial features RF-ACE should have implemented ASAP. I've made
considerable effort to implement this in an efficient manner, and I think we're
only missing one piece from the puzzle.
Implementation proved more challenging than initially thought, mostly because I
want to make prediction both fast and generic.
Original issue reported on code.google.com by [email protected]
on 7 Jan 2012 at 11:33
What steps will reproduce the problem?
1. add a non-existent feature to the black list
2. start rf-ace
What is the expected output? What do you see instead?
expected: an error message with the non-existent feature name
instead: Segmentation fault
What version of the product are you using? On what operating system?
1.0.7, Aug 28 2012, windows 7
Please provide any additional information below.
here is the command line and output:
pollux(src/2012_09_11_output)%
/titan/cancerregulome9/workspaces/rf-ace/bin/rf-ace --filter --nThreads 1 -I
2012_09_11_1704_preterm_cons.fm -i N:CLIN:TermCategory:NB:::: -O
../2012_09_11_analysis/2012_09_11_1704_preterm_cons_22_bl_554_100_256.rf-ace.out
-B bl.txt -S 22 -n 554 -m 100 -p 256
-----------------------------------------------------------
| RF-ACE version: 1.0.7, Aug 28 2012 |
| Compile date: Aug 28 2012, 00:14:10 |
| Report issues: code.google.com/p/rf-ace/issues/list |
-----------------------------------------------------------
===> Reading file '2012_09_11_1704_preterm_cons.fm', please wait... DONE
===> Reading blacklist 'bl.txt', please wait... DONE
===> Applying blacklist, keeping 557 / 603 features, please wait...
Segmentation fault
Original issue reported on code.google.com by [email protected]
on 13 Sep 2012 at 6:22
With this feature it would be easy to extend the StcohasticForest class to grow
CARTs with arbitrary feature set restrictions. GBT implementation would become
simplified also.
Original issue reported on code.google.com by [email protected]
on 25 Mar 2012 at 10:30
What steps will reproduce the problem?
rf-ace-build-predictor-win64.exe -I oe1.train.arff -i class -O all.test.model -R
What is the expected output? What do you see instead?
Reading file 'oe1.train.arff', please wait... datadefs::str2num: ERROR: paramete
1513' could not be read properly. Quitting...
Assertion failed: false, file src\datadefs.cpp, line 168
What version of the product are you using? On what operating system?
WIN7, v1.0.3_win7_x64
Please provide any additional information below.
I attached a quite similar file (oe1.test.arff) which works fine. I already
reduced both files to make it easier to track down the problem. The only
difference I'm aware of is that the file that failes to be loaded was generated
through appending operations via WEKA.
greetings,
Berni
Original issue reported on code.google.com by [email protected]
on 22 Mar 2012 at 6:52
Attachments:
What steps will reproduce the problem?
1. tar xzf rf_ace_v1.0.4_src.tar.gz
2. make
What is the expected output? What do you see instead?
- Expecting no warning during make
- Seeing:
src/node.cpp: In member function 'bool Node::regularSplitterSeek(Treedata*,
size_t, const std::vector<long unsigned int>&, const std::vector<long unsigned
int>&, const Node::GrowInstructions&, size_t&, std::vector<long unsigned int>&,
std::vector<long unsigned int>&, datadefs::num_t&)':
src/node.cpp:377:92: warning: 'splitValue' may be used uninitialized in this
function [-Wuninitialized]
In each step where node.cpp is compiled.
What version of the product are you using? On what operating system?
- v1.0.4 on Ubuntu Linux 11.10
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 9 Apr 2012 at 10:18
What steps will reproduce the problem?
1. Use a large trainset (48 features, 270000 instances)
2. Build forest with ntrees=300, mtry=12 features
3. Fails to be built, stacktrace is attached, but will not always be dumped
What version of the product are you using? On what operating system?
version r513
rf-ace-build-predictor.exe (built with gcc and make)
rf-ace-build-predictor-win64.exe (built with Visual Studio Express 10)
Please provide any additional information below.
I tried also 100 and 200 trees, which worked just fine, but with 300 there
seems to be an issue. I don't think it's RAM related, since there were about
6Gig free RAM left.
Original issue reported on code.google.com by [email protected]
on 6 Apr 2012 at 7:10
Attachments:
Currently there are no built-in metrics based on which to assess the fitness of
the generated models. OOB prediction error would give valuable information
about how the forest fits the training data.
Original issue reported on code.google.com by [email protected]
on 25 Feb 2012 at 8:15
Depending on whether RF, GBT, or CART is selected, some parameters become
irrelevant. Say, with CART selected only one tree is grown and all features are
tested with each split (so nTrees == 1 and mTry is irrelevant). Etc.
Original issue reported on code.google.com by [email protected]
on 29 May 2012 at 8:04
What steps will reproduce the problem?
1. $ make
What is the expected output? What do you see instead?
The software is expected to build correctly, but gcc 4.7.1 raises many errors.
What version of the product are you using? On what operating system?
Arch Linux (current packages) with rf-ace 1.0.7.
Please provide any additional information below.
The errors are attached.
Original issue reported on code.google.com by [email protected]
on 11 Sep 2012 at 6:14
Attachments:
At the moment there may be some NaN-issues in the way GBT splits the nodes.
This must be investigated.
Original issue reported on code.google.com by [email protected]
on 4 Jul 2011 at 1:42
In most RF implementations, "ordinal" as feature type isn't supported, yet in
many cases such data type is the most natural one. The good news is, ordinal
feature type can be accounted for with very little modifications:
1. If ordinal feature splits, it is treated as numerical feature
- IF encoded in a certain way
2. If ordinal feature is splitted, it is treated as categorical feature
- no need to pay attention to internal formatting
Thus, an ordinal feature has the dual property of being both numerical and
categorical at the same time. The proposed annotation for ordinal feature is
naturally "O", e.g.
O:ordinal_feature
as per AFM notation. One problem arises: should the ARFF standard be extended
to account for ordinal features?
Original issue reported on code.google.com by [email protected]
on 27 Aug 2012 at 2:10
Sometimes identifying the proper index in the input file for a desired target
is cumbersome, but if one knows the name, that could be used instead.
Original issue reported on code.google.com by [email protected]
on 5 Jul 2011 at 2:17
It will be beneficial both for development and end-use to be able to assess the
frequency at which a particular feature is showing up in the trees. Also,
information about the show-up frequencies of contrast features, in comparison
to real features, should assess quality of data.
Original issue reported on code.google.com by [email protected]
on 3 Jul 2011 at 11:59
What steps will reproduce the problem?
1. Just call rf-ace-build-predictor-win64.exe without any parameter
What is the expected output? What do you see instead?
...
-s / --nodesize Minimum number of train samples per node, affects
tree depth
...
What version of the product are you using? On what operating system?
rf_ace_v1.0.4_*
Every OS
Please provide any additional information below.
The comment is a little bit confusing, since it could be interpreted as how
many samples are at least used to determine the best split for the node.
Original issue reported on code.google.com by [email protected]
on 31 Mar 2012 at 8:50
mtry parameter is set to default based on the all set of features although few
features are provided with a whitelist.
It should be updated based on the number of features in the subset of features
provided in the whitelist.
Original issue reported on code.google.com by [email protected]
on 4 Apr 2012 at 5:43
What steps will reproduce the problem?
Running the following command after compiling the source code from SVN
bin/rf-ace -F test_5by10_numeric_matrix.arff -i 4 -n 100 -m 5 -A
associations.tsv
What is the expected output? What do you see instead?
The expected output is a run of the feature selection process. Instead, it is
reported that the target is missing from all samples.
Verbatim:
-----------------------------------------------------------
| RF-ACE version: 1.1.0, Dec 5th 2012 |
| Compile date: Feb 18 2013, 01:07:51 |
| Report issues: code.google.com/p/rf-ace/issues/list |
-----------------------------------------------------------
Random Forest (RF) configuration:
-n / --nTrees = 100
-m / --mTry = 5
-s / --nodeSize = 3
-a / --nMaxLeaves = 2147483646
-q / --quantiles = NOT SET
-N / --noNABranching = NOT SET
Filter options:
-p / --nPerms = 20
-t / --pValueTh = 0.05
-Reading file 'test_5by10_numeric_matrix.arff' for filtering
Feature 'y' chosen as target with 10 / 0 samples ( -inf % missing ) among 5
features
Not enough samples (0) to perform a single split
What version of the product are you using? On what operating system?
RF-ACE version as in verbatim output above.
Operating system is Ubuntu precise (12.04.2 LTS)
Please provide any additional information below.
Same behaviour is observed with all ARFF files.
Original issue reported on code.google.com by [email protected]
on 17 Feb 2013 at 7:46
What steps will reproduce the problem?
1. tar tzvf rf_ace_v1.0.4_src.tar.gz | grep '~$' | wc -l
What is the expected output? What do you see instead?
- Expect to see 0 (zero)
- Seeing 47 (number of files ending with '~' in the tarball)
What version of the product are you using? On what operating system?
- rf-ace v1.0.4
Please provide any additional information below.
May want to add *~ to the 'clean' target in the Makefile too. Thanks.
Original issue reported on code.google.com by [email protected]
on 9 Apr 2012 at 10:46
As documented in the source code, certain cases appear to cause ArgParse to
fail after its rewrite to rely upon GNU C's getopt_long:
* When long arguments are specified in form "--longoption value"
* When short arguments are packed together, such as "-abcd valueForD"
Wrapping this code in a standard try-catch fails since the code throws across
linking barriers. Other attempts to catch the error are equally ineffectual.
Given the problems inherent to use of getopt_long as a drop-in rewrite of the
previous iteration of ArgParse, while maintaining its inefficient time
complexity, it's advised that this construct be rewritten to use a hashmap with
a very limited, well-defined number of input cases. Such a framework is trivial
once all of the supported input cases are defined.
Original issue reported on code.google.com by [email protected]
on 15 Aug 2011 at 10:20
What steps will reproduce the problem?
1. valgrind --track-origins=yes $RF/rf-ace-build-predictor -I
$DATA/adult.test.arff -O tree -i class 2>err.log >std.log
What version of the product are you using? On what operating system?
0.9.9, February 2nd, 2012
64 bit Linux: Linux 3.0.0-13-generic #22-Ubuntu SMP Wed Nov 2 13:27:26 UTC 2011
x86_64 x86_64 x86_64 GNU/Linux
Original issue reported on code.google.com by [email protected]
on 9 Feb 2012 at 11:40
Attachments:
What version of the product are you using? On what operating system?
rf_ace_v1.0.3_*
all operating systems
Please provide any additional information below.
The parameter mTry should not relate to the total amount of features, but
instead accept a positive integer with the absolute values of features to be
selected.
As a default value there are two suggestions which performed quite nice for me
in the past:
1 - mTRy = root(M)...where M is the total amount of features as suggested by
Breiman
2 - mTry = log2(M)+1...as implemented in WEKA
best regards,
Berni
Original issue reported on code.google.com by [email protected]
on 22 Mar 2012 at 10:42
What steps will reproduce the problem?
1. Train a predictor with default nmaxleaves and nodesize parameter:
rf-ace-build-predictor-win64.exe -I trainset_all.arff -i class -R -n 24 -m 12
-O trainset_all.arff.model -S 1
What is the expected output? What do you see instead?
The default parameters aren't performing well in terms of OOB error.
What version of the product are you using? On what operating system?
rf_ace_v1.0.4_*
Every OS
Please provide any additional information below.
Breiman suggests to build unpruned trees for a Random Forest,
so I would like to propose to set the default values of nmaxleaves and nodesize
in a manner that unpruned trees are generated.
best regards,
Berni
Original issue reported on code.google.com by [email protected]
on 31 Mar 2012 at 8:21
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.