Comments (15)
That's good!
Don't worry 😃
Thanks to the discussion here, points to be improved of rgf_python is found.
In WINDOWS, path including space sometimes exists.
So rgf_python can be more useful and stable by fixing it.
Please feel free to ask questions or send PR.
from rgf.
Thank you for your report!
Unfortunately, rgf is directory name as well as module name.
So we should use rgf.rgf
instead of rgf
.
But in old version, test.py placed the same directory with rgf.py.
So test.py can import RGFClassifier by from rgf import RGFClassifier
.
I transfer test.py to other directory, and fixed import sentence.
Could you try from rgf.rgf import RGFClassifier
as now version test.py?
from rgf.
So I am getting some unusual behaviour now. In a Jupyter notebook from rgf.rgf import RGFClassifier
seems to import the RGFClassifier class correctly. The test script ran until I got this error,
ERROR: /run/user/1000/jupyter/kernel-74918c1f-c829-4d4d-9d5b-0251eaa36408 (unittest.loader._FailedTest)
----------------------------------------------------------------------
AttributeError: module '__main__' has no attribute '/run/user/1000/jupyter/kernel-74918c1f-c829-4d4d-9d5b-0251eaa36408'
----------------------------------------------------------------------
Ran 1 test in 0.001s
FAILED (errors=1)
An exception has occurred, use %tb to see the full traceback.
SystemExit: True
---------------------------------------------------------------------------
SystemExit Traceback (most recent call last)
/usr/local/lib/python3.5/dist-packages/IPython/core/interactiveshell.py in run_code(self, code_obj, result)
2868 #rprint('Running code', repr(code_obj)) # dbg
-> 2869 exec(code_obj, self.user_global_ns, self.user_ns)
2870 finally:
<ipython-input-3-35dda55e8cd1> in <module>()
79 if __name__ == '__main__':
---> 80 unittest.main()
Maybe this is a separate issue to be raised. This looked like a Jupyter notebook error so back to importing.
I tried to import rgf.rgf
with python from the terminal using the following,
josh@josh-HP-ZBook-17-G2:~/rgf_python/rgf$ python3 test.py
Traceback (most recent call last):
File "test.py", line 11, in <module>
from rgf.rgf import RGFClassifier, RGFRegressor
ImportError: No module named 'rgf.rgf'; 'rgf' is not a package
I then tried to run this from a python shell and got pretty much the same issue,
>>> from rgf.rgf import RGFClassifier
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named 'rgf.rgf'; 'rgf' is not a package
I am very confused but please ask for any extra information that might be of use
from rgf.
Could you try to copy & paste test.py to other directory and execute it?
In terminal, if test.py and rgf.py locate in the same directory,
test.py try to import rgf.py directly.
Not directory but module.
This is caused by python package importation priority rule.
In newest rgf_python, I moved test.py to test/test.py.
Sorry for your inconvinience.
from rgf.
In Jupyter notebook, unittest
doesn't work as in terminal.
I think this is not rgf_python problem.
from rgf.
I have copy and pasted this into a different directory. I ran it and the package imported as expected. Thank you for your help.
However, on running the test another issue as arisen. There are 4 errors in the test.py
. They are all the same as this one in that they have the IndexError: list index out of range
but just for the 4 different model types; classifier, bin, regression and softmax. Here is regression,
ERROR: test_regressor (__main__.TestRGFClassfier)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test.py", line 71, in test_regressor
y_pred = reg.predict(X_test)
File "/usr/local/lib/python3.5/dist-packages/rgf_sklearn-0.0.0-py3.5.egg/rgf/rgf.py", line 463, in predict
latest_model_loc = sorted(glob(model_glob), reverse=True)[0]
IndexError: list index out of range
Many thanks for you help and for creating this python wrapper!!
Yes, I think the Jupyter notebook issue is not a rgf_python issue.
from rgf.
You're welcome! 😃
Your error is caused by failing to load model learning result.
(And I may have to make error message more informative.)
Could you list up all file names under loc_temp
?
I think they don't contain model learning result file.
If you can't, the reason is ...
- Failed to call rgf exec file.
- Failed to save model learning result under
loc_temp
.
e.g. indicated to directory writable by superuser only. - Other error related with rgf exec file.
From my experience, if we can pass this error, rgf becomes available.
from rgf.
No, there is no model file. Here is what I have in there,
~/Documents/Python Scripts/temp$ dir
test.data.x train.data.x train.data.y
How is best to check for each of the following possible errors?
Many thanks
from rgf.
Thanks!
Good, then you succeeded to write data file, but not model learning result.
So 2. was not occurred.
Now, I suspected that 1. was occurred.
Could you run following example?
https://github.com/fukatani/rgf_python/blob/master/example/cross_validation_for_iris.py
And please paste console output.It is informative.
CAUTION! I updated the example today, so please use the newest version.
You only have to update your local repository, or copy & paste example script any directory.
Beggining of output in my envirionment is here.
"train":
algorithm=RGF_Sib
train_x_fn=temp/train.data.x
train_y_fn=temp/train.data.y
Log:ON
model_fn_prefix=temp/model_c0
--------------------
Sat Oct 15 23:19:14 2016: Reading training data ...
Sat Oct 15 23:19:14 2016: Start ... #train=99
--------------------
Forest-level:
loss=Log
max_leaf_forest=400
max_tree=200
opt_interval=100
test_interval=100
num_tree_search=1
Verbose:ON
memory_policy=Generous
Turning on Force_to_refresh_all
-------------
Training data: 4x99, nonzero_ratio=1; managed as dense data.
from rgf.
I copied the new cross_validation_for_iris.py
and saved it as new_iris.py
in my Documents directory. I ran it from there and got this,
/usr/local/lib/python3.5/dist-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
b'
Usage: /home/josh/rgf1.2/bin/rgf train parameters
"train" Train and save models to files.
parameters: keyword-value pairs (e.g., "algorithm=RGF") and options
(e.g., "NormalizeTarget") delimited by "," described below.
Example parameters:
algorithm=RGF,train_x_fn=data.x,train_y_fn=data.y,reg_L2=0.1,...
Below, "*" indicates the required parameters that cannot be omitted.
[ Parameters for "train" ]
"algorithm=" RGF|RGF_Opt|RGF_Sib (Default:RGF)
* "train_x_fn=" Path to the feature file of training data.
* "train_y_fn=" Path to the target file of training data.
* "model_fn_prefix=" To save models, path names are generated by attaching
"-01", "-02",... to this value.
To optionally specify the weights of individual data points:
"train_w_fn=" Path to the file of user-defined weights assigned to
training data points.
To optionally do warm-start with an existing model:
"model_fn_for_warmstart="
Path to the input model file from which training
should do warm-start.
[ Parameters for RGF (default algorithm) ]
* "reg_L2=" lambda. Regularization coefficient.
"reg_sL2=" For node search, override lambda with this value.
"loss=" Loss function (Default:LS)
LS|Log|Expo
LS: Square loss (p-y)^2/2
Log: Log loss log(1+exp(-py)) for y=1,-1
Expo: Exponential loss exp(-py) for y=1,-1
"max_leaf_forest=" Stop training when the number of leaf nodes in the
forest reaches this number. (Default:10000)
"opt_interval=" Weight optimization interval in terms of #leaf.
(Default:100)
"test_interval=" Approximate test interval in terms of #leaf. Must be
multiple or divisor of the optimization interval for
efficiency; otherwise, it may be changed by the system
automatically. (Default:500)
"num_tree_search=" Number of trees to be searched for the nodes to split.
The most recently-grown trees are searched first.
(Default:1)
"reg_depth=" gamma>=1. A larger value penalizes deeper nodes more
severely. Used with lambda as in lambda*gamma^depth.
(Default:1)
"NormalizeTarget" For training, normalize training targets so that the
average becomes zero. Intended for regression.
"num_iteration_opt=" Used in the iterative optimization of weights.
Maximum number of iterations. (Default:10 for square
loss; 5 for exponential loss and the likes)
"opt_stepsize=" Used in the iterative optimization of weights. Step
size of Newton updates. (Default:0.5)
"min_pop=" Minimum number of training data points in each leaf
node. (Default:10)
"Time" Measure elapsed time for node search and weight
optimization.
"Verbose" Print information during training.
"memory_policy=" Conservative|Generous. (Default:Generous)
---------------------------------------------------------
To display parameters for other algorithms, enter:
/home/josh/rgf1.2/bin/rgf train algorithm_name
Example: /home/josh/rgf1.2/bin/rgf train RGF_Sib
---------------------------------------------------------
List of algorithm names:
"RGF" Regularized greedy forest
"RGF_Opt" RGF w/min-penalty regularization
"RGF_Sib" RGF w/min-penalty regularization w/sum-to-zero sibling
constraints
'
None
b'
Usage: /home/josh/rgf1.2/bin/rgf train parameters
"train" Train and save models to files.
parameters: keyword-value pairs (e.g., "algorithm=RGF") and options
(e.g., "NormalizeTarget") delimited by "," described below.
Example parameters:
algorithm=RGF,train_x_fn=data.x,train_y_fn=data.y,reg_L2=0.1,...
Below, "*" indicates the required parameters that cannot be omitted.
[ Parameters for "train" ]
"algorithm=" RGF|RGF_Opt|RGF_Sib (Default:RGF)
* "train_x_fn=" Path to the feature file of training data.
* "train_y_fn=" Path to the target file of training data.
* "model_fn_prefix=" To save models, path names are generated by attaching
"-01", "-02",... to this value.
To optionally specify the weights of individual data points:
"train_w_fn=" Path to the file of user-defined weights assigned to
training data points.
To optionally do warm-start with an existing model:
"model_fn_for_warmstart="
Path to the input model file from which training
should do warm-start.
[ Parameters for RGF (default algorithm) ]
* "reg_L2=" lambda. Regularization coefficient.
"reg_sL2=" For node search, override lambda with this value.
"loss=" Loss function (Default:LS)
LS|Log|Expo
LS: Square loss (p-y)^2/2
Log: Log loss log(1+exp(-py)) for y=1,-1
Expo: Exponential loss exp(-py) for y=1,-1
"max_leaf_forest=" Stop training when the number of leaf nodes in the
forest reaches this number. (Default:10000)
"opt_interval=" Weight optimization interval in terms of #leaf.
(Default:100)
"test_interval=" Approximate test interval in terms of #leaf. Must be
multiple or divisor of the optimization interval for
efficiency; otherwise, it may be changed by the system
automatically. (Default:500)
"num_tree_search=" Number of trees to be searched for the nodes to split.
The most recently-grown trees are searched first.
(Default:1)
"reg_depth=" gamma>=1. A larger value penalizes deeper nodes more
severely. Used with lambda as in lambda*gamma^depth.
(Default:1)
"NormalizeTarget" For training, normalize training targets so that the
average becomes zero. Intended for regression.
"num_iteration_opt=" Used in the iterative optimization of weights.
Maximum number of iterations. (Default:10 for square
loss; 5 for exponential loss and the likes)
"opt_stepsize=" Used in the iterative optimization of weights. Step
size of Newton updates. (Default:0.5)
"min_pop=" Minimum number of training data points in each leaf
node. (Default:10)
"Time" Measure elapsed time for node search and weight
optimization.
"Verbose" Print information during training.
"memory_policy=" Conservative|Generous. (Default:Generous)
---------------------------------------------------------
To display parameters for other algorithms, enter:
/home/josh/rgf1.2/bin/rgf train algorithm_name
Example: /home/josh/rgf1.2/bin/rgf train RGF_Sib
---------------------------------------------------------
List of algorithm names:
"RGF" Regularized greedy forest
"RGF_Opt" RGF w/min-penalty regularization
"RGF_Sib" RGF w/min-penalty regularization w/sum-to-zero sibling
constraints
'
None
b'
Usage: /home/josh/rgf1.2/bin/rgf train parameters
"train" Train and save models to files.
parameters: keyword-value pairs (e.g., "algorithm=RGF") and options
(e.g., "NormalizeTarget") delimited by "," described below.
Example parameters:
algorithm=RGF,train_x_fn=data.x,train_y_fn=data.y,reg_L2=0.1,...
Below, "*" indicates the required parameters that cannot be omitted.
[ Parameters for "train" ]
"algorithm=" RGF|RGF_Opt|RGF_Sib (Default:RGF)
* "train_x_fn=" Path to the feature file of training data.
* "train_y_fn=" Path to the target file of training data.
* "model_fn_prefix=" To save models, path names are generated by attaching
"-01", "-02",... to this value.
To optionally specify the weights of individual data points:
"train_w_fn=" Path to the file of user-defined weights assigned to
training data points.
To optionally do warm-start with an existing model:
"model_fn_for_warmstart="
Path to the input model file from which training
should do warm-start.
[ Parameters for RGF (default algorithm) ]
* "reg_L2=" lambda. Regularization coefficient.
"reg_sL2=" For node search, override lambda with this value.
"loss=" Loss function (Default:LS)
LS|Log|Expo
LS: Square loss (p-y)^2/2
Log: Log loss log(1+exp(-py)) for y=1,-1
Expo: Exponential loss exp(-py) for y=1,-1
"max_leaf_forest=" Stop training when the number of leaf nodes in the
forest reaches this number. (Default:10000)
"opt_interval=" Weight optimization interval in terms of #leaf.
(Default:100)
"test_interval=" Approximate test interval in terms of #leaf. Must be
multiple or divisor of the optimization interval for
efficiency; otherwise, it may be changed by the system
automatically. (Default:500)
"num_tree_search=" Number of trees to be searched for the nodes to split.
The most recently-grown trees are searched first.
(Default:1)
"reg_depth=" gamma>=1. A larger value penalizes deeper nodes more
severely. Used with lambda as in lambda*gamma^depth.
(Default:1)
"NormalizeTarget" For training, normalize training targets so that the
average becomes zero. Intended for regression.
"num_iteration_opt=" Used in the iterative optimization of weights.
Maximum number of iterations. (Default:10 for square
loss; 5 for exponential loss and the likes)
"opt_stepsize=" Used in the iterative optimization of weights. Step
size of Newton updates. (Default:0.5)
"min_pop=" Minimum number of training data points in each leaf
node. (Default:10)
"Time" Measure elapsed time for node search and weight
optimization.
"Verbose" Print information during training.
"memory_policy=" Conservative|Generous. (Default:Generous)
---------------------------------------------------------
To display parameters for other algorithms, enter:
/home/josh/rgf1.2/bin/rgf train algorithm_name
Example: /home/josh/rgf1.2/bin/rgf train RGF_Sib
---------------------------------------------------------
List of algorithm names:
"RGF" Regularized greedy forest
"RGF_Opt" RGF w/min-penalty regularization
"RGF_Sib" RGF w/min-penalty regularization w/sum-to-zero sibling
constraints
'
None
Traceback (most recent call last):
File "new_iris.py", line 34, in <module>
rgf_score += rgf.score(xs_test, y_test)
File "/usr/local/lib/python3.5/dist-packages/sklearn/base.py", line 349, in score
return accuracy_score(y, self.predict(X), sample_weight=sample_weight)
File "/usr/local/lib/python3.5/dist-packages/rgf_sklearn-0.0.0-py3.5.egg/rgf/rgf.py", line 231, in predict
proba = self.predict_proba(X)
File "/usr/local/lib/python3.5/dist-packages/rgf_sklearn-0.0.0-py3.5.egg/rgf/rgf.py", line 201, in predict_proba
class_proba = clf.predict_proba(X)
File "/usr/local/lib/python3.5/dist-packages/rgf_sklearn-0.0.0-py3.5.egg/rgf/rgf.py", line 330, in predict_proba
latest_model_loc = sorted(glob(model_glob), reverse=True)[0]
IndexError: list index out of range
Please note: The code above was not formatted 100% correctly and so I replaced all \n
with actual newlines for readability.
Any more information please ask :)
from rgf.
Thanks a lot!
rgf_python succeeded to call execution file, but arguments seems to be invalid.
This is reason why the usage was displayed in your environment.
Let's confirm arguments.
For confirming it, we have to debug by embedding print
sentence or using IDE.
If you want to use former method, please rewrite platform_specific_Popen and insert print function as follows.
def platform_specific_Popen(cmd, **kwargs):
print("Output command arguments... {0}".format(cmd))
print("Output kwargs... {0}".format(kwargs))
print("Output sysname... {0}".format(sys_name))
if sys_name == WINDOWS:
return subprocess.Popen(cmd.split(), **kwargs)
elif sys_name == LINUX:
return subprocess.Popen(cmd, **kwargs)
Here is result in my environment.
Output command arguments... C:\Users\rf\Documents\python\rgf1.2\bin\rgf.exe train Verbose,train_x_fn=temp/train.data.x,train_y_fn=temp/train.data.y,algorithm=RGF_Sib,loss=Log,max_leaf_forest=400,test_interval=100,reg_L2=0.1,reg_sL2=0.1,reg_depth=1,model_fn_prefix=temp/model_c2 2>&1
Output kwargs... {'shell': True, 'stdout': -1}
Output sysname... Windows
from rgf.
From the print/debug function I get
Output command arguments... /home/josh/rgf1.2/bin/rgf train Verbose,train_x_fn=/home/josh/Documents/Python Scripts/temp/train.data.x,train_y_fn=/home/josh/Documents/Python Scripts/temp/train.data.y,algorithm=RGF_Sib,loss=Log,max_leaf_forest=400,test_interval=100,reg_L2=0.1,reg_sL2=0.1,reg_depth=1,model_fn_prefix=/home/josh/Documents/Python Scripts/temp/model_c2 2>&1
Output kwargs... {'stdout': -1, 'shell': True}
Output sysname... Linux
from rgf.
OK!
Could you change loc-temp to other directory which name does not include space?
Space is recognized as delimiter character.
from rgf.
I have changed it. It all seems to be working now as the iris script ran and gave the 0.95..
answer. Thank you greatly for your help and sorry that it was such a silly error.
from rgf.
This is also working with my Jupyter notebook now! :)
from rgf.
Related Issues (20)
- Collecting nice RGF blog post and kaggle solution. HOT 8
- libcurl handshake error for codecov.io HOT 4
- [R-package] Split classes into their own files HOT 3
- [R-package] Use keyword arguments with internal constructor calls HOT 5
- [R-package] Roxygen documentation should use shared parameters HOT 6
- Can't use rgf on kaggle HOT 8
- [R-package] R package is not currently tested on Windows HOT 1
- [R-package] increase the code coverage of R-package HOT 16
- New release HOT 16
- Fix known issue description
- Add RGF to CRAN Task View on Machine Learning? HOT 5
- FastRGFClassifier BUG HOT 5
- FastRgf doesn't compile HOT 7
- excuse me,but could someone tell me that after I use "joblib.dump(clf,'fastrgf_model.pkl')",how could I load model back?? HOT 3
- FastRGF executables not found on Windows HOT 4
- Problem installing FastRGF HOT 1
- Running RGF from R cmd HOT 2
- FastRGF estimators are unavailable for usage. HOT 1
- Feature importance (permutation or shapley values)
- When loading model artifact, it couldn't find the tmp folder HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rgf.