dmlc / treelite Goto Github PK

View Code? Open in Web Editor NEW

717.0 40.0 96.0 7.79 MB

Universal model exchange and serialization format for decision tree forests

Home Page: https://treelite.readthedocs.io/en/latest/

License: Apache License 2.0

C++ 63.71% CMake 2.65% Python 30.50% C 0.54% Makefile 0.04% Shell 2.27% Batchfile 0.30%

treelite's Introduction

Treelite

Documentation | Installation | Release Notes | Acknowledgements |

Treelite is a universal model exchange and serialization format for decision tree forests. Treelite aims to be a small library that enables other C++ applications to exchange and store decision trees on the disk as well as the network.

treelite's People

Contributors

Stargazers

Watchers

Forkers

zmoon111 qizailiu leezqcst cuiopen poseidon1214 headupinclouds zheng-da grseb9s jamesliu kevinking vishalbelsare ai3dvision leemgs hcho3 horgh yiyisan mpuccio chenqin strint feiyuxinfeng yskn67 neo-ai amitmeel litao-wrk henuxhj ltynbo hongbozhang0808 sperlingxx yupbank tonybarber xiuxiujiang salonijain27 fengjixuchui teju85 canonizer apivovarov mbrukman qoffee buptjk tonymou trivialfis xiaming9880 papamadeleine2022 pidefrem tusharkalecam lujunsincerely chopehq binbinmeng yueyedeai houjincheng1992 raskr masknugget zhouhans wphicks gian-lc jaime0 trowind dovahcrow isabella232 trevor-m crflynn hsq79815 ramitchell guozhaochen rogervaas redesufpel david-cortes liangtsao josepowera neogyk premprabhat guodongxiaren hirnimeshrampuresoftware odidev ramanan-subramanian tczhao pinkdiamond1 linanqiu dantegd oliverholworthy grosa1 gmh5225 cwharris gerhobbelt adamreeve edwardtj arldy trxcllnt junjie2008v linhthi xuanqing94 jameslamb

treelite's Issues

Precision of threshold/value in generated c files

Currently, the values in the model are outputted with default precision (i.e. 6 significant digits in general).
For example if in the saved model the threshold is
0.51548230441328713, then the generated c file will have data[1].fvalue <= 0.515482.
This can have a very significant impact on the model accuracy, where predictions in original model and predictions from generated c code vary drastically.

Generated files should have max available (or parameterized) precision, sth like
oss << std::setprecision(std::numeric_limits::digits10 + 2);

Error loading protobuf RF model with single tree and single leaf node

I created a simple protobuf representation of a Random Forest model with a single tree containing a single leaf node. However when I try to load it I get:

Check failed: node.leaf_vector_size() > 0 (0 vs. 0)

I am using Treelite version 0.32 on Mac OS-X. After creating the protobuf file I loaded it in Python 3.6 using the code below.

Attaching the protobuf file as a .gz file . (It is only a few bytes)
and the code to generate it.
model1.protobuf.gz
test-protobuf-rf.tar.gz

import treelite
model = treelite.Model.load('model1.protobuf')
Traceback (most recent call last):
File "", line 1, in
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/treelite/frontend.py", line 350, in load
ctypes.byref(handle)))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/treelite/core.py", line 50, in _check_call
raise TreeliteError(_LIB.TreeliteGetLastError())
treelite.common.util.TreeliteError: b'[19:10:44] /Users/travis/build/hcho3/treelite-wheels/treelite/src/frontend/protobuf.cc:51: Check failed: node.leaf_vector_size() > 0 (0 vs. 0) \n\nStack trace returned 1 entries:\n[bt] (0) 0 libtreelite.dylib 0x0000000110d20b7c dmlc::StackTraceabi:cxx11 + 364\n\n'

Question about calling"int TreelitePredictorPredictInst(...)" from C++ codebase

Hi,

First off, much kudos to you for providing this valuable tool! I have perhaps a naive question. I am using treelite to speedup xbgoost's inference where I need to do prediction on data samples one row at a time in a C++ codebase. I see that you provide an interface for TreelitePredictorPredictInst() in treelite/runtime/native/include/treelite/c_api_runtime.h, but I do not see this method in the compiled shared object library libtreelite.so. Is it possible to provide a c_api_runtime example on how to call it?

Best regards,
Ismail

Option for 64bit input data

Issue #55 updated the model translation to use double precision for the thresholds and leaf values, however, the input data is still using 32bit numpy arrays / floats. This leads to incorrect predictions in models that use double precision data.

Prediction Results Do Not Match Original XGBoost

I trained a model with a single iteration:

param = {
    'nthread': 10,
    'objective': 'multi:softprob',
    'eval_metric': 'mlogloss',
    'num_class': 3,
    'silent': 1,
    'max_depth': 5,
    'min_child_weight': 5,
    'eta': 0.5,  # learning rate
    'subsample': 1,
    'colsample_bytree': 1,
    'gamma': 0, 
    'alpha': 0,
    'lambda': 1, 
}
watchlist  = [(dtest, 'eval')]
bst = xgb.train(param, dtrain, 1, watchlist)

sample will be a numpy array with a single row of features. The output of:

bst.predict(xgb.DMatrix(sample))

differs noticeably from the output of:

bst_lite = treelite.Model.from_xgboost(bst)
bst_lite.export_lib(toolchain='gcc', libpath=model_path, verbose=True)
batch = Batch.from_npy2d(sample)
predictor = Predictor(model_path, verbose=True)
predictor.predict(batch)

The problem is compounded when more trees are added.

Any ideas of what could be going on here?

Thanks for your work on this @hcho3 .

In our experiment, origin lightgbm predict result is not equal treelite predict result

Dependencies:
lightgbm==2.2.2
treelite==0.32

LightGBM was trained with the following parameters:
boosting_type = gbdt
objective = binary
metric = binary_logloss,auc
metric_freq = 1
is_training_metric = true
max_bin = 255
categorical_feature=1,2,3,5,6,8,299
scale_pos_weight = 2.0
num_trees = 500
num_leaves = 255
max_depth = 20
learning_rate = 0.1
min_gain_to_split = 0.00000001
feature_fraction = 0.7
bagging_freq = 5
bagging_fraction = 0.8
min_data_in_leaf = 10
min_sum_hessian_in_leaf = 1e-3
is_enable_sparse = true
use_two_round_loading = true

Save the model and predict sum case:
the result is as follow:
0.077086254685271596
0.044846837005837775
0.66469752969639961

compile the model and predict the samples:
model = treelite.Model.load('./a.mdl', model_format='lightgbm')
model.export_lib(toolchain='gcc', libpath='./a.so', verbose=True)

batch = treelite.runtime.Batch.from_npy2d(matrix)
predictor = treelite.runtime.Predictor('./a.so', verbose=True)
out_pred = predictor.predict(batch, pred_margin=True)
the reuslt is different from origin model predict:
0.10160005
0.05897592
0.6094563

CoreML support

any plans for the CoreML support ?
Thanks

Catboost support

load model from file and predict

I'm interested in using treelite as a fast and lightweight C/C++ prediction module. It seems the preferred model is to load a standard supported format, compile/bake an optimized c file/library, and then integrate the generated c file into an application for prediction. Is it possible to use treelite to load a model from file at runtime for prediction? (If this isn't possible through the current API would you accept this feature). This would provide some flexibility and code size reductions compared to using xgboost or lightgbm directly.

Segmentation fault on predictor load with C runtime API

Treelite version: 0.31
Compiler: gcc-7 (Homebrew GCC 7.3.0_1) 7.3.0
Cmake version: 3.10.3
Environment: OS X 10.13.3 high sierra

Using Treelite through C API throws segmentation fault on TreelitePredictorLoad. Worked fine on 0.3.

GPU inference?

What do you think of a inference on a GPU? Potentially this could be faster.

predict

predict() returns inappropriately shaped result

Awesome library--thanks for developing it!

When I'm doing multiclass classification and I call predict on a (1, #features) shaped dataset, I get a (#classes) shaped result, as opposed to a 1x#classes shaped result. This forces me to write code to handle this case, as predictions on all other sizes return 2D results (ex. (100x, #features) returns a (100, #classes) shaped result).

BUG: Unable to handle kernel paging request

I am loading around 1000 xgboost models to perform treelite prediction. The treelite version is 0.32. However, the following error will appear occasionally, causing the restart of the Linux server.

Any advice on this situation?

How to use this library in Android or IOS platform？

I want to use Treelite for inference in Android and IOS platform.
1.Is this library compatible with xgboost model file?
Because I plan to train the model using xgboost，and inference using the treelite.
2.Is there any dependence on linux-only library?
I was try to compile the xgboost with Android NDK,however I find there are many dependence on linux-only library,for example the execinfo.h.I can't find such file in Android NDK.

when I compile treelite with java8, just get an error that says "cannot find javolution.io.Union"

Error when setting absolute path in model.compile(dirpath=treelite_java_path, compiler='ast_java', params=params)

Only support relative path for dirpath.
Please consider absolute path case.

how to get the leaf index

@hcho3
hello:
How can I get the leaf index of all trees in lightbm?not only the final predict result

Export CMakeLists.txt instead of Makefile

Is there any way to export with a CMakeLists.txt file instead of a Makefile? This would make it much easier to integrate the model into code that already uses CMake, and remove the need to set the toolchain when exporting the model.

does not contain valid get_num_output_group()

After having problems getting my own model to work, I decided to go through "First Tutorial" step by step. Here is the code I executed:

X, y = load_boston(return_X_y=True)
dtrain = xgboost.DMatrix(X, label=y)
bst = xgboost.train({}, dtrain, 20, [(dtrain, 'train')])
model = treelite.Model.from_xgboost(bst)
model.export_lib(toolchain="msvc", libpath="./mymodel.dll")
predictor = treelite.runtime.Predictor("./mymodel.dll")

It works fine up until the last line where I try to load the model into runtime. I then get this error:

TreeliteErrorTraceback (most recent call last)
<ipython-input-14-e0e2860102c2> in <module>()
----> 1 predictor = treelite.runtime.Predictor("./mymodel.dll")

C:\Users\HannahWalsh\Anaconda2\lib\site-packages\treelite\runtime\predictor.pyc in __init__(self, libpath, nthread, verbose, include_master_thread)
    260         ctypes.c_int(nthread if nthread is not None else -1),
    261         ctypes.c_int(1 if include_master_thread else 0),
--> 262         ctypes.byref(self.handle)))
    263     # save # of output groups
    264     num_output_group = ctypes.c_size_t()

C:\Users\HannahWalsh\Anaconda2\lib\site-packages\treelite\runtime\predictor.pyc in _check_call(ret)
     44   """
     45   if ret != 0:
---> 46     raise TreeliteError(_LIB.TreeliteGetLastError())
     47 
     48 class Batch(object):

TreeliteError: [13:52:40] c:\projects\treelite-wheels\treelite\src\predictor.cc:220: Check failed: query_func != nullptr Dynamic shared library `C:\Users\HannahWalsh\Desktop\treelite_trial\mymodel.dll' does not contain valid get_num_output_group() function

Notably, this is the same issue I was getting when I tried my own model.
I compiled it manually (as opposed to export_lib) and can see the get_num_output_group in main.c and it is this:

size_t get_num_output_group(void) {
  return 1;
}

which I know is correct from examples posted here.
I am using Windows 10, Python 2.7 in Atom with the interactive Hydrogen package.

model = treelite.Model.load('test.model', model_format='xgboost')

File "/root/.local/lib/python3.5/site-packages/treelite/frontend.py", line 347, in load
ctypes.byref(handle)))
File "/root/.local/lib/python3.5/site-packages/treelite/core.py", line 50, in _check_call
raise TreeliteError(LIB.TreeliteGetLastError())
treelite.common.util.TreeliteError: b'[00:52:37] /io/treelite/src/frontend/xgboost.cc:374: Check failed: fp->Read(&name_obj[0], len) == len (6030078 vs. 876439406) Ill-formed XGBoost model file: corrupted header\n\nStack trace returned 10 entries:\n[bt] (0) /root/.local/lib/python3.5/site-packages/treelite/libtreelite.so(dmlc::StackTrace()+0x19c) [0x7fb96cb7bcdc]\n[bt] (1) /root/.local/lib/python3.5/site-packages/treelite/libtreelite.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x18) [0x7fb96cb7ce48]\n[bt] (2) /root/.local/lib/python3.5/site-packages/treelite/libtreelite.so(+0x26a90f) [0x7fb96cbb690f]\n[bt] (3) /root/.local/lib/python3.5/site-packages/treelite/libtreelite.so(treelite::frontend::LoadXGBoostModel(char const*)+0x28) [0x7fb96cbb9038]\n[bt] (4) /root/.local/lib/python3.5/site-packages/treelite/libtreelite.so(TreeliteLoadXGBoostModel+0x23) [0x7fb96cb7f213]\n[bt] (5) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c) [0x7fb9744d3e20]\n[bt] (6) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call+0x2eb) [0x7fb9744d388b]\n[bt] (7) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(_ctypes_callproc+0x49a) [0x7fb9744ce01a]\n[bt] (8) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(+0x9fcb) [0x7fb9744c1fcb]\n[bt] (9) python3(PyObject_Call+0x47) [0x5c1797]\n\n'
root@ubuntu:/home/hkulkarni/treelite/tests/python# python3 test.py

I doesn't look like the model file is corrupt, but above says so.

To generate model file i have used XGBoost->saveModel()

Which i assume to be of "xgboost" format.

Compilation error on Windows - missing directory

I am trying read and compile a Lightgbm model but I am having issues with temporary files. My code is as follows:

import treelite
model = treelite.Model.load('lgb_model.txt', model_format='lightgbm')
model.export_lib(toolchain='msvc', libpath='./lgb_model.dll', verbose=True)

And I get the following output:

[00:41:46] c:\projects\treelite-wheels\treelite\src\compiler\ast_native.cc:22: Using ASTNativeCompiler
[00:41:47] c:\projects\treelite-wheels\treelite\src\compiler\ast\split.cc:10: Parallel compilation disabled; all member trees will be dumped to a single source file. This may increase compilation time and memory usage.
[00:41:48] c:\projects\treelite-wheels\treelite\src\c_api\c_api.cc:297: Code generation finished. Writing code to files...
[00:41:48] c:\projects\treelite-wheels\treelite\src\c_api\c_api.cc:314: Writing file header.h...
[00:41:48] c:\projects\treelite-wheels\treelite\src\c_api\c_api.cc:314: Writing file main.c...
[00:41:49] c:\projects\treelite-wheels\treelite\src\c_api\c_api.cc:314: Writing file recipe.json...
[00:41:49] C:\Users\myusername\AppData\Local\Continuum\anaconda3\envs\tf2\lib\site-packages\treelite\contrib\__init__.py:208: WARNING: some of the source files are long. Expect long compilation time. You may want to adjust the parameter parallel_comp.

[00:41:49] C:\Users\myusername\AppData\Local\Continuum\anaconda3\envs\tf2\lib\site-packages\treelite\contrib\util.py:98: Compiling sources files in directory C:\Users\myusername\AppData\Local\Temp\tmp2fs01b49 into object files (*.obj)...

But the compilation fails because it can't write to a temp file. The tempfolder never gets created, and the retcode_cpu0.txt file gets created on the same location of the lightgbm model and Jupyter notebook (I am running this in Jupyter Lab).

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-4-e9892c80d16e> in <module>
----> 1 model.export_lib(toolchain='msvc', libpath='./lgb_model.dll', verbose=True)

~\AppData\Local\Continuum\anaconda3\envs\tf2\lib\site-packages\treelite\frontend.py in export_lib(self, toolchain, libpath, params, compiler, verbose, nthread, options)
     97       self.compile(temp_dir, params, compiler, verbose)
     98       temp_libpath = create_shared(toolchain, temp_dir, nthread,
---> 99                                    verbose, options)
    100       shutil.move(temp_libpath, libpath)
    101 

~\AppData\Local\Continuum\anaconda3\envs\tf2\lib\site-packages\treelite\contrib\__init__.py in create_shared(toolchain, dirpath, nthread, verbose, options)
    219     from .gcc import _create_shared, _openmp_supported
    220   libpath = \
--> 221     _create_shared(dirpath, toolchain, recipe, nthread, options, verbose)
    222   if verbose:
    223     log_info(__file__, lineno(),

~\AppData\Local\Continuum\anaconda3\envs\tf2\lib\site-packages\treelite\contrib\msvc.py in _create_shared(dirpath, toolchain, recipe, nthread, options, verbose)
     86                           .format(_varsall_bat_path(),
     87                                   'amd64' if _is_64bit_windows() else 'x86')
---> 88   return _create_shared_base(dirpath, recipe, nthread, verbose)
     89 
     90 def _check_ext(dllpath):

~\AppData\Local\Continuum\anaconda3\envs\tf2\lib\site-packages\treelite\contrib\util.py in _create_shared_base(dirpath, recipe, nthread, verbose)
    114   result = []
    115   for tid in range(ncpu):
--> 116     result.append(_wait(proc[tid], workqueue[tid]))
    117 
    118   for tid in range(ncpu):

~\AppData\Local\Continuum\anaconda3\envs\tf2\lib\site-packages\treelite\contrib\util.py in _wait(proc, args)
     81   dirpath = args['dirpath']
     82   stdout, _ = proc.communicate()
---> 83   with open(os.path.join(dirpath, 'retcode_cpu{}.txt'.format(tid)), 'r') as f:
     84     retcode = [int(line) for line in f]
     85   return {'stdout':_str_decode(stdout), 'retcode':retcode}

FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\myusername\\AppData\\Local\\Temp\\tmp2fs01b49\\retcode_cpu0.txt'

predictions results do not match original lightgbm

Dependencies:
lightgbm==2.2.2
treelite==0.32,
mac, python3

It seems it is somehow related to this issue as well.

LightGBM was trained with the following parameters:

params = {
            'learning_rate': 0.1,
            'n_estimators': 1000,
            'max_depth': 4,
            'random_state': 1
        }

Save the model and load predictor:

joblib.dump(model, model_path)
treelite_model = treelite.Model.load(model_path, model_format='lightgbm')
treelite_model.export_lib(toolchain='gcc', libpath=str(directory) + '/model.so', verbose=False, params={'parallel_comp':10})
predictor = Predictor(str(directory / 'model.so'), verbose=True)

Predict with original model and treelite:

print(model.predict(sample_data))
print(predictor.predict_instance(sample_data.values[0]))

[2.786864]
2.7558372

The difference looks small, but once it is exponentiated, it starts to be big. I tested it with different number of trees and the less trees I use the less different are the predictions.

my litghtgbm model can't get correct predict value

treelite version:0.32
lightgbm version:2.2.4
big_value_model.txt
I transform the big_value_model.txt to big_value_model.so by

model = treelite.Model.load('model/big_value_model.txt', model_format='lightgbm')
model.export_lib(toolchain='gcc', libpath='./treelite/big_value_model.so', params={'parallel_comp':32}, verbose=True)

Then, I use both .txt and .so model to predict. The code like this:

import lightgbm as lgb
test_list =[1512972804, 1496750400, '6227', '20009', 0, 0, 1275, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 125, 78, 2, 65, 85, 0, 2, 0, 1, 0, 0, 0, 0]
value_model = lgb.Booster(model_file='big_value_model.txt')
predict_one = value_model.predict([test_list])
batch = treelite.runtime.Batch.from_npy2d(np.aray([test_list]))
predictor = treelite.runtime.Predictor('./treelite/big_value_model.so',verbose=True)
predict_two = [predictor.predict(batch)]
print(predict_one,predict_two)
"""
Output:[9301868.184555175],[3049.897216796875]
"""

Somebody can help me to find out what's wrong with it?

Loading dll produces "predictor.dll does not contain valid get_num_output_group()"

I'm running on Windows 10, with Visual Studio 2017 compiler tools. I successfully built "predictor.dll".
When I tried to use the dll with:

predictor = treelite.runtime.Predictor('./mymodel/predictor.dll', verbose=True)
I get:


---------------------------------------------------------------------------
TreeliteError                             Traceback (most recent call last)
<ipython-input-10-9b893b29877a> in <module>()
----> 1 predictor = treelite.runtime.Predictor('./mymodel/predictor.dll', verbose=True)

C:\Users\Charles\Anaconda2\lib\site-packages\treelite\runtime\predictor.pyc in __init__(self, libpath, nthread, verbose, include_master_thread)
    260         ctypes.c_int(nthread if nthread is not None else -1),
    261         ctypes.c_int(1 if include_master_thread else 0),
--> 262         ctypes.byref(self.handle)))
    263     # save # of output groups
    264     num_output_group = ctypes.c_size_t()

C:\Users\Charles\Anaconda2\lib\site-packages\treelite\runtime\predictor.pyc in _check_call(ret)
     44   """
     45   if ret != 0:
---> 46     raise TreeliteError(_LIB.TreeliteGetLastError())
     47 
     48 class Batch(object):

TreeliteError: [12:45:56] c:\projects\treelite-wheels\treelite\src\predictor.cc:220: 
Check failed: query_func != nullptr Dynamic shared library
`H:\HedgeTools\Notebooks\Frameworks\SKLearn\XGBClassifier\Deployment_v2\mymodel\
predictor.dll' does not contain valid get_num_output_group() function

Any suggestions will be greatly appreciated,
Charles

leaf index

@hcho3 If i want to get the leaf index of all trees other than the final result, do i have to totally rewrite the compiler?

ValueError: invalid literal for int() with base 10 when trying to dump model

I'm encountering the following error when trying to dump a model using Python 3.6.5 running on Mac OS C.

Error:

python dump_treelite.py
[main] treelite
[LightGBM] [Info] Finished loading 250 models
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/frontend/lightgbm.cc:104: Warning: input file was not terminated with end-of-line character.
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/compiler/ast_native.cc:22: Using ASTNativeCompiler
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/compiler/ast/split.cc:15: Parallel compilation enabled; member trees will be divided into 4 translation units.
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:297: Code generation finished. Writing code to files...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file recipe.json...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file main.c...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file tu1.c...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file tu3.c...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file header.h...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file tu0.c...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file tu2.c...
[11:49:15] /Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/__init__.py:208: WARNING: some of the source files are long. Expect long compilation time. You may want to adjust the parameter parallel_comp.

[11:49:15] /Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/util.py:96: Compiling sources files in directory /var/folders/n1/fsscqfrd5fg8mybxkv2hl5t0m5v75k/T/tmp4n9b4c9t into object files (*.o)...
Traceback (most recent call last):
  File "dump_treelite.py", line 21, in <module>
    main()
  File "dump_treelite.py", line 16, in main
    params={'parallel_comp':4}, nthread=8, verbose=True)
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/frontend.py", line 99, in export_lib
    verbose, options)
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/__init__.py", line 221, in create_shared
    _create_shared(dirpath, toolchain, recipe, nthread, options, verbose)
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/gcc.py", line 61, in _create_shared
    return _create_shared_base(dirpath, recipe, nthread, verbose)
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/util.py", line 115, in _create_shared_base
    result.append(_wait(proc[tid], workqueue[tid]))
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/util.py", line 82, in _wait
    retcode = [int(line) for line in f]
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/util.py", line 82, in <listcomp>
    retcode = [int(line) for line in f]
ValueError: invalid literal for int() with base 10: 'clang -c -O3 -o tu2.o tu2.c -fPIC -std=c99 \n'

Script to dump model:

from sklearn.externals import joblib
import treelite

def main():
    out_fn = 'models/brr_v1.1.0a6_tl.dylib'
    lgb_txt_fn = 'models/brr_v1.1.0a6.lgb.txt'
    print('[main] treelite')
    ranker = joblib.load('models/brr_v1.1.0a6.pkl.gz')
    ranker.booster_.save_model(lgb_txt_fn)
    tl = treelite.Model.load(lgb_txt_fn, model_format='lightgbm')
    tl.export_lib('clang', libpath=out_fn,
                             params={'parallel_comp':4}, nthread=8, verbose=True)
    print('wrote {}'.format(out_fn))


if __name__ == '__main__':
    main()

OS Details:

$ uname -a
Darwin HA002727 16.7.0 Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/RELEASE_X86_64 x86_64

Python details:

$ python
Python 3.6.5 (default, Apr 12 2018, 10:53:09)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)] on darwin

Compiler:

$ clang --version
Apple LLVM version 8.0.0 (clang-800.0.38)
Target: x86_64-apple-darwin16.7.0
Thread model: posix

Gain and Cover

@hcho3 How to add gain and cover parameters of Nodes of tree ?

set_numerical_test_node() method of class Node doesn't take these parameters

prediction of leaf ids

Awesome project! Thanks!

Could you also add a prediction of leaf ids in each tree.

For instance if I have 10 trees in the model, then for each event I would get a vector of length 10 with ids for each of the tree.

These function is needed if one wants to get just the partition info about each event.

TreeliteError: Error occured in worker #0: (win 10)

I am trying compile a Lightgbm model in jupyter lab. My code is as follows:

import treelite
model = treelite.Model.load(r'D:\1_Student\ChengJQ\data used by code\Result\model\lightgbm\201908\20190819_AX_Cla_total.txt',
                            'lightgbm')
toolchain = 'gcc'   # change this value as necessary
libpath = 'D:/1_Student/ChengJQ/data used by code/Result/model/lightgbm/201908/C++model/20190819_AX_Cla_compaired.dll'
model.export_lib(toolchain=toolchain, params={'parallel_comp': 256},
                 libpath=libpath, verbose=True)

But the compilation fails. And I get the following error:

[22:09:51] c:\projects\treelite-wheels\treelite\src\c_api\c_api.cc:314: Writing file tu254.c...
[22:09:51] C:\Users\TPM\AppData\Roaming\Python\Python36\site-packages\treelite\contrib\__init__.py:208: WARNING: some of the source files are long. Expect long compilation time. You may want to adjust the parameter parallel_comp.

[22:09:52] C:\Users\TPM\AppData\Roaming\Python\Python36\site-packages\treelite\contrib\util.py:98: Compiling sources files in directory C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6 into object files (*.o)...

TreeliteError                             Traceback (most recent call last)
<ipython-input-7-a6aa38f77224> in <module>
      2 libpath = 'D:/1_Student/ChengJQ/data used by code/Result/model/lightgbm/201908/C++model/20190819_AX_Cla_compaired.dll'
      3 model.export_lib(toolchain=toolchain, params={'parallel_comp': 256},
----> 4                  libpath=libpath, verbose=True)

~\AppData\Roaming\Python\Python36\site-packages\treelite\frontend.py in export_lib(self, toolchain, libpath, params, compiler, verbose, nthread, options)
     97       self.compile(temp_dir, params, compiler, verbose)
     98       temp_libpath = create_shared(toolchain, temp_dir, nthread,
---> 99                                    verbose, options)
    100       shutil.move(temp_libpath, libpath)
    101 

~\AppData\Roaming\Python\Python36\site-packages\treelite\contrib\__init__.py in create_shared(toolchain, dirpath, nthread, verbose, options)
    219     from .gcc import _create_shared, _openmp_supported
    220   libpath = \
--> 221     _create_shared(dirpath, toolchain, recipe, nthread, options, verbose)
    222   if verbose:
    223     log_info(__file__, lineno(),

~\AppData\Roaming\Python\Python36\site-packages\treelite\contrib\gcc.py in _create_shared(dirpath, toolchain, recipe, nthread, options, verbose)
     58   recipe['create_library_cmd'] = lib_cmd
     59   recipe['initial_cmd'] = ''
---> 60   return _create_shared_base(dirpath, recipe, nthread, verbose)
     61 
     62 def _check_ext(dllpath):

~\AppData\Roaming\Python\Python36\site-packages\treelite\contrib\util.py in _create_shared_base(dirpath, recipe, nthread, verbose)
    121         f.write(result[tid]['stdout'] + '\n')
    122       raise TreeliteError('Error occured in worker #{}: '.format(tid) +\
--> 123                           '{}'.format(result[tid]['stdout']))
    124 
    125   # 2. Package objects into a dynamic shared library

TreeliteError: Error occured in worker #0: Microsoft Windows [版本 10.0.18362.295]
(c) 2019 Microsoft Corporation。保留所有权利。

C:\Users\TPM\CHJQ_CODE>cd C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>type NUL > retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu221.o tu221.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu42.o tu42.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu31.o tu31.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu146.o tu146.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu35.o tu35.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu142.o tu142.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu49.o tu49.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu135.o tu135.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu131.o tu131.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu22.o tu22.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu50.o tu50.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu179.o tu179.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu176.o tu176.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu66.o tu66.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu74.o tu74.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu82.o tu82.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu90.o tu90.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu99.o tu99.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu107.o tu107.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu117.o tu117.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu123.o tu123.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu127.o tu127.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu148.o tu148.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu152.o tu152.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu156.o tu156.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu161.o tu161.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu165.o tu165.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu169.o tu169.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu173.o tu173.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu183.o tu183.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu191.o tu191.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>gcc -c -O3 -o tu199.o tu199.c -fPIC -std=c99 

cc1.exe: out of memory allocating 65536 bytes

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>echo %errorlevel% >> retcode_cpu0.txt

C:\Users\TPM\AppData\Local\Temp\tmpk_p4h0f6>

any advise on it?

Tree ensemble transformations

@hcho3 I am interested in doing research on transformations such as combining multiple trees together, different pruning methods etc.

Would we be able to support these transformations as a part of the API?

Something like:

model.transform(pruning_transformation)
model.compile()
...

Having an intermediate representation of a tree structure, independent of the algorithm that generated it, is a very useful thing for this kind of work.

Please add version attribute

print(treelite.version)
AttributeError: module 'treelite' has no attribute 'version'

Errors when getting dump from the xgboost model exported from treelite

model.export_as_xgboost(local_model, name_obj="reg:linear")
bst = xgboost.Booster()
bst.load_model(local_model)
trees = bst.get_dump(with_stats=True, dump_format='json')
for tree in trees:
print(tree)
json.loads(tree)

JSONDecodeError Traceback (most recent call last)
in ()
2 for tree in trees:
3 print(tree)
----> 4 json.loads(tree)

~/anaconda/envs/g/lib/python3.6/json/init.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
352 parse_int is None and parse_float is None and
353 parse_constant is None and object_pairs_hook is None and not kw):
--> 354 return _default_decoder.decode(s)
355 if cls is None:
356 cls = JSONDecoder

~/anaconda/envs/g/lib/python3.6/json/decoder.py in decode(self, s, _w)
337
338 """
--> 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340 end = _w(s, end).end()
341 if end != len(s):

~/anaconda/envs/g/lib/python3.6/json/decoder.py in raw_decode(self, s, idx)
355 obj, end = self.scan_once(s, idx)
356 except StopIteration as err:
--> 357 raise JSONDecodeError("Expecting value", s, err.value) from None
358 return obj, end

JSONDecodeError: Expecting value: line 1 column 113 (char 112)

Missing predictor.h

The version 0.32 has no predictor.h?

support for other operators in export_as_xgboost

Hello,
I'm trying to convert RankLib ensembles to xgboost using treelite. I can programmatically build an ensemble with treelite given RankLib .xml, commit, and then export_as_xgboost. Unfortunately, RankLib's split operator is "<=". While treelite seems to support this operator, when I try to export to xgboost the library complains that only the "<" operator is allowed. One way to bypass this issue is to convert every disequality in a "<" one, but this requires to change the sign of the thresholds, of the feature values, and to flip the tree. Not very practical.

Is there any particular reason for this limitation? Can I help in some way to fix this?

Conda

Hi,

Could you add treelite to conda? Thanks!

Error while using on OSX

After I import treelite using python 3.6, I get the following error message.

OSError: dlopen(/usr/local/lib/python3.6/site-packages/treelite/libtreelite.dylib, 6): Library not loaded: /usr/local/opt/gcc/lib/gcc/7/libgomp.1.dylib
  Referenced from: /usr/local/lib/python3.6/site-packages/treelite/libtreelite.dylib
  Reason: image not found

I tried both installing using pip and compiling manually. Is there something I'm missing?

Build fail on 32bit platforms like Rasp3b

It seems size_t is the same type as uint32_t on RPi, and causing build fails.

treelite/include/treelite/common.h:372:15: error: redefinition of ‘T treelite::common::TextToNumber(const string&) [with T = unsigned int; std::__cxx11::string = std::__cxx11::basic_string<char>]’
 inline size_t TextToNumber(const std::string& str) {
               ^~~~~~~~~~~~
/home/pi/workplace/neo-ai/neo-ai-dlr/3rdparty/treelite/include/treelite/common.h:357:17: note: ‘T treelite::common::TextToNumber(const string&) [with T = unsigned int; std::__cxx11::string = std::__cxx11::basic_string<char>]’ previously declared here
 inline uint32_t TextToNumber(const std::string& str) {
                 ^~~~~~~~~~~~

List of TODOs before first release

Items with strikethroughs are finished

To-do's by first release

~~Implement shared library generation scripts in Python: need at least gcc, MSVC, clang~~
~~Separate out runtime library (libtreelite_runtime.so) from the rest of tree-lite (libtreelite.so)~~
~~Finalize Python API and subpackage organization~~
~~Add a library of prediction transform functions that transform margins into probabilities. Each library will have predict_margin and pred_transform functions~~
Write a polished IPython notebooks to go through steps of importing scikit-learn models.
- will need to elaborate on idiosyncrasies, e.g. how to encode random forest classifiers with multiple classes.
~~Submit to PyPI~~

Postponed items

Offer R binding
Add integration and unit tests
Hook up Continuous Integration service
Add ranking facilities to DMatrix class
Support "base predictor": a set of base margins before the first trees
Support weighted instances (e.g. for Adaboost)

pip3 install --user treelite not working

predictions results do not match original lightgbm

#94 I installed treelite by compiling the latest code and the predictions results have a big difference between treelite model and original model. I have confused for a few of days.

static lib and BUILD_SHARED_LIBS

Great project! This really simplifies deployment.

Would you consider omitting the forced SHARED library build and deferring to CMake's BUILD_SHARED_LIBS so that the user can choose? I'm happy to send a PR.

treelite/CMakeLists.txt

Lines 118 to 119 in 5e057bf

 add_library(treelite SHARED $<TARGET_OBJECTS:objtreelite> $<TARGET_OBJECTS:objtreelite_common>) 

 add_library(treelite_runtime SHARED $<TARGET_OBJECTS:objtreelite_runtime> $<TARGET_OBJECTS:objtreelite_common>)

add_library(treelite SHARED $<TARGET_OBJECTS:objtreelite> $<TARGET_OBJECTS:objtreelite_common>)
add_library(treelite_runtime SHARED $<TARGET_OBJECTS:objtreelite_runtime> $<TARGET_OBJECTS:objtreelite_common>)

Thread safety for prediction C API TreelitePredictorPredictBatch

Deploy with uwsgi

I'm using treelite to deploy with my xgboost model. For higher QPS, I must parallelize my program. So, I use uwsgi to complete it.

However, it seems to not work. And "Time per request" is linear growth with "request concurrent". Could you give me some advisement.

Below is my "apache ab" report:

request with 1 concurrent (ab -c 1 -n 1000 -p postfile.txt http://localhost:6181/api)

Concurrency Level:      1
Time taken for tests:   5.163 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      92000 bytes
Total body sent:        232000
HTML transferred:       21000 bytes
Requests per second:    193.68 [#/sec] (mean)
Time per request:       5.163 [ms] (mean)
Time per request:       5.163 [ms] (mean, across all concurrent requests)
Transfer rate:          17.40 [Kbytes/sec] received
                        43.88 kb/s sent
                        61.28 kb/s total

request with 10 concurrent (ab -c 10 -n 1000 -p postfile.txt http://localhost:6181/api)

Concurrency Level:      10
Time taken for tests:   3.453 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      92000 bytes
Total body sent:        232000
HTML transferred:       21000 bytes
Requests per second:    289.59 [#/sec] (mean)
Time per request:       34.532 [ms] (mean)
Time per request:       3.453 [ms] (mean, across all concurrent requests)
Transfer rate:          26.02 [Kbytes/sec] received
                        65.61 kb/s sent
                        91.63 kb/s total

New release?

Hello everyone,
in our software we are using the C api of treelite (thanks a lot!) and we are using an hashed version instead of a release tag. We were wondering though if a new release is coming or if indeed the package is safe to be used at head.

error: ‘inf’ undeclared with `export_lib`

Hi, treelite looks excellent, but so far I can't seem to get my lightgbm model to export correctly.
Any pointers?

In [10]: tl.export_lib(toolchain='gcc', libpath='tree_test.so', verbose=False)
[19:31:42] /io/treelite/src/compiler/recursive.cc:188: Parallel compilation disabled; all member trees will be dump to a single source file. This may increase compilation time and memory usage.
[19:31:42] /home/ubuntu/anaconda3/envs/lgb207/lib/python3.6/site-packages/treelite/contrib/__init__.py:99: WARNING: some of the source files are long. Expect long compilation time. You may want to adjust the parameter parallel_comp.

---------------------------------------------------------------------------
TreeliteError                             Traceback (most recent call last)
<ipython-input-10-7ed5ccaceb4c> in <module>()
----> 1 tl.export_lib(toolchain='gcc', libpath='tree_test.so', verbose=False)

~/anaconda3/envs/lgb207/lib/python3.6/site-packages/treelite/frontend.py in export_lib(self, toolchain, libpath, params, compiler, verbose, nthread, options)
     90       self.compile(temp_dir, params, compiler, verbose)
     91       temp_libpath = create_shared(toolchain, temp_dir, nthread,
---> 92                                    verbose, options)
     93       shutil.move(temp_libpath, libpath)
     94

~/anaconda3/envs/lgb207/lib/python3.6/site-packages/treelite/contrib/__init__.py in create_shared(toolchain, dirpath, nthread, verbose, options)
    112   else:
    113     raise ValueError('toolchain {} not supported'.format(toolchain))
--> 114   libpath = _create_shared(dirpath, recipe, nthread, options, verbose)
    115   if verbose:
    116     log_info(__file__, lineno(),

~/anaconda3/envs/lgb207/lib/python3.6/site-packages/treelite/contrib/gcc.py in _create_shared(dirpath, recipe, nthread, options, verbose)
     90         f.write(result[id]['stdout'] + '\n')
     91       raise TreeliteError('Error occured in worker #{}: '.format(id) +\
---> 92                           '{}'.format(result[id]['stdout']))
     93
     94   # 2. Package objects into a dynamic shared library (.so)

TreeliteError: Error occured in worker #0: tmpl86ac6yh.c: In function ‘predict_margin’:
tmpl86ac6yh.c:10072:61: error: ‘inf’ undeclared (first use in this function)
           if ( (data[1].missing != -1) && data[1].fvalue <= inf) {
                                                             ^
tmpl86ac6yh.c:10072:61: note: each undeclared identifier is reported only once for each function it appears in

Using export_lib on a windows machine

I am just starting to learn how to use treelite. I am following the "First Tutorial" using one of my models.

import treelite as trl
model = trl.Model.load(os.path.join("TestingModels", mf), 'xgboost')
model.export_lib(toolchain="msvc", libpath="./mymodel.dll", verbose=False)

The model loads perfectly, creates header.h, main.c, and recipe.json just fine, but then says it cannot find the specified file. I thought "mymodel.dll" was the name of the file being written? I am a bit confused so help would be appreciated.
I am using Windows 10, Python 2.7, and Atom Editor.

sklearn binary classifier: compiled and Python model discrepancies?

Dear treelite community,

what could be the reasons of discrepancies in predictions of a sklearn random forest binary classifier, between the original Python code and the compiled code?

I used gallery interface for compilation and treelite.runtime for deployment, if that may matter.

PS thanks for treelite, I really enjoy 10-15x speedup in my compiled regression models (they work just fine)

support for QuickScorer and RapidScorer

any future supports for the state-of-the-art research like
QuickScorer (https://www.cse.cuhk.edu.hk/irwin.king/_media/presentations/sigir15bestpaper.pdf)
and RapidScorer (https://www.kdd.org/kdd2018/accepted-papers/view/rapidscorer-fast-tree-ensemble-evaluation-by-maximizing-compactness-in-data) ?

TODO: release 1.0, with better wheel packaging

It is about time to make another release.

	add_library(treelite SHARED $<TARGET_OBJECTS:objtreelite> $<TARGET_OBJECTS:objtreelite_common>)
	add_library(treelite_runtime SHARED $<TARGET_OBJECTS:objtreelite_runtime> $<TARGET_OBJECTS:objtreelite_common>)