Giter Club home page Giter Club logo

Comments (15)

annaveronika avatar annaveronika commented on May 17, 2024 2

Categorical feature indices should be correct. If the 0 and 2 features are categorical and others numeric, then the value should be cat_index=[0,2]
In case of this error some categorical feature was not included to this list, so alrogithm was trying to parse it as numeric.

from catboost.

annaveronika avatar annaveronika commented on May 17, 2024 2

No, if you provide the date and say that it's a categorical feature, then it will be treated like a string with this value.

from catboost.

sergmiller avatar sergmiller commented on May 17, 2024 2

@shivamsaboo17, for now if you use python package you should convert the labels to categories or try catboost command line version with option --class-names. Possibility to use class-names in python version will be added soon.

Also automatic setting class_names will be added into both versions.

from catboost.

amresh1495 avatar amresh1495 commented on May 17, 2024 1

Does it convert date into numerical value ?

from catboost.

Donskov7 avatar Donskov7 commented on May 17, 2024

@abaspinar, could you provide your data?
*And check that all features have right indices that you give into cat_features (indices start from 0 (zero))

from catboost.

abaspinar avatar abaspinar commented on May 17, 2024

Thanks! @Donskov7
It was because of wrong indices.

from catboost.

parthparashar1647 avatar parthparashar1647 commented on May 17, 2024

I did'nt understand @abaspinar how did you correct the wrong indicies

from catboost.

shivamsaboo17 avatar shivamsaboo17 commented on May 17, 2024

I am getting same error but it's with the target labels which are in string format. Should I convert the labels to categories as string labels are not supported?

from catboost.

goshulina avatar goshulina commented on May 17, 2024

Same issue
What should i fix if:
I trained catboost and defined cat features as this:
clf.fit(train, labels, cat_dims)
And when i look at indexes of cat and float features in the trained model they are the same as in cat_dims variable before.
I got this error when i load test set and try to make class prediction:
_catboost.CatboostError: Bad value for num_feature[0,3]="1.281.501.0": Cannot convert 'b'1.281.501.0'' to float

My cat indexes in trained model:
[0, 1, 2, 9, 10, 11, 12, 13, 15, 16, 19, 21, 24, 28, 29, 31, 32, 34, 35, 36, 38, 39]
So, Bad value for num_feature[0,3] error - means, that my third feature was float (and it is true; it is not presented in the cat indexes list above), and it was OK while training, and now, when i try to predict on test sample it suddenly became unable to convert it to float, because third feature in test sample is now categorical (according to the value "1.281.501.0")? And how to know the name of this feature? The "third" feature does not make any sense to me. What should i do to load test samples and make a prediction successfully?

PS.: i open test sample in pandas with the same data types as train. And the number of features is the same

from catboost.

annaveronika avatar annaveronika commented on May 17, 2024

@goshulina Opened #633

from catboost.

rousso1 avatar rousso1 commented on May 17, 2024

for the sake of future visitors - this error is most likely thrown when you have a categorical feature which is not casted to string, prior to fitting

from catboost.

jendefig avatar jendefig commented on May 17, 2024

I'm having the same problem on a new dataset. In the coursera example all the column dtypes are int64 and it works fine. Nothing was converted. My new dataset is throwing the error with a float64 that I converted to string (object) and am having the same problem... @annaveronika what am I missing?

from catboost.

bharath7896 avatar bharath7896 commented on May 17, 2024

do we need to label encode the string containing categorical features which contain dtypes as object to numeric ? for data to be pooled ? because I am also getting same error

TypeError: Cannot convert 'b'W'' to float

from catboost.

annaveronika avatar annaveronika commented on May 17, 2024

I'm having the same problem on a new dataset. In the coursera example all the column dtypes are int64 and it works fine. Nothing was converted. My new dataset is throwing the error with a float64 that I converted to string (object) and am having the same problem... @annaveronika what am I missing?

It is not allowed to use floating point columns for categorical features. The best thing is to use Categorical type, you can also use integer or object types.
Here is the explanation why you cannot use float numbers for categorical features:
https://catboost.ai/docs/concepts/faq.html#specify-weights-or-baseline-for-eval-set

from catboost.

annaveronika avatar annaveronika commented on May 17, 2024

do we need to label encode the string containing categorical features which contain dtypes as object to numeric ? for data to be pooled ? because I am also getting same error

TypeError: Cannot convert 'b'W'' to float

This error means that you have not listed your categorical feature in cat_features parameter. By default all features are considered numeric. You must explicitly say that the feature is categorical.

from catboost.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.