Comments (3)
The tokenizer will tokenize the string in the following way:
words | tokens |
---|---|
This, That, And the Other | this , that , and the other |
It's not splitting text into tokens using a comma delimiter.
If you want the behavior to instead be three tokens This
, That
, And The Other
, I suggest preprocessing those columns and pass text that has already been feature engineered.
from modelfox.
Do you have an example of how that would work?
How can I pass text in any other way in the column?
from modelfox.
You would need to pre-process your csv using another tool. Alternatively, you can use an enum
column by using a custom config file as described here: https://www.modelfox.dev/docs/guides/train_with_custom_configuration.
In the example linked above, the "chest_pain" column is specified as type "enum" with four variants.
{
"dataset": {
"columns": [
{
"name": "chest_pain",
"type": "enum",
"variants": [
"asymptomatic",
"atypical angina",
"non-angina pain",
"typical angina"
]
},
...
}
}
For your dataset, you would specify that the words
column is an enum
with 3 variants: "This", "That", "And The Other".
Then, use the config file by passing --config path/to/config.json
on the CLI.
from modelfox.
Related Issues (20)
- Add CLI Command to auto-generate config file HOT 2
- Playground Chart min value is deceptive, use 0 instead
- Long column names overflow training stats table
- Repo overview View to compare all models contained in the repo
- Ctrl-c to cancel training from python
- Early Stopping Options default to present
- Coerce boolean values appropriately to enum values for prediction in language libraries/ cli
- Improve error message for CLI incorrect path to train/test file
- Forgetting the threshold in logPrediction causes bad request
- Allow data frame as input to predict function in python
- Explain what a baseline classifier is on the metrics page.
- Training error when column to predict has more than 100 variants HOT 2
- Thread 'main' panicked at 'called `Option::unwrap()` on a `None` value' HOT 5
- Failed to get `modelfox` as a dependency of package HOT 1
- URL in repo description is broken HOT 2
- Debian package is not installable HOT 1
- Running `modelfox app` gives "error: No such file or directory (os error 2)" HOT 1
- [Ruby] Does not work for M1 Mac OSX
- datasets are not downloadable anymore
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from modelfox.