Giter Club home page Giter Club logo

Comments (4)

dumitrescustefan avatar dumitrescustefan commented on September 2, 2024

I think we should switch from optparse to argparse in the near future. Also, for tokenization and (optional) sentence-splitting tasks, we need separate params for train/dev/test raw-text files (eg. --raw-train-file) to be able to evaluate tok and ss at train-time. I'll look into it.

from nlp-cube.

tiberiu44 avatar tiberiu44 commented on September 2, 2024

So the argparse suggestion is pretty clear. But I have to ask: should we create separate entry points for train, test (which I would actually rename to "run" or "process") and runserver ?

from nlp-cube.

dumitrescustefan avatar dumitrescustefan commented on September 2, 2024

What I think we should do is thinking like the users and not like the developers. If I was a user, I'd like an entry point with something like:
-train
-task = tok / lemma / etc. (which task should we create a model for)
-train&dev conllu files (test is ok if we want to test also, but optional)
-dev txt raw-file (so we can train the tok and ss), test txt raw-file (optional, if we want to get results on test)
-other params like config (which should have ALL config sections for ALL tasks), base folder path, batch size, etc

-run (run is better than test)
-a config file OR a parameter that specifies the task chain, like : ss,tok,tag,lemma,parse so we can run only required components
-raw text file (if ss/tok are specified in the processing chain) OR conllu file if not
-output conllu file
-other params like a config file having ALL config sections in a single file), base to load model, batch size, etc.

Also we should have a whitespace tokenizer (dumb tokenizer) for people that already have their input tokenized (this should be the default if no tokenizer is specified in the task pipeline).

from nlp-cube.

dumitrescustefan avatar dumitrescustefan commented on September 2, 2024

We could have two entry points to drop the train/run parameters.
Also argparse works for python>=2.7

from nlp-cube.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.