Giter Club home page Giter Club logo

deep-eos's People

Contributors

sa-j avatar stefan-it avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep-eos's Issues

Could't download the data

~/deep-eos/data ~/deep-eos
Downloading corpus for de.sentences.tar.xz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 185 100 185 0 0 293 0 --:--:-- --:--:-- --:--:-- 293
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
curl: (60) SSL certificate problem: certificate has expired
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

Question about experiment

I don't understanding the meaning of this experiment.
Too many errors in the gold set.
For examples, in the europarl-v7.de-en.en.sentences.test.gold:
line 73:I am happy to try and answer, Mr Wijsenbeek. As you will certainly know,……. Here "I am happy to try and answer, Mr Wijsenbeek." is obviously a single sentence and the gold dost't mark is as.
Simliar data:
line 130,175... too much
So I don't understanding the meaning of "sentence boundary detection" in this dataset.

Error in Related Work section of paper

Since Google Scholar alerted me to the citation and I was curious, I checked this preprint of the Deep-EOS paper. It says there:

Further high-performers such asElephant(Evang et al., 2013) orCutter(Gra ̈en et al., 2018)follow a sequence labeling approach. However,they require a prior language-dependent tokeniza-tion of the input text.

At least I interpret this as saying that the input to Elephant is already tokenized text, on which sentence boundary detection is then performed. That is not true. Elephant performs tokenization and sentence boundary detection jointly. It is true that this scenario requires tokenized training data. However, Elephant could also be trained and used on data that is not tokenized and only sentence-segmented.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.