Giter Club home page Giter Club logo

Comments (7)

hughperkins avatar hughperkins commented on July 22, 2024 7

@kushalarora thank you for the information about how to fix the urls. works perfectly now :) for anyone else, if you get this error message:

Downloading and extracting CoLA...
        Completed!
Downloading and extracting SST...
        Completed!
Processing MRPC...
Traceback (most recent call last):
  File "download_glue_data.py", line 144, in <module>
    sys.exit(main(sys.argv[1:]))
  File "download_glue_data.py", line 136, in main
    format_mrpc(args.data_dir, args.path_to_mrpc)
  File "download_glue_data.py", line 68, in format_mrpc
    URLLIB.urlretrieve(MRPC_TRAIN, mrpc_train_file)
  File "/persist/conda/lib/python3.6/urllib/request.py", line 248, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/persist/conda/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/persist/conda/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/persist/conda/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/persist/conda/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/persist/conda/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/persist/conda/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

... then open up download_glue_data.py, meander down to lines 45 and 46, and update them as per @kushalarora 's urls, in his first post.

Then you will see a healthier

Downloading and extracting CoLA...
        Completed!
Downloading and extracting SST...
        Completed!
Processing MRPC...
        Completed!
Downloading and extracting QQP...
        Completed!
Downloading and extracting STS...
        Completed!
Downloading and extracting MNLI...
        Completed!
Downloading and extracting SNLI...
        Completed!
Downloading and extracting QNLI...
        Completed!
Downloading and extracting RTE...
        Completed!
Downloading and extracting WNLI...
        Completed!
Downloading and extracting diagnostic...
        Completed!

:)

from glue-baselines.

sleepinyourhat avatar sleepinyourhat commented on July 22, 2024

Thanks for letting us know about all of this! (Though the opening strikes me as needlessly aggressive.)

Could you let us know what kind of experiments/models you're planning to run?

If you're just trying to evaluate a system on GLUE, you should just use the jiant codebase (as we say in big letters in the readme). That's where our ongoing, supported work on this project lives. This codebase only exists as an archive to allow people to reproduce our exact baseline numbers if they need to (it's basically an old internal draft of jiant). We will try to fix the clashes and broken links, though.

from glue-baselines.

kushalarora avatar kushalarora commented on July 22, 2024

Hello Prof. Bowman,

I apologize if my comment came out as aggressive, it was more of a deep sigh of resignation at my attempt to reproduce the baseline than an accusatory comment. I understand how it will come out to appear aggressive though and apologize on behalf of sleep-deprived me writing this comment at 5 in the morning.

I am just planning to reproduce the baseline experiments. My set of experiments involve evaluating some word embeddings and for this task, the diagnostic test proposed by GLUE looked well suited. I read that comment about using jiant repo but all I was trying to do was swap out GloVe for something else and thought that running this repo might be simpler than running code from jiant repo.

I also understand that it is difficult to maintain a repo, especially in academic setting considering we don't have an army of engineers to support this effort but my comments stand. I will request you to kindly edit the Glue Benchmark site to point to jiant repo for running the baseline and as a main repo for the running benchmarks as the support effort is directed there. Otherwise, a few people at least will give up on using this extremely useful benchmark in their experiments due to their inability to reproduce the experiments.

Also, I will suggest adding a deprecated warning on the repo like https://github.com/knowitall/openie so that it is clear that we ought to use jiant repo directly. The current Readme indicates if you don't plan to substantially change code/models in the baseline, this repo should suffice. This is not the experience I had with the repo.

Finally, I also add a couple of issues I missed out on in my last comment.

  1. The package needs me to clone Cove package which ideally should be not the case if I don't want to run cove. We can do conditional imports in the code using the load_module command in python for such cases.
  2. The diagnostic tests were downloaded in a separate directory but the preprocessing code expected it in MNLI directory.

Once again apologies for my comment coming out as aggressive. If you like, I can help fix some or most of these issues via a pull request after the ACL deadline.

Regards,
Kushal

from glue-baselines.

sleepinyourhat avatar sleepinyourhat commented on July 22, 2024

Thanks for the note! If you're far enough along that you're definitely going to try to use this code, PRs are welcome. We may beat you to it, but with the ACL deadline, it's not that likely.

Otherwise, though, jiant is still a work in progress, but it supports all the use cases that this repo does, and it's better documented and maintained. (CoVe is a conditional import there, IIRC, and the download script should be up to date.)

from glue-baselines.

Sleepingbug avatar Sleepingbug commented on July 22, 2024

@kushalarora Thank you!

from glue-baselines.

mingbocui avatar mingbocui commented on July 22, 2024

@kushalarora thanks for your sharing, helps a lot

from glue-baselines.

YangQun1 avatar YangQun1 commented on July 22, 2024

@kushalarora thanks for your sharing, helps a lot

from glue-baselines.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.