Giter Club home page Giter Club logo

masakhane-community's People

Contributors

alpoktem avatar cdleong avatar dadelani avatar hackmd-deploy avatar ignatiusezeani avatar juliakreutzer avatar kpu avatar oldladypants avatar poppingtonic avatar ruohoruotsi avatar tosingithub avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

masakhane-community's Issues

Masakhane Speaker Topics

Sometimes it's nice to get someone in to come do a talk about a topic that is relevant to the community. If you have something you're specifically interested in learning more about, please post the Topic and a brief motivation as to why you think it would be important to the community.

As the community, please ๐Ÿ‘ on the topics that are most important for you right now (try not upvote everything :P That won't help us prioritize which talks to organise first)

Custom Data Notebook: Spaces in file paths can cause issues with bash commands

For example, /content/drive/My Drive/masakhane/$src-$tgt-$tag can cause issues, but also the following situation caused an error for me:

source_file = f"/content/drive/My Drive/Research/Hani Machine Translation/hni_story_corpus/v2/hani_story_corpus_train.{source_language}"
target_file = f"/content/drive/My Drive/Research/Hani MachineTranslation/hni_story_corpus/v2/hani_story_corpus_train.{target_language}"

# They should both have the same length.
! wc -l $source_file
! wc -l $target_file

Mitigations we could do:

"MyDrive" instead of "My Drive" helps

Actually, it seems you can just change from using My Drive to MyDrive paths, which helps a lot so long as there aren't spaces elsewhere in the path, e.g. in my case where Hani Machine Translation was in the path to train.eng and train.hni

Add quotes around bash variables

For example
! wc -l "$source_file" instead of wc -l $source_file

and `

! head "$source_file"* instead of ! head "$source_file"*

but this doesn't completely solve it, and can get complicated when we've got some of the more complex cases later in the notebook, like

!cp -r joeynmt/models/${src}${tgt}_transformer/* "$gdrive_path/models/${src}${tgt}_transformer/"

or within the yaml file:

#load_model: "{gdrive_path}/models/{name}_transformer/1.ckpt" # if uncommented, load a pre-trained model from this checkpoint

Warn the user about whitespaces.

Add a section that checks all the paths for white spaces and warns the user that, maybe it would be easier if they just removed them?

Do all our file manipulations with Python

We could rewrite a lot of these to use pathlib

See also pjreddie/darknet#1672 and https://stackoverflow.com/questions/56640534/cannot-open-train-txt-with-white-space-my-drivehe

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.