Giter Club home page Giter Club logo

glosario-r's Introduction

glosario

R build status codecov

glosario allows users to create and retrieve multilingual glossaries. By default, glosario provides access to a community-curated glossary hosted by The Carpentries. This repository also documents the structure expected for the glossaries that can be managed by glosario.

There is also a Python interface.

Installation

glosario is still in the development stage and is only available from GitHub with:

# install.packages("remotes")
remotes::install_github("carpentries/glosario-r")

Example

library(glosario)

define("data frame")
#> data frame:
#> See also: tidy_data
#> 

To get definitions in other languages we would do:

define("plus_one", lang = 'fr')
#> +1: Un vote en faveur de quelque chose.
#> 

If you want to use your custom glossary file you can do it the following way:

custom_url <- "https://raw.githubusercontent.com/carpentries/glosario/master/glossary.yml"

g <- get_glossary(url = custom_url)

define("plus_one", lang = 'fr', glossary = g)
#> +1: Un vote en faveur de quelque chose.
#> 

To add links to definitions, you can use the gdef function for inline writing:

This is a `r gdef('data_frame', 'Data Frame')`, they are used for storing data.

Which would look like this:

This is a Data Frame, they are used for storing data.

glosario-r's People

Contributors

fmichonneau avatar ian-flores avatar larnsce avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

glosario-r's Issues

`define()` doesn't implement fuzzymatching

We need to implement fuzzy matching or string distance to search for the nearest word similar to the slug if there isn't an exact match.

Right now if we run:

g <- get_glossary()
define("data frame", glossary = g)

We get:

#> Warning: Some key are not found: 'data frame'. They are being excluded.

Because it is expecting the data_frame slug and not data frame. But the define function should be able to see that this is a very near match and thus we should present this definition. I do this using cosine similarity in the Python version

Extract and format subset of glossary for inclusion in book

Additional use case: suppose you have written a book with bookdown, and have included glossary references. You would like to include those terms (and only those terms) as a glossary in the book, so it should be possible to gather them while knitting the book (or in a separate pass).

Generate different text for inline glossary reference depending on output format

When an R Markdown document is being compiled into HTML, we want something like the following for a glossary entry:

<a href="https://glosario.carpentries.org/en/#some_key">some term</a>

But when the same document is being compiled into LaTeX/PDF, we want:

\glossref{some term}{some\_key}

(Or something like that---the user may want some other LaTeX command for glossary references.) We cannot generate the former and hope that Pandoc will translate it into the latter. We also cannot generate LaTeX and hope that Pandoc will translate it into HTML, since Pandoc doesn't handle user-defined commands.

I have used a Lua filter to solve this problem in a book project merely-useful/py-rse#467, but we should think about how to configure the function for inserting definitions into text to handle this.

Submit the package to r-universe

CRAN submission is harder to achieve, but I think we can publish the package to r-universe so it's accessible from another place that is not only GitHub

Need CONTRIBUTING file

This repository doesn't have a CONTRIBUTING file and therefore defaults on the version found in the .github repository which applies for lessons but not for this repository. We should create one to avoid possible confusion.

Use Case - Checking a lesson [Consistency Support]

  1. Checking a lesson.
    1. Beatriz has made some changes to a lesson she inherited from Amari,
      and wants to check that it is still consistent.
    2. She runs a command-line script that:
      1. Reads the R Markdown file.
      2. Extracts the terms under the glossary/defines key.
      3. Searches the body of the document for calls to gdef(...).
      4. Checks that every term listed in glossary/defines is referenced in the document body,
        and that every term referenced in the document body is mentioned in glossary/defines.

`As Implemented In`

By J.A.

"you could have an "as implemented in" field in the same way that you have a "see also" section. So filtering has pointers to dplyr::select() along with base::subset() and pandas.whatever."

Use Case - Linking to a definition [RMarkdown Support]

Comment from @fmichonneau about handling links in RMd:

"We may want to provide more flexibility in the way the links are generated. For instance, we may need to provide options to switch between HTML or Markdown versions of the links. For the HTML version using the htmltools package might make it easier to provide additional attributes to the anchors."

Use Case - Summarizing a lesson [RMarkdown Support]

  1. Amari has written a lesson in R Markdown that includes YAML metadata
    stating that it defines correlation and causation.
  2. She adds a code chunk to the end of her lesson that includes a call to
    glosario::summarize_terms().
  3. When she knits the document to HTML, this code chunk inserts a definition list dl at that point. Its entries are the definitions of all of the terms listed under the glossary/defines key in the page's YAML header in alphabetical order by term according to the rules for glossary/language.

Use Case - Finding lessons [Requirements Support]

  1. Amari writes a lesson in R Markdown. She adds the glossary key to its YAML metadata and indicates that the lesson requires the term correlation and defines the term regression.
  2. Beatriz is writing a lesson on linear models. She adds YAML metadata indicating that the lesson requires the term regression.
  3. To find prerequisite lessons she can recommend to her students, Beatriz runs a command-line script that:
    1. Uses rmarkdown::yaml_front_matter(filename) to reads metadata from all of the lessons she has archived.
    2. Lists all of the lessons that state they define the term regression.

Need to install again to be able to use new translated terms?

Hi!
I made a page for a workshop and added some Glosario terms there. Some were still not merged, so I left them in english.
After the PR was merged, I tried to update this terms to portuguese (with lang = 'pt'), and the output was this message:

Some languages requested are not availble for this entry.NULL

I tried in two different days (not that makes much difference), and still got this message. So I installed again the package, and now the term is shown in portuguese alright :)

My question is: everytime that I want to use terms that were recently translated into glosario, I'll have to install the package again? Is there anyway to make this re-instalation not needed? Or I'm missing something that I should do?

Thanks!

Receiving an error "some references are slugs that are not defined" for all functions in glosario R package

Use case

I am trying to use glosario in a Quarto book project and I had thought to use the define functions in combination with an idea document on the quarto-cli issue tracker at: quarto-dev/quarto-cli#1697 (comment)_.

My idea would be to do the following:

[data frame]{#data_frame}

:   `r glosario::define("data_frame")`

Error

The following reproducible example shows the error I receive:

library(glosario)

packageVersion("glosario")
#> [1] '0.2'

define("tidy_data")
#> Error: Some references are slugs that are not found:
#> ref: 'geometry_shader' from slug:  'fragment_shader'
#> ref: 'tidy data' from slug:  'openrefine'
#> ref: 'geometry_shader' from slug:  'shader'
#> ref: 'data analysis' from slug:  'spectral analysis'
#> ref: 'bayesian inference' from slug:  'spectral analysis'
#> ref: 'bayesian statistics' from slug:  'spectral analysis'
#> ref: 'geometry_shader' from slug:  'tessellation_shader'
#> ref: 'geometry_shader' from slug:  'vertex_shader'
define('data_frame')
#> Error: Some references are slugs that are not found:
#> ref: 'geometry_shader' from slug:  'fragment_shader'
#> ref: 'tidy data' from slug:  'openrefine'
#> ref: 'geometry_shader' from slug:  'shader'
#> ref: 'data analysis' from slug:  'spectral analysis'
#> ref: 'bayesian inference' from slug:  'spectral analysis'
#> ref: 'bayesian statistics' from slug:  'spectral analysis'
#> ref: 'geometry_shader' from slug:  'tessellation_shader'
#> ref: 'geometry_shader' from slug:  'vertex_shader'
glosario::get_glossary()
#> Error: Some references are slugs that are not found:
#> ref: 'geometry_shader' from slug:  'fragment_shader'
#> ref: 'tidy data' from slug:  'openrefine'
#> ref: 'geometry_shader' from slug:  'shader'
#> ref: 'data analysis' from slug:  'spectral analysis'
#> ref: 'bayesian inference' from slug:  'spectral analysis'
#> ref: 'bayesian statistics' from slug:  'spectral analysis'
#> ref: 'geometry_shader' from slug:  'tessellation_shader'
#> ref: 'geometry_shader' from slug:  'vertex_shader'

Created on 2023-06-13 with reprex v2.0.2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.