Giter Club home page Giter Club logo

book-package-rcl's Introduction

book-package-rcl

https://unfoldingWord.github.io/book-package-rcl/

Identifying resources needed to enable single piece workflow.

Purpose

Given a book of the Bible and, optionally, a chapter, then return a all the resources needed for the book and chapter. At this time, only English is supported.

Current constraints

These resources will exist in all gateway languages, but for now, only english is completed. This component only considers english language resources.

The resources shown include:

  • Lexicon references as Strong's Numbers with counts and links Note: Since each word in the original text has a Strong's number, then the number of words in the original text is the same as the number of Strong's entries.
  • Translation Words (tW) with counts and links to articles
  • Translation Notes (tN) with counts
  • Translation Academy (tA) with counts and links to tA articles
  • Translation Questions (tQ) with counts. Note: A count of the number of level one headings is used to count the number of questions.

Setup Notes

These notes are adapted from https://unfoldingword-box3.github.io/hello-world-react-component-library/ for convenience.

  1. Ensure node.js and yarn are installed
  2. Clone the repo and change directory to the cloned folder.
  3. Install the npm dependencies with yarn. Just run in project folder. It can take a while to run!
  4. Run and develop with yarn start; view at localhost:6060.
    • if dependencies are missing it will not compile and will report what is missing
    • to fix, add dependencies to package.json and rerun yarn to install them
  5. See debug console.log() output in browser console -- in chrome, CTRL-SHIFT-J to open.

After changes tested:

  1. Update package version (must valid semver, such as 1.5.0 with all three pieces)
  2. Use git to commit/push
  3. Use yarn publish

Chromebook Linux Beta Notes

Must use hostname -I to get the host address. **Neither localhost nor 127.0.0.1 will work.

$ hostname -I
100.115.92.202 
$

book-package-rcl's People

Contributors

abelpz avatar dependabot[bot] avatar mandolyte avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

book-package-rcl's Issues

Recursion for aligned texts

Learned yesterday that aligned texts have no limit on how deeply the children attributes are nested in the USFM-JS format.

At present a depth of 5 is hard-coded, and may not be sufficient. This would cause the words to be missed and the book to be under counted.

Convert both aligned ULT and UST to use a recursive algorithm instead.

The tQ component has no details

The tQ component only has word counts. There are no details or links to the individual chapters (folders in Github) or verses (markdown files).

Should there be?

Allow Multiple Books

Similar to allow a range of chapters can be provided, it would be excellent if we could provide a range of books.

For example, if I want to see what is included in Titus, Luke, and Acts, it would be great to be able to specify:

<BookPackageRollup bookId='tit,luk,act' chapter='' />

and then get the combined numbers for all of those.

TA Typo

The header for TA says "Translation Articles" but it should be "Translation Academy"

Discount HTML/XML comments from word counts

Do not include XML comments in word counts. Example from Strong's article H0835.md (use raw mode to view):

<!-- Status: S2="NeedsEdits" -->
<!-- Lexica used for edits:   -->

tN word count accuracy

For Psalm 1, the number of notes 565 and the word count is only 28.

Translation Notes for "PSA" and Chapters 1
Total number of notes: 565
Distinct number of words in notes: 23
Total number of words in the notes: 28

TA Should be its Own Component

Currently the TA information is embedded in the TN card but it needs to be represented distinctly as it's own component.

Simplify Package Header

When this RCL is used by the Book Package App, the header title looks like this:
Package Rollup for "1JN, 2JN, 3JN" and Chapters (ALL).

While the RCL can restrict the results to specified chapters, this is not possible in the app.

So change the RCL to omit the chapters bit when no restriction is specified. Thus the title for the app will look like this: Package Rollup for "1JN, 2JN, 3JN"

Cosmetic Issues

Here is a short list of cosmetic changes to make on the "cards":

  • Refer to word counts as "Total word count:"
  • Refer to unique words as "Unique words:"
  • Place "Total word count" first

UTQ and 404 error avoidance

The Gitea API has this:
https://git.door43.org/api/swagger#/repository/repoGetContents

For given a book of the bible, say "tit", this URL returns all the chapter folders:
https://git.door43.org/api/v1/repos/unfoldingword/en_tq/contents/tit

Then for each chapter folder, "03", this URL returns all the question verse files:
https://git.door43.org/api/v1/repos/unfoldingword/en_tq/contents/tit/03

Then for each verse file found, here "01.md", this URL will return metadata and actual content of file (base64 encoded):
https://git.door43.org/api/v1/repos/unfoldingword/en_tq/contents/tit/03/01.md

This will avoid "guessing" and eliminate the 404 errors.

Inconsistencies in naming, etc.

There are a number inconsistencies due learning things during development on box3. Examples:

  • Names of the components
  • How data is passed around (sometimes in database, sometimes in returned objects, sometimes both)
  • Probably some other things I haven't noticed yet...

Deduped counts of UTW and UTA documents across multiple books

For example, if a team chooses to do Titus first and then Luke after that, how many articles from Translation Words and how many modules from Translation Academy would be a part of the Luke package, given that they've already translated a handful in the Titus package?

Books of Bible List

We need one more top level component that that lists out each book of the Bible. This would probably be the meta element that would be used as a visual method of selecting a book.

Each book name should be clickable, revealing underneath it the Book Package contents.

Switch to new books JSON

The new books JSON has two new attributes:

  • Title: the name of the book, such as "Titus"
  • USFM: the base name of the book in USFM format, such as "57-TIT"

Preserve data needed for optimization algorithm

For book oriented data, we only need to retain total word counts for entire book. For example, for Titus, we need to retain the counts for ULT, UST, UTQ, and UTN.

For reference oriented data (to wit, UTA and UTW), we need to preserve:

  • the article type, i.e., UTA or UTW
  • the article name or URI
  • the counts for the article
  • the books which reference the article

All this is known to the RCL currently, but is not retained. Retention will enable optimization to be done.

@jag3773 Other thoughts?

Performance and Load Impact of 404 errors

As an example, consider translation questions. In advance I do not know whether any questions exists for a given verse. So I have guess and attempt to fetch the questions. When they don't exist, I get one or two errors for each such failed attempt. I suspect it might be faster to request all the files in a repo and use it as a whitelist to avoid failed attempts.

Failed to load resource: the server responded with a status of 404 (Not Found)
:6060/#/BookPackage?id=bookpackagetn:1 

Access to XMLHttpRequest at 'https://git.door43.org/unfoldingWord/en_uhal/raw/branch/master/content/H0834a.md' from origin 'http://localhost:6060' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.

Material Table enhancement

Switch to NPM Material Table component.

Question: do we need the word order table that shows all words in the lexical order found? If not just show the word frequency table.

Another option is to put both tables into their own collapsible section.

Book Package Details

BP Information

Re-flow the BP information using the following as an example, using 1 and 2 John. I'm thinking of something like Tree View https://material-ui.com/components/tree-view/. Note the order of the resources below too.


> Book Packages Total Word Count: 72,932

When expanded, it would reveal these indented:

⌄ Book Packages Subtotals (<- expanded by default)

  • ULT Word Count: 3,245
  • UST Word Count: 4,821
  • UTA Word Count: 10,031
  • UTW Word Count: 2,426
  • UTN Word Count: 33,783
  • UTQ Word Count: 19,088

> 1 John Book Package Word Count: 61,135

> 2 John Book Package Word Count: 20,668

When those are expanded, they would reveal each component (indented), which would also be a dropdown. I've expanded them in the example below so I don't have to repeat them twice.

⌄ ULT Word Count: 319

[Word Frequency Table]

⌄ UST Word Count: 498

[Word Frequency Table]

⌄ Translation Academy Word Count: 7,373

Linked modules: 7 unique, 14 total links
[Table of links]

⌄ Translation Words Word Count: 11,254

Linked articles: 37 unique, 69 total links
[Table of links]

⌄ Translation Notes Word Count: 936

Number of Notes: 33
[Word Frequency Table]

⌄ Translation Questions Word Count: 11,254

Number of Questions: 11
[Word Frequency Table]

Persist errors in file retrieval

Currently, filenames in the repos may not be named per their embedded references in other files. For example, a UTA article may begin with a capital letter when it should be lowercase.

These mistakes result in lower word counts and should be captured so they can be corrected.

Incorporate uw-word-count

At present this package has an older version of word count code. Replace with NPM component uw-word-count

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.