ogallagher / quizcard-generator Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 530 KB

Given a source document, generate quiz/flash cards

Home Page: https://wordsearch.dreamhosters.com/quizcard-generator

License: MIT License

HTML 1.86% CSS 1.30% TypeScript 56.46% Nunjucks 0.54% JavaScript 39.83%

anki-cards educational language-learning

quizcard-generator's People

Contributors

Stargazers

Watchers

quizcard-generator's Issues

Configure special treatment of prefixes and suffixes

I'm not sure what to do about them, but in cases where grammatical prefixes (ex. pre-, re-, un-) and suffixes (ex. -들, -에, -가, -는) occur frequently I've noticed some unusual results.

Since choices for a given word test are currently aiming for similarity, the choices end up being the same word with different conjugations/particles attached.
Different forms of the same word can disproportionately fill the number of tests generated.

Anki note generator optional limit

Include reference to source in Anki note

Confirm and document Anki integration with notes and card templates

Once the full control flow has been confirmed, document the steps to follow in readme.

Fix higher unicode point character support

Configure excluded line number ranges

Allow exclusion of words by their line number. It would also then make sense to optionally show line numbers in the web ui source editor.

In AnkiNote.export escape double quotes

Any text that populates AnkiNote.text should have contained double quotes escaped in export.

More initial unit tests

Use istanbul/nyc to integrate code coverage into the tests.

Randomize choice order on card render

In Anki fill-blanks card front template, update the script to shuffle the order of the active choice list elements.

Controlled with render control tag #32

Publish as package at npmjs.com

fix cli arg aliases, or abandon aliases for now
.npmignore files
move temp logger import to quizcard_cli.ts
#23
include .d.ts type declaration files in npm package
Update install instructions to reference npm package

Configure choice count and variation per cloze

Choice count is the number of choices from which to choose the correct word.

configure choice count in cli
configure choice count in web ui

Choice variation is edit distances between the choices.

update Word.get_closest_words to accept variation/randomness
configure choice variation in cli
configure choice variation in web ui

Add live demo with web UI to readme

Near the top of the documentation, include a video demonstrating wordsearch...tld/quizcard-generator, with an explanation that it's a hosted web UI implemented on top of the CLI tool.

Custom tags for Anki note export

It would be quite easy to bulk edit tags within Anki after export as well, but also easy to provide an input for them at the quizcard generator step.

tag cli opt declaration
tag cli opt implementation

Export CLI opt help/describe strings

I'm using these options in the wordsearch generator web driver, so information about them should be exposed for external use.

Fix exclude-word in notes export opt comment

I'm seeing

--exclude-word=this,some,--exclude-word=this,some

instead of

--exclude-word=this --exclude-word=some

i18n of webpage

Frontend language selector button, locale cookie.
Frontend script fetches translation file for selected locale.
Localization of element title attributes.
Localization of element innerText attributes.
Handle spa
Handle eng
Ensure no translatable strings are used as data/control values

Exclude non word characters from Anki clozes

If the token (coughing?) is parsed as a word with key_string='coughing' raw_string='(coughing)', then the corresponding Anki note cloze should be generated as ({{c1::coughing}}?), with ( and ?) being outside of the cloze text.

Use Korean NLP library for filtering testable words by part of speech

I plan to use the konlpy Python package, with a driver script that quizgen can call to fetch part of speech (POS) tags for a given sentence. #26 (comment)

python konlpy cli driver that accepts a string and returns the token POS tags
quizgen accepts source text language opt
if source text language is Korean
- when building each sentence, pass the sentence text to the konlpy driver
- parse the POS tags and assign them to Word instances
- In Word, a new member root_string has the subset of key_string that excludes unimportant parts of speech (particles/ornaments). Another member stores the ornaments.
What to do with word root_string and ornaments?

I don't think words should use root_string as the unique identifier, because a word test should include the particles as part of the test; they are sometimes what makes an answer invalid or valid, depending on the correct overall part of speech.

I could use root_string to count occurrences of a word (across multiple instances of Word with different key_string), which could be used to limit the number of tests of the same word. Likewise, it could be used to enable tests of words otherwise considered to occur too infrequently.

I could use root_string for edit distance, so that words with different particles and the same root would be stored as edit distance zero. This could then be used to exclude words with the same root from choices for a test (ex. 가방은, 가방이). But again, sometimes testing different parts of speech for the same root is desirable for testing grammar instead of vocabulary.

CLI prompt for input file path or content when missing

Configure max tests of the same word

There is certainly benefit in testing a word in multiple contexts, but there should be a way to configure the maximum number of tests of amy same word.

use word test limit in anki generator
Configure w cli opt
Configure w web ui

Configure note prologues and epilogues

Prologues and epilogues are additional text from neighboring notes' sentences, which will not be rendered with any testable words.

add prologue and epilogue fields to note type
store sentence prologues and epilogues in quizgen
configure lengths of prologue and epilogue from cli
document new fields in readme
update notes export column list comment for 2 new columns
configure lengths from web UI
show prologue and epilogue in web card preview
update card template to display prologue and epilogue fields around the question text.
Eventually, showing and hiding these will be done via control tags #32.

Use template library for Anki cards

I like the sound of nunjucks, by Mozilla and inspired by Jinja2 (which I have used a lot).

This will allow me to develop the card templates more fully with components in separate files.

enable anki card templating
update usage documentation for compiling templates before using in Anki

Customize number of sentences per note

Currently, every generated Anki note represents one sentence, but allowing more sentences per note will give more context for each card.

opt for sentence tokens max
web UI input for sentence tokens max
~~Web input for tokens max seems not to work~~
web UI input for sentence words min
set web input default value for words min to 3 (same as placeholder)
#33

Ordinal frequency filters accept percentages

For example, instead of "keep most common 100 testable words", you can "keep most common 10% of testable words".

Fix web server null note error quizcard_webserver.js:34

Example error logs

101.INFO: parsed 458 sentences, 2026 words
323.INFO: max choices = 5
147.INFO: calculations complete for 택시-운전사_대본_1.txt
215.INFO: generate null anki notes
219.DEBUG: word_frequency_ordinal_max or min = undefined || undefined
222.DEBUG: choice variation = 0.7
223.DEBUG: prologue=10 epilogue=8
/home/dh_yx2fag/wordsearch.dreamhosters.com/quizcard_webserver.js:34
                    (note) => note.toString(
                                   ^
TypeError: Cannot read properties of null (reading 'toString')
    at /home/dh_yx2fag/wordsearch.dreamhosters.com/quizcard_webserver.js:34:36
    at Array.map (<anonymous>)
    at /home/dh_yx2fag/wordsearch.dreamhosters.com/quizcard_webserver.js:33:40
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

Define exclude word patterns (ex. speaker names in script)

specify word exclude literals in cli option
specify in word excludes file

Truncate input-file-content in export notes opts comment

I do not want to include the entire value of --input-file-content in the notes opts comment.

Configure min words, max tokens per sentence

There is already a QuizCardGenerator.sentence_word_count_min property (not configurable), so there can be another like sentence_word_count_max if we're counting testable words, or sentence_token_count_max if counting tokens.

Include sentence/text translation in Anki note

I added the empty field, so now we can populate it with a translation service.

Export all package exposed contents through main

Main being package.json#main, quizcard_generator.js. I'm specifically thinking of the CLI option keys.

Use Anki tags to customize card render

If the current note's tags are available within the DOM on render, then I can support variations of fill-blanks, as well as other layouts of the same information, without needing to create duplicate notes for associating to different card templates.

Prefix all quizgen generated tags with qg- except those specified by the user, and those for identifying the source text.

Initial render control tags:

If you add the show-logging tag, then render logs can be shown in the card body.
If you remove the show-choices tag, then the multiple choices for the cloze are not shown.
If you remove the show-source-file and show-source-line tags, that section will be hidden.
If you remove the show-prologue tag, the preceding sentence will be hidden, and similarly for show-epilogue.
show-randomized shuffles the choices

Document

Explain render control tags in readme

Words should have different raw text per location

Low priority

Currently, whatever raw text was present at the first occurrence location of a word, that same text will be used everywhere when it's time to export.

To fix this (ex. ... hello. "Hello?"), move raw_string to be a child of the Word.locations collection. When formatting the sentence for export, the location corresponding to the current sentence and token number will determine which value is chosen.

Fix custom log level cli opt

i18n of cli opt descriptions

cli driver determines env locale
lang opt to override locale
cli driver updates option descriptions before loading them for the help message.
cli driver mandatory arg prompts are also localized

Document likely pitfalls and tips in web UI

notes are uniquely identified by source file name and source file line number.

When the updated notes are imported into Anki, they are uniquely identified with the euid column, which will not change as long as the quizcard-generator input/source file name and line number to which the note belongs don't change.

If the input file is too large, the server will throw a cryptic error (likely because of exceeded memory limit). In this scenario, one has to split the file into segments. Note each segment should have a different source file name to prevent overwriting notes from the previous segment.
Notes that are generated without any testable words are assigned the not-testable tag, but are still included in the notes file. This allows you to then decide case by case, after import to Anki, whether to edit them or delete them.

Options for min word frequency, word frequency highest/lowest, min word length

define cli options
implement cli options

Fix card choice shuffle

The structure of the active choice list does not guarantee the list element (<ul/>) to be the root.

i18n of readme

Host quizgen readme in webpage using md to HTML compiler.
Frontend language select button
Frontend language cookie
i18n frontend script that detects browser language, fetches corresponding localization file, and translates page strings.
For this I plan to use roddeh-i18n since it's referenced from an MDN extensions API article, though it doesn't seem like the most ubiquitous library choice.
fix i18n.create().extend() in fork

Include all generator option keys and values in notes export

This embeds the options used to generate the file within the file header comments section, making it easier to reproduce and adjust later.

opts in notes from cli
opts in notes from web ui

Include unique id column in Anki note

This is what Anki uses to determine whether to insert or update the note when comparing with existing (ex. previously imported) notes.

Include tags in Anki notes export file

Word edit distance reduce runtime and memory usage

Make edit distances a single shared structure, rather than owned individually by each word.
This might not actually reduce memory usage at all, because I still need to store the distance between every word a and b, bidirectionally. If the data structure is singular, each distance will still have n^2 keys (n = word count).
Ignore edit distances beyond a configurable threshold.
For the threshold, I can use the same formula as the configurable choice variance.
Begin calculating edit distances while parsing the source text.
Move word edit distances structure into storage as temp files? Maybe this is overkill, for a future enhancement. It may be better to move other structures into storage first, like the lists of sentences and words.

Configure combination of multiple clozes with same index

There should be more variety between cards and less ambiguity between choices, if multiple words can be tested simultaneously in the same card, each choice being a combination of the tested word candidates.

ogallagher / quizcard-generator Goto Github PK

quizcard-generator's People

Contributors

Stargazers

Watchers

quizcard-generator's Issues

Recommend Projects

Recommend Topics

Recommend Org