vuizur / add-stress-to-epub Goto Github PK
View Code? Open in Web Editor NEWA program that sets the stress and the letter ё of Russian text and ebooks using Wiktionary data and grammar analysis.
License: GNU Affero General Public License v3.0
A program that sets the stress and the letter ё of Russian text and ebooks using Wiktionary data and grammar analysis.
License: GNU Affero General Public License v3.0
Hello,
I'm very impressed by the project and found it a few minutes ago.
I've noticed that the display of accented characters has a lot of variety depending on whatever fonts are embedded in the book or whatever. It would be nice for users to be able to add a font known to be good with calibre (maybe a wiki page)?
(edit: looks like this is a bug in KOReader, where it calls up a "fallback font" for the accented characters)
For example удалось. There should be a new row in the database for this to save the other version
Like (по)умнее
Hello
This software seems to physically add the accents on the words themselves. This requires that the user has special dictionary programs or files which can handle the words in their accented forms. I propose a possible way to add accents without needing special dictionaries.
We use a CSS ::before property to provide the accented character in a zero width inline-block span. The text generated by CSS is not selectable, at least in all the browsers I tested. It also works in Calibre and Foliate, partially in KOReader (it does not select as a whole word, but you can just drag a bit). It might work with all reader software, but it still seems useful.
Example html:
<html>
<head>
<style>
[data-content]::before {
content: attr(data-content);
}
</style>
</head>
<body>
нали<span data-content='́'></span>чный
</body>
</html>
This should display нали́чный, but if you try to select it, it will select наличный (without the accent)
It would be great if this is an option for this tool!
Hello!
I am the maintainer for VocabSieve. It would be great if you you can publish this to PyPI for programmatic use. All the existing ones (like russtress
) does not consider context and make mistakes quite often.
On another note, is it really necessary to have spacy
for this? It is a rather large dependency and in my experience works somewhat slowly. Have you tried pymorphy2
? It seems to be able to tag words too.
Hi there!
Cool concept and exactly what I've been looking for to add stress marks to sentences from the SMARTool database to make Anki decks.
I installed on my Linux Mint 20 machine using
pip3 install git+https://github.com/Vuizur/add-stress-to-epub
When I run your example in my Python script, I get the following error at the import line:
File "/home/user_name/.local/lib/python3.8/site-packages/russian_text_stresser/russian_dictionary.py", line 45, in <module>
class RussianDictionary:
File "/home/user_name/.local/lib/python3.8/site-packages/russian_text_stresser/russian_dictionary.py", line 46, in RussianDictionary
def __init__(self, db_file: str, simple_cases_file: str | None) -> None:
TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'
I'm running your tool on Windows 11. I get a bunch of "Apparently wrong POS detected" messages, and then the tool fails with KeyError: 'AUX', all within about 30 seconds.
Here is the file on which I'm trying to add stress marks. Any feedback is appreciated!
This combined with https://github.com/Vuizur/Wiktionary-Dictionaries absolutely improved my reading experience on my kindle tenfold. Right now I am cruising through Гарри Поттер и Философский камень. Thank you
It is in the OpenRussian data, so it should theoretically be there. I should be able to fix this by parsing the newer OpenRussian CSVs.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.