Giter Club home page Giter Club logo

vidyut's People

Contributors

akprasad avatar ma08 avatar mssrprad avatar shreevatsa avatar vbasky avatar vipranarayan14 avatar yashkhasbage25 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

vidyut's Issues

[prakriya] lit forms of वयँ (01.0547) are wrong

In the लिट् forms of वयँ (01.0547) dhatu, वय् is shown as getting samprasarana through the sutra, ग्रहिज्यावयिव्यधिवष्टिविचतिवृश्चतिपृच्छतिभृज्जतीनां ङिति च 6/1/16. Due to this the generated forms are wrong.

The sutra only applies to वय् adesha of वेञ् dhatu but not to the वय् dhatu (See Kashika and Kaumudi).

So, as shown in Kaumudi and Madhaviyadhatuvritti (ववये - वादित्वान्नैत्त्वाभ्यासलोपौ) the forms of वयँ (वय्) dhatu should be:

eka dvi bahu
ववये ववयाते ववयिरे
ववयिषे ववयाथे ववयिध्वे/ववयिढ्वे
ववये ववयिवहे ववयिमहे

But the current forms are:

image

(AFAIK, ऊये, ऊयाते, etc. cannot be optional forms for the वयँ (वय्) dhatu either. But please do check.)

[prakriya] ढत्व in ध्वम् and षीध्वम् is not possible if the anga is not ending with इण्

Due to the sutra विभाषेटः 8.3.79, the ध् in ध्वम् (लिट/लुङ्) and षीध्वम् (आशीर्लिङ्) is optionally replaced by ढ्, if the anga ends with इण् and is followed by इट् (aagama).

If the anga does not end with इण्, then the sutra cannot apply.

But I noticed that vidyut applies this sutra for many dhatus whose anga is not इण्-अन्त। Some of the examples are:

  • स्पर्धँ: पस्पर्धिढ्वे , पस्पर्धिध्वे / स्पर्धिषीढ्वम् , स्पर्धिषीध्वम् / अस्पर्धिढ्वम् , अस्पर्धिध्वम्
  • राजृँ: रेजिढ्वे , रेजिध्वे , रराजिढ्वे , रराजिध्वे / राजिषीढ्वम् , राजिषीध्वम् / अराजिढ्वम् , अराजिध्वम्
  • भाषँ: बभाषिढ्वे , बभाषिध्वे / भाषिषीढ्वम् , भाषिषीध्वम् / अभाषिढ्वम् , अभाषिध्वम्

AFAIK this is not correct. Please correct me if I am wrong.

[prakriya] Some forms of जनँ जनने (03.0025) are wrong

Forms of जनँ जनने (03.0025) are shown as getting ई आदेश in अपित् places through the sutra, ई हल्यघोः 6/4/113.

But this is not possible due to the more specific sutra, जनसनखनां सञ्झलोः 6/4/42. Due to this, the dhatu's forms in अपित् places are wrong: जजेतः, जजेताम्, अजजेताम्, जजेयाताम्.

The correct forms are shown in Siddhanta Kaumudi:

एषामाकारोऽन्तादेशः स्याज्झलादौ सनि झलादौ क्ङिति च । जजातः । जज्ञति । जजंसि । जजान । जजन्यात् । जजायात् । जन्यात् । जायात् ।

[prakriya] Improve quality for यङन्तs

We have basic tests in basic_tinantas, but these need more work.

The difficulty here is somewhere between #16 and #17. In particular, I'm not sure how to derive the अजादिs (e.g. I can understand अशश्यते but not अशाश्यते, so I think I'm missing a rule). But the main difficulty will be implementing the right order for various changes: dhatu substitutions, dvitva, samprasarana. (Our current order works for basic verbs in kartari prayoga but is failing here.)

[prakriya] Optimize the `tripadi` module

Profiling indicates that the tripadi module is slow.

Many of the rules in the tripadi need to iterate over every character in the string so that they can apply various sandhi changes. Currently, we create a new CompactString for each of these rules. My rough guess is that we create a dozen such strings for each word we derive, even if none of the rules have scope to apply. CompactString shouldn't stack allocate in most cases, but the copy work required here is still slow.

Once we confirm that this is a problem with profiling, we should avoid the extra copies here. Two approaches that come to mind:

  1. Instead of creating a new string, iterate over the Term strings and manage indices carefully.

  2. Store one copy of the string and rebuild it only if a rule applies. The code would follow the basic pattern of ItPrakriya, e.g., by extending the Prakriya struct with new data and helper methods.

I think (2) is generally cleaner, and it has the side effect of improving our APIs.

[vidyullekha] 'प्रयोग' and 'सनादि' options should reset when switching between dhatus

Current behaviour

As shown in the GIF below, when switching from one dhatu to another the 'प्रयोग' and 'सनादि' options are reset in the UI but the forms are displayed with the same options that were previously selected.

vidyullekha ui bug animation 1

Currently, they can be reset only by selecting another option and then selecting the original option.

vidyullekha ui bug animation 2

Expected behaviour

The options should reset to default ("कर्तरि" and "none") when switching between dhatus.

lakAras in selected transliteration

Currently, they are in fixed transliteration.
Kindly provide them in desired transliteration - in the given example in Devanagari script.
Screenshot_2023-04-16_19-09-30

[Feature Request] Support for non-Devnagari transliteration

Hi,

Great project!! Really interesting and inspiring.

Are you interested in adding support for non-Devnagari transliteration, maybe one or all of IAST, ITRANS, SLP1? Will be useful for folks with difficulty in reading Devnagari.

I would love to contribute to developing this feature. Maybe for a start, we could implement a small transient popup that shows the transliteration(s) near the pointer on extended mouse hover?

prakriya demo: Set up instructions

I tried to work on the demo web app, but things are broken:

➜  ~/w/ambuda/vidyut/vidyut-prakriya git:(main) ✗ make debugger 
wasm-pack build --target web
[INFO]: 🎯  Checking for the Wasm target...
[INFO]: 🌀  Compiling to Wasm...
    Finished release [optimized + debuginfo] target(s) in 0.05s
[INFO]: License key is set in Cargo.toml but no LICENSE file(s) were found; Please add the LICENSE file(s) to your project directory
[INFO]: ⬇️  Installing wasm-bindgen...
[INFO]: Optimizing wasm binaries with `wasm-opt`...
[INFO]: ✨   Done in 0.96s
[INFO]: 📦   Your wasm pkg is ready to publish at /Users/shreevatsa/w/ambuda/vidyut/vidyut-prakriya/pkg.
cp pkg/* www/static/wasm
cp: www/static/wasm is not a directory
make: *** [debugger] Error 1

After creating that directory:

➜  ~/w/ambuda/vidyut/vidyut-prakriya git:(main) ✗ mkdir -p www/static/wasm
➜  ~/w/ambuda/vidyut/vidyut-prakriya git:(main) ✗ make debugger           
wasm-pack build --target web
[INFO]: 🎯  Checking for the Wasm target...
[INFO]: 🌀  Compiling to Wasm...
    Finished release [optimized + debuginfo] target(s) in 0.05s
[INFO]: License key is set in Cargo.toml but no LICENSE file(s) were found; Please add the LICENSE file(s) to your project directory
[INFO]: ⬇️  Installing wasm-bindgen...
[INFO]: Optimizing wasm binaries with `wasm-opt`...
[INFO]: ✨   Done in 0.97s
[INFO]: 📦   Your wasm pkg is ready to publish at /Users/shreevatsa/w/ambuda/vidyut/vidyut-prakriya/pkg.
cp pkg/* www/static/wasm
cp data/* www/static/data
cp: www/static/data is not a directory
make: *** [debugger] Error 1

Ok, creating that one too:

➜  ~/w/ambuda/vidyut/vidyut-prakriya git:(main) ✗ mkdir -p www/static/data
➜  ~/w/ambuda/vidyut/vidyut-prakriya git:(main) ✗ make debugger           
wasm-pack build --target web
[INFO]: 🎯  Checking for the Wasm target...
[INFO]: 🌀  Compiling to Wasm...
    Finished release [optimized + debuginfo] target(s) in 0.05s
[INFO]: License key is set in Cargo.toml but no LICENSE file(s) were found; Please add the LICENSE file(s) to your project directory
[INFO]: ⬇️  Installing wasm-bindgen...
[INFO]: Optimizing wasm binaries with `wasm-opt`...
[INFO]: ✨   Done in 0.96s
[INFO]: 📦   Your wasm pkg is ready to publish at /Users/shreevatsa/w/ambuda/vidyut/vidyut-prakriya/pkg.
cp pkg/* www/static/wasm
cp data/* www/static/data
cd www && source env/bin/activate && python app.py
/bin/sh: env/bin/activate: No such file or directory
make: *** [debugger] Error 1

Indeed there's no env/bin/activate inside www so maybe it needs to be done in the other order?

➜  ~/w/ambuda/vidyut/vidyut-prakriya git:(main) ✗ source env/bin/activate
(env) ➜  ~/w/ambuda/vidyut/vidyut-prakriya git:(main) ✗ cd www && python app.py
Traceback (most recent call last):
  File "/Users/shreevatsa/w/ambuda/vidyut/vidyut-prakriya/www/app.py", line 1, in <module>
    from flask import Flask, render_template
ModuleNotFoundError: No module named 'flask'

Maybe there needs to be a requirements.txt and all that, as this app is no longer just a standalone index.html file :-)

[prakriya] Prakriya doesn't show all rules involved in the process?

In the derivation of बुध्यात् (बुधँ अवगमने 01.0994):

Without the presence of atleast the क्ङिति च, it is difficult to understand why there is no guNa in the form.

image

Dr Dhaval's Prakriyāpradarśinī shows क्ङिति च:

image

If this is intended, I understand.

[prakriya] Handling rule conflicts

​We currently have reasonable support for karmani prayoga, and I'll also add support for sanAdi pratyayas by the end of the year. We have experimental support for various krdantas and basic support for subantas.

Currently, how are rule conflicts handled in prakriyA simulation? The regular interpretation of विप्रतिषेधे परं कार्यम्, augmented by a web of paribhAShA-s?

Would it be simple to implement an option to resolve such rule conflicts by means of the simpler framework described in Rishi rajpopat's thesis which recently entered the news and fascinated / surprised many? This will be enormously valuable in validating the claims made therein, and will likely lead to advances in our understanding of what pANini intended + drawbacks therein.

[kosha] Refactor `Kosha` logic into a new `MultiFst` class

MultiFst should handle duplicate keys and do mst of the heavy lifting. Kosha should then be a thin wrapper over MultiFst.

Use cases:

  • can use MultiFst for other kinds of linguistic data
  • can create Python bindings directly for MultiFst which can be handy for some applications (e.g. creating lightweight FST structs for word lists.)

Technical blockers

  • to avoid double allocation, get_all should return an iterator instead of a Vec.

[prakriya] Add a better test suite for krdantas

Online data for tinantas and subantas is quite reasonable. But as far as I'm aware, there are no high-quality lists of krdantas. I have started a basic test suite in basic_krdantas, but we should add more cases here.

This task requires some knowledege of व्याकरण or else a willingness to go through various grammar books, etc. to determine which forms we should expect.

description typo

should change from "Sanksrit software" -> "Sanskrit software"

prakriya: side-by-side view for different forms for the same inputs

When multiple forms are generated (e.g. "भवतात् , भवताद् , भवतु"), may be nice to see all their prakriya-s side by side. (Basically make the entire table cell a link/action target, rather than an individual generated form inside a table cell.)

(nice to have but a little trickier on the frontend side)

(Above quoted from #19 — may fit well with #23 if all the inputs (lakāra, puruṣa, vacana…) are the "state".)

[prakriya] Improve quality for णिजन्तs

We have basic tests in basic_tinantas, but these need more work.

The major issue is that we need to correctly implement पुगन्तलघूपधस्य च to account for पुगन्त. Otherwise there are minor issues, e.g. for जागृ guna.

prakriya: add search box for dhatus

Similar to sites like ashtadhyayi.com. A list of 2000 dhatus is too much to reasonably scroll through, especially since they are in their aupadeshika forms.

[kosha] Explore different bitfield orderings

In packing, I chose a bitfield ordering more or less on a hunch, but I don't think our current ordering works very well because our modular_bitfield crate uses an endianness that's different from what I expected.

I think a better ordering or approach here could potentially decrease the size of the FST. My guess is that we might save up to 10% on size, which means more of the FST can be kept in the processor cache.

A good PR here should quantify the size decrease when using a different bitfield ordering.

[prakriya] Improve quality for सन्नन्तs

We have basic tests in basic_tinantas, but these need much more work.

As compared to our णिजन्तs, our सन्नन्तs have more quality issues and are not quite as reliable. The major issue is that a lot of small सन् rules haven't been implemented yet, but I'm not sure which ones.

[prakriya] Some lit forms of अक्षूँ (01.0742) and few other ऊदित् dhatus are not generated

Expected

अक्षूँ (01.0742) is ऊदित् dhatu. So, due to स्वरतिसूतिसूयतिधूञूदितो वा 7/2/44, it optionally gets इडागम when followed by a वलादि आर्धधातुक suffix. In case of lit lakara, it should have the following forms:

eka dvi bahu
आनक्ष आनक्षतुः आनक्षुः
आनष्ठ , आनक्षिथ आनक्षथुः आनक्ष
आनक्ष आनक्षिव , आनक्ष्व आनक्षिम , आनक्ष्म

Similarly, तक्षूँ (01.0743) and त्वक्षूँ (01.0744).

Current

Additional Info

Acc. to Ashtadhyayi Sahajabodha - Part II of Pushpa Dikshit (p.304), all the dhatus given below should also behave similarly. But I have not checked them yet.

image

Here is a related issue in Dhaval Patel's SanskritVerb project.

Explore using `CompactString` in segmenter

CompactString is a memory-efficient string that can store up to 24 bytes on the stack before making a heap allocation. It's mostly a drop-in replacement for String, and you can learn more about it here.

We've had success improving performance by using CompactString in vidyut-prakriya, and it might also improve performance in vidyut-cheda.

A PR here would experiment with CompactString and quantify the performance change.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.