Giter Club home page Giter Club logo

Comments (7)

avinashvarna avatar avinashvarna commented on August 15, 2024

It's currently using recursion. Would it make sense to use Depth first search instead, since the splits are essentially a tree? We can see if it would speed it up. At a first glance, should at least help with memory usage?

from sanskrit_parser.

drdhaval2785 avatar drdhaval2785 commented on August 15, 2024

It is the recursion which is to be blamed.
unutbu's answer here uses a non recursive method.
https://stackoverflow.com/questions/2158395/flatten-an-irregular-list-of-lists

Please see if it improves time performance.

from sanskrit_parser.

kmadathil avatar kmadathil commented on August 15, 2024

@avinashvarna
This is actually an directed acyclic graph, but not a tree. Remember that Python strings (edit: removed incorrect reference to lists) are immutable, and note the memoization in the split function. So what we actually generate happens to be a non-canonical (as opposed to an adjacency list dict) graph representation (that is slightly easier to read once printed). The "flatten" method is the classic recursive find_all_paths to the common end (None) node, adapted to this structure (but without memoization, so I was slightly stupid there).
So, to make things clearer, I think I'll make the graph structure explicit by wrapping a class around it, and give methods to output in various fashions. Memoizing the find_all_paths should get us the speed we need.
Incidentally, you can see I've in effect tried to avoid cycles by disallowing the enhanced right string from becoming equal to the original string in split.

from sanskrit_parser.

kmadathil avatar kmadathil commented on August 15, 2024

@drdhaval2785
Interesting. Let me look through this - the "flatten" name is a bit of a misnomer as I mentioned above. I should've called it find_all_paths. I used a non-canonical graph representation, and a non-canoical name for find_all_paths. In this representation, each node is represented by a ordered pair of [Node, list of paths to the end node]. Therefore, find_all_paths becomes "sort-of" a flatten, but with prefix additions. However, some of the ideas you pointed to may be applicable here too.

from sanskrit_parser.

avinashvarna avatar avinashvarna commented on August 15, 2024

@kmadathil
I wrote my comment around midnight my time, and then as I was tossing and turning in bed, it came to me that this is not strictly a tree :). You are right of course about this being a DAG.

My turn to nitpick now :). Python strings are immutable, but lists are mutable - https://docs.python.org/2/reference/datamodel.html?highlight=sequences

from sanskrit_parser.

kmadathil avatar kmadathil commented on August 15, 2024

Ah yes, I trip on that again. You are of course, perfectly right. Lists are mutable. However, I think the graph implementation is still a valid though non-canonical and ponderous one. All that is moot, as I'm leaning towards doing a canonical implementation and getting it right.

from sanskrit_parser.

kmadathil avatar kmadathil commented on August 15, 2024

dag-nx gives reasonable speed for the order of 1000 paths which should be fine for now.

from sanskrit_parser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.