Giter Club home page Giter Club logo

Comments (9)

jlengrand avatar jlengrand commented on May 31, 2024

Yeah, seems to be running fine AFAIK, it's just getting slower as time passes it seems. Let's see

Screen.Recording.2021-01-19.at.3.46.25.PM.mov

Can confirm that it is going slower as times passes. Over 9000, I process roughly one file per second now.

from rocket.

jlengrand avatar jlengrand commented on May 31, 2024

Interestingly, bottleneck is not CPU nor memory apparently

image

from rocket.

jlengrand avatar jlengrand commented on May 31, 2024

Past the improvements that can be made, it makes me think we probably should allow for storing maps of links to avoid starting from scratch every time. You don't want your CI to take 3x more time because of dead links.

from rocket.

daKmoR avatar daKmoR commented on May 31, 2024

35.000 files is certainly a different scale 😅

why do they have sooo much? 🤔

anyways there are still a few performance improvements we can do... (some which only make sense at this scale so I didn't bother yet... also ~2s for the other use cases I tested seemed fast enough 😬)

adding some sort of permanent cache is imho tricky as then we would need to look at cache invalidation - also you would need to commit this or the ci needs to permanently cache it
potentially this would require a full link tree... yeah don't think it's worth the trouble

from rocket.

jlengrand avatar jlengrand commented on May 31, 2024

Good point.

I was thinking along those lines as well. I'll try some more sites and see where we're going with it :).

Oh, and now that you mention it the number of links could be because they're bundling autogenerated javadoc. That'd explain. I'll look at possibly adding ignore folders then.

Thanks for thinking along

from rocket.

jlengrand avatar jlengrand commented on May 31, 2024

It worked BTW, just took a while. Congrats

image

from rocket.

daKmoR avatar daKmoR commented on May 31, 2024

I did a quick hack with running 2 parsers in parallel... and on the 11ty-website (with around 500 documents) it improved the performance by about 30%... once I added a 3rd parser it was slower than 1 parser alone.

Seems a rule of thumb could be to have 1 parser for around 300 - 400 documents... which would mean ~100 parsers for 35.000 files 😅

that would still take a ton of time 😅

but yeah I guess checking 35.000 documents is a little out of scope

from rocket.

jlengrand avatar jlengrand commented on May 31, 2024

Nice test though! I mean the complexity of searching for links through pages and finding reference is growing exponentially with the number of pages so yeah, we should probably avoid auto generated doc, which is by design correct anyways

from rocket.

daKmoR avatar daKmoR commented on May 31, 2024

I'm going to close this as it works... it's just not fast enough if applied to 30.000+ pages 😅

working with max ~5.000 documents sounds like a valid limitation for now.
There are some more tricks that could be applied but I don't see working on them any time soon (especially as it would introduce way more complexity)

Feel free to reopen or create another issue if this needs to be tackled.
It should probably have a good use case or a nice project that needs it or it should be sponsored.

from rocket.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.