Giter Club home page Giter Club logo

Comments (13)

jawz101 avatar jawz101 commented on May 25, 2024 2
  --hierarchical        Switch the value of the hierarchical sorting of tested
                        file. Configured value: True

is it really defaulted to true? I'm not seeing them sort

from pyfunceble.

funilrys avatar funilrys commented on May 25, 2024 1

Feature introduced with 5fb252e.

P.S: Please DO NOT close this issue manually.
This issue is going to be closed automatically once 5fb252e will be merged to the master branch.

from pyfunceble.

funilrys avatar funilrys commented on May 25, 2024

Hi @jawz101 and thanks for opening thing issue.

Indeed it's may be a good idea to introduce such feature.

from pyfunceble.

funilrys avatar funilrys commented on May 25, 2024

Note: Do not take it personally if I'm not starting to code this feature yet.

I'm currently writing all inline comments but will think about this implementation once I'm done with all inline comments. 👍

from pyfunceble.

jawz101 avatar jawz101 commented on May 25, 2024

No problemo. I just like writing my thoughts out & if someone thinks it's interesting and don't require much implementation/maintenance- that's cool. If not, that's cool too :)

Another example of some domain sorting might be found in Privacy Badger around here. I don't know if they sort it the same way but I'd started sorting their lists for them once and I guess they liked the idea and ran with it.

from pyfunceble.

jawz101 avatar jawz101 commented on May 25, 2024

Very cool! BTW, I don't need any credit. You did the work for it. I mean, thank you either way.

from pyfunceble.

funilrys avatar funilrys commented on May 25, 2024

Hi @jawz101,

I consider that if someone give an idea and that idea finish in the code base, that person indirectly contributed to the evolution of the code base. So those people should be in the list of contributor. The fact is that your idea made or make the code base a better one.

So I'm not going to change that today or in the future.
I do think that it is what makes open source code stronger than proprietary source source. Everybody has its place here, no need to code!

from pyfunceble.

funilrys avatar funilrys commented on May 25, 2024

Hi @jawz101, I was thinking about setting it to default... But I'll have to rethink about that as I'll release 1.0.0 very soon...

Anyway, you found a bug in the CLI parser 👍

from pyfunceble.

jawz101 avatar jawz101 commented on May 25, 2024

It doesn't look like the sort does what I imagine

hosts.txt

Example: content.ad should be down with other domains that begin with a 'C'. Maybe this is a little awkward of a request

from pyfunceble.

funilrys avatar funilrys commented on May 25, 2024

Hi @jawz101,

Let me explain what I did.

Let's say we have aaa.bbb.ccc.com

What I basically do (internally) is suppress all punctuation and reorder the elements of the given domain/IP in order to then sort alphabetically. Once it's done the hierarchy is kept as we read internally in the generated order but test the original domains.

It can be strange or weird but it works 😄

So I first reorder by TDL which means that if com is the tdl I sort in the following order.

  • com
  • aaa
  • bbb
  • ccc

That gives us 2 domains. The one we are currently testing aaa.bbb.ccc.com and the one which gives us the testing order (if we have to sort hierarchically naturally) comaaabbbccc.

But, if ccc.com is registered into the PSL which we parsed into public-suffix.json we sort in the following order.

  • ccc.com
  • aaa
  • bbb

That gives us 2 domains. The one we are currently testing aaa.bbb.ccc.com and the one which gives us the testing order (if we have to sort hierarchically naturally) ccccomaaabbb.

Same if bbb.ccc.com is registered we sort in the following order.

  • bbb.ccc.com
  • aaa

Which gives us aaa.bbb.ccc.com to test and bbbccccomaaa as the order giver.

That way it stay hierarchically sorted.
Applied to a big list it gives us what you got.

If I misunderstood something and you need another kind of sorting, please let me know, I'll be glad to implement it 😸

Cheers,
Nissar

from pyfunceble.

jawz101 avatar jawz101 commented on May 25, 2024

Ah... so it looks like it sorts by the public suffix list 1st and then the remainder.

This is really just a puzzle more than anything so I'm glad you think it's interesting too.

My method didn't really account for the entire PSL, but what I do might be something like

Part 1 - Grab the PSL
Part 2 - Grab the next domain part to the left of it (googly domains)
Part 3 - Take the rest of the stuff to the left and throw it here.

Then sort by Part 2, Part 3, Part 1 (or Part 2, Part 1, Part 3... either way)

So, for example, google's would sort've clump together even if they had a different PSL:

googleadservices.co.uk
adservice.google.com.tr
adservice.google.de
pagead2.googleadservices.com
googleadservices.com
googleadservices.nl
googleadservices.ru

In a spreadsheet I actually paste all of it, go through and move all of the cells to align along the right , and then sort on the 2nd to last column, then the last column, and then the one to the left of the 2nd to last, the one to the left of that, and so on until the leftmost column. It's a little onerous but gets more-or-less the same effect.

But then you kinda see how, say, google or amazon buys their domains across suffixes- which is common so people don't try to domain park or phish their customers.

from pyfunceble.

funilrys avatar funilrys commented on May 25, 2024

Hi @jawz101, I just implemented your method with the following order:

  • Part 2
  • Part 1
  • Part 3

P.S: Please DO NOT close this issue manually.
This issue is gonna be closed automatically once 7a273bb will be merged to master.

from pyfunceble.

jawz101 avatar jawz101 commented on May 25, 2024

Awesome! I hope it made sense. That should be a nice way to see a company's domain presence on the web.

from pyfunceble.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.