Giter Club home page Giter Club logo

Comments (5)

dpasch01 avatar dpasch01 commented on June 12, 2024

The issues with this solution are:

  1. It is going to be extremely slow;
  2. A vast majority of Twitter profiles have never been snapshot by the WBM, hence, we will not be able to get any data.

However, it is a possible solution to the situation.

from snscrape.

upintheairsheep avatar upintheairsheep commented on June 12, 2024

The issues with this solution are:

  1. It is going to be extremely slow;
  2. A vast majority of Twitter profiles have never been snapshot by the WBM, hence, we will not be able to get any data.

However, it is a possible solution to the situation.

1: Better than nothing
2: Yes I know this is the case, however it can scrape whatever twitter profile is used. This should not be a replacement for the regular Twitter scraped, more of a way to get posts of deleted or banned accounts and deleted Tweets.

from snscrape.

JustAnotherArchivist avatar JustAnotherArchivist commented on June 12, 2024

The additional complexity of supporting every past version of Twitter's web layout (rather than just the single current one) is not something I consider an adequate use of developer time, especially given the spotty coverage.

from snscrape.

upintheairsheep avatar upintheairsheep commented on June 12, 2024

The additional complexity of supporting every past version of Twitter's web layout (rather than just the single current one) is not something I consider an adequate use of developer time, especially given the spotty coverage.

I'd say to just support the first two or three most recent versions, as desire to archive Twitter only really gained motion since Elon took over, and luckily for us, Twitter's web layout has remained stagnant from about 2016 to 2022, and some captures shuffle the mobile layout which has not changed either. See http://web.archive.org/web/2/https://www.twitter.com/jack/status/20 as an example.

from snscrape.

JustAnotherArchivist avatar JustAnotherArchivist commented on June 12, 2024

'The site looks the same' doesn't mean there were no changes relevant for a scraper's code. The WBM also contains snapshots using at least four completely different Twitter website designs in just the last few years (the old design, the old simple/mobile design, the current simple design, and the current usual site which generally doesn't work in the WBM).

And you misunderstood me: I don't think supporting even a single additional version is worth the effort. I certainly won't be doing it. I might consider a well-written PR. Otherwise, this should be done outside of snscrape.

from snscrape.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.