This morning in <a href="https://twitter.com/arcanjo/status/876772795123982336" rel="n

Adding the proper credits for the idea: thanks <a class="user-mention notranslate" dat

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Oops; sent before finishing… so here we go again. <p dir="au

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Read hashtags from replies about whistleblower HOT 14 OPEN

okfn-brasil commented on June 8, 2024 1

Read hashtags from replies

from whistleblower.

Comments (14)

cuducos commented on June 8, 2024 1

Adding the proper credits for the idea: thanks @arcanj0 ; )

from whistleblower.

arcanj0 commented on June 8, 2024

Very nice!

Would be very helpful for the toll itself, like you said.
The cons would be you could be exposed to Bots using the hashtag in a "Evil way", but I guess it's not that hard to identify!

The way it is today doesn't give the followers the certainty to be counted as a help.

I would love to see it working!

Thanks @cuducos !

from whistleblower.

vmesel commented on June 8, 2024

@cuducos, you are saying that you would like to create a dataset with the data:

recipe_id | count_positive | count_negative | real_classification

If you do, there is no API to checkout the replies for tweets, but there is a way you can scrape a user profile in order to find a reply_to_tweet method with the original @RosieDoSerenata tweet ID. The link to do this is: https://stackoverflow.com/questions/2693553/replies-to-a-particular-tweet-twitter-api

from whistleblower.

cuducos commented on June 8, 2024

First of all:

If you do, there is no API to checkout the replies for tweets

Ow, c'mon @vmesel — you're better than that ; )

Do you prefer to trust in a 2010 Stack Overflow reply or in Twitter API official documentation?

BTW there's a Twitter API Python wrapper I've contributed to that offers GetReplies method to list replies to a certain user.

The main question is: given a @RosieDaSerenata list of replies, are we able to identify which tweet is each reply is actually refers to using this wrapper? Twitter API offers in_reply_to_… properties.

That said we have to choose a API wrapper or to deal directly with the REST API.

create a dataset with the data:
recipe_id | count_positive | count_negative | real_classification

I'd prefer to have count as something like a query result so later we can filter (eg discard congressperson replies), So I'd go with something like that:

document_id | tweet_id | reply_id | reply_user | suspicion_confirmed

Where:

document_id: reimbursement document_id
tweet_id: original Rosie's tweet ID
reply_id: reply tweet ID
reply_user: user who replied to Rosie
classification: true or false

from whistleblower.

cuducos commented on June 8, 2024

Complementing my previous message, a roadmap for a possible implementation would be:

check all new replies @RosieDaSerenata has got
check if in the in_reply_to_… field we have a @RosieDaSerenata tweet with a valid Jarbas URL
fetch the original tweet and extract the document_id from id
scan the reply for a certain hashtag (#falsoPositivo or #RosieAcertaOutraVez for example)
persist that data in a dataset or database

from whistleblower.

vmesel commented on June 8, 2024

@cuducos I was saying that there is no reply endpoint, I do know that, there is a mentions endpoint and the example (that you've said it's old, but it's still gold) given uses the same logic you said.

Stack Overflow example:

From status/show you can find the user's id. Then statuses/mentions_timeline will return a list of status for a user. Just parse that return looking for a in_reply_to_status_id matching the original tweet's id.

Why should we have individual records for replies if we just want to classify the receipt? This consumes lots of unnecessary space on the database.

We could do an update on the line referencing this tweet/document id.

Another thing on this request is: how can we recover the Jarbas document ID without making a web request, this would be time and money consuming (using the network costs of DO, or AWS).

from whistleblower.

cuducos commented on June 8, 2024

Why should we have individual records for replies if we just want to classify the receipt?

As I've said:

so later we can filter (eg discard congressperson replies)

from whistleblower.

cuducos commented on June 8, 2024

Oops; sent before finishing… so here we go again.

Why should we have individual records for replies if we just want to classify the receipt?

As I've said:

so later we can filter (eg discard congressperson replies)

I cannot predict which filters are useful (bots could be created in massive attacks to poor Rosie, for example), so I think storing them all is the best choice.

how can we recover the Jarbas document ID without making a web request

Does the text in the tweet API is the full URL or Twitter shortened version? If it's shortened, there's no wayout. If it's not we can parse the URL (it's always https://jarbas.serenatadeamor.org/#/documentId/<document_id>).

from whistleblower.

Irio commented on June 8, 2024

@cuducos The URLs are shortened. You get the full ones making a request to https://t.co/.

from whistleblower.

vmesel commented on June 8, 2024

@Irio so we have a great bandwith problem here! Depending on the quantity and how we are going to organize the requests to http://t.co/ it might think its a DDOS attack and block our IPs.

from whistleblower.

Irio commented on June 8, 2024

@vmesel This is not a problem since the document_id's of all posts are stored in a database. Check again the link I posted.

from whistleblower.

cuducos commented on June 8, 2024

we have a great bandwith problem here!

Do we? Share the code and we can help you.

from whistleblower.

cuducos commented on June 8, 2024

Depending on the quantity and how we are going to organize the requests to http://t.co/ it might think its a DDOS attack and block our IPs.

This is not a bandwidth problem: this is a software problem. If this is a problem we just pause between requests, for example, as we do here.

from whistleblower.

cuducos commented on June 8, 2024

how we are going to organize the requests to http://t.co/

@vmesel we don't need to do any request to t.co — python-twitter fetches the expanded URL directly for us, I'm using it on Jarbas.

from whistleblower.

Read hashtags from replies about whistleblower HOT 14 OPEN

Comments (14)

Related Issues (13)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent