Giter Club home page Giter Club logo

subito-it-searcher's Introduction

Improved version of subito-it-searcher

BeautifulSoup scraper running queries on a popular italian ad website. This searcher is compatible with Python 3.x versions.

Features (thanks to Marco Perronet)

  • Infinite refresh with adjustable delay
  • Multiplatform support: can run also on Windows
  • Windows 10 notifications
  • Easier Telegram setup
  • Handle connection errors
  • Fix flooding on Telegram

Setup

Install dependencies

pip3 install -r requirements.txt

NB: For Windows 10 users, install also win10toast.

Telegram configuration

To have to bot send you updates on Telegram, follow these steps:

  1. Create a bot by writing to the BotFather on Telegram
  2. BotFather will give you an API key: save this API key for later
  3. Create a public channel and add the newly created bot as administrator
  4. Save the name of the channel including the "@", for example: @subito_bot

To configure Telegram, simply invoke the script with the proper parameters as following:

python3 subito-searcher.py --addtoken [YOUR_API_TOKEN] --addchatid [YOUR_CHANNEL_NAME]

Usage

Write python3 subito-searcher.py --help to see all the command line arguments. Keep in mind that the script always needs some argument in order to start.

Here is a cheatsheet of the most common usages:

  • Add a new query with name "Auto": python3 subito-searcher.py --add Auto --url https://www.subito.it/annunci-italia/vendita/usato/?q=auto --minPrice 50 --maxPrice 100 (keep in mind that you always use --add and --url together, min and max prices are optional)

  • Remove the query "Auto": python3 subito-searcher.py --delete Auto

  • See a list of all your added queries: python3 subito-searcher.py --short_list

  • Start the bot, it will search for new announcements every 2 minutes: python3 subito-searcher.py --daemon

  • Start the bot with a custom delay (example, 30 seconds): python3 subito-searcher.py --daemon --delay 30

  • Start the bot, but disable windows notifications: python3 subito-searcher.py --notifyoff

  • Start the bot, but disable telegram messages: python3 subito-searcher.py --tgoff

Example setup

Here is the list of commands I types to set up the bot on my computer:

python3 subito-searcher.py --addtoken "6168613223:oij9JDXXlipj92jDj0j90JFWO292" --addchatid "@subito_it_test"
python3 subito-searcher.py --add Auto --url https://www.subito.it/annunci-italia/vendita/usato/\?q\=auto
python3 subito-searcher.py --add Iphone --url https://www.subito.it/annunci-italia/vendita/usato/\?q\=iphone
python3 subito-searcher.py --add ScarpeMaxMin --url https://www.subito.it/annunci-italia/vendita/usato/\?q\=auto --minPrice 10 --maxPrice 150
python3 subito-searcher.py --daemon --delay 10

(Of course the token I showed here is not the real one)

"Auto", "Iphone", and "Scarpe" are very common queries, so hopefully you should see some notifications on Telegram!

If you want to check if your bot is able to receive messages, you can use this link to send a test message: https://api.telegram.org/bot[bot_token_code]/sendMessage?chat_id=[chat_id_code]&text=prova (please use your token and chat id in the link).

For example, I used: https://api.telegram.org/6168613223:oij9JDXXlipj92jDj0j90JFWO292/sendMessage?chat_id=@subito_it_test&text=Ciao

Troubleshooting

  • Did you add the bot to the channel and set it as admin?
  • Did you use the correct chat id? Don't forget the "@" at the beginning (e.g. @subito_it_test)
  • Be patient! Maybe it will take a few minutes to receive notifications. Did you use a common query where people post announcments like "Auto"? For testing, try also setting a low delay (e.g. python3 subito-searcher.py --daemon --delay 10)

subito-it-searcher's People

Contributors

alenada99 avatar atreb92 avatar corradopetrelli avatar daqh avatar duemme avatar giuseppericco avatar gstru avatar leo-tasso avatar lorenzofavaro avatar lucasforza avatar morrolinux avatar ottomano avatar perronet avatar super99master avatar valex1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

subito-it-searcher's Issues

Codice sorgente cambiato

Credo che il codice sorgente della pagina sia cambiato, mi esce fuori questo errore:
Traceback (most recent call last): File "subito-searcher.py", line 135, in <module> run_query(args.url, args.name) File "subito-searcher.py", line 94, in run_query location = product.find('div', class_='AdElements__ItemDateLocation--container-L2hvbWUv').find('span').contents[0] AttributeError: 'NoneType' object has no attribute 'find' "from" non è riconosciuto come comando interno o esterno, un programma eseguibile o un file batch.

non riesco a configurare il bot per le notifiche telegram

buongiorno a tutti ho seguito passo passo le istruzioni
1 Creato un bot da @Botfather
2 Creato un canale e reso amministratore il bot creato
3 lanciata la stringa mettendo API TOKEN E CHANNEL NAME (ho provato sia con che senza virgolette)
il channel name l 'ho ricavato sia numericamente con il bot IDBOT sia mettendo come diceva la guida la @nomedelcanale

riesco a ricevere il messaggio di prova ma non le notifiche, c'è qualche comando per attivarle?

potete aiutarmi vi ringrazio

feature request: don't show sold items

since subito.it shows sold items for some days, they are still seen as available by the bot; furthermore when an item goes sold it seems to be seen as a new item

Feature request: Remove sold items

Description

When an item is sold remove from list or the item should not be added to the list (this in the case a new item is already sold).

Advice

Use the class item-sold-badge for this purpose.

image

Structure html (simplified)

<div class="...upper-data-group">
    <div class="... item-key-data ...">
        <h2 class="... size-normal ...">
            Blablabla
        </h2>
        <div class="feature-row ...">
            <p class="...price....">
                290&nbsp;€
                <span class="... item-sold-badge ...">
                    Venduto
                </span>
            </p>
        </div>
    </div>
</div>

the bot works prefectly but now i have this error....

C:\Users\IoSonoLeggenda\Desktop\subito>python subito-searcher.py --refresh
Traceback (most recent call last):
File "C:\Users\IoSonoLeggenda\Desktop\subito\subito-searcher.py", line 294, in
load_queries()
File "C:\Users\IoSonoLeggenda\Desktop\subito\subito-searcher.py", line 61, in load_queries
queries = json.load(file)
File "C:\Users\IoSonoLeggenda\AppData\Local\Programs\Python\Python39\lib\json_init_.py", line 293, in load
return loads(fp.read(),
File "C:\Users\IoSonoLeggenda\AppData\Local\Programs\Python\Python39\lib\json_init_.py", line 346, in loads
return _default_decoder.decode(s)
File "C:\Users\IoSonoLeggenda\AppData\Local\Programs\Python\Python39\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\IoSonoLeggenda\AppData\Local\Programs\Python\Python39\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Fascia oraria di funzionamento

Ciao,
ho preparato una versione per includere una fascia oraria di funzionamento (ad esempio escludere le ore notturne, cosa che riduce inutili richieste quando generalmente non pubblica nessuno).
Non conosco molto il Python quindi potrei aver scritto codice "terribile", ma sembra funzionare.

subito-searcher.py.txt

Errore su alcune query con parametri complessi

Salve,
Ho usato il bot per circa un mese senza alcun tipo di problema, ma da circa 2 settimane quando provo a lanciarlo mi si presenta questo errore:
Adding result: Memoria corsair ddr4 16gb 2x8gb pc3600 vengeance r - 50 € - Ivrea (TO) Traceback (most recent call last): File "\\DC.PRCONSULTING.local\UserProfiles$\riccardo.guerriero\Desktop\subito-it-searcher-master\subito-searcher.py", line 193, in <module> run_query(args.url, args.name, False) File "\\DC.PRCONSULTING.local\UserProfiles$\riccardo.guerriero\Desktop\subito-it-searcher-master\subito-searcher.py", line 132, in run_query location = product.find('span',re.compile(r'town')).string + product.find('span',re.compile(r'city')).string AttributeError: 'NoneType' object has no attribute 'string'

Il tutto avviene anche se provo ad aggiungere una nuova query.

When a new product is found "TypeError: can only concatenate str (not "NoneType") to str"

The script can not concatenate the price.
This is the error:

New search added: RobotTagliaerbaLombardia
Adding result: Robot tagliaerba MI322 Viking - 100 € - Merone (CO)
Traceback (most recent call last):
  File "subito-searcher.py", line 188, in <module>
    run_query(args.url, args.name, False)
  File "subito-searcher.py", line 136, in run_query
    tmp = "New element found for "+name+": "+title+" @ "+price+" - "+location+" --> "+link+'\n'
TypeError: can only concatenate str (not "NoneType") to str

If in line 136 I edit the script deleting the price it works.

                tmp = "New element found for "+name+": "+title+"- "+location+" --> "+link+'\n'

So the problem is the price. I don't know ho to fix it.

More info:
This is the product.
maybe the problem is 100€ like 100&nbsp but not sure.

Thanks for help
----EDIT----
In the file searches.tracked there is.
"price": null
So I tried :

        try:
            price=product.find('p',class_=re.compile(r'price')).string #at the moment (18.3.2021) the price is under the 'p' tag
            if price == "null" : price = "PrezzoNullo"

But without succeed :-(

Bug report: if I do a search without including the minimum price, but only the maximum price, the results are distorted

For example if I ran this search:
python3 subito-searcher.py --add "orologi" --url "https://www.subito.it/annunci-italia/vendita/usato/?q=orologi" --maxPrice "700"
I'll get results like this:
Rolex oyster perpetual : 4.500 € --> Alassio (SV)
but 4.500 is greater than 700.
I think the problem is that strings are used to represent the prices, in fact you can run into many bugs like this, but it should be easy enough to fix, just convert the strings to integers.

The bot doesn't work

I tried several times to setup the bot, but the bot never responds. Also in the description, in the part that explains how to add the bot, in the command part, it says pyhton instead of python

Non riesco a collegare il bot con telegram

Ho seguito tutti i passaggi, inserito il token e il chat id, ho controllato anche il telegram_api_credentials, e tutti i dati sono corretti, ma su telegram non compare nulla.

TypeError: can only concatenate str (not "Tag") to str

When the price doesn't exist the script should save the price as unknown in the json file.
Instead it scrapes the span tag and prints concatenation error.

image

A quick solution could be to check the variable.

from bs4 import BeautifulSoup, Tag


price_soup = BeautifulSoup(price, 'html.parser')
if type(price_soup == Tag)

aiuto

non ho capito ma il comando: pyhton3 subito-searcher.py --addtoken [YOUR_API_TOKEN] --addchatid [YOUR_CHANNEL_NAME]

dove va inserito? grazie

Error when using price options

When using with --minPrice and --maxPrice options i get this error:

running query ("Switch" - https://www.subito.it/annunci-italia/vendita/usato/?shp=true&q=nintendo%20switch&from=mysearches&order=datedesc)... Traceback (most recent call last): File "/home/alessandro/GoogleDrive/Progetti/subito-it-searcher/subito-searcher.py", line 243, in <module> refresh(notify) File "/home/alessandro/GoogleDrive/Progetti/subito-it-searcher/subito-searcher.py", line 111, in refresh run_query(url[0], search[0], notify, minP[0], maxP[0]) File "/home/alessandro/GoogleDrive/Progetti/subito-it-searcher/subito-searcher.py", line 166, in run_query if not queries.get(name).get(url).get(minPrice).get(maxPrice).get(link): # found a new element AttributeError: 'NoneType' object has no attribute 'get'

I think the problem may be a conversion error between strings and integers in the run_query function. I'm currently trying to solve the issue. If successfull, i will make a pull request.

subito[dot]it web page source code changed.

Hi morro,

i checked-out your source code. In one month the subito[dot]it developers have changed the site source code. I think they now use CSS to create dynamic web page (i am not a web developer, so sorry if i'm writing rubbish).

However, i modify your source code to model with the new web page structure using regular expression, if you are interested, contact me.

Best regard,
margiodo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.