Giter Club home page Giter Club logo

scrappy's Introduction

scrap.py

Simple script for website (wget, Chromium) monitoring (Xpath, regex) and communicating changes via an external tool.

setup and usage

Just put both files somewhere in PATH. Dependencies I needed to make it run:

pip install lxml
apt-get install chromium-driver

Inspect the examples closely to not feed the script wrong parameters. Be aware this uses eval() and I proofed nothing. Create your own services (Python functions) using functions request() and filter().

request(): The first argument is 'chrome' (default if empty) or 'wget'. The second is target URL.

filter(): The first argument is the result of a request. Then, a method: 'hash' (default if empty), 'xpath', or 'regex'. Finally, the pattern to match. Hash method takes an empty pattern for now, maybe I'll add CRC32/SHA1 selection later.

Execute with e.g. python3 scrap.py wgxp to start function _wgxp(). I put it in cron.

Make sure to customize scraphead.py to have an already existing log path in filepath variable (maybe just mkdir /var/log/scrappy) and a working communicate() function. Mine uses Telegram, but I put Hangouts here for the sake of a demo.

hangouts.py

To get hangouts.py present in the communicate() function by default, you'll need Hangups, and then follow these steps:

  1. Take send_message_example.py

  2. In the beginning, blend in

import sys
CONVERSATION_ID = sys.argv[1]
MESSAGE = ' '.join(sys.argv[2:])
  1. Modify REFRESH_TOKEN_PATH to work with your username

  2. If Python >=3.7, then replace async( with ensure_future(

You can get the target conversation ID by grepping Hangups logs: grep -C 1 'conversation_id' /home/USERNAME/.cache/hangups/log/hangups.log

scrappy's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.