Giter Club home page Giter Club logo

glw_weather's Introduction

glw_weather

Python 3: Scrape Weather Data from HTML: urllib3, BeautifulSoup, terminaltables

Two versions of a scraping script for tabulated weather data from a Davis® Weatherlink Web page. So, if that Web page HTML changes, the scripts can easily break. The scripts have been created for testing and personal use. The following modules have been imported into one or both scripts:

  • urllib3 is the HTTP client, including PoolManager which can pool multiple servers.
  • BeautifulSoup functions as a Python wrapper for an HTML parser for traversing, searching and changing the parsed tree.
  • AsciiTable uses - for horizontal lines, | for vertical lines and + for intersections to construct tabular data grids.

Variables

  • http = urllib3.PoolManager(): urllib3's PoolManager() handles arbitrary server requests.
  • req = http.request("GET", "http://www.weatherlink.com/user/gooselakewx/index.php?view=summary&headers=0"): the GET reqeust.
if req.status == 200:
    blob = req.data
else:
    print("Check req.status")
  • the conditional statement above checks for OK connection status, or status 200.
  • soup = BeautifulSoup(blob, "html.parser"): for parsing HTML for text from elements. in this repo, most of which were <td> elements from tabular data.
  • f'{soup.select("td:nth-of-type(14)")[0].string}': in this f-string, the select() function is used to target a specific <td> element for its text .string.
table = AsciiTable(main_data)
r_table = AsciiTable(rainfall)
s_table = AsciiTable(soil)

print(table.table)
print(r_table.table)
print(s_table.table)

in the variables defined above, AsciiTable() class gets the data arrays and formats them into the three tables shown below:

glw2

ko-fi

glw_weather's People

Contributors

nick3499 avatar

Watchers

 avatar

glw_weather's Issues

Incorrect value

Noticed an incorrect temperature value. Also heat index, wind chill, dew point, and there may be others.

NameError: name 'blob' is not defined

since the request failed, blob = req.data returned a NameError, but such is the life of a Web scraper. Both glw_weather_1.py and glw_weather_2.py will be affected.

"IndexError: list index out of range"

Traceback (most recent call last):
  File "/usr/local/sbin/glw2", line 79, in <module>
    "\n\033[32m     Temp 1:\033[0m", soup.select("td:nth-of-type(155)")[0].string, "\033[31m|\033[0m",
IndexError: list index out of range

The 155th <td> container had the first soil temperature reading, and the rain data displayed correctly, so the soil temp data was possibly withheld. When I checked again, later this evening, the soil temps displayed correctly.

MIght add a try clause to handle IndexError, but not sure if missing soil temp data was actually the cause.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.