Giter Club home page Giter Club logo

Comments (12)

GillesMoyse avatar GillesMoyse commented on August 15, 2024 6

2 things to fix in the notebook :

  • new version for url_template : url_template = "http://climate.weather.gc.ca/climate_data/bulk_data_e.html?stationID=5415&Year={year}&Month={month}&format=csv&timeframe=1&submit=%20Download+Data"
  • in weather_mar2012 = pd.read_csv("data/eng-hourly-03012012-03312012.csv", skiprows=15, index_col='Date/Time', parse_dates=True, encoding='latin1'), remove header=True

Sent a PR.

from pandas-cookbook.

andreas-h avatar andreas-h commented on August 15, 2024

also, the encoding='latin1' should go (at least on Python3)

from pandas-cookbook.

hsuanie avatar hsuanie commented on August 15, 2024

Hello. I tried with the updated codes. But I got an error stating as follows:
File b'data/eng-hourly-03012012-03312012.csv' does not exist

Please kindly help me thanks!

from pandas-cookbook.

Enkerli avatar Enkerli commented on August 15, 2024

At this point (July 2018), the following works in Python3:
In[]: url_template = "http://climate.weather.gc.ca/climate_data/bulk_data_e.html?stationID=5415&Year={year}&Month={month}&format=csv&timeframe=1&submit=%20Download+Data"

and:
In[]: url = url_template.format(month=3, year=2012)
weather_mar2012 = pd.read_csv(url, skiprows=15, index_col='Date/Time', parse_dates=True, encoding='utf-8', header=0)

An important change, apart from the URL itself, is that header accepts an integer (row number) instead of a boolean.

Because of the encoding change, we need to change this, as well:
In[]: weather_mar2012[u"Temp (°C)"].plot(figsize=(15, 5))

Also, the “Data Quality” column disappeared. This requires tweaks while working with columns.

In[]: weather_mar2012.columns = [ u'Year', u'Month', u'Day', u'Time', u'Temp (C)', u'Temp Flag', u'Dew Point Temp (C)', u'Dew Point Temp Flag', u'Rel Hum (%)', u'Rel Hum Flag', u'Wind Dir (10s deg)', u'Wind Dir Flag', u'Wind Spd (km/h)', u'Wind Spd Flag', u'Visibility (km)', u'Visibility Flag', u'Stn Press (kPa)', u'Stn Press Flag', u'Hmdx', u'Hmdx Flag', u'Wind Chill', u'Wind Chill Flag', u'Weather']
In[]: weather_mar2012 = weather_mar2012.drop(['Year', 'Month', 'Day', 'Time'], axis=1)

In[]:

def download_weather_month(year, month):
    if month == 1:
        year += 1
    url = url_template.format(year=year, month=month)
    weather_data = pd.read_csv(url, skiprows=15, index_col='Date/Time', parse_dates=True, header=0)
    weather_data = weather_data.dropna(axis=1)
    weather_data.columns = [col.replace('\xb0', '') for col in weather_data.columns]
    weather_data = weather_data.drop(['Year', 'Day', 'Month', 'Time'], axis=1)
    return weather_data

from pandas-cookbook.

mvresh avatar mvresh commented on August 15, 2024

At this point (July 2018), the following works in Python3:
In[]: url_template = "http://climate.weather.gc.ca/climate_data/bulk_data_e.html?stationID=5415&Year={year}&Month={month}&format=csv&timeframe=1&submit=%20Download+Data"

and:
In[]: url = url_template.format(month=3, year=2012)
weather_mar2012 = pd.read_csv(url, skiprows=15, index_col='Date/Time', parse_dates=True, encoding='utf-8', header=0)

An important change, apart from the URL itself, is that header accepts an integer (row number) instead of a boolean.

Because of the encoding change, we need to change this, as well:
In[]: weather_mar2012[u"Temp (°C)"].plot(figsize=(15, 5))

Also, the “Data Quality” column disappeared. This requires tweaks while working with columns.

In[]: weather_mar2012.columns = [ u'Year', u'Month', u'Day', u'Time', u'Temp (C)', u'Temp Flag', u'Dew Point Temp (C)', u'Dew Point Temp Flag', u'Rel Hum (%)', u'Rel Hum Flag', u'Wind Dir (10s deg)', u'Wind Dir Flag', u'Wind Spd (km/h)', u'Wind Spd Flag', u'Visibility (km)', u'Visibility Flag', u'Stn Press (kPa)', u'Stn Press Flag', u'Hmdx', u'Hmdx Flag', u'Wind Chill', u'Wind Chill Flag', u'Weather']
In[]: weather_mar2012 = weather_mar2012.drop(['Year', 'Month', 'Day', 'Time'], axis=1)

In[]:

def download_weather_month(year, month):
    if month == 1:
        year += 1
    url = url_template.format(year=year, month=month)
    weather_data = pd.read_csv(url, skiprows=15, index_col='Date/Time', parse_dates=True, header=0)
    weather_data = weather_data.dropna(axis=1)
    weather_data.columns = [col.replace('\xb0', '') for col in weather_data.columns]
    weather_data = weather_data.drop(['Year', 'Day', 'Month', 'Time'], axis=1)
    return weather_data

When using the url template and the weather data to compare the temperatures with bikes data, code seems to be not working. I modified url template and made the changes required in later parts, and everything is running well. But when I tried to output first three rows of the data, its showing nothing.

from pandas-cookbook.

mvresh avatar mvresh commented on August 15, 2024

Here's the code :

`

getting weather data to look at temps

 def get_weather_data(year):
      url_template = "http://climate.weather.gc.ca/climate_data/bulk_data_e.html?stationID=5415&Year={year}&Month={month}&format=csv&timeframe=1&submit=%20Download+Data"

  # airport station is 5415, hence that was used

  data_by_month = []

  for month in range(1,13):

    url = url_template.format(year=year, month=month)
    weather_data = pd.read_csv(url, skiprows=15, index_col='Date/Time', parse_dates=True, encoding='utf-8', header=0)
    weather_data.columns = map(lambda x: x.replace('\xb0', ''), weather_data.columns)
    

    # xbo is degree symbol

    weather_data = weather_data.drop(['Year', 'Day', 'Month', 'Time'], axis=1)
    data_by_month.append(weather_data.dropna())

  return pd.concat(data_by_month).dropna(axis=1, how='all').dropna()

weather_data = get_weather_data(2012)

weather_data[:5]

`

from pandas-cookbook.

kbridge avatar kbridge commented on August 15, 2024
url_template = "http://climate.weather.gc.ca/climate_data/bulk_data_e.html?stationID=5415&Year={year}&Month={month}&format=csv&timeframe=1&submit=%20Download+Data"
# url_template = 'https://raw.githubusercontent.com/kbridge/weather-data/main/weather_data_{year}_{month}.csv'
url = url_template.format(month=3, year=2012)
weather_mar2012 = pd.read_csv(url, index_col='Date/Time (LST)', parse_dates=True, encoding='utf-8-sig')

Summary:

  • url_template is the same as @GillesMoyse posted. That url is too slow to load. You can change it to my mirror at gist.
  • header=True is removed.
  • skiprows=15 is removed because there is no metadata before the CSV data anymore.
  • index_col is changed from 'Date/Time' to 'Date/Time (LST)'.
  • encoding is changed from 'latin1' to 'utf-8-sig'. We need to use the -sig variant to skip the UTF-8 BOM; otherwise, the first column will contain weird characters .

from pandas-cookbook.

kbridge avatar kbridge commented on August 15, 2024

Before renaming the columns to eliminate ° characters, drop some unexpected new columns first:

weather_mar2012 = weather_mar2012.drop(['Longitude (x)', 'Latitude (y)', 'Station Name', 'Climate ID', 'Precip. Amount (mm)', 'Precip. Amount Flag'], axis=1)

And the renaming code becomes

weather_mar2012.columns = [
    u'Year', u'Month', u'Day', u'Time', u'Temp (C)', 
    u'Temp Flag', u'Dew Point Temp (C)', u'Dew Point Temp Flag', 
    u'Rel Hum (%)', u'Rel Hum Flag', u'Wind Dir (10s deg)', u'Wind Dir Flag', 
    u'Wind Spd (km/h)', u'Wind Spd Flag', u'Visibility (km)', u'Visibility Flag',
    u'Stn Press (kPa)', u'Stn Press Flag', u'Hmdx', u'Hmdx Flag', u'Wind Chill', 
    u'Wind Chill Flag', u'Weather']

Column Data Quality is removed because the new data doesn't contain the column anymore.

This also renames the column Time (LST) to Time.

from pandas-cookbook.

kbridge avatar kbridge commented on August 15, 2024

No need to drop the column Data Quality anymore:

-weather_mar2012 = weather_mar2012.drop(['Year', 'Month', 'Day', 'Time', 'Data Quality'], axis=1)
+weather_mar2012 = weather_mar2012.drop(['Year', 'Month', 'Day', 'Time'], axis=1)

from pandas-cookbook.

kbridge avatar kbridge commented on August 15, 2024

temperatures.head is a method now, so you should

-print(temperatures.head)
+print(temperatures.head())

from pandas-cookbook.

kbridge avatar kbridge commented on August 15, 2024

Change download_weather_month to this:

# mirror
# url_template = 'https://raw.githubusercontent.com/kbridge/weather-data/main/weather_data_{year}_{month}.csv'

def download_weather_month(year, month):
    url = url_template.format(year=year, month=month)
    weather_data = pd.read_csv(url, index_col='Date/Time (LST)', parse_dates=True, encoding='utf-8-sig')
    weather_data = weather_data.dropna(axis=1)
    weather_data.columns = [col.replace('\xb0', '') for col in weather_data.columns]
    weather_data = weather_data.drop([
        'Year',
        'Day',
        'Month',
        'Time (LST)',
        'Longitude (x)',
        'Latitude (y)',
        'Station Name',
        'Climate ID',
    ], axis=1)
    return weather_data

which was

def download_weather_month(year, month):
    if month == 1:
        year += 1
    url = url_template.format(year=year, month=month)
    weather_data = pd.read_csv(url, skiprows=15, index_col='Date/Time', parse_dates=True, header=True)
    weather_data = weather_data.dropna(axis=1)
    weather_data.columns = [col.replace('\xb0', '') for col in weather_data.columns]
    weather_data = weather_data.drop(['Year', 'Day', 'Month', 'Time', 'Data Quality'], axis=1)
    return weather_data

from pandas-cookbook.

kbridge avatar kbridge commented on August 15, 2024

Sorry I have used this issue as if it is my own memo. But I will be glad if my comments help you.

from pandas-cookbook.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.