Giter Club home page Giter Club logo

purple_air_api's Introduction

The API this library uses has been deprecated!

Please see issue #69 for more information.


PurpleAir API

DOI

A Python 3.x module to turn data from the PurpleAir/ThingSpeak API into a Pandas DataFrame safely, with many utility methods and clear errors.

Global Sensor Map with Celsius Scale

Installation

  • To use
  • To hack
    • Clone this repo
    • cd to the folder
    • Create a virtual environment
      • python -m venv venv
    • Activate the virtual environment
      • source venv/bin/activate
    • Install as dependency in the virtual environment
      • python setup.py develop
    • Install third party dependencies
      • Required to run: pip install -r requirements/common.txt
      • Install development requirements with pip install -r requirements/dev.txt
      • Install example file requirements with pip install -r requirements/examples.txt

Frequently Asked Questions

Before opening a new ticket, please refer to the FAQ document.

Example code

For detailed documentation, see the docs file.

Listing all useful sensors

from purpleair.network import SensorList
p = SensorList()  # Initialized 23,145 sensors!
useful = [s for s in p.all_sensors if s.is_useful()]  # List of sensors with no defects
print(len(useful))  # 17,426

Get location for a single sensor

from purpleair.sensor import Sensor
s = Sensor(2890, parse_location=True)
print(s)  # Sensor 2891 at 10834, Canyon Road, Omaha, Douglas County, Nebraska, 68112, USA

Make a DataFrame from all current sensor data

from purpleair.network import SensorList
p = SensorList()  # Initialized 11,220 sensors!
# Other sensor filters include 'outside', 'useful', and 'family'
df = p.to_dataframe(sensor_filter='all',
                    channel='parent')

Result:

             lat         lon                          name location_type  pm_2.5  temp_f     temp_c  ...  downgraded age 10min_avg 30min_avg  1hour_avg  6hour_avg  1week_avg
id                                                                                                   ...
14633  37.275561 -121.964134             Hazelwood canary        outside    7.15    92.0  33.333333  ...       False   1      6.50      5.13       4.11      12.44      42.94
25999  30.053808  -95.494643   Villages of Bridgestone AQI       outside   10.16   103.0  39.444444  ...       False   1      9.96     10.63      12.51      18.40      14.55
14091  37.883620 -122.070087                   WC Hillside       outside   11.36    89.0  31.666667  ...       False   1     10.31      8.74       7.21      20.03      63.44
42073  47.185173 -122.176855                            #1       outside   99.46    73.0  22.777778  ...       False   0    100.06    100.31     101.36     106.93      68.40
53069  47.190197 -122.177992                            #2       outside  109.82    79.0  26.111111  ...       False   0    109.52    108.72     109.33     116.64      74.52

Make a DataFrame from all current sensors that have a 10 minute average pm2.5 value

from purpleair.network import SensorList
p = SensorList()  # Initialized 11,220 sensors!
# If `sensor_filter` is set to 'column' then we must also provide a value for `column`
df = p.to_dataframe(sensor_filter='column',
                    channel='parent',
                    column='m10avg')  # See Channel docs for all column options
print(len(df))  # 10,723

Get historical data for parent sensor secondary channel

from purpleair.sensor import Sensor
se = Sensor(2890)
df = se.parent.get_historical(weeks_to_get=1,
                              thingspeak_field='secondary')
print(df.head())

Result:

                        created_at  0.3um/dl  0.5um/dl  1.0um/dl  2.5um/dl  5.0um/dl  10.0um/dl  PM1.0_CF_1_ug/m3  PM10.0_CF_1_ug/m3
entry_id
1005219  2020-09-09 00:01:06+00:00    194.84     61.16      5.53      0.00      0.00       0.00              0.45               0.60
1005220  2020-09-09 00:03:06+00:00    224.95     69.07      4.19      0.00      0.00       0.00              0.63               0.86
1005221  2020-09-09 00:05:06+00:00    238.37     71.58      5.42      0.02      0.02       0.02              0.51               0.88
1005222  2020-09-09 00:07:06+00:00    259.61     79.00      8.11      0.96      0.43       0.43              0.71               1.48
1005223  2020-09-09 00:09:06+00:00    254.69     76.66      6.47      0.81      0.67       0.00              0.95               1.50

Get historical data for child sensor primary channel

from purpleair.sensor import Sensor
se = Sensor(2890)
df = se.child.get_historical(weeks_to_get=1,
                             thingspeak_field='primary')
print(df.head())

Result:

                        created_at  PM1.0 (CF=1) ug/m3  PM2.5 (CF=1) ug/m3  PM10.0 (CF=1) ug/m3  UptimeMinutes  RSSI_dbm  Pressure_hpa  Blank  PM2.5 (CF=ATM) ug/m3
entry_id
1002561  2020-09-09 00:01:09+00:00                1.03                1.41                 1.41        18136.0      0.01        982.25    NaN              1.41
1002562  2020-09-09 00:03:09+00:00                1.07                1.60                 1.60        18136.0      0.01        982.18    NaN              1.60
1002563  2020-09-09 00:05:09+00:00                1.28                1.59                 1.76        18136.0      0.01        982.10    NaN              1.59
1002564  2020-09-09 00:07:09+00:00                1.33                1.71                 1.71        18136.0      0.01        982.21    NaN              1.71
1002565  2020-09-09 00:09:09+00:00                1.25                1.86                 1.86        18136.0      0.01        982.18    NaN              1.86

See examples in /scripts for more detail.

purple_air_api's People

Contributors

dependabot[bot] avatar jamesbeldock avatar pjrobertson avatar reagentx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

purple_air_api's Issues

Sensor a and sensor b

Hi Christopher,

Thanks for a useful package!

I've noticed that the information that the package pulls when you specify the B sensor in Sensor.get_historical() is actually the secondary dataset for the A sensor.

Something like this in Sensor.get_data() would get the b sensor.

    response = requests.get(f'{API_ROOT}?show={self.identifier}')
    data = json.loads(response.content)
    a_channel = data['results'][0]
    b_channel = data['results'][1]

If you check the 'Label' key for a_channel and b_channel, you'll see that the B sensor's label has "B" appended to it. The data returned from the keys in the B sensor in this way are also consistent with the column headers. The thingspeak csv requested using data['results'][1]['THINGSPEAK_PRIMARY_ID'] and data['results'][1]['THINGSPEAK_PRIMARY_ID_READ_KEY'] lines up nicely with the primary B sensor header from https://www2.purpleair.com/community/faq#!hc-primary-and-secondary-data-header, but the "unused" column for Sensor.get_historical(sensor_channel='b', weeks_to_get=1) contains data at the moment, because the function is pulling data from the wrong source.

Thanks again. I'm happy to collaborate on this if you'd like.

Charlie

Custom Time Range for get_historical

Motivation

Currently, get_historical collects some number of weeks of data relative to "today". There could be special events which would motivate collecting data from specific sensors on/around specific dates. The following would support this use case:

  • Channel.get_historical should support either a relative time frame, or a custom start and end period.
  • The behavior on the custom periods should mimic any behaviors needed to respect API limits on thingspeak. One approach might be to partition the date range into week long chunks.

Thank you for all your work on this wrapper, it has saved a lot of time.

RuntimeError: Polyfit sanity test emitted a warning, most likely due to using a buggy Accelerate backend.

Installing this package in an environment that uses Python 3.9.0 will result in the following error:

ImportError: Failed to import test module: purpleair
Traceback (most recent call last):
  File "/Users/chris/.pyenv/versions/3.9.0/lib/python3.9/unittest/loader.py", line 436, in _find_test_path
    module = self._get_module_from_name(name)
  File "/Users/chris/.pyenv/versions/3.9.0/lib/python3.9/unittest/loader.py", line 377, in _get_module_from_name
    __import__(name)
  File "/Users/chris/Documents/Code/Python/purple_air_api/tests/test_sensor.py", line 3, in <module>
    from purpleair import sensor
  File "/Users/chris/Documents/Code/Python/purple_air_api/purpleair/sensor.py", line 15, in <module>
    from .channel import Channel
  File "/Users/chris/Documents/Code/Python/purple_air_api/purpleair/channel.py", line 9, in <module>
    import pandas as pd
  File "/Users/chris/Documents/Code/Python/purple_air_api/venv/lib/python3.9/site-packages/pandas/__init__.py", line 11, in <module>
    __import__(dependency)
  File "/Users/chris/Documents/Code/Python/purple_air_api/venv/lib/python3.9/site-packages/numpy/__init__.py", line 286, in <module>
    raise RuntimeError(msg)
RuntimeError: Polyfit sanity test emitted a warning, most likely due to using a buggy Accelerate backend. If you compiled yourself, see site.cfg.example for information. Otherwise report this to the vendor that provided NumPy.
RankWarning: Polyfit may be poorly conditioned

This is because there are no wheels built for NumPy yet. Since this package depends on numpy (it is a dependency of pandas, it will not work until numpy works on Python 3.9.0.

Limit to Historical Data

I am trying to pull data for the first six months of 2020 for all of the Bay Area, and it appears that I am not able to query that far back. I checked with PurpleAir and the data should be accessible for that time period. Is there a restriction applied through this API that could be creating this issue?

Thanks for the help!

Write contributing guidelines

  • Release branch is for releases only, all PRs should be to develop
  • No pull request shall be behind develop
  • First come, first served
  • If anything breaks, the pull request will be queued again when the issue is resolved
  • Pull request comments will be resolved by the person who created them

Error in retrieving all sensor list

Hello,
Using 1.2.1 and trying to run the demo code to import all the available sensors:

from purpleair.network import SensorList
p = SensorList() # Initialized 11,220 sensors!

Other sensor filters include 'outside', 'useful', 'family', and 'no_child'

df = p.to_dataframe(sensor_filter='all',
channel='parent')

And it returns:

ValueError: No sensor data returned from PurpleAir: An empty querystring is not permitted. Please contact PurpleAir at [email protected] for assistance.

Is there another way to do this?

Thanks!

Missing Parent Problem

I've run into the "Child 5843 lists parent 5842, but parent does not exist!" error and I read through the FAQ and found that deleting the cache.sqlite file might help, but unfortunately it did not. I also see that this is an issue on purpleair's side, not this program, but it entirely prevents SensorList from working while the issue persists. Would it be viable for SensorList to accept a flag that would tell it to ignore these kinds of mismatches when constructing the sensor network?

Field names disagreement with the PurpleAir documentation

The column names of the Dataframe disagree with the (documentation). The fields are actually referring to the column names until 20 Oktober 2019. For example, PM1.0_CF_ATM_ug/m3 is actually PM1.0_CF1_ug/m3 according to documentation and field5 of PARENT_PRIMARY_COLS is RSSI instead of ADC.

PARENT_PRIMARY_COLS = {
'created_at': 'created_at',
'entry_id': 'entry_id',
'field1': 'PM1.0_CF_ATM_ug/m3',
'field2': 'PM2.5_CF_ATM_ug/m3',
'field3': 'PM10.0_CF_ATM_ug/m3',
'field4': 'UptimeMinutes',
'field5': 'ADC',
'field6': 'Temperature_F',
'field7': 'Humidity_%',
'field8': 'PM2.5_CF_1_ug/m3',
}
PARENT_SECONDARY_COLS = {
'created_at': 'created_at',
'entry_id': 'entry_id',
'field1': '0.3um/dl',
'field2': '0.5um/dl',
'field3': '1.0um/dl',
'field4': '2.5um/dl',
'field5': '5.0um/dl',
'field6': '10.0um/dl',
'field7': 'PM1.0_CF_1_ug/m3',
'field8': 'PM10.0_CF_1_ug/m3',
}
CHILD_PRIMARY_COLS = {
'created_at': 'created_at',
'entry_id': 'entry_id',
'field1': 'PM1.0_CF_ATM_ug/m3',
'field2': 'PM2.5_CF_ATM_ug/m3',
'field3': 'PM10.0_CF_ATM_ug/m3',
'field4': 'UptimeMinutes',
'field5': 'RSSI_dbm',
'field6': 'Pressure_hpa',
'field7': 'Blank',
'field8': 'PM2.5_CF_1_ug/m3',
}
CHILD_SECONDARY_COLS = {
'created_at': 'created_at',
'entry_id': 'entry_id',
'field1': '0.3um/dl',
'field2': '0.5um/dl',
'field3': '1.0um/dl',
'field4': '2.5um/dl',
'field5': '5.0um/dl',
'field6': '10.0um/dl',
'field7': 'PM1.0_CF_1_ug/m3',
'field8': 'PM10.0_CF_1_ug/m3'
}

Sensor ID 60333 has incorrect key name `THINGSPEAY_ID_READ_KEY`

{
    "ID": 60333,
    "Label": "Burbank",
    "DEVICE_LOCATIONTYPE": "outside",
    "THINGSPEAK_PRIMARY_ID": "1108161",
    "THINGSPEAY_ID_READ_KEY": "Y9HNPG5JGYXR5Q3A",
    "THINGSPEAK_SECONDARY_ID": "1108162",
    "THINGSPEAK_SECONDARY_ID_READ_KEY": "NJZ5F2M4NH9CCD6Q",
    "Lat": 37.769549,
    "Lon": -122.271873,
    "PM2_5Value": "4.07",
    "LastSeen": 1600291467,
    "Type": "PMS5003+PMS5003+BME280",
    "Hidden": "false",
    "isOwner": 0,
    "humidity": "35",
    "temp_f": "91",
    "pressure": "1017.05",
    "AGE": 1,
    "Stats": "{\"v\":4.07,\"v1\":4.25,\"v2\":4.29,\"v3\":3.95,\"v4\":11.08,\"v5\":58.25,\"v6\":39.72,\"pm\":4.07,\"lastModified\":1600291467183,\"timeSinceModified\":119995}"
}

ValueError: Invalid JSON data returned from network!

hitting a ValueError with JSON return:

reproduced with

from purpleair.network import SensorList
p = SensorList()
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 5999909 (char 5999908)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/bmosley/code/py_aqi/env/lib/python3.8/site-packages/purpleair/network.py", line 23, in __init__
    self.get_all_data()
  File "/Users/bmosley/code/py_aqi/env/lib/python3.8/site-packages/purpleair/network.py", line 38, in get_all_data
    raise ValueError(
ValueError: Invalid JSON data returned from network!

Help me in downloading the data by subsetting spatially and temporally

Hi,

Using this package, I can only download data for specific time-period only using sensor specifications. But I want to look at all the sensors present in a particular geography and use their data for specific time periods. In short, I need to subset PA data both spatially and temporally.

Thanks for your help,
Praful

Parent sensors constructed from Child IDs have incorrect identifiers

>>> cse = Sensor(6643)
Child sensor requested, acquiring parent instead.
>>> cse.identifier
6643

The identifier should be the parent’s ID (6642), because cse is filled with the data from 6642:

if channel_data and len(channel_data) == 1:
print('Child sensor requested, acquiring parent instead.')
try:
parent_id = channel_data[0]["ParentID"]

However, the library naively assigns the identifier from the Sensor.__init__() construction call, regardless of what data it gets filled with.

self.identifier = identifier

Unable to open database file

Thank you for putting together a great python tool that works with panda dataframes. When I load the library with "from purpleair.network import Sensor" or from "purpleair.network import SensorList" I get the following error sqlite3.OperationalError: unable to open database file at the end of a long Traceback. Do I need to use an API key or something to gain access? Thank you.

Better name for PurpleAir class

Using

from purpleair.purpleair import PurpleAir

feels repetitive. Perhaps we should call it network, although

from purpleair.network import Network

does not seem much better.

Host as PyPI package

This will remove the requirement to manually install this package through git.

Example code is out of date/wrong

Hi,

this code

from purpleair.network import SensorList
p = SensorList()  # Initialized 11,220 sensors!
print(len(p.useful_sensors))  # 10047, List of sensors with no defects

should be

from purpleair.network import SensorList
p = SensorList()  # Initialized 11,220 sensors!
print(len(p.all_sensors))

Write better documentation

Currently, all sample code is just in the readme, but it is incomplete and does not actually document anything.

Discontinuation of the /json and /data.json urls

I think they discontinued the "/json" and "/data.json URLs" a few days ago, and that's why your "Purpleair" library is not working.

Link: https://community.purpleair.com/t/discontinuation-of-the-json-and-data-json-urls/713
"After a few years of grace period, we are now redirecting these two URLs (www.purpleair.com/json and www.purpleair.com/data.json ) to a server that will not respond.
Please contact us if you have any questions or need any help getting going on our new API, at https://api.purpleair.com ."

Feature: network filter based on available columns

Right now, the parameter sensor_filter of to_dataframe() only filters on the options {'useful', 'outside', 'all'}. We could include more filters here:

  • Column name to only include sensors that do not have a None in the given column name
  • Sensors with/without a child
  • Sensors with/without Stats
  • etc

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.