ozon3org / ozon3 Goto Github PK

An open-source Python package to easily obtain real-time, historical, or forecasted air quality data for anywhere in the world. Reliable, accurate and simple.

License: GNU General Public License v3.0

Python 100.00%

api-wrapper python air-quality api open-source python3 python-package python3-package air-quality-index air-quality-data

ozon3's People

Contributors

Stargazers

Watchers

ozon3's Issues

Remove 'Support or Contact' heading from the API documentation website

GitHub Pages automatically adds this in, but it's not required and shouldn't be there.

use the gh-pages branch for this
Do not directly commit to it but instead make a PR from your own fork

We require a proper API documentation website, for the primary docs of Ozone. The README file cannot explain the methods in enough detail.

Any suggestions for what site/service to use?

MyPy flags a type incompatibility in line 436 of Ozone.py. This is in the get_specific_parameter method

The problem

Here's the warning that's displayed:

Argument of type "Series[Unknown]" cannot be assigned to parameter "x" of type "SupportsFloat | SupportsIndex | str | bytes |

This is the line of code that causes this:

result = float(row[air_param])

Possible explanation

Here, row is a Pandas dataframe, and air_param is a column in it

This warning, if I understand correctly, is basically indicating that MyPy has no guarantee that the value stored in that cell is a number. Due to this, it's not comfortable with us type casting it as a float while assigning it to result.

Possible solution

Better implementation of get_specific_parameter method. This method currently creates an entire dataframe by calling get_city_air, when all it requires is a single number. A more lightweight method of implementing it would be to make an API call, and then get the required parameter directly from the JSON response, and return it from there.

Add IDE setting files to .gitignore file

A lot of contributors might be using different integrated development environments (IDE) when committing their changes and editing their code, and might accidentally include the IDE’s specific settings file as part of their committed changes when making their pull request.
To prevent this from occurring, a line should be added to the .gitignore file that matches these files , effectively preventing them from being Considered (ignored) when staging and committing changes.

Add a link to the versioning model in the README and the GitHub branching model in the CONTRIBUTING file

The versioning model explains how Ozone uses semantic versioning to increment it's release version number.
The GitHub branching model explains how Ozone branches are created, and what purpose each one serves.

Both are currently covered in Ozone's discussions forum (versioning and branching-model)

Add a section in the README near the bottom about 'Semantic Versioning System' and refer to the link in the discussion.
Add a section in the contributing file near the bottom called 'Github Branching Model' and refer to the link in the discussion.

Maybe add small descriptions of both.

Unhelpful error msg when attempting a nonexistent city

Windows 7, Python 3.8.5, dev branch

>>> import ozone as ooo
>>> o3 = ooo.Ozone(WAQI_TOKEN)
>>> o3.get_city_air('nonexistent_city')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "e:\git-line\milind-ozone\ozone\src\ozone\ozone.py", line 318, in get_city_air
    row = self._parse_data(data_obj, city, params)
  File "e:\git-line\milind-ozone\ozone\src\ozone\ozone.py", line 162, in _parse_data
    row["latitude"] = data_obj["city"]["geo"][0]
TypeError: string indices must be integers
>>>

Might've been what is discussed in #58. In any case, there should be a more helpful error message.

Class docstrings required for the Ozone and URLs class

Currently the Ozone and URLs class do not have any class docstrings. These should be added to make the purpose and methods of each class clearer to developers who use the package. The URLs class, for example, should state that it cannot/should not be instantiated.

Create a virtual env setup guide for development of Ozone in the CONTRIBUTING file

Using this exact environment is not compulsory, but does help to avoid issues with development of the package's features.

Instruct contributors to setup a virtual env with either conda environments or python virtual environments
As of now, Python 3.9 should be used ideally (it's what I've used for the majority of the package, and python 3.10 is not yet supported by conda environments)
the dependencies should be installed from requirements.txt with pip install -r requirements.txt

Create a method to get air quality data for all stations a pair of latitude-longitude bounds

This should allow users to enter in two pairs of lat-long coordinates. The WAQI API has an endpoint available to get all air quality measuring stations between two pairs of lat-long pairs. This list of stations could be used with the get_multiple_coordinate_air method in Ozone, to get the air quality for all of them.

This would be really useful, as users would essentially be able to get data for an entire area of the world, without necessarily knowing the name/location of each station in that area (Ozone would take care of that part).

Ideally create a private method that takes in a two pairs of lat-long coordinates, and returns a list of several lat-long pairs, one for each station in that range.
Then create a separate method that uses get_multiple_coordinate_air under the hood and returns the data in the same way all other data methods do.

Add restriction on reads/min and reads/ second to prevent token being blocked from API

The WAQI API allows a certain number of requests every minute and every second. This can be looked up on their API website. If this limit is exceeded then the users API token is blocked.

An intrinsic check should be added to the code to prevent exceeding this limit, perhaps by taking short breaks between successive requests if the limit is about to be exceeded.

Creation of logo for Ozone package

Ozone currently lacks a logo. A logo goes a long way in making the package feel more professional, and makes the README and documentation site look prettier too!

Guidelines

The logo should be related to these keywords: ozone, air, clouds, weather, etc
Should be in .svg format ideally
Gotta be cool!

Be creative, have fun!

Code Of Conduct not present in documentation.

Considering that this is an open-source package, it's a good idea to have a good of conduct present in the root directory to help people see what behaviour is encouraged and what's not encouraged.

A short, clear CODE_OF_CONDUCT.md should be all that's required for a project as simple as this one. You could take a look at other projects and see what they include.

In general: I want this package to be easy for newbies to contribute.

Encourage constructive discourse and discourage berating new contributors.
Encourage new ideas about features and enhancements as long as they don't go against the goal of Ozone to be the simplest, most effective package for fetching air quality data.
Have a fun time!

Bug: No way to set custom name for file, if saving data as CSV, Excel or JSON file. Also inconsistency in naming.

Current behaviour

All saved files of the same type have the exact same name. This means that if I try to save multiple CSV files, they just overwrite the previous ones. For example:

CSV files are named 'air_quality.csv'
Excel files are named 'air_quality_data.csv'
Excel files are named 'air_quality_data.json'

Desired behaviour

Users should have the option to custom name their files should they choose to save their data as a file.
If they choose to save their data as a file instead of a dataframe, a warning should tell users that they should give the file a unique name to avoid previous files from being overwritten.

Add example usage and gif of `get_historical_data` method to README

We recently added a feature for retrieving historical air quality data. To make this clear for users to use, It's a good idea for us to add an example of it being used to the README in the form of a gif.

Show briefly how the method can be used in the 'Getting Started' section. Add this in the same format that the other methods are shown.

Here are the steps you can follow to add the gif:

Use the method with every argument used. This might look like this:

o3.get_historical_data(city='Houston', data_format='df')

Make sure the output data is visible too.
Take a video of the entire process and convert it into a ~ 10 second gif.
Add the gif to the media folder
Add the gif to the README by referring to it in the same way as the other two gifs have. For example:

![Gif of ozone.get_city_air()](/src/media/ozone_get_city_air.gif)

Creation of proper API documentation with GitHub Pages

Complete API documentation for Ozone must be created.

When following development environment setup guide in CONTRIBUTING file, pre-commit install doesn't work right away

While following the development environment setup guide from the CONTRIBUTING file, the line that instructs us to do :
pre-commit install

doesn't work right away. I had to pip install pre-commit and then re run the pre-commit install for it to work.

I'm using a conda virtual environment with python 3.9
Using Zsh

Shouldn't the 'pre-commit' package be added to our dev dependencies in order to have it pre installed before this step?

release-1.1.0 bug: get air quality data methods currently can return two types of data: Dataframe or None.

As of now when a user gets air quality data, and decides that they want the data in a Pandas dataframe, then the get_city_air and get_multiple_city_air methods return a dataframe. (type is Pandas Dataframe)
BUT
when a user says they want the data in CSV, excel or JSON form, it returns a None type, and saves the data as a file to the working directory.

At first glance this seems like what we want, but actually this behaviour is bad for two reasons:

It gives the returned dataframe a type of Dataframe | None. This means that if you try and access info from one column with indexing then Pylance and MyPy highlight this behaviour as a warning, because as far as the linters are concerned the var you're trying to index from could be None and not a dataframe. This is not a good user experience as they might not trust the method if it's getting error highlights.
Also from a code maintainability standpoint, it's a much better idea for a method to return a consistent type in all situations.

No mention of the WAQI API docs in the README at all

This package is built as a wrapper around the WAQI API, however no mention is made to their website and documentation. This should be provided for the sake of completeness and for contributors to refer to.

Simple to add it.

Addition of tests to check build status

The package currently has no test coverage at all. There is thus no method of checking build status with CI/CD.
The core functionality at least should have test coverage.

Get air quality methods should have option to return different data formats like CSVs or Excel files, and not just pandas dataframes

The data fetched from the API can be saved directly to these file formats.

README.md: Markdown Code Blocks under "Getting Started" heading are labeled as being shell language - They should be labeled as python

Under the "Getting Started" section of the README.md, there are two code blocks. In the markdown, these code blocks are labelled as being in the Shell language. However, these code blocks are supposed to be Python. Please label them to be Python

Current State:

```sh

```sh

Should be:

```python

```python

Add example photos of resulting data from get_multiple_city_air data in README

Add a couple of photos of the data retrieved by using the get_multiple_city_air method with 2-3 cities.

I think a screenshot of the Excel sheet, and of the Pandas dataframe would probably be enough. Ideally use data from different cities for each screenshot, just to demonstrate the range of Ozone as a package.

You can add it as a small subsection under the example code GIFs.

When historical data output is saved as CSV file, the dates column is missing entirely, making the data hard to use

Here's what the saved file looks like right now:

And the raw CSV file:

As you can see, there's no dates column, which would make this data confusing to work with

Add feature to get historical data for a particular location

Currently only live data can be fetched given a location. Most users would find more utility in large datasets of historical data.

I don't think that the WAQI api is capable of providing historical data, but the WAQI website does have a resource for downloading CSV's and Excel sheets of historical data - Perhaps it could be possible to download and read the CSV from there, programmatically.

Contributors section in README to give credit to all the people who's code helps improve Ozone

A section in the README, to give attribution to all the contributors that add code to Ozone.

Here's a guide to doing this, that I found online:

https://levelup.gitconnected.com/displaying-contributors-image-on-readme-files-with-no-pain-f1cbaca963e8

Add method to show air quality forecast information. The API actually returns this info, but we currently ignore it.

The WAQI API documentation mentions how the JSON object that is retuned contains 'forecast' data as well. This is basically the 7 day forecast for 4 parameters: pm25, pm10, o3, and uvi.

Create a method that takes in a city and fetches the forecast data for it. Allow users to enter a list of the specific parameters they want, and get all by default.

You can refer to the get_city_air method while making this. The implementation would be quite similar.

Add gif of Ozone being used to README. Would help first time users understand how to use the code.

Make a gif of a screen recording of yourself using the package, and add it to the README. Make sure to try both the currently available methods in Ozone to showcase it's capabilities properly.

Addition of FILE_STRUCTURE.md file to give new contributors clarity on layout of files.

Explain the basic file structure and layout of the package in the file.

Which directory has the main source code? -> src/ozone/
Which files have the main classes? ozone.py and urls.py
What does each of the two files contain? Ozone class, and URLs class
What are the different files to be edited for dependency management? -> setup.py, requirements.txt, and setup.cfg
What does src/media contain? -> gifs for README code examples

List of parameters should be added to docstrings of methods, to allow users to understand what options they have to choose from

[ "aqi", "pm25", "pm10", "o3", "co", "no2", "so2", "dew", "h", "p", "t", "w", "wg" ]

These are all the separate parameters that the API can get. For each data fetching method, in their docstrings, add the above line and indicate that these are the different options that users can choose from.

Refactor repeated section of code responsible for getting data row dict into separate private method

As @lahdjirayhan mentioned in #79 , this part of Ozone's logic is being duplicated:

Building a URL.

Making request (handled by Ozone via _make_api_request)

Checking response status/error code (handled by Ozone via _check_status_code)

Checking if the JSON returned is truly OK (I added this check in this commit to handle cities > without AQI data. It will also be needed in get_specific_parameter)
5 .Parsing the data into more ready-to-use format (handled by Ozone via _parse_data)

And finally, take the intended parameter.

This should be refactored into its own private method for cleaner use.

Returned dataframe from `get_historical_data` doesn't have a label for the dates column

All the other columns except for the dates column have labels. While it is pretty obvious that this column contains dates, for the sake of completeness it should be labelled.

The screenshot below shows what I mean.

Add the units for the retrieved data in the documentation. As of now, no information about the units of the data is provided.

The units will have to be looked up in the WAQI API documentation and added from there. Any units that are missing from there could probably be assumed to be standard units for the parameter being measured. They should be put in the README (later we can add it to the official documentation when we make it)

The label of one column of the returned dataframe for all methods should be pm2.5, not pm25

The dataframe returned by the methods contains data on particulate matter 2.5 microns (PM2.5). This data is slightly mislabelled, and might confuse users.

Current behaviour

The column with this data is labelled 'pm25'

Desired behaviour

The column should be labelled 'pm2.5'

Screenshots

Add method to get an individual parameter as a number and not a dataframe

This might help if someone is planning to actually use this package for something other than data analysis. The current methods all cater to data analysis more by providing a Pandas data frame as the return value. This might be better if someone is using this to update their backend, or require a single parameter only.

To add:
(something along these lines)
Methods that return a single numerical value, and provide a Null value with a warning if that parameter is not measured by the station.

Create shell script to run to automatically bump the version number in setup.cfg and setup.py

This script would be run by the maintainer when a new release is being created, to bump the version number before the final commit for the release.

setup.cfg mentions the version number once, on line 3
setup.py mentions the version number twice, on line 12, and in the download url in line 13

Add a method to get air quality data for a particular longitude and latitude

Currently users can only get AQI data for a city, or many cities.

I read into this a little, and the WAQI API offers a URL endpoint to get air quality data for a certain latitude and longitude. It should be fairly simple to use it and create a method to allow users to get data using geographical coordinates.

Also add a method to get air quality data for an entire list of coordinate pairs (similar to the get_multiple_city_air, but for coordinate pairs instead of cities.

Here's the API page with the endpoints: https://aqicn.org/json-api/doc/#api-_

Need better checking for the JSON response

Problem

It is possible to get a response status code 200 (OK), yet a JSON response like this:

{"status": "error", "message": 404"} # clearly does not look like an all-is-good response to me

That is, when you call get_city_air("") (search for city with empty string as its name). Currently, doing this raises a KeyError.

This is what we have now:

https://github.com/Milind220/Ozone/blob/bef2e410315ac67541448f22296be56c56e90e58/src/ozone/ozone.py#L318-L330

Because the response status code is 200, it passes the first if check.

Possible solutions

The immediate fix for this one problem is to check for response['status'] != 'ok' before trying to access anything else.

However

I think we need more useful response checking. I don't think we need to add more checks into _check_status_code, because it already does its job and has perfect scope & purpose. We need to improve the check on JSON response.

If there's a documentation of all possible JSON responses by WAQI, it should help us build a better check.

Add a method to show the meaning of the Air Quality Data

For example, an aqi of 0 - 50 is Good and that means that _air quality is considered satisfactory, and air pollution poses little or no risk.

Add link to the Ozone PyPI package in the README

We currently do not have a link to the PyPI page for Ozone (on PyPI it's called ozon3). This is generally a good idea for the sake of completeness.

This is the page that should be linked.

Add the link in the section of the README that instructs users on how to download the package using pip.

Need to add a list of links of the World Air Quality Index's API website/documentation to the CONTRIBUTING file

Ozone uses the WAQI's API under the hood to fetch its data. Therefore, it's important that contributors can easily find all the information they need about the API in order to make good quality contributions.

Add a section in the CONTRIBUTING.md file with a list of links to any page that has information that might be important for contributors to know. Also in a single short phrase, describe the page that the link leads to (would help people know which one they need).

Here's a couple of links to get you started:

WAQI website: https://waqi.info
API overview: https://aqicn.org/api/

get_multiple_cities_air() method returns an error

Error message:

TypeError: cannot concatenate object of type '<class 'list'>'; only Series and DataFrame objs are valid

Prevent users from modifying the urls stored in the URLs class

As of now, users can technically import and instantiate the URLs class, which might cause all sorts of havoc if the URLs stored were changed.

This might be fixed by converting it to a static class/annotating it as abstract/ or something along those lines.

Spruce up and beautify the README.

Emoji's should be added liberally. Always makes the README a little easier to read.

Test out historical-data feature to ensure that it's production ready

@Sam-damn

I tried this out today, and it worked great for the one city that I tried it with. I'm going to try it a few other places tomorrow.

Add documentation for `get_historical_data`, `get_city_station_options`, and `get_city_forecast` to the gh-pages documentation

The methods page of the official API documentation explains every public method in Ozone and shows how it can be used.

Three new methods have been added, for which there is no complete documentation right now:

get_historical_data
get_city_station_options
get_city_forecast

Add Option for User of ozone to specify a path to output directory for any files (csv, json, xlsl) that might be created

@Milind220 while writing the tests i noticed that running the _format_output() generates files in the current directory of the code , i think it would be better if the user himself can choose the location of the output of such files like csv or json or any future files that ozone might create.

the best way this can be done in my opinion is by adding a property output_dir to the class ozone itself , and this is a better approach than adding it as a parameter to each public function, since the user can choose his output directory when instantiating the ozone class itself and it would be used by all functions that need it.

Create a GitHub action to assign issues to contributors if they comment on an issue saying 'I will do this'

This would help ensure that issues are quickly assigned to contributors who express an interest in doing them.
Some steps to help you get started if you're not sure how to go about this:

article that covers exactly this. Keep in mind that, for this repo, a user should comment 'I will do this' to get the issue assigned to them

Broken link in CONTRIBUTING.md

In the dev branch CONTRIBUTING file's Table of Contents, the hyperlink for 'The world air quality index's API' doesn't work.

It is supposed to take you down to the section of the file that it's referring to, but fails to do this. This is probably because of the ' in the phrase - The world air quality index's API

Pretty easy fix, a simple google search should give you the answer.

(Also whoever takes this, please remember to follow the git commit messages style guide from the CONTRIBUTING file when making commits. For this issue, for example, the commit should have the prefix [docs]

The mypy linter, black, and flake8 should be added as development dependencies for the package

I've noticed that some contributors make commits that churn up mypy linting warnings. These warnings do not prevent execution of the code, but generally point out a flaw in the design of the program, often to do with type inconsistency.

I think as a part of the development environment, contributors should ideally have mypy installed. This also ties in with issue #54 , which has to do with the creation of a consistent development environment for Ozone.

Brief explanation of each parameter that the API can fetch should be added to README

Ozone can currently fetch the following parameters (taken from the Ozone class):

[ "aqi", "pm25", "pm10", "o3", "co", "no2", "so2", "dew", "h", "p", "t", "w", "wg", ]

Some of these parameters, such as 'aqi' are pretty obvious in meaning, but others, such as 'h' , 'w', 'wg', 't' are not easily understood.

It would be excellent to have a short section explaining each parameter briefly in the README, perhaps under the code examples.

You can check the WAQI API website and see if they have the meanings there, or otherwise a google search should prove useful.

ozon3org / ozon3 Goto Github PK

ozon3's People

Contributors

Stargazers

Watchers

Forkers

ozon3's Issues

The problem

Here's the warning that's displayed:

This is the line of code that causes this:

Possible explanation

Possible solution

Guidelines

Be creative, have fun!

Current behaviour

Desired behaviour

Current behaviour

Desired behaviour

Screenshots

Problem

Possible solutions

Recommend Projects

Recommend Topics

Recommend Org