Giter Club home page Giter Club logo

ozon3's People

Contributors

alishahpurwala avatar amitesh100 avatar corey-new avatar igotadigbick avatar kowshik-noor avatar kristinamancini avatar lahdjirayhan avatar milind220 avatar miller2014 avatar samxx97 avatar shootgan avatar tariksouabny avatar tomiiwa avatar ubongab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

ozon3's Issues

MyPy flags a type incompatibility in line 436 of Ozone.py. This is in the get_specific_parameter method

The problem

Here's the warning that's displayed:

Argument of type "Series[Unknown]" cannot be assigned to parameter "x" of type "SupportsFloat | SupportsIndex | str | bytes |

This is the line of code that causes this:

result = float(row[air_param])

Possible explanation

Here, row is a Pandas dataframe, and air_param is a column in it

This warning, if I understand correctly, is basically indicating that MyPy has no guarantee that the value stored in that cell is a number. Due to this, it's not comfortable with us type casting it as a float while assigning it to result.

Possible solution

Better implementation of get_specific_parameter method. This method currently creates an entire dataframe by calling get_city_air, when all it requires is a single number. A more lightweight method of implementing it would be to make an API call, and then get the required parameter directly from the JSON response, and return it from there.

Add IDE setting files to .gitignore file

A lot of contributors might be using different integrated development environments (IDE) when committing their changes and editing their code, and might accidentally include the IDE’s specific settings file as part of their committed changes when making their pull request.
To prevent this from occurring, a line should be added to the .gitignore file that matches these files , effectively preventing them from being Considered (ignored) when staging and committing changes.

Add a link to the versioning model in the README and the GitHub branching model in the CONTRIBUTING file

The versioning model explains how Ozone uses semantic versioning to increment it's release version number.
The GitHub branching model explains how Ozone branches are created, and what purpose each one serves.

Both are currently covered in Ozone's discussions forum (versioning and branching-model)

  • Add a section in the README near the bottom about 'Semantic Versioning System' and refer to the link in the discussion.
  • Add a section in the contributing file near the bottom called 'Github Branching Model' and refer to the link in the discussion.

Maybe add small descriptions of both.

Unhelpful error msg when attempting a nonexistent city

Windows 7, Python 3.8.5, dev branch


>>> import ozone as ooo
>>> o3 = ooo.Ozone(WAQI_TOKEN)
>>> o3.get_city_air('nonexistent_city')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "e:\git-line\milind-ozone\ozone\src\ozone\ozone.py", line 318, in get_city_air
    row = self._parse_data(data_obj, city, params)
  File "e:\git-line\milind-ozone\ozone\src\ozone\ozone.py", line 162, in _parse_data
    row["latitude"] = data_obj["city"]["geo"][0]
TypeError: string indices must be integers
>>>

Might've been what is discussed in #58. In any case, there should be a more helpful error message.

Class docstrings required for the Ozone and URLs class

Currently the Ozone and URLs class do not have any class docstrings. These should be added to make the purpose and methods of each class clearer to developers who use the package. The URLs class, for example, should state that it cannot/should not be instantiated.

Create a virtual env setup guide for development of Ozone in the CONTRIBUTING file

Using this exact environment is not compulsory, but does help to avoid issues with development of the package's features.

  1. Instruct contributors to setup a virtual env with either conda environments or python virtual environments
  2. As of now, Python 3.9 should be used ideally (it's what I've used for the majority of the package, and python 3.10 is not yet supported by conda environments)
  3. the dependencies should be installed from requirements.txt with pip install -r requirements.txt

Create a method to get air quality data for all stations a pair of latitude-longitude bounds

This should allow users to enter in two pairs of lat-long coordinates. The WAQI API has an endpoint available to get all air quality measuring stations between two pairs of lat-long pairs. This list of stations could be used with the get_multiple_coordinate_air method in Ozone, to get the air quality for all of them.

This would be really useful, as users would essentially be able to get data for an entire area of the world, without necessarily knowing the name/location of each station in that area (Ozone would take care of that part).

Ideally create a private method that takes in a two pairs of lat-long coordinates, and returns a list of several lat-long pairs, one for each station in that range.
Then create a separate method that uses get_multiple_coordinate_air under the hood and returns the data in the same way all other data methods do.

Add restriction on reads/min and reads/ second to prevent token being blocked from API

The WAQI API allows a certain number of requests every minute and every second. This can be looked up on their API website. If this limit is exceeded then the users API token is blocked.

An intrinsic check should be added to the code to prevent exceeding this limit, perhaps by taking short breaks between successive requests if the limit is about to be exceeded.

Creation of logo for Ozone package

Ozone currently lacks a logo. A logo goes a long way in making the package feel more professional, and makes the README and documentation site look prettier too!

Guidelines

  • The logo should be related to these keywords: ozone, air, clouds, weather, etc

  • Should be in .svg format ideally

  • Gotta be cool!

Be creative, have fun!

Code Of Conduct not present in documentation.

Considering that this is an open-source package, it's a good idea to have a good of conduct present in the root directory to help people see what behaviour is encouraged and what's not encouraged.

A short, clear CODE_OF_CONDUCT.md should be all that's required for a project as simple as this one. You could take a look at other projects and see what they include.

In general: I want this package to be easy for newbies to contribute.

  1. Encourage constructive discourse and discourage berating new contributors.
  2. Encourage new ideas about features and enhancements as long as they don't go against the goal of Ozone to be the simplest, most effective package for fetching air quality data.
  3. Have a fun time!

Bug: No way to set custom name for file, if saving data as CSV, Excel or JSON file. Also inconsistency in naming.

Current behaviour

All saved files of the same type have the exact same name. This means that if I try to save multiple CSV files, they just overwrite the previous ones. For example:

  • CSV files are named 'air_quality.csv'
  • Excel files are named 'air_quality_data.csv'
  • Excel files are named 'air_quality_data.json'

Desired behaviour

  • Users should have the option to custom name their files should they choose to save their data as a file.
  • If they choose to save their data as a file instead of a dataframe, a warning should tell users that they should give the file a unique name to avoid previous files from being overwritten.

Add example usage and gif of `get_historical_data` method to README

We recently added a feature for retrieving historical air quality data. To make this clear for users to use, It's a good idea for us to add an example of it being used to the README in the form of a gif.

Show briefly how the method can be used in the 'Getting Started' section. Add this in the same format that the other methods are shown.

Here are the steps you can follow to add the gif:

  1. Use the method with every argument used. This might look like this:
o3.get_historical_data(city='Houston', data_format='df')
  1. Make sure the output data is visible too.
  2. Take a video of the entire process and convert it into a ~ 10 second gif.
  3. Add the gif to the media folder
  4. Add the gif to the README by referring to it in the same way as the other two gifs have. For example:
![Gif of ozone.get_city_air()](/src/media/ozone_get_city_air.gif)

When following development environment setup guide in CONTRIBUTING file, pre-commit install doesn't work right away

While following the development environment setup guide from the CONTRIBUTING file, the line that instructs us to do :
pre-commit install

doesn't work right away. I had to pip install pre-commit and then re run the pre-commit install for it to work.

  • I'm using a conda virtual environment with python 3.9
  • Using Zsh

Shouldn't the 'pre-commit' package be added to our dev dependencies in order to have it pre installed before this step?

release-1.1.0 bug: get air quality data methods currently can return two types of data: Dataframe or None.

As of now when a user gets air quality data, and decides that they want the data in a Pandas dataframe, then the get_city_air and get_multiple_city_air methods return a dataframe. (type is Pandas Dataframe)
BUT
when a user says they want the data in CSV, excel or JSON form, it returns a None type, and saves the data as a file to the working directory.

At first glance this seems like what we want, but actually this behaviour is bad for two reasons:

  1. It gives the returned dataframe a type of Dataframe | None. This means that if you try and access info from one column with indexing then Pylance and MyPy highlight this behaviour as a warning, because as far as the linters are concerned the var you're trying to index from could be None and not a dataframe. This is not a good user experience as they might not trust the method if it's getting error highlights.

  2. Also from a code maintainability standpoint, it's a much better idea for a method to return a consistent type in all situations.

No mention of the WAQI API docs in the README at all

This package is built as a wrapper around the WAQI API, however no mention is made to their website and documentation. This should be provided for the sake of completeness and for contributors to refer to.

Simple to add it.

Addition of tests to check build status

The package currently has no test coverage at all. There is thus no method of checking build status with CI/CD.
The core functionality at least should have test coverage.

Add example photos of resulting data from get_multiple_city_air data in README

Add a couple of photos of the data retrieved by using the get_multiple_city_air method with 2-3 cities.

I think a screenshot of the Excel sheet, and of the Pandas dataframe would probably be enough. Ideally use data from different cities for each screenshot, just to demonstrate the range of Ozone as a package.

You can add it as a small subsection under the example code GIFs.

Add feature to get historical data for a particular location

Currently only live data can be fetched given a location. Most users would find more utility in large datasets of historical data.

I don't think that the WAQI api is capable of providing historical data, but the WAQI website does have a resource for downloading CSV's and Excel sheets of historical data - Perhaps it could be possible to download and read the CSV from there, programmatically.

Add method to show air quality forecast information. The API actually returns this info, but we currently ignore it.

The WAQI API documentation mentions how the JSON object that is retuned contains 'forecast' data as well. This is basically the 7 day forecast for 4 parameters: pm25, pm10, o3, and uvi.

Create a method that takes in a city and fetches the forecast data for it. Allow users to enter a list of the specific parameters they want, and get all by default.

You can refer to the get_city_air method while making this. The implementation would be quite similar.

Addition of FILE_STRUCTURE.md file to give new contributors clarity on layout of files.

Explain the basic file structure and layout of the package in the file.

  • Which directory has the main source code? -> src/ozone/
  • Which files have the main classes? ozone.py and urls.py
  • What does each of the two files contain? Ozone class, and URLs class
  • What are the different files to be edited for dependency management? -> setup.py, requirements.txt, and setup.cfg
  • What does src/media contain? -> gifs for README code examples

Refactor repeated section of code responsible for getting data row dict into separate private method

As @lahdjirayhan mentioned in #79 , this part of Ozone's logic is being duplicated:

  1. Building a URL.
  2. Making request (handled by Ozone via _make_api_request)
  3. Checking response status/error code (handled by Ozone via _check_status_code)
  4. Checking if the JSON returned is truly OK (I added this check in this commit to handle cities > without AQI data. It will also be needed in get_specific_parameter)
    5 .Parsing the data into more ready-to-use format (handled by Ozone via _parse_data)
  5. And finally, take the intended parameter.

This should be refactored into its own private method for cleaner use.

Add method to get an individual parameter as a number and not a dataframe

This might help if someone is planning to actually use this package for something other than data analysis. The current methods all cater to data analysis more by providing a Pandas data frame as the return value. This might be better if someone is using this to update their backend, or require a single parameter only.

To add:
(something along these lines)
Methods that return a single numerical value, and provide a Null value with a warning if that parameter is not measured by the station.

Add a method to get air quality data for a particular longitude and latitude

Currently users can only get AQI data for a city, or many cities.

I read into this a little, and the WAQI API offers a URL endpoint to get air quality data for a certain latitude and longitude. It should be fairly simple to use it and create a method to allow users to get data using geographical coordinates.

Also add a method to get air quality data for an entire list of coordinate pairs (similar to the get_multiple_city_air, but for coordinate pairs instead of cities.

Here's the API page with the endpoints: https://aqicn.org/json-api/doc/#api-_

Need better checking for the JSON response

Problem

It is possible to get a response status code 200 (OK), yet a JSON response like this:

{"status": "error", "message": 404"} # clearly does not look like an all-is-good response to me

That is, when you call get_city_air("") (search for city with empty string as its name). Currently, doing this raises a KeyError.

This is what we have now:

https://github.com/Milind220/Ozone/blob/bef2e410315ac67541448f22296be56c56e90e58/src/ozone/ozone.py#L318-L330

Because the response status code is 200, it passes the first if check.


Possible solutions

The immediate fix for this one problem is to check for response['status'] != 'ok' before trying to access anything else.

However

I think we need more useful response checking. I don't think we need to add more checks into _check_status_code, because it already does its job and has perfect scope & purpose. We need to improve the check on JSON response.

If there's a documentation of all possible JSON responses by WAQI, it should help us build a better check.

Add link to the Ozone PyPI package in the README

We currently do not have a link to the PyPI page for Ozone (on PyPI it's called ozon3). This is generally a good idea for the sake of completeness.

This is the page that should be linked.

Add the link in the section of the README that instructs users on how to download the package using pip.

Need to add a list of links of the World Air Quality Index's API website/documentation to the CONTRIBUTING file

Ozone uses the WAQI's API under the hood to fetch its data. Therefore, it's important that contributors can easily find all the information they need about the API in order to make good quality contributions.

Add a section in the CONTRIBUTING.md file with a list of links to any page that has information that might be important for contributors to know. Also in a single short phrase, describe the page that the link leads to (would help people know which one they need).

Here's a couple of links to get you started:

WAQI website: https://waqi.info
API overview: https://aqicn.org/api/

Prevent users from modifying the urls stored in the URLs class

As of now, users can technically import and instantiate the URLs class, which might cause all sorts of havoc if the URLs stored were changed.

This might be fixed by converting it to a static class/annotating it as abstract/ or something along those lines.

Add Option for User of ozone to specify a path to output directory for any files (csv, json, xlsl) that might be created

@Milind220 while writing the tests i noticed that running the _format_output() generates files in the current directory of the code , i think it would be better if the user himself can choose the location of the output of such files like csv or json or any future files that ozone might create.

the best way this can be done in my opinion is by adding a property output_dir to the class ozone itself , and this is a better approach than adding it as a parameter to each public function, since the user can choose his output directory when instantiating the ozone class itself and it would be used by all functions that need it.

Broken link in CONTRIBUTING.md

In the dev branch CONTRIBUTING file's Table of Contents, the hyperlink for 'The world air quality index's API' doesn't work.

It is supposed to take you down to the section of the file that it's referring to, but fails to do this. This is probably because of the ' in the phrase - The world air quality index's API

Pretty easy fix, a simple google search should give you the answer.

(Also whoever takes this, please remember to follow the git commit messages style guide from the CONTRIBUTING file when making commits. For this issue, for example, the commit should have the prefix [docs]

The mypy linter, black, and flake8 should be added as development dependencies for the package

I've noticed that some contributors make commits that churn up mypy linting warnings. These warnings do not prevent execution of the code, but generally point out a flaw in the design of the program, often to do with type inconsistency.

I think as a part of the development environment, contributors should ideally have mypy installed. This also ties in with issue #54 , which has to do with the creation of a consistent development environment for Ozone.

Brief explanation of each parameter that the API can fetch should be added to README

Ozone can currently fetch the following parameters (taken from the Ozone class):

[ "aqi", "pm25", "pm10", "o3", "co", "no2", "so2", "dew", "h", "p", "t", "w", "wg", ]

Some of these parameters, such as 'aqi' are pretty obvious in meaning, but others, such as 'h' , 'w', 'wg', 't' are not easily understood.

It would be excellent to have a short section explaining each parameter briefly in the README, perhaps under the code examples.

You can check the WAQI API website and see if they have the meanings there, or otherwise a google search should prove useful.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.