Giter Club home page Giter Club logo

mecoda-orange's Introduction

mecoda-logo Mecoda-Orange

This repository includes different Orange Data Mining widgets to access data from Minka, Odour Collect, canAIRio, Ictio, Natusfera or Smart Citizen.

cos4cloud-logo MECODA is part of Cos4Cloud, a European Horizon 2020 project to boost citizen science technologies.

To use MECODA package you need to install Orange Data Mining platform through https://orangedatamining.com/download

Once Orange is installed, inside the Options menu, it's possible to get the package using "Add-ons" category, clicking on "Add more" and searching by name "mecoda-orange". The last version of the package will be installed into Orange platform.

You can find also a "Installation Guide" and "Example of Use".

minka-logo Minka widget

This widget collects observations from Minka API and allows filter them by:

Argument Descrition Example
Search by words Word or phrase found in the data of an observation query="quercus quercus"
Project name Name of a project project_name="urbamar"
User name Name of user who has uploaded the observations user="zolople"
Place One of the places created in minka website place="246: BioPrat"
Taxon One of the main taxonomies taxon="fungi"
Year Year of observations year=2019
Id of observation Identification number of a specific observation id_obs=425
Max. number of results The max. number should be under 20.000 (API limit) num_max=800

minka-widgetminka-widget2

The Minka widget integrates the Python library mecoda-minka into a visual interface. You can make any query and download two outputs, a dataframe with one observation per row and a dataframe with one photo per row. A single observation can have more than a photo.

The observations output gets a Table with the following fields:

  • id: observation id
  • captive: True or False
  • created_at: date field.
  • updated_at: date field.
  • observed_on: date field.
  • description: open text field.
  • latitude / longitude: geo location fields.
  • quality_grade: needs_id / research.
  • user_login: user login name.
  • num_identification_agreements / num_identification_disagreements
  • identifications_count: number of identifications for a observation.
  • iconic_taxon: one of the big taxonomic groups available in Minka.
  • taxon_id: species taxon id.
  • taxon_name: species name of observation.
  • taxon fields:
    • kingdom
    • class
    • order
    • superfamily
    • family
    • genus

The observations table allows to make statistical analysis. The photos table allows image analysis.

The widget is complemented with other widgets that can take input from it or directly from Minka API:

get-images get_images

This widget takes a Table with observations (and a column with ids from Minka) and get the photos from all of them. Works with data from Minka Widget.

The output is a Table with an image type feature that can be accessed using Image viewer.

taxon-filter Minka Taxon Filter

This widget allows the user to filter Minka observations by different taxonomic levels (from kingdom to species). The levels shown are just the ones with registered observations.

The widget looks like that:

taxon-filter-widget

taxon-filter-by-words Minka Taxon Search

This widget allows the user to filter Minka observations by scientific or common name.

taxon-search-widget

fish-minka.png Marine and Terrestrial Filter

The widget splits Table of observations into two dataframes: one for marine species and other for terrestrial ones. Just gets observations with research degree.

marine-filter-widget

odourcollect-logo OdourCollect widget

The Odour Collect widget allows the user to get observations from Odour Collect API. The widget looks like this:

odour-collect-widget

The widget has different search fields: date, annoy level, intensity level, category and type. Besides, the observations can be complemented with the distance from a Point of Interest, if this is set.

The output is a Table of observations, with this information:

field description
user OdourCollect's user ID of the citizen that registered the observation.
date Observation date in yyyy-mm-dd format.
time Observation time in HH:mm (24h) format, UTC timezone.
week_day Observation day of week. This field is extra data calculated by PyOdourCollect to help the analyst in finding patterns. Please bear in mind that this calculation is based on UTC, not local time, so it could be misleading in some edge cases.
category First tier of odour classification. In OdourCollect webapp, this is called "type". It provides complementary classification nuances that can be safely ignored for basic analysis. See the full table below for better understanding.
type Second tier of odour classification. In OdourCollect webapp, this is called "subtype". It provides the richest odour classification criteria. See the full table below for better understanding.
hedonic_tone_n Hedonic tone of odour observation (numeric representation). Hedonic tone is the subjective measurement of how annoyant an odour is, from -4 (Extremely unpleasant) to +4 (Extremely pleasant). Zero is used to report nor annoyance nor pleasure. This scale is based on the VDI 3940:2006 standard for odour impact assessement.
hedonic_tone_t Text description version of the former metric.
intensity_n Intensity of odour observation (numeric representation). Intensity is the measurement of how intense and noticeable an odour is, from 1 (Very weak) to 6 (Extremely strong). Zero (Not perceptible) is also used, but only to report absence of odour in observations. This scale is based on the VDI 3940:2006 standard for odour impact assessement.
intensity_t Text description version of the former metric.
duration Metric informing for how much time an odour has been perceived by reporter. Categorical text data with following self-explanatory options: (No odour),Punctual,Continuous in the last hour and Continuous throughout the day
latitude GPS coordinates of observation. Latitude.
longitude GPS coordinates of observation. Longitude.
distance Distance in Kms (with an accuracy of 0.01 Kms.) between the point of observation and a configurable Point of Interest (POI). This extra data is calculated by PyOdourCollect when the data analyst provides a set of coordinates for a given suspicious activity that motivates his/her analysis. In case that no POI coordinates are provided, this field is missing.
time_hour Observation time in HH (24h) format, UTC timezone.
time_mins Observation time in mm (0-60') format, UTC timezone.
time_secs Observation time in ss (0-60'') format, UTC timezone.

canairio_logo.png CanAIRio Fixed Stations

The widget allows to get observations from fixed stations through CanAIRio API. The widget looks like this:

canairio_fixed_widget

The widget filters between the different measurements and gets a dataframe with all data from fixed stations at the request moment.

When selecting data from one of the stations, it can be combined with another widget (Last Hour Fixed Station) to get data from the last recorded data of this station.

canairio_fixed_widget

The output of Last Hour Fixed Station widget is a dataframe with last registered measurements from this station.

canairio_logo.png CanAIRio Mobile Stations

The widget gets observations from all the mobile stations registered by CanAIRio API.

canairio_fixed_widget

The output can be placed in a map and colored by any parameter:

canairio_fixed_widget

We can select one device and get the complete track of the route using Track - Mobile Station. This is the result placed in a map:

canairio_fixed_widget

The point can be coloured by any measurement.

This example can be loaded as a workflow (.ows format) directly in Orange Canvas:

canairio_fixed_widget

ictio_logo.png Ictio widget

The widget analyse data from Ictio's citizen observatory for Amazon basin fish observation. The data from this Citizen Observatory is not freely available via public API nor public download, but it can be downloaded as a zip file after registration in web page.

This widget takes an Ictio_Basic zip file from ictio.org and process it using IctioPy library, created by Science For Change:

ictio_widget

The output of this widget is a Table with this structure:

  • obs_id: Unique observation ID.
  • weight: Total weight in Kg reported for the given taxon.
  • price_local_currency: Price per Kg in the local currency for the taxon.
  • obs_comments: Comments made by the Citizen Scientist at the time of registering the observation.
  • upload_date_yyyymmdd: Date of observation upload. It does not necessarily match observation date. The relevant data for analysis purposes is the observation date. num_photos: Number of photos taken with the observation. The photos are not available in the basic version of the ictio.org's database, so this field is only included as a reference.
  • user_id: Anonymous, numeric ID of the user that made the observation. checklist_id: Unique checklist identifier
  • protocol_name: Name of the observation protocol used. Possible values: During fishing event , After the fishing event, Market Survey and Port Survey. complete_checklist: Indicates if the checklist was completed. A complete checklist is when all taxa caught during the fishing effort are reported. In a market survey it would be all taxa observed at the market. If observation was made via app, it is assumed that user reported a complete checklist.
  • fishing_duration: The duration of the fishing effort in hours. submission_method: How was data submitted? EFISH_android for mobile app or EFISH_upload for upload tool.
  • app_version: Version of the mobile app or upload tool used.
  • taxon_code: Species taxon code.
  • scientific_name: Scientific name of the species observed.
  • num_of_fishers: Number of fishers participating in fishing effort.
  • number_of_fish: Number of individual fish reported for the given taxon.
  • obs_year: Year of observation.
  • obs_month: Month of observation.
  • obs_day: Day of month of observation.
  • port: This is the name of the port where data was collected and is only reported with the Port Protocol. This is not the location where fish were caught.
  • location_type: Ictio has three location types: Watersheds, Ictio Hotspots, and Personal Locations. This field will identify watersheds and Ictio Hotspots. This field will be null for personal locations. A personal location is any new location added with the upload tool or based on raw GPS coordinates.
  • country_code: Country Code, automatically assigned by latitude and longitude. If you assign a checklist to a watershed it will be assigned to the country where the centroid of the watershed is. If the watershed overlaps a boundary, it could be assigned to a different country from where it is being submitted.
  • country_name: Country in which the observation was made.
  • state_province_code: State/Province Code, automatically assigned by latitude and longitude. If you assign a checklist to a watershed it will be assigned to the State/Province where the centroid of the watershed is. If the watershed overlaps a boundary, it could be assigned to a different State/Province.
  • state_province_name: State/Province name, automatically assigned by latitude and longitude. If there is a checklist assigned to a watershed, observation will be assigned to the State/Province where the centroid of the watershed is. If the watershed overlaps a boundary, it could be assigned to a different State/Province. watershed_code: Unique identifier for watershed. For Ictio hotspots and personal locations, the watershed code and watershed name are inferred based on geographic position of Citizen Scientist at the time of observation.
  • watershed_name: Name of the watersed in which the osbervation was made.

natusfera-logo Natusfera widget

This widget collects observations from Natusfera API and allows filter them by:

Argument Descrition Example
Search by words Word or phrase found in the data of an observation query="quercus quercus"
Project name Name of a project project_name="urbamar"
User name Name of user who has uploaded the observations user="zolople"
Place Name of a place place_name="Barcelona"
Taxon One of the main taxonomies taxon="fungi"
Year Year of observations year=2019
Id of observation Identification number of a specific observation id_obs=425
Max. number of results The max. number should be under 20.000 (API limit) num_max=800

natusfera-widget

The Natusfera widget integrates the Python library mecoda-nat into a visual interface. You can make any query and download two outputs, a dataframe with one observation per row and a dataframe with one photo per row. A single observation can have more than a photo.

smartcitizen-logo SmartCitizen widgets

The first widget (Smart Citizen Search) collects data from the Smart Citizen API. It allows to select the device either via device ID (the number after https://smartcitizen.me/kits/[...]) or by searching the API by city, tags, or device type. The second widget (Smart Citizen Data) uses the data from the first one and collects timeseries tabular data from a device, with a defined rollup (i.e. the frequency of the readings), minimum and maximum date; as well as resample options.

Example workflow is available https://github.com/fablabbcn/smartcitizen-docs/blob/master/docs/assets/ows/example_sc.ows and documentation will be made available in https://docs.smartcitizen.me/Data/.

aireciudadano-logo Aire Ciudadano widget

The widget allows to get data from Aire Ciudadano air quality stations, from the last registers or filtering by range of time.

The output is a table with these columns:

Field Descrition
station Code of the station.
date Date of registry in format Year-Month-Date.
time Time of registry in format Hour:Minute:Second.
Latitude Geographical latitude.
Longitude Geographical longitude.
CO2 Value in ppm (parts per million) of the concentration of carbon dioxide.
Humidity Value in % of relative humidity.
InOut Variable to identify if the sensor is located outdoors (InOut= 0) or indoors (InOut = 1).
NOx NOx index (nitrous oxides) with range from 1 to 500, only applicable to Sensirion's SEN55 sensor.
Noise Value in dbA (A-weighted decibel).
NoisePeak Peak value in dbA reached in the time range (Publication time) in which the sensor publishes its data.
PM10 Value in ug/m3 of Particulate Matter PM10.
PM25 Value in ug/m3 of Particulate Matter PM2.5.
PM252 Value in ug/m3 of Particulate Matter PM2.5 measured by an installed secondary sensor (optional).
PM25raw Value in ug/m3 of Particulate Matter PM2.5 without adjustment, only applies to Plantower brand sensors for which the "Plantower PMS adjust RECOMMENDED" function has been activated.
Temperature Value in °C of the temperature.
VOC VOC index (volatile organic compounds) with range from 1 to 500, only applicable to Sensirion SEN55 and SEN54 sensor.

Testing

To run tests locally you'll need to have python 3.8, pip, virtualenv and git installed.

  • Clone the repository and go into the directory:
git clone https://github.com/eosc-cos4cloud/mecoda-orange.git
cd mecoda-orange
  • Set up the virtualenv for running tests:
virtualenv -p `which python3.8` env
source env/bin/activate
  • Install mecoda-orange:
pip install -e .
  • Install pytest:
pip install pytest
  • Run tests from the mecoda-orange directory:
pytest
  • To run only one test, use:
pytest -k <name-of-the-test>

Next steps

MECODA is intented to be kept as an open source repository. It will be ensured to be maintained, at least as part of other existent repository. A version will be kept in CSIC Gitlab.

License

This repository is under GPLv3 license. See license for more details.

mecoda-orange's People

Contributors

pynomaly avatar oscgonfer avatar danielbernalb avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.