Giter Club home page Giter Club logo

ngsi-timeseries-api's People

Contributors

c0c0n3 avatar chicco785 avatar daminichopra avatar dependabot[bot] avatar fisuda avatar iamarnavgarg avatar jason-fox avatar juliozinga avatar keshavsoni2511 avatar modulartaco avatar nec-vishal avatar necravisaketi avatar ohylli avatar pooja1pathak avatar taliaga avatar wistefan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ngsi-timeseries-api's Issues

Suggestions for refactoring

looking for tests, i would expect a main folder tests and then inside it the test folders for each module, rather than the other way around. but probably is just a a matter of taste.

also, i would move the source of the module in src folder, so to better structure the repository

for the client, we have a full client in python (https://github.com/smartsdk/ngsi-sdk-python), so it was not needed to develop one just for the purpose of the tests... use your own dog food.

it looks also weird that the folder is named client, when it is an orion client and not the quantum leap one.

it looks quite weird as well to have certain files in the root:
conftest.py
run.sh (this run the tests right? the name does not point to that)

i also suggest to have a Docker folder that include the Dockerfile, but also a default example composition. more complex one can be in the recipe repository.

[QUERY] Integration with InfluxDB

Hello!

I have seen that there is the possibility of adding other databases, such as InfluxDB, and we wanted to know if you have implemented the integration with that database or in what state that possible implementation is.

A greeting.

Implement a name Sanitizer for attribute/column names

Take into account not only Crate naming restrictions, but also have a look at Orion's naming restrictions. And what if in the future the db changes and restrictions are different?

'-' is an example of an invalid char in the attributes names. For the test-case:
{
'id': 'Impeller_Nueva_pieza_1.dmo',
'type': 'MeasurementResult',
'FA@SP-00': {'type': 'StructuredValue', 'value': [0, 0, 44.3366, 0, 0, 1], 'metadata': {'featureType': {'type': 'Text', 'value': 'POINT,CART'}}}
}

Note reserved words, which cannot be used for schemas, tables or columns:
https://crate.io/docs/crate/reference/en/latest/sql/general/lexical-structure.html#sql-lexical

ql as context broker datasource

context broker is supporting "registration" as way to register external data sources for entities. this means that when queried, the context broker will answer by query the external data source.
ideally QL could act as a data source returning the last value of a given entity. this could be particularly interesting in combination with #101

Array parameters

It seems that QuantumLeap is struggling when catching Array data. I have noticed that the Orion is receiving the data but never store as historical (on the Quantum side).

This is an example of the json sent:

{
 "id": "smartphone-9845C",
 "type": "Device",
 "category": {
    "value": "smartphone"
	},
 "osVersion": {
    "value": "Android 4.0"
	},
 "softwareVersion": {
    "value": "MA-Test 1.6"
	},
 "hardwareVersion": {
    "value": "GP-P9872"
	},
 "firmwareVersion": {
    "value": "SM-A310F"
	},
 "consistOf":{
    "value": 
		[
		 "sensor-9845A",
		 "sensor-9845B",
		 "sensor-9845C"
		]
	},
 "refDeviceModel": {
    "value": "myDevice-345"
	},
 "dateCreated": {
    "value": "2016-08-22T10:18:16Z"
	}
}

This is the error I found when checking the Quantum's Log:
crate.client.exceptions.ProgrammingError: SQLActionException[ColumnValidationException: Validation failed for refdevice: ['device-9845A', 'device-9845B', 'device-9845C'] cannot be cast to type object]

Fix doc

  • Remove Subscrition comments

Flaky test: test_not_found

test_not_found in reporter/tests/test_1T1E1A.py fails from time to time usually in the complete suite run. More details in the logs of https://travis-ci.org/smartsdk/ngsi-timeseries-api/builds/409226234?utm_source=github_status&utm_medium=notification

________________________________ test_not_found ________________________________
    def test_not_found():
        query_params = {
            'type': entity_type,
        }
        r = requests.get(query_url(), params=query_params)
>       assert r.status_code == 404, r.text
E       AssertionError: {
E           "detail": "The server encountered an internal error and was unable to complete your request.  Either the server is overloaded or there is an error in the application.",
E           "status": 500,
E           "title": "Internal Server Error",
E           "type": "about:blank"
E         }
E         
E       assert 500 == 404
E        +  where 500 = <Response [500]>.status_code
reporter/tests/test_1T1E1A.py:265: AssertionError
===================== 1 failed, 51 passed in 51.40 seconds =====================

Add status endpoint

Add some form of status endpoint, which could be used in liveness/readiness Probes that reports on the status of complementing services (crate, maybe grafana and redis)

Attributes names loose capitalization

Crate has some (naming restrictions)[https://crate.io/docs/crate/reference/en/latest/sql/ddl/basics.html#naming-restrictions] that forces the tables and column names to be lowercase.

We need to persist the original entity name and attribute names (the skeleton of the entity) somewhere so that information is used when constructing the entities to reply data queries.

JSON size on QL component

This error pops up when I post a long Json structure.

ERROR in app: Exception on /notify [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1614, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1517, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, in reraise
    raise value
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1598, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "reporter/reporter.py", line 110, in notify
    trans.insert([payload])
  File "/src/ngsi-timeseries-api/translators/crate.py", line 146, in insert
    self.cursor.executemany(stmt, entries)
  File "/usr/local/lib/python3.6/site-packages/crate/client/cursor.py", line 67, in executemany
    self.execute(sql, bulk_parameters=seq_of_parameters)
  File "/usr/local/lib/python3.6/site-packages/crate/client/cursor.py", line 54, in execute
    bulk_parameters)
  File "/usr/local/lib/python3.6/site-packages/crate/client/http.py", line 304, in sql
    content = self._json_request('POST', self.path, data=data)
  File "/usr/local/lib/python3.6/site-packages/crate/client/http.py", line 416, in _json_request
    _raise_for_status(response)
  File "/usr/local/lib/python3.6/site-packages/crate/client/http.py", line 170, in _raise_for_status
    error_trace=error_trace)
crate.client.exceptions.ProgrammingError: IllegalArgumentException[Document contains at least one immense term in field="value" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[45, 49, 46, 48, 50, 57, 54, 44, 52, 46, 54, 52, 54, 53, 53, 44, 45, 49, 46, 54, 51, 50, 53, 52, 44, 50, 48, 49, 55, 45]...', original message: bytes can be at most 32766 in length; got 94981]; nested: MaxBytesLengthExceededException[bytes can be at most 32766 in length; got 94981]; 

Please, note that Orion Context Broker does not present any problem handling the long structures. The error was retrieved from the QuantumLeap component.

Could it be possible to extend the size capacity on the component to handle such long files? Personally need it to deal with accelerometer data.

geo:json for everyone

As suggested by @chicco785 , it would be interesting to have, for the entities expressing location in a non geo-json format (say, with country codes and addresses), an under-the-hood transformation to an equivalent geo-json structure, so that geo-queries can be made across different entities expressing locations in different ways.

I open this issue to keep track of the discussion of this idea.

OSM geocoder may not return expected osm_type

Since 19 Nov 2018, the following geocoding tests fail:

  • test_entity_add_point: OSM used to return an osm_type of node for the address used in this test but it now returns an osm_type of way which results in our geocoding module not being able to extract the geo-location coords from the response since we expect node for an address in the format <street name> <street number> , <city> , <country code>.
  • test_caching: this fails because we use the same address as in test_entity_add_point so we can't extract the location from the OSM response and hence it won't be in the cache.

It could be that the data in OSM for that location got corrupted or a new version of Nominatim (the geocoder) went live and the API behaviour changed. Or it could be something else. But note the other full blown address we use in our tests actually works fine, i.e. we get back an osm_type of node in the response so we manage to extract the location in this case.

On another note, it doesn't look like the OSM project is in a good shape!

We have to decide how to fix this...

Query api follow-ups

  • Better treatment for missing "type" param in the query
  • Add test to make sure a reasonable response is returned when no entity matches any of the supported queries

Support multi-tenancy using service path concept of context broker

When we deal with different scenarios sharing a common infrastructure, but we do not want them to share access to data, we need to find a way to isolate the data.
A simple solution, as the one used in context broker vs mongodb, is to have one database for tenant (identified by the service-path in the header request).

Support a default way in the entity data model to track entity instances ownership.

Scenario:

  1. keep track of who injected a certain entity instance. (this should be a "compulsory" attribute in the data model. It could be the dataProvider field defined in the GSMA-Commons.
      "type": "object",
      "properties": {
        "id": {
          "$ref": "#/definitions/EntityIdentifierType"
        },
        "dateCreated": {
          "type": "string",
          "format": "date-time"
        },
        "dateModified": {
          "type": "string",
          "format": "date-time"
        },
        "source": {
          "type": "string"
        },
        "name": {
          "type": "string"
        },
        "alternateName": {
           "type": "string" 
        },
        "description": {
           "type": "string" 
        },
        "dataProvider": {
          "type": "string"
        }
      }
    },
  1. if a model is inject that does not have the field, raise a warning.
  2. Based on an external authz mechanism allow a user to access a given entity instance based on the policy defined by the data provider. This should work both via the NGSI API and the CrateDB API (see #17).

Food4Thought: Complementing Analytics Service?

In huge queries, cratedb limits the response to the first 100 entries.
What if we supported a response of 100 entries but evenly distributed across the specified querying range?

  • How to do the regression for the user to be able to see the overall trend without the need of ingesting the whole dataset.

  • It'd be nice to "repeat" the zooming query across different attributes.

It might be that the approach is by having a complementary "data analytics" microservice taking care of the

Authentication / Authorisation Proxy

May not be a "direct" issue related to QuantumLeap, but a side one.
When accessing from Grafana (assuming we use the existing driver) there should be away to provide a sort of access control on tables and rows in side a table.

Handle aggregation on invalid columns

sum and a avg should not be allowed on attr_names (attributes) of non-numeric types.

Note min, max and count still work on things like bool or string.

private word data into structure

Below error pops up when I include the word "data" as part of the attributes. No sure whether this is a similar case as the previous issues mentioned (i.e., about "dataModified"). I mention this, just to let you know that this might be an issue in further developments. I have already worked it around by using a different word.

ERROR in app: Exception on /notify [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1614, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1517, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, in reraise
    raise value
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1598, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "reporter/reporter.py", line 89, in notify
    assert len(payload) == 1, 'Multiple data elements in notifications not supported yet'
AssertionError: Multiple data elements in notifications not supported yet

pin pipenv version in docker file

The latest version of pipenv (2018.11.14 on PyPi) doesn't seem to be able to generate a requirements file when the output is redirected to a file---the way we have it in our Dockerfile. In fact, running

pipenv lock -r > requirements.txt

with version 2018.11.14 when building our Docker image produces a requirements.txt containing the text: " Running.. ". So we should pin the version to the one we tested, i.e. 2018.10.13.

Support multiple endpoints for crate

Supposing that you have a CRATE cluster and that each node expose different endpoints, it would be useful a sort of "roundrobin" process to connect to the different hosts in case one of them is not reachable.
This may not be an issue in Docker Swarm, where hopefully, using healthchecks, the swarm remove and endpoint from the "discovery" feature if it is not healthy.

timestamps

When I include this attribute to the json structure:

"dateModified": {
    "value": "2017-01-18T20:45:42.697Z-0800"
  }

This error emerges:

crate.client.exceptions.ProgrammingError: SQLActionException[ColumnValidationException: Validation failed for time_index: {"metadata"={}, "type"='Text', "value"='2017-08-26T21:43:33.00Z'} cannot be cast to type timestamp]
207.249.127.152 - - [28/Dec/2017 20:30:54] "POST /notify HTTP/1.1" 500 -
INFO:werkzeug:207.249.127.152 - - [28/Dec/2017 20:30:54] "POST /notify HTTP/1.1" 500 -

Provide basic documentation

  1. How to deploy using docker (provide a basic stack made of context broker, quantum leap, crate, grafana)
  2. How to send notifications from context broker to quantum leap
  3. How to connect grafana to crate
  4. How to run queries in crate / against crate (using the http endpoint)

virtual entity support

Ideally, we should be able to create "virtual entities" from existing entities. These virtual entities ideally should be created as "view" over the database.

For example, let's suppose we have a collection of parking sensors. You would like to be able to have aggregated information by parking lot (assuming that all parking sensors have such information - the lot to which they belong).

Ideally this could be done by having an endpoint that allow to create "virtual entities" that are rendered via a database view. Of course the view should be "clever" enough.

In fact, having to compute "a time series" view (made of different entities with different timestamps) is not that easy.

Ideas?

Delete all entities should drop table.

Delete all entities does not seem to be dropping the table, just all the records. Check why and try to force table dropping as this will allow "starting from the scratch with entity_type foo".

support keyValue mode

I suppose that using a parser it is possible to use also the keyValue mode.
Orion is supporting it, so there should be no reason for QL not being able to to so.

geographical queries

afaik, while crateDB supports that quite well, there is no way to support geo queries from any API.

see geo graphical queries in the spec:

and cratedb support:

the following ones should be easy to implement:

  • georel=intersects. Denotes that matching entities are those intersecting with the reference geometry.
    • maps to MATCH (column_ident, query_term) using intersects
  • georel=coveredBy. Denotes that matching entities are those that exist entirely within the reference geometry. When resolving a query of this type, the border of the shape must be considered to be part of the shape.
    • maps to MATCH (column_ident, query_term) using within

A bit more complex:

  • georel=near. The near relationship means that matching entities must be located at a certain threshold distance to the reference geometry. It supports the following modifiers: maxDistance. Expresses, in meters, the maximum distance at which matching entities must be located. minDistance. Expresses, in meters, the minimum distance at which matching entities must be located.

still it seems there are quite some hints here on how to solve that:
https://crate.io/a/geospatial-queries-with-crate-data/

Support retention policy

how long to keep data of a given entity and at which resolution?
e.g. after 1 year, I may be happy to keep instead of all data, only interpolation of data for 1h or so.

Define and document policy for missing values

Many times data injectors send a "null" "empty" value for some attributes.

At the moment QL is discarding received notifications with this kind of "null" input in any of the attributes, but we could implement something more flexible so as not to loose the rest of incoming valid data.

Define a policy for dealing with these cases.

Problems with aggrPeriod attribute

I'm trying to use the aggrPeriod attribute in a query to /entitites/../attrs/...? and I do not know if it does not work or that I have a wrong idea about him.

In my opinion, if I add the attribute "aggrPeriod = minute" with the method "aggrMethod = avg", the query will return me the value of the values, minute by minute.

Is this what should happen?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.