equinor / tagreader-python Goto Github PK

View Code? Open in Web Editor NEW

40.0 6.0 18.0 4.68 MB

A Python package for reading trend data from the OSIsoft PI and Aspen InfoPlus.21 historians

Home Page: https://equinor.github.io/tagreader-python/

License: MIT License

Python 100.00%

odbc rest-api osisoft osisoft-pi aspentech ip21 infoplus21

tagreader-python's Introduction

tagreader-python

Tagreader is a Python package for reading timeseries data from the OSIsoft PI and Aspen Infoplus.21 Information Manufacturing Systems (IMS) systems. It is intended to be easy to use, and present as similar interfaces as possible to the backend historians.

Installation

You can install tagreader directly into your project from pypi by using pip or another package manager. The only requirement is Python version 3.8 or above.

pip install tagreader

The following are required and will be installed:

pandas
requests
requests-kerberos
certifi
diskcache

Usage

Tagreader easy to use for both Equinor internal IMS services, and non-internal usage. For non-internal usage you simply need to provide the corresponding IMS service URLs and IMSType. See data source for details.

Usage example

import tagreader
c = tagreader.IMSClient("mysource", "aspenone")
print(c.search("tag*"))
df = c.read_tags(["tag1", "tag2"], "18.06.2020 08:00:00", "18.06.2020 09:00:00", 60)

Note, you can add a timeout argument to the search method in order to avoid long-running search queries.

Jupyter Notebook Quickstart

Jupyter Notebook examples can be found in /examples. In order to run these examples, you need to install the optional dependencies.

pip install tagreader[notebooks]

The quickstart Jupyter Notebook can be found here

For more details, see the Tagreader Docs.

Documentation

The full documentation can be found in Tagreader Docs

Contribute

To starting contributing, please see Tagreader Docs - Contribute

tagreader-python's People

Contributors

Stargazers

Watchers

Forkers

subckodr ldariva greatwallisme santhraul g-parki westlake-ml ruan0507 gparreira adamzalewski simrit1 dnarh taljaards industrial-data rhymeswith0range daha chunchi031 rashadnaz

tagreader-python's Issues

Read as batch from PI web API

Resolve Snyk Code identified vulnerabilities

Ref.:

Definition of Done:

Resolve issues
Remove "continue-on-error: true" from snyk-pipeline

Missing methode for dissconnect, how is connections handled?

import getpass
from requests_ntlm import HttpNtlmAuth
user = "mydomain\" + getpass.getuser()
pwd = getpass.getpass()
auth = HttpNtlmAuth(user, pwd)
c = tagreader.IMSClient(datasource="myplant", url="api.mycompany.com/aspenone", imstype="aspenone", auth=auth, verifySSL=False)
c.connect()

d.disconnect() <----- Add methode for disconnecting not leaving a lot of open connections

Review 3rd part licenses

Support for UNIX (REHL 7) [enhancement]

Hi,

I would like to use tagreader on the UNIX. However tagreader (2.3.0) accesses the windows registry. This happens in find_registry_key* and (I think) are used to . Are theses keys also stored on the UNIX machines so it would be possible to extend tagreader use to UNIX?

All the best, Stefan

Overload str in client class to give a pretty print

I want to see name and type of data server that client is connected to when calling print(client). Will make it fast and easy to pass connection information to loggers...

Add GitHub link on PyPi description

Feature request - Tag metadata

Issue added on GitLab by @knudsvik 2018-06-12:

Include pull of metadata for collected tags. Especially tag description and units could be useful, but maybe range also?
if made available these data can be used for column headings, label descriptions etc.

Different behaviour get_descriptions and get_units when tag is not found

when tag is not existing, get_units fails and returns:
Error using web_handlers>_get_tag_unit (line 372)
Python Error: KeyError

while get_descriptions returns empty string

Search not working properly with imstype='aspenone'

When there are spaces in the tagname the search function does not work correctly with imstype='aspenone'.
It returns [] even when the tag exists.

Failing to install version 4.0.1 on mac m1 with python 3.11.2 due to issues with tables

Downloading tables-3.8.0.tar.gz (8.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.0/8.0 MB 8.8 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [10 lines of output]
ld: library not found for -lhdf5
clang: error: linker command failed with exit code 1 (use -v to see invocation)
cpuinfo failed, assuming no CPU features: 'flags'
* Using Python 3.11.2 (main, Mar 8 2023, 10:10:03) [Clang 14.0.0 (clang-1400.0.29.202)]
* Found cython 0.29.35
* USE_PKGCONFIG: True
.. ERROR:: Could not find a local HDF5 installation.
You may need to explicitly state where your local HDF5 headers and
library can be found by setting the HDF5_DIR environment
variable or by using the --hdf5 command-line option.
[end of output]

Handle duplicate tags better

Make sure tags are not repeated in query and/or report non-unique tags. Something like tags = list(dict.fromkeys(tags))

Cache issues when reading missing data

When reading missing values followed by reading valid values, the following error occurs during cache store:
ValueError: invalid combination of [values_axes] on appending data [name->values_block_0,cname->values_block_0,dtype->float64,kind->float,shape->(1, 721)] vs current table [name->values_block_0,cname->values_block_0,dtype->bytes24,kind->string,shape->None]

When reading valid values followed by reading missing values, the following error occurs during cache store:
TypeError: '>' not supported between instances of 'NoneType' and 'int'

When reading missing values twice in a row, the original data are None while the data read from cache are NaN.

It seems as if at least the ODBC based handlers return None instead of NaN, and this is handled in an unwanted way by the cache. Probably better to convert None to np.nan before storing to cache and returning results.

get_status True and missing tags

In the case when a tag (e.g. "tag_id") does not exist, tagreader returns a dataframe with a column named "tag-id" and numpy nan values. When get_status=True and a tag does not exist tagreader returns the "tag_id" column with nan values, but not "tag_id::status" with nan values.

This could of course be handled by the code calling tagreader, but I think it makes sense for tagreader to return "tag_id::status" with nan values since this column is expected when get_status=True is used and a missing tag is passed to tagreader.

Fails running pytest because I don't have access to submodule extratests

Fails because folder extratests\tests is not included in public part of repo.

Improve read performance over multiple tags

Data format returned from .read function

The .read function (tags, start_time, end_time, time_interval etc.) seems to return data as a list, one element per tag, with all information. When I want to write the data to a .csv file I would like to convert it to a format that is easier to work with - e.g. a "time" column with the time data and one column "tag-name" with the tag values. Any tips?

Reading non-numeric tags is NaN

When reading tags that are not of numeric type, the tag is converted to a NaN value. This is due to line 489 of webhandlers.py and can be resolved by just removing the line. The line attempts to convert the tag to a numeric value, but it really should first check that the tag is numeric in the first place or allow the user to specify which tags are numeric.

# Ensure non-numericals like "1.#QNAN" are returned as NaN
df["Value"] = pd.to_numeric(df.Value, errors="coerce")

https://github.com/equinor/tagreader-python/blob/master/tagreader/web_handlers.py

Handle multiple default maps better

Don't just pick the first one, but warn user to specify if more than one default map is found for a tag.

Extend with support for performing raw queries

Create new IMSClient method and corresponding handler methods which let users perform raw queries against SQL database or Web API.

Crash when using piwebapi

Search for tag works fine with imstype='pi, with imstype='piwebapi' it crashes.
Ref conversation on Teams.

Fix PyPi publish pipeline

Current pipeline has 2 issues:

It does not trigger automatically because release-please (bot) can't trigger another pipeline. This is a GitHub Actions feature done on purpose to avoid recursive triggering infinite loops of pipelines.
Auth fails on upload to PyPi

Convert to Poetry for packaging and dependency management

Convert to Poetry in order to align with other projects.

SmartCache bombs out for some tags under some circumstances

Ref mail from Juan earlier today:
ValueError: invalid combination of [values_axes] on appending data [name->values_block_0,cname->values_block_0,dtype->float64,kind->float,shape->(1, 10000)] vs current table [name->values_block_0,cname->values_block_0,dtype->bytes24,kind->string,shape->None]

This happens for at least some tags (but not all) using specific query parameters. Other tags using the same parameters work fine. Changing the interval from e.g. 60 seconds to 600 seconds for a tag that fails makes the query work. Seems rather idiosyncratic. Disabling the cache makes the query work.

Provide proper feedback to user when no data found for get_units() ++

Calling get_units() for non-existing tag results in a crash. Provide user with informative feedback.
Probably similar issue for other calls that assume existing tags.

Data format returned from search function

Hi - seems that the tag search function returns a list with entities (tag + description). I would like to like to grab the tags only in an easy way - and then perhaps get units and get descriptions etc. later. Would it be possible not to combine tag and description in one list entity, or is it an easy way for users to extract the tag from the return of the search function?

Searching for "11#35-DENS" on MO-IP21AB causes Requests.JSONDecodeError

Using the following code to search for 11#35-DENS causes Requests.JSONDecodeError.

import os
import datetime
import tagreader


def connect_to_aspen():
	print("Venter på svar frå aspen")
	API = "aspenone"
	sources = tagreader.list_sources(API)
	if "MO-IP21AB" not in sources:
		print("Kunne ikkje koble til MO-IP21AB")
		os._exit(0)
	else:
		source = "MO-IP21AB"
	client = tagreader.IMSClient(source, API)
	client.connect()
	return client

if __name__ == "__main__":
    pp = "11#35-DENS"
    end = datetime.date.today()
    start = end + datetime.timedelta(days=-30)
    client = connect_to_aspen()
    m = client.search(pp)
    print(m)

Backtrace:

Traceback (most recent call last):
  File "C:\Users\hsysl\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\requests\models.py", line 972, in json
    return complexjson.loads(self.text, **kwargs)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\json\__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "f:\tettleiksmeister\main.py", line 24, in <module>
    m = client.search(pp)
  File "C:\Users\hsysl\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\tagreader\clients.py", line 273, in search
    return self.handler.search(tag, desc)
  File "C:\Users\hsysl\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\tagreader\web_handlers.py", line 349, in search
    description = self._get_tag_description(tagname)
  File "C:\Users\hsysl\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\tagreader\web_handlers.py", line 403, in _get_tag_description  
    j = res.json()
  File "C:\Users\hsysl\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\requests\models.py", line 976, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Crash for PI Web API for readertype RAW when no return values

Error in read_tag() during attempted timestamp conversion due to missing key "Timestamp" in empty df from json.

Unify result-length for aggregates among all handlers

Interpolated results contain end-points, so when specifying e.g. start_time 12:00:00 and end_time 13:00:00 with sampling interval 60s, the results are of length 61. This is fine and expected. However, aggregated result lengths are sometimes 60, other times 61, depending on handler. Behavior must be the same, regardless of handler.
PI ODBC: 61
IP.21 ODBC: 60
PI WEB API: 60
AspenOne: 61

The half-interval offset added in get_next_timeslice is probably relevant.

Reading PI data from AspenOne doesn't seem to work

Ref mail from Juan today: Reading the test tag from AspenOne resulted in value alternating between 8 and nothing with completely wrong tagname. Perhaps an issue with AspenOne reporting PI data on a different format? Not sure this needs looking into since PI data should primarily be handled by PI Web API.

Use diskcache as caching backend

In order to avoid dependencies and NIH, we want to use diskcache as the caching backend instead of HDF5. This is a zero-dependency solution that uses SQLight as an ACID backend. This is much safer and easier, plus it has a lot of nice features such as expiry, caching strategies, statistics, etc.

ImportError: cannot import name 'tables' from 'tagreader'

import sys
sys.path.insert(0, "../")
import tagreader
import getpass
import tables

from tagreader.utils import add_statoil_root_certificate
from tagreader import list_sources


add_statoil_root_certificate()



#print(list_sources("aspen"))


c = tagreader.IMSClient("TRB", "aspenone")
c.connect()
#print(c.search(tag="tagname*", desc="**"))


tags = ['some-tag']
start_time = '01-JAN-2020 08:00:00'
end_time = '10-JAN-2020 08:00:00'
ts = 60

tagToPrint = c.read(tags, start_time, end_time, ts)

print(tagToPrint);

Traceback (most recent call last):
File "C:\Users*\Desktop\tagreader-python.py", line 5, in
import tables
File "C:\Users*\AppData\Roaming\Python\Python38\site-packages\tables_init_.py", line 82, in
raise ImportError(
ImportError: Could not load any of ['hdf5.dll', 'hdf5dll.dll', 'pytables_hdf5.dll'], please ensure that it is installed in the package folder.

Snorre A (SNA) tags are not found

Hi,

It seems like tagreader cannot find anything from SNA, but it is working completely fine for Snorre B. Yes, I have access to the data.

Add support for Omnia Plant Timeseries API

Added on GitLab by @smolvik1 2019-06-18

Add support for raw and snapshot

Add option to return status

Aspen IP.21 ODBC returns a single status value which is

0: Good
1: Suspect
2: Bad
4: Good/M (history has been modified)
5: Suspect/M (history has been modified)
6: Bad/M (history has been modified)

PI ODBC and Web API can return three booleans that may be relevant:

Good (or IsGood)
Questionable
Substituted
(Also Annotated, but probably not relevant here)

PI ODBC can also return Status, which is documented here. It is somewhat more detailed, but probably unnecessarily so.

Assuming AspenOne can provide the same as IP.21 ODBC,, an initial suggestion is to combine the three booleans from PI into a numerical value corresponding to Aspen IP.21 ODBC nomenclature. Assuming that Good and Questionable are mutually exclusive: Status = Questionable + 2*(1-Good) + 4 * Substituted

Issue with "old" data when using AspenOne

When trying to pull "old" data with tagreader, there is an issue with using AspenOne. When connecting to the current server, in this case MEL-IMS, everything works fine, but there are no tags found when connecting to a different server (MEL-Y08Y14-IMS).
A work-around is to use the OBDC drivers, but I have some issues with that after upgrading to 3.9.

See attached images for examples.

Rewrite handling of read results for PI Web API

The code is currently a bit messy and error-prone in order to accommodate measurements as both Value and Value.Value since the latter is in use by digitalsets and by summary data. However, error codes are also reported as Value.Value, resulting in erroneous interpretation or error codes as actual data. Probably requires a rewrite of read_tag() result handling.

Ref mail from Juan today.

Cannot clone install package from Github

Hi,
I'm trying to get the most recent version of this package because I'm having the JSON parsing issue. However when I try to update directly from github, I get:
Failed to clone 'extratests' a second time, aborting
Which makes sense, since that folder is a dead link

Function connect and property isconnected/isavailable

Make it possible to verify that provided is available for connection prior to / independent of actually attempting query.

Our use case can then fail nicely before reading a lot of config files to get the tags.

Relax Pandas requirement to support V1

Wrong interpretation of time strings

fromtime = "2023-01-31 12:01"
totime = "2023-02-10 12:00"
df = c.read(tags, fromtime, totime, 60)

tagreader 3.0.1 with pandas 1.5.0 reads the above correctly. After an update to tagreader 3.0.2 and pandas 2.0.1 it does not. totime is then interpreted as October 2, not February 10.

utils.py issues the following warning:
UserWarning: Parsing dates in %Y-%m-%d %H:%M format when dayfirst=True was specified. Pass dayfirst=False or specify a format to silence this warning.
date_stamp = pd.to_datetime(date_stamp, dayfirst=True)

Aspen Web API returns QNAN as text

All NaN-values that are not properly returned as NaN should be. So far: "1.#QNAN" found for Aspen Web API.

Other possibilites (from https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html):
‘#N/A’, ‘#N/A N/A’, ‘#NA’, ‘-1.#IND’, ‘-1.#QNAN’, ‘-NaN’, ‘-nan’, ‘1.#IND’, ‘1.#QNAN’, ‘’, ‘N/A’, ‘NA’, ‘NULL’, ‘NaN’, ‘n/a’, ‘nan’, ‘null’

Reduce max_rows on PI requests

Issue added on GitLab by @knudsvik 2018-06-12:

Based on experience, 100.000 rows got a time-out.

Normally poetry.lock is not commited to git for libraries

ref:
python-poetry/poetry#7488

Issues in search function, piwebapi

Lately I get this error message when I search for tags using method piwebapi (not for aspenone), regardless of plant and tags/descriptions. Any suggestions to what might be wrong?

Release automatically to PyPI when new release has been accepted

Raw queries Information/Warning

If querying for raw data (read_type=ReaderType.RAW) but also specifying the frequency argument a warning could be thrown to inform the user that the frequency is not used.

Use windows binaries for tables not working with python version 3.9 and newer

Collecting tables
Downloading tables-3.6.1.tar.gz (4.6 MB)
|¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦| 4.6 MB 6.4 MB/s
ERROR: Command errored out with exit status 1:
command: 'C:\Users*\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\\AppData\Local\Temp\pip-install-s96626t1\tables\setup.py'"'"'; file='"'"'C:\Users\\AppData\Local\Temp\pip-install-s96626t1\tables\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users*\AppData\Local\Temp\pip-pip-egg-info-4d2o5bx_'
cwd: C:\Users*\AppData\Local\Temp\pip-install-s96626t1\tables
Complete output (17 lines):
Traceback (most recent call last):
File "", line 1, in
File "C:\Users*\AppData\Local\Temp\pip-install-s96626t1\tables\setup.py", line 634, in
libdir = compiler.has_function(package.target_function,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.496.0_x64__qbz5n2kfra8p0\lib\distutils\ccompiler.py", line 792, in has_function
objects = self.compile([fname], include_dirs=include_dirs)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.496.0_x64__qbz5n2kfra8p0\lib\distutils_msvccompiler.py", line 323, in compile
self.initialize()
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.496.0_x64__qbz5n2kfra8p0\lib\distutils_msvccompiler.py", line 220, in initialize
vc_env = _get_vc_env(plat_spec)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.496.0_x64__qbz5n2kfra8p0\lib\site-packages\setuptools\msvc.py", line 314, in msvc14_get_vc_env
return _msvc14_get_vc_env(plat_spec)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.496.0_x64__qbz5n2kfra8p0\lib\site-packages\setuptools\msvc.py", line 268, in _msvc14_get_vc_env
raise distutils.errors.DistutilsPlatformError(
distutils.errors.DistutilsPlatformError: Microsoft Visual C++ 14.0 is required. Get it with "Build Tools for Visual Studio": https://visualstudio.microsoft.com/downloads/
* Using Python 3.9.1 (tags/v3.9.1:1e5d33e, Dec 7 2020, 17:08:21) [MSC v.1927 64 bit (AMD64)]
* USE_PKGCONFIG: False
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Add "stepped" option to read()

Any other to consider?