Giter Club home page Giter Club logo

direct-access-py's Introduction

Hi there 👋

direct-access-py's People

Contributors

adamchainz avatar magerton avatar wchatx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

direct-access-py's Issues

Add to_dataframe method on V2 class

Many users of this module are also using Pandas to create dataframes from their API queries. This issue covers the requirements for providing a method that generates the dataframe for them.

Using the DDL feature of the V2 endpoints allows creation of precise dtypes for dataframes.

Still receiving timeout error

Using the following code:

prod_list = []
    print('Begin DI Pull')
    for ID in GroupedEntityIds:
        
        for row in d2.query(
        'producing-entity-details',
        fields='ApiNo,EntityId,Gas,Liq,ProdDate,ProdMonthNo',
        EntityId='in({})'.format(','.join([str(x) for x in ID])),
        DeletedDate='null',
        proddate='gt(2009-12-01)',
        pagesize=10000):
            prod_list.append(row)

Error:

RetryError: HTTPSConnectionPool(host='di-api.drillinginfo.com', port=443): Max retries exceeded with url: /v2/direct-access/producing-entity-details?
...
(Caused by ResponseError('too many 503 error responses'))

Unable to install directaccess package in FME environment

Hi,

FME has a PythonCaller module for Python integration. I try to install directaccess package in FME environment through the following command:

fme.exe python -m pip install directaccess

I get the following error messages:


Collecting directaccess
Using cached https://files.pythonhosted.org/packages/c7/b2/3bb51148af50f4aeda5ced745224317357cbaa9ea11cb3ea0995eea69a69/directaccess-1.4.0-py2.py3-none-any.whl
Collecting unicodecsv==0.14.1 (from directaccess)
Using cached https://files.pythonhosted.org/packages/6f/a4/691ab63b17505a26096608cc309960b5a6bdf39e4ba1a793d5f9b1a53270/unicodecsv-0.14.1.tar.gz
Error [WinError 2] The system cannot find the file specified while executing command python setup.py egg_info
Could not install packages due to an EnvironmentError: [WinError 2] The system cannot find the file specified

It appears to have trouble with the "python setup.py egg_info" command. The FME installation is 2019 version on Windows 10.

Need some help to resolve this issue.

Thanks in advance.

WARNING Throttled token request. Waiting 60 seconds...

I'm failing to get a token. The keys I am using (not shown below, I've just used dummies) are working in other scripts that do not use the directaccess package.

from directaccess import DirectAccessV2

d2 = DirectAccessV2(
api_key = '555555',
client_id='5555',
client_secret='555555555555555',
)

Output: JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I've also tried:
from directaccess import DirectAccessV2

d2 = DirectAccessV2(
api_key = '<555555>',
client_id='<5555>',
client_secret='<555555555555555>',
)

Output:
directaccess WARNING Throttled token request. Waiting 60 seconds...

Is there some sort of issue with my syntax? Do I need new keys to use direct access?

Requesting Example For Use-Case Scenario | Multi-Processing into SQL Server Database

A use-case scenario I'm interested in would be some form of the multi-processing example where multiple API endpoints can be queried and then loaded into a SQL server database on a regular basis. An added bonus would be to utilize the Enverus Developer API Best Practices to incrementally update a database by leveraging the "UpdatedDate" and "DeletedDate" fields, but this isn't a critical feature at this point.

Currently, I am using the multi-processing example to download .CSV files from two different API endpoints (Rigs and Rig Analytics) on a daily basis and then I have written a separate Python script that uses the pyodbc package to truncate the target SQL Server tables and then load the data in each .CSV file into their respective table/column. This occurs on a daily basis. It's clunky, but it works. My concern is that if I begin to add more endpoints, the inefficiency of my approach will come back to haunt me.

Also, I want to mention that I am an amateur Python user, so I'm open to any approach that is most sensible and greatly appreciate what is being provided with the direct-access package. Please let me know if I can provide any further information. Thank you very much!

Handle endpoint 404s

When a user provides a dataset endpoint that doesn't exist, we don't handle this ourselves, instead getting a 404 page from nginx.

Provide a useful exception for this

directaccess.DAQueryException: Non-200 response: 403 Authentication failed

I am getting a 403 Authentication Error partway through a query download. I am confident my API key is correct and I see you have an issue open to implement different handling for throttled request, but I am not getting throttled here it seems. Any insight?

Login successful...
Downloading production data...
Fri, 09 Oct 2020 12:34:45 directaccess INFO     Wrote 100000 records to file /var/cache/analytics/enverus/temp/2020-10-09T12:31:09.364976-producing-entities.csv
Fri, 09 Oct 2020 12:38:29 directaccess INFO     Wrote 200000 records to file /var/cache/analytics/enverus/temp/2020-10-09T12:31:09.364976-producing-entities.csv
Fri, 09 Oct 2020 12:42:24 directaccess INFO     Wrote 300000 records to file /var/cache/analytics/enverus/temp/2020-10-09T12:31:09.364976-producing-entities.csv
Fri, 09 Oct 2020 12:46:19 directaccess INFO     Wrote 400000 records to file /var/cache/analytics/enverus/temp/2020-10-09T12:31:09.364976-producing-entities.csv
Fri, 09 Oct 2020 12:50:04 directaccess INFO     Wrote 500000 records to file /var/cache/analytics/enverus/temp/2020-10-09T12:31:09.364976-producing-entities.csv
Fri, 09 Oct 2020 12:53:40 directaccess INFO     Wrote 600000 records to file /var/cache/analytics/enverus/temp/2020-10-09T12:31:09.364976-producing-entities.csv
Traceback (most recent call last):
  File "enverus_prod.py", line 66, in <module>
    old_csv = get_production_data(env, temp_path)
  File "enverus_prod.py", line 24, in get_production_data
    env.to_csv(producing_entities, os.path.join(temp_path, filename))
  File "/usr/local/lib/python3.6/dist-packages/directaccess/__init__.py", line 87, in to_csv
    for i, row in enumerate(query, start=1):
  File "/usr/local/lib/python3.6/dist-packages/directaccess/__init__.py", line 336, in query
    response.status_code, response.text)
directaccess.DAQueryException: Non-200 response: 403 Authentication failed

Match `well-origins` with `well-production-values` results`

I used the API to request two sets of results:

d2.query('well-origins') and d2.query('well-production-details')

I'm looking to combine the two results to essentially have a table like the one at the bottom of the web tool:
Screenshot from 2020-05-11 13-09-33

However I'm not finding a matching ID. Well Origins has a UID for each well, while Well Production Details has the 14-digit API well number. How could I merge these?

Many thanks again!

directaccess WARNING Throttled token request. Waiting 60 seconds...

Am unable to instantiate DirectAccessV2 class. I double checked the api key, client secret & id and they are correct. I get the following error (with debug enabled):

urllib3.connectionpool DEBUG    https://di-api.drillinginfo.com:443 "POST /v2/direct-access/tokens?grant_type=client_credentials HTTP/1.1" 403 41
Mon, 29 Jun 2020 15:00:38 directaccess WARNING  Throttled token request. Waiting 60 seconds...

I am sure my credentials are correct, but this error persists 5 times before finally throwing a DAAuthException:

Traceback (most recent call last):
  File "enverus_global.py", line 24, in <module>
    d2 = login(api_key, client_id, client_secret)
  File "enverus_global.py", line 9, in login
    env = da(api_key, client_id, client_secret)
  File "/usr/local/lib/python3.6/dist-packages/directaccess/__init__.py", line 197, in __init__
    self.access_token = self.get_access_token()['access_token']
  File "/usr/local/lib/python3.6/dist-packages/directaccess/__init__.py", line 254, in get_access_token
    response.status_code, response.text)
directaccess.DAAuthException: Error getting token. Code: 403 Message: Authentication failed

Any insight?

Way to get a count of query results without downloading the dataset?

I'm looking for a way to get a count of results of a query, say:

DirectAccessV2().query('dataset'')

and return number of results instead of the results themselves, it'd be nice to know the number before downloading each individual record. I spoke with a customer service rep who mentioned something called 'post' (I could be completely wrong) but he wasn't sure if it was possible or what the Syntax was in the Python API wrapper. Does the generator object have any length attributes or methods?

Finalize to_dataframe()

The to_dataframe() method has not been released yet. Finalize and release:

  • The option to chunk results for large datasets is untested. Remove for now
  • Creating the dataframe from a temporary CSV might not be possible in some environments that don't allow write access to temp space. Try from records instead.

List of possible options for `query` method?

Hey there. I'm using the DirectAccess API to query for all wells in North America but I'm having trouble figuring out what the possible options for the query method are. From the documentation:


query(dataset, **options)

  • Parameters: dataset – a valid dataset name. See the Direct Access documentation for valid values
  • options – query parameters as keyword arguments

What are the possible "options" or query parameters? I've seen some in the examples provided but is there a full list somewhere?

Ideally I'd like to specify a rectangular Area of Interest using North, South, East & West bounds. Is this possible? Or can I only do geographical AOIs like county= 'reeves'?

Deprecation warning from urllib3 Retry

I'm seeing this warning in a project using directaccess:

/.../python3.9/site-packages/directaccess/__init__.py:69: DeprecationWarning: Using 'method_whitelist' with Retry is deprecated and will be removed in v2.0. Use 'allowed_methods' instead
  retries = Retry(

The docs describe this as a simple replacement.

Certificate problems behind corperate firewall

I am behind a corporate firewall. Anytime a production uses certificate validate it fails and I have to disable the verification. I have been unable to figure out where I would add the verify=False to the init.py module. I've tried adding it in several location. Any assistance would be appreciated.

How can I query specific fields of a dataset?

I want to query the producing-entities-details dataset, but I'd like to only download records with UpdatedDate after or between certain dates. I can see in the API Explorer that one can make a request with parameters such as btw(date1, date2) or gt(date) :

UpdatedDate No   Value should be in the RFC3339 format Date the record was updated. Example: updateddate=2017-01-15 updateddate=eq(2017-01-15)

but I'm not sure how this translates to the DirectAccessV2.query() method.

Add helper methods for relational endpoints

The Enverus Developer API V2 has two high-level concepts in which many endpoints participate: the well hierarchy and production. This issue covers the requirements for creating helper methods that allow a user to provide a query to one of the top-level endpoints (well-origins or producing-entities) and retrieve records for the other participating endpoints (wellbores, packers, completions, producing-entity-details, etc) that correspond to their parent.

Remove `links` argument from V2 class init

Allowing a user to provide a persisted links object was intended to ease recovering from a failure but created more confusion than it was worth.

Remove the links argument and note this breaking change in the README

enhancement: Completions Query

I would really like to see how to pull completions to match when they were filed to the state. I can't seem to figure out how to extract the most recent w2 filings by the date they were filed (not date well was completed), if that makes sense. I'm sure i'm missing a relationship. For example, I've linked to the most recent new mexico completions table. I can pull the data if i already know the api, but unsure of how to do this without first going to the website to get the well ids.

https://wwwapps.emnrd.state.nm.us/ocd/ocdpermitting/Reporting/Activity/WeeklyActivity.aspx

thanks!

Differentiate between bad API key and throttled token request

There is currently no difference in handling between a bad API key and a throttled token request.

When a user provides an incorrect API key, ensure an appropriate exception is thrown immediately.

When a user is throttled from requesting tokens too quickly, warn the user and wait for 60 seconds before trying again

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.