konosp / adobe-analytics-reports-api-v2.0 Goto Github PK

View Code? Open in Web Editor NEW

13.0 4.0 5.0 90 KB

Python Package to manage and perform API requests using the Adobe v2 API.

License: Apache License 2.0

Python 100.00%

adobe-analytics adobe-analytics-api v2 python

adobe-analytics-reports-api-v2.0's Introduction

Adobe Analytics Python package

Download Reports data utilising the Adobe.io version 2.0 API.

For more Digital Analytics related reading, check https://analyticsmayhem.com

Authentication methods supported by the package:

JWT
OAuth (tested only through Jupyter Notebook!)

JWT Requirements & Adobe.io access

In order to run the package, first you need to gain access to a service account from Adobe.io. The method used is JWT authentication. More instructions on how to create the integration at: https://www.adobe.io/authentication/auth-methods.html#!AdobeDocs/adobeio-auth/master/JWT/JWT.md. After you have completed the integration, you will need to have available the following information:

Organization ID (issuer): It is in the format of < organisation id >@AdobeOrg
Technical Account ID: < tech account id >@techacct.adobe.com
Client ID: Information is available on the completion of the Service Account integration
Client Secret: Information is available on the completion of the Service Account integration
Account ID: Instructions on how to obtain it at https://youtu.be/lrg1MuVi0Fo?t=96
Report suite: Report suite ID from which you want to download the data.

Make sure that the integration is associated with an Adobe Analytics product profile that is granted access to the necessary metrics and dimensions.

OAuth Requirements

To perform an OAuth authentication you need to create an integration at the Adobe I/O Console as described in the guide by Adobe at https://github.com/AdobeDocs/analytics-2.0-apis/blob/master/create-oauth-client.md. The result of the integration provides the following information:

Client ID (API Key)
Client Secret

Package installation

pip install analytics-mayhem-adobe

Samples

Initial setup - JWT

After you have configured the integration and downloaded the package, the following setup is needed:

from analytics.mayhem.adobe import analytics_client
import os

ADOBE_ORG_ID = os.environ['ADOBE_ORG_ID']
SUBJECT_ACCOUNT = os.environ['SUBJECT_ACCOUNT']
CLIENT_ID = os.environ['CLIENT_ID']
CLIENT_SECRET = os.environ['CLIENT_SECRET']
PRIVATE_KEY_LOCATION = os.environ['PRIVATE_KEY_LOCATION']
GLOBAL_COMPANY_ID = os.environ['GLOBAL_COMPANY_ID']
REPORT_SUITE_ID = os.environ['REPORT_SUITE_ID']

Next initialise the Adobe client:

aa = analytics_client(
        adobe_org_id = ADOBE_ORG_ID, 
        subject_account = SUBJECT_ACCOUNT, 
        client_id = CLIENT_ID, 
        client_secret = CLIENT_SECRET,
        account_id = GLOBAL_COMPANY_ID, 
        private_key_location = PRIVATE_KEY_LOCATION
)

aa.set_report_suite(report_suite_id = REPORT_SUITE_ID)

Initial setup - OAuth

Import the package and initiate the required parameters

from analytics.mayhem.adobe import analytics_client

client_id = '<client id>'
client_secret = '<client secret>'
global_company_id = '<global company id>'

Initialise the Adobe client:

aa = analytics_client(
        auth_client_id = client_id, 
        client_secret = client_secret,
        account_id = global_company_id
)

Perform the authentication

aa._authenticate()

This will open a new window and will request you to login to Adobe. After you complete the login process, you will be redirect to the URL you configured as redirect URI during the Adobe Integration creation process. If everything is done correctly, final URL will have a URL query string parameter in the format of www.adobe.com/?code=eyJ..... Copy the full URL and paste it in the input text. For a demo notebook, please refer to the Jupyter Notebook - OAuth example

Report Configurations

Set the date range of the report (format: YYYY-MM-DD)

aa.set_date_range(date_start = '2019-12-01', date_end= '2019-12-31')

To configure specific hours for the start and end date:

aa.set_date_range(date_start='2020-12-01', date_end='2020-12-01', hour_start= 4, hour_end= 5 )

If hour_end is set, then only up to that hour in the last day data will be retrieved instead of the full day.

Request with 3 metrics and 1 dimension

aa.add_metric(metric_name= 'metrics/visits')
aa.add_metric(metric_name= 'metrics/orders')
aa.add_metric(metric_name= 'metrics/event1')
aa.add_dimension(dimension_name = 'variables/mobiledevicetype')
data = aa.get_report()

Output:

itemId_lvl_1	value_lvl_1	metrics/visits	metrics/orders	metrics/event1
0	Other	5000	3	100
1728229488	Tablet	200	45	30
2163986270	Mobile Phone	49	23	31
...	...	...	...	...

Request with 3 metrics and 2 dimensions:

aa.add_metric(metric_name= 'metrics/visits')
aa.add_metric(metric_name= 'metrics/orders')
aa.add_metric(metric_name= 'metrics/event1')
aa.add_dimension(dimension_name = 'variables/mobiledevicetype')
aa.add_dimension(dimension_name = 'variables/lasttouchchannel')
data = aa.get_report_multiple_breakdowns()

Output: Each item in level 1 (i.e. Tablet) is broken down by the dimension in level 2 (i.e. Last Touch Channel). The package downloads all possible combinations. In a similar fashion more dimensions can be added.

itemId_lvl_1	value_lvl_1	itemId_lvl_2	value_lvl_2	metrics/visits	metrics/orders	metrics/event1
0	Other	1	Paid Search	233	39	10
0	Other	2	Natural Search	424	12	412
0	Other	3	Display	840	41	31
...	...	...	...	...	...	...
1728229488	Tablet	1	Paid Search	80	12	41
1728229488	Tablet	2	Natural Search	50	41	21
...	...	...	...	...	...	...

Global segments

To add a segment, you need the segment ID (currently only this option is supported). To obtain the ID, you need to activate the Adobe Analytics Workspace debugger (https://github.com/AdobeDocs/analytics-2.0-apis/blob/master/reporting-tricks.md). Then inspect the JSON request window and locate the segment ID under the 'globalFilters' object.

To apply the segment:

aa.add_global_segment(segment_id = "s1689_5ea0ca222b1c1747636dc970")

Issues, Bugs and Suggestions:

https://github.com/konosp/adobe-analytics-reports-api-v2.0/issues

Known missing features:

No support for filtering
No support for custom sorting

adobe-analytics-reports-api-v2.0's People

Contributors

Stargazers

Watchers

Forkers

dancingcactus shamshul2007 sauravk1996 saravganes ninakim1004

adobe-analytics-reports-api-v2.0's Issues

Issues in filtering

Hi,
Thanks for creating new pakage to use multiple breakdowns first.
When I use this pakage, I got some issues on dimension filterings.
I checked this pakage works well and the numbers are correct.
However, there's too much time needed if I pull the report data for multiple dimension(eg. daterangeday, marketingchannel, searchengines , EntryURL). So, I tried to add filtering limit for specific dimension(marketingchannel > natural only), but I couldn't figure it out anywhere. I saw the all the pakage function code with adobe.py file, but i don't think it works.
By the way,
I read the commnet 'No support for filtering' at the end of the content in Code Section this morning.

Is that meaning there's no way to add dimension filtering like freeform table?
If it's possible to add dimension filtering, please let me know how to use that function.
I appreciate for your efforts for this pakage.
Thanks.

Segment Support

Support OAuth2 Authentication

Documentation: https://github.com/AdobeDocs/analytics-2.0-apis/blob/master/oauth-curl.md

Expanding date to include time

Hello,

We have a need to specify a specific time as well as the date when grabbing a report. But I see that only a data is supported, and it's converted to T00:00:00.000 - Would you be able to add support in to pass through time with the date as well?

Thanks

AttributeError: 'DataFrame' object has no attribute 'data'

Hi konosp,

i think error is because my itemid has ' ' at the end.

I tried running below and got the error

data = aa.get_report_breakdown(t,dimensions = 'variables/evar1', current_level = None)

TypeError: sort_values() missing 1 required positional argument: 'by'

please let me know what do you recommend for fix

Thanks

ValueError: ('Response code error', '')

Hi,
thank you for sharing your service.

My colleague and I noticed that a recurrent issue takes place, namely
ValueError: ('Response code error', '')

With some investigations, we pointed out that it seems there's a naive mistake in the code: in the error message, instead of page.text
raise ValueError('Response code error', page.text)
we would have expected page.status_code
raise ValueError('Response code error', page.status_code)
for a clearer error explanation. Don't you agree?

At the same time, notwithstanding these messages, API calls seem to actually work fine from UI.
If this is the case, it might mean that the package does not wait enough for a response from server.
As a first tweak, better performances seem to be achieved if you consider a timeout in the requests.post within the _get_page function , e.g.
page = requests.post( self.analytics_url, headers=analytics_header, data=json.dumps(report_object), timeout=360 )
In this way it seems to work better, but we still encounter issues when making requests for more segments but have no clue for the moment if there's a way for a more stable call.
In case of a status_code 429, maybe a further sleep could be needed?

Tell us if you can update the package and have hints for more stability in the package.

Thank you

Issue with adding limit

Hi konosp,

while adding multiple breakdowns can you help me with adding limit to every breakdown so it would be comfortable to limit down the number of requests we want to pull.

For eg: (aa.add_dimension(dimension_name = 'variables/mobiledevicetype', limit=10000)
so it will only work on top 10000 data available.

additionally, i am facing a GMR routing error from adobe's end when i am trying to pull data. it is when we add multiple breakdowns.

SSL CERTIFICATE ERROR

Hi,
When I run data = aa.get_report_multiple_breakdowns(), I get this error -
SSLError: HTTPSConnectionPool(host='ims-na1.adobelogin.com', port=443): Max retries exceeded with url: /ims/exchange/jwt (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1123)')))
My analysis: I checked the backend code on how this method is being called. As per my understanding, at a certain point its trying to authenticate my request, which I don't want because I am handling JWT and access token part in a separate script. So now, I already have access token with me, and using that(along with other credentials) all I want is report, bypassing authentication part.
If I add verify=False, data = aa.get_report_multiple_breakdowns(verify=False), then i get this error -
TypeError: get_report_multiple_breakdowns() got an unexpected keyword argument 'verify'
Kindly help. TIA

Private key de-serialization issue

I have worked with your package for more than half a year now, and ran the same code over and over again without any issues. Suddenly, I cannot run it now, as it fails with the below error. My pem file is the same as ever, and the key is not expired. I tried updating the cryptography package but it solves nothing. I tried downgrading the cryptography package, same issue. I'm using version 0.0.8 on Windows, Python 3.7.
Would highly appreciate your help.

TypeError Traceback (most recent call last)
in
19 aa.add_dimension(dimension_name = 'variables/daterangeday')
20
---> 21 df0 = aa.get_report()

/local_disk0/pythonVirtualEnvDirs/virtualEnv-c9976e96-acb1-4a62-9656-e4d1a21b6ef1/lib/python3.7/site-packages/analytics/mayhem/adobe.py in get_report(self, custom_report_object)
289 self._set_page_number(0)
290 # Get initial page
--> 291 data = self._get_page(custom_report_object)
292 self.logger(data.text)
293 json_obj = json.loads(data.text)

/local_disk0/pythonVirtualEnvDirs/virtualEnv-c9976e96-acb1-4a62-9656-e4d1a21b6ef1/lib/python3.7/site-packages/analytics/mayhem/adobe.py in _get_page(self, report_object)
255 report_object = self.report_object
256
--> 257 analytics_header = self._get_request_headers()
258
259 status_code = None

/local_disk0/pythonVirtualEnvDirs/virtualEnv-c9976e96-acb1-4a62-9656-e4d1a21b6ef1/lib/python3.7/site-packages/analytics/mayhem/adobe.py in _get_request_headers(self)
150 self.access_token = self._obtain_oauth_access_token()
151 elif self.client_id:
--> 152 self.access_token = self._renew_access_token()
153
154 analytics_header = {

/local_disk0/pythonVirtualEnvDirs/virtualEnv-c9976e96-acb1-4a62-9656-e4d1a21b6ef1/lib/python3.7/site-packages/analytics/mayhem/adobe.py in _renew_access_token(self)
101 jwtPayloadJson = self._get_jwtPayload()
102 # Encode the jwt Token
--> 103 jwttoken = jwt.encode(jwtPayloadJson, private_key, algorithm='RS256')
104
105 accessTokenRequestPayload = {

/local_disk0/pythonVirtualEnvDirs/virtualEnv-c9976e96-acb1-4a62-9656-e4d1a21b6ef1/lib/python3.7/site-packages/jwt/api_jwt.py in encode(self, payload, key, algorithm, headers, json_encoder)
61 ).encode("utf-8")
62
---> 63 return api_jws.encode(json_payload, key, algorithm, headers, json_encoder)
64
65 def decode_complete(

/local_disk0/pythonVirtualEnvDirs/virtualEnv-c9976e96-acb1-4a62-9656-e4d1a21b6ef1/lib/python3.7/site-packages/jwt/api_jws.py in encode(self, payload, key, algorithm, headers, json_encoder)
108 try:
109 alg_obj = self._algorithms[algorithm]
--> 110 key = alg_obj.prepare_key(key)
111 signature = alg_obj.sign(signing_input, key)
112

/local_disk0/pythonVirtualEnvDirs/virtualEnv-c9976e96-acb1-4a62-9656-e4d1a21b6ef1/lib/python3.7/site-packages/jwt/algorithms.py in prepare_key(self, key)
240 key = load_ssh_public_key(key)
241 else:
--> 242 key = load_pem_private_key(key, password=None)
243 except ValueError:
244 key = load_pem_public_key(key)

TypeError: load_pem_private_key() missing 1 required positional argument: 'backend'

Multiple Breakdowns

Introduce in the response multiple breakdown dimensions

Issue with code running indefinitely when the data is more.

Hi Mayhem,

Thanks alot for the python Package analytics mayhem this has been a game changer for me. I am using it to extract out data from adobe analytics.
The only problem I am facing is that it is taking alot of time in running the code as its just running indefinitely and I had to close the command. I am really sorry I am writing this question but I dont know how to connect with you via any other method.
It would be great if you could help me on this as there is nothing like this package on the net.

Regards,
Vinayak

Private Key location

Private key location should support absolute path. Currently supports only within home folder.

Issue in adding multiple breakdowns

Hi,

when we are trying to add breakdowns more than 2 , the data output is incorrect and we end up getting lot of junk data.
As per my analysis, the payload that is being generated is not correct.

Please help me with the same.