singer-io / tap-chargify Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 16.0 62 KB

License: GNU Affero General Public License v3.0

Makefile 1.87% Python 98.13%

in-review

tap-chargify's People

Contributors

Stargazers

Watchers

Forkers

lambtron chrishumphries briefmnews punita1790 amakalin4 monkidea goes-funky isaiahbaca isabella232 futuri-pro rgmills anelendata jsu

tap-chargify's Issues

About session_bookmark

Hi @lambtron
I am working on the streams in order to make them incremental where it is possible and I'll make a PR with that. I have a question about the methods session_bookmarks
https://github.com/singer-io/tap-chargify/blob/master/tap_chargify/streams.py#L51-L61

I don't understand what they stand for. I do not find any reference to them in the tap-chargify module, and I do not find any reference neither in Singer code (I am quite new to Singer so I don't know if this is an API method).

If this is dead code do you think that I can remove it?

"string indices must be integers" in invoice table

I have extract data using stitchdata, and it throws error as the topic

here is error logs

2021-04-23 12:25:50,074Z tap - yield j["invoice"]
2021-04-23 12:25:50,074Z tap - TypeError: string indices must be integers
2021-04-23 12:25:50,097Z target - INFO Requests complete, stopping loop
2021-04-23 12:25:50,152Z main - INFO Target exited normally with status 0
2021-04-23 12:25:50,734Z main - INFO [smart-services] event successfully sent to kafka: com.stitchdata.extractionJobFinished [18] at offset None
2021-04-23 12:25:50,735Z main - INFO No tunnel subprocess to tear down
2021-04-23 12:25:50,735Z main - INFO Exit status is: Discovery succeeded. Tap failed with code 1 and error message: "string indices must be integers". Target succeeded

Thank you, for advance

Add currency to subscriptions schema

Currency field not currently available in Subscriptions object, but is available when making read subscription calls per Chargify support.

Stitch Data Error Message: "Parser must be a string or character stream, not NoneType"

Hello,

We have created one integration in stitchdata. Our source is "ReCharge ecommerce". When we run the extraction, we are getting error message: "Parser must be a string or character stream, not NoneType". Below is the log for reference.

2022-03-25 11:21:32,010Z tap - File "/code/orchestrator/tap-env/lib/python3.9/site-packages/dateutil/parser/_parser.py", line 69, in __init__ 2022-03-25 11:21:32,010Z tap - raise TypeError('Parser must be a string or character stream, not ' 2022-03-25 11:21:32,010Z tap - TypeError: Parser must be a string or character stream, not NoneType

Any help will be highly appreciated.

Thanks,
Mitul

Ids should be integer not number

I think that all the ids in the schemas should be integer and not number

If you agree I am happy to submit a PR.

Schema Issue

tap-chargify/tap_chargify/schemas/subscriptions.json

Lines 149 to 153 in b64bdb0

 "delayed_cancel_at": { 

 "type": [ 

 "null", 

 "number" 

 ]

@lambtron - this needs to have a new data type:

"delayed_cancel_at": { "type": [ "null", "string" ], "format": "date-time"

pep8

Greetings,

The code does not respect pep8, particularly with the line length.
Are you ok if I submit a patch that formats the code? I generally use black for that.

About Invoices

I see from the git history (and from the forks of this project) that there is some confusion about the Invoices stream.

from streams.py

class Invoices(Stream):
    name = "invoices"
    replication_method = "INCREMENTAL"
    # replication_key = "updated_at"
    replication_key = "due_date"
    # API endpoint filters only on `due_date`.

We can see that the replication key has due_date and had been updated_at. There was a back and forth on this line in the commits.
The fact is there are actually two very different Invoice objects in Chargify:

Legacy invoices (instances prior to Jan 2018)
https://reference.chargify.com/v1/invoices-legacy/invoices
Relationship Invoices (instances after Jan 2018)
https://reference.chargify.com/v1/relationship-invoicing/relationship-invoicing-intro

Legacy Invoices have a updated_at property
Relationship Invoices have a due_date property
Both of them have a completely different structure and are unrelated. The most confusing part is that they have the same endpoint in Chargify API: /invoices/
You'll get one or the other depending on the instance you have.

I will open a PR to fix this issue. Unfortunately, I only have instances with Legacy Invoices so I can only test this. But I will provide a scafolder for the new Relationship Invoices.

Missing attributes in events stream

The events stream has an event_specific_data property that has a different schema for each type of event.

Currently only the attributes product_id and account_transaction_id are declared. That means that all the attributes for the other type of event will be ignored (see https://reference.chargify.com/v1/events/events-intro#events-fields-and-event-type for the type of event).

Chargify does not publish the schema for each event type. However, I am speaking with them to have it and will submit a PR with the full specification.

Incremental replications

I am currently fixing the incremental replications. I think the current state of this part of the code is quite broken.

https://github.com/singer-io/tap-chargify/blob/master/tap_chargify/streams.py#L104-L111

Some streams are currently not ordered (ex Subscriptions) and the bookmark is updated for each record. Then before yielding a record we check to see if it is newer than the bookmark. As the stream is not ordered, it could be that the record is older than the bookmark, whereas it should legitimately write this record.

To make it clear let's take an example.

Say that the state for the Subscription stream is "updated_at": "2021-04-01T02:06:58.147Z"
The first query on Chargify API will be

https://subdomain.chargify.com/subscriptions.json?page=1&per_page=200&start_datetime=2021-04-01T02%3A06%3A58.147Z&date_field=updated_at&direction=asc

it will return unsorted records such as (all fields removed but id and updated_at)

{"id": 38025480, "updated_at": "2021-04-01T12:21:37.000000Z"}
{"id": 11255022, "updated_at": "2021-04-01T12:21:37.000000Z"}
{"id": 30812776, "updated_at": "2021-04-01T13:04:02.000000Z"}
{"id": 37986658, "updated_at": "2021-04-01T12:21:37.000000Z"}
{"id": 23329748, "updated_at": "2021-04-01T13:04:03.000000Z"}
{"id": 37925200, "updated_at": "2021-04-01T13:06:28.000000Z"}
{"id": 39472331, "updated_at": "2021-04-01T13:06:28.000000Z"}
{"id": 37925203, "updated_at": "2021-04-01T13:06:29.000000Z"}                                                        
{"id": 32061301, "updated_at": "2021-04-01T13:06:28.000000Z"}

The first step is of course to sort the stream, which is quite easy thanks to Chargify API.
However, it is not sufficient. As you can notice there are several records with the same updated_at value (the example above is from real data). With the current code, subsequent records with a value lower than the bookmark are discarded. As we are filtering the API with the value of the bookmark I am confident with the fact that all the records returned by the API should be written to be consumed by the target.
I have a PR that fixed the issue for Subscriptions. I will open a PR that fixes all the streams to make them incremental where it is possible.

On a side note, I see that Chargify is now in beta in Stitch. If this is this with this version of the code, I think this is really not ready for production, not even beta.

Data type should be boolean.

tap-chargify/tap_chargify/schemas/subscriptions.json

Lines 111 to 114 in b64bdb0

 "cancel_at_end_of_period": { 

 "type": [ 

 "null", 

 "number"

@lambtron this should be a boolean field

Add reference to subscriptions schema

This field is not coming through in the schema but is in the API spec (note: this is different to customer.reference)