Giter Club home page Giter Club logo

tap-stackexchange's Introduction

tap-stackexchange

Test Maintainability Rating Security Rating Duplicated Lines (%)

Singer tap for the StackExchange API.

Built with the Meltano SDK for Singer Taps and Targets.

Capabilities

  • sync
  • catalog
  • state
  • discover

Settings

Setting Required Default Description
key False None Pass this to receive a higher request quota
site False stackoverflow.com StackExchange site
tags False None Question tags
start_date False None The earliest record date to sync

A full list of supported settings and capabilities is available by running: tap-stackexchange --about

Custom filter

The StackExchange API supports a number of custom filters that can be used to include or exclude certain fields from the response objects. This application has a baked-in filter with the following parameters:

  • include

    Parameter Description
    question.comment_count The number of comments on the question
    tag.last_activity_date The date of the last activity on the tag
  • unsafe=false

Update the baked-in filter

To update the baked-in filter, edit the FILTER_ID constant in tap_stackexchange/tap.py.

To generate a new filter, use the Try It button on the StackExchange API documentation, using the baked-in filter as the base parameter.

Use a custom filter

To use a custom filter, set the filter setting to the filter ID. Note that if you use a custom filter, you will need to use a custom catalog that includes the fields you want to sync. That is you will need to

  1. Write the default catalog to a file: tap-stackexchange --discover > catalog.json
  2. Edit the catalog file to include the fields not included by the default API filter
  3. Run the tap with the custom catalog: tap-stackexchange --config config.json --catalog catalog.json

Installation

pipx install git+https://github.com/edgarrmondragon/tap-stackexchange.git

Source Authentication and Authorization

Register a new application on Stack Apps and copy the generated key.

Usage

You can easily run tap-stackexchange by itself or in a pipeline using Meltano.

Executing the Tap Directly

tap-stackexchange --version
tap-stackexchange --help
tap-stackexchange --config CONFIG --discover > ./catalog.json

Initialize your Development Environment

pipx install poetry
poetry install

Testing with Meltano

Note: This tap will work in any Singer environment and does not require Meltano. Examples here are for convenience and to streamline end-to-end orchestration scenarios.

Your project comes with a custom meltano.yml project file already created. Open the meltano.yml and follow any "TODO" items listed in the file.

Next, install Meltano (if you haven't already) and any needed plugins:

# Install meltano
pipx install meltano
# Initialize meltano within this directory
cd tap-stackexchange
meltano install

Now you can test and orchestrate using Meltano:

# Test invocation:
meltano invoke tap-stackexchange --version

# OR run a test `elt` pipeline:
meltano elt tap-stackexchange target-sqlite --job_id=stackexchange-sqlite

# Runtime configuration
TAP_STACKEXCHANGE__LOAD_SCHEMA=dragon_ball_gt \
TAP_STACKEXCHANGE_SITE=anime \
TAP_STACKEXCHANGE_TAGS='["dragon-ball-gt"]' \
meltano elt tap-stackexchange target-sqlite

SDK Dev Guide

See the dev guide for more instructions on how to use the SDK to develop your own taps and targets.

tap-stackexchange's People

Contributors

dependabot[bot] avatar edgarrmondragon avatar meltybot avatar pre-commit-ci[bot] avatar suraj-patro avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

tap-stackexchange's Issues

KeyError when optional key config not provided

The readme says that key is not required but without it I got a KeyError. So we could either make key required and update the doc along with adding a required field in the tap schema or leave it optional and add .get("key") instead of '["key"]'.

line 55, in get_url_params
    "key": self.config["key"],

Error Handling Surrounding Flattening

When setting flattening_enabled to true then running meltano select tap-stackexchange --list --all an error message returns without a needed change expressed.

Example of the error:

Cannot list the selected attributes: Catalog discovery failed: command ['/meltano/.meltano/extractors/tap-stackexchange/venv/bin/tap-stackexchange', '--config', '/meltano/.meltano/run/tap-stackexchange/tap.07702730-c1d8-433b-ab2c-92b6271df905.config.json', '--discover'] returned 1

I found a bread crumb by going back and running meltano invoke tap-stackexchange --about, which returned this error:

Traceback (most recent call last):
  File "/meltano/.meltano/extractors/tap-stackexchange/venv/bin/tap-stackexchange", line 8, in <module>
    sys.exit(TapStackExchange.cli())
  File "/meltano/.meltano/extractors/tap-stackexchange/venv/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/meltano/.meltano/extractors/tap-stackexchange/venv/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/meltano/.meltano/extractors/tap-stackexchange/venv/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/meltano/.meltano/extractors/tap-stackexchange/venv/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/meltano/.meltano/extractors/tap-stackexchange/venv/lib/python3.8/site-packages/singer_sdk/tap_base.py", line 467, in cli
    tap = cls(  # type: ignore  # Ignore 'type not callable'
  File "/meltano/.meltano/extractors/tap-stackexchange/venv/lib/python3.8/site-packages/singer_sdk/tap_base.py", line 87, in __init__
    self.mapper = PluginMapper(
  File "/meltano/.meltano/extractors/tap-stackexchange/venv/lib/python3.8/site-packages/singer_sdk/mapper.py", line 540, in __init__
    self.flattening_options = get_flattening_options(plugin_config)
  File "/meltano/.meltano/extractors/tap-stackexchange/venv/lib/python3.8/site-packages/singer_sdk/helpers/_flattening.py", line 34, in get_flattening_options
    return FlatteningOptions(max_level=int(plugin_config["flattening_max_depth"]))
KeyError: 'flattening_max_depth'

I took a guess at setting flattening_max_depth config and it fixed the issue, but there is nothing that I can find stating flattening_max_depth is required when setting flattening_enabled.

A possible solution instead of updating the error handling would be to remove flattening_enabled altogether as a config and have it's state set by the flattening_max_depth. For instance, if flattening_max_depth is set then flattening is enabled to that depth, otherwise flattening is off.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.