Giter Club home page Giter Club logo

ldes-action's Introduction

LDES-Action

LDES-Action is a GitHub Action that replicates a Linked Data Event Stream or tree:Collection and republishes it on GitHub Pages.

Usage

Create a .github/workflows/data.yaml file in the repository where you want to fetch data. An example:

# data.yaml

# make workflow concurrent
concurrency: ci-${{ github.ref }}

# trigger workflow:
on:
  # - on push to branch 'main'
  push:
    branches:
      - main
  # - on schedule, every 30 minutes
  schedule:
    - cron: '*/30 * * * *'
  # - manually 
  workflow_dispatch:

jobs:
  scheduled:
    runs-on: ubuntu-latest
    steps:
      # Check out the repository so it can read the files inside of it and do other operations
      - name: Check out repo
        uses: actions/checkout@v2
      # Fetch dataset, write data to json, push data to the repo and setup GitHub Pages
      - name: Fetch and write data
        uses: TREEcg/LDES-Action@v2
        with:
          # url you want to fetch
          url: 'https://smartdata.dev-vlaanderen.be/base/gemeente'
          # output directory name 
          storage: 'output'

The TREEcg/LDES-Action action will perform the following operations:

  1. fetch data from the provided url
  2. split and store the fetched data across turtle files in the storage directory
  3. commit and push all of the data to your repo
  4. deploy the data to GitHub Pages on branch main.

Inputs

url

URL to a LDES or tree:Collection dataset from which you want to fetch data.

storage

Name of the output directory where the fetched data will be stored.

gh_pages_url (optional)

URL where GitHub Pages will be deployed.
Default: http(s)://<username>.github.io/<repository> or http(s)://<organization>.github.io/<repository>

fragmentation_strategy (optional)

Fragmentation strategy that will be deployed.
Default: basic
possibele values:

fragmentation_page_size (optional)

Amount of RDF objects that will be on a single page.
Default: '50'

datasource_strategy (optional)

Datasource strategy to use.
Default: ldes-client (only one implemented at this point)

property_path (optional)

Property path to be used by bucketizers.

stream_data (optional)

Boolean whether to stream the LDES members or the load them in memory.
Default: false

timeout (optional)

Amount of time in milliseconds to wait for the datasource to fetch data in a single run, after which the datasource (LDES Client) will be paused. Take in mind that a single job execution run is limited to 6 hours. As a safety it is currently recommended to keer timeout under 5 hours.
Default: 3600000 (1 hour)

Outputs

delta_bytes

A signed number describing the number of bytes that changed in this run.

Development

Test

Create a private .env file following this structure, with your wanted environment variables:

INPUT_URL="https://smartdata.dev-vlaanderen.be/base/gemeente"
INPUT_STORAGE="output"
INPUT_GIT_USERNAME="<YOUR_GIT_USERNAME>"
INPUT_GIT_EMAIL="<YOUR_GIT_EMAIL>"
INPUT_FRAGMENTATION_STRATEGY="alphabetical"
INPUT_FRAGMENTATION_PAGE_SIZE="100"
INPUT_DATASOURCE_STRATEGY="ldes-client"

Run the code to test it and check the output folder.

npm run test

Compile

Compile this Node.js project into a single file (see ncc), this is needed if you want to use this as a GitHub Action:

npm run dist

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.