Giter Club home page Giter Club logo

sitemap's Introduction

Europeana Sitemap

Generates and publishes a record and entity sitemap for www.europeana.eu

The record sitemap is generated by connecting to a Mongo server and listing all records (with a minimum content tier and meta data tier). The entity sitemap uses the search functionality of Entity-API to retrieve all entities used on the Europeana website.

For both, the generated sitemap consists of:

  • multiple sitemap files containing record urls (45,000 resp. 20,000 per file)
  • a sitemap index file listing all the sitemap files

To make sure there is always a sitemap available, we use blue/green versions of the sitemap files and we keep track which one is 'active'. At the start of the update process all files of the inactive blue/green version are deleted first. Then the new sitemap files are created and the active version is switched from blue to green or vice versa.

For more information about sitemaps in general see also https://support.google.com/webmasters/answer/183668?hl=en

Build

mvn clean install (add -DskipTests) to skip the unit tests during build

Run locally

You can run the application directly in your IDE (select 'Run' on SitemapApplication class) For debugging purposes you can use the following urls:

  • /files shows a list of stored files

  • /file?name=x shows the contents of the stored file with the name x

  • /record/index.xml and /entity/index.xml shows the contents of the sitemap index files

Note that you can only run /record/update or /entity/update manually if you configure and provide an administrator apikey e.g. /record/update?wskey=<enter_adminkey_here>

Deployment

  1. Generate a Docker image using the project's Dockerfile

  2. Configure the application by generating a sitemap.user.properties file and placing this in the k8s folder. After deployment this file will override the settings specified in the sitemap.properties file located in the src/main/resources folder. The .gitignore file makes sure the .user.properties file is never committed.

  3. Configure the deployment by setting the proper environment variables specified in the configuration template files in the k8s folder

  4. Deploy to Kubernetes infrastructure

  5. To run a sitemap update deploy the same image, but add either record or entity on the command-line. This can also be deployed as a Kubernetes cron job

sitemap's People

Contributors

dependabot[bot] avatar europeana-jenkins avatar ikattey avatar j-jeurissen avatar luthien-in-edhil avatar nshweta90 avatar p-ehlert avatar srishtisingh-eu avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

bhanditz

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.