Ticket: DM-9818
Background
The purpose of DocHub is make LSST's information artifacts, which are currently spread across many platforms, available and searchable from a single website. I did some research on DocHub in https://sqr-013.lsst.io last November, and that technote will provide useful background on what DocHub will (hopefully) become. But what you'll be building here is an initial prototype for DocHub. Rather than a sophisticated API+React app with JSON-LD metadata modeling, what we're looking for here is:
- A static website published with LSST the Docs to the
www
product so that its URL will be www.lsst.io
(we can alias lsst.io
to www.lsst.io
too).
- There's no need for persistence yet in building the initial static site; all data can be obtained during build time from the
keeper.lsst.codes
API and from metadata.yaml
files in the GitHub repositories of projects.
- The ltd-dasher project is similar to what you'll build here (Jinja2 templates, with data populated from APIs), except that there's no need to make
dochub-prototype
a server application (at least at this stage). LTD Keeper doesn't need to trigger a DocHub rebuild everytime a new LSST the Docs build is pushed. I think that hourly builds will be sufficient. The reason I'm cautious about making this a server app is because the build will take a significant amount of time, so any client would timeout unless we build a background task queue. But if we design the entire thing to run as an asynchronous job that can be triggered by a cron or launched as a Kubernetes Job, then we get that task queue feature for free.
Python package
I think the core implementation can just be a standard Python package dochubproto
(it can even be deployed to PyPI). Inside the package will be a templates
directory with the Jinja2 templates and Python modules that handle website rendering (getting data from APIs and actually rendering the templates).
There can be a dochub-render.py
executable for triggering a render.
Like ltd-dasher
, you can use ltd-conveyor to upload the built HTML/CSS/whatever to LSST the Docs with all the appropriate caching headers.
Dockerizing
If you want, you can Dockerize and deploy dochub-prototype with Kubernetes. I was thinking of doing this as a Kubernetes Job
resource. Once CronJob is available we can switch to that. The nice thing about this is then we could build a lightweight api.lsst.codes microservice that triggers a DocHub rebuild by just deploying the DocHub manifest. Again, this help prevent us from building our own task queue with celery.
If you can set up a Jenkins job or Travis Job to run this every hour that's great. But I think we can still close the epic without nailing the operational infrastructure 100%.
The index.html information content and API sources
The MVP for the sqre-s17-doceng epic is to list all technotes on www.lsst.io. We could also list LDMs and user guides (pipelines.lsst.io, firefly.lsst.io, developer.lsst.io, ltd-keeper.lsst.io, ltd-mason.lsst.io, ltd-conveyor.lsst.io) but I think that shipping just a list of DMTN, SQRs, and SMTNs would be sufficient and also useful.
Without getting into front-end design, you can treat the DMTN, SQR and SMTN sections (either all on the homepage, or as separate HTML pages) as ul
lists of technote template partials.
The template partials should provide the following information for each technote:
- Title (without the handle) (either from
keeper.lsst.codes
or metadata.yaml
)
- The document handle (either from
keeper.lsst.codes
or metadata.yaml
)
- The URL (from
keeper.lsst.codes
)
- The GitHub repo URL (from
keeper.lsst.codes
)
- Link to the edition dashboard (compute as
https://product.lsst.io/v
). For bonus points, use the GitHub API to state whether there are open PRs.
- Date last updated (from
keeper.lsst.codes
)
- The author list (from
metadata.yaml
)
- The description (from
metadata.yaml
, if available)
Getting data from keeper.lsst.codes
is straightforward as you know. You can use the GitHub API to obtain the metadata.yaml
file from technote repositories.
One trick is that not all technotes are on LSST the Docs. Some of the originals are on Read the Docs, but still have metadata.yaml
files. You can either work around that, or (probably better) just list technotes in LSST the Docs and I'll actually get around to porting the old technotes over.