Giter Club home page Giter Club logo

api-entrypoint's People

Contributors

csjx avatar gothub avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

csjx

api-entrypoint's Issues

Expose the 'virtuoso' service

The slinky stack contains the 'virtuoso' service that should be exposed to users and administrators. The three parts that are of interest are...

1. Conductor Webapp

The Virtuoso Conductor web application is nice to access because without it, accounts and the graph store settings need to be modified using SQL commands.

2. SPARQL Endpoint

The sparql endpoint can be found at /sparql endpoint. Users will need their OAUTH token to perform the queries here.

3. OAUTH Endpoint

Users should be able to generate an OAUTH token for use with the sparql endpoint; we should definitely be exposing /oauth to allow this.

Endpoint Name

https://api.dataone.org/slinky might not be that descriptive for anyone unfamiliar with service. I think that
https://api.dataone.org/graph, https://api.dataone.org/virtuoso, or https://api.dataone.org/knowledge could work.

I think a different subdomain might make the most sense since it's not completely an API, but more of a standalone service with an API endpoint. This would give something like graph.dataone.org/sparql for the SPARQL endpoint, graph.dataone.org/oauth for OAUTH, and graph.dataone.org for the conductor webapp

Add restricted users to k8s

As new services are added to k8s, they can be administered by appropriate Linux usernames. For example, currently on the dev k8s cluster, the bookkeeper service is started, stopped and upgraded from the Linux 'bookkeeper' username.

This username can be restricted to one k8s namespace, so that only k8s resources (pods, services) can be created and viewed in that namespace and no other.

To enable this, for each username needed:

  • create a k8s service account for the username
  • create the appropriate namespace
  • create the k8s role and rolebinding YAML files that restricts the username
  • create the k8s config file that enables permissions (e.g. ~/.kube/config)

Detailed instructions with template YAML and config files will be added to this repo.

Repository directory structure

@csjx Possible contents of this repo are:

  • files needed to deploy the k8s NGINX Ingress Controller and Virtual Server resources
  • files needed to deploy a HAProxy or other load balancer / reverse proxy

Here is a proposed directory layout for the repo:

- deployments
  - haproxy?
  - nginx-ingress-controller
      - virtual-server
        - helm-chart
        - manifests
          - common
          - deployment
          - rbac
          - service
- docs

How does this look to you?
Is there anything else that might be in this repo?

Document NGINX Ingress Controller / VirtualServer config/start/stop

The initial checkin of files for the NGINX Ingress Controller / VirtualServer
contains the files needed to deploy the NGINX Ingress Controller / VirtualServer that runs on the NCEAS k8s cluster. This
configuration provides the routing from the main k8s URL https://docker-ucsb-4.dataone.org:30443 (soon to be https://api.dataone.org:443) to the services that will hosted on k8s, includeing bookkeeper and the MetaDIG quality service.

A description of how the Ingress Controller and VirtualServer work, as well as operational instructions will be included in the ./docs directory.

Renew k8s API certificate

The NCEAS k8s installation uses 'kubeadm' to manage installations and updates to the k8s software. k8s uses a self-signed certificate internally to authenticate/authorize operations within the cluster.
These certificates are valid for a year, and are renewed via the command kubeadm certs renwe as detailed here

The current certificates will expire on Feb 19 18:16:40 2022 GMT.

Note that kubeadm has to be upgraded to the current version to support the command mentioned above.
Also, certificates are automatically updated when k8s is upgraded with kubeadm.

Upgrade k8s on dev, then production

Upgrade k8s on the dev k8s cluster then the production k8s to the current version which is v1.19.0 (https://kubernetes.io/docs/setup/release/notes/)

Currently the k8s version running on production is:

metadig@docker-ucsb-4:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:14:22Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:07:13Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}

and the version running on dev is:

metadig@docker-dev-ucsb-1:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:14:22Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.2", GitCommit:"59603c6e503c87169aea6106f57b9f242f64df89", GitTreeState:"clean", BuildDate:"2020-01-18T23:22:30Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}

Note that it may be necessary to upgrade first to 1.18.x, then to 1.19.0. This process will be fully tested on
the dev k8s, then production, so that downtime is minimized. In the past, these upgrades take 10-15 minutes, unless complications are encountered.

Reverse proxy / load balancer options

Please view the related post: #1

If a decision to use a reverse proxy (RP) / load balancer is made here are a couple of options, with some highlights of each option listed.

  1. HAproxy
    • http://www.haproxy.org/
    • TCP, TCP-SSL, HTTP and HTTPS load balancing
    • flexible health checks and failover conditions
    • Basic caching (v1.8 - 2017)
    • Customizable log format, to import access logs to kibana/splunk/graylog
    • Detailed status page, to see active requests and servers status
    • Exportable metrics, to integrate with monitoring solutions (graphite/prometheus/datadog)
    • More high-performance oriented. Better indicated to handle 100k connections or 40 GbE interfaces.
  2. Apache http
  3. NGINX
    • https://docs.nginx.com/nginx/admin-guide/web-server/reverse-proxy/
    • HTTP and HTTPS load balancing (TCP - UDP in paid edition)
    • More flexibility on caching
    • Customizable log format, to import access logs to kibana/splunk/graylog
    • No status page (paid edition only)
    • No exportable metrics (paid edition only)
    • Can serve local files
    • Can serve FastCGI applications (not CGI)
    • Nginx is open core and many features are only available in the paid edition

Make 'external' services available to k8s

@csjx relevant to the recent conversation regarding api.dataone.org and k8s:

Services running on k8s may need to access services running outside the k8s cluster.

For example, the quality service needs to access the NFS Awards database (i.e. https://api.nsf.gov/services/v1). Currently this is accessed by hardcoding the URL in the quality server source code.

The k8s infrastructure supports a 'standard' way to make external services available to internal k8s services. The k8s service definition supports service type 'externalName', which is essentially causes a CNAME redirect to a DNS name defined in the service. For example, an NSF awards service definition could be defined:

apiVersion: v1
kind: Service
metadata:
  name: nsf
  namespace: metadig
spec:
  type: ExternalName
  externalName: https://api.nsf.gov

Internal services could then access this service using a domain name that is dynamically resolved using the internal k8s DNS server: http://nsf.metadig.svc.cluster.local/services/v1.

The main reason for using this type of service is so that if the external service URL changes, only the k8s service definition needs to change, and not source code that has the URL hard-coded.

The only problem with this mechanism is that the NGINXInc Ingress Controller only supports 'externalName' services with the NGINX Plus version (nginxinc/kubernetes-ingress#485)

So, if this is something that should be added to the NCEAS k8s, then the community version of the NGINX Ingress Controller must be used (i.e. https://kubernetes.github.io/ingress-nginx/)

Consider load balancer/reverse proxy

A reverse proxy (RP) could be used to route traffic to services running on the NCEAS k8s cluster, and
to other DataONE services, as required.

benefits of using a reverse proxy / load balancer

costs of using a reverse proxy

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.