Giter Club home page Giter Club logo

nexus-operator's People

Contributors

bdurrow avatar kaitou786 avatar kevin-mok avatar lcaparelli avatar nwalens avatar radtriste avatar ricardozanini avatar sowmiyamuthuraman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

nexus-operator's Issues

Ingress comparison is not working as expected

Ingress comparison is not working as expected due to 2 issues:

  1. the type registered to the comparator is k8s.io/api/extensions/v1beta1.Ingress instead of k8s.io/api/networking/v1beta1.Ingress
  2. the comparison function compares the deployed resource with itself, instead of comparing it with the required resource

Rewrite comparator logic

Currently the comparator is not using custom comp functions, which leads to changes in Ingress (which doesn't have a default comparator) to be left unaddressed. We could wrap the comparator with a builder to point to our own comp functions, such as in the Kogito Operator.

This also increases our control over this part of the reconcile loop, which can prove to be useful in the long run.

Generate OLM files to publish at OperatorHub

OperatorHub needs CVS files and other metadata files that can be generated via operator-sdk cli. Those files should be presented in the project to be able to send a PR to the OLM repo and publish it at Operatorhub.io catalog.

Add Nexus3 health check using its own API endpoint

Is your feature request related to a problem? Please describe.
Today the health check just makes a HTTP request to the default port, instead the Nexus Server offers a standard health check endpoint on /service/rest/v1/status.

Describe the solution you'd like
The Nexus deployment probes to check the default health endpoint instead of the default port only.

Describe alternatives you've considered
None

Additional context
See the documentation: https://help.sonatype.com/repomanager3/rest-and-integration-api/status-api#StatusAPI-Status

Would you be able to assist in testing this feature if implemented?
Yes

Level up to "Seamless Upgrades" capability level

Issue to track our capability level. We should be able to do minor upgrades of the installed Nexus Server. This will probably impact the way we handle the application image, since we are using latest tag by default, which is not a good practice. Today, if the user provides an image for the Nexus server, the operator will stick with it, unless the user manually change it later in the CR.

Ideally, users should be able to define the major Nexus version and the operator should upgrade minor versions automatically. We will reach another maturity level with this approach. See: https://sdk.operatorframework.io/docs/operator-capabilities/#level-2---seamless-upgrades

Refactor resource management to simplify its logic

We could adopt the same pattern used in #52 to simplify the resource life cycle management. We could introduce an ìnterface along the lines of:

type ResourceManager interface {
    GetRequiredResources() (map[reflect.Type][]resource.KubernetesResource, error)
    GetDeployedResources() (map[reflect.Type][]resource.KubernetesResource, error)
    GetComparator(t reflect.Type) func(deployed resource.KubernetesResource, requested resource.KubernetesResource)
}

Objects that implement this interface would be responsible for managing a domain of resources life cycle.

It could also simplify the logic contained in pkg/controller/nexus/resource/resources.go by storing a set of ResourceManager objects which could be iterated over. In pseudo-code, for example:

GetDeployedResources() {
    for manager in ResourceManagers {
        deployedResources += manager.GetDeployedResources()
    }
    return deployedResources
}

Pods can't be started on OCP 3.x clusters with default configuration

When attempting to deploy to an OCP 3.x cluster the following error pops up:

Error creating: pods "nexus3-695b67564f-" is forbidden: unable to validate against any security context constraint: [fsGroup: Invalid value: []int64{200}: 200 is not an allowed group spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 200: must be in the ranges: [1000160000, 1000169999]]

As Nexus must run using this UID the cluster administrator needs to create a scc to work around this. It would be nice if we could supply this scc and have this documented.

Improve install scripts or change README file

Is your feature request related to a problem? Please describe.
Today when the user clones the project and just run make install the nexus deployment fails as the image with 0.3.0 doesn't exist on quay, user can edit the deployment and change the image tag to run the pod

Describe the solution you'd like
User should not edit anything just running the make install should get the deployment running

Describe alternatives you've considered

  1. Change the tag in deployment file
  2. Instruct user to download repo from releases page
  3. We can have modify install.sh script to accept the tag/version of image and checkout the repository at that tag, this tag can have default value of latest release (Currently 0.2.1)
    Additional context
    None

Would you be able to assist in testing this feature if implemented?
Yes

Refactor resource managers to return their struct, not the interface

When creating a new resource manager today we return its interface:

func NewManager(nexus v1alpha1.Nexus, client client.Client) infra.Manager {

While this is fine at the moment, it goes against the "Accept Interfaces Return Struct" rule of thumb. Returning an interface brings no additional benefit, but it can reduce the code's flexibility by removing all of the object's behavior that's not explicitly defined in the interface.

For example, consider the following interface:

type Person interface {
    Name() string
    Age() int
}

One implementation of that interface could be:

type Gamer struct {
	name  string
	age   int
	games []Game
}

func (g Gamer) Name() string {
	return g.name
}

func (g Gamer) Age() int {
	return g.age
}

func (g Gamer) Games() []Game {
	return g.games
}

If the function that creates a new gamer returned Person, we could never access this additional behavior defined in Games(). If it returned *Gamer, we'd still be able to use functions which receive a Person as parameter, as Gamer implements that interface AND we'd still be able to make use of this type's specific behavior.

The Operator fails to start on Openshift 3.x

When starting the Operator on Openshift 3.x the following error occurs:

2020-05-02T19:57:51.816-0300	ERROR	cmd	Manager exited non-zero	{"error": "no matches for kind \"Ingress\" in version \"networking.k8s.io/v1beta1\""}
github.com/go-logr/zapr.(*zapLogger).Error
	/home/lcaparel/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
main.main
	/home/lcaparel/gitrepos/nexus-operator/cmd/manager/main.go:155
runtime.main
	/usr/local/go/src/runtime/proc.go:203
FATA[0012] Failed to run operator locally: failed to run operator locally: failed to exec []string{"build/_output/bin/nexus-operator-local"}: exit status 1 

This is due to the fact that we only compare group names when assessing which resources are available in pkg/framework/controller_watcher.go:

for _, object := range watchedObjects {
	// core resources
	if object.AddToScheme == nil {
		desiredObjects = append(desiredObjects, object)
	} else {
		found := false
		for _, serverGroup := range serverGroups.Groups {
			if strings.Contains(serverGroup.Name, object.GroupVersion.Group) {
				addToScheme = append(addToScheme, object.AddToScheme)
				desiredObjects = append(desiredObjects, object)
				found = true
				delete(c.groupsNotWatched, object.GroupVersion.Group)
				break
			}
		}

The Ingress is part of networking.k8s.io/v1beta1. Openshift 3.x supports networking.k8s.io/v1, but not v1beta1. As we compare the group name but not the version, found is set to true, which eventually leads to the panic.

Nexus fails when deployed without persistence

When I deploy Nexus without persistence then the Nexus pod fails with "Permission denied" issues.

Example CR:

apiVersion: apps.m88i.io/v1alpha1
kind: Nexus
metadata:
  name: nexus3
spec:
  replicas: 1
  useRedHatImage: false
  resources:
    limits:
      cpu: "2"
      memory: "2Gi"
    requests:
      cpu: "1"
      memory: "2Gi"
  persistence:
    persistent: false
  networking:
    expose: true

In the log I can see:

id: cannot find name for user ID 1000650000
Warning:  Cannot open log file: ../sonatype-work/nexus3/log/jvm.log
Warning:  Forcing option -XX:LogFile=/tmp/jvm.log
OpenJDK 64-Bit Server VM warning: Cannot open file ../sonatype-work/nexus3/log/jvm.log due to Permission denied

java.io.FileNotFoundException: ../sonatype-work/nexus3/tmp/i4j_tA0O_LqRVFNhWb_IlDQiAGNa5vA=.lock (Permission denied)
	at java.io.RandomAccessFile.open0(Native Method)
	at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
	at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
	at com.install4j.runtime.launcher.util.SingleInstance.check(SingleInstance.java:72)
	at com.install4j.runtime.launcher.util.SingleInstance.checkForCurrentLauncher(SingleInstance.java:31)
	at com.install4j.runtime.launcher.UnixLauncher.checkSingleInstance(UnixLauncher.java:88)
	at com.install4j.runtime.launcher.UnixLauncher.main(UnixLauncher.java:67)
...

Allow the user to select the Pull Policy

Is your feature request related to a problem? Please describe.
At the moment the Pull Policy is hardcoded as "Always", meaning the user has no choice whether or not to pull an image.

Describe the solution you'd like
It would be nice if the user could select the policy as they see fit.

Describe alternatives you've considered
There could be a new field in the Nexus CRD to control this.

Additional context
N/A

Would you be able to assist in testing this feature if implemented?
Yes 😁

Add pre hook script to write admin temp password

When it’s first created, Nexus 3 writes a file in /nexus-data directory containing the temporary password for admin.

A pre hook script that writes to this file a pre defined password can be set so users could easily grab it in the Operator Status screen to do the first login.

Nexus Operator fails to create a ServiceAccount for the deployed CR

See: #63

This is a regression from #41 fix.

In version 0.2.0 users will see the following message in the logs:

E0515 16:51:07.295411       1 reflector.go:153] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:105: Failed to list *v1.ServiceAccount: serviceaccounts is forbidden: User "system:serviceaccount:nexus:nexus-operator" cannot list resource "serviceaccounts" in API group "" in the namespace "nexus"

To workaround this issue the nexus-operator service account must have permissions on the object serviceAccount. We control this kind of permission in the roles.yaml file. See #41.

There are some other things to polish before to fix this bug. That's why we opened this issue.

Implement backup persistent storage

Nexus has a built in backup capability. Would be interesting to also have a this feature supported by the operator by providing a persistent storage to it. See: https://help.sonatype.com/repomanager3/backup-and-restore/configure-and-run-the-backup-task

Proposal

The Nexus CRD interface would have a switch to turn backup on/off. If "on", the operator would create a PVC for it, and call the internal Nexus API to create this task for the admin, setting the backup path to the volume mount.

Also, the "notification" e-mail should be added to the interface: an attribute describing the adminEmail and a backup structure with the notificationEmail on it. If the former is empty, we would take the adminEmail.

Structure suggestion:

apiVersion: apps.m88i.io/v1alpha1
kind: Nexus
metadata:
  name: nexus3
spec:
  (...)
  adminEmail: [email protected]
  backup:
    enabled: true
    notificationEmail: [email protected]
    # ideally greater than the one set for the service
    volumeSize: 10Gi  
   (...)

Deploy a new Nexus instance with a given image and track its deployment status

This is the basic functionality for the Nexus Operator. Users will have the option to create a new Nexus instance based on a given image (version 3.x)

After deployed, users could see the deployment status based on the Deployment resource like:

oc describe nexus

If there's no image input. the latest tag will be deployed.

Move all Nexus CR validations to the resource Managers

Is your feature request related to a problem? Please describe.
At this moment validations are spread out in the reconcile loop and in resource-generating functions. This makes the code harder to maintain as we need to worry about validation in more than one place. Additionally, projects importing this are not able to send a Nexus CR to the managers without making additional checks themselves.

Describe the solution you'd like
Make the validation take place in a single place behind exposed API, improving code maintainability and usability.

Describe alternatives you've considered
Shift the validation responsibility to the Resource Managers. The reconcile loop would use them as would any users importing us as a library. The resource Managers would then be responsible for:

  • resource creation;
  • resource fetching;
  • resource comparison;
  • Nexus CR validation;

If the Manager finds an error it can't recover from when validating the Nexus CR it should return an error and die.

Additional context
N/A

Would you be able to assist in testing this feature if implemented?
Absolutely. 😁

Fix path for `go.mod` in CONTRIBUTING.md

Is your feature request related to a problem? Please describe.
In the CONTRIBUTING.md if one clicks on the the go.mod file link it reroutes requests to a link which is not available

Describe the solution you'd like
On clicking the link, it should be rerouted to the go.mod file on the repo

Describe alternatives you've considered
Changing the path on link should fix this issue

Additional context
N/A

Would you be able to assist in testing this feature if implemented?
Yes

On OpenShift, create ImageStreams to handle Nexus Images

When deploying on OpenShift, would be better to have support for Image Streams, this way we could create a new stream on openshift namespace to fetch from Red Hat Catalog certifies images, otherwise users would have to set their own tokens within Nexus namespace to use Red Hat certified images.

Rebuild Nexus Operator 0.2.1 image to fix CVE RHSA-2020:2637 and RHSA-2020:1998

At this moment we have two security issues in our latest image, impacting our users. We should rebuild to take a fresh ubi8 base image that has these fixes.

Also, we will create a tag 0.2 pointing to the latest micro version. This way administrators can choose whether or not deploy the operator with the latest micro versions.

More information about the security issues:

https://access.redhat.com/errata/RHSA-2020:2637
https://access.redhat.com/errata/RHSA-2020:1998

Thanks @Kaitou786 for reporting it.

Enhancement in Nexus Spec

In current master this is the nexus spec object supporting the following parameters.

type NexusSpec struct {
	Replicas int32 `json:"replicas"`
	Image string `json:"image,omitempty"`
	Resources corev1.ResourceRequirements `json:"resources,omitempty"`
	Persistence NexusPersistence `json:"persistence"`
	UseRedHatImage bool `json:"useRedHatImage"`
	Networking NexusNetworking `json:"networking,omitempty"`
}

I would like to suggest an enhancement to this spec following all the deployment parameters such as taints,tolerations, liveliness, affinity , pod disruption budgets to be added.
Ill raise a PR for this soon.

Possible to add this on Projects as ticket enhancement.

Write permissions problem on minikube

Container hangs on pod initialization problem because of lack of permissions on /nexus-data directory:

mkdir: cannot create directory '../sonatype-work/nexus3/log': Permission denied
mkdir: cannot create directory '../sonatype-work/nexus3/tmp': Permission denied
OpenJDK 64-Bit Server VM warning: Cannot open file ../sonatype-work/nexus3/log/jvm.log due to No such file or directory

Warning:  Cannot open log file: ../sonatype-work/nexus3/log/jvm.log
Warning:  Forcing option -XX:LogFile=/tmp/jvm.log
java.io.FileNotFoundException: ../sonatype-work/nexus3/tmp/i4j_ZTDnGON8hezynsMX2ZCYAVDtQog=.lock (No such file or directory)
	at java.io.RandomAccessFile.open0(Native Method)
	at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
	at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
	at com.install4j.runtime.launcher.util.SingleInstance.check(SingleInstance.java:72)
	at com.install4j.runtime.launcher.util.SingleInstance.checkForCurrentLauncher(SingleInstance.java:31)
	at com.install4j.runtime.launcher.UnixLauncher.checkSingleInstance(UnixLauncher.java:88)
	at com.install4j.runtime.launcher.UnixLauncher.main(UnixLauncher.java:67)
java.io.FileNotFoundException: /nexus-data/karaf.pid (Permission denied)
	at java.io.FileOutputStream.open0(Native Method)
	at java.io.FileOutputStream.open(FileOutputStream.java:270)
	at java.io.FileOutputStream.<init>(FileOutputStream.java:213)

	at java.io.FileOutputStream.<init>(FileOutputStream.java:101)
	at org.apache.karaf.main.InstanceHelper.writePid(InstanceHelper.java:126)
	at org.apache.karaf.main.Main.launch(Main.java:243)
	at org.sonatype.nexus.karaf.NexusMain.launch(NexusMain.java:113)
	at org.sonatype.nexus.karaf.NexusMain.main(NexusMain.java:52)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.exe4j.runtime.LauncherEngine.launch(LauncherEngine.java:85)
	at com.install4j.runtime.launcher.UnixLauncher.main(UnixLauncher.java:69)
java.lang.RuntimeException: /nexus-data/log/karaf.log (No such file or directory)
	at org.apache.karaf.main.util.BootstrapLogManager.getDefaultHandlerInternal(BootstrapLogManager.java:102)
	at org.apache.karaf.main.util.BootstrapLogManager.getDefaultHandlersInternal(BootstrapLogManager.java:137)
	at org.apache.karaf.main.util.BootstrapLogManager.getDefaultHandlers(BootstrapLogManager.java:70)
	at org.apache.karaf.main.util.BootstrapLogManager.configureLogger(BootstrapLogManager.java:75)
	at org.apache.karaf.main.Main.launch(Main.java:244)
	at org.sonatype.nexus.karaf.NexusMain.launch(NexusMain.java:113)
	at org.sonatype.nexus.karaf.NexusMain.main(NexusMain.java:52)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.exe4j.runtime.LauncherEngine.launch(LauncherEngine.java:85)
	at com.install4j.runtime.launcher.UnixLauncher.main(UnixLauncher.java:69)
Caused by: java.io.FileNotFoundException: /nexus-data/log/karaf.log (No such file or directory)
	at java.io.FileOutputStream.open0(Native Method)
	at java.io.FileOutputStream.open(FileOutputStream.java:270)
	at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
	at org.apache.karaf.main.util.BootstrapLogManager$SimpleFileHandler.open(BootstrapLogManager.java:193)
	at org.apache.karaf.main.util.BootstrapLogManager$SimpleFileHandler.<init>(BootstrapLogManager.java:182)
	at org.apache.karaf.main.util.BootstrapLogManager.getDefaultHandlerInternal(BootstrapLogManager.java:100)
	... 12 more
Error creating bundle cache.
Unable to update instance pid: Unable to create directory /nexus-data/instances

Spin another pod if image CR got updated

Users can update the Spec.Image field. In this case, a new pod should be created and deployed to the cluster preserving it's volume, secrets, configMaps and so on.

Data incompatibilities won't be treated for now.

Add SCC to the Service Account in OCP

Having the operator's Service Account using a restrictive SCC would improve the operator's security.

I have an initial implementation of this that is failing to build due to some dependency issues. The libraries we're using seem to be somewhat incompatible as they are now, let's keep a close watch to continue this as soon as possible.

At the moment the cluster admin must add an SCC to the Service Account in order to be able to start pods correctly in OCP 3.x. (#41) and if this was implemented it wouldn't be necessary.

Add TLS support

On OpenShift, rely on Routes. For Kubernetes, open a spec to add crt and key to be injected into the Ingress

Allow the fake client to mock server API error responses

Today we have the following implementation for the fake client used in testing, taken from pkg/test/client.go:

// NewFakeClient will create a new fake client with all needed schemas
func NewFakeClient(initObjs ...runtime.Object) client.Client {
	return fake.NewFakeClientWithScheme(GetSchema(), initObjs...)
}

// GetSchema gets the needed schema for fake tests
func GetSchema() *runtime.Scheme {
	s := scheme.Scheme
	s.AddKnownTypes(v1alpha1.SchemeGroupVersion, &v1alpha1.Nexus{})
	s.AddKnownTypes(routev1.GroupVersion, &routev1.Route{}, &routev1.RouteList{})
	return s
}

While this is nice and really useful, it doesn't allow us to mock specific responses to certain events/actions. That feature would come in pretty handy when testing the resource managers' ability to handle/report errors when the server responds, for example, with a 500 status code.

Digging up some docs I found that k8s.io/client-go/testing (which is already a dependency for the project) has a nice fake client implementation that allow us to insert "interceptors" (called Reactor here) that allow us to define how the fake client should respond to certain actions.

Check out its godoc for all the good stuff and this issue for an example of working usage.

Also noteworthy is that our fake discovery client uses the k8s.io/client-go/discovery/fake package, which is built on top of the same Fake implementation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.