opensearch-project / opensearch-k8s-operator Goto Github PK
View Code? Open in Web Editor NEWOpenSearch Kubernetes Operator
License: Apache License 2.0
OpenSearch Kubernetes Operator
License: Apache License 2.0
Exposing the cluster with selected vendors (ingress/haproxy .etc.).
it seems to take a really long time to create transport certs - I think that we need to try to create it using go routine asynchronously and let the rest of the reconciliation continue.
Opensearch should automatically load the certs when they appear.
Make opensearch-k8s-operator able to adjust the OpenSearch statefulset resources (cpu/memory resources requests and limits)
The user should be able to pass cpu and memory resources to the cluster from cluster.yaml
https://github.com/Opster/opensearch-k8s-operator/blob/main/docs/userguide/main.md
Example as
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
name: my-first-cluster
namespace: default
spec:
general:
serviceName: my-first-cluster
dashboards:
enable: true
nodePools:
- component: masters
replicas: 3
diskSize: 30
NodeSelector:
cpu: "0.3"
memory: "2Gi"
roles:
- "data"
- "master"
The values should be passed as follows, user should be able to adjust cpu with. "m" and memory with "Mi" or "Gi".
cpu: "0.3"
memory: "2Gi"
This came up during discussions in the biweekly meeting. Seems like today there is a way for users to setup Kubernetes operator with weak (hardcoded) demo certificates and admin:admin
default credentials. These weak configurations are fine for PoC building but bad if carried forward to production. Going forward, we should fix this to provide strong out of the box defaults with autogenerated self-signed certificates and stronger default passwords.
Currently, the Operator is tested locally and on AWS. We would like to make sure before the Operator is GA that it is fully compatible with the major clouds - AWS, GCP, and Azure.
For example if I want to replace internal_users.yml
with custom values I should be able to do that without having to include every securityconfig file.
This is related to #89
All components are wired using secure connection:
Basic auth
SSL
Disks encryption options
Creating an OpenSearchCluster resource with a partial spec:
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
name: test-opensearch
namespace: default
spec:
security:
tls:
http:
secret:
name: test-opensearch-http
...
Results in:
1.6522032290237098e+09 INFO controller.opensearchcluster Reconciling OpenSearchCluster {"reconciler group": "opensearch.opster.io", "reconciler kind": "OpenSearchCluster", "name": "test-opensearch", "namespace": "default", "cluster": "default/test-opensearch"}
1.6522032290357742e+09 ERROR controller.opensearchcluster Not all secrets for http provided {"reconciler group": "opensearch.opster.io", "reconciler kind": "OpenSearchCluster", "name": "test-opensearch", "namespace": "default", "error": "missing secret in spec"}
opensearch.opster.io/pkg/reconcilers.(*TLSReconciler).Reconcile
/workspace/pkg/reconcilers/tls.go:70
opensearch.opster.io/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning
/workspace/controllers/opensearchController.go:326
opensearch.opster.io/controllers.(*OpenSearchClusterReconciler).Reconcile
/workspace/controllers/opensearchController.go:141
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227
If pass in a ~full spec (with expected defaults) it will error the same.
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
name: test-opensearch
namespace: default
spec:
security:
tls:
http:
generate: false
caSecret:
name: test-opensearch-http
secret:
name: test-opensearch-http
transport:
generate: true
perNode: true
...
Expected behavior would be that remaining defaults would populate on their own... the CRD documentation list most of the properties as optional and does not specify this level of cross dependency (all or none like behavior). Or that the logged error was more clear... since the "secret" is not missing...
Hey just created this issue as a thought on we can achieve this disk reconciler task, to increase the disk size.
Solution [1]:
Solution [2]:
This service is responsible for updating configuration that requires a rolling restart
Create a standalone service called scaler which runs on K8 pod.
The service should be able (by API’s) to scale up/down nodes according to node role.
Detailed explanation:
As part of architecture the operator will be responsible to load service workers (like this scaler).
The scaler service will be the one that handled what's needed in order to scale down/up cluster wise.
The service will received an API call to increase/decrease the number of nodes according to the desired node role.
If the service finds that more nodes should be added it will update the CRD with the desired number of nodes.
If the service finds that nodes should removed from the cluster than it will trigger a drain process which on successful operation should update the CRD by API.
After looking at the CRD and having worked with the opensearch helm chart I have some suggestions on how to rework the CRD to make it more congruent with kubernetes and opensearch terminology.
Note: This builds on the changes I proposed in #24.
To start a discussion this is my suggestion of a reworked example definition (all changes commented):
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
name: opster-opensearch
namespace: operator-os
spec:
general:
clusterName: os-from-operator # Could be optional, if not set the operator can just use metadata.name
httpPort: 9200
transportPort: 9300 # Introduced to make both ports configurable
#vendor: opensearch # Do we need that? Managing both opensearch and elasticsearch would likely become very hard due to the different implementations. I'd say remove field until we really have a use for it
version: latest
serviceName: es-svc # Could be optional, if not set the operator can just use clusterName
dashboards:
enabled: true
nodePools:
- name: masters # Renamed from component, use it as just a name, the component will be defined by the roles
replicas: 3
storage: # renamed and introduced substructure
size: "30Gi" # Switch to using kubernetes resource units
storageClassName: "default" # Optional to e.g. use local disks or fast SSDs
nodeSelector: # start with lowercase letter to be consistent
resources: # Switch to resource definition structure and units as is used for pods/containers in kubernetes
requests:
cpu: "4"
memory: "16Gi"
limits:
cpu: "4"
memory: "16Gi"
roles: # Removed ingest parameter and made it more generic
- master
- name: data
replicas: 3
storage:
size: "100Gi"
nodeSelector:
resources:
requests:
cpu: "4"
memory: "16Gi"
limits:
cpu: "4"
memory: "16Gi"
roles:
- ingest
- data
- name: coordinators
replicas: 3
storage:
size: "100Gi"
nodeSelector:
resources:
requests:
cpu: "4"
memory: "16Gi"
limits:
cpu: "4"
memory: "16Gi"
roles: [] # No roles means the node is a coordinator node
Looking forward to your ideas.
When creating a single master cluster the, get an error as master not discovered or elected yet
Cluster.yaml
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
name: os-logs
namespace: os
spec:
security:
tls:
http:
generate: true
transport:
generate: true
perNode: true
general:
httpPort: 9200
vendor: opensearch
version: 1.2.3
serviceName: os-svc
setVMMaxMapCount: true
confMgmt:
autoScaler: false
monitoring: false
dashboards:
enable: true
version: 1.2.0
replicas: 1
nodePools:
- component: master
replicas: 1
diskSize: "100Gi"
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
memory: 1Gi
roles:
- master
- component: nodes
replicas: 1
diskSize: 1000Gi
resources:
requests:
cpu: 500m
memory: 2Gi
limits:
memory: 2Gi
jvm: "-Xmx1G -Xms1G"
roles:
- data
- component: client
replicas: 1
diskSize: 100Gi
resources:
requests:
cpu: 500m
memory: 2Gi
limits:
memory: 2Gi
jvm: "-Xmx1G -Xms1G"
roles:
- data
Log:
[2022-04-04T12:05:00,713][WARN ][o.o.c.NodeConnectionsService] [os-logs-master-0] failed to connect to {os-logs-bootstrap-0}{zs52XaaoT0mHtvYMg3N_Aw}{0NSI6IQXSEqq-0WJv0bICQ}{os-logs-bootstrap-0}{192.168.4.104:9300}{m}{shard_indexing_pressure_enabled=true} (tried [25] times)
org.opensearch.transport.ConnectTransportException: [os-logs-bootstrap-0][192.168.4.104:9300] connect_exception
at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1064) ~[opensearch-1.2.3.jar:1.2.3]
at org.opensearch.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:213) ~[opensearch-1.2.3.jar:1.2.3]
at org.opensearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:55) ~[opensearch-core-1.2.3.jar:1.2.3]
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2152) ~[?:?]
at org.opensearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:70) ~[opensearch-core-1.2.3.jar:1.2.3]
at org.opensearch.transport.netty4.Netty4TcpChannel.lambda$addListener$0(Netty4TcpChannel.java:81) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[?:?]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:321) ~[?:?]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:337) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:707) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:620) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:583) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) ~[?:?]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: os-logs-bootstrap-0/192.168.4.104:9300
Caused by: java.net.NoRouteToHostException: No route to host
at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
at sun.nio.ch.Net.pollConnectNow(Net.java:660) ~[?:?]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:875) ~[?:?]
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330) ~[?:?]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) ~[?:?]
... 7 more
master not discovered or elected yet, an election requires a node with id [zs52XaaoT0mHtvYMg3N_Aw], have discovered [{os-logs-master-0}{T08PHfy_TgOcFmrxp8t-Xg}{G4DypmTBQn63VlsqqgI7fw}{os-logs-master-0}{192.168.20.231:9300}{m}{shard_indexing_pressure_enabled=true}] which is not a quorum; discovery will continue using [192.168.39.238:9300, 192.168.50.75:9300] from hosts providers and [{os-logs-bootstrap-0}{zs52XaaoT0mHtvYMg3N_Aw}{0NSI6IQXSEqq-0WJv0bICQ}{os-logs-bootstrap-0}{192.168.4.104:9300}{m}{shard_indexing_pressure_enabled=true}, {os-logs-master-0}{T08PHfy_TgOcFmrxp8t-Xg}{G4DypmTBQn63VlsqqgI7fw}{os-logs-master-0}{192.168.20.231:9300}{m}{shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 1, last-accepted version 29 in term 1
Cause:
It's because the operator spin up a cluster with effectively 2 master nodes, then remove the initial master which causes quorum problems.
Possible solution to fix:
Before removing the operator has to make sure that the bootstrap node should not be part of the voting configuration.
Increase/Decrease disks size (Only PVC)
There are few TBD sections that we need to elaborate on:
Need to describe those sections, and maybe add a short intro, for simple on-boarding of new users.
Originally posted by NoorKumar May 10, 2022
Hi Team,
I was trying to create the cluster using operator, I do not see the way to add any additional labels or any extra environment variables to any nodes (master nodes or data nodes). When I submit the cluster config file to operator to create cluster I would like add some labels and env variables.
Currently the kind for the operator CRD is Os
. That abbreviation is also used in other places in the CRD.
Using the Os
abbreviation is not optimal for two reasons:
OS
is already widely known and used for Operating System, so using it here differently can confuse peopleAs such I suggest renaming the kind of the CRD and also any mention of Os
in the CRD fields.
Note: os
can still be configured as a shortName for the CRD so a user can still do a quick kubectl get os
This is my suggestion of how the example definition could look like with new names:
apiVersion: opensearch.opster.io/v1 # Renamed
kind: OpenSearchCluster # Renamed
metadata:
name: opster-opensearch
namespace: operator-opensearch
spec:
general:
clusterName: my-opensearch
httpPort: 9200 # renamed from osPort
transportPort: 9300 # Introduced to make both ports configurable
vendor: opensearch
version: latest
serviceName: es-svc
dashboards: # Renamed from osConfMgmt and moved to make kibana/dashboards a top-level member, later we can add other dashboards-related config options here
enabled: true
nodePools: # renamed from osNodes
- component: masters
replicas: 3
diskSize: 30
NodeSelector:
cpu: 4
memory: 16
ingest: "false"
- component: nodes
replicas: 3
diskSize: 100
NodeSelector:
cpu: 4
memory: 16
ingest: "true"
- component: coordinators
replicas: 3
diskSize: 100
NodeSelector:
cpu: 4
memory: 16
ingest: "false"
The tests should run on push, as part of git hub actions.
Firstly, thanks so much for this initiative! I attempted to install the operator on a c6g.4xlarge Graviton (ARM) based processor and received the below error:
standard_init_linux.go:228: exec user process caused: exec format error
Has this come up before?
Add toleration and taints support for each node group
Need to build the operator for ARM, as part of the build process.
For operator to add tls setting as follows:
tls:
transport:
generate: true
http:
generate: true
There is a dependance on
security:
config:
securityConfigSecret:
##Pre create this secret with required roles and security configs
name: <secret_name>
If only TLS is added
Error
ACTIONGROUPS, CONFIG, ROLES, ROLESMAPPING, TENANTS, NODESDN, WHITELIST, AUDIT] (index=.opendistro_security)
[2022-03-30T17:47:02,622][ERROR][o.o.s.c.ConfigurationLoaderSecurity7] [my-first-cluster-masters-2] Failure no such index [.opendistro_security] retrieving configuration for [INTERNALUSERS, ACTIONGROUPS, CONFIG, ROLES, ROLESMAPPING, TENANTS, NODESDN, WHITELIST, AUDIT] (index=.opendistro_security)
[2022-03-30T17:47:02,622][ERROR][o.o.s.c.ConfigurationLoaderSecurity7] [my-first-cluster-masters-2] Failure no such index [.opendistro_security] retrieving configuration for [INTERNALUSERS, ACTIONGROUPS, CONFIG, ROLES, ROLESMAPPING, TENANTS, NODESDN, WHITELIST, AUDIT] (index=.opendistro_security)
[2022-03-30T17:47:02,622][ERROR][o.o.s.c.ConfigurationLoaderSecurity7] [my-first-cluster-masters-2] Failure no such index [.opendistro_security] retrieving configuration for [INTERNALUSERS, ACTIONGROUPS, CONFIG, ROLES, ROLESMAPPING, TENANTS, NODESDN, WHITELIST, AUDIT] (index=.opendistro_security)
[2022-03-30T17:47:02,622][ERROR][o.o.s.c.ConfigurationLoaderSecurity7] [my-first-cluster-masters-2] Failure no such index [.opendistro_security] retrieving configuration for [INTERNALUSERS, ACTIONGROUPS, CONFIG, ROLES, ROLESMAPPING, TENANTS, NODESDN, WHITELIST, AUDIT] (index=.opendistro_security)
[2022-03-30T17:47:02,622][ERROR][o.o.s.c.ConfigurationLoaderSecurity7] [my-first-cluster-masters-2] Failure no such index [.opendistro_security] retrieving configuration for [INTERNALUSERS, ACTIONGROUPS, CONFIG, ROLES, ROLESMAPPING, TENANTS, NODESDN, WHITELIST, AUDIT] (index=.opendistro_security)
[2022-03-30T17:47:02,622][ERROR][o.o.s.c.ConfigurationLoaderSecurity7] [my-first-cluster-masters-2] Failure no such index [.opendistro_security] retrieving configuration for [INTERNALUSERS, ACTIONGROUPS, CONFIG, ROLES, ROLESMAPPING, TENANTS, NODESDN, WHITELIST, AUDIT] (index=.opendistro_security)
[2022-03-30T17:47:02,622][ERROR][o.o.s.c.ConfigurationLoaderSecurity7] [my-first-cluster-masters-2] Failure no such index [.opendistro_security] retrieving configuration for [INTERNALUSERS, ACTIONGROUPS, CONFIG, ROLES, ROLESMAPPING, TENANTS, NODESDN, WHITELIST, AUDIT] (index=.opendistro_security)
[2022-03-30T17:47:02,622][ERROR][o.o.s.c.ConfigurationLoaderSecurity7] [my-first-cluster-masters-2] Failure no such index [.opendistro_security] retrieving configuration for [INTERNALUSERS, ACTIONGROUPS, CONFIG, ROLES, ROLESMAPPING, TENANTS, NODESDN, WHITELIST, AUDIT] (index=.opendistro_security)
[2022-03-30T17:47:03,001][ERROR][o.o.s.a.BackendRegistry ] [my-first-cluster-masters-2] Not yet initialized (you may need to run securityadmin)
[2022-03-30T17:47:03,004][ERROR][o.o.s.a.BackendRegistry ] [my-first-cluster-masters-2] Not yet initialized (you may need to run securityadmin)
[2022-03-30T17:47:05,500][ERROR][o.o.s.a.BackendRegistry ] [my-first-cluster-masters-2] Not yet initialized (you may need to run securityadmin)
[2022-03-30T17:47:05,503][ERROR][o.o.s.a.BackendRegistry ] [my-first-cluster-masters-2] Not yet initialized (you may need to run securityadmin)
[2022-03-30T17:47:08,001][ERROR][o.o.s.a.BackendRegistry ] [my-first-cluster-masters-2] Not yet initialized (you may need to run securityadmin)
[2022-03-30T17:47:08,004][ERROR][o.o.s.a.BackendRegistry ] [my-first-cluster-masters-2] Not yet initialized (you may need to run securityadmin)
Background:
OpenSearch once TLS is added for Node Transport and HTTP rest API, the embedded security plugin creates.opendistro_security index to enable security settings, for this the securityadmin.sh has to run to load new settings, else the demo install_demo_configuration.sh file will run by default if the TLS setting is not added (If you do not configure anything opensearch will use included demo TLS certificates that are not suited for real deployments.)
curl -k https://localhost:9200/_cat/indices -u admin:admin
green open security-auditlog-2022.03.29 SHZ_xtRBTGub4NFhbtugSw 1 1 7 0 116.4kb 96.8kb
green open .kibana_1 UOntE6z9Soa73BSdk3JI5Q 1 1 0 0 416b 208b
green open .opendistro_security RYmlNkB5RgWAKMZU3_S05Q 1 2 9 0 178.1kb 59.3kb
With the current setup from the PR https://github.com/Opster/opensearch-k8s-operator/pull/61/files#diff-190387233823a104ed9004f0cba248cf0aa504090c923cad3be1a901bd01e99f
the securityadmin.sh will be called by a kubernetes batch job.
securityadmin.sh need to run when we add tls or custom secrets and securityadmin.sh should also run when we add new config files.
With just adding TLS setting does not run the batch job, the following is seen in logs, as once TLS is added to operator opensearch.yml is already modified with Security settings, so the Demo Installer will quit
OpenSearch Security Demo Installer
** Warning: Do not use on production or public reachable systems **
Basedir: /usr/share/opensearch
OpenSearch install type: rpm/deb on NAME="Amazon Linux"
OpenSearch config dir: /usr/share/opensearch/config
OpenSearch config file: /usr/share/opensearch/config/opensearch.yml
OpenSearch bin dir: /usr/share/opensearch/bin
OpenSearch plugins dir: /usr/share/opensearch/plugins
OpenSearch lib dir: /usr/share/opensearch/lib
Detected OpenSearch Version: x-content-1.2.3
Detected OpenSearch Security Version: 1.2.3.0
/usr/share/opensearch/config/opensearch.yml seems to be already configured for Security. Quit.
sed: cannot rename /usr/share/opensearch/config/seddRF6sR: Device or resource busy
Enabling OpenSearch Security Plugin
To move forward, we need to add securityConfigSecret for the security plugin to pick up TLS and passed in roles example as https://github.com/opensearch-project/security/tree/main/securityconfig
A Readme doc on configuring this setup would be helpful.
Once added
security:
config:
securityConfigSecret:
##Pre create this secret with required roles and security configs
name: securityconfig-secret
tls:
transport:
generate: true
http:
generate: true
Library for communicating with the OpenSearch cluster:
This functionality should support different clusters versions.
Service responsibility:
Add CLI for interacting with the operator. The CLI should expose the operator APIs and capabilities in the high-level DSL CLI.
please consider adding a changelog which follows the Keep a Changelog standard. OpenSearch is moving in the same direction, see opensearch-project/OpenSearch#1868 (discussion) & opensearch-project/security#1821 (POC showing how it'd look like).
this would be very helpful for users and you can then re-use this 1:1 for the release notes.
No namespace name should be set in the values.yaml. The operator should be installed into whatever namespace helm is used in (by default the selected namespace of kubectl or the one specified as helm install -n ). Helm offers {{ .Release.Namespace }} for that.
Make opensearch-k8s-operator able to adjust the OpenSearch dashboard Deployment replicas to desired user input. Dashboards can work with HA, so user should be able to pass number of replicas for Dashboard, currently its default hardcoded to 1.
Example Configuration.
dashboards:
enable: true
replicas: 2
It should be possible to override the image for BusyBox either at the operator level as a argument when starting or as a part of the CRD?
OS cluster is currently namespaced although it creates a separate namespace to deploy cluster
I think for more convenience it should either be namespaced and deploy cluster in the same namespace where crd has been created, or should be a cluster resource and deploy cluster in a namespace specified in spec
In my opinion namespace management via operator makes sense when crd requires multiple namespaces, otherwise we will have some pain managing full lifecycle in terms of PVC management
There is a number of operators for orchestrating databases that I used to work with and all of them create clusters in the same namespace where crd is created (cassandra, vicroriametrics, etcd etc)
OperatorHub.io is a new home for the Kubernetes community to share Operators.
It would be great to see OpenSearch Operator make it to the OperatorHub.io so the OpenSearch Kubernetes user-base could discover it easily.
Is it possible to enable multitenancy in the opensearch-dashboards using the operator ?
Hey i'm following this guide to install the operator but I dont see operator creating pods.
make build manifests
(connecting to the cluster)
make install
But I dont see any pods bought up by the controller
kubectl get pods
No resources found in default namespace.
Used the following cluster.yaml file
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
name: my-first-cluster
namespace: default
spec:
general:
serviceName: my-first-cluster
dashboards:
enable: true
nodePools:
- component: masters
replicas: 3
diskSize: 30
NodeSelector:
cpu: 1
memory: 1
roles:
- "master"
- "data"
The following is the output describing the cluster
Name: my-cluster
Namespace: default
Labels: <none>
Annotations: <none>
API Version: opensearch.opster.io/v1
Kind: OpenSearchCluster
Metadata:
Creation Timestamp: 2022-03-17T20:50:17Z
Generation: 1
Managed Fields:
API Version: opensearch.opster.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:spec:
.:
f:confMgmt:
.:
f:smartScaler:
f:dashboards:
.:
f:enable:
f:general:
.:
f:httpPort:
f:serviceName:
f:vendor:
f:version:
f:nodePools:
f:security:
.:
f:tls:
.:
f:http:
.:
f:generate:
f:transport:
.:
f:generate:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2022-03-17T20:50:17Z
Resource Version: 264753
UID: 8f6f66ce-04f8-413d-9c50-6b1cb1985c50
Spec:
Conf Mgmt:
Smart Scaler: true
Dashboards:
Enable: true
General:
Http Port: 9200
Service Name: my-cluster
Vendor: opensearch
Version: latest
Node Pools:
Component: masters
Cpu: 1
Disk Size: 30
Memory: 1
Replicas: 3
Roles:
master
data
Component: nodes
Cpu: 1
Disk Size: 100
Memory: 1
Replicas: 3
Roles:
data
Component: coordinators
Cpu: 1
Disk Size: 100
Replicas: 3
Roles:
ingest
Security:
Tls:
Http:
Generate: true
Transport:
Generate: true
Events: <none>
Create config for the operator to use for creating workers.
The operator will use this config to load specific workers to handle desired operator operations, such as scaler, configuration services .etc.
Support gp2 and local storage type.
I have tried to create an Opensearch cluster in a namespace that has:
imagePullSecret
Cluster creation fails with image pull errors of the istio sidecar.
I attempted to apply configuration that includes the appropriate stanza in the general
portion of the spec:
spec:
general:
imagePullSecrets:
- name: XXX
This still does not succeed. When first creating the opensearchcluster resource the imagePullSecrets
is omitted. If I apply the identical resource a second time the imagePullSecrets
configuration appears in the output of kubectl describe
. Unfortunately, even after applying the configuration twice so that imagePullSecrets
is set, the containers managed by the operator do not have the imagePullSecrets
in their specification. I would expect that the imagePullSecrets
doesn't require applying the configuration twice.
Example:
$ cat ha-poc-third-try.yml
---
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
name: ha-poc-third-try
spec:
general:
serviceName: os-ha-poc-third-try
version: 1.3.1
imagePullSecrets:
- name: XXX
dashboards:
enable: true
version: 1.3.1
replicas: 1
resources:
requests:
memory: "512Mi"
cpu: "200m"
limits:
memory: "512Mi"
cpu: "200m"
nodePools:
- component: masters
replicas: 3
diskSize: "5Gi"
NodeSelector:
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "500m"
roles:
- "data"
- "master"
[1080] GalensReltioMBP:~/Documents/gdrive/opensearch-k8s-operator% kubectl describe os -n ha-poc
No resources found in ha-poc namespace.
[1081] GalensReltioMBP:~/Documents/gdrive/opensearch-k8s-operator% kubectl apply -n ha-poc -f ha-poc-third-try.yml
opensearchcluster.opensearch.opster.io/ha-poc-third-try created
[1082] GalensReltioMBP:~/Documents/gdrive/opensearch-k8s-operator% kubectl describe os -n ha-poc
Name: ha-poc-third-try
Namespace: ha-poc
Labels: <none>
Annotations: <none>
API Version: opensearch.opster.io/v1
Kind: OpenSearchCluster
Metadata:
Creation Timestamp: 2022-04-27T22:33:20Z
Finalizers:
Opster
Generation: 2
Managed Fields:
API Version: opensearch.opster.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:spec:
.:
f:dashboards:
.:
f:enable:
f:replicas:
f:resources:
.:
f:limits:
.:
f:cpu:
f:memory:
f:requests:
.:
f:cpu:
f:memory:
f:version:
f:general:
.:
f:httpPort:
f:serviceName:
f:version:
f:nodePools:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2022-04-27T22:33:20Z
API Version: opensearch.opster.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
f:spec:
f:confMgmt:
f:dashboards:
f:opensearchCredentialsSecret:
f:status:
.:
f:componentsStatus:
f:phase:
f:version:
Manager: manager
Operation: Update
Time: 2022-04-27T22:33:21Z
Resource Version: 13740903
Self Link: /apis/opensearch.opster.io/v1/namespaces/ha-poc/opensearchclusters/ha-poc-third-try
UID: fc1ff7e1-19bc-4101-92b9-a6401794aeaf
Spec:
Conf Mgmt:
Dashboards:
Enable: true
Opensearch Credentials Secret:
Replicas: 1
Resources:
Limits:
Cpu: 200m
Memory: 512Mi
Requests:
Cpu: 200m
Memory: 512Mi
Version: 1.3.1
General:
Http Port: 9200
Service Name: os-ha-poc-third-try
Version: 1.3.1
Node Pools:
Component: masters
Disk Size: 5Gi
Replicas: 3
Resources:
Limits:
Cpu: 500m
Memory: 2Gi
Requests:
Cpu: 500m
Memory: 2Gi
Roles:
data
master
Status:
Components Status:
Phase: RUNNING
Version: 1.3.1
Events: <none>
[1083] GalensReltioMBP:~/Documents/gdrive/opensearch-k8s-operator% kubectl apply -n ha-poc -f ha-poc-third-try.yml
opensearchcluster.opensearch.opster.io/ha-poc-third-try configured
[1084] GalensReltioMBP:~/Documents/gdrive/opensearch-k8s-operator% kubectl describe os -n ha-poc
Name: ha-poc-third-try
Namespace: ha-poc
Labels: <none>
Annotations: <none>
API Version: opensearch.opster.io/v1
Kind: OpenSearchCluster
Metadata:
Creation Timestamp: 2022-04-27T22:33:20Z
Finalizers:
Opster
Generation: 3
Managed Fields:
API Version: opensearch.opster.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
f:spec:
f:confMgmt:
f:dashboards:
f:opensearchCredentialsSecret:
f:status:
.:
f:componentsStatus:
f:phase:
f:version:
Manager: manager
Operation: Update
Time: 2022-04-27T22:33:21Z
API Version: opensearch.opster.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:spec:
.:
f:dashboards:
.:
f:enable:
f:replicas:
f:resources:
.:
f:limits:
.:
f:cpu:
f:memory:
f:requests:
.:
f:cpu:
f:memory:
f:version:
f:general:
.:
f:httpPort:
f:serviceName:
f:version:
f:nodePools:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2022-04-27T22:34:10Z
Resource Version: 13741241
Self Link: /apis/opensearch.opster.io/v1/namespaces/ha-poc/opensearchclusters/ha-poc-third-try
UID: fc1ff7e1-19bc-4101-92b9-a6401794aeaf
Spec:
Conf Mgmt:
Dashboards:
Enable: true
Opensearch Credentials Secret:
Replicas: 1
Resources:
Limits:
Cpu: 200m
Memory: 512Mi
Requests:
Cpu: 200m
Memory: 512Mi
Version: 1.3.1
General:
Http Port: 9200
Image Pull Secrets:
Name: XXX
Service Name: os-ha-poc-third-try
Version: 1.3.1
Node Pools:
Component: masters
Disk Size: 5Gi
Replicas: 3
Resources:
Limits:
Cpu: 500m
Memory: 2Gi
Requests:
Cpu: 500m
Memory: 2Gi
Roles:
data
master
Status:
Components Status:
Phase: RUNNING
Version: 1.3.1
Events: <none>
[1085] GalensReltioMBP:~/Documents/gdrive/opensearch-k8s-operator% kubectl describe pod -n ha-poc ha-poc-third-try-bootstrap-0
Name: ha-poc-third-try-bootstrap-0
Namespace: ha-poc
Priority: 0
Node: ip-10-10-144-255.ec2.internal/10.10.144.255
Start Time: Wed, 27 Apr 2022 15:33:21 -0700
Labels: opster.io/opensearch-cluster=ha-poc-third-try
security.istio.io/tlsMode=istio
Annotations: banzaicloud.com/last-applied:
UEsDBBQACAAIAAAAAAAAAAAAAAAAAAAAAAAIAAAAb3JpZ2luYWzMVF+P4jYQ/yponpOQcHs9kTe0e1VV6QqC6+mkE0KOM2xcHNsa2+wilO9e2aEhbHe3L304IZGxPX9+85s/Z2jRsZ...
kubernetes.io/psp: eks.privileged
sidecar.istio.io/status:
{"version":"023ae377d1a8981380141286422d04a98c39883f0647804d1ec0a7b5683da18d","initContainers":["istio-init"],"containers":["istio-proxy"]...
Status: Pending
IP: X.X.X.X
IPs:
IP: X.X.X.X
Controlled By: OpenSearchCluster/ha-poc-third-try
Init Containers:
init:
Container ID: docker://26438a038ba34b5a81bada0d60751206e61c5942f06b3f6f5699c461fb3eaa7b
Image: busybox
Image ID: docker-pullable://busybox@sha256:d2b53584f580310186df7a2055ce3ff83cc0df6caacf1e3489bff8cf5d0af5d8
Port: <none>
Host Port: <none>
Command:
sh
-c
Args:
chown -R 1000:1000 /usr/share/opensearch/data
State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 27 Apr 2022 15:33:22 -0700
Finished: Wed, 27 Apr 2022 15:33:22 -0700
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/usr/share/opensearch/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-smv66 (ro)
istio-init:
Container ID:
Image: gcr.io/XXXXX/istio/proxyv2:1.4.4
Image ID:
Port: <none>
Host Port: <none>
Command:
istio-iptables
-p
15001
-z
15006
-u
1337
-m
REDIRECT
-i
10.10.0.0/16
-x
-b
*
-d
9160,9042,15020
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Limits:
cpu: 100m
memory: 50Mi
Requests:
cpu: 10m
memory: 10Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-smv66 (ro)
Containers:
opensearch:
Container ID:
Image: docker.io/opensearchproject/opensearch:1.3.1
Image ID:
Ports: 9200/TCP, 9300/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Liveness: tcp-socket :9200 delay=10s timeout=5s period=20s #success=1 #failure=10
Startup: tcp-socket :9200 delay=10s timeout=5s period=20s #success=1 #failure=10
Environment:
cluster.initial_master_nodes: ha-poc-third-try-bootstrap-0
discovery.seed_hosts: ha-poc-third-try-discovery
cluster.name: ha-poc-third-try
network.bind_host: 0.0.0.0
network.publish_host: ha-poc-third-try-bootstrap-0 (v1:metadata.name)
OPENSEARCH_JAVA_OPTS: -Xmx512M -Xms512M
node.roles: master
Mounts:
/usr/share/opensearch/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-smv66 (ro)
istio-proxy:
Container ID:
Image: gcr.io/customer-facing/istio/proxyv2:1.4.4
Image ID:
Port: 15090/TCP
Host Port: 0/TCP
Args:
proxy
sidecar
--domain
$(POD_NAMESPACE).svc.cluster.local
--configPath
/etc/istio/proxy
--binaryPath
/usr/local/bin/envoy
--serviceCluster
ha-poc-third-try-bootstrap-0.ha-poc
--drainDuration
45s
--parentShutdownDuration
1m0s
--discoveryAddress
istio-pilot.helm-istio-system:15010
--zipkinAddress
zipkin.helm-istio-system:9411
--proxyLogLevel=warning
--dnsRefreshRate
5s
--connectTimeout
10s
--proxyAdminPort
15000
--concurrency
2
--controlPlaneAuthPolicy
NONE
--statusPort
15020
--applicationPorts
9200,9300
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 100m
memory: 128Mi
Readiness: http-get http://:15020/healthz/ready delay=1s timeout=1s period=2s #success=1 #failure=30
Environment:
POD_NAME: ha-poc-third-try-bootstrap-0 (v1:metadata.name)
ISTIO_META_POD_PORTS: [
{"name":"http","containerPort":9200,"protocol":"TCP"}
,{"name":"transport","containerPort":9300,"protocol":"TCP"}
]
ISTIO_META_CLUSTER_ID: Kubernetes
POD_NAMESPACE: ha-poc (v1:metadata.namespace)
INSTANCE_IP: (v1:status.podIP)
SERVICE_ACCOUNT: (v1:spec.serviceAccountName)
ISTIO_AUTO_MTLS_ENABLED: true
ISTIO_META_POD_NAME: ha-poc-third-try-bootstrap-0 (v1:metadata.name)
ISTIO_META_CONFIG_NAMESPACE: ha-poc (v1:metadata.namespace)
SDS_ENABLED: false
ISTIO_META_INTERCEPTION_MODE: REDIRECT
ISTIO_META_INCLUDE_INBOUND_PORTS: 9200,9300
ISTIO_METAJSON_ANNOTATIONS: {"banzaicloud.com/last-applied":"UEsDBBQACAAIAAAAAAAAAAAAAAAAAAAAAAAIAAAAb3JpZ2luYWzMVF+P4jYQ/yponpOQcHs9kTe0e1VV6QqC6+mkE0KOM2xcHNsa2+w
ilO9e2aEhbHe3L304IZGxPX9+85s/Z2jRsZo5BuUZJKtQ2iBpYx1SJvRUG1QWGfEm5dKHWyihYanRPHWNoDp1dIIuAcVafOUprbR21hEzaQ69ljWMX1UhAf2kkNa4R0LF0UL54wzMiG9IVmgFJVxBZFdkxwISqKTmh2Wwf0CJLqo78pgA
18qRljIA7m8OQtVQwtKg2kRn95eE3gQPCXgRbPa82O8/YZEW84qnd0VepPNZNU/ZL3d58Wl+x5Dtodt2CViDPDAYwjOhkPp0UB3j9xLoQmUmlHCCyV3LwnGndI0WEjgy6f+TzC4Z3NXCcn1EOmUWsd412rp33Qz6Yyf/YIrHt43HJgrdk
6ZDVgnVRx3Z5Vn8vaZufCWFbW4sfiXdBt72AmW9xn2Qb3oglju+rphroBw6twfcdaNAy9XnPzafF+v733a/L74tdsvV180IWvq9ff5YzL5M0u+tDcINSF1jRlreFKKvD3TbBETLHiPnmh9ejogh/RdyN7opi+xDFpBLcUSF1q5IVxgTZU
J6wq8NoW20rKEs8gQuDfGAkp02yLWqbf9gkISuh6tZnoD1nKO1Yw8JOG42AZkLMYwmB+V8luddAk60qL0bXHy8Du0VLyTRqG/aoYdXg5/rqDhnIm0vlT6MlBwxZSOIwByh1Z7igJ/DpDhGzpufi5Cjlr7FL9qrCwdtEC8dN/WWprZhhKM
KT+P2HFKOp24bO0UJd3+7Bhg9BgF4o5/UJF1PijzPy/A3ecf5NmyztmVhff0AG4qUchj1YuXtqdLPVxQhNPybceSehDsFUPgcCSGvFvZPG5Zk/j+n3/u6rL/WuNODoB7HC93YCs5HkN3fAQAA//9QSwcIbVtlHqkCAACSBgAAUEsBAhQA
FAAIAAgAAAAAAG1bZR6pAgAAkgYAAAgAAAAAAAAAAAAAAAAAAAAAAG9yaWdpbmFsUEsFBgAAAAABAAEANgAAAN8CAAAAAA==","kubernetes.io/psp":"eks.privileged"}
ISTIO_METAJSON_LABELS: {"opster.io/opensearch-cluster":"ha-poc-third-try"}
ISTIO_META_WORKLOAD_NAME: ha-poc-third-try-bootstrap-0
ISTIO_META_OWNER: kubernetes://apis/v1/namespaces/ha-poc/pods/ha-poc-third-try-bootstrap-0
ISTIO_KUBE_APP_PROBERS: {}
Mounts:
/etc/certs/ from istio-certs (ro)
/etc/istio/proxy from istio-envoy (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-smv66 (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
default-token-smv66:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-smv66
Optional: false
istio-envoy:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
istio-certs:
Type: Secret (a volume populated by a Secret)
SecretName: istio.default
Optional: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 11m default-scheduler Successfully assigned ha-poc/ha-poc-third-try-bootstrap-0 to ip-10-10-144-255.ec2.internal
Normal Pulling 11m kubelet Pulling image "busybox"
Normal Pulled 11m kubelet Successfully pulled image "busybox"
Normal Created 11m kubelet Created container init
Normal Started 11m kubelet Started container init
Warning Failed 10m (x3 over 11m) kubelet Error: ErrImagePull
Normal Pulling 9m33s (x4 over 11m) kubelet Pulling image "gcr.io/XXX/istio/proxyv2:1.4.4"
Warning Failed 9m33s (x4 over 11m) kubelet Failed to pull image "gcr.io/XXX/istio/proxyv2:1.4.4": rpc error: code = Unknown desc = Error response
from daemon: unauthorized: You don't have the needed permissions to perform this operation, and you may have invalid credentials. To authenticate your request, follow the steps
in: https://cloud.google.com/container-registry/docs/advanced-authentication
Warning Failed 5m54s (x20 over 11m) kubelet Error: ImagePullBackOff
Normal BackOff 62s (x42 over 11m) kubelet Back-off pulling image "gcr.io/XXX/istio/proxyv2:1.4.4"
Service worker to manage the cluster resources:
Ability to add DiskSize as string rather an a fixed int value, which currently parses backend to 'Gi', user should be flexible to pass 'G' or 'Gi' and soo on.
Background: from doc
Limits and requests for memory are measured in bytes. We can express memory as a plain integer or as a fixed-point integer using one of these suffixes: E, P, T, G, M, K. We can also use the power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki.
The kibibyte was designed to replace the kilobyte in those computer science contexts in which the term kilobyte is used to mean 1024 bytes. The interpretation of kilobyte to denote 1024 bytes, conflicting with the SI definition of the prefix kilo (1000), used to be common.
So, as we can see, 5G means 5 Gigabytes while 5Gi means 5 Gibibytes. They amount to:
5 G = 5000000 KB / 5000 MB
5 Gi = 5368709.12 KB / 5368.70 MB
Therefore, in terms of size, they are not the same.
The Chart outputs a namespace definition with no name now... and aside from dropping the chart as a raw template and manipulating it, you can no longer install the operator using the Helm Repo at https://opster.github.io/opensearch-k8s-operator-chart/. These changes all seem to have been merged in the last 24 hours. PR #130 even references this issue, yet, that PR has not been merged yet. Seems the changes were made to main and the gh-pages publication process may have 'jumped the gun'.
currently neither the README nor the release notes (on GitHub) list the versions of OpenSearch and OpenSearch Dashboards which are supported by this version of the operator.
it'd be great if:
Add support in ephemeral disks
The bootstrap pod does not declare any required resources and is also not configurable via CRD.
The BootstrapPod when launched on a cluster with resource defaults defined (to ensure that all pods are configured with limits) may fail to complete a bootstrap if the configured defaults are too small to accommodate the hard-coded Java heap opts.
Both the pod resources and the bootstrap java opts should be exposed via the CRD as configurable parameters or else a default resource limit configured to match the Java opts.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.