selectdb / doris-operator Goto Github PK
View Code? Open in Web Editor NEWDoris kubernetes operator
License: Apache License 2.0
Doris kubernetes operator
License: Apache License 2.0
before restarting set not rebalance.
find master use column name IsMaster
to avoid relying on the sequence of show frontends;
.
版本:1.4.1(当前最新)
通过设置 AdminUser 后不能完成设置管理员密码的功能,是还没实现吗?
operator报错2024-04-10T12:04:12Z ERROR Reconciler error {"controller": "doriscluster", "controllerGroup": "doris.selectdb.com", "controllerKind": "DorisCluster", "DorisCluster": {"name":"doriscluster-zelos","namespace":"doris"}, "namespace": "doris", "name": "doriscluster-zelos", "reconcileID": "cb2fa73c-c75e-4f07-9807-388357fb09fb", "error": "Service "doriscluster-zelos-fe-service" is invalid: spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset"}
helm版本:v3.7.1
通过以下命令安装,指定集群名称和命名空间不生效:
helm install -n default doris-test ./ -f values.yaml
输出提示:看上去是对的
NAME: doris-test
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Thank you for installing doris-1.4.1
但是还是部署到了doris的命名空间下,集群名称为:doriscluster-helm
看着是_helpers.tpl 里面设置的模板有些问题。
用户环境可能不太方便访问公共镜像,initial需要依赖
selectdb/alpha镜像,因此需要允许用户自己定义。
selectdb/doris.broker-ubuntu does not have a 2.0.3 version image on dockerhub .Unable to pull
find master use colume name IsMaster
to avoid relying on the sequence of show frontends;.
请问下doc目录中的readme文档是使用什么框架构建的,想离线环境部署使用。
should first determine whether the namespace exists, and if it does not exist, create it.
be support grace exit, when stop be, should add grace parameter to stop_be.sh for graceful.
in a privatization environment, k8s may be customed and the abilities are enabled through annotations.
fe service可以支持下annotations的定义吗?
如阿里云服务使用负载均衡NLB:https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/configure-nlb-instances-by-using-annotations?spm=a2c4g.11186623.0.0.66822c7dq6iDV5
腾讯云使用负载均衡CLB:https://cloud.tencent.com/document/product/457/45491
operator.yaml has persistentvolumeclaims permissions, but it's not in the cluster role of the operator helm chart.
For deploying robustness, Pods of the same component should deploy on different hosts with Preferred scheduling.
I provisioned the cluster with the following spec and it works.
feSpec:
replicas: 2
image: selectdb/doris.fe-ubuntu:2.1.2
service:
type: LoadBalancer
persistentVolumes:
- mountPath: /opt/apache-doris/fe/doris-meta
name: fetest
persistentVolumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
- mountPath: /opt/apache-doris/fe/log
name: felog
persistentVolumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
beSpec:
replicas: 3
image: selectdb/doris.be-ubuntu:2.1.2
service:
type: LoadBalancer
persistentVolumes:
- mountPath: /opt/apache-doris/be/storage
name: betest
persistentVolumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 40Gi
- mountPath: /opt/apache-doris/be/log
name: belog
persistentVolumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
I changed the storage size for all volumes and the DorisCluster CR spec has been updated. But it is not applied to underlying resources like statefulset and PVCs. I feel like there is no reconcile logic for that.
There is no error logs in doris operator. Here is the logs.
I0607 09:52:22.652461 1 doriscluster_controller.go:91] DorisClusterReconciler reconcile the update crd name test-c1 namespace doris
I0607 09:52:22.652602 1 client.go:35] CreateOrUpdateService service Name, Ports, Selector, ServiceType, Labels have not change namespace doris name test-c1-fe-internal
I0607 09:52:22.652670 1 client.go:35] CreateOrUpdateService service Name, Ports, Selector, ServiceType, Labels have not change namespace doris name test-c1-fe-service
I0607 09:52:22.653179 1 statefulset.go:96] the statefulset name test-c1-fe new hash value 1386567562 old have value 1386567562
I0607 09:52:22.653196 1 client.go:54] ApplyStatefulSet Sync exist statefulset name=test-c1-fe, namespace=doris, equals to new statefulset.
I0607 09:52:22.653247 1 client.go:35] CreateOrUpdateService service Name, Ports, Selector, ServiceType, Labels have not change namespace doris name test-c1-be-internal
I0607 09:52:22.653365 1 client.go:35] CreateOrUpdateService service Name, Ports, Selector, ServiceType, Labels have not change namespace doris name test-c1-be-service
I0607 09:52:22.653812 1 statefulset.go:96] the statefulset name test-c1-be new hash value 3597274972 old have value 3597274972
I0607 09:52:22.653826 1 client.go:54] ApplyStatefulSet Sync exist statefulset name=test-c1-be, namespace=doris, equals to new statefulset.
I0607 09:52:22.653837 1 controller.go:201] Doris cluster is not have cn
I0607 09:52:22.653845 1 controller.go:201] Doris cluster is not have cn
Also DorisCluster status is available.
NAME FESTATUS BESTATUS CNSTATUS BROKERSTATUS
test-c1 available available
THe volume size change in DorisCluster spec should be applied.
selectdb/doris.k8s-operator:1.6.0
apiVersion: doris.selectdb.com/v1
kind: DorisCluster
metadata:
labels:
app.kubernetes.io/name: doriscluster
app.kubernetes.io/instance: doriscluster-sample
app.kubernetes.io/part-of: doris-operator
name: doriscluster-sample
spec:
feSpec:
configMapInfo:
configMapName: doriscluster-sample-conf
resolveKey: fe.conf
securityContext: # 新增部分
runAsUser: 0 # 允许容器以root用户运行
replicas: 3
service:
type: NodePort
limits:
cpu: 2
memory: 4Gi
requests:
cpu: 1
memory: 2Gi
image: selectdb/doris.fe-ubuntu:2.0.11
envVars:
- name: TZ
value: "Asia/Shanghai"
persistentVolumes:
- mountPath: /opt/apache-doris/fe/doris-meta
name: fe-doris-meta
persistentVolumeClaimSpec:
storageClassName: local-path
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
- mountPath: /opt/apache-doris/fe/log
name: fe-log
persistentVolumeClaimSpec:
storageClassName: local-path
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
beSpec:
configMapInfo:
configMapName: doriscluster-sample-conf
resolveKey: be.conf
replicas: 3
securityContext: # 新增部分
runAsUser: 0 # 允许容器以root用户运行
service:
type: NodePort
limits:
cpu: 2
memory: 4Gi
requests:
cpu: 1
memory: 2Gi
image: selectdb/doris.be-ubuntu:2.0.11
envVars:
- name: TZ
value: "Asia/Shanghai"
persistentVolumes:
- mountPath: /opt/apache-doris/be/storage
name: be-storage
persistentVolumeClaimSpec:
storageClassName: local-path
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
- mountPath: /opt/apache-doris/be/log
name: be-log
persistentVolumeClaimSpec:
storageClassName: local-path
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
errlog:
Normal Created 7m16s (x4 over 8m8s) kubelet Created container fe
Warning Failed 7m16s (x4 over 8m8s) kubelet Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "/opt/apache-doris/fe_entrypoint.sh": permission denied: unknown
Warning BackOff 3m (x34 over 8m7s) kubelet Back-off restarting failed container fe in pod doriscluster-sample-fe-0_doris(15869920-5ff4-4bf4-9da9-3502a47a4cf4)
when fe enters into split-brain status or from normal running into not master status, we should have a method to use --metadata_failure_recovery
to recover cluster.
现在的charts要定义jvm配置,需要将fe.config的配置静态写到values.yaml下,这样需要配置很多项,对大批量集群的维护配置有点繁琐。
xxSpec:
configMap:
xxx |
xxx
是否可以将相关配置文件项移到configmap.yaml文件内,然后jmx设置值引用.Values.xxSpec.heapMemory的值?
client has session affinity with fe will be the best for use, and the webui (one service of doris ) should keep session affinity to fe.
now annotations in baseSpec are not added in pod annotations. allow set annotations to adapt privatization environments. example: use annotation for static ip.
now doriscluster CR provide one configmap config in doris components. The configmap as file in read-only volume, for the application to read, The files mount the point "/etc/doris/".
the files in "/etc/doris" for doris components starting. But, the plugins provided by doris also need configuration to start. Doris Operator should provide the ability to add a config file for the plugin.
当前fe、be节点的注册都是使用shell脚本来完成的,是否可以优化下使用operator来完成,不放到启动脚本里面更好些?到时缩容的时候也在operator进行判断完成缩容逻辑?
cloud platform, as obs, allows set annotations to pvc for providing more ability.
As shown in the title, can this feature be added in future versions
if fe uses configmap and not config query_port, when fe pod starts in debug mode, the service in pod will not listen query_port. because code not config default listen value.
the circumstances that be not config heart_service_port are the same as above.
The entrypoint script for doris components to register self in cluster.
scripts rely on the command output of show frontends;
but, the script uses an index to fetch the value that will not be right if the output sequence changes.
IsMaster
to avoid query sequence change. fe entrypointIsMaster
to avoid query sequence change.be entrypointbe support set graceful timeout https://doris.apache.org/zh-CN/docs/dev/admin-manual/config/be-config#grace_shutdown_wait_seconds.
but, this timeout is not synchronous tuning with pod.spec.terminationGracePeriodSeconds.
doris operator should set pod.spec.terminationGracePeriodSeconds use grace_shutdown_wait_seconds
in be conf.
Return result adjustment affects cluster initialization and master node acquisition, requiring adaptation.
mysql> show frontends;
+-----------------------------------------+----------------------------------------------------------------------------------+-------------+----------+-----------+---------+--------------------+----------+----------+-----------+------+-------+-------------------+---------------------+---------------------+----------+--------+------------------------------+------------------+
| Name | Host | EditLogPort | HttpPort | QueryPort | RpcPort | ArrowFlightSqlPort | Role | IsMaster | ClusterId | Join | Alive | ReplayedJournalId | LastStartTime | LastHeartbeat | IsHelper | ErrMsg | Version | CurrentConnected |
+-----------------------------------------+----------------------------------------------------------------------------------+-------------+----------+-----------+---------+--------------------+----------+----------+-----------+------+-------+-------------------+---------------------+---------------------+----------+--------+------------------------------+------------------+
| fe_0dba4cb9_55a4_44ac_b82c_a81d8137532f | doriscluster-sample-fe-0.doriscluster-sample-fe-internal.doris.svc.cluster.local | 9010 | 8030 | 9030 | 9020 | -1 | FOLLOWER | true | 847488715 | true | true | 39 | 2024-01-16 05:26:10 | 2024-01-16 05:28:41 | true | | doris-0.0.0-trunk-9ef4e49307 | Yes |
+-----------------------------------------+----------------------------------------------------------------------------------+-------------+----------+-----------+---------+--------------------+----------+----------+-----------+------+-------+-------------------+---------------------+---------------------+----------+--------+------------------------------+------------------+
1 row in set (0.00 sec)
distinct liveness and readness probe.
参考文档: https://doris.apache.org/zh-CN/docs/install/k8s-deploy
kubectl apply -f https://raw.githubusercontent.com/selectdb/doris-operator/master/config/crd/bases/doris.selectdb.com_dorisclusters.yaml
kubectl apply -f https://raw.githubusercontent.com/selectdb/doris-operator/master/config/operator/operator.yaml
kubectl apply -f https://raw.githubusercontent.com/selectdb/doris-operator/master/doc/examples/doriscluster-sample.yaml
只是将名称
doriscluster-sample
修改了一下名称为了:doriscluster
错误信息:
ERROR (stateListener|97) [Env.checkCurrentNodeExist():1601] current node doriscluster-fe-1.doriscluster-fe-internal.doris.svc.cluster.local:9010 is not added to the cluster, will exit. Your FE IP maybe changed, please set 'priority_networks' config in fe.conf properly.
看提示是说需要设置priority_networks,我把,fe.conf
加上以下参数,挂载到pod也没有任何效果:
priority_networks = 10.42.0.0/16
enable_fqdn_mode = true
请帮忙看看哪里有问题呢?
Description:
The Apache Doris v2.1 supports Arrow Flight SQL channel out of the box. To enable Arrow Flight SQL, the arrow_flight_sql_port
needs to be configured in fe/conf/fe.conf
and be/conf/be.conf
[1]. Adding these properties in feSpec.configMap.fe.conf
and beSpec.configMap.be.conf
enables the Arrow Flight SQL port on BE and FE.
However, this configuration does not expose the port on the pod and the service when deploying Doris using the DorisCluster
CRD. Any attempt to manually expose it by editing the StatefulSet and Service results in the following error (with PVC as well):
2024-06-03 16:28:33,158 INFO (UNKNOWN fe_a9d8b97b_7b57_4961_ac21_0ffa0bbcc532(-1)|1) [Env.waitForReady():1067] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:101 reason:
This issue persists even when metadata_failure_recovery=true
is set in fe.conf
[2].
Request:
It would be highly beneficial if the doris-operator could support the configuration for the Arrow Flight SQL port on BE and FE. This enhancement will facilitate seamless deployment and usage of the Arrow Flight SQL feature in Doris.
References:
[1] Arrow Flight SQL in Apache Doris
[2] apache/doris#4322
fe followers participate in elections, should not scale on the numbers. doris operator should restrict the operation and emit an event in doriscluster cr.
when a pod of doris crashes and does not restart successfully, we should exec the pod for manual debugging.
when a pod crashes, we can add an annotation selectdb.com.doris/runmode=debug
, the next start of the pod should run in debug mode. The pod will re-enter running status in the next restart. when you have finished debugging the service, you can delete the pod or cancel the annotation, the pod will run in normal mode at the next start.
provide use secret to specify username and password to manage doris node.
hi
follow smaple doriscluster-sample-storageclass create doris cluster ,not resolve be name
curl --location-trusted -u root:000000 -T cc.csv -H "column_separator:," http://10.0.162.252:31352/api/demo/example_tbl/_stream_load
curl: (6) Could not resolve host: doriscluster-be-0.doriscluster-be-internal.doris.svc.cluster.local
Since the default initImage
changed from alpine
to selectdb/alpine
, it no longer supports arm64.
It may be necessary to push images of multiple platforms.
Now Doris deployed host has some restrictive limits for deploying. for example:
vm.max_map_count,
ulimit -n
swapoff -a
For the user environment, These limits have not been initialed. If not, users should configure these by themself. It is a burden to manage these limits. Doris-Operator should pre-config restrictive limits.
CRD set a field DisableDefaultPreInitial
construct initContainer for the initial config of the host.
when Doris deployed and config the account and password operator should create an account and set a password for the account.
Hi team:
As refer from this link: https://doris.apache.org/zh-CN/docs/install/cluster-deployment/k8s-deploy/root-user-use/ I'm trying to set root password.
Step 1: apply DorisCluster without add adminUser.name
and adminUser.password
. the default empty password works for me, both be & fe start well.
Step 2: login doris with mysql cli, run SHOW ALL GRANTS;
and set password for 'root' = password('pwd')
, verify login with mysql cli new password.
Step 3: update DorisCluster and add adminUser.name
and adminUser.password
, apply by kubectl. new started BE continuing log the error message.
The following error in BE pod as follow:
[Thu Jun 13 03:42:23 UTC 2024] [info] use root no password show frontends result ERROR 1045 (28000): Access denied for user '[email protected]' (using password: NO) .
ERROR 1045 (28000): Access denied for user '[email protected]' (using password: YES)
Images:
Also check the issue: #131
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.