polardb / polardbx-operator Goto Github PK
View Code? Open in Web Editor NEWpolardbx-operator is a Kubernetes extension that aims to create and manage PolarDB-X cluster on Kubernetes.
License: Apache License 2.0
polardbx-operator is a Kubernetes extension that aims to create and manage PolarDB-X cluster on Kubernetes.
License: Apache License 2.0
os:centos-stream 9
docker version 24
python version: 3.9
pxd version: 0.5.3
报错如下:
latest:Pulling from polardbx/xstore-tools
Digest: sha256:745a19864eb6ef05a07b8ba97bc428aa9ed9084784f93a48e3c8a3c4e16a1f28
Status: Image is up to date for polardbx/xstore-tools:latest
Pull image: polardbx/polardbx-cdc:latest at 127.0.0.1
latest:Pulling from polardbx/polardbx-cdc
Digest: sha256:d372291c367309016b3a273d2feaa619c592712c3f54478d8c47425bae83efcf
Status: Image is up to date for polardbx/polardbx-cdc:latest
Processing [###########-------------------------] 30% create gms node
Processing [#############-----------------------] 38% create gms db and tables
Processing [################--------------------] 46% create PolarDB-X root account
Processing [###################-----------------] 53% create dn
Traceback (most recent call last):
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/api/client.py", line 268, in _raise_for_status
response.raise_for_status()
File "/home/polardbx/.local/lib/python3.9/site-packages/requests/models.py", line 953, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 409 Client Error: Conflict for url: http+docker://localhost/v1.43/containers/41ce7473fd94b6d9e7ef1247bd8ce4c36d5d6bdbaf762d7260c7f1dbcf472d95/exec
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/polardbx/.local/bin/pxd", line 8, in
sys.exit(main())
File "/home/polardbx/.local/lib/python3.9/site-packages/click/core.py", line 1137, in call
return self.main(*args, **kwargs)
File "/home/polardbx/.local/lib/python3.9/site-packages/click/core.py", line 1062, in main
rv = self.invoke(ctx)
File "/home/polardbx/.local/lib/python3.9/site-packages/click/core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/polardbx/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/polardbx/.local/lib/python3.9/site-packages/click/core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/pxd.py", line 53, in tryout
create_tryout_pxc(name, cn_replica, cn_version, dn_replica, dn_version, cdc_replica, cdc_version, leader_only)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/pxc/polardbx_manager.py", line 36, in create_tryout_pxc
pxc.create()
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/pxc/polardbx_cluster.py", line 144, in create
result = create_task()
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/decorator/decorators.py", line 40, in wrapper
raise ex
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/decorator/decorators.py", line 33, in wrapper
ret = func(*args, **kw)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/pxc/polardbx_cluster.py", line 353, in _create_dn
logger.info(f.result())
File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/usr/lib64/python3.9/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/pxc/polardbx_cluster.py", line 342, in _create_dn
xdb.create()
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/xdb/xdb.py", line 100, in create
result = create_task()
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/decorator/decorators.py", line 61, in wrapper
raise ex
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/decorator/decorators.py", line 54, in wrapper
ret = func(*args, **kw)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/xdb/xdb.py", line 397, in _create_admin_account
leader_node = self._wait_leader_elected()
File "/home/polardbx/.local/lib/python3.9/site-packages/retrying.py", line 49, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/home/polardbx/.local/lib/python3.9/site-packages/retrying.py", line 212, in call
raise attempt.get()
File "/home/polardbx/.local/lib/python3.9/site-packages/retrying.py", line 247, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/home/polardbx/.local/lib/python3.9/site-packages/six.py", line 719, in reraise
raise value
File "/home/polardbx/.local/lib/python3.9/site-packages/retrying.py", line 200, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/xdb/xdb.py", line 358, in _wait_leader_elected
(exit_code, output) = container.exec_run(cmd=["/tools/xstore/current/venv/bin/python3",
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/models/containers.py", line 193, in exec_run
resp = self.client.api.exec_create(
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/utils/decorators.py", line 19, in wrapped
return f(self, resource_id, *args, **kwargs)
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/api/exec_api.py", line 80, in exec_create
return self._result(res, True)
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/api/client.py", line 274, in _result
self._raise_for_status(response)
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/api/client.py", line 270, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 409 Client Error for http+docker://localhost/v1.43/containers/41ce7473fd94b6d9e7ef1247bd8ce4c36d5d6bdbaf762d7260c7f1dbcf472d95/exec: Conflict ("Container 41ce7473fd94b6d9e7ef1247bd8ce4c36d5d6bdbaf762d7260c7f1dbcf472d95 is not running")
How does admin obtain Grant_priv permissions?Or root password
- What is missing?
PrometheusRule for PolarDB-X is missing in polardbx-montor helm chart.
- Why do we need it?
Currently, PolarDB-X cluster deployed by polardbx-operator is monitored by prometheus and grafana. Only grafana dashboards are provided to monitor the metrics and healtch of PolarDB-X.
If some metrics exceed the threshold, such as CPU high, an alert message should be generated and noticed to the user as soon as possible. So a default PrometheusRule for PolarDB-X cluster is needed. After that, prometheus can send the alert message to other alert systems(such AlertManager).
Supports using storageclass to provide storage and enable pods to mount pvc volumes
The parameters are not effective, such as expire_logs_days, enter mysql to query expire_logs_days is 0
apiVersion: polardbx.aliyun.com/v1
kind: PolarDBXParameterTemplate
metadata:
labels:
app.kubernetes.io/managed-by: Helm
name: product-8030
spec:
nodeType:
cn:
name: cnTemplate
paramList:
...
- defaultValue: "7"
divisibilityFactor: 1
mode: readwrite
name: expire_logs_days
optional: .*
restart: true
unit: INT
...
Request xstore api privileges support super type user,Thanks
Command: helm install --namespace polardbx-monitor polardbx-monitor polardbx/polardbx-monitor
Result: Error: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "resource-metrics-server-resources" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "polardbx-monitor"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "polardbx-monitor"
版本: polardbx-operator-1.4.1.tar.gz
polardbx-operator/values.yaml:
nodeSelector: { }
affinity: { }
tolerations:
- key: "dedicated"
operator: "Equal"
value: "dba"
effect: "NoSchedule"
错误日志:
[root@localhost polardb-x]# helm template polardbx-operator --output-dir ./result --debug>aa
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config
install.go:178: [debug] Original chart version: ""
install.go:195: [debug] CHART PATH: /root/polardb-x/polardbx-operator
Error: YAML parse error on polardbx-operator/templates/controller-manager-deployment.yaml: error converting YAML to JSON: yaml: line 38: did not find expected key
helm.go:84: [debug] error converting YAML to JSON: yaml: line 38: did not find expected key
YAML parse error on polardbx-operator/templates/controller-manager-deployment.yaml
helm.sh/helm/v3/pkg/releaseutil.(*manifestFile).sort
helm.sh/helm/v3/pkg/releaseutil/manifest_sorter.go:146
helm.sh/helm/v3/pkg/releaseutil.SortManifests
helm.sh/helm/v3/pkg/releaseutil/manifest_sorter.go:106
helm.sh/helm/v3/pkg/action.(*Configuration).renderResources
helm.sh/helm/v3/pkg/action/action.go:165
helm.sh/helm/v3/pkg/action.(*Install).RunWithContext
helm.sh/helm/v3/pkg/action/install.go:259
main.runInstall
helm.sh/helm/v3/cmd/helm/install.go:264
main.newTemplateCmd.func2
helm.sh/helm/v3/cmd/helm/template.go:82
github.com/spf13/cobra.(*Command).execute
github.com/spf13/[email protected]/command.go:856
github.com/spf13/cobra.(*Command).ExecuteC
github.com/spf13/[email protected]/command.go:974
github.com/spf13/cobra.(*Command).Execute
github.com/spf13/[email protected]/command.go:902
main.main
helm.sh/helm/v3/cmd/helm/helm.go:83
runtime.main
runtime/proc.go:255
runtime.goexit
runtime/asm_amd64.s:1581
生成后错误的yaml(debug):
fieldPath: metadata.annotations['pause']
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: dba
containers:
感觉是toYaml函数会带上本行空格导致, 修改controller-manager-deployment.yaml文件中tolerations部分从
{{- with .Values.controllerManager.tolerations }}
tolerations:
{{ toYaml . | indent 8 }}
{{- end }}
改为
{{- with .Values.controllerManager.tolerations }}
tolerations:
{{ toYaml . | indent 6 }}
{{- end }}
能修复,类似的可能还有nodeSelector, affinity部分(toYaml相关)
apiVersion: polardbx.aliyun.com/v1
kind: XStore
metadata:
name: polardb-readonly
spec:
config:
controller:
RPCProtocolVersion: 1
engine: {}
readonly: true
primaryXStore: polardb-master # 主实例名
topology:
nodeSets:
- name: readonly
replicas: 1
role: Learner
template:
spec:
image: image
resources:
limits:
cpu: "2"
memory: 8Gi
requests:
cpu: "2"
memory: 4Gi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
xstore/node-role: candidate
topologyKey: kubernetes.io/hostname
weight: 1
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: mks.components
operator: In
values:
- "enable"
weight: 1
hostNetwork: false
parameterTemplate:
name: product-8032
serviceType: ClusterIP
环境单机docker安装的rancher单节点 上安装
[root@test]# kubectl -n polardbx get pods
NAME READY STATUS RESTARTS AGE
polardbx-c00-gt96-dn-0-cand-0 0/3 ContainerCreating 0 24m
polardbx-c00-gt96-dn-0-cand-1 0/3 ContainerCreating 0 24m
polardbx-c00-gt96-dn-0-log-0 0/3 ContainerCreating 0 24m
polardbx-c00-gt96-gms-cand-0 0/3 ContainerCreating 0 24m
polardbx-c00-gt96-gms-cand-1 0/3 ContainerCreating 0 24m
polardbx-c00-gt96-gms-log-0 0/3 ContainerCreating 0 24m
polardbx-controller-manager-7b4f47dfcf-5txkz 1/1 Running 0 38m
polardbx-hpfs-2fq8x 1/1 Running 0 38m
polardbx-tools-updater-jd829 1/1 Running 0 38m
kubectl -n polardbx describe pod polardbx-c00-gt96-dn-0-cand-0
Events:
Type Reason Age From Message
Normal Scheduled 17s default-scheduler Successfully assigned polardbx/polardbx-c00-gt96-dn-0-cand-0 to dz-fanjie
Warning FailedMount 2s (x6 over 17s) kubelet MountVolume.SetUp failed for volume "etclocaltime" : hostPath type check failed: /etc/localtime is not a file
Warning FailedMount 2s (x6 over 17s) kubelet MountVolume.SetUp failed for volume "zoneinfo" : hostPath type check failed: /usr/share/zoneinfo is not a directory
出现上述错误,宿主机及docker容器都有上面的2个目录。
场景是我们在开发环境单独加了一些worker节点,准备让polardb独享,但是为了不让其他应用占用资源,必须在节点上打上污点,这样一来,polardbx本身如果不容忍污点的话,也会被调度器排除在外
希望能配置tolerations或Runtime Class
Hi, we use https://orbstack.dev/ k8s environment for development
we install by helm install --namespace polardbx-operator-system polardbx-operator polardbx/polardbx-operator
with version v1.5.0
but polardbx-hpfs
start failed with error message
2023-11-17T14:43:20.414Z INFO getting file lock...
2023-11-17T14:43:20.414Z INFO locked, continue
2023-11-17T14:43:20.414Z INFO starting grpc service...
2023-11-17T14:43:20.415Z INFO Start filestream server
Loading block devices...
failed to create local file service: stat /dev/vda: no such file or directory
Client: 10.1.8.242:57568
GET /liveness HTTP/1.1
Host: 10.1.8.242:23234
user-agent: kube-probe/1.21
accept: /
probe-extra: galaxy
probe-port: 17997
probe-target: xstore
connection: close
2022/08/12 21:20:44 Succeeds!
2022/08/12 21:20:44 Failed!
$ kubectl get polardbxcluster -w
NAME GMS CN DN CDC PHASE DISK AGE
liyz 0/1 0/1 0/1 0/1 Creating 42m
alway waiting.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
liyz-9gk8-dn-0-single-0 2/3 Running 0 43m
liyz-9gk8-gms-single-0 2/3 Running 0 43m
$ kubectl logs --tail=30 liyz-9gk8-dn-0-single-0 --all-containers=true
INFO:root:Begin to initialize...
2022-04-07 08:31:23,861 - GalaxyEngine - INFO - (cwd=/data/mysql/data) execute command: /opt/galaxy_engine/bin/mysqld --defaults-file=/data/mysql/conf/my.cnf --loose-pod-name=liyz-9gk8-dn-0-single-0 --loose-cluster-info=liyz-9gk8-dn-0-single-0:25047@1 --initialize-insecure
INFO:root:Initialized!
INFO:root:Bootstrapping engine galaxy ...
2022-04-07 08:31:27,665 - GalaxyEngine - INFO - starting process...
2022-04-07 08:31:27,665 - GalaxyEngine - INFO - () start command: /opt/galaxy_engine/bin/mysqld_safe --defaults-file=/data/mysql/conf/my.cnf --loose-pod-name=liyz-9gk8-dn-0-single-0
2022-04-07T08:31:28.433699Z mysqld_safe Logging to '/data/mysql/log/alert.log'.
2022-04-07T08:31:28.447407Z mysqld_safe Starting mysqld daemon with databases from /data/mysql/data
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:277 msg="Starting msqyld_exporter" version="(version=0.13.0, branch=master, revision=4e8e2b98de1c2bcfed7bf0a5c08b3a885b1cbf46)"
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:278 msg="Build context" (gogo1.16.9,userroot@eafbaf7294c5,date20211023-07:52:42)=(MISSING)
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=global_status
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=global_variables
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=slave_status
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.processlist
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.innodb_tablespaces
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.innodb_metrics
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.innodb_cmp
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.innodb_cmpmem
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.query_response_time
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=engine_innodb_status
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:303 msg="Listening on address" address=:21897
level=info ts=2022-04-07T08:32:02.795Z caller=tls_config.go:191 msg="TLS is disabled." http2=false
Client: 192.168.49.2:47238
GET /liveness HTTP/1.1
Host: 192.168.49.2:22096
probe-port: 17047
probe-target: xstore
connection: close
user-agent: kube-probe/1.23
accept: /
probe-extra: galaxy
Client: 192.168.49.2:47236
GET /readiness HTTP/1.1
Host: 192.168.49.2:22096
probe-extra: galaxy
probe-port: 17047
probe-target: xstore
connection: close
user-agent: kube-probe/1.23
accept: /
2022/04/07 09:12:00 Succeeds!
2022/04/07 09:12:00 Failed!
Client: 192.168.49.2:47254
GET /readiness HTTP/1.1
Host: 192.168.49.2:22096
probe-target: xstore
connection: close
user-agent: kube-probe/1.23
accept: /
probe-extra: galaxy
probe-port: 17047
2022/04/07 09:12:01 Failed!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.