Giter Club home page Giter Club logo

polardbx-operator's People

Contributors

arkbriar avatar co63oc avatar dingfeng avatar fengshunli avatar free6om avatar testwill avatar vettalwu avatar wenki-96 avatar wenki96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

polardbx-operator's Issues

pxd 安装tryout 报错

os:centos-stream 9
docker version 24

python version: 3.9

pxd version: 0.5.3

报错如下:
latest:Pulling from polardbx/xstore-tools
Digest: sha256:745a19864eb6ef05a07b8ba97bc428aa9ed9084784f93a48e3c8a3c4e16a1f28
Status: Image is up to date for polardbx/xstore-tools:latest
Pull image: polardbx/polardbx-cdc:latest at 127.0.0.1

latest:Pulling from polardbx/polardbx-cdc
Digest: sha256:d372291c367309016b3a273d2feaa619c592712c3f54478d8c47425bae83efcf
Status: Image is up to date for polardbx/polardbx-cdc:latest
Processing [###########-------------------------] 30% create gms node
Processing [#############-----------------------] 38% create gms db and tables
Processing [################--------------------] 46% create PolarDB-X root account
Processing [###################-----------------] 53% create dn

Traceback (most recent call last):
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/api/client.py", line 268, in _raise_for_status
response.raise_for_status()
File "/home/polardbx/.local/lib/python3.9/site-packages/requests/models.py", line 953, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 409 Client Error: Conflict for url: http+docker://localhost/v1.43/containers/41ce7473fd94b6d9e7ef1247bd8ce4c36d5d6bdbaf762d7260c7f1dbcf472d95/exec

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/polardbx/.local/bin/pxd", line 8, in
sys.exit(main())
File "/home/polardbx/.local/lib/python3.9/site-packages/click/core.py", line 1137, in call
return self.main(*args, **kwargs)
File "/home/polardbx/.local/lib/python3.9/site-packages/click/core.py", line 1062, in main
rv = self.invoke(ctx)
File "/home/polardbx/.local/lib/python3.9/site-packages/click/core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/polardbx/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/polardbx/.local/lib/python3.9/site-packages/click/core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/pxd.py", line 53, in tryout
create_tryout_pxc(name, cn_replica, cn_version, dn_replica, dn_version, cdc_replica, cdc_version, leader_only)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/pxc/polardbx_manager.py", line 36, in create_tryout_pxc
pxc.create()
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/pxc/polardbx_cluster.py", line 144, in create
result = create_task()
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/decorator/decorators.py", line 40, in wrapper
raise ex
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/decorator/decorators.py", line 33, in wrapper
ret = func(*args, **kw)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/pxc/polardbx_cluster.py", line 353, in _create_dn
logger.info(f.result())
File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/usr/lib64/python3.9/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/pxc/polardbx_cluster.py", line 342, in _create_dn
xdb.create()
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/xdb/xdb.py", line 100, in create
result = create_task()
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/decorator/decorators.py", line 61, in wrapper
raise ex
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/decorator/decorators.py", line 54, in wrapper
ret = func(*args, **kw)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/xdb/xdb.py", line 397, in _create_admin_account
leader_node = self._wait_leader_elected()
File "/home/polardbx/.local/lib/python3.9/site-packages/retrying.py", line 49, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/home/polardbx/.local/lib/python3.9/site-packages/retrying.py", line 212, in call
raise attempt.get()
File "/home/polardbx/.local/lib/python3.9/site-packages/retrying.py", line 247, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/home/polardbx/.local/lib/python3.9/site-packages/six.py", line 719, in reraise
raise value
File "/home/polardbx/.local/lib/python3.9/site-packages/retrying.py", line 200, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/home/polardbx/.local/lib/python3.9/site-packages/deployer/xdb/xdb.py", line 358, in _wait_leader_elected
(exit_code, output) = container.exec_run(cmd=["/tools/xstore/current/venv/bin/python3",
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/models/containers.py", line 193, in exec_run
resp = self.client.api.exec_create(
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/utils/decorators.py", line 19, in wrapped
return f(self, resource_id, *args, **kwargs)
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/api/exec_api.py", line 80, in exec_create
return self._result(res, True)
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/api/client.py", line 274, in _result
self._raise_for_status(response)
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/api/client.py", line 270, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/home/polardbx/.local/lib/python3.9/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 409 Client Error for http+docker://localhost/v1.43/containers/41ce7473fd94b6d9e7ef1247bd8ce4c36d5d6bdbaf762d7260c7f1dbcf472d95/exec: Conflict ("Container 41ce7473fd94b6d9e7ef1247bd8ce4c36d5d6bdbaf762d7260c7f1dbcf472d95 is not running")

Support default PrometheusRule for polardbx cluster

- What is missing?

PrometheusRule for PolarDB-X is missing in polardbx-montor helm chart.

- Why do we need it?

Currently, PolarDB-X cluster deployed by polardbx-operator is monitored by prometheus and grafana. Only grafana dashboards are provided to monitor the metrics and healtch of PolarDB-X.
If some metrics exceed the threshold, such as CPU high, an alert message should be generated and noticed to the user as soon as possible. So a default PrometheusRule for PolarDB-X cluster is needed. After that, prometheus can send the alert message to other alert systems(such AlertManager).

Supports storageclass

Supports using storageclass to provide storage and enable pods to mount pvc volumes

Parameter is not valid

The parameters are not effective, such as expire_logs_days, enter mysql to query expire_logs_days is 0

apiVersion: polardbx.aliyun.com/v1
kind: PolarDBXParameterTemplate
metadata:
  labels:
    app.kubernetes.io/managed-by: Helm
  name: product-8030
spec:
  nodeType:
    cn:
      name: cnTemplate
      paramList:
...
      - defaultValue: "7"
        divisibilityFactor: 1
        mode: readwrite
        name: expire_logs_days
        optional: .*
        restart: true
        unit: INT
...
image image

polardbx-monitor install failed with helm

Command: helm install --namespace polardbx-monitor polardbx-monitor polardbx/polardbx-monitor

Result: Error: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "resource-metrics-server-resources" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "polardbx-monitor"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "polardbx-monitor"

修改values.yaml文件的tolerations对象后生成模板失败

版本: polardbx-operator-1.4.1.tar.gz
polardbx-operator/values.yaml:

  nodeSelector: { }
  affinity: { }
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "dba"
    effect: "NoSchedule"

错误日志:

[root@localhost polardb-x]# helm template polardbx-operator --output-dir ./result --debug>aa
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config
install.go:178: [debug] Original chart version: ""
install.go:195: [debug] CHART PATH: /root/polardb-x/polardbx-operator

Error: YAML parse error on polardbx-operator/templates/controller-manager-deployment.yaml: error converting YAML to JSON: yaml: line 38: did not find expected key
helm.go:84: [debug] error converting YAML to JSON: yaml: line 38: did not find expected key
YAML parse error on polardbx-operator/templates/controller-manager-deployment.yaml
helm.sh/helm/v3/pkg/releaseutil.(*manifestFile).sort
        helm.sh/helm/v3/pkg/releaseutil/manifest_sorter.go:146
helm.sh/helm/v3/pkg/releaseutil.SortManifests
        helm.sh/helm/v3/pkg/releaseutil/manifest_sorter.go:106
helm.sh/helm/v3/pkg/action.(*Configuration).renderResources
        helm.sh/helm/v3/pkg/action/action.go:165
helm.sh/helm/v3/pkg/action.(*Install).RunWithContext
        helm.sh/helm/v3/pkg/action/install.go:259
main.runInstall
        helm.sh/helm/v3/cmd/helm/install.go:264
main.newTemplateCmd.func2
        helm.sh/helm/v3/cmd/helm/template.go:82
github.com/spf13/cobra.(*Command).execute
        github.com/spf13/[email protected]/command.go:856
github.com/spf13/cobra.(*Command).ExecuteC
        github.com/spf13/[email protected]/command.go:974
github.com/spf13/cobra.(*Command).Execute
        github.com/spf13/[email protected]/command.go:902
main.main
        helm.sh/helm/v3/cmd/helm/helm.go:83
runtime.main
        runtime/proc.go:255
runtime.goexit
        runtime/asm_amd64.s:1581

生成后错误的yaml(debug):

              fieldPath: metadata.annotations['pause']
      tolerations:
              - effect: NoSchedule
          key: dedicated
          operator: Equal
          value: dba
      containers:

感觉是toYaml函数会带上本行空格导致, 修改controller-manager-deployment.yaml文件中tolerations部分从

      {{- with .Values.controllerManager.tolerations }}
      tolerations:
      {{ toYaml . | indent 8 }}
      {{- end }}

改为

      {{- with .Values.controllerManager.tolerations }}
      tolerations:
{{ toYaml . | indent 6 }}
      {{- end }}

能修复,类似的可能还有nodeSelector, affinity部分(toYaml相关)

readonly db notworking

image ```yaml

apiVersion: polardbx.aliyun.com/v1
kind: XStore
metadata:
name: polardb-readonly
spec:
config:
controller:
RPCProtocolVersion: 1
engine: {}
readonly: true
primaryXStore: polardb-master # 主实例名
topology:
nodeSets:
- name: readonly
replicas: 1
role: Learner
template:
spec:
image: image
resources:
limits:
cpu: "2"
memory: 8Gi
requests:
cpu: "2"
memory: 4Gi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
xstore/node-role: candidate
topologyKey: kubernetes.io/hostname
weight: 1
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: mks.components
operator: In
values:
- "enable"
weight: 1
hostNetwork: false
parameterTemplate:
name: product-8032
serviceType: ClusterIP

polardbx-operator install MountVolume.SetUp failed for volume "etclocaltime" : hostPath type check failed: /etc/localtime is not a file

环境单机docker安装的rancher单节点 上安装
[root@test]# kubectl -n polardbx get pods
NAME READY STATUS RESTARTS AGE
polardbx-c00-gt96-dn-0-cand-0 0/3 ContainerCreating 0 24m
polardbx-c00-gt96-dn-0-cand-1 0/3 ContainerCreating 0 24m
polardbx-c00-gt96-dn-0-log-0 0/3 ContainerCreating 0 24m
polardbx-c00-gt96-gms-cand-0 0/3 ContainerCreating 0 24m
polardbx-c00-gt96-gms-cand-1 0/3 ContainerCreating 0 24m
polardbx-c00-gt96-gms-log-0 0/3 ContainerCreating 0 24m
polardbx-controller-manager-7b4f47dfcf-5txkz 1/1 Running 0 38m
polardbx-hpfs-2fq8x 1/1 Running 0 38m
polardbx-tools-updater-jd829 1/1 Running 0 38m

kubectl -n polardbx describe pod polardbx-c00-gt96-dn-0-cand-0
Events:
Type Reason Age From Message


Normal Scheduled 17s default-scheduler Successfully assigned polardbx/polardbx-c00-gt96-dn-0-cand-0 to dz-fanjie
Warning FailedMount 2s (x6 over 17s) kubelet MountVolume.SetUp failed for volume "etclocaltime" : hostPath type check failed: /etc/localtime is not a file
Warning FailedMount 2s (x6 over 17s) kubelet MountVolume.SetUp failed for volume "zoneinfo" : hostPath type check failed: /usr/share/zoneinfo is not a directory

出现上述错误,宿主机及docker容器都有上面的2个目录。

希望能支持污点容忍

场景是我们在开发环境单独加了一些worker节点,准备让polardb独享,但是为了不让其他应用占用资源,必须在节点上打上污点,这样一来,polardbx本身如果不容忍污点的话,也会被调度器排除在外
希望能配置tolerations或Runtime Class

Failed to install polardbx-operator in OrbStack

Hi, we use https://orbstack.dev/ k8s environment for development

we install by helm install --namespace polardbx-operator-system polardbx-operator polardbx/polardbx-operator
with version v1.5.0

but polardbx-hpfs start failed with error message

2023-11-17T14:43:20.414Z	INFO	getting file lock...
2023-11-17T14:43:20.414Z	INFO	locked, continue
2023-11-17T14:43:20.414Z	INFO	starting grpc service...
2023-11-17T14:43:20.415Z	INFO	Start filestream server
Loading block devices...
failed to create local file service: stat /dev/vda: no such file or directory

健康检查一直无法通过

Client: 10.1.8.242:57568
GET /liveness HTTP/1.1
Host: 10.1.8.242:23234
user-agent: kube-probe/1.21
accept: /
probe-extra: galaxy
probe-port: 17997
probe-target: xstore
connection: close
2022/08/12 21:20:44 Succeeds!
2022/08/12 21:20:44 Failed!

cannot create pod

$ kubectl get polardbxcluster -w
NAME GMS CN DN CDC PHASE DISK AGE
liyz 0/1 0/1 0/1 0/1 Creating 42m
alway waiting.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
liyz-9gk8-dn-0-single-0 2/3 Running 0 43m
liyz-9gk8-gms-single-0 2/3 Running 0 43m
$ kubectl logs --tail=30 liyz-9gk8-dn-0-single-0 --all-containers=true
INFO:root:Begin to initialize...
2022-04-07 08:31:23,861 - GalaxyEngine - INFO - (cwd=/data/mysql/data) execute command: /opt/galaxy_engine/bin/mysqld --defaults-file=/data/mysql/conf/my.cnf --loose-pod-name=liyz-9gk8-dn-0-single-0 --loose-cluster-info=liyz-9gk8-dn-0-single-0:25047@1 --initialize-insecure
INFO:root:Initialized!
INFO:root:Bootstrapping engine galaxy ...
2022-04-07 08:31:27,665 - GalaxyEngine - INFO - starting process...
2022-04-07 08:31:27,665 - GalaxyEngine - INFO - () start command: /opt/galaxy_engine/bin/mysqld_safe --defaults-file=/data/mysql/conf/my.cnf --loose-pod-name=liyz-9gk8-dn-0-single-0
2022-04-07T08:31:28.433699Z mysqld_safe Logging to '/data/mysql/log/alert.log'.
2022-04-07T08:31:28.447407Z mysqld_safe Starting mysqld daemon with databases from /data/mysql/data
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:277 msg="Starting msqyld_exporter" version="(version=0.13.0, branch=master, revision=4e8e2b98de1c2bcfed7bf0a5c08b3a885b1cbf46)"
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:278 msg="Build context" (gogo1.16.9,userroot@eafbaf7294c5,date20211023-07:52:42)=(MISSING)
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=global_status
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=global_variables
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=slave_status
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.processlist
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.innodb_tablespaces
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.innodb_metrics
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.innodb_cmp
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.innodb_cmpmem
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=info_schema.query_response_time
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:293 msg="Scraper enabled" scraper=engine_innodb_status
level=info ts=2022-04-07T08:32:02.795Z caller=mysqld_exporter.go:303 msg="Listening on address" address=:21897
level=info ts=2022-04-07T08:32:02.795Z caller=tls_config.go:191 msg="TLS is disabled." http2=false
Client: 192.168.49.2:47238
GET /liveness HTTP/1.1
Host: 192.168.49.2:22096
probe-port: 17047
probe-target: xstore
connection: close
user-agent: kube-probe/1.23
accept: /
probe-extra: galaxy
Client: 192.168.49.2:47236
GET /readiness HTTP/1.1
Host: 192.168.49.2:22096
probe-extra: galaxy
probe-port: 17047
probe-target: xstore
connection: close
user-agent: kube-probe/1.23
accept: /
2022/04/07 09:12:00 Succeeds!
2022/04/07 09:12:00 Failed!
Client: 192.168.49.2:47254
GET /readiness HTTP/1.1
Host: 192.168.49.2:22096
probe-target: xstore
connection: close
user-agent: kube-probe/1.23
accept: /
probe-extra: galaxy
probe-port: 17047
2022/04/07 09:12:01 Failed!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.