Giter Club home page Giter Club logo

percona / pmm Goto Github PK

View Code? Open in Web Editor NEW
583.0 45.0 120.0 136.05 MB

Percona Monitoring and Management: an open source database monitoring, observability and management tool

Home Page: https://www.percona.com/software/database-tools/percona-monitoring-and-management

License: GNU Affero General Public License v3.0

Go 96.47% Makefile 0.77% HTML 0.26% Shell 1.33% PLpgSQL 0.14% Python 0.30% Dockerfile 0.12% CSS 0.01% JavaScript 0.03% TypeScript 0.57%
pmm monitoring database database-management monitoring-server observability hacktoberfest

pmm's Introduction

Percona Monitoring and Management

CI CLA assistant Code coverage Go Report Card Forum

PMM

Percona Monitoring and Management

A single pane of glass to easily view and monitor the performance of your MySQL, MongoDB, PostgreSQL, and MariaDB databases.

Percona Monitoring and Management (PMM) is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed. PMM helps users to:

  • Reduce Complexity
  • Optimize Database Performance
  • Improve Data Security

See the PMM Documentation for more information.

Use Cases

  • Monitor your database performance with customizable dashboards and real-time alerting.
  • Spot critical performance issues faster, understand the root cause of incidents better and troubleshoot them more efficiently.
  • Zoom-in, drill-down database performance from node to single query levels. Perform in-depth troubleshooting and performance optimization.
  • Built-in Advisors run regular checks of the databases connected to PMM. The checks identify and alert you of potential security threats, performance degradation, data loss and data corruption.
  • DBaaS: Create and configure database clusters no matter where the infrastructure is deployed.
  • Backup and restore databases up to a specific moment with Point-in-Time-Recovery.

Architecture

Please check our Documentation for the actual architecture.

Overal Architecture

PMM Server

PMM Client

Installation

There are numbers of installation methods, please check our Setting Up documentation page.

But in a nutshell:

  1. Download PMM server Docker image
$ docker pull percona/pmm-server:2
  1. Create the data volume container
$ docker volume create pmm-data
  1. Run PMM server container
$ docker run --detach --restart always \
--publish 443:443 \
--volume pmm-data:/srv \
--name pmm-server \
percona/pmm-server:2
  1. Start a web browser and in the address bar enter the server name or IP address of the PMM server host.

Enter the username and password. The defaults are username: admin and password: admin

How to get involved

We encourage contributions and are always looking for new members that are as dedicated to serving the community as we are.

If you’re looking for information about how you can contribute, we have contribution guidelines across all our repositories in CONTRIBUTING.md files. Some of them may just link to the main project’s repository’s contribution guidelines.

We're looking forward to your contributions and hope to hear from you soon on our Forums.

Submitting Bug Reports

If you find a bug in Percona Monitoring and Management or one of the related projects, you should submit a report to that project's JIRA issue tracker. Some of related project also have GitHub Issues enabled, so you also could submit there.

Your first step should be to search the existing set of open tickets for a similar report. If you find that someone else has already reported your problem, then you can upvote that report to increase its visibility.

If there is no existing report, submit a report following these steps:

  1. Sign in to Percona JIRA. You will need to create an account if you do not have one.
  2. Go to the Create Issue screen and select the relevant project.
  3. Fill in the fields of Summary, Description, Steps To Reproduce, and Affects Version to the best you can. If the bug corresponds to a crash, attach the stack trace from the logs.

An excellent resource is Elika Etemad's article on filing good bug reports..

As a general rule of thumb, please try to create bug reports that are:

  • Reproducible. Include steps to reproduce the problem.
  • Specific. Include as much detail as possible: which version, what environment, etc.
  • Unique. Do not duplicate existing tickets.

Licensing

Percona is dedicated to keeping open source open. Wherever possible, we strive to include permissive licensing for both our software and documentation. For this project, we are using the GNU AGPLv3 license.

pmm's People

Contributors

ademidoff avatar adivinho avatar aleksi avatar artemgavrilov avatar arvenil avatar askomorokhov avatar bupychuk avatar dasio avatar delgod avatar dependabot[bot] avatar dliakhov avatar fiowro avatar gen1us2k avatar idexter avatar idoqo avatar jirictvrtka avatar michaelcoburn avatar michal-kralik avatar nikita-b avatar oter avatar palash25 avatar pavelkhripkov avatar percona-csalguero avatar qwest812 avatar ritbl avatar rnovikovp avatar roman-vynar avatar talhabinrizwan avatar tshcherban avatar yaroslavpodorvanov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pmm's Issues

Add more unit tests

Description

We need to increase tests coverage. So new tests are welcome!)

Suggested solution

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

PMM Clickhouse Datasource Exception

Description

We’re currently running PMM on a VM with a docker container running the pmm-server. To embed the dashboards, we have started a proxy VM which basically adds a auth header to all the requests passed on to the PMM Endpoint (percona.domain.in).

While embedding the dashboards, it works fine while using it with dashboard having prometheus data source. However, we’ve been facing a weird issue while querying the Clickhouse based datasource dashboard through the Proxy VM.

Expected Results

The Clickhouse Query should be executed successfully.

Actual Results

Here’s what’s happening

while opening a dashboard on percona.domain.in
a query is sent through the following endpoint percona.domain.in/api/datasources/proxy/3?query=‘Clickhouse Query’
while embedding a dashboard on Proxy VM with endpoint proxy.domain.in
a query is sent through the following endpoint proxy.domain.in/api/datasources/proxy/3?query=‘Clickhouse Query’
However, while the Query is sent to the API, the following error is thrown by Clickhouse

Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = No authentication information found: Basic authentication expected (version 21.3.14.1 (official build))

To Debug the above error, we went ahead with enabling basic auth in the Clickhouse Datasource. Initially we tried the Datasource config on our percona.domain.in. It ended up working perfectly fine. However, the same config ( basic auth enabled ) threw the same error.

Any suggestions on handling this exception ?

Version

PMMVersion: 2.20.0

Steps to reproduce

Proxy the request sent to PMM for any Clickhouse based Datasource.

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

License Policy Violation detected in github.com/percona/percona-toolkit-v3.2.1+incompatible

License Policy Violation detected in github.com/percona/percona-toolkit-v3.2.1+incompatible

Library - github.com/percona/percona-toolkit-v3.2.1+incompatible

Library home page: https://proxy.golang.org/github.com/percona/percona-toolkit/@v/v3.2.1+incompatible.zip

Dependency Hierarchy:

  • github.com/percona/percona-toolkit-v3.2.1+incompatible (Library containing License Policy Violation)

Found in base branch: main

📃 License Details

GPL 2.0

    ⛔ License Policy Violation - No GPL

License Policy Violation detected in github.com/percona/go-mysql-v0.0.0-20210427141028-73d29c6da78c - autoclosed

License Policy Violation detected in github.com/percona/go-mysql-v0.0.0-20210427141028-73d29c6da78c

Library - github.com/percona/go-mysql-v0.0.0-20210427141028-73d29c6da78c

Go packages for MySQL

Library home page: https://proxy.golang.org/github.com/percona/go-mysql/@v/v0.0.0-20210427141028-73d29c6da78c.zip

Dependency Hierarchy:

  • github.com/percona/go-mysql-v0.0.0-20210427141028-73d29c6da78c (Library containing License Policy Violation)

Found in base branch: main

📃 License Details

AGPL 3.0

BSD 3

    ⛔ License Policy Violation - No GPL

Helm deploy not connecting agent GRPC error

Description

Screenshot 2023-05-09 at 09 35 57

deployed PMM-server in kubernetes with helm! and cannot connect pmm-agent! but when i deploy in docker different machine it connects i dont understand why like this. in Kubernetes i have ingress with tls domain name behind the ingress have deployment to Pmm-server!

Expected Results

How to solve this problem?

Actual Results

Problem with ingress!

Version

PMM-server:2.37.0 Client 2.37

Steps to reproduce

No response

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Replace deprecated Azure SDK lib

Description

We have this linter warning:

SA1019: "github.com/Azure/azure-sdk-for-go/services/resourcegraph/mgmt/2019-04-01/resourcegraph" is deprecated: Please note, this package has been deprecated. A replacement package is available [github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/resourcegraph/armresourcegraph](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/resourcegraph/armresourcegraph). We strongly encourage you to upgrade to continue receiving updates. See [Migration Guide](https://aka.ms/azsdk/golang/t2/migration) for guidance on upgrading. Refer to our [deprecation policy](https://azure.github.io/azure-sdk/policies_support.html) for more details. (staticcheck)

So we should replace deprecated package that is used in our codebase. Here is where we import that package:

"github.com/Azure/azure-sdk-for-go/services/resourcegraph/mgmt/2019-04-01/resourcegraph"

Suggested solution

Follow the migration guide provided by linter warning

Additional context

https://aka.ms/azsdk/golang/t2/migration

Code of Conduct

  • I agree to follow this project's Code of Conduct

Help for Remove Service incorrectly identifies optional arguments

Description

Impact on the user
Unable to correctly identify the necessary arguments ahead of execution

Workaround
Trial and error at the command line

Details
The help information for removing a service is inaccurate and does not help the user successfully execute the action first time without prior knowledge of additional information.

Expected Results

The help information is sufficient to allow the user to successfully identify the arguments required to execute the action.

Actual Results

Help information identifies both the flags and the service-name as optional arguments. In addition, the help information does not state anything useful alongside the service-id option to help clarify matters. Without extra knowledge, the removal will fail due to missing either the service-name or service-id

Version

client 2.28

Steps to reproduce

  1. Register a new PMM agent using pmm-agent setup, or similar
  2. Add at least two services of the same type (mysql, mongodb, postgresql, proxysql, haproxy, external) using pmm-admin.
  3. Use pmm-admin remove --help to identify the necessary arguments to remove the service
  4. Remove one of the services with pmm-admin remove <service-type> command

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

License Policy Violation detected in github.com/AZURE/azure-sdk-for-go-v66.0.0+incompatible

License Policy Violation detected in github.com/AZURE/azure-sdk-for-go-v66.0.0+incompatible

Library - github.com/AZURE/azure-sdk-for-go-v66.0.0+incompatible

Library home page: https://proxy.golang.org/github.com/azure/azure-sdk-for-go/@v/v66.0.0+incompatible.zip

Dependency Hierarchy:

  • github.com/AZURE/azure-sdk-for-go-v66.0.0+incompatible (Library containing License Policy Violation)

Found in base branch: main

📃 License Details

LGPL

MIT

    ⛔ License Policy Violation - No GPL

Question about integration with Grafana OSS

Description

Hello,

We are working with OSS PPM, we implement and things is amazing especially QAN, but we are asking ourselves, if we can use some external Grafana and Victoria Metrics? Why I'm asking this because our company has great features in our corporative grafana to alarm ans show metrics, and we'd like only to plug PMM in this Grafana OSS

Suggested solution

Inteegrate with Grafana OSS

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Enable temporarily disabled linters

Description

In the golangci configuration file, we have a bunch of temporarily disabled linters.

Suggested solution

We want to review those disabled, one by one. Our options are:

  1. either enable the rule and fix potential warning messages or
  2. disable the rule and leave a comment explaining why.

It's not impossible to disable a couple of rules if we realize we don't need them.

Additional context

list of currently disabled linters that

  • bodyclose
  • containedctx
  • contextcheck
  • deadcode
  • dupl
  • errcheck
  • errorlint
  • execinquery
  • forbidigo
  • forcetypeassert
  • gocognit
  • gocritic
  • godot
  • godox
  • golint
  • gosimple
  • interfacebloat
  • ireturn
  • maintidx
  • nakedret
  • nestif
  • nilnil
  • nilerr
  • nonamedreturns
  • noctx
  • nosnakecase
  • nosprintfhostport
  • paralleltest
  • prealloc
  • predeclared
  • revive
  • tagliatelle
  • thelper
  • tparallel
  • usestdlibvars
  • whitespace

Code of Conduct

  • I agree to follow this project's Code of Conduct

-remoteWrite.rateLimit for vmagent

Description

When there is a lot of buffered data, from a lot of clients, being sent all at once after the pmm-server was down or unreachable for a while, the server or network can be overwhelmed.
This could be counteracted by setting a rate limit for the vmagent.

Suggested solution

Allow configuration of the vmagent launch arguments somewhere

Additional context

https://docs.victoriametrics.com/vmagent.html#advanced-usage:~:text=%2DremoteWrite.rateLimit%20array,the%20remote%20storage

Code of Conduct

  • I agree to follow this project's Code of Conduct

DATA_RETENTION env variable can only be expressed in hours

Description

I am running pmm-server on Docker, and trying to set the data retention to 90 days.

When setting the DATA_RETENTION variable value to 90 following the documentation here, I get an error stating that 90 is an invalid value.

I was able to track the error log to these sections in the code:
Environment variable parsing
Value checking

The issue is that the time.ParseDuration function in the second link accepts a specific format, not just the number.

If I set the value to 2160h (90 * 24), the container starts with no issues.

Please fix the docs with the right values for the variable.

Expected Results

The container starts and the retention is set to the configured values.

Actual Results

The container doesn't start with an error log stating the value is not correct.

Version

PMM Server v2.32.0 (Docker)

Steps to reproduce

Set the environtment variable DATA_RETENTION to any numeric value.

Relevant logs

ERRO[2023-01-24T07:49:28.358+00:00] Configuration error: environment variable "DATA_RETENTION=90" has invalid duration 90.

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Fix migration test

Description

We have a test that was written the wrong way and then updated that it tests nothing. The test is https://github.com/percona/pmm/blob/main/managed/models/database_test.go#L417.
It's supposed to test migration from 58 to 59 to test that migration didn't break anything for mongodb_exporters created with the stats_collector field as an array.
Suggested implementation:
Change the value for migration from 68 to 58.
Rewrite methods used in this test to make migration work from 58 to the latest one. Instead of using reform, let's use SQL query to insert agents as we do in the test above.

Expected Results

Test should test migration from 58 to 59.

Actual Results

It's testing different migration.

Version

PMM Server 2.31

Steps to reproduce

Change the migration number from 68 to 58 in https://github.com/percona/pmm/blob/main/managed/models/database_test.go#L417 and run the test.

Relevant logs

/root/go/src/github.com/percona/pmm/managed/models/database_test.go:433
            	Error:      	Received unexpected error:
            	            	pq: column agents.process_exec_path does not exist
            	            	github.com/percona/pmm/managed/models.checkUniqueAgentID
            	            		/root/go/src/github.com/percona/pmm/managed/models/agent_helpers.go:151
            	            	github.com/percona/pmm/managed/models.createPMMAgentWithID
            	            		/root/go/src/github.com/percona/pmm/managed/models/agent_helpers.go:533
            	            	github.com/percona/pmm/managed/models.CreatePMMAgent
            	            		/root/go/src/github.com/percona/pmm/managed/models/agent_helpers.go:563
            	            	github.com/percona/pmm/managed/models_test.TestDatabaseMigrations.func4
            	            		/root/go/src/github.com/percona/pmm/managed/models/database_test.go:432
            	            	testing.tRunner
            	            		/usr/local/go/src/testing/testing.go:1439
            	            	runtime.goexit
            	            		/usr/local/go/src/runtime/asm_amd64.s:1571
            	Test:       	TestDatabaseMigrations/stats_collections_field_migration:_string_array_to_string_array

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Cannot get security credentials if run on AWS FarGate

Description

Cannot get credentials from AWS Auth chain if run not in AWS EC2 but on AWS FarGate due to missed
the correct value of AWS_CONTAINER_CREDENTIALS_RELATIVE_URI.

Suggested solution

rds.txt

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Localhost URLs in alert messages

Description

Alert messages sent by PMM installed with the helm chart contain broken links with localhost in the URLs instead of the configured domain name. Related config file inside the pmm container:

[root@pmm-0 opt]# cat /etc/supervisord.d/alertmanager.ini 
; Managed by pmm-managed. DO NOT EDIT.

[program:alertmanager]
priority = 8
command =
        /usr/sbin/alertmanager
                --config.file=/etc/alertmanager.yml
                --storage.path=/srv/alertmanager/data
                --data.retention=336h
                --web.external-url=http://localhost:9093/alertmanager/
                --web.listen-address=127.0.0.1:9093
                --cluster.listen-address=""
...

(--web.external-url= seems to contain the wrong domain)

Expected Results

Links that lead to the alert page

Actual Results

Broken links

Version

PMM Helm Cart 1.2.0
PMM Server 2.35.0

Steps to reproduce

  • install PMM on kubernetes / docker
  • add an alert contact point
  • send a test alert

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Replace percona-platform/setup-go, percona-platform/checkout and percona-platform/cache GH Actions with their upstream variants

Description

We are using forked GH Actions. Originaly it was made due security reasons, but we figured out that it takes to much effort to support them. So instead lets switch to upstream actions with specified version, example:

When PR will be ready we should ask repo admin (@atymchuk) to allow this particular versions in repo settings.

P.s. If you will find another percona-platform/* actions in our CI you can replace them as well.

Suggested solution

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Enable new Access Role feature to be assigned to teams

Description

Access roles feature is great for limiting user metric access.

So far as I see they need to be assigned per user. If there is a lot of users who will manage all these access roles for each of them.

Suggested solution

Solution could be to assign access roles to teams and if user is a member of a team, he will auto obtain all these access roles assigned to the team. This will make it easier to manage it. Teams are more likely to be less than users.
Of course keep it possible to assign explicit other access roles beside these obtained from team roles.

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Add support for github.com/daixiang0/gci 0.9.0

Description

Checks with upgraded github.com/daixiang0/gci are failing because it changed formatting, so we should update our code or configs of gci to make checks pass.
PR that failing #1439

Expected Results

Checks pass with new gci

Actual Results

Checks are failing

Version

#1439

Steps to reproduce

No response

Relevant logs

HEAD detached at pull/1439/merge
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   managed/utils/tests/fuzz.go

no changes added to commit (use "git add" and/or "git commit -a")
diff --git a/managed/utils/tests/fuzz.go b/managed/utils/tests/fuzz.go
index 91dfc26..6c5dc12 100644
--- a/managed/utils/tests/fuzz.go
+++ b/managed/utils/tests/fuzz.go
@@ -15,8 +15,7 @@
 
 package tests
 
-import (
-	// go-fuzz uses SHA1 for non-cryptographic hashing
+import ( // go-fuzz uses SHA1 for non-cryptographic hashing
 	"crypto/sha1" //nolint:gosec
 	"fmt"
 	"os"

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Pass VM config through API instead of filesystem

Description

What should be done: pmm-managed should pass Victoria Metrics config using API instead of filesystem

Benefit: it will remove one more dependency on the file system

Suggested solution

Suggested implementation: -promscrape.config flag of Victoria metrics supports HTTP endpoints, so we should reconfigure it to use API for it, we should create a new endpoint in pmm-managed and point victoria metrics to that endpoint.

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Test Taras

Description

Description

Expected Results

ER

Actual Results

AR

Version

2.31

Steps to reproduce

No response

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Please return the full environment overview dashboard

Description

In the latest version (2.32.0) you have changed the home page from the very convenient table format environment overview dashboard to a consolidated summed up version.

Suggested solution

Please include it in its original form in all its glory in the next PMM version 🙏🏼

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

PXC Galera replication latency dashboard

Description

PXC Galera replication latency graphs no data as it's using metric called mysql_global_status_wsrep_evs_repl_latency
This metric is not an option for query when using metric browser, instead there is this metric called mysql_galera_evs_repl_latency_(avg_seconds|min_seconds|max_seconds|stdev)

Can you please update dashboards with new metric name:

  • PXC/Galera Cluster Summary (Panels: Average/Maximum Galera Replication Latency)
  • PXC/Galera Node Summary (Panel: Galera Replication Latency)
  • PXC/Galera Nodes Compare(Panel: Galera Replication Latency)

Replace these queries:
Query Maximum Latency:
Old:
avg by (service_name,aggregator) (max_over_time(mysql_global_status_wsrep_evs_repl_latency{service_name=~"$service_name", aggregator="Maximum"}[$interval]) or max_over_time(mysql_global_status_wsrep_evs_repl_latency{service_name=~"$service_name", aggregator="Maximum"}[5m]))

New:
avg by (service_name) (max_over_time(mysql_galera_evs_repl_latency_max_seconds{service_name=~"$service_name"}[$interval]) or max_over_time(mysql_galera_evs_repl_latency_max_seconds{service_name=~"$service_name"}[5m]))

Query Stdev Latency:
Old:
avg by (service_name,aggregator) (avg_over_time(mysql_global_status_wsrep_evs_repl_latency{service_name=~"$service_name", aggregator="Standard Deviation"}[$interval]) or avg_over_time(mysql_global_status_wsrep_evs_repl_latency{service_name=~"$service_name", aggregator="Standard Deviation"}[5m]))

New:
avg by (service_name) (avg_over_time(mysql_galera_evs_repl_latency_stdev{service_name=~"$service_name"}[$interval]) or avg_over_time(mysql_galera_evs_repl_latency_stdev{service_name=~"$service_name"}[5m]))

Query Average Latency:
Old:
avg by (service_name,aggregator) (avg_over_time(mysql_global_status_wsrep_evs_repl_latency{service_name=~"$service_name", aggregator="Average"}[$interval]) or avg_over_time(mysql_global_status_wsrep_evs_repl_latency{service_name=~"$service_name", aggregator="Average"}[5m]))

New:
avg by (service_name) (avg_over_time(mysql_galera_evs_repl_latency_avg_seconds{service_name=~"$service_name"}[$interval]) or avg_over_time(mysql_galera_evs_repl_latency_avg_seconds{service_name=~"$service_name"}[5m]))

Query Minimum Latency:
Old:
avg by (service_name,aggregator) (min_over_time(mysql_global_status_wsrep_evs_repl_latency{service_name=~"$service_name", aggregator="Minimum"}[$interval]) or min_over_time(mysql_global_status_wsrep_evs_repl_latency{service_name=~"$service_name", aggregator="Minimum"}[5m]))

New:
avg by (service_name) (min_over_time(mysql_galera_evs_repl_latency_min_seconds{service_name=~"$service_name"}[$interval]) or min_over_time(mysql_galera_evs_repl_latency_min_seconds{service_name=~"$service_name"}[5m]))

Expected Results

Showing data for galera replication latency

Actual Results

No Data

Version

PMM Server 2.34.0 > PMM Client 2.34.0
PMM Server 2.33.0 > PMM Client 2.33.0

Steps to reproduce

No response

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

2.33.0 Docker image error (both after upgrade, and clean installation)

Description

I have yet to run the container image for 2.33.0 successfully.
I've tested upgrading (using my data container + changing the server image which has worked flawlessly in the past).
I've also tested a new installation and at this point, haven't been able to successfully start the pmm-server containers.
The issue appears to occur at this point of the instantiation: pmm-server | 2023-01-26 06:21:04,252 INFO exited: qan-api2 (exit status 1; not expected) and grafana goes into a crashloop cycle after reporting a

Expected Results

Service starts correctly and presents login screen

Actual Results

500 Error in Nginx (bad gateway).

Resulting in unhealthy containers

# docker ps -a
CONTAINER ID   IMAGE                                          COMMAND                  CREATED         STATUS                     PORTS                                      NAMES
1905e3414872   docker.redacted.net/percona/pmm-server:2.33.0   "/bin/bash -c 'ln -s…"   6 minutes ago   Up 6 minutes (unhealthy)   0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp   pmm-server
3145d1d3f79e   docker.redacted.net/percona/pmm-server:2.33.0   "/bin/bash -c 'chown…"   7 minutes ago   Exited (0) 6 minutes ago                                              pmm-data

All pointing to grafana not loading the pmm-app plugin

Version

PMM Server 2.33.0

Steps to reproduce

Using this docker-compose.yaml file:

version: '2'

services:
  pmm-data:
    image: docker.redacted.net/percona/pmm-server:2.33.0
    container_name: pmm-data
    volumes:
      - /u01/srv:/srv

    command:
      - /bin/bash
      - -c
      - |
        chown -R root:clickhouse /srv/clickhouse
        chown -R grafana:grafana /srv/grafana
        chown -R pmm:pmm /srv/logs /srv/prometheus /srv/alertmanager
        chown -R postgres /srv/postgres
        chown -R postgres /srv/logs/postgresql.log
        /bin/true

  pmm-server:
    image: docker.redacted.net/percona/pmm-server:2.33.0
    container_name: pmm-server
    ports:
      - '80:80'
      - '443:443'
    restart: always
    environment:
      - DISABLE_UPDATES=true
      - DISABLE_TELEMETRY=true
      - DATA_RETENTION=2160h
    command:
      - /bin/bash
      - -c
      - |
        ln -s /srv/grafana /usr/share/grafana/data; grafana-cli --homepath /usr/share/grafana admin reset-admin-password 'redactedpassword'
        /opt/entrypoint.sh
    volumes_from:
      - pmm-data

Relevant logs

# from grafana.log
logger=sqlstore t=2023-01-26T06:43:09.279464905Z level=info msg="Connecting to DB" dbtype=sqlite3
logger=migrator t=2023-01-26T06:43:09.294504382Z level=info msg="Starting DB migrations"
logger=migrator t=2023-01-26T06:43:09.299234637Z level=info msg="migrations completed" performed=0 skipped=452 duration=546.206µs
logger=plugin.finder t=2023-01-26T06:43:09.350125134Z level=warn msg="Skipping finding plugins as directory does not exist" path=/usr/share/grafana/plugins-bundled
logger=plugin.loader t=2023-01-26T06:43:09.599976665Z level=warn msg="Skipping loading plugin due to problem with signature" pluginID=pmm-pt-summary-datasource status=unsigned
logger=plugin.loader t=2023-01-26T06:43:09.600021765Z level=warn msg="Skipping loading plugin due to problem with signature" pluginID=pmm-update status=unsigned
logger=plugin.loader t=2023-01-26T06:43:09.600045666Z level=warn msg="Skipping loading plugin due to problem with signature" pluginID=pmm-pt-summary-panel status=unsigned
logger=plugin.loader t=2023-01-26T06:43:09.600055366Z level=warn msg="Skipping loading plugin due to problem with signature" pluginID=pmm-qan-app-panel status=unsigned
logger=plugin.loader t=2023-01-26T06:43:09.600064166Z level=warn msg="Skipping loading plugin due to problem with signature" pluginID=pmm-check-panel-home status=unsigned
logger=plugin.loader t=2023-01-26T06:43:09.600071666Z level=warn msg="Skipping loading plugin due to problem with signature" pluginID=vertamedia-clickhouse-datasource status=unsigned
logger=plugin.loader t=2023-01-26T06:43:09.600096866Z level=warn msg="Skipping loading plugin due to problem with signature" pluginID=grafana-polystat-panel status=unsigned
logger=plugin.loader t=2023-01-26T06:43:09.600114366Z level=warn msg="Skipping loading plugin due to problem with signature" pluginID=pmm-app status=unsigned
logger=plugin.loader t=2023-01-26T06:43:09.600164667Z level=info msg="Plugin registered" pluginID=petrslavotinek-carpetplot-panel
logger=plugin.loader t=2023-01-26T06:43:09.600181567Z level=info msg="Plugin registered" pluginID=grafana-worldmap-panel
logger=plugin.loader t=2023-01-26T06:43:09.600189067Z level=info msg="Plugin registered" pluginID=grafana-piechart-panel
logger=plugin.loader t=2023-01-26T06:43:09.600195967Z level=info msg="Plugin registered" pluginID=yesoreyeram-boomtable-panel
logger=plugin.loader t=2023-01-26T06:43:09.600201868Z level=info msg="Plugin registered" pluginID=jdbranham-diagram-panel
logger=plugin.loader t=2023-01-26T06:43:09.600207368Z level=info msg="Plugin registered" pluginID=camptocamp-prometheus-alertmanager-datasource
logger=plugin.loader t=2023-01-26T06:43:09.600214868Z level=info msg="Plugin registered" pluginID=natel-discrete-panel
logger=secrets t=2023-01-26T06:43:09.600474271Z level=info msg="Envelope encryption state" enabled=true currentprovider=secretKey.v1
logger=query_data t=2023-01-26T06:43:09.603518606Z level=info msg="Query Service initialization"
logger=live.push_http t=2023-01-26T06:43:09.609171073Z level=info msg="Live Push Gateway initialization"
logger=infra.usagestats.collector t=2023-01-26T06:43:09.706840418Z level=info msg="registering usage stat providers" usageStatsProvidersLen=2
logger=provisioning t=2023-01-26T06:43:09.825091305Z level=error msg="Failed to provision plugins" error="app provisioning error: plugin not installed: \"pmm-app\""
Failed to start grafana. error: app provisioning error: plugin not installed: "pmm-app"
app provisioning error: plugin not installed: "pmm-app"

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Check possibility of using github-actions-version of reviewdog

Description

Currently we're using local version of reviewdog in our CI. There is a github-actions-version https://github.com/reviewdog/action-golangci-lint. The problem of using actions-version is that we're using golangci-lint for local tests. In case we use two different types of golangci-lint (locally built one for local tests and the other one for actions-version of reviewdog) we need to sync versions of golangci-lint to be sure we're using the same version of linter. We can consider to use just the latest version of golangci-lint if we can configure our CI such a way.

Suggested solution

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

PMM Server 2.36.0 can not restart successfully with pg failed.

Description

I installed pxc-operator and pmm-server using helm-chart 1.12.1. When the pmm was first deployed, it started correctly. When the pod restarted, I found that the pg service was still failing.

2023-04-11 10:18:28,393 INFO exited: postgresql (exit status 1; not expected)
2023-04-11 10:18:28,546 INFO exited: qan-api2 (exit status 1; not expected)
2023-04-11 10:18:29,261 INFO success: pmm-update-perform-init entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-11 10:18:29,261 INFO success: clickhouse entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-11 10:18:29,261 INFO success: grafana entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-11 10:18:29,261 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-11 10:18:29,292 INFO success: victoriametrics entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-11 10:18:29,308 INFO success: vmalert entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-11 10:18:29,308 INFO success: alertmanager entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-11 10:18:29,308 INFO success: vmproxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-11 10:18:29,308 INFO success: pmm-managed entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-11 10:18:29,308 INFO success: pmm-agent entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-11 10:18:29,422 INFO spawned: 'postgresql' with pid 153
2023-04-11 10:18:29,450 INFO exited: postgresql (exit status 1; not expected)
2023-04-11 10:18:29,568 INFO spawned: 'qan-api2' with pid 155
2023-04-11 10:18:30,561 INFO success: qan-api2 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-11 10:18:31,570 INFO spawned: 'postgresql' with pid 185
2023-04-11 10:18:31,942 INFO exited: postgresql (exit status 1; not expected)
2023-04-11 10:18:35,111 INFO spawned: 'postgresql' with pid 231
2023-04-11 10:18:35,344 INFO exited: postgresql (exit status 1; not expected)
2023-04-11 10:18:39,669 INFO spawned: 'postgresql' with pid 260
2023-04-11 10:18:39,833 INFO exited: postgresql (exit status 1; not expected)
2023-04-11 10:18:44,966 INFO spawned: 'postgresql' with pid 344
2023-04-11 10:18:45,090 INFO exited: postgresql (exit status 1; not expected)
2023-04-11 10:18:46,956 INFO exited: pmm-update-perform-init (exit status 0; expected)
2023-04-11 10:18:52,051 INFO spawned: 'postgresql' with pid 396
2023-04-11 10:18:52,090 INFO exited: postgresql (exit status 1; not expected)
2023-04-11 10:18:59,145 INFO spawned: 'postgresql' with pid 397
2023-04-11 10:18:59,183 INFO exited: postgresql (exit status 1; not expected)
2023-04-11 10:19:07,246 INFO spawned: 'postgresql' with pid 399
2023-04-11 10:19:07,269 INFO exited: postgresql (exit status 1; not expected)
2023-04-11 10:19:16,478 INFO spawned: 'postgresql' with pid 402
2023-04-11 10:19:16,497 INFO exited: postgresql (exit status 1; not expected)
2023-04-11 10:19:26,712 INFO spawned: 'postgresql' with pid 404
2023-04-11 10:19:26,734 INFO exited: postgresql (exit status 1; not expected)
2023-04-11 10:19:27,713 INFO gave up: postgresql entered FATAL state, too many start retries too quickly

I checked pg logs in/src/logs and found that the pg directory permissions is not correct.

2023-04-11 10:18:52.087 UTC [396] FATAL:  data directory "/srv/postgres14" has invalid permissions
2023-04-11 10:18:52.087 UTC [396] DETAIL:  Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).
2023-04-11 10:18:59.179 UTC [397] FATAL:  data directory "/srv/postgres14" has invalid permissions
2023-04-11 10:18:59.179 UTC [397] DETAIL:  Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).
2023-04-11 10:19:07.267 UTC [399] FATAL:  data directory "/srv/postgres14" has invalid permissions
2023-04-11 10:19:07.267 UTC [399] DETAIL:  Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).
2023-04-11 10:19:16.495 UTC [402] FATAL:  data directory "/srv/postgres14" has invalid permissions
2023-04-11 10:19:16.495 UTC [402] DETAIL:  Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).
2023-04-11 10:19:26.731 UTC [404] FATAL:  data directory "/srv/postgres14" has invalid permissions
2023-04-11 10:19:26.731 UTC [404] DETAIL:  Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).

I used the following commands to change the pg directory permissions and start pg. Pg started after the first change. But after I tried to restart pod, the directory permissions were forced to change by an unknown script or program. The repetition caused the above exception.

chmod 700 -R /srv/postgres14
su postgres -c "/usr/pgsql-14/bin/pg_ctl start -D /srv/postgres14"

Expected Results

Directory permission for postgres should not change, which is a mandatory restriction for pg startup.

Actual Results

pg directory permissions should not be changed.

Version

pmm-server and client 2.36.
OKD 4.11

Steps to reproduce

No response

Relevant logs

I had checked /srv permissions and I found that:


drwxrwsr-x. 13 root     pmm   4096 Apr  6 03:30 .
dr-xr-xr-x.  1 root     root    62 Apr 12 08:28 ..
drwxrwsr-x.  3 root     pmm   4096 Apr  6 03:29 alerting
drwxrwsr-x.  4 pmm      pmm   4096 Apr  6 03:29 alertmanager
drwxrwsr-x.  2 root     pmm   4096 Apr  6 03:30 backup
drwxrwsr-x. 13 root     pmm   4096 Apr 12 08:28 clickhouse
drwxrwsr-x.  6 grafana  pmm   4096 Apr 12 08:28 grafana
drwxrwsr-x.  2 pmm      pmm   4096 Apr 12 08:23 logs
drwxrws---.  2 root     pmm  16384 Apr  6 03:29 lost+found
drwxrwsr-x.  2 root     pmm   4096 Apr  6 03:29 nginx
-rw-rw-r--.  1 root     pmm      7 Apr  6 03:29 pmm-distribution
drwxrws---. 20 postgres pmm   4096 Apr 12 00:00 postgres14
drwxrwsr-x.  3 pmm      pmm   4096 Apr  6 03:29 prometheus
drwxrwsr-x.  3 pmm      pmm   4096 Apr  6 03:29 victoriametrics

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

DATA_RETENTION documentation is misleading

Description

Env var DATA_RETENTION is documented here (https://hub.docker.com/r/percona/pmm-server) as
"DATA_RETENTION | How many days to keep time-series data in ClickHouse"

But actually it's format is Go duration like 24h. Note that days are not supported, so specifying '30d' will cause an error

Expected Results

Looks like documentation should be updated. Thanks!

Actual Results

DATA_RETENTION=30

Configuration error: environment variable "DATA_RETENTION=30" has invalid duration 30. 

DATA_RETENTION=30d

Configuration error: environment variable "DATA_RETENTION=30d" has invalid duration 30d. 

Version

2.31

Steps to reproduce

No response

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Fix TestRDSService/DiscoverRDS/ListRegions

Description

TestRDSService/DiscoverRDS/ListRegions for a new version of AWS SDK is failing because a new region was added, so we should add the new region to the test. Here is the link to PR #1442.
new region name is "ap-south-2"

Expected Results

TestRDSService/DiscoverRDS/ListRegions shouldn't fail

Actual Results

TestRDSService/DiscoverRDS/ListRegions is failing

Version

PMM Server 2.34.0-main

Steps to reproduce

Run tests in this PRs branch #1442

Relevant logs

--- FAIL: TestRDSService/DiscoverRDS (0.99s)
        --- FAIL: TestRDSService/DiscoverRDS/ListRegions (0.00s)
            rds_test.go:103: 
                	Error Trace:	/root/go/src/github.com/percona/pmm/managed/services/management/rds_test.go:103
                	Error:      	Not equal: 
                	            	expected: []string{"af-south-1", "ap-east-1", "ap-northeast-1", "ap-northeast-2", "ap-northeast-3", "ap-south-1", "ap-southeast-1", "ap-southeast-2", "ap-southeast-3", "ca-central-1", "cn-north-1", "cn-northwest-1", "eu-central-1", "eu-central-2", "eu-north-1", "eu-south-1", "eu-south-2", "eu-west-1", "eu-west-2", "eu-west-3", "me-central-1", "me-south-1", "sa-east-1", "us-east-1", "us-east-2", "us-gov-east-1", "us-gov-west-1", "us-iso-east-1", "us-iso-west-1", "us-isob-east-1", "us-west-1", "us-west-2"}
                	            	actual  : []string{"af-south-1", "ap-east-1", "ap-northeast-1", "ap-northeast-2", "ap-northeast-3", "ap-south-1", "ap-south-2", "ap-southeast-1", "ap-southeast-2", "ap-southeast-3", "ca-central-1", "cn-north-1", "cn-northwest-1", "eu-central-1", "eu-central-2", "eu-north-1", "eu-south-1", "eu-south-2", "eu-west-1", "eu-west-2", "eu-west-3", "me-central-1", "me-south-1", "sa-east-1", "us-east-1", "us-east-2", "us-gov-east-1", "us-gov-west-1", "us-iso-east-1", "us-iso-west-1", "us-isob-east-1", "us-west-1", "us-west-2"}
                	            	
                	            	Diff:
                	            	--- Expected
                	            	+++ Actual
                	            	@@ -1,2 +1,2 @@
                	            	-([]string) (len=32) {
                	            	+([]string) (len=33) {
                	            	  (string) (len=10) "af-south-1",
                	            	@@ -7,2 +7,3 @@
                	            	  (string) (len=10) "ap-south-1",
                	            	+ (string) (len=10) "ap-south-2",
                	            	  (string) (len=14) "ap-southeast-1",
                	Test:       	TestRDSService/DiscoverRDS/ListRegions

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

SSH Tunnelling

Description

Hello, I was looking at Percona Server for MySQL 8.0 and Percona Monitoring and Management.

I was wondering if an architecture like this is possible.

SERVER A

  • Contains Podman instance of Percona Server running on localhost

COMPUTER B

  • Connected to server A with SSH and binding a port for SSH Tunnell to 3306 of Server A
  • Running Percona Monitoring Server and Client on Podman Desktop

Purpose, connecting from my computer to web page to monitor the percona server on SERVER A

I was looking at https://docs.percona.com/percona-monitoring-and-management/ but I have not found something that can be interesting about SSH tunnells

Suggested solution

Adding to https://docs.percona.com/percona-monitoring-and-management/ and eventual situation with SSH Tunnelling or expand the details for remote monitoring

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

PMM cant restore xtradb from backup

Description

I installed test xtradb cluster (5.7 version) using this instruction:
https://docs.percona.com/percona-xtradb-cluster/5.7/overview.html

Install PMM server and pmm agent (on each mysql node), register service from each node, using this instruction:
https://docs.percona.com/percona-monitoring-and-management/setting-up/client/mysql.html

Install xtrabackup 2.4 on all mysql nodes using this instruction:
https://docs.percona.com/percona-xtrabackup/2.4/installation/yum_repo.html

now i have next installation:
node01 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1
node02 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1
node03 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1
node 04 - percona/pmm-server:2 in docker

After all preparing i start backup from node01 using web interface. Backup complete without problems. After creating backup i start restore them to node01 using web interface and get error. Web interface dont get information about error. But li /var/log/mesages i see next errors:

Jul  3 12:55:50 node01 systemd[1]: Stopping Percona XtraDB Cluster...
Jul  3 12:55:50 node01 mysql-systemd[1279656]: SUCCESS! Stopping Percona XtraDB Cluster......
Jul  3 12:56:00 node01 pmm-agent[1172959]: #033[33mWARN#033[0m[2023-07-03T12:56:00.196+00:00] Job terminated 
with error: signal: killed
Jul  3 12:56:00 node01 pmm-agent[1172959]: waiting systemctl stop command failed
Jul  3 12:56:00 node01 pmm-agent[1172959]: github.com/percona/pmm/agent/runner/jobs.stopMySQL
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/jobs/mysql_restore_job.go:290
Jul  3 12:56:00 node01 pmm-agent[1172959]: github.com/percona/pmm/agent/runner/jobs.(*MySQLRestoreJob).Run
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/jobs/mysql_restore_job.go:121
Jul  3 12:56:00 node01 pmm-agent[1172959]: github.com/percona/pmm/agent/runner.(*Runner).handleJob.func1
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/runner.go:185
Jul  3 12:56:00 node01 pmm-agent[1172959]: runtime/pprof.Do
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/usr/local/go/src/runtime/pprof/runtime.go:40
Jul  3 12:56:00 node01 pmm-agent[1172959]: runtime.goexit
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/usr/local/go/src/runtime/asm_amd64.s:1594
Jul  3 12:56:00 node01 pmm-agent[1172959]: github.com/percona/pmm/agent/runner/jobs.(*MySQLRestoreJob).Run
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/jobs/mysql_restore_job.go:122
Jul  3 12:56:00 node01 pmm-agent[1172959]: github.com/percona/pmm/agent/runner.(*Runner).handleJob.func1
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/runner.go:185
Jul  3 12:56:00 node01 pmm-agent[1172959]: runtime/pprof.Do
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/usr/local/go/src/runtime/pprof/runtime.go:40
Jul  3 12:56:00 node01 pmm-agent[1172959]: runtime.goexit
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/usr/local/go/src/runtime/asm_amd64.s:1594  #033[33mcomponent#033[0m=runner #033[33mid#033[0m=/job_id/90a32d1b-4e89-4cc4-bd4d-36abe52c0da7 #033[33mtype#033[0m=mysql_restore
Jul  3 12:56:02 node01 pmm-agent[1172959]: [mysql] 2023/07/03 12:56:02 packets.go:122: closing bad idle connection: EOF

I using almalinux 8.7 on all servers. And also settings same solution for backuping mongoDB (the work good).

Can you help me with this problem?

Expected Results

I expect that backup restore correctly.

Actual Results

Backup restore with errors.

Version

node01 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1
node02 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1
node03 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1
node 04 - percona/pmm-server:2 in docker

Steps to reproduce

install xtradb 5.7 cluster with 3 nodes, create backup using PMM web interface, try restore backup

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

None detected

mongo db not showing correct query time, considered in seconds but seems like it should be millis

Description

We are trying to run query analytics on mongo db. we are using locust to run some load scripts. we are getting min and average response in range of few ms. but pmm shows min as ~1 min and max as ~2.5 hours. I am attaching some snapshots here.
pmm query time average is coming as 1 min 57 sec here. Though it should be around ~120 ms. because on locust this time is around ~170 ms.

image

image

I checked this from mongo stats as well.
image

It seems that these values are in millisecond but these values are being considered as seconds. That is the reason we are seeing such higher values in the UI.

Expected Results

Query time values are considered in seconds but are in milli seconds.

Actual Results

Query time is considered in seconds.

Version

PMM server 2.30.0

Steps to reproduce

  1. enable DB profiling in mongo db
  2. push stats from mongo db query analytics stats from mongo to pmm
  3. check the UI min, max, average query time. it comes in minutes and way higher than expected

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

MySQL Table Details dashboard showing No Data

Description

Dashboard showing no data. Hundreds of clients connected to PMM server.
Variable "Service Name" not loading any value.
Query used for service_name variable is
label_values(mysql_info_schema_table_version, service_name)

This raw metric also crash victoriametrics when used in explore.
Only queries using "by" works.. for example:
label_values(max by(service_name)(mysql_info_schema_table_version), service_name)

Maybe solution is to use some kind of aggregation on the query or simple change metric to "mysql_up" and it should look like:
label_values(mysql_up, service_name)

Expected Results

Showing data per service name

Actual Results

No data, empty var Service Name

Version

PMM Server 2.34 > PMM client 2.34

Steps to reproduce

No response

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Fix unstable TestVersionCache test

Description

TestVersionCache test is unstable, if fails time to time so we need to fox it.

Expected Results

Test always passes

Actual Results

It randomly fails, restarting CI job typically helps.

Version

Steps to reproduce

I saw this problem only in CI, so I'm not sure if it's reproducible in local environments.

Here is example of failed job: https://github.com/percona/pmm/actions/runs/3264069158/jobs/5364126371

Relevant logs

panic: Log in goroutine after TestVersionCache has completed: <<< SELECT "service_software_versions"."service_id", "service_software_versions"."service_type", "service_software_versions"."software_versions", "service_software_versions"."next_check_at", "service_software_versions"."created_at", "service_software_versions"."updated_at" FROM "service_software_versions" WHERE "service_software_versions"."service_id" = $1 LIMIT 1 [`service_id_1`] 18.464168ms

goroutine 37 [running]:
testing.(*common).logDepth(0xc000804ea0, {0xc000026d00, 0x18b}, 0x3)
	/usr/local/go/src/testing/testing.go:894 +0x6d9
testing.(*common).log(...)
	/usr/local/go/src/testing/testing.go:876
testing.(*common).Logf(0xc000804ea0, {0x203677c, 0x6}, {0xc000764230, 0x1, 0x1})
	/usr/local/go/src/testing/testing.go:927 +0xa5
gopkg.in/reform%2ev1.(*PrintfLogger).After(0xc0001f82c0, {0xc0008c2d80, 0x16a}, {0xc0008cc110?, 0x1, 0xc0008cc110?}, 0x1?, {0x0, 0x0})
	/root/go/pkg/mod/gopkg.in/[email protected]/logger.go:95 +0x44c
gopkg.in/reform%2ev1.(*Querier).logAfter(...)
	/root/go/pkg/mod/gopkg.in/[email protected]/querier.go:41
gopkg.in/reform%2ev1.(*Querier).QueryRow(0xc0004ecbe0, {0xc0008c2d80, 0x16a}, {0xc0008cc110, 0x1, 0x1})
	/root/go/pkg/mod/gopkg.in/[email protected]/querier.go:139 +0x211
gopkg.in/reform%2ev1.(*Querier).SelectOneTo(0xc0004ecbe0?, {0x3156f40, 0xc0006c3780}, {0xc0008c82c0, 0x3b}, {0xc0008cc110, 0x1, 0x1})
	/root/go/pkg/mod/gopkg.in/[email protected]/querier_selects.go:56 +0xbd
gopkg.in/reform%2ev1.(*Querier).FindOneTo(0x3ddc100?, {0x3156f40, 0xc0006c3780}, {0x204aece, 0xa}, {0x1d996e0?, 0xc0008cc0e0})
	/root/go/pkg/mod/gopkg.in/[email protected]/querier_selects.go:147 +0x138
gopkg.in/reform%2ev1.(*Querier).FindByPrimaryKeyTo(0xc0006c3780?, {0x3161a80, 0xc0006c3780}, {0x1d996e0, 0xc0008cc0e0})
	/root/go/pkg/mod/gopkg.in/[email protected]/querier_selects.go:198 +0x11f
gopkg.in/reform%2ev1.(*Querier).Reload(0x4aa5ac?, {0x3161a80, 0xc0006c3780})
	/root/go/pkg/mod/gopkg.in/[email protected]/querier_selects.go:216 +0x57
github.com/percona/pmm/managed/models.FindServiceSoftwareVersionsByServiceID(0xc0006b5bf8?, {0xc0006e7fd4, 0xc})
	/root/go/src/github.com/percona/pmm/managed/models/software_version_helpers.go:156 +0x1fc
github.com/percona/pmm/managed/models.UpdateServiceSoftwareVersions(0x1188f4d3?, {0xc0006e7fd4, 0xc}, {0xc0006c6ca8, {0xc0006c3700, 0x4, 0x4}})
	/root/go/src/github.com/percona/pmm/managed/models/software_version_helpers.go:130 +0xc5
github.com/percona/pmm/managed/services/versioncache.(*Service).updateVersionsForNextService(0xc0006ee180)
	/root/go/src/github.com/percona/pmm/managed/services/versioncache/versioncache.go:198 +0xb68
github.com/percona/pmm/managed/services/versioncache.(*Service).Run(0xc0006ee180, {0x3156258, 0xc0006a8840})
	/root/go/src/github.com/percona/pmm/managed/services/versioncache/versioncache.go:238 +0x43f
github.com/percona/pmm/managed/services/versioncache.TestVersionCache.func3()
	/root/go/src/github.com/percona/pmm/managed/services/versioncache/versioncache_test.go:158 +0x4d
created by github.com/percona/pmm/managed/services/versioncache.TestVersionCache
	/root/go/src/github.com/percona/pmm/managed/services/versioncache/versioncache_test.go:157 +0x1cea
FAIL	github.com/percona/pmm/managed/services/versioncache	2.455s

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

PMM Backup support backing up to azure blob storage

Description

Would be extremely helpful if the backup feature of PMM supports azure blob storage like it does for S3 storage.

Suggested solution

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Address the `gocritic` linter warnings

Description

We silenced a number of gocritic rule warnings in this PR to reduce the scope. However, addressing some of the warnings could simplify and optimize the code (see this conversation).

Suggested solution

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

PMM agent postgresql high cpu usage

Description

I have a 14.3 postgres server that contains about 900+ databases.

With standard server operation, I have a CPU load of no more than 10%.

When running PMM agent on the same server to monitor postgres, 10 minutes later my server has 100% CPU usage.

After analyzing the situation, I see that the system is loaded by PMM agent, it is he who makes the CPU load 100%.

I don't see any errors in the event log, everything works fine.

Is the reason for such a high load from PMM agent that the server has a significant number of databases and it tries to process them in parallel and the only solution is to add a CPU to this server?

Expected Results

CPU usage less than 20% percent

Actual Results

CPU usage 100%

Version

PMM server version: 2.30.0
PMM admin version: 2.30.0
PMM agent version: 2.30.0

Steps to reproduce

No response

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

rds_exporter doesn't use service account role in AWS GovCloud EKS Cluster

Description

rds_exporter doesn't use the role of the service account when deploying in the AWS EKS. It uses the EKSNodeRole and the following error is logged
ERRO[2023-03-08T20:56:43.534+00:00] ts=2023-03-08T20:56:43.534Z caller=sessions.go:122 level=error component=sessions msg="Failed to get resource IDs." error="AccessDenied: User: arn:aws-us-gov:sts:: *****:assumed-role/******-01-EKSNodeRole-20230217182951101900000001/i-0580a5cf4b2e9b11e is not authorized to perform: rds:DescribeDBInstances on resource: arn:aws-us-gov:rds:us-gov-west-1:*****:db:* because no identity-based policy allows the rds:DescribeDBInstances action\n\tstatus code: 403, request id *****-****-**********" agentID=pmm-server/rds component=agent-process type=rds_exporter

Node uses the following role (eks-rds)

[root@pmm-0 opt]# aws sts get-caller-identity
{
    "Account": "***********",
    "UserId": "AROAR7I67PE2QCQJYKWUC:botocore-session-1678308909",
    "Arn": "arn:aws-us-gov:sts::*****:assumed-role/eks-rds/botocore-session-1678308909"
}

Expected Results

It should the service account Role

Actual Results

It uses the Node Role

Version

2.35

Steps to reproduce

No response

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

PMM QAN Doesn't Appear to Work Fully

Description

I am using Docker to run the PMM server. I am able to get everything set up and connected with our RDS MariaDB instance.

Unfortunately, it doesn't seem to allow me to make full use of Query Analytics. Tables & Explains tabs for Query details don't appear to work.

Expected Results

Expected to be able to use Query Insights/Analytics with data on Tables, Visual Explains, etc...

Actual Results

Screen Shot 2022-09-24 at 11 35 33 AM

Screen Shot 2022-09-24 at 11 35 39 AM

Screen Shot 2022-09-24 at 11 35 43 AM

Version

PMM Docker latest image

Steps to reproduce

No response

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Make the `godot` linter rule more restrictive.

Description

Ref: #1541

When enabling the godot rule, we have intentionally chosen to set its period config parameter to false, since otherwise the PR would be too heavy to review.

We want to turn it back to true, i.e. period: true so as to make the linter fail when it encounters the absence of the dot at the end of a comment.

This seems to follow Go's conventions and best practices.

Suggested solution

  1. set the period: true in the .golangci.yml file
  2. fix the linter warnings

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

software "mysqld" is not installed: incompatible service

Description

I have done the setup for pmm and percona-operator mysql cluster based on documentation instruction for kubernetes, i m not sure how to resolve this issue ...

software "mysqld" is not installed: incompatible service

2023-01-10_19-43-24.mp4

Expected Results

backup to s3 like service

Actual Results

[ERROR] software "mysqld" is not installed: incompatible service

Version

pmm-server V2.32.0 , pmm-client v2.32.0

Steps to reproduce

No response

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

rpm package for arm

Description

Now that there are more DBs based on arm, I think its much needed to create a packages for arm OS.
Can you please add this to your download page?

Suggested solution

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Capture Prepared statements data in Query Analytics with datasource PERFORMANCE_SCHEMA

Description

Goal: Create a basic implementation of collecting prepared statement for PS 8.0.**

The problem: Queries executed using prepared statements are not captured by QAN. This cause the problem = Not everything happening in the  DB server  - visible in the QAN and it's easy to do false conclusions about a Server 

User Story: As a PMM user I need to be able to see queries executed with prepared statements so that I can see the real number of queries executed by the DB server 

Use case: As a developer, I'm using prepared statements to increase the security of my APP and DB. I can prepare a statement in my code and later execute it.  This type of query generates a majority of the load on the DB server.

Suggested solution

Implement getPreparedStatements similar to https://github.com/percona/pmm-agent/blob/master/agents/mysql/perfschema/summaries.go
Implement a model for prepared_statements_instances as in https://github.com/percona/pmm-agent/blob/master/agents/mysql/perfschema/models.go
Collect metrics of prepared statements: https://github.com/percona/pmm-agent/blob/master/agents/mysql/perfschema/perfschema.go#L126

Additional context

Out of scope: Any other MySQL version/distribution than PS 8.0

Code of Conduct

  • I agree to follow this project's Code of Conduct

Getting “The security token included in the request is invalid” while deploying in EKS in GovCloud

Description

I am getting the following error security token included in the request is invalid when attempting to discover AWS RDS using a pod that has administrator rights assigned to its role.
The error message I am receiving is
pmm-managed.log:WARN[2023-03-02T18:12:00.044+00:00] RPC /management.RDS/DiscoverRDS done in 1.244182818s with gRPC error: rpc error: code = InvalidArgument desc = The security token included in the request is invalid request=b90e3874-b925-11ed-889c-626f4a78ebbc ,

and I am unsure where to look to find more information about the issue. Can anyone provide guidance on how to troubleshoot this error?

Expected Results

It should discover the databases

Actual Results

Ran into an error

Version

PMM Server 2.35.0

Steps to reproduce

No response

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Admin password does not persist after you restart rebuild docker image

Description

While setting up the docker image, I used a compose file to create the container. The docker compose file attaches to an external persisted data dir. The reason I create a docker compose file is so I can easily upgrade the PMM server version.
When the container comes back up, the newly setup admin password does not work anymore and it reverts to the standard admin password on installation.

Expected Results

I should be able to log back into the PMM server with the credentials I setup when I changed the admin password. The data directory is persisted and should still keep the same login information.

Actual Results

I get invalid username/password everytime I try to log in with my new admin password

Version

PMM Server docker v.2.35

Steps to reproduce

docker-compose up -d the below docker compose file.

Then change the admin password.

docker exec -t pmm-server change-admin-password newpass

Then run docker compose down and then docker compose up.

docker-compose down && docker-compose up -d

Relevant logs

Followed guide and created data dir first

docker volume create pmm-data

Docker compose file 

version: "3.9"
services:
  pmm:
    container_name: pmm-server
    restart: always
    image: percona/pmm-server:2.35
    volumes:
      - pmm-data:/var/lib/mysql
    ports:
      - "443:443"
      - "80:80"

volumes:
  pmm-data:
    external: true

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

How to limit monitor account number?

Description

HI
I check the document, and it can limit monitor account conncetion number.

ALTER USER monitor CONNECTION LIMIT 2;

https://docs.percona.com/percona-monitoring-and-management/setting-up/client/postgresql.html#create-a-database-account-for-pmm

Because I find out if I have five user db, each userdb will open 4 connections, and there will be 20 ilde connctions finally.

If I limit user account "monitor" number as 2, there are only two connections right, but there will be bunch of fatal errors in
postgresql log like this.

2022-11-16 01:07:52.590 -04 [493080] monitor@postgres 192.168.XXX.XXX(44096)FATAL: too many connections for role "monitor"

All suggestions are welcome, Thank you!

Expected Results

1.No fatal errors in postgresql
2. lower monitor account conncetions

Actual Results

bunch of fatal errors in
postgresql log like this.

2022-11-16 01:07:52.590 -04 [493080] monitor@postgres 192.168.XXX.XXX(44096)FATAL: too many connections for role "monitor"

Version

pmm server 2.32
pmm client 2.32

Steps to reproduce

No response

Relevant logs

No response

Code of Conduct

  • I agree to follow Percona Community Code of Conduct

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.