vonng / pigsty Goto Github PK

View Code? Open in Web Editor NEW

2.8K 48.0 234.0 26.74 MB

Battery-Included PostgreSQL Distro as a Free RDS Alternative

Home Page: https://doc.pigsty.cc/

License: GNU Affero General Public License v3.0

Makefile 4.71% Shell 27.97% Python 8.03% PLpgSQL 26.04% Jinja 26.04% HCL 2.83% Ruby 2.31% Perl 1.67% TypeScript 0.40%

postgres graphics infra service toolbox yours

pigsty's Issues

grafana首页 HAproxy连接问题

小白刚刚初始化完成pigsty集群在首页
这个lian'jie链接不应该是 h.pigsty 么，目前的超链接好像是 pigsty:9101

其他几个链接都是正常的, 还请指教

Pigsty CMDB and dynamic inventory

Make default pgsql database on meta as part of the infrastructure.

Use it as dynamic inventory for cluster management.

for example, ansible could retrieve inventory information from cmdb via

#!/bin/bash
psql service=meta -AXtwc 'SELECT text FROM pigsty.inventory;'

Scenario

Currently pigsty rely on consul for three purpose

DCS for HA Agent
Monitoring target discovery
Cluster status admin and report

While consul is really good. I'd like to provide more options and possibility.
For example, user who are familiar with etcd would not like the idea of learning a new kind of database.

Proposal

Refactor Patroni template to enabling etcd support
Add configuration entry to designate where to register services
Add configuration entry to designate where to discover services

Binary install mode for node_exporter and pg_exporter

Scenario

Sometimes you may want ADD pigsty monitoring system to existing database cluster instead of provisioning a new one.
In that case. It is much convenient to install node_exporter & pg_exporter via copy binaries. (Directly from meta node instead of setup infra, download and install rpm, blahblah)

Proposal

Add a exporter_binary_install option. If set, node_exporter and pg_exporter are copied to /usr/bin/ directly.
But you'll have to put those two binary files in files directory.

Refactor default datalets and add to pigsty contrib

有些指标数据没有收集

QPS，TPS并没有数据，检查了一下发现并没有收集pg:ins:qps和pg:ins:tps指标。是开源版不开放这个功能吗？

Pgbadger Intergration

pgbadger is an useful tools for log analyzing.
We can introduce semi-realtime log summary with pgbadger.

Safeguard for purge functionality

Scenario

It is very dangerous to set dcs_exists_action = clean and pg_exists_action = clean, Which are safeguard variables to avoid accidentally removing any running databases or dcs instances.

It is always better to achieve that by an explicit playbook execution pgsql-rm.yml rather than change a variable.
In some case you just want to erase that possibility.

Proposal

Add two safeguard variables pg_disable_purge and dcs_disable_purge. If set to true, corresponding variable pg_exists_action and dcs_exists_action will be forced to abort to avoid purge.

Pigsty CLI Tools

Provide yet another cli wrapper to seal ansible primitives. for example:

NAME
    pigsty -- Pigsty Command-Line Interface v0.8

SYNOPSIS

    meta               setup meta nodes            init|fetch|repo|cache|ansible
    node               setup database nodes        init|tune|dcs|remove|ping|bash|ssh|admin
    pgsql              setup postgres clusters     init|node|dcs|postgres|template|business|monitor|service|monly|remove
    infra              setup infrastructure        init|ca|dns|prometheus|grafana|loki|haproxy|target
    clean              clean pgsql clusters        all|service|monitor|postgres|dcs
    config             mange pigsty config file    init|edit|info|dump|path
    serve              run pigsty API server       init|start|stop|restart|reload|status
    demo               setup local demo            init|up|new|clean|start|dns
    log                watch system log            query|postgres|patroni|pgbouncer|message
    pg                 pg operational tasks        user|db|svc|hba|log|psql|deploy|backup|restore|vacuum|repack


EXAMPLES

    1. infra summary
        pigsty infra

    2. pgsql clusters summary
        pigsty pgsql

    3. pigsty nodes summary
        pigsty node

    4. create pgsql cluster 'pg-test'
        pigsty pgsql init -l pg-test

    5. add new instance 10.10.10.13 of cluster 'pg-test'
        pigsty pgsql init -l 10.10.10.13

    6. remove cluster 'pg-test'
        pigsty clean -l pg-test

    7. create user dbuser_vonng on cluster 'pg-test'
        pigsty pg user dbuser_vonng -l pg-test

    8. create database test2 on cluster 'pg-test'
        pigsty pg db test -l pg-test

Usage:
  pigsty [command]

Available Commands:
  clean       remove pgsql cluster/instance
  help        Help about any command
  infra       setup infrastructure
  meta        setup meta node
  node        setup database node
  pgsql       setup pgsql clusters
  serve       Launch pigsty API server

Flags:
  -h, --help               help for pigsty
  -i, --inventory string   inventory file (default "./pigsty.yml")
  -l, --limit string       limit execution hosts
  -t, --tags strings       limit execution tasks

Add create database playbook

Scenario

Database creation after cluster bootstrap is tricky.
It is not a good practice to use init playbook pgsql.yml for that purpose.

Proposal

add a pgsql-createdb.yml playbook for that purpose

It may requires a cli arg -e pg_database = <your new database in inventory> to work

Remove svc and role label in monitoring system

svc and role are tagged to all metrics in pigsty.

Usually it works well. role of instance will be auto-adjusted by patroni callback and routine tasks.

While it do have two down-sides:

When failover occurs. It create NEW time series. which my disturb monitoring display
It cost more space and attention. Which may affect how analysis goes.

To avoid it. Here I propose to eliminate svc and role label from pigsty. (Along with any old service level metrics implementation)

Add QPS panels for tables and indexes

Add Table/Index QPS panel

Table QPS can be roughly calculated from pg_table_idx_scan and pg_table_seq_scan. (Which is actually Scan per Seconds)
It's very useful to have a QPS metrics for tables and indexes.

Database provisioning interface enhancement

Upgrade database creation interface to support complete features.

Current interface:

pg_databases:                       # create a business database 'meta'
  - name: meta
    schemas: [meta]                 # create extra schema named 'meta'
    extensions: [{name: postgis}]   # create extra extension postgis
    parameters:                     # overwrite database meta's default search_path
      search_path: public, monitor

Proposed interface

Which include complete customizable options for postgres database.

pg_databases:
  - name: meta                      # name is the only required field for a database
    owner: postgres                 # optional, database owner
    template: template1             # optional, template1 by default
    encoding: UTF8                  # optional, UTF8 by default
    locale: C                       # optional, C by default
    allowconn: true                 # optional, true by default, false disable connect at all
    revokeconn: false               # optional, false by default, true revoke connect from public # (only default user and owner have connect privilege on database)
    tablespace: pg_default          # optional, 'pg_default' is the default tablespace
    connlimit: -1                   # optional, connection limit, -1 or none disable limit (default)
    extensions:                     # optional, extension name and where to create
      - {name: postgis, schema: public}
    parameters:                     # optional, extra parameters with ALTER DATABASE
      enable_partitionwise_join: true
    pgbouncer: true                 # optional, add this database to pgbouncer list? true by default
    comment: pigsty meta database   # optional, comment string for database

when i make upload,always error.then vm enter aborted status

# make upload
ssh -t meta "sudo rm -rf /tmp/pkg.tgz"
Connection to 127.0.0.1 closed.
scp -r files/pkg.tgz meta:/tmp/pkg.tgz
pkg.tgz                                                                                                                 100% 1029MB  10.9MB/s   01:34    
ssh -t meta "sudo mkdir -p /www/pigsty/; sudo rm -rf /www/pigsty/*; sudo tar -xf /tmp/pkg.tgz --strip-component=1 -C /www/pigsty/"
Connection to 127.0.0.1 closed by remote host.
Connection to 127.0.0.1 closed.
make: *** [upload] 错误 255

# make status
cd vagrant && vagrant status
Current machine states:

meta                      aborted (virtualbox)
node-1                    running (virtualbox)
node-2                    running (virtualbox)
node-3                    running (virtualbox)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.

# make resume
cd vagrant && vagrant resume
==> meta: Waiting for machine to boot. This may take a few minutes...
The guest machine entered an invalid state while waiting for it
to boot. Valid states are 'restoring, running'. The machine is in the
'aborted' state. Please verify everything is configured
properly and try again.

If the provider you're using has a GUI that comes with it,
it is often helpful to open that and watch the machine, since the
GUI often has more helpful error messages than Vagrant can retrieve.
For example, if you're using VirtualBox, run `vagrant up` while the
VirtualBox GUI is open.

The primary issue for this error is that the provider you're using
is not properly configured. This is very rarely a Vagrant issue.
make: *** [resume] 错误 1

Full etcd support

Implement etcd as service registry and HA DCS.

Refactor monitor roles with monitor only mode

Scenario

Add monitor-only mode support for role monitor

Proposal

Add export_repo_url (http address) in case of skipping node_repo_urls setup.
Add exporter_create_user (boolean) to trying create monitor user in target database.

v0.6 EN Documentation

English documentation: http://pigsty.cc/en/docs/
Needs update, working on it

Integration with Tencent Cloud VPS

The pigsty official demo is hosted on qcloud (Tencent).
It's good to have a tutorial for setup pigsty using Tencent Cloud VPC.

ClientIP Issue when using Haproxy

When using haproxy, ClientIP will be shadowed. So it will be always 127.0.0.1 in PostgreSQL logs. which could be annoying.

Candidate Solutions

Use smart clients and connect to pgbouncer directly (e.g via consul DNS, services discovery, blahblah)
Change pgbouncer source code to extract clientIP from haproxy and use set application to pass it to postgres.
Use L4 VIP instead of haproxy with toa module which could solve this issue.
I don't care about client IP, just go with it.

Upgrade to PG 13.2

Update default software packages to PG 13.2

PG 13.2 will be added in next minor release (v0.6)

working on it

HAProxy admin page access via nginx

Scenario

HAProxy provides a handy admin web UI. And we already have a nginx server on meta node.
So why don't we build an index page for all haproxy instance and make nginx proxy for them?

Proposal

Add a new domain name h.pigsty for haproxy index pages

Add new default entry in nginx_upstream

- { name: haproxy,       host: h.pigsty, url: "127.0.0.1:9091" }

Setup upstream and proxy in /etc/nginx/conf.d/haproxy.conf

# haproxy instance definition
upstream pg-meta-1 {
    server 10.10.10.10:9101 max_fails=0;
}
upstream pg-test-1 {
    server 10.10.10.11:9101 max_fails=0;
}
upstream pg-test-2 {
    server 10.10.10.12:9101 max_fails=0;
}
upstream pg-test-3 {
    server 10.10.10.13:9101 max_fails=0;
}


# nginx proxy for haproxy admin
server {
    listen       80;
    server_name  h.pigsty;

    location ^~/pg-meta-1/ {
        proxy_pass http://pg-meta-1;
        proxy_connect_timeout 2;
    }
    location ^~/pg-test-1/ {
        proxy_pass http://pg-test-1;
        proxy_connect_timeout 2;
    }
    location ^~/pg-test-2/ {
        proxy_pass http://pg-test-2;
        proxy_connect_timeout 2;
    }
    location ^~/pg-test-3/ {
        proxy_pass http://pg-test-3;
        proxy_connect_timeout 2;
    }

    location / {
        root        //www/;
        index       haproxy.html;
    }
}

Add index page for all haproxy pages: haproxy.html
Add auth for haproxy admin web UI: haproxy_admin_auth_enabled

Loki & Promtail intergration

Make Loki and promtail as part of default deployment procedure.

Create extension with schema typo

A missing space in pg-init-business.sql.j2 prevent extension being created in correct schema.

{% if 'extensions' in database %}{% for extension in database.extensions %}
CREATE EXTENSION IF NOT EXISTS "{{ extension.name }}"{% if 'schema' in extension %}WITH SCHEMA "{{ extension.schema }}"{% endif %};
{% endfor %}{% endif %}

Add create user playbook

Scenario

User creation after cluster bootstrap is tricky.
It is not a good practice to use init playbook pgsql.yml for that purpose.

Proposal

add a pgsql-createuser.yml playbook for that purpose

It may requires a cli arg -e pg_user = <your new user in inventory> to work.

Readonly traffic fallback on primary if all replicas down

Scenario

If all replica instances are down. Current setup will NOT route those traffic to the only primary. which means readonly service are not available.

It nice to have an additional option to configure that in case of all replicas failure:

Leave primary be, so at least readwrite service will survival the avalanche case
Route readonly traffic to primary so the service still intact.

Proposal

Add an option haproxy_weight_fallback which controls this behaviro

When set to 0, it is same as before, primary will not take any read-only traffic
When set to a value larger than 0, It will be used as relative weight for primary instance in replica services group.

In normal situation, the primary will take a very little portion of read-only traffic. In case of all replica down. It will become the only available backend in replica service. And takeover all read-only traffics.

pg_hba reset on patroni restart (patroni 2.0)

After patroni restart, pg_hba rules will be overwrite by patroni.

Which means any modification to hba related variables should be applied to patroni.yml to take permanent effect.

This issue could be solved by templating patroni.yml hba section (working on it)

Adjust relative traffic weight with configuration

Scenario

Adjust haproxy service backend weight from inventory

Proposal

Add an new variable haproxy_weight (default 100) for that.
And render that variable into haproxy.cfg

RPM Installation

RPM packages has 3 major advantage.

Installation & Uninstall made easy
Explicitly limit the execution environment to CentOS 7 x86_64
Better version control mechanism

Delayed Role Implementation

Delayed instance is a special kind of offline instance which

restore from archived WAL
have a recovery_min_apply_delay

Offline access control

Scenario

In real world production environment, It is common requirements that running some ETL process / Interactive Queries / SAGA on OLTP databases.

It is highly recommended to run that shit on dedicate instance (which may named offline) to avoid undesired side-effect on entire cluster.

Thus, we need a mechanism to MARK a specific instance for that purpose. and a new default role to MARK users with that sort of needs.

Proposal

We may introduce a new default role dbrole_offline which have same privilege as dbrole_readonly but is not a member of dbrole_readonly. It will be granted to ETL/SAGA/Interactive users.

And members of dbrole_offline will only be allowed by hba on instances with pg_role = offline or variable pg_offline_query = true.

User provisioning interface enhancement

Upgrade user creation interface to support complete features.

Current interface:

pg_users:
  - username: test                  # example production user have read-write access
    password: test                  # example user's password
    options: LOGIN                  # extra options
    groups: [ dbrole_readwrite ]    # dborole_admin|dbrole_readwrite|dbrole_readonly
    comment: default test user for production usage
    pgbouncer: true                 # add to pgbouncer

Proposed interface

Which include complete customizable options for postgres user/role.

pg_users:
  # complete example of user/role definition for production user
  - name: dbuser_meta               # example production user have read-write access
    password: DBUser.Meta           # example user's password, can be encrypted
    login: true                     # can login, true by default (should be false for role)
    superuser: false                # is superuser? false by default
    createdb: false                 # can create database? false by default
    createrole: false               # can create role? false by default
    inherit: true                   # can this role use inherited privileges?
    replication: false              # can this role do replication? false by default
    bypassrls: false                # can this role bypass row level security? false by default
    connlimit: -1                   # connection limit, -1 disable limit
    expire_at: '2030-12-31'         # 'timestamp' when this role is expired
    expire_in: 365                  # now + n days when this role is expired (OVERWRITE expire_at)
    roles: [dbrole_readwrite]       # dborole_admin|dbrole_readwrite|dbrole_readonly
    pgbouncer: true                 # add this user to pgbouncer? false by default (true for production user)
    parameters:                     # user's default search path
      search_path: public
    comment: test user

Can not connect to some nodes from meta node

172.16.121.232 元节点
172.16.121.233 pg节点1 primary
172.16.121.234 pg节点2 从
172.16.121.235 pg节点3 从
172.16.121.236 源节点vip
172.16.121.237 pg cluster vip
集群初始化完成后（all），发现元节点无法ping通 primary pg节点3

v0.6 CN Documentation

Since pigsty 0.6 is released. Corresponding documentation should be updated in time

http://pigsty.cc/zh/docs/

Working on it.

Integration with Aliyun MyBase for PostgreSQL

Pigsty could be use as pure monitoring system without provisioning part.
So it could monitoring Postgres that are created by other provisioning solutions, such as Aliyun MyBase.

I've received request for supporting MyBase PostgreSQL
It will be tested and added in next release.

Split monolith static monitor target file into per-cluster conf

Currently. If you are using static prometheus target discovery.

The target file is a monolith lies in /etc/prometheus/targets/all.yml.

It's good to split that monolith file into small files, for example, one yml file per cluster.

Besides, for people who are using per-cluster style inventory, this could work, too.

The targets files would be:

/etc/prometheus/targets/pg-test.yml
/etc/prometheus/targets/pg-meta.yml
/etc/prometheus/targets/pg-other-cluster.yml

zh-cn document for v1.0.1

pgbouncer reload with systemctl not work

Bug, This not work as expected:

systemctl reload pgbouncer

Service provision

Currently service is achieved by haproxy node port.
In case of integration with other LoadB lancer provider. We need extract the service abstraction from it.

for example:

pg_services:
  # default service will route {ip|name}:5436 to primary postgres (5436->5432) (example service)
  - name: default           # service's actual name is {{ pg_cluster }}-{{ service.name }}
    src_ip: "*"             # service bind ip address, * for all, vip for cluster virtual ip address
    src_port: 5436          # bind port, mandatory
    dst_port: postgres      # target port: postgres|pgbouncer|port_number , pgbouncer(6432) by default
    check_method: http      # health check method: only http is available for now
    check_port: patroni     # health check port:  patroni|pg_exporter|port_number , patroni by default
    check_url: /primary     # health check url path, / as default
    check_code: 200         # health check http code, 200 as default
    selector: "[]"          # instance selector
    haproxy:                # haproxy specific fields
      maxconn: 3000         # default front-end connection
      balance: roundrobin   # load balance algorithm (roundrobin by default)
      default_server_options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'

pg_exporter multiple database implementation

Support monitoring multiple database simultaneously

Chinese translation of Dashboards

Low priority.

Use single node as default deployment

The standard deployment use to have 4 node.
Which may be a threshold for use.

Switch to single node deployment by default will simplify installation procedure. and enabling other possibilities such as take pigsty as data-analysis environment.

Default pigsty config does not have exec func priv for dbrole_offline

Case

Default configuration entry pg_default_privileges for pigsty has a mis-config

    pg_default_privilegs:                         # object created by dbsu and admin will have their privileges properly set
      - GRANT USAGE                         ON SCHEMAS   TO dbrole_readonly
      - GRANT SELECT                        ON TABLES    TO dbrole_readonly
      - GRANT SELECT                        ON SEQUENCES TO dbrole_readonly
      - GRANT EXECUTE                       ON FUNCTIONS TO dbrole_readonly
      - GRANT USAGE                         ON SCHEMAS   TO dbrole_offline
      - GRANT SELECT                        ON TABLES    TO dbrole_offline
      - GRANT SELECT                        ON SEQUENCES TO dbrole_offline
      - GRANT EXECUTE                       ON FUNCTIONS TO dbrole_readonly
      - GRANT INSERT, UPDATE, DELETE        ON TABLES    TO dbrole_readwrite
      - GRANT USAGE,  UPDATE                ON SEQUENCES TO dbrole_readwrite
      - GRANT TRUNCATE, REFERENCES, TRIGGER ON TABLES    TO dbrole_admin
      - GRANT CREATE                        ON SCHEMAS   TO dbrole_admin
      - GRANT USAGE                         ON TYPES     TO dbrole_admin

The line:

- GRANT EXECUTE                       ON FUNCTIONS TO dbrole_readonly

should be

- GRANT EXECUTE                       ON FUNCTIONS TO dbrole_offline

Solution

Update pigsty.yml configuration file.

English Documentation v0.10 and v1.0

Prometheus static targets mode support

Scenario

Currently, postgres instance identity is managed by Consul Service Registration.
Which is implemented as local consul service configuration file in /etc/consul.d/svc-***.json
It is a distributed approach that allows instances themselves to 'report' their identity.
It do have some potential security issues such as wrong identity because of malicious or accidentally configured.

Some may prefer a centralized approach. E.g manage all identity in the meta-node.
Since we already have those information in inventory

Proposal

Take prometheus for example.
We can provide an alternative option to use static target identity file rather than service discovery.
Which may reduce the hazard of operating and reloading consul.

We can generate prometheus configuration with inventory and label all instance properly.
It also makes it much easier to integrate with other provisioning systems.

PostgreSQL 14 metrics support for pg_exporter

PostgreSQL 14 alpha is released. and first beta will be released on 5/20.
Add support for new metrics of PostgreSQL 14

pg_stat_wal

pg_stat_activity.command

pg_stat_replication_slots

pg_stat_statements

pg_stat_database

Migration plan based on logical replication

Monitor Only Deployment Support

Scenario

Add support for monitor-only deployment.

Which only init infra on meta and monitor components on common nodes.

Dashboard overhaul based on Grafana 8.0

There are two major reasons for UI Overhaul

Underlying metrics are overhauled
Grafana 8.0 will introduce lot's of interesting features.

Fix typo in option `pg_default_privileges`

The original option named pg_default_privilegs has a typo in it.

privilegs should be privileges.

It actually slightly changes the API

vonng / pigsty Goto Github PK

pigsty's Issues

Scenario

Proposal

Scenario

Proposal

Scenario

Proposal

Scenario

Proposal

Current interface:

Proposed interface

Scenario

Proposal

Candidate Solutions

Scenario

Proposal

Scenario

Proposal

Scenario

Proposal

Scenario

Proposal

Scenario

Proposal

Current interface:

Proposed interface

Case

Solution

Scenario

Proposal

Scenario

Recommend Projects

Recommend Topics

Recommend Org