vonng / pigsty Goto Github PK
View Code? Open in Web Editor NEWBattery-Included PostgreSQL Distro as a Free RDS Alternative
Home Page: https://doc.pigsty.cc/
License: GNU Affero General Public License v3.0
Battery-Included PostgreSQL Distro as a Free RDS Alternative
Home Page: https://doc.pigsty.cc/
License: GNU Affero General Public License v3.0
咱们有关于如何接入现有数据库的文档么
Make default pgsql database on meta as part of the infrastructure.
Use it as dynamic inventory for cluster management.
for example, ansible could retrieve inventory information from cmdb via
#!/bin/bash
psql service=meta -AXtwc 'SELECT text FROM pigsty.inventory;'
Currently pigsty rely on consul for three purpose
While consul is really good. I'd like to provide more options and possibility.
For example, user who are familiar with etcd
would not like the idea of learning a new kind of database.
Sometimes you may want ADD pigsty monitoring system to existing database cluster instead of provisioning a new one.
In that case. It is much convenient to install node_exporter & pg_exporter via copy binaries. (Directly from meta node instead of setup infra, download and install rpm, blahblah)
Add a exporter_binary_install
option. If set, node_exporter
and pg_exporter
are copied to /usr/bin/
directly.
But you'll have to put those two binary files in files
directory.
QPS,TPS并没有数据,检查了一下发现并没有收集pg:ins:qps和pg:ins:tps指标。是开源版不开放这个功能吗?
pgbadger
is an useful tools for log analyzing.
We can introduce semi-realtime log summary with pgbadger.
It is very dangerous to set dcs_exists_action = clean
and pg_exists_action = clean
, Which are safeguard variables to avoid accidentally removing any running databases or dcs instances.
It is always better to achieve that by an explicit playbook execution pgsql-rm.yml
rather than change a variable.
In some case you just want to erase that possibility.
Add two safeguard variables pg_disable_purge
and dcs_disable_purge
. If set to true, corresponding variable pg_exists_action
and dcs_exists_action
will be forced to abort
to avoid purge.
Provide yet another cli wrapper to seal ansible primitives. for example:
NAME
pigsty -- Pigsty Command-Line Interface v0.8
SYNOPSIS
meta setup meta nodes init|fetch|repo|cache|ansible
node setup database nodes init|tune|dcs|remove|ping|bash|ssh|admin
pgsql setup postgres clusters init|node|dcs|postgres|template|business|monitor|service|monly|remove
infra setup infrastructure init|ca|dns|prometheus|grafana|loki|haproxy|target
clean clean pgsql clusters all|service|monitor|postgres|dcs
config mange pigsty config file init|edit|info|dump|path
serve run pigsty API server init|start|stop|restart|reload|status
demo setup local demo init|up|new|clean|start|dns
log watch system log query|postgres|patroni|pgbouncer|message
pg pg operational tasks user|db|svc|hba|log|psql|deploy|backup|restore|vacuum|repack
EXAMPLES
1. infra summary
pigsty infra
2. pgsql clusters summary
pigsty pgsql
3. pigsty nodes summary
pigsty node
4. create pgsql cluster 'pg-test'
pigsty pgsql init -l pg-test
5. add new instance 10.10.10.13 of cluster 'pg-test'
pigsty pgsql init -l 10.10.10.13
6. remove cluster 'pg-test'
pigsty clean -l pg-test
7. create user dbuser_vonng on cluster 'pg-test'
pigsty pg user dbuser_vonng -l pg-test
8. create database test2 on cluster 'pg-test'
pigsty pg db test -l pg-test
Usage:
pigsty [command]
Available Commands:
clean remove pgsql cluster/instance
help Help about any command
infra setup infrastructure
meta setup meta node
node setup database node
pgsql setup pgsql clusters
serve Launch pigsty API server
Flags:
-h, --help help for pigsty
-i, --inventory string inventory file (default "./pigsty.yml")
-l, --limit string limit execution hosts
-t, --tags strings limit execution tasks
Database creation after cluster bootstrap is tricky.
It is not a good practice to use init playbook pgsql.yml
for that purpose.
add a pgsql-createdb.yml
playbook for that purpose
It may requires a cli arg -e pg_database = <your new database in inventory>
to work
svc
and role
are tagged to all metrics in pigsty.
Usually it works well. role of instance will be auto-adjusted by patroni callback and routine tasks.
While it do have two down-sides:
To avoid it. Here I propose to eliminate svc and role label from pigsty. (Along with any old service level metrics implementation)
Add Table/Index QPS panel
Table QPS can be roughly calculated from pg_table_idx_scan
and pg_table_seq_scan
. (Which is actually Scan per Seconds)
It's very useful to have a QPS metrics for tables and indexes.
Upgrade database creation interface to support complete features.
pg_databases: # create a business database 'meta'
- name: meta
schemas: [meta] # create extra schema named 'meta'
extensions: [{name: postgis}] # create extra extension postgis
parameters: # overwrite database meta's default search_path
search_path: public, monitor
Which include complete customizable options for postgres database.
pg_databases:
- name: meta # name is the only required field for a database
owner: postgres # optional, database owner
template: template1 # optional, template1 by default
encoding: UTF8 # optional, UTF8 by default
locale: C # optional, C by default
allowconn: true # optional, true by default, false disable connect at all
revokeconn: false # optional, false by default, true revoke connect from public # (only default user and owner have connect privilege on database)
tablespace: pg_default # optional, 'pg_default' is the default tablespace
connlimit: -1 # optional, connection limit, -1 or none disable limit (default)
extensions: # optional, extension name and where to create
- {name: postgis, schema: public}
parameters: # optional, extra parameters with ALTER DATABASE
enable_partitionwise_join: true
pgbouncer: true # optional, add this database to pgbouncer list? true by default
comment: pigsty meta database # optional, comment string for database
# make upload
ssh -t meta "sudo rm -rf /tmp/pkg.tgz"
Connection to 127.0.0.1 closed.
scp -r files/pkg.tgz meta:/tmp/pkg.tgz
pkg.tgz 100% 1029MB 10.9MB/s 01:34
ssh -t meta "sudo mkdir -p /www/pigsty/; sudo rm -rf /www/pigsty/*; sudo tar -xf /tmp/pkg.tgz --strip-component=1 -C /www/pigsty/"
Connection to 127.0.0.1 closed by remote host.
Connection to 127.0.0.1 closed.
make: *** [upload] 错误 255
# make status
cd vagrant && vagrant status
Current machine states:
meta aborted (virtualbox)
node-1 running (virtualbox)
node-2 running (virtualbox)
node-3 running (virtualbox)
This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.
# make resume
cd vagrant && vagrant resume
==> meta: Waiting for machine to boot. This may take a few minutes...
The guest machine entered an invalid state while waiting for it
to boot. Valid states are 'restoring, running'. The machine is in the
'aborted' state. Please verify everything is configured
properly and try again.
If the provider you're using has a GUI that comes with it,
it is often helpful to open that and watch the machine, since the
GUI often has more helpful error messages than Vagrant can retrieve.
For example, if you're using VirtualBox, run `vagrant up` while the
VirtualBox GUI is open.
The primary issue for this error is that the provider you're using
is not properly configured. This is very rarely a Vagrant issue.
make: *** [resume] 错误 1
Implement etcd
as service registry and HA DCS.
Add monitor-only mode support for role monitor
export_repo_url
(http address) in case of skipping node_repo_urls
setup.exporter_create_user
(boolean) to trying create monitor user in target database.English documentation: http://pigsty.cc/en/docs/
Needs update, working on it
The pigsty official demo is hosted on qcloud (Tencent).
It's good to have a tutorial for setup pigsty using Tencent Cloud VPC.
When using haproxy, ClientIP will be shadowed. So it will be always 127.0.0.1
in PostgreSQL logs. which could be annoying.
set application
to pass it to postgres.toa
module which could solve this issue.Update default software packages to PG 13.2
PG 13.2 will be added in next minor release (v0.6)
working on it
HAProxy provides a handy admin web UI. And we already have a nginx server on meta node.
So why don't we build an index page for all haproxy instance and make nginx proxy for them?
h.pigsty
for haproxy index pagesnginx_upstream
- { name: haproxy, host: h.pigsty, url: "127.0.0.1:9091" }
# haproxy instance definition
upstream pg-meta-1 {
server 10.10.10.10:9101 max_fails=0;
}
upstream pg-test-1 {
server 10.10.10.11:9101 max_fails=0;
}
upstream pg-test-2 {
server 10.10.10.12:9101 max_fails=0;
}
upstream pg-test-3 {
server 10.10.10.13:9101 max_fails=0;
}
# nginx proxy for haproxy admin
server {
listen 80;
server_name h.pigsty;
location ^~/pg-meta-1/ {
proxy_pass http://pg-meta-1;
proxy_connect_timeout 2;
}
location ^~/pg-test-1/ {
proxy_pass http://pg-test-1;
proxy_connect_timeout 2;
}
location ^~/pg-test-2/ {
proxy_pass http://pg-test-2;
proxy_connect_timeout 2;
}
location ^~/pg-test-3/ {
proxy_pass http://pg-test-3;
proxy_connect_timeout 2;
}
location / {
root //www/;
index haproxy.html;
}
}
haproxy_admin_auth_enabled
Make Loki and promtail as part of default deployment procedure.
A missing space in pg-init-business.sql.j2
prevent extension being created in correct schema.
{% if 'extensions' in database %}{% for extension in database.extensions %}
CREATE EXTENSION IF NOT EXISTS "{{ extension.name }}"{% if 'schema' in extension %}WITH SCHEMA "{{ extension.schema }}"{% endif %};
{% endfor %}{% endif %}
User creation after cluster bootstrap is tricky.
It is not a good practice to use init playbook pgsql.yml
for that purpose.
add a pgsql-createuser.yml
playbook for that purpose
It may requires a cli arg -e pg_user = <your new user in inventory>
to work.
If all replica instances are down. Current setup will NOT route those traffic to the only primary. which means readonly service are not available.
It nice to have an additional option to configure that in case of all replicas failure:
Add an option haproxy_weight_fallback
which controls this behaviro
In normal situation, the primary will take a very little portion of read-only traffic. In case of all replica down. It will become the only available backend in replica service. And takeover all read-only traffics.
After patroni restart, pg_hba rules will be overwrite by patroni.
Which means any modification to hba related variables should be applied to patroni.yml
to take permanent effect.
This issue could be solved by templating patroni.yml hba section (working on it)
Adjust haproxy service backend weight from inventory
Add an new variable haproxy_weight
(default 100) for that.
And render that variable into haproxy.cfg
RPM packages has 3 major advantage.
Delayed instance is a special kind of offline instance which
recovery_min_apply_delay
In real world production environment, It is common requirements that running some ETL process / Interactive Queries / SAGA on OLTP databases.
It is highly recommended to run that shit on dedicate instance (which may named offline
) to avoid undesired side-effect on entire cluster.
Thus, we need a mechanism to MARK a specific instance for that purpose. and a new default role to MARK users with that sort of needs.
We may introduce a new default role dbrole_offline
which have same privilege as dbrole_readonly
but is not a member of dbrole_readonly
. It will be granted to ETL/SAGA/Interactive users.
And members of dbrole_offline
will only be allowed by hba on instances with pg_role = offline
or variable pg_offline_query = true
.
Upgrade user creation interface to support complete features.
pg_users:
- username: test # example production user have read-write access
password: test # example user's password
options: LOGIN # extra options
groups: [ dbrole_readwrite ] # dborole_admin|dbrole_readwrite|dbrole_readonly
comment: default test user for production usage
pgbouncer: true # add to pgbouncer
Which include complete customizable options for postgres user/role.
pg_users:
# complete example of user/role definition for production user
- name: dbuser_meta # example production user have read-write access
password: DBUser.Meta # example user's password, can be encrypted
login: true # can login, true by default (should be false for role)
superuser: false # is superuser? false by default
createdb: false # can create database? false by default
createrole: false # can create role? false by default
inherit: true # can this role use inherited privileges?
replication: false # can this role do replication? false by default
bypassrls: false # can this role bypass row level security? false by default
connlimit: -1 # connection limit, -1 disable limit
expire_at: '2030-12-31' # 'timestamp' when this role is expired
expire_in: 365 # now + n days when this role is expired (OVERWRITE expire_at)
roles: [dbrole_readwrite] # dborole_admin|dbrole_readwrite|dbrole_readonly
pgbouncer: true # add this user to pgbouncer? false by default (true for production user)
parameters: # user's default search path
search_path: public
comment: test user
Since pigsty 0.6 is released. Corresponding documentation should be updated in time
Working on it.
Pigsty could be use as pure monitoring system without provisioning part.
So it could monitoring Postgres that are created by other provisioning solutions, such as Aliyun MyBase.
I've received request for supporting MyBase PostgreSQL
It will be tested and added in next release.
Currently. If you are using static prometheus target discovery.
The target file is a monolith lies in /etc/prometheus/targets/all.yml
.
It's good to split that monolith file into small files, for example, one yml file per cluster.
Besides, for people who are using per-cluster style inventory, this could work, too.
The targets files would be:
/etc/prometheus/targets/pg-test.yml
/etc/prometheus/targets/pg-meta.yml
/etc/prometheus/targets/pg-other-cluster.yml
Bug, This not work as expected:
systemctl reload pgbouncer
Currently service is achieved by haproxy node port.
In case of integration with other LoadB lancer provider. We need extract the service abstraction from it.
for example:
pg_services:
# default service will route {ip|name}:5436 to primary postgres (5436->5432) (example service)
- name: default # service's actual name is {{ pg_cluster }}-{{ service.name }}
src_ip: "*" # service bind ip address, * for all, vip for cluster virtual ip address
src_port: 5436 # bind port, mandatory
dst_port: postgres # target port: postgres|pgbouncer|port_number , pgbouncer(6432) by default
check_method: http # health check method: only http is available for now
check_port: patroni # health check port: patroni|pg_exporter|port_number , patroni by default
check_url: /primary # health check url path, / as default
check_code: 200 # health check http code, 200 as default
selector: "[]" # instance selector
haproxy: # haproxy specific fields
maxconn: 3000 # default front-end connection
balance: roundrobin # load balance algorithm (roundrobin by default)
default_server_options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'
Support monitoring multiple database simultaneously
Low priority.
The standard deployment use to have 4 node.
Which may be a threshold for use.
Switch to single node deployment by default will simplify installation procedure. and enabling other possibilities such as take pigsty as data-analysis environment.
Default configuration entry pg_default_privileges
for pigsty has a mis-config
pg_default_privilegs: # object created by dbsu and admin will have their privileges properly set
- GRANT USAGE ON SCHEMAS TO dbrole_readonly
- GRANT SELECT ON TABLES TO dbrole_readonly
- GRANT SELECT ON SEQUENCES TO dbrole_readonly
- GRANT EXECUTE ON FUNCTIONS TO dbrole_readonly
- GRANT USAGE ON SCHEMAS TO dbrole_offline
- GRANT SELECT ON TABLES TO dbrole_offline
- GRANT SELECT ON SEQUENCES TO dbrole_offline
- GRANT EXECUTE ON FUNCTIONS TO dbrole_readonly
- GRANT INSERT, UPDATE, DELETE ON TABLES TO dbrole_readwrite
- GRANT USAGE, UPDATE ON SEQUENCES TO dbrole_readwrite
- GRANT TRUNCATE, REFERENCES, TRIGGER ON TABLES TO dbrole_admin
- GRANT CREATE ON SCHEMAS TO dbrole_admin
- GRANT USAGE ON TYPES TO dbrole_admin
The line:
- GRANT EXECUTE ON FUNCTIONS TO dbrole_readonly
should be
- GRANT EXECUTE ON FUNCTIONS TO dbrole_offline
Update pigsty.yml configuration file.
Currently, postgres instance identity is managed by Consul Service Registration.
Which is implemented as local consul service configuration file in /etc/consul.d/svc-***.json
It is a distributed approach that allows instances themselves to 'report' their identity.
It do have some potential security issues such as wrong identity because of malicious or accidentally configured.
Some may prefer a centralized approach. E.g manage all identity in the meta-node.
Since we already have those information in inventory
Take prometheus for example.
We can provide an alternative option to use static target identity file rather than service discovery.
Which may reduce the hazard of operating and reloading consul.
We can generate prometheus configuration with inventory and label all instance properly.
It also makes it much easier to integrate with other provisioning systems.
PostgreSQL 14 alpha is released. and first beta will be released on 5/20.
Add support for new metrics of PostgreSQL 14
pg_stat_wal
pg_stat_activity.command
pg_stat_replication_slots
pg_stat_statements
pg_stat_database
Add support for monitor-only deployment.
Which only init infra on meta and monitor components on common nodes.
There are two major reasons for UI Overhaul
The original option named pg_default_privilegs
has a typo in it.
privilegs
should be privileges
.
It actually slightly changes the API
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.