Comments (12)
Agreed. Even before I created dbtoyaml, I preferred the "single object" model for storing, e.g., table definitions, in a VCS. You can do that, manually, by using "dbtoyaml -t" repeatedly, but
- it is cumbersome when you have many tables, etc., (even if you script it, you could inadvertantly omit some), and
- the individual tablename.yaml files can't be simply concatenated for processing by yamltodb without pre-processing to remove redundant "schema xxx" lines.
In addition, I believe programmers and DBAs would prefer to store the textual definition of views and function source as separate objects (non-redundantly) in a nicely indented format rather the output from pyyaml. And yes, the serialization enhacement should be optional.
Thinking out loud, perhaps this could be prototyped with a script that serialized the "all in one" format to separate files, and a converse tool to merge them back for input to yamltodb.
from pyrseas.
In addition, I believe programmers and DBAs would prefer to store the textual definition of views and function source as separate objects (non-redundantly) in a nicely indented format rather the output from pyyaml. And yes, the serialization enhacement should be optional.
Exactly! Same holds for tables — storing them in plain DDL would be nice too (this is subject for another ticket, but anyway... that would require additional effort for building SQL/DDL parser and maintaining its compatibility with original PostgreSQL's parser — maybe it would be nice to try to include Pyrseas as PostgresSQL's contrib module, what do you think?).
from pyrseas.
I'm not keen on storing the SQL DDL for tables and particularly on parsing SQL, which is an inferior language IMHO (I prefer D, aka Tutorial D). Apgdiff already provides the capability of storing pg_dump output but it's hard to provide full coverage of PG features that way. As to contrib, Pyrseas is already in PGXN. I had started an attempt to embed Pyrseas (dbobject) code as a PL/Pythonu module(s) but didn't get too far.
from pyrseas.
I've done some experimental work on this, see commit 40698292cc5bc6793a545322b7268bf83ec2b43e:
View.definition
andFunction.source
are now wrapped inpyrseas.yaml.MultiLineStr
to be serialized using|
style. That would make serialized definitions more readable.dbtoyaml
andyamltodb
commands can now serialize/deserialize schema to/from repository — directory withschema.yaml
and separate YAML file for each view with filename in form ofschemaname.viewname.yaml
.
I found it a bit difficult to store views as plain SQL files because COMMENT
statement should be stored separately of CREATE VIEW...
statement — that would require minimal parsing (actually just split two statements by ;
), but if we do the same with functions — that would not be so easy. I think we could stick to some conventions like — "COMMENT
statement should be the last one in file" or something like this, though I'm not a big fan of those non-intuitive implementation-dependent things.
Regarding functions — I also would like to store them in separate files, but besides all points above there is another difficulty — how to store functions with the same name but different arguments' types — a. store them in single file or b. store them separately but how we should encode filenames then?
Given all these questions I still found that storing database schema in multiple files feels more natural.
from pyrseas.
Very interesting. It'll take me a while to digest it and see how it all works out. With regard to functions, if separate files are used, the names would have to follow the keylist attribute of class Proc, i.e., schema (name), (function) name, and arguments. For example, given public functions f(int, int) and f(int, text), the filenames could be public.f.integer_integer.yaml
and public.f.integer_text.yaml
. There are issues of course: f() stored as f..yaml
? how to deal with data types with embedded spaces?
from pyrseas.
Just after thinking a bit more about functions — I think functions with the same name are closely related to each other so it might be a good idea to store them in single file.
from pyrseas.
I cloned your repo to try things out. I'm not sure I understand what you did in pyrseas/yaml.py
: importing yaml in a module named 'yaml' looks like looking for trouble. In any case, when I try to run dbtoyaml, I see:
$ export PYTHONPATH=$PWD
$ pyrseas/dbtoyaml.py --help
Traceback (most recent call last):
File "pyrseas/dbtoyaml.py", line 12, in <module>
from pyrseas import yaml
File "/home/jma/src/andreypopp/pyrseas/yaml.py", line 5, in <module>
import yaml
File "/home/jma/src/andreypopp/pyrseas/yaml.py", line 14, in <module>
yaml.add_representer(MultiLineStr, MultiLineStr_presenter)
AttributeError: 'module' object has no attribute 'add_representer'
If I print(dir(yaml)) before the line in error, I see:
['MultiLineStr', 'MultiLineStr_presenter', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__', 'absolute_import', 'yaml']
My test environment is Python 2.7.3rc2 on Linux (Debian), but also saw the same behavior with Python 3.2.3rc2.
from pyrseas.
This is an interesting thread about splitting db objects in pg_dump output, in particular, various observations on directory structures for the output and making unique names for functions:
http://archives.postgresql.org/pgsql-hackers/2010-12/threads.php#02285
http://archives.postgresql.org/pgsql-hackers/2011-01/thrd16.php#00047
Also, in trying to keep dbtoyaml options similar/parallel to pg_dump options, I think the --repository
option could be specified as -Fd/--format=directory
since the default format will be a single file (somewhat equivalent to -Fp/--format=plain
). I realize that the 'directory' format of pg_dump doesn't output the same structure as we're suggesting here, so perhaps --format=tree
or --format=split
may be more appropriate.
from pyrseas.
Andrey, I finally got around to looking a bit more at this (because Josep Martínez Vilà is also interested in this feature). Since I couldn't figure out how it was apparently working for you and not for me, I went with a simpler solution, pyrseas/yamlutil.py
, as follows:
"""Pyrseas YAML utilities"""
from yaml import add_representer
class MultiLineStr(str):
""" Marker for multiline strings"""
def MultiLineStr_presenter(dumper, data):
return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
add_representer(MultiLineStr, MultiLineStr_presenter)
With this and the changes to the __init__
methods of the Function
and View
classes, I was able to produce the function source and view definitions as multi-line strings, e.g.,
schema public:
description: standard public schema
function f1(integer, integer):
language: sql
returns: integer
source: |-
SELECT GREATEST($1, $2)
...
view v1:
definition: |2-
SELECT now()::date AS today;
view v2:
definition: |2-
SELECT t1.c1, t1.c3 * 2
FROM t1;
However, this did not work properly with some actual views or functions (from production/staging databases) that had large text definitions. This will need some research and further testing.
I'll look at the repository changes next.
from pyrseas.
Change f13980d, based on Andrey's experimental work above has been committed. Now view definitions, function source text and object descriptions (COMMENT text) are formatted in YAML block style.
from pyrseas.
Change 94574eb is initial submission to address this new feature.
from pyrseas.
Change 6a6565d essentially completes the work for this enhacement (although some refinements are still needed).
from pyrseas.
Related Issues (20)
- ALTER SEQUENCE OWNER comes before CREATE TABLE HOT 3
- SSL Connection to remote database HOT 3
- datacopy CSV with null trailing columns not importing with -u HOT 7
- CREATE EXTENSION statements in the change script HOT 5
- [Contribution] Docker support HOT 2
- Sequences processing doesn't skip schemas defined with '-N' option. HOT 3
- dbtoyaml fails when dealing with extension-owned triggers/functions HOT 7
- Multiline Strings (source: |2) Output Unstable HOT 5
- Enable Github Discussions HOT 2
- Fails to Install: Error: pg_config executable not found HOT 5
- Error Using dbtoyaml With Supabase Schema HOT 67
- dbtoyaml Fails When PG Password Has Spaces (Improper Escaping) HOT 1
- Convert .travis.yml to GH Actions
- Fold pgdbconn back into Pyrseas HOT 1
- Key error when converting aggregate to YAML HOT 4
- plv8 to extension and fix reserved words in the primary key HOT 5
- Add REFERENCING tables in trigger HOT 3
- Support columns GENERATED AS (x) STORED HOT 2
- Support --revert for tables HOT 4
- the yamltodb tool is not placing the names of schemas, tables and columns in double quotes HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyrseas.