Giter Club home page Giter Club logo

Comments (12)

jmafc avatar jmafc commented on September 17, 2024

Agreed. Even before I created dbtoyaml, I preferred the "single object" model for storing, e.g., table definitions, in a VCS. You can do that, manually, by using "dbtoyaml -t" repeatedly, but

  • it is cumbersome when you have many tables, etc., (even if you script it, you could inadvertantly omit some), and
  • the individual tablename.yaml files can't be simply concatenated for processing by yamltodb without pre-processing to remove redundant "schema xxx" lines.

In addition, I believe programmers and DBAs would prefer to store the textual definition of views and function source as separate objects (non-redundantly) in a nicely indented format rather the output from pyyaml. And yes, the serialization enhacement should be optional.

Thinking out loud, perhaps this could be prototyped with a script that serialized the "all in one" format to separate files, and a converse tool to merge them back for input to yamltodb.

from pyrseas.

andreypopp avatar andreypopp commented on September 17, 2024

In addition, I believe programmers and DBAs would prefer to store the textual definition of views and function source as separate objects (non-redundantly) in a nicely indented format rather the output from pyyaml. And yes, the serialization enhacement should be optional.

Exactly! Same holds for tables — storing them in plain DDL would be nice too (this is subject for another ticket, but anyway... that would require additional effort for building SQL/DDL parser and maintaining its compatibility with original PostgreSQL's parser — maybe it would be nice to try to include Pyrseas as PostgresSQL's contrib module, what do you think?).

from pyrseas.

jmafc avatar jmafc commented on September 17, 2024

I'm not keen on storing the SQL DDL for tables and particularly on parsing SQL, which is an inferior language IMHO (I prefer D, aka Tutorial D). Apgdiff already provides the capability of storing pg_dump output but it's hard to provide full coverage of PG features that way. As to contrib, Pyrseas is already in PGXN. I had started an attempt to embed Pyrseas (dbobject) code as a PL/Pythonu module(s) but didn't get too far.

from pyrseas.

andreypopp avatar andreypopp commented on September 17, 2024

I've done some experimental work on this, see commit 40698292cc5bc6793a545322b7268bf83ec2b43e:

  • View.definition and Function.source are now wrapped in pyrseas.yaml.MultiLineStr to be serialized using | style. That would make serialized definitions more readable.
  • dbtoyaml and yamltodb commands can now serialize/deserialize schema to/from repository — directory with schema.yaml and separate YAML file for each view with filename in form of schemaname.viewname.yaml.

I found it a bit difficult to store views as plain SQL files because COMMENT statement should be stored separately of CREATE VIEW... statement — that would require minimal parsing (actually just split two statements by ;), but if we do the same with functions — that would not be so easy. I think we could stick to some conventions like — "COMMENT statement should be the last one in file" or something like this, though I'm not a big fan of those non-intuitive implementation-dependent things.

Regarding functions — I also would like to store them in separate files, but besides all points above there is another difficulty — how to store functions with the same name but different arguments' types — a. store them in single file or b. store them separately but how we should encode filenames then?

Given all these questions I still found that storing database schema in multiple files feels more natural.

from pyrseas.

jmafc avatar jmafc commented on September 17, 2024

Very interesting. It'll take me a while to digest it and see how it all works out. With regard to functions, if separate files are used, the names would have to follow the keylist attribute of class Proc, i.e., schema (name), (function) name, and arguments. For example, given public functions f(int, int) and f(int, text), the filenames could be public.f.integer_integer.yaml and public.f.integer_text.yaml. There are issues of course: f() stored as f..yaml? how to deal with data types with embedded spaces?

from pyrseas.

andreypopp avatar andreypopp commented on September 17, 2024

Just after thinking a bit more about functions — I think functions with the same name are closely related to each other so it might be a good idea to store them in single file.

from pyrseas.

jmafc avatar jmafc commented on September 17, 2024

I cloned your repo to try things out. I'm not sure I understand what you did in pyrseas/yaml.py: importing yaml in a module named 'yaml' looks like looking for trouble. In any case, when I try to run dbtoyaml, I see:

$ export PYTHONPATH=$PWD
$ pyrseas/dbtoyaml.py --help
Traceback (most recent call last):
  File "pyrseas/dbtoyaml.py", line 12, in <module>
    from pyrseas import yaml
  File "/home/jma/src/andreypopp/pyrseas/yaml.py", line 5, in <module>
    import yaml
  File "/home/jma/src/andreypopp/pyrseas/yaml.py", line 14, in <module>
    yaml.add_representer(MultiLineStr, MultiLineStr_presenter)
AttributeError: 'module' object has no attribute 'add_representer'

If I print(dir(yaml)) before the line in error, I see:

['MultiLineStr', 'MultiLineStr_presenter', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__', 'absolute_import', 'yaml']

My test environment is Python 2.7.3rc2 on Linux (Debian), but also saw the same behavior with Python 3.2.3rc2.

from pyrseas.

jmafc avatar jmafc commented on September 17, 2024

This is an interesting thread about splitting db objects in pg_dump output, in particular, various observations on directory structures for the output and making unique names for functions:

http://archives.postgresql.org/pgsql-hackers/2010-12/threads.php#02285
http://archives.postgresql.org/pgsql-hackers/2011-01/thrd16.php#00047

Also, in trying to keep dbtoyaml options similar/parallel to pg_dump options, I think the --repository option could be specified as -Fd/--format=directory since the default format will be a single file (somewhat equivalent to -Fp/--format=plain). I realize that the 'directory' format of pg_dump doesn't output the same structure as we're suggesting here, so perhaps --format=tree or --format=split may be more appropriate.

from pyrseas.

jmafc avatar jmafc commented on September 17, 2024

Andrey, I finally got around to looking a bit more at this (because Josep Martínez Vilà is also interested in this feature). Since I couldn't figure out how it was apparently working for you and not for me, I went with a simpler solution, pyrseas/yamlutil.py, as follows:

"""Pyrseas YAML utilities"""

from yaml import add_representer


class MultiLineStr(str):
    """ Marker for multiline strings"""


def MultiLineStr_presenter(dumper, data):
    return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
add_representer(MultiLineStr, MultiLineStr_presenter)

With this and the changes to the __init__ methods of the Function and View classes, I was able to produce the function source and view definitions as multi-line strings, e.g.,

schema public:
  description: standard public schema
  function f1(integer, integer):
    language: sql
    returns: integer
    source: |-
      SELECT GREATEST($1, $2)
  ...
  view v1:
    definition: |2-
       SELECT now()::date AS today;
  view v2:
    definition: |2-
       SELECT t1.c1, t1.c3 * 2
         FROM t1;

However, this did not work properly with some actual views or functions (from production/staging databases) that had large text definitions. This will need some research and further testing.

I'll look at the repository changes next.

from pyrseas.

jmafc avatar jmafc commented on September 17, 2024

Change f13980d, based on Andrey's experimental work above has been committed. Now view definitions, function source text and object descriptions (COMMENT text) are formatted in YAML block style.

from pyrseas.

jmafc avatar jmafc commented on September 17, 2024

Change 94574eb is initial submission to address this new feature.

from pyrseas.

jmafc avatar jmafc commented on September 17, 2024

Change 6a6565d essentially completes the work for this enhacement (although some refinements are still needed).

from pyrseas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.