Giter Club home page Giter Club logo

docs's People

Contributors

andrewxhill avatar avichalp avatar brunocalza avatar carsonfarmer avatar cyrilbois avatar danielboye avatar datadanne avatar dependabot[bot] avatar dimitrov-d avatar dtbuchholz avatar ethanol48 avatar gitbook-bot avatar joewagner avatar jsign avatar omahs avatar sanderpick avatar tabascoatw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

docs's Issues

[DOC-49] Add / update SQL playbook 101 docs

  • Finish the rest of supported JOIN docs
  • Add guide for aliasing
  • Add brief about relationships (aka basics about primary keys and 1:1, 1:many, etc)
  • Include guides for all supported compound select statements
  • Include examples for all supported expressions
    • Operators & parse-affecting attributes
    • Parameters
    • CASE
    • BETWEEN
    • CAST
    • EXISTS
  • Examples for all supported aggregate functions
  • A list of all (un)supported scalar functions (note: only examples “needed” are the existing docs for JSON functions). WIP list here.
  • Open questions:
    • Is WINDOW and/or OVER supported?
    • Is WITH supported?
    • RAISE?
    • RECURSIVE?

From SyncLinear.com | DOC-49

Docs: create a build script that removes new line characters

Summary

Paragraphs within the docs inject new line characters mid-sentence, which makes it difficult to export markdown and repurpose it elsewhere (e.g., copy/paste into a Notion page). Ideally, there's a build script that fixes this such that the markdown doesn't have the new lines, and exporting specs into other tools is straightforward.

Example

See line 57 -- it should be a continuous sentence instead of starting a new line with link 58's "semicolon-separated...":

The core Tableland SQL parser accepts an SQL statement list which is a

Are floating point values in JSON text columns problematic?

The below is taken from a Discord thread:

@asutula: Did we evaluate the consequences of storing floating point values in text columns of json data? I think it would be ok since the value is handled as and stored as a string on the way in, and could maybe be a problem for read queries where you use a function like json_extract() where the docs explain:

If only a single path P1 is provided, then the SQL datatype of the result is NULL for a JSON null, INTEGER or REAL for a JSON numeric value...

So if we really must stick to our rule that read query results must be deterministic, does this present a problem?

@brunocalza: interesting. gotta investigate that a bit more. but tbh it's getting hard to be very strict with that rule. there are already some scenarios where we cannot guarantee a deterministic result. for example, ORDER BY is one of them. when you sort by a column that is not able to break ties you can have problems,

so let's see how that goes

@jsign: mmm i think that JSON situation might be a problem for the new INSERT with SELECT statements. in that case, feels we should block -> in those SELECTS maybe.
or even in simple UPDATEs, if an UPDATE has UPDATE .... WHERE foo->imafloat :/
so maybe -> should be blocked in all write-queries.
just sharing first thoughts here. agree on thinking of it

docs: improve the explanation about what is allowed in the prefix of a table's name

Currently, we are mixing a regex and a text statement to explain to our users what is allowed as a table's name prefix:

Where prefix is optional, and may include any characters from the regular expression ([A-Za-z0-9\_]+), but cannot start with a number.

We should follow @carsonfarmer suggestion:

but now reading that, we should not mention regex in the spec doc like that at all, it doesn't read very nicely. it should just describe in prose what we want, and then we have a regex that implements that description in the code

I'd say use this regex: ^([A-Za-z]+[A-Za-z0-9_]*) and then describe it in the docs as: "Where prefix is optional. When a prefix is included it must start with a letter, and be followed by any combination of (zero or more) letters, numbers, and/or underscores."

Spec: we need to solve non-determinism for rowids

Today we have a problem with letting users manipulate the rowid value of a row, which can result in non-determinism because (quoting SQLite spec):

If the largest ROWID is equal to the largest possible integer (9223372036854775807) then the database engine starts picking positive candidate ROWIDs at random until it finds one that is not previously used.

We've been chatting with @brunocalza about this problem, and it's quite nuanced.

Some notes about this problem:

  • If the user defines a INTEGER PRIMARY KEY (with or without AUTOINCREMENT), it becomes an alias for rowid, which is the crux of the problem.
  • If we allow the user to assign 9223372036854775807 to rowid, _rowid_, oid then the next insert would be non-deterministic.
  • The above bullet also applies to a column name the user-defined as INTEGER PRIMARY KEY (with or without AUTOINCREMENT).
  • PRIMARY KEYs that aren't INTEGER won't create an alias, which can help narrow down the spec change.

Tentative/fuzzy idea skeleton:

  • If a user wants to define an INTEGER PRIMARY KEY (with or without AUTOINCREMENT), it should have a fixed name defined by the spec (e.g: say __id). The idea is to allow the parser to detect invalid queries (see point below)
  • The spec should only allow a direct assignment or updates to column names rowid, _rowid_, oid or __id with constant values lower than some number X (where X << 9223372036854775807).
  • User defined PRIMARY KEYs which aren't INTEGER looks like they are safe to have arbitrary assignments.

The above are some quick notes, we need to jump into a convo about this to dive deeper.
We have to remember to discuss what this means for existing history.

[DOC-52] Update specs structure

The specs folder in root is not used. Instead, it was duplicated/slightly altered into the docs folder so that it could be successfully rendered—i.e., ./docs/specs is what gets shown when visiting the SQL specification page.

  • Delete the rootspecs folder.
    • Note: the reason the ./specs markdown files cannot be used it because LaTex did not work for any maths unless the file is placed in the docs folder. Perhaps there's a way to fix this, which would be nice to keep specs top-level instead of nested in docs; a nice to have.
  • Restructure the folder in docs into decomposed sidebar pages (aka not a monolithic single page of all SQL specs but easier to navigate).

Alternatively, it may make sense to use deployment build scripts to enable this structure. Namely, the README outlines how the specs top-level folder creates a README using files in the specs/sql folder. Maybe, that build script should keep spec/sql and copy what's needed into the docs/specs folder. Just an idea.

From SyncLinear.com | DOC-52

Spec: We need to decide on and document a canonical string encoding for our AST / parsed SQL statements

It seems like so far we are assuming lowercase string output for all AST components, which seems fine. Since we aren't exposing the AST to the outside world (yet), a canonical encoding (JSON?) of the tree state itself isn't really required at this time... but it would be nice to spec and document the string encoding and finalize that sooner rather than later.

I would propose we stick with lowercase string outputs. It would be nice to have a canonical JSON representation as well, at least to help with future-proofing ourselves to a degree. In that case, I would propose we use lower-case, snake-case field names for all AST nodes and elements.

If we are happy with this, then we can update our spec with this information.

cc @sanderpick @brunocalza @jsign

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.