Giter Club home page Giter Club logo

Comments (3)

matriv avatar matriv commented on July 25, 2024

In the first version of filter push down: #15653 we only enabled this for safe operators, which is the standard comparison and logical operators, and if anything else is included we don't push down the filter at all.

I see 2 improvements here:

  • Maybe enable more functions for which we have 100% compatibility with PostgreSQL
  • When expressions are combined with AND split the query and do partial push-down.

from crate.

mfussenegger avatar mfussenegger commented on July 25, 2024

🆗 some filters are pushed down

EXPLAIN SELECT * FROM doc.foreign_t01 WHERE c1 = 'Hello' or c1 LIKE 'Test%';
-- Filter[(c1 LIKE 'Test%')] (rows=unknown)
-- └ ForeignCollect[doc.foreign_t01 | [c1, c2] | (c1 = 'Hello')] (rows=unknown)

Splitting on OR and pushing down one side won't work.
The given example would exclude cases where c1 = 'Hello' is false but c1 LIKE 'Test%' is true.

+--------------+-----------------+-------+
| c1 = 'Hello' | c1 LIKE 'Test%' |  OR   |
+--------------+-----------------+-------+
| true         | true            | true  |
| true         | false           | true  |
| true         | null            | true  |

^ only case above would be visible for Filter c1 LIKE 'Test%'

                                         |
| false        | true            | true  | < would be missing
| false        | false           | false |
| false        | null            | null  |
                                         |
| null         | true            | true  | < would be missing
| null         | false           | null  |
| null         | null            | null  |
+--------------+-----------------+-------+

Maybe enable more functions for which we have 100% compatibility with PostgreSQL

It is not just about PostgreSQL. JDBC fdw should work for any foreign database that has a JDBC driver.
That said - we could consider using DatabaseMetaData.getFunctions() in some way to figure out if functions are eligible. Or specialize the PostgreSQL case more.

from crate.

proddata avatar proddata commented on July 25, 2024

Splitting on OR and pushing down one side won't work.
The given example would exclude cases where c1 = 'Hello' is false but c1 LIKE 'Test%' is true.

Makes sense. The example has been chosen badly in this case. I updated it to better reflect the original issue.


We also tried to explicitly split the filters into a cte like ...

cr> EXPLAIN WITH foreign_select AS (SELECT * FROM doc.foreign_t01 WHERE c1 = 'Hello')
    SELECT * FROM foreign_select WHERE c2['module']  = 'Test%';
+------------------------------------------------------------------------+
| QUERY PLAN                                                             |
+------------------------------------------------------------------------+
| Rename[c1, c2] AS foreign_select (rows=0)                              |
|   └ Filter[(('Test%' = c2['module']) AND (c1 = 'Hello'))] (rows=0)     |
|     └ ForeignCollect[doc.foreign_t01 | [c1, c2] | true] (rows=unknown) |
+------------------------------------------------------------------------+

... which (un)fortunately gets optimised.

With SET optimizer_move_filter_beneath_rename = false; this leads to the desired output

cr> EXPLAIN  WITH foreign_select AS (SELECT * FROM doc.foreign_t01 WHERE c1 = 'Hello')
    SELECT * FROM foreign_select WHERE c2['module']  = 'Test%';
+----------------------------------------------------------------------------------+
| QUERY PLAN                                                                       |
+----------------------------------------------------------------------------------+
| Filter[('Test%' = c2['module'])] (rows=0)                                        |
|   └ Rename[c1, c2] AS foreign_select (rows=unknown)                              |
|     └ ForeignCollect[doc.foreign_t01 | [c1, c2] | (c1 = 'Hello')] (rows=unknown) |
+----------------------------------------------------------------------------------+

Deactivating this by default however seems problematic.


Within PostgreSQL this most likely could be prevented with the MATERIALIZED keyword in the CTE, forcing a separate evaluation of the CTE. https://www.postgresql.org/docs/16/queries-with.html#QUERIES-WITH-CTE-MATERIALIZATION

from crate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.