Giter Club home page Giter Club logo

Comments (9)

Smjert avatar Smjert commented on June 18, 2024

@Samuel-Roach Thanks for the writing and for catching this.
I have only given a cursory look; is it your understanding that sqlite is at fault and/or should we do something osquery side?

In the sqlite commit it seems this is an optimization done for a LEFT JOIN; is this also broken in other situations?

from osquery.

Samuel-Roach avatar Samuel-Roach commented on June 18, 2024

@Smjert My understanding is that SQLite is not at fault, as the optimisation they have made is probably fair, albeit could be further scrutinised for validation of this.

In my opinion, something should be done on the osquery side. I'm not 100% on the interraction between SQLite and Osquery code; however, the concept of required columns for a table is part of osquery, and is not a core part of SQL. Therefore, osquery should make a change in order to accommodate for this optimization made in SQLite.

I'm under the impression from the release notes (https://www.sqlite.org/releaselog/3_43_0.html) that this could affect all types of JOIN, as the optimization this is in relation to is generalised to RIGHT and FULL joins. To verify this I have run the following changes to c table in the query above:

c (
    path,
    subject_name
) AS (
    SELECT
        b.path,
        ac.subject_name
    FROM b
    FULL JOIN authenticode AS ac ON
        ac.path = b.path
)

and

c (
    path,
    subject_name
) AS (
    SELECT
        b.path,
        ac.subject_name
    FROM b
    RIGHT JOIN authenticode AS ac ON
        ac.path = b.path
)

In both cases, the end result is as follows, although it takes a little longer to get there:

W0305 17:13:34.864069 12656 virtual_table.cpp:975] Table authenticode was queried without a required column in the WHERE clause
W0305 17:13:34.866068 12656 virtual_table.cpp:986] Please see the table documentation: https://osquery.io/schema/#authenticode
Error: constraint failed

The query in the initial bug was the minimal reproducible example I could create, and it seems very temperamental when changing any aspect of the query. E.g. Removing the DISTINCT in the main query will stop this from happening.

from osquery.

directionless avatar directionless commented on June 18, 2024

However, as this is implying the same as NOT IN, they should be interchangeable without error.

I wonder how #8263 intersects this.

from osquery.

Samuel-Roach avatar Samuel-Roach commented on June 18, 2024

I think this issue might be present in earlier versions, but in a different form. I was experimenting further with using !=, and have found a use case where a call can be made to the authenticode table leading to the same error message in Osquery 5.11.0.

WITH a (
    path
) AS (
    SELECT
        path
    FROM services
),

b (
    path,
    subject_name
) AS (
    SELECT
        a.path,
        authenticode.subject_name
    FROM a
    INNER JOIN authenticode ON
        authenticode.path = a.path
)

SELECT DISTINCT
    a.path,
    b.subject_name
FROM a
LEFT JOIN b ON
    a.path = b.path
WHERE
    b.subject_name != 'Microsoft Windows';

Running this against Osquery 5.11.0 I see the following output:

W0311 11:27:26.173632 22456 services.cpp:124] 000000000340DF40: failed to query service description
W0311 11:27:26.231290 22456 services.cpp:124] 000000000340C58E: failed to query service description
W0311 11:27:26.254547 22456 services.cpp:124] 000000000340BC64: failed to query service description
W0311 11:27:26.270889 22456 services.cpp:124] 000000000340B4B8: failed to query service description
W0311 11:27:26.290596 22456 services.cpp:124] 000000000340A89C: failed to query service description
W0311 11:27:26.314883 22456 services.cpp:124] 0000000003409DFE: failed to query service description
W0311 11:27:26.381752 22456 virtual_table.cpp:975] Table authenticode was queried without a required column in the WHERE clause
W0311 11:27:26.385138 22456 virtual_table.cpp:986] Please see the table documentation: https://osquery.io/schema/#authenticode
Error: constraint failed

It would be worth noting that changing this to be

WHERE
    b.subject_name NOT IN ('Microsoft Windows');

also fails, however making the JOIN to authenticode in table b a LEFT JOIN results in the query working...

from osquery.

Smjert avatar Smjert commented on June 18, 2024

During the office hours we said that we wanted to resolve this, or at least avoid the regression, by first trying to patch the sqlite source code, removing that optimization.
The problem though is that the code that the commit touches has been changed since then (since it's also from almost 1 year ago), and I'm not sure if it's ideal to revert.

Furthermore there's something that doesn't checks with me.
Interestingly enough the very first query in the issue doesn't cause any error in my tests, while the FULL JOIN and RIGHT JOIN do. But in those cases is to be expected.
EDIT: The first query does fail, made an error in the test, the reasoning on the other kind of joins stands though.

A RIGHT JOIN or a FULL JOIN requires for all the right table rows to be present in the result of the join. The only way one can obtain all results from authenticode is querying it without constraints; but this is not possible because path is required.
So the FULL and RIGHT JOIN example queries do not seem incorrect to me.

I can reproduce with your very last example though, which is interesting. But I wonder if there isn't something else we are missing.

from osquery.

Smjert avatar Smjert commented on June 18, 2024

Something I just discovered is that it's possible to return SQLITE_CONSTRAINT from xBestIndex to say that the constraint is unusable.
This is documented in https://www.sqlite.org/vtab.html, in the "Return value" section. And even more clearly in "Enforcing Required Parameters On Table-Valued Functions", where it shows as an example a plan that's not acceptable for the function.

I tried it and it works; I think though the problem with this is when we are actually doing a query with no required constraint.
The problem is that the query returns with "Error: no query solution", which is not too helpful; but we have access to zErrMsg, that is a variable that can be set to bring an error back to the caller.

I haven't investigated further but we should also be able to distinguish the case of a plan not be valid due to a required constraint missing among valid plans, to avoid outputting the content of zErrMsg, and the case where the query actually failed due to that.

EDIT: Also for additional notes on this, since we talked about this specific thing during the office hours, when the xBestIndex plan does not contain a required column we set the cost to a high value we call kMaxIndexCost. The problem though is that this is not really the maximum cost possible, and this is why it still gets selected.
The documentation mentions about returning SQLITE_CONSTRAINT or setting the cost to infinity; the double infinity value, since estimatedCost is a double.

from osquery.

zwass avatar zwass commented on June 18, 2024

@Smjert great find! Looks like there is some attempt made to use this functionality but at a glance I can't tell if it's correct:

return SQLITE_CONSTRAINT;

from osquery.

Smjert avatar Smjert commented on June 18, 2024

@Smjert great find! Looks like there is some attempt made to use this functionality but at a glance I can't tell if it's correct:

return SQLITE_CONSTRAINT;

Yeah the difference it's that it's in the xFilter, which simply makes the whole query fail, and it's after all the plan evaluation.
We need to filter plans in xBestIndex

from osquery.

zwass avatar zwass commented on June 18, 2024

Ah yes I had just noticed that. It makes sense and seems like a path forward!

from osquery.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.