In order to use metriql in a dbt project, you need to be using <a href="https://docs.g

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Ability to define metrics in dbt models about metriql HOT 11 CLOSED

metriql commented on May 30, 2024

Ability to define metrics in dbt models

from metriql.

Comments (11)

dbrtly commented on May 30, 2024 1

FWIW, I think is a bit too clever to be relevant at this phase of metriql. The proposal isn't in harmony with the python principle of “prefer explicit over implicit”.

from metriql.

buremba commented on May 30, 2024

@dbrtly thanks for the input. The rationale behind this feature is making the switch to metriql for the existing dbt projects seamless. The power users usually have a convention for the metrics inside dbt models (here is an example: https://emilyriederer.netlify.app/post/convo-dbt/) and I'm trying to provide a way for them to use metriql with changing only a few lines of their existing files.

I agree with you on prefer explicit over implicit but reducing the friction is also important in this context. We're unlikely to implement such a solution without having a consensus anyways. :)

from metriql.

dbrtly commented on May 30, 2024

I hadn’t seen this post before. I get dbt errors if I leave out the yml files.
edit: I thought I would get errors. Turns out dbt doesn't validate like that. :(

Is it feasible to go further than a naming convention and get better reach? Users don't necessarily having a naming convention in common but the ansi sql is common. Could metricql automatically add meticql config to a yml file after scanning the code for the full pattern in the second last cte:

select … sum(…) as column_name

and automatically classify all of those columns as measures?

from metriql.

buremba commented on May 30, 2024

I could not find a reliable way to do that yet and I don't think that I can. Here are the limitations:

The first option is parsing SQL queries but every database has a different dialect so it's not easy as it seems. The parsers need to understand the full syntax of the database and I don't know any fault-tolerant parser that just skips the unknown parts of the query and just looks at the column definitions of the actual query.
The other solution would be running some metadata queries in the target database to find out the aggregation function of the metrics. If your materialization is view, a few databases provide a way to inspect the views but if it's table or incremental, then all the metrics become regular columns and even the database doesn't know if they're metric or not.

from metriql.

muskirac commented on May 30, 2024

Hey @buremba,

To me the metrics should be part of metadata and YML files, so that analysts who don't want to break the data flows can easily inject and manage their metrics. From this aspect I like how cube.js defines complex metrics within "sql" meta (https://cube.dev/docs/direction-of-joins#transitive-join-pitfall). Maybe metriql can introduce a new dbt macro that generates/propagates metric SQL defined in YAML into model sql files. For e.g., in select section of a model so that all auto-generated metrics are strictly managed in YAML files:

select
  .... some source to propagated expression,
  {{ metriql_add_metrics(model_name, metric_group, materialize='metriql_only', put_comma_before=False, ... ) }}
  ...
from ... some_source ...

Moreover, analysts probably wouldn't want to define metrics only on the sources but also on the model outputs. This will require introduction of model yaml definitions. Maybe metriql should not place metrics in source YAML files, but introduce its own YAML directory to add metrics to sources or models, and developers can integrate such metrics into data flow by calling metriql macros within their transformation queries.

from metriql.

dbrtly commented on May 30, 2024

I would like to put forward naming conventions for metric and have a “generate_metriql_yaml” macro (like the dbt package codegen). For example the following: - starts with: count_of_ count_of_unique approx_count_of_unique total_ average_ maximum_ minimum_ That would be a start Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: Mustafa Kıraç ***@***.***> Sent: Saturday, August 7, 2021 10:04:12 PM To: metriql/metriql ***@***.***> Cc: Daniel Bartley ***@***.***>; Mention ***@***.***> Subject: Re: [metriql/metriql] Ability to define metrics in dbt models (#17) Hey @buremba<https://github.com/buremba>, To me the metrics should be part of metadata and YML files, so that analysts who don't want to break the data flows can easily inject and manage their metrics. From this aspect I like how cube.js defines complex metrics within "sql" meta (https://cube.dev/docs/direction-of-joins#transitive-join-pitfall). Maybe metriql can introduce a new dbt macro that generates/propagates metric SQL defined in YAML into model sql files. For e.g., in select section of a model so that all auto-generated metrics are strictly managed in YAML files: select .... some source to propagated expression, {{ metriql_add_metrics(model_name, metric_group, put_comma_before=False, ... ) }} ... from ... some_source ... Moreover, analysts probably wouldn't want to define metrics only on the sources but also on the model outputs. This will require introduction of model yaml definitions. Maybe metriql should not place metrics in source YAML files, but introduce its own YAML directory to add metrics to sources or models, and developers can integrate such metrics into data flow by calling metriql macros within their transformation queries. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#17 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADLXPTJVVFZGK3XOFQXNL3TT3UOLZANCNFSM5BKSEPGQ>.

from metriql.

buremba commented on May 30, 2024

Hey @muskirac, thanks for the insight. Indeed, metriql lets you create datasets from all your dbt resources including models, sources, and seeds. We also support sql relations and generate JOIN relations if you query a field from the using the dot notation just like Cube.js.

metriql generates all the SQL queries at runtime and all the consumers connect their data warehouse through our JDBC, REST API, etc. so I'm not sure if generating dbt models from metriql has a potential use-case with the metriql_add_metrics macro. How do plan to use the generated dbt model?

@dbrtly, clever idea! I agree with both of you, this approach seems to be a "black box" for the end-users and potentially be confusing since there will be multiple ways to define metrics. If people define the metrics using column names, it will be hard for them to switch to YML later on as the logic will be separated into two different parts of the codebase.

For now, I'm in favor of generating a macro just like dbt's codegen package as it will be a native dbt approach.

from metriql.

dbrtly commented on May 30, 2024

A less verbose syntax:

Eg macro sum_when_boolean

select
blah_blah,
{{ metrics.sum_when_boolean(‘my_value’, ‘bool_col’) }}

from my_table
group by 1

select
blah_blah,
sum(if(bool_col, my_value, null)
as total_my_value_when_my_boolean

from my_table
group by 1

from metriql.

dbrtly commented on May 30, 2024

Thinking further is the dbt docs feature extensible? Can you do your own implementation of exposures and make it dance with the documentation website? Maybe:

metriql docs generate
metriql docs serve

from metriql.

buremba commented on May 30, 2024

@dbrtly, metriql CLI comes with a dashboard that lets you connect to BI tools and audit all the queries at the moment. We're planning to add a new page that visualizes all the data marts and lets you run queries in a simple interface for testing purposes. I believe that rather than introducing separate static documentation, it can be included in the dashboard. What do you think?

from metriql.

buremba commented on May 30, 2024

We support dbt's native metric definitions at the moment so so we don't need to introduce an alternative for serving docs, running tests, or defining the metrics itself.

from metriql.

Ability to define metrics in dbt models about metriql HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent