Problem statement
The Row
trait allows serializing a Rust struct to and from Clickhouse, e.g.
#[derive(klickhouse::Row)]
struct User {
age: u8,
name: String
}
let users: Vec<User> = ch.query_collect("SELECT age, name from users").await?;
In this example, let's assume we now want to retrieve user details and account balance in the same query. We would like to do something like
let users: Vec<(u32, User)> = ch.query_collect("SELECT credits, (age, name) FROM ...").await?;
// or
#[derive(klickhouse::Row)]
struct Row {
#[klickhouse(flatten)]
user: User,
credits: u32
}
let users: Vec<Row> = ch.query_collect("SELECT age, name, credits FROM ...").await?;
i.e. composing with the existing implementation of User: Row
:
Neither of the approaches are currently supported:
Row
is not automatically available on tuples of types implementing Row
. This would allow using (UnitValue<u32>, User)
as the query
return type.
- The derive macro supports many serde attributes (
default
, rename
, skip
...), but not flatten
Rather, one has to manually implement Row
(subject to some issues described below).
Implementation possibilities
Row
on composed types
For the first approach, one would implement Row
on tuples implementing Row
.
For (T1, ..., Tn)
, one would either:
- Flatten the columns (assuming they have distinct names)
- Expect the Clickhouse output to be a single tuple with types compatible to
T1
and T2
.
flatten
support
For serialization and deserialization, one would:
- Proceed as usual for regular fields.
- For flattened fields, call the
{serialize, deserialize}_row
of the subfield, setting (resp. retrieving) the respective fields.
One minor annoyance is the signature
fn deserialize_row(map: Vec<(&str, &Type, Value)>) -> Result<Self>;
which does not allow to efficiently retrieve the fields to pass to the recursive deserialize_rows
calls (i.e. one needs to linearly search for them and swap the Value
). This problem would disappear if the signature was
fn deserialize_row(map: IndexMap<String, (Type, Value)>) -> Result<Self>;
which is actually a pretty simple change in the code, but a breaking API change. The inefficiency probably does not matter with a small number of fields.
Subfield addressing
Another approach is to allow deriving
#[derive(klickhouse::Row)]
struct Row {
user: User,
credits: u32
}
where the User
fields are serialized and deserialized with a user
prefix, e.g.
SELECT name AS "user.name", age AS "user.age", credits FROM ...
Either implicitly, or with another attribute on the user: User
field.
Comments?
I have a working draft for the flatten
approach, and I might have a go at the tuple approach too.
Before cleaning this up and submitting a PR, I wanted to ask whether you had any comments or thoughts about this?
I only recently started using clickhouse and this (very nice) crate, and I might be missing some subtleties.