pinterest / querybook Goto Github PK
View Code? Open in Web Editor NEWQuerybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.
Home Page: https://www.querybook.org
License: Apache License 2.0
Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.
Home Page: https://www.querybook.org
License: Apache License 2.0
Add the ability to
Add a table warning system in DataHub where users can put their own warning messages for a table. This warning message will be shown by the linter while user is writing code.
Boost tables based on boost score
After favoriting a DataDoc, it does not show up on refresh
This is useful when there is a large amount of series
Expand the dropdown to not just give edit/read permission but also to give ownership. The previous owner should still get write permission afterwards
When searching xxx.yyy in data doc search, yyy would return nothing and users have to search xxx.yyy to find the result. The strategy will be provide multiple analyzers to analyze code and rich text differently
Make sidebar search and table search use similar UI to control what parameters can be filtered such as:
By default the default schema name is 'default', which does not apply to all cases since this can be overridden in the connection string. This setting would also be different for different query engines, for example, sqlite's default is actually 'main' instead of 'default'
Acceptance
Add field selection to row samples, by default, all columns are selected
Users can export the raw query
Users can copy the result to clipboard as tsv
This is caused by the url change without considering the current state
Currently, all private DataDocs are not indexed on Elasticsearch for simplification of logic. Since most of the DataDocs will be private by default with FGAC, it is essential to make them searchable from Elasticsearch. The new Elasticsearch table for DataDocs should include 2 more fields: public and readable_user_ids. The second field readable_user_ids should include every user who can access this private DataDoc.
Story
As a user I want to export multiple query results externally
Acceptance
Story
As a user, I would only want to view query examples by a certain query engine
Assumption
Acceptance
Currently picking a table chart does not let the user choose a title. It makes it hard to differentiate between a table chart and a query execution
Create Notifier plugin model to allow for different orgs to add new notification services such as ms teams. Notifier will handle sending query completion messages as well as doc permission change messages to DataHub users.
Story
As an admin, I want to add the same query engine to different environments without worrying about duplicating the config.
As an admin, I want to be able to order query engine in the dropdown so that I can order them differently for the user.
Assumption
Acceptance
Problem:
The sql-lexer assumes that anything that is a VARIABLE type following a FROM statement is a table and breaks the suggestions.
Root cause:
Presto allows a FROM clause in front of things other than table names
The types supported by the extract function vary depending on the field to be extracted. Most fields support all date and time types.
extract(field FROM x) โ bigint
Returns field from x.
Code where this fails:
while (!stream.eol()) {
// here the match fails, and because nothing gets consumed it goes off in an infinite loop if the match is handled
// Maybe the right thing to do is, if there's no match, break out of the stream matching?
const match = stream.match(/^([_\w\d]+|`.*`)\.?/, true);
// this fails and kicks you out of the loop, but then the suggestions stop working
if (match[1]) {
let part = match[1];
if (part.charAt(0) === '`') {
// remove first and last char
part = part.slice(1, -1);
}
parts.push(part);
}
short snippet of what caused this:
SELECT *
FROM table_2
JOIN table_1
ON table_1.field_1 = table_2.field_2
AND extract(YEAR FROM field_1_date) = table_2.field_year
Story
As a user, it would be confusing when I go on DataHub and it does not tell me why I cannot see any environments.
As an admin, I want to give pointers to new users when they first visit DataHub.
Acceptance
This change will apply to the following views:
Fields such as partition, hive metastore information, query users, should be all hidden if there is no information to show
expected formatting:
DELETE JAR s3://test-bucket/hadoopusrs/prod/test-0.5-SNAPSHOT/test-0.5-SNAPSHOT.jar;
ADD JAR s3://test-bucket/hadoopusrs/bob/test-0.5-SNAPSHOT/test-0.5-SNAPSHOT.jar;
-> same
actual formatting
DELETE JAR s3://test-bucket/hadoopusrs/prod/test-0.5-SNAPSHOT/test-0.5-SNAPSHOT.jar;
ADD JAR s3://test-bucket/hadoopusrs/bob/test-0.5-SNAPSHOT/test-0.5-SNAPSHOT.jar;
->
DELETE JAR s3: / / test - bucket / hadoopusrs / prod / test -0.5 - SNAPSHOT / test -0.5 - SNAPSHOT.jar;
ADD JAR s3://test-bucket/hadoopusrs/bob/test-0.5-SNAPSHOT/test-0.5-SNAPSHOT.jar;
When sorting the impression count in the impression table in DataDoc/DataTable view, it does not sort from largest to smallest or vice versa.
change it from a table format to rows format similar to announcement admin ui
As a developer, I want to set up data source unit tests quickly with some example data in database
Acceptance:
This would be useful if user opens DataHub in a multi-window browser
Being able to set which fields to search for cmd+k search
Here are some of the potential fields:
There is a user setting for editor text size. It would be nice to have a similar setting for the text size of the query results.
We can also reuse such setting for query results size
As a user, I want to see who are the frequent users of a table so I can ask them questions.
Assumption:
Use query samples to obtain info about the common query runners
Acceptance:
Together with #202, they should help with the experience of exporting
Story
As an user I want to export my entire DataHub query results without worrying about the preview size
Acceptance
We can support snowflake easily with snowflake-sqlalchemy integration
https://docs.snowflake.com/en/user-guide/sqlalchemy.html
https://pypi.org/project/snowflake-sqlalchemy/
The request does not return when you add a filter for start date or end date.
Things to check:
Story
As a user, I want to use Vscode to develop DataHub with minimal amount of effort
Acceptance
To repro:
Currently most columns are using varchar or mediumtext
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.