Comments (5)
@kgorman Assuming your topic is in JSON format try this:
ksql> REGISTER TOPIC t1 WITH (value_format = 'json', kafka_topic='defaultsink');
ksql> PRINT t1 SAMPLE 10;
This will print out one row out of every 10 row. Of course you can set the value to any integer you desire, if the rate is high you can say 1000 or even more!
Note that in the docs we omitted the above statements!
Let me know if you run into any issues.
from ksql.
I mean like, this isn't particularly elegant:
ksql> create stream kg (test varchar) with (value_format='delimited', kafka_topic='defaultsink');
Message
----------------
Stream created
ksql> select * from kg;
Exception in deserializing the delimited row: {"flight": "", "timestamp_verbose": "2017-08-29 18:33:28.406898", "msg_type": "8", "track": "", "timestamp": 1504049608, "altitude": "", "counter": 1057, "lon": "", "icao": "A365B7", "vr": "", "lat": "", "speed": ""}
Query terminated
;-)
from ksql.
@kgorman: From what I understand, your actual problem is figuring out how to format/parse the data in a STREAM's or TABLE's underlying Kafka topic, correct? That is, which properties would need to be set (and to which values) in order for e.g. CREATE STREAM
to work properly?
In other words, it's not about being able to sample a topic -- it's about knowing how to write the properties part of e.g. CREATE STREAM
correctly? Asking because sampling might be a direct use case (e.g. in situations where you want to work on a lower volume variant of the actual input data).
from ksql.
Good point(s) @miguno. Yes, correct on all counts. Because of the interactive nature of the shell, it's natural to experiment and make streams and tables to explore and visualize. I think @hjafarpour solution is spot on! It works for this purpose. It would be good to update the docs/example to include this type of information. Perhaps I am just missing it. If so, apologies.
from ksql.
Thanks for clarifying @kgorman! We'll take a look at how we could update the docs with this information.
from ksql.
Related Issues (20)
- COLLECT_LIST in KSQL returning duplicate results OR retaining previous results HOT 13
- KSQL_KSQL_HEARTBEAT_ENABLE settings inquiry HOT 2
- UDAF with GROUP BY not working properly - NullPointerException HOT 1
- RETENTION_MS config mandatory?
- Data Discrepancy in Tumbling Windowed Table Creation from IoT Data Stream HOT 1
- Unable to verify if the value schema for topic: is compatible with ksqlDB HOT 1
- release 7.6.0-137 missing HOT 1
- Unable to Locate the packages in maven repository HOT 1
- Inconsistent results in pull queries with distributed KsqlDB setup HOT 1
- KSQL application for denormalizing data for data warehouses
- KSQLDB is throwing Error as "The group is Not Empty"
- Persistent query silently fails / is deleted, but still shows as RUNNING
- Testing tool missing from 7.6.0 release
- `ksql-migrations` returns an error to the shell when there are no eligible migrations
- Table Pull Query Scan: Add 'IS (NOT) NULL' Filter HOT 1
- Unable to verify if the value schema for topic -- PROTOBUF -- Reason: null HOT 1
- Streaming ETL pipeline Tutorial Error
- KSQL Configuration with Kafka Connect with Authentication HOT 1
- KSQL with authenticated kafka connect not documented
- Docker compose fails for ksqldb-cli on Mac M2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ksql.