dbohdan / sqawk Goto Github PK
View Code? Open in Web Editor NEWLike awk but with SQL and table joins
License: MIT License
Like awk but with SQL and table joins
License: MIT License
Read it line-by-line.
Line 32 in 6ffc59e
Hi,
I was testing the "summing up" example and noticed that sqawk
gives I/O error for more than 10K lines. I was wondering if this is related to hardware limitations or software limitations? Is it possible to override this limit?
Also, I was surprised that it gives I/O error since I'm not reading any data from a file or writing results to a file.
Here are the codes to reproduce:
10K lines
$ awk -v seed=$RANDOM 'BEGIN{srand(seed); split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",chars,""); for (n=1;n<10000;n++){loc=int(rand()*26); num=int(rand()*10000); print chars[loc+1]" "num }}' | sqawk 'select a1, sum(a2) from a group by a1 order by a1'
A 1908044
B 2134362
C 1979623
D 1913669
E 1989040
F 1947681
G 1853303
...
100K lines
$ awk -v seed=$RANDOM 'BEGIN{srand(seed); split("ABCDEFGHIJKLMNOPQRSTUVWXYZ",chars,""); for (n=1;n<100000;n++){loc=int(rand()*26); num=int(rand()*10000); print chars[loc+1]" "num }}' | sqawk 'select a1, sum(a2) from a group by a1 order by a1'
error: disk I/O error
I have a table of data where one of the column has a special character #
# Done Priority Get Size Name
0: 100% Normal Yes 709.3 MB FILE_NAME
When I pipe it to sqawk
it threw unrecognized token: "#"
at me
sqawk -output json,indent=1 \
'select #,Done,Priority,Get,Size,Name from a' \
trim=left \
fields=1,2,3,4,5-end \
header=1
If I replace #
to ID
, then it is working perfectly.
I have tried to escape #
with the following but none works
'select \#,Done,Priority,Get,Size,Name from a'
"select \#,Done,Priority,Get,Size,Name from a"
Hello,
What do you expect with this ls -l | sqawk -output table 'select a1 from a' trim=left
?
I would expect this to only select the first column. Instead, all rows are printed:
┌──────────┬──┬────┬────┬────┬───┬──┬─────┬─────────────────────┐
│ total │23│ │ │ │ │ │ │ │
├──────────┼──┼────┼────┼────┼───┼──┼─────┼─────────────────────┤
│drwxr-xr-x│ 7│sean│sean│ 7 │Aug│31│01:59│ bin │
├──────────┼──┼────┼────┼────┼───┼──┼─────┼─────────────────────┤
│-rw-r--r--│ 1│sean│sean│ 419│Aug│31│03:45│exp_1441043133748.csv│
├──────────┼──┼────┼────┼────┼───┼──┼─────┼─────────────────────┤
│drwxr-xr-x│ 5│root│sean│ 5 │Aug│21│03:04│ perl5 │
├──────────┼──┼────┼────┼────┼───┼──┼─────┼─────────────────────┤
│-rw-r--r--│ 1│sean│sean│3892│Aug│28│03:14│ script.sh │
├──────────┼──┼────┼────┼────┼───┼──┼─────┼─────────────────────┤
│-rw-r--r--│ 1│sean│sean│ 60 │Aug│20│07:54│ tmux-client-4279.log│
├──────────┼──┼────┼────┼────┼───┼──┼─────┼─────────────────────┤
│-rw-r--r--│ 1│sean│sean│9570│Aug│20│07:54│ tmux-server-4281.log│
└──────────┴──┴────┴────┴────┴───┴──┴─────┴─────────────────────┘
From you example, 'ps | sqawk -output table 'select a1 from a' trim=left' this also prints full data.
Hello,
FreeBSD calls tclsh tclsh8.6
, so when attempting to install your program, it can 't find tclsh. What file do I need to amend to have it find tclsh8.6?
Thanks!
One Chinese character occupy two English character space, example:
**
1234
so the tabulate has problem with Chinese character
Hello,
% cat test.csv
numbers,letters
001,a
002,b
003,c
004,d
5,e
6,f
7,g
8,i
9,j
10,j
When doing cat test.csv| sqawk -output table 'select a1,a2 from a' trim=left format=csv2
, 001-004 turn into just 1,2,3,4. Actually all output options drop the 0 for 001-004.
If a trim option and format are not specified, then 001-004 are printed but its on one table.
Is there some work around to support numbers with 0 in front of numbers? If I add a decimal point in front of 001 and 002, sqawk will support this:
% cat test.csv| sqawk -output json 'select a1,a2 from a' format=csvalt trim=none
[{"a1":"numbers","a2":"letters"},{"a1":"0.001","a2":"a"},{"a1":"0.002","a2":"b"},{"a1":"3","a2":"c"},{"a1":"4","a2":"d"},{"a1":"5","a2":"e"},{"a1":"6","a2":"f"},{"a1":"7","a2":"g"},{"a1":"8","a2":"i"},{"a1":"9","a2":"j"},{"a1":"10","a2":"j"}]
I did json above because it's far easier to format that in the comment.
The ps | jimsh ./tabulate.tcl example on the tcler's wiki is kind of a red herring--in real life the 4th column is multiple words and the result is gibberish. I'm not sure how to make a patch so I've included the updated ::tabulate::main procedure to handle that new switch. This makes ps | tclsh ./tabulate.tcl -numcolumns 4 look good.
# Read the input, process the command line options and output the result.
proc ::tabulate::main {argv0 argv} {
options::process $argv \
store -FS in FS default {} \
store -style in style default default \
store -alignments or -align in alignments default {} \
store -margins in margins default 0 \
store -numcolumns in numcolumns default 0
set data [split [string trim [read stdin]] \n]
# Input field separator. If none is given treat each line of input as a Tcl
# list.
if {$FS ne {}} {
set updateData {}
foreach line $data {
lappend updateData [split $line $FS]
}
set data $updateData
}
if {$numcolumns > 0} {
if {$FS eq {}} { set FS " "}
set updateData {}
foreach line $data {
set last [join [lrange $line $numcolumns-1 end] $FS]
lappend updateData [lreplace $line $numcolumns-1 end $last]
}
set data $updateData
}
# Accept style names rather than style *values* that ::tabulate::tabulate
# normally takes.
set styleName [::tabulate::dict-get-default $argv default -style]
puts [tabulate -data $data \
-style [::tabulate::style::by-name $style] \
-alignments $alignments \
-margins $margins]
}
Note that Sqawk is meant to be used as a command line program. The ::sqawk
Tcl API will probably never be stable.
Hello,
Updated sqawk this morning, but the tests are failing--all of them:
https://gist.github.com/jungle-boogie/ac7b7d410375466120b8
I have a command:
❯ ./table.sh
foo bar
1 a
2 b
I pipe it to swawk and get the output that I want if I name the fields explicitly, like so:
❯ ./table.sh | sqawk -output json 'select foo, bar from a where bar = "b"' header=1
[{"foo":"2","bar":"b"}]
But if I select *
I get many other fields too:
❯ ./table.sh | sqawk -output json 'select * from a where bar = "b"' header=1
[{"anr":"2","anf":"2","a0":"2\tb","foo":"2","bar":"b","a3":"","a4":"","a5":"","a6":"","a7":"","a8":"","a9":"","a10":""}]
As a workaround I can delete those fields explicitly:
❯ ./table.sh | sqawk -output json 'select * from a where bar = "b"' header=1 | jq 'map(del(.a0, .a1, .a2, .a3, .a4, .a5, .a6, .a7, .a8, .a9, .a10, .anr, .anf))'
[
{
"foo": "2",
"bar": "b"
}
]
But I think it would be cleaner if I could tell sqawk to not include them in the first place. Is there such a way? If not, can there be?
while executing
"package require cmdline"
(file "/usr/local/bin/sqawk" line 6)
-bash-4.2# sqawk
can't find package cmdline
while executing
"package require cmdline"
(file "/usr/local/bin/sqawk" line 6)
This would reduce memory usage.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.