jhspetersson / fselect Goto Github PK
View Code? Open in Web Editor NEWFind files with SQL-like queries
Home Page: https://fselect.rocks
License: Apache License 2.0
Find files with SQL-like queries
Home Page: https://fselect.rocks
License: Apache License 2.0
relates to Homebrew/homebrew-core#53845
homebrew
has brew audit
check against the github release tags, and now it complains that the latest release tag is with prerelease
.
I wonder if you can update the tag to the latest release to unblock the brew audit
check.
Thanks!
For example,
>fselect path,name FROM . where name="*.py\-e" | sed -e 's/\(.*\)/|\1|/'
|./test/__init__.py-e __init__.py-e |
|./test/ldapsync.py-e ldapsync.py-e |
|./test/00-import.py-e 00-import.py-e |
|./test/tools.py-e tools.py-e |
I understand that tabs is the default format, but it makes little sense to print the last tab.
Neither does fselect --help
or fselect help
.
Please, add license to your package for me to be able to figure out rules of usage and to have ability to build package.
Hej! I have some special character issues when using powershell in windows 10:
PS C:\Users\Sebastian> fselect
FSelect utility �[33m0.6.1�[0m
Find files with SQL-like queries.
�[4;36mhttps://github.com/jhspetersson/fselect�[0m
This happens all over the place, in help messages, error messages, results, ...
For dotfiles on unix, and perhaps hidden files on NTFS/Windows?
I often need fselect
path by duration, can support some duration variable,
like 1 hours ago
find
can do this like
find . -mmin 60 #exactly 60 minutes old
I am doing a quick test of fselect on Windows 10 on the top-level of an Angular project.
I started with the simple request adapted from readme:
size, path from . where name = *.ts or name = *.css or name = *.html
but it started to collect all files in node_modules.
So I changed the request to:
size, path from . gitignore where name = *.ts or name = *.css or name = *.html
but I get the same.
1087 .\node_modules\ignore\index.d.ts
2861 .\node_modules\ipaddr.js\lib\ipaddr.js.d.ts
110 .\node_modules\is-plain-object\index.d.ts
97 .\node_modules\isobject\index.d.ts
and so on.
In .gitignore, we have the classical paths:
/build/
/bin/
/node_modules
My suspicion is that you check against slashes but get backslashes out of the filesystem API calls.
If I change the request to:
size, path from ./src where name = *.ts or name = *.css or name = *.html
I get a listing of:
30 ./src\shared\file\index.ts
2750 ./src\shared\file\item.model.ts
60 ./src\shared\main\index.ts
1297 ./src\shared\main\modal.service.ts
66 ./src\shared\sharing\index.ts
with mixed slashes and backslashes, which isn't very nice... :-)
Since forward slashes are well understood overall in Windows, I suggest to normalize on them.
Thanks for this very interesting and flexible utility.
Windows 10:
I just tried to select some files from my temp folder which in my case is a RAM drive mounted as drive T:. fselect always responds with the error:
T:: could not canonicalize path
This is my command line:
fselect abspath from T: where name = 'hb*'
I can add a backslash to "T:", but the error message is the same. The temp drive is created by the tool ImDisk-Toolkit, but it should not matter what type of drive should be search, right?
version: 0.3.2
query: fselect from ~/src where name like "%rust%"
result: newline spam
expected result: no output or warning about not having any selectors specified
Would be nice to have built-in support for counting the resultset.
fselect 'count(*) from node_modules where name = package.json'
gets the total amount of installed packages in a JS project, for example.
Just a random thought, in a true unix tradition, be nice to be able to do things like
$ find <whatever> | fselect max(size)
This way fselect would be more composable with other tools. E.g., could do things like fd | fselect | fzf
(with fd being lightning-fast when going through massive numbers of files, so you could narrow it down before filtering it with more detail). In theory, you could also pipe fselect
into fselect
, why not? :)
Syntax-wise, maybe the whole from
clause could be then omitted, and if there's anything being piped in, it would filter those results instead of searching in the current folder.
thread 'main' panicked at 'called Result::unwrap()
on an Err
value: Os { code: 63, kind: Other, message: "File name too long" }', libcore/result.rs:945:5
note: Run with RUST_BACKTRACE=1
for a backtrace.
I think that such a good tool needs a good manpage (or info). Also I will try to write it, if I can recall syntax.
Just like some other file-searching related tools do (fd, ripgrep, exa - to mention some), fselect
could use colors too. Since fselect
doesn't support command line options, there might be a simple rule, colorize the output if it prints to terminal, but disable colors if the output is redirected.
Personally, I'd like to see paths colorized according to LS_COLORS variable, that would help to visually distinguish the files by types.
Thank you for a great tool! 🔧
fselect panics when opening stdout when newer terminfo/ncurses is present. It’s a issue with the term crate, due to a change in the ncurses. The issue in term is now fixed but fselect will need to be updated to use the new version to fix the issue.
See the term issue for more info: Stebalien/term#81 (comment)
Thanks again for building a neat tool in Rust.
The SQL like syntax makes me think that we could implement fupdate or fdelete?
Hiya! I've been using this on my mac and it's simply brilliant! Any chance someone has built this for Ubuntu as well?
If fselect has a way to distinguish textual files from binary ones, it can be interesting to have a is_text
search field (based on content, not on file extension).
And in this case, I would like to have something like:
where line_count gt 2000
or where line_length gt 160
which, among other things, can be useful for source code files...
Of course, fselect has to be agnostic for line end markers, even accepting a mix of CRLF, CR and LF...
I see / guess that some things are hard-coded, like the list of extensions for is_xxx
search fields.
Perhaps you could support a .fselectrc
file.
On Windows, I suggest the same place than .gitconfig
and .npmrc
(among others), at root of user profile (usually at C:\User\<user name>
).
Among other things, this would allow to add our favorite language extensions to is_source
, this specific archive extension to is_archive
, and so on.
I don't know if it is easy to add (or even possible without bloating) but supporting a duration
for audio and video files would be very useful, I think.
Maybe you can add some explanations to some obscure column names. Particularly hsize
and fsize
(look the same for me on Windows), (s)uid
and (s)gid
(Unix / Mac only?) and so on (I understand is_block
or is_socket
and some others are probably not supported on Windows).
Of course, one can assume that if they don't understand the name, they don't need it... 😄
How do you feel about having an attribute called is_source
? It will match any files which can be classified as source files e.g. .rs
. I've already implemented it, but I didn't want to waste your time with a PR if you didn't think it was useful.
Hey thanks for making such a neat tool! While using it today I discovered that the SIGPIPE signal being passed back from CLI tools that only consume a certain amount of stdout results in a panic. Here's a minimal repro of the bug:
$ RUST_BACKTRACE=1 fselect path, size from /usr | head
/usr/sbin 12288
/usr/sbin/cupsaddsmb 14328
/usr/sbin/pppoe-discovery 22528
/usr/sbin/exim_lock 18424
/usr/sbin/update-passwd 31136
/usr/sbin/genccode 10608
/usr/sbin/netscsid 118688
/usr/sbin/split-logfile 2415
/usr/sbin/cache_dump 11
/usr/sbin/cpgr 4
thread 'main' panicked at 'failed printing to stdout: Broken pipe (os error 32)', src/libstd/io/stdio.rs:690:9
stack backtrace:
0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:39
1: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:70
2: std::panicking::default_hook::{{closure}}
at src/libstd/sys_common/backtrace.rs:58
at src/libstd/panicking.rs:200
3: std::panicking::default_hook
at src/libstd/panicking.rs:215
4: std::panicking::rust_panic_with_hook
at src/libstd/panicking.rs:478
5: std::panicking::continue_panic_fmt
at src/libstd/panicking.rs:385
6: std::panicking::begin_panic_fmt
at src/libstd/panicking.rs:340
7: std::io::stdio::_print
at src/libstd/io/stdio.rs:690
at src/libstd/io/stdio.rs:699
8: fselect::searcher::Searcher::check_file
9: fselect::searcher::Searcher::visit_dir
10: fselect::searcher::Searcher::visit_dir
11: fselect::searcher::Searcher::list_search_results
12: fselect::main
13: std::rt::lang_start::{{closure}}
14: std::panicking::try::do_call
at src/libstd/rt.rs:49
at src/libstd/panicking.rs:297
15: __rust_maybe_catch_panic
at src/libpanic_unwind/lib.rs:92
16: std::rt::lang_start_internal
at src/libstd/panicking.rs:276
at src/libstd/panic.rs:388
at src/libstd/rt.rs:48
17: main
18: __libc_start_main
19: _start
Of course, using fselect path, size from /usr limit 10
is a valid workaround for this use-case- I'm just using head
for a quick example. Any similar command that uses SIGPIPE has produced equivalent results. Here's an example that just uses sleep
:
$ RUST_BACKTRACE=1 fselect path | sleep 1
thread 'main' panicked at 'failed printing to stdout: Broken pipe (os error 32)', src/libstd/io/stdio.rs:690:9
stack backtrace:
0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:39
1: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:70
2: std::panicking::default_hook::{{closure}}
at src/libstd/sys_common/backtrace.rs:58
at src/libstd/panicking.rs:200
3: std::panicking::default_hook
at src/libstd/panicking.rs:215
4: std::panicking::rust_panic_with_hook
at src/libstd/panicking.rs:478
5: std::panicking::continue_panic_fmt
at src/libstd/panicking.rs:385
6: std::panicking::begin_panic_fmt
at src/libstd/panicking.rs:340
7: std::io::stdio::_print
at src/libstd/io/stdio.rs:690
at src/libstd/io/stdio.rs:699
8: fselect::searcher::Searcher::check_file
9: fselect::searcher::Searcher::visit_dir
10: fselect::searcher::Searcher::visit_dir
11: fselect::searcher::Searcher::visit_dir
12: fselect::searcher::Searcher::visit_dir
13: fselect::searcher::Searcher::visit_dir
14: fselect::searcher::Searcher::visit_dir
15: fselect::searcher::Searcher::list_search_results
16: fselect::main
17: std::rt::lang_start::{{closure}}
18: std::panicking::try::do_call
at src/libstd/rt.rs:49
at src/libstd/panicking.rs:297
19: __rust_maybe_catch_panic
at src/libpanic_unwind/lib.rs:92
20: std::rt::lang_start_internal
at src/libstd/panicking.rs:276
at src/libstd/panic.rs:388
at src/libstd/rt.rs:48
21: main
22: __libc_start_main
23: _start
I'm not a rust programmer but from skimming the documentation about std::io::stdio::_print
and the source of the check_file
function, I think it's not set up to handle SIGPIPE and terminate early. I'm running fselect version 0.6.3 installed via source using cargo
.
First time with fselect:
Actual
when i exec fselect size, path from . gitignore
then fselect not display files switch .gitignore
but fselect display files in .git directory
Expected
when i exec fselect size, path from . gitignore
then fselect not display files switch .gitignore
and fselect not display files in .git directory
add .git
in .gitignore
is not usual.
Thanks (fselect is great)
Ami44
fselect name,modified from . where modified le '2019-04-18 18:50:00'
it list file which modified is 2019-04-18 18:51:06
First of all, I like what it can do and was about to install it when I noticed that there doesn't seem to be a single test.
Something that came to my mind is to test it entirely on CLI
level to emulate how people would use it. Here is an example of journey level tests, which looks like this when run (it's all just bash
).
Do you consider this this as much of a maintenance problem as I do?
Please don't feel pressured or criticized, I really am interested to learn about the way this tool is/was developed.
How can I select the absolute path, when selecting from a relative path?
If it's not possible, could this be implemented?
e.g. fselect abspath from ../foo where ..
.
Does JSON support work? If there's no match, it prints []
, but if there is a match (either using like
or =
I get an error.
The same queries work fine with csv
or lines
.
Environment
10.13.6
0.4.3
LANG="en_GB.UTF-8"
RUST_BACKTRACE=1 fselect path from . where name like %.yaml and is_file = true into json
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("key must be a string", line: 0, column: 0)', libcore/result.rs:945:5
stack backtrace:
0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
1: std::sys_common::backtrace::print
2: std::panicking::default_hook::{{closure}}
3: std::panicking::default_hook
4: std::panicking::rust_panic_with_hook
5: std::panicking::continue_panic_fmt
6: rust_begin_unwind
7: core::panicking::panic_fmt
8: core::result::unwrap_failed
9: fselect::searcher::Searcher::check_file
10: fselect::searcher::Searcher::visit_dirs
11: fselect::searcher::Searcher::list_search_results
12: fselect::main
13: std::rt::lang_start::{{closure}}
14: std::panicking::try::do_call
15: __rust_maybe_catch_panic
16: std::rt::lang_start_internal
17: main
[%
Hello!
I've created AUR package for fselect, here it is: https://aur.archlinux.org/packages/fselect/
Maybe it will make sense to add information about it into readme.
Hej,
This may not be a bug but it caused me some confusion. When fselect
ing from a folder and using a where clause, the trailing backslash of the folder causes the where clause to not return any results.
It happens both when using a quoted string after fselect
and when not using quotes.
This is not the case for forward slashes:
PS C:\Users\Sebastian> fselect "name from d:\e-books\ where name = *.mobi"
PS C:\Users\Sebastian> fselect "name from d:\e-books where name = *.mobi"
Kling, Marc-Uwe - 03 Die Kaenguru-Offenbarung.mobi
The inmates are running the asylum - Alan Cooper.mobi
The Martian - Andy Weir.mobi
If Hemingway Wrote Javascript - Angus Croll.mobi
[...]
PS C:\Users\Sebastian> fselect "name from d:/e-books/ where name = *.mobi"
Kling, Marc-Uwe - 03 Die Kaenguru-Offenbarung.mobi
The inmates are running the asylum - Alan Cooper.mobi
The Martian - Andy Weir.mobi
If Hemingway Wrote Javascript - Angus Croll.mobi
[...]
(newlines for readability)
It's especially annoying because tab-complete in PowerShell automatically adds the final backslash
In the process of making changes to formulae in the Homebrew package manager, I noticed that fselect was one of a handful of Rust binary projects without a Cargo.lock
file in version control. The Cargo book recommends the following (source):
If you’re building an end product, which are executable like command-line tool or an application, or a system library with crate-type of staticlib or cdylib, check
Cargo.lock
intogit
.
More information about the reasoning can be found in the "Why do binaries have Cargo.lock in version control, but not libraries?" section of the Cargo FAQ.
The Cargo.lock
file helps package managers to keep builds reproducible, since cargo install
simply uses the latest dependency versions unless the --locked
flag is added to the command, in which case it will use the versions outlined in Cargo.lock
. Without a Cargo.lock
file, there's a chance that a dependency update will break the build sometime in the future, which is something I've already encountered with other Rust binary projects.
Would you please consider checking Cargo.lock
into version control?
Hey guys,
I've been using fselect for a while now. It is a great tool and it helps me a lot. The only thing I miss is a "contains" statement which can be used to match files which contain certain text string. It would be great if you could add this function.
It's pretty short, so I'll leave the trace here:
% RUST_BACKTRACE=1 fselect "select path from . where path regexp \(NOT '^./code'\) and name regexp '\.conf$'
thread 'main' panicked at 'Incorrect regex expression', src/searcher.rs:1246:45
stack backtrace:
0: <unknown>
1: <unknown>
2: <unknown>
3: <unknown>
4: <unknown>
5: <unknown>
6: <unknown>
7: <unknown>
8: <unknown>
9: <unknown>
10: <unknown>
11: <unknown>
12: <unknown>
13: <unknown>
14: <unknown>
15: <unknown>
16: __libc_start_main
17: <unknown>
If there're no files in a directory AVG
function fails with an error:
thread 'main' panicked at 'attempt to divide by zero', src/function.rs:382:20
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
Also MIN
function returns -1
in such case, which looks confusing.
On the from, possibility to ignore folder or subfolder.
When using the gitignore option it only reads the .gitignore
file from the current directory. It is pretty common to have multiple tiers of gitignore files effecting different part's of a repo.
It would be great if fselect could use the repo's true gitignore list. You can get the list of ignored files and folders with git status --ignored --porcelain
(git v1.7.6 and later). There are also some other options.
Looking at gitignore.rs
it looks like you are manually parsing the ignore file and basically implementing the logic, so shelling out to git to get the files would be a bit of a change. There is also a Rust libgit2 binding that might be an option, although that is probably even more work and a new dependency.
I've just installed static (with musl) FSelect 0.6.4 and started it in some random directory via
RUST_BACKTRACE=1 ~/fselect name
to get:
thread 'main' panicked at 'byte index 42 is out of bounds of `abandonedpsychiatricinsaneasylumroom.jpg`', src/libcore/str/mod.rs:2017:9
stack backtrace:
0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:39
1: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:71
2: std::panicking::default_hook::{{closure}}
at src/libstd/sys_common/backtrace.rs:59
at src/libstd/panicking.rs:197
3: std::panicking::default_hook
at src/libstd/panicking.rs:211
4: std::panicking::rust_panic_with_hook
at src/libstd/panicking.rs:474
5: std::panicking::continue_panic_fmt
at src/libstd/panicking.rs:381
6: rust_begin_unwind
at src/libstd/panicking.rs:308
7: core::panicking::panic_fmt
at src/libcore/panicking.rs:85
8: core::str::slice_error_fail
at src/libcore/str/mod.rs:0
9: core::str::traits::<impl core::slice::SliceIndex<str> for core::ops::range::RangeTo<usize>>::index::{{closure}}
10: fselect::util::parse_filesize
11: fselect::searcher::Searcher::get_field_value
12: fselect::searcher::Searcher::check_file
13: fselect::searcher::Searcher::visit_dir
14: fselect::searcher::Searcher::visit_dir
15: fselect::searcher::Searcher::visit_dir
16: fselect::searcher::Searcher::list_search_results
17: fselect::main
18: std::rt::lang_start::{{closure}}
19: std::panicking::try::do_call
at src/libstd/rt.rs:49
at src/libstd/panicking.rs:293
20: __rust_maybe_catch_panic
at src/libpanic_unwind/lib.rs:85
21: std::rt::lang_start_internal
at src/libstd/panicking.rs:272
at src/libstd/panic.rs:394
at src/libstd/rt.rs:48
22: main
fselect 'path,size from . where size=0'
query: Error parsing condition, no operator found
where size = 0
that works
It is not possible to query if the directory in the FROM clause has a dash in it, e.g.,
> fselect path from './foo-bar'
./foo: entity not found
even though this is a valid path (part) name.
In attempt to pick square images, I tried to run the request:
fselect path where width = height
But it didn't work (I suppose, it's not supported). Maybe that would be a good feature to have. Or even more advanced:
fselect path where width gte (height / 2)
Newer iPhones default to capturing photos in *.heic
files using the HEIF format. These photos still use EXIF metadata, and it looks like the libheif-rs
crate can parse it out of such files. Could the EXIF support in fselect be reasonably extended to cover HEIF files as well? libheif-rs
depends on the libheif
C++ library and its dependencies, so this probably would have an impact on build complexity unless it was a compile-time option.
I'm trying to find anything that can sort a folder of mixed HEIF and JPEG photos by date taken, and not finding much smaller than full-blown photo management databases.
Hi,
I wanted to try out fselect on my raspberry pi 3B but it gives me the error mentioned in the title.
rust was installed just fine, this is the version:
rustc --version --verbose
rustc 1.38.0 (625451e37 2019-09-23)
binary: rustc
commit-hash: 625451e376bb2e5283fc4741caa0a3e8a2ca4d54
commit-date: 2019-09-23
host: armv7-unknown-linux-gnueabihf
release: 1.38.0
LLVM version: 9.0
and I am on "rustup default stable"
changed it to nightly as some answers on the internet suggested, did not work out for me, same issue.
Attached is the whole output of cargo install fselect
When I select multiple columns, they get printed in separate lines:
$ fselect name size from /usr depth 1
lib32
36864
etc
4096
share
20480
lib64
270336
include
69632
man
4096
bin
139264
lib
270336
local
4096
sbin
139264
src
4096
I would have expected a tabular format:
lib32 36864
etc 4096
share 20480
lib64 270336
include 69632
man 4096
bin 139264
lib 270336
local 4096
sbin 139264
src 4096
Or, at least, CSV (or even tab separated):
lib32,36864
etc,4096
share,20480
lib64,270336
include,69632
man,4096
bin,139264
lib,270336
local,4096
sbin,139264
src,4096
Example(I wish to find 10 largest files in ~/Downloads):
fselect hsize, path from ~/Downloads order by size desc limit 10
Hi fselect-developers,
thanks for this great tool !
I'm a big fan of using SQL.
But I've just tried to find huge mbox files:
fselect fsize, path from /home/user/Mail
where size gt 10mb
and path like '%tr%'
order by 1 asc
But the order by does not sort correctly.
It's not ordered by fsize as a number value.
I expected at worst that I get a list in lexicographical order.
But it's a randomly ordered list.
Could you explain or better fix this?
Thanks a lot in advance
UD
Seems that regular expressions are case-sensitive by default (which is fine) but I cannot find any way to make them case-insensitive (using regex with operators like /.../i
does not work).
Examples:
fselect name where name =~ '.*substring.*'
-> works (case sensitive)fselect name where name =~ '/.*substring.*/i'
-> does not work (should be a case-insensitive search)Is it a limitation or did I miss something?
There seems to be a bug with regex matching. To reproduce:
$ mkdir fselect-test
$ cd fselect-test
$ touch some-file
$ fselect name where name =~ '^\.'
some-file
I would think that the regex ^\.
would match a string whose first character is a literal .
. Yet fselect
thinks the name some-file
matches, though it doesn’t start with a dot.
By comparison, fselect name where name =~ '\.'
, without a starting ^
, correctly outputs no rows. fselect name where name =~ '\.$'
, with the end anchor $
, also correctly outputs no rows.
I see that ^
is supported, as I expected, by the regular expressions library you use.
I am on macOS 10.14, using the Fish shell.
I see that the code for doing regex matches against fields is here:
Lines 1472 to 1491 in bf0ba6d
I don’t see a problem with that part of the code, though I’m not that familiar with Rust.
There probably should be column mime
.
Bonus point: detect MIME types by the content.
fselect size,modified,path from . where modified gt 2018-08-01 and name='*.txt' order by modified
This seems pretty straightforward but it looks like the everything in the where
clause is ignored and the sorting is not working either. It just dumps the same output as if I did
fselect size,modified,path from .
I am using zsh and I installed from source on Ubuntu 18.10
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.