Giter Club home page Giter Club logo

linesieve's People

Contributors

lemon24 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

linesieve's Issues

"parse" command

For each line, output named groups of first matching pattern (of many) as a list of key-value pairs (as returned by groupdict()).

TODO: A better name than "extract".

Output format should be at least:

  • tab-separated values, like split/cut (TBD how to delimit key from value)
  • JSON or JSON Lines (whichever jq can read)

Nice to have:

  • specify preferred ordering of keys (maybe not for JSON)
  • handle lines that don't match anything in a sane way
  • a mode where you give a pattern with "key" and "value" groups and it iterfinds them in the "line"

"sort" command

In-memory fine, initially.

Sort key regex (named, first group, whole match; like --section marker). Maybe multiple groups too?

Would be nice to have natural/human sort (see Unix sort flags).


Update:

Relevant Unix sort flags:

  • -n, --numeric-sort
  • -h, --human-numeric-sort (can be added later)
  • -r, --reverse
  • -k, --key=POS1[,POS2], "POS is F[.C][OPTS], where F is the field number and C the character position in the field"
  • -t, --field-separator=SEP (arguably, both this and --key should work like linesieve split)

It should also be possible to pass a regex key:

  • no groups: full match
  • numeric groups: ???
  • named groups: custom order, with flags like --key OPTS (a la linesieve parse name__value feature)

Multi-line support

It should be possible to treat the following as 3 "lines":

start abcd
start ef
gh
start third

This is useful for log files that start with e.g. a timestamp.

It should like be possible to specify both a start and/or an end separator; also see how awk does it. Worth considering what happens "outside" of separators.

If multi-line is enabled, it should probably imply re.DOTALL (? if yes, we can just enable it everywhere, since single-line cannot use it).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.