Giter Club home page Giter Club logo

Comments (11)

Jeffail avatar Jeffail commented on July 20, 2024 2

Hey @rucciva, I've been thinking about this, an alternative would be to use the process_map processor, allowing you to extract fields, apply a date processor and then place the result in a path. That way we can have a flow where the processor uses function interpolation for multiple args and a value.

It would look something like this:

type: process_map
process_map:
  premap:
    duration: path.to.duration
    value: path.to.value
  processors:
  - type: date
    date:
      operator: add
      arg: ${!json_field:duration}
      value: ${!json_field:value}
  postmap:
    path.of.result: .

from connect.

Jeffail avatar Jeffail commented on July 20, 2024

Hey @rucciva, I've considered this before, to an extent some of this stuff can be done with the grok processor but it obviously doesn't allow you to do conversions. What would be the specific processing steps you're after?

I'd consider both function interpolations or a dedicated processor, I'm not entirely sure which I would personally prefer yet.

from connect.

rucciva avatar rucciva commented on July 20, 2024

In my case, i would like to

  1. Convert unix timestamp (either in second, mili second, or nano second) to elasticsearch date string.
  2. From the same date data, construct time based index, either daily, monthly, or yearly. Depends on the approximate number of event.

from connect.

Jeffail avatar Jeffail commented on July 20, 2024

Thinking about it, I could possibly add this as a text processor operation. It might look something like this:

type: text
text:
  operator: date_to_unix
  arg: "Mon Jan 2 15:04:05 -0700 MST 2006" # Format to convert from

The arg field would follow this rule: https://golang.org/pkg/time/#Time.Format. Then I could add date_to_unix_nano, unix_to_date, unix_nano_to_date, etc.

type: text
text:
  operator: unix_to_date
  arg: "Mon Jan 2 15:04:05 -0700 MST 2006" # Format to convert to

from connect.

rucciva avatar rucciva commented on July 20, 2024

Hmm.. why not create a separate processor instead of using text processor?

Imho, adding this operation to text processor is not consistent with the fact that decode, encode, and hash gets it own processor. I think text processor are best for general string processing.

I also think that having dedicated processor for date processing is good in case future specifix date operations are needed.

from connect.

Jeffail avatar Jeffail commented on July 20, 2024

Yeah you might be right, I'll sit on this one for a little while if that's okay.

from connect.

rucciva avatar rucciva commented on July 20, 2024

Yeah It's completely okay. No need to rush

from connect.

rucciva avatar rucciva commented on July 20, 2024

Hi @Jeffail , any news on this?

IMHO, a separate processors is better than combining it with the text processor. I think the config might be different and i have two options in my mind

first:

type: date
date:
    parts: []
    operator: some operator
    args:
        - interpolatable argument 1
        - interpolatable argument 2
        - ...
    value: interpolatable value that will replace the message part as the base value that is manipulated by the operator

i think that a message containing only date value is rare, thus the date value must be a part of or a field in the message. this also imply that replacing the whole message with only date value is a rare occassion. in this case process_field could come in handy to select only the date field to be processed. But in the process, we will also lose reference to other part of the message that might be needed as arguments.

in the case above, i'am thinking we could create a new field that contains both the value that will be processed and other values needed as arguments. Then we use process_field on the newly created field and select the value and arguments field using interpolation. for example given a message like this {"request_start_time" : 1540352793, "duration":5, ... }, when we need to compute the request_finish_time, we first need to create a temporary field contains both the request_start_time and duration like this {"request_start_time" : 1540352793, "duration":5, "request_finish_time": {"request_start_time" : 1540352793, "duration":5}, ... }, then process_field the 'request_finish_time' and apply the date processing:

- type: process_field
  process_field: 
    parts: []
    path: "request_finish_time"
    processors: 
      - type: date
        date: 
          operator: add
          args:
            - ${!json_field:duration}
          value: ${!json_field:request_start_time}

the second options will use target_path and source_path in the configuration like so.

type: date
date:
  parts: []
  source_path: some json path like that of process_field's path
  target_path: some json path like that of process_field's path
  operator: some operator
  args:
   - interpolatable argument 1
   - interpolatable argument 2
   - ...

or we could just remove source_path and treat everything as arguments to the operator

what do you think?

from connect.

rucciva avatar rucciva commented on July 20, 2024

i like that process_map, it makes it looks simpler, and reusable to other sub processor

from connect.

Jeffail avatar Jeffail commented on July 20, 2024

Hey @rucciva, just to keep you up to date. I'm thinking of solving this instead via the new AWK processor by adding date related functions.

This example gives a brief look at how it works: https://github.com/Jeffail/benthos/blob/master/docs/processors/README.md#json

from connect.

rucciva avatar rucciva commented on July 20, 2024

Hi @Jeffail, this is very nice too. Thanks.

I think the issue can be closed

from connect.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.