Giter Club home page Giter Club logo

Comments (8)

ildargafarov avatar ildargafarov commented on August 25, 2024

@lspgn Hi! Will be any answer here? Are there any questions?

from goflow2.

lspgn avatar lspgn commented on August 25, 2024

Hello,
I haven't had too much time to update GoFlow2 recently.
I do not understand what you mean by making which field exportable? Without a template, the structure cannot be exported and is equivalent to keep the raw payload of the UDP message.
While this is an interesting idea, storing raw data, handling template expiration, introduce a non-negligible amount of complexity.

I believe #29 should be able to solve your issue by allowing cold start.
Unless I am misunderstanding your use-case.

from goflow2.

ildargafarov avatar ildargafarov commented on August 25, 2024

Hi! Thanks for the answer!
Sorry for my question wording, I'm not a native English speaker and I'm new in Golang and Netflow.

I meant making exportable fields of ErrorTemplateNotFound structure. I think the word "exportable" is more appropriate in Golang than "public" :-) I didn't want to make payloads exportable. What I want exactly is:

type ErrorTemplateNotFound struct {
	Version      uint16
	OBSDomainId  uint32
	TemplateId   uint16
	TypeTemplate string
}

from here
These are the only changes that I want in PR.

Yes, I want to save RAW payloads until a template appears. This is a pseudocode that shows how I want to release it:

type NetflowParser struct {
	templates          map[string]*TemplateSystem
	unresolvedData     map[uint16]map[uint32]map[uint16][][]byte
	unresolvedDataLock *sync.RWMutex
}

// save packet for which there is no template to process later
func (p *NetflowParser) addUnresolvedData(data []byte, version uint16, obsDomainId uint32, templateId uint16) {
	p.unresolvedDataLock.Lock()
	if _, exists := p.unresolvedData[version]; !exists {
		p.unresolvedData[version] = make(map[uint32]map[uint16][][]byte)
	}
	if _, exists := p.unresolvedData[version][obsDomainId]; !exists {
		p.unresolvedData[version][obsDomainId] = make(map[uint16][][]byte)
	}
	p.unresolvedData[version][obsDomainId][templateId] = append(p.unresolvedData[version][obsDomainId][templateId], data)
	p.unresolvedDataLock.Unlock()
}

// pop unprocessed packets by template identifier
func (p *NetflowParser) popUnresolvedData(version uint16, obsDomainId uint32, templateId uint16) [][]byte {
	p.unresolvedDataLock.Lock()
	unresolvedData := p.unresolvedData[version][obsDomainId][templateId]
	delete(p.unresolvedData[version][obsDomainId], templateId)
	p.unresolvedDataLock.Unlock()
	return unresolvedData
}

func (p *NetflowParser) Parse(data []byte, metadata map[string]string) error {
	// get or create templates holder for each NAT server
	templates, ok := p.templates[metadata["packetIPSrc"]]
	if !ok {
		templates = &TemplateSystem{
			templates: netflow.CreateTemplateSystem(),
			key:       metadata["packetIPSrc"],
		}
		p.templates[metadata["packetIPSrc"]] = templates
	}

	buf := bytes.NewBuffer(data)
	decMsg, err := netflow.DecodeMessage(buf, templates)
	if err != nil {
		switch err := err.(type) {
		case *netflow.ErrorTemplateNotFound:
			p.addUnresolvedData(data, err.Version, err.OBSDomainId, err.TemplateId)
		default:
			log.Println(err)
		}
		return err
	}

	msg, ok := decMsg.(netflow.NFv9Packet)
	if !ok {
		return nil
	}

	for _, flowSet := range msg.FlowSets {
		switch flowSet := flowSet.(type) {
		case netflow.TemplateFlowSet:
			for _, record := range flowSet.Records {
				payloads := p.popUnresolvedData(msg.Version, msg.SourceId, record.TemplateId)
				for _, payload := range payloads {
					go p.Parse(payload, metadata)
				}
			}
		case netflow.DataFlowSet:
			// process data records
		default:
			log.Printf("unknown flow: %s\n", flowSet)
		}
	}

	return nil
}

#29 is not exactly my case. Every FlowRecord is important in my app. I think saving templates to files is a more complex solution than just saving raw payloads in memory, at least in my case.

I'll prepare PR on the weekend. I just wanted to ask you before writing any code.

from goflow2.

lspgn avatar lspgn commented on August 25, 2024

Hello,
Thank you for the details,
I will think about it: this creates subsequent problems

  • where is the data stored, for how long (if the template never appears, is it expired)
  • what happens if the template changed and suddenly wrong fields are decoded
  • ordering may matter matters in some pipeline (especially the ones that do windowing aggregates)

Every FlowRecord is important in my app. I think saving templates to files is a more complex solution than just saving raw payloads in memory, at least in my case

This is more complex than warming a template cache.
My suggestion in that case: if you are dealing with network packets, have you looked at sFlow? They are stateless

from goflow2.

ildargafarov avatar ildargafarov commented on August 25, 2024

Hi! Thanks for you suggestions.

I have tried to find some answers to you questions in rfc3954

where is the data stored, for how long (if the template never appears, is it expired)

As it says in rfc3954:

If the Flow has been inactive for a certain period of time. This inactivity timeout SHOULD be configurable at the Exporter, with a minimum value of 0 for an immediate expiration.

I think the collector also should have some configured timeout parameter that tells for how long the records should be saved.

what happens if the template changed and suddenly wrong fields are decoded

As it says in rfc3954 Template Management section:

If the template configuration is changed, the current Template ID is abandoned and SHOULD NOT be reused until the NetFlow process or Exporter restarts. If a Collector should receive a new definition for an already existing Template ID, it MUST discard the previous template definition and use the new one.

The collector may keep track of sysUpTime header and discard all records if the header has been changed.

ordering may matter matters in some pipeline (especially the ones that do windowing aggregates)

The collector can sort records by Sequence Number header.

This is more complex than warming a template cache.

Definitely, the template warming is more proper solution. I described my case where the template warming can't solve all my problems.

My suggestion in that case: if you are dealing with network packets, have you looked at sFlow? They are stateless

I can't use sFlow because I don't have the access to the exporter. I don't have any possibility of choosing protocol type and version.
I attached PR here that makes it possible to make a solution in my side. I don't want you to add feature of saving records in the library. Thanks for you support.

from goflow2.

lspgn avatar lspgn commented on August 25, 2024

Understood! Thank you
Currently, any new template having the same IDs will overwrite the previous template.
Regarding the PR, this looks ok, but as I am refactoring templates (#49) I may remove custom errors but will keep the ability to identify missing templates, will also include expiration.

from goflow2.

ildargafarov avatar ildargafarov commented on August 25, 2024

Great, thanks a lot. I'm waiting for #49. I think the issue may be closed

from goflow2.

lspgn avatar lspgn commented on August 25, 2024

I believe this is solved in v1.2.1
Feel free to re-open otherwise

from goflow2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.