acln0 / perf Goto Github PK

View Code? Open in Web Editor NEW

28.0 28.0 6.0 211 KB

tentative golang.org/x/sys/unix/linux/perf package

Go 99.61% Assembly 0.39%

perf's People

Contributors

Stargazers

Watchers

Forkers

andrewkroh adriansr pwaller hongli-my zyedidia dbaynak-cl

perf's Issues

The way forward for the perf package

Hi all,

I've reached out to @acln0 privately to ask about the future of this package. Unfortunately, he won't be returning to OSS anytime soon, so we'll need to find an alternative working solution in the mean time. He agrees we should try to focus the community's efforts around a single repository, since there are currently multiple forks of this repo with different patch sets.

There's also a 'competing' implementation at https://github.com/hodgesds/perf-utils, (@hodgesds) though not sure about feature overlap.

My current goal is using this library to complement https://github.com/cilium/ebpf for easy tracepoint and kprobe support, and moving out most of the cilium/ebpf/perf package. (cc @lmb)

Given my (very) limited familiarity with the kernel's perf API, I don't feel positioned to take up maintainership long-term, but I can already start gathering patches from forks for inclusion. (also, my availability over the summer will be limited)

I'd also like to gather the community's thoughts around this lib in particular.

Do we stick with this API? Is it too high or low level?
What are the biggest pain points (barring unmerged PRs)?
Any particular reason to push for inclusion into the stdlib?

Also, where should the package live in the mean time?

The existing github.com/elastic/go-perf? (@adriansr)
github.com/cilium/perf is available
Any other active organizations that have an interest in hosting this?

@pwaller @andrewkroh @florianl @tklauser @mdlayher @pengfei-su @markpash @fbegyn

cc, feel free to ignore/unsub if not interested, sorry for spamming!

Thanks, take care!

Use Errorf instead of Fatalf in test failures where possible

As noted in #18, the code is full of these.

Revisit testComm waiting on sawcommevfd

In #25, I accidentally introduced an additional comment on evwait(sawcommevfd) in the testComm test. This was unintentional, and doesn't otherwise relate to the fixes introduced in that PR.

This issue is here as a reminder to revisit this.

perf/record_test.go

Lines 528 to 544 in 4d8e4e5

 // TODO(acln): investigate the legitimacy of the following crutch. 

 // 

 // Wait for the parent to see that we changed our name, then exit. 

 // 

 // If we do not wait here, there is a terrible race condition waiting 

 // to happen: If we PR_SET_NAME in the child, then immediately exit, 

 // the other side may not see POLLIN on the comm record: it may see 

 // POLLHUP directly, even though a comm record was actually written 

 // to the ring in the meantime. Why we get POLLHUP directly, and not 

 // POLLIN before it, is unclear. The machinery to deal with this 

 // eventuality in the poller does not exist yet, and at the time 

 // when this comment was written, I have found no good solutions to 

 // this conundrum. 

 // 

 // So we live with it, but still try to make our test pass. 

 // evwait(sawcommevfd) 

 _ = sawcommevfd

The current state for me is that if I apply the patch of #26 onto master at 4d8e4e5, then sudo ./perf.test -test.count=1000 -test.run=Record/Comm passes with no failures for me.

However, if I uncomment the evwait above, then the tests fail. In that case, ReadRecord doesn't return, even if the context deadline is set many seconds into the future. What I see is that the child ends up waiting for the signal sawcommevfd after changing its COMM, and never exits.

So the mystery is why ReadRecord doesn't return when this wait is present. It feels as though the kernel isn't respecting our wakeup events = 1. When the wait is commented out (erroneously by me), then the process exits, and we receive the event.

EBADF in testComm

While looking into #23, I spied the skipped testComm test, and suspected it might be related to some of the issues I've been seeing.

I noticed that there were some ignored errors:

perf/record_test.go

Line 1234 in 6861f4b

unix.Write(fd, buf)

perf/record_test.go

Line 1240 in 6861f4b

unix.Read(fd, buf)

If I remove the skip and run go test -count=100 -run TestReadRecord/Comm, then every ~100 tests or so I see EBADF returning from those, and sometimes I see hangs while one side or the other is waiting for an eventfd which will never get messaged.

The skip:

perf/record_test.go

Line 548 in 6861f4b

t.Skip("flaky. TODO(acln): investigate")

Is CallingThread correctly named?

https://github.com/acln0/perf/blob/master/perf.go#L24-L25

I assume the above is correct but wanted to double check. The perf docs say:

       pid == 0 and cpu == -1
              This measures the calling process/thread on any CPU.

I note that it uses the word process/thread for pid==0 and I'm not really sure what this means.

The other part to my question is, how does this interact with runtime.LockOSThread()? Can we ensure that Measure(func() {...}) doesn't measure other things going on within the process?

I note that golang/go#20458 was fixed and makes LockOSThread and UnlockOSThread safe to call in pairs, even if you don't know the outer state with which you are being called.

(*perf.Group).Options are not respected after g.Add()

In #4 (comment) I commented that I was getting perf: failed to open event leader: perf_event_open: permission denied with perf.perf_event_paranoid=2.

Stepping through perf_event_open with a kernel debugger I isolated the cause. The ExcludeKernel option isn't being passed through.

Here's the code I wrote where I expected the option to be set:

	var g perf.Group
	g.CountFormat = perf.CountFormat{
		Running: true,
		ID:      true,
	}

	g.Options.ExcludeKernel = true
	g.Options.ExcludeHypervisor = true
	
	g.Add(perf.Instructions, perf.CPUCycles)

Indeed, if I look at the leader attributes being used:

perf/group.go

Lines 73 to 75 in e7587bd

 leaderattr := g.attrs[0] 

 leaderattr.CountFormat.Group = true 

 leader, err := Open(leaderattr, pid, cpu, nil)

... I find that g.Options aren't being respected.

Samples# dramatically decreases if preciseIP is non-zero.

perf/perf.go

Line 1142 in 6861f4b

false, false, // 2 bits for skid constraint

In the sampling mode, I found the number of samples is dramatically reduced if preciseIP =1,2,3 instead of 0. Did I miss sth important?

Tracepoint Configurator could be more clever

From @pwaller:

Calling g.Add(perf.Instructions, perf.CPUCycles, perf.ContextSwitches, tracepoint) on a Group with ExcludeKernel = true causes the tracepoint to not register any values.

Perhaps the Configurator returned by Tracepoint could be more clever, and set ExcludeKernel = false.

Or perhaps a different API is in order, such as IncludeKernel(Configurator) Configurator, or more generally, WithSomeOption(Configurator) Configurator.

Thinking.

No way to specify PERF_FLAG_FD_OUTPUT (?)

As far as I can tell, there is no way to pass flags into perf.Open.

Here it's passed as 0:

perf/perf.go

Line 121 in 6861f4b

return open(a, pid, cpu, group, 0)

I want to pass PERF_FLAG_FD_OUTPUT so that I can coalesce tracepoint events into a single ring buffer, where it is useful to preserve their ordering.

Confusing code

This is only a minor suggestion, but I found this confusing:

perf/record_test.go

Lines 219 to 225 in 239c48f

 const errDisabledTestEnv = "PERF_TEST_ERR_DISABLED" 

 func init() { 

 // In child process of testErrDisabledProcessExist. 

 if os.Getenv(errDisabledTestEnv) != "1" { 

 return 

 }

Variables/consts beginning with the name err are usually errors.
It refers to testErrDisabledProcessExist which doesn't exist.
It's unclear from the comment which side of the branch is "in" something. Is it in the returning side, or the non-returning side? This might be a rare case where "early return" is a minor hinderance and perhaps it would be better to phrase it positively, and move the body of code out, to make it clear when the special condition matches:
```
func init() {
  if specialCondition {
    doSpecialCondition()
  }
}
```

Tests fail as user with paranoid=1

--- FAIL: TestCount (0.08s)
    --- FAIL: TestCount/Software (0.02s)
        --- FAIL: TestCount/Software/PageFaults (0.02s)
            count_test.go:99: failed to drop VM cache: open /proc/sys/vm/drop_caches: permission denied
    --- FAIL: TestCount/IoctlAndCountIDsMatch (0.00s)
        count_test.go:366: failed to drop VM cache: open /proc/sys/vm/drop_caches: permission denied
--- FAIL: TestReadRecord (0.01s)
    --- FAIL: TestReadRecord/CPUWideSwitch (0.00s)
        record_test.go:846: perf_event_open: permission denied

Please note CPUWideSwitch in particular - I have a suggestion/another discussion for the TestCounts.

Idea: perf.Command

One use case I have is to measure another process I want to start from Go. A slight challenge I immediately came up against was how to start the measurement before the process starts doing useful work.

Assumptions:

process may be short lived.
this is not a solved problem (because I can't find a solution yet).

Illustration of the issue

cmd := exec.Command()
cmd.Start() // required to get the Pid of the subprocess
... := perf.Open(..., cmd.Process.Pid, ...) // by the time we get here, the process may have already done some of the work we want to measure.

Idea

So my idea is to have some sort of user interface which takes the process to run. Either the *exec.Cmd, or arguments that are passed to `exec.Command (though I guess that would be less flexible).

First there is a challenge to solve: How to create a process in Go such that it is in the stopped state, so that you can start measurement before it does work you want to measure? Searching around I'm surprised there is not an immediate solution to this. Ptrace on SysProcAttr might do it - I haven't checked yet.

The next best thing I can think of is to rig up a shell process along the lines of sh -c 'kill -STOP $$; exec "$@"' -- ....

Whatever the solution is, it could be wrapped up in perf.Command or alike.

Other things from `perf list`, e.g. hw_interrupts.received

I'd like to record hw_interrupts.received:

perf stat -v -e hw_interrupts.received bash -c 'for i in {1..10000}; do echo hi; done > /dev/null'
Using CPUID GenuineIntel-6-9E
hw_interrupts.received -> cpu/umask=0x1,period=100003,event=0xcb/
hw_interrupts.received: 3 13527404 13527404

 Performance counter stats for 'bash -c for i in {1..10000}; do echo hi; done > /dev/null':

                 3      hw_interrupts.received

I can't yet figure out how to do it, but I'll update the thread if I do figure out how to.

I'd be interested in seeing that the perf API does eventually cover this and other events given by perf list in an intuitive way.

Failing tests with -test.count=10

Tested on master just now @ 46e3c14.

$ go test -c
$ sudo ./perf.test -test.count=10
--- FAIL: TestCount (0.15s)
    --- FAIL: TestCount/Software (0.00s)
        --- FAIL: TestCount/Software/PageFaults (0.00s)
            count_test.go:104: PERF_EVENT_IOC_DISABLE: bad file descriptor
--- FAIL: TestCount (0.12s)
    --- FAIL: TestCount/Software (0.00s)
        --- FAIL: TestCount/Software/PageFaults (0.00s)
            count_test.go:104: PERF_EVENT_IOC_DISABLE: bad file descriptor
--- FAIL: TestReadRecord (0.24s)
    --- FAIL: TestReadRecord/Comm (0.05s)
        record_test.go:635: got context deadline exceeded, want valid record
--- FAIL: TestCount (0.11s)
    --- FAIL: TestCount/Software (0.00s)
        --- FAIL: TestCount/Software/PageFaults (0.00s)
            count_test.go:104: PERF_EVENT_IOC_DISABLE: bad file descriptor
    --- FAIL: TestCount/IoctlAndCountIDsMatch (0.02s)
        count_test.go:368: didn't see a page fault
--- FAIL: TestCount (0.09s)
    --- FAIL: TestCount/Software (0.00s)
        --- FAIL: TestCount/Software/PageFaults (0.00s)
            count_test.go:104: PERF_EVENT_IOC_DISABLE: bad file descriptor
    --- FAIL: TestCount/IoctlAndCountIDsMatch (0.02s)
        count_test.go:368: didn't see a page fault
--- FAIL: TestReadRecord (0.24s)
    --- FAIL: TestReadRecord/Comm (0.05s)
        record_test.go:635: got context deadline exceeded, want valid record
--- FAIL: TestCount (0.10s)
    --- FAIL: TestCount/Software (0.01s)
        --- FAIL: TestCount/Software/PageFaults (0.01s)
            count_test.go:104: PERF_EVENT_IOC_DISABLE: bad file descriptor
    --- FAIL: TestCount/IoctlAndCountIDsMatch (0.02s)
        count_test.go:368: didn't see a page fault
--- FAIL: TestReadRecord (0.25s)
    --- FAIL: TestReadRecord/Comm (0.05s)
        record_test.go:635: got context deadline exceeded, want valid record
--- FAIL: TestCount (0.09s)
    --- FAIL: TestCount/Software (0.00s)
        --- FAIL: TestCount/Software/PageFaults (0.00s)
            count_test.go:104: PERF_EVENT_IOC_DISABLE: bad file descriptor
    --- FAIL: TestCount/IoctlAndCountIDsMatch (0.01s)
        count_test.go:368: didn't see a page fault
--- FAIL: TestReadRecord (0.27s)
    --- FAIL: TestReadRecord/Comm (0.05s)
        record_test.go:635: got context deadline exceeded, want valid record
--- FAIL: TestCount (0.09s)
    --- FAIL: TestCount/Software (0.01s)
        --- FAIL: TestCount/Software/PageFaults (0.01s)
            count_test.go:104: PERF_EVENT_IOC_DISABLE: bad file descriptor
    --- FAIL: TestCount/IoctlAndCountIDsMatch (0.02s)
        count_test.go:368: didn't see a page fault
--- FAIL: TestReadRecord (0.30s)
    --- FAIL: TestReadRecord/Comm (0.05s)
        record_test.go:635: got context deadline exceeded, want valid record
--- FAIL: TestCount (0.09s)
    --- FAIL: TestCount/Software (0.01s)
        --- FAIL: TestCount/Software/PageFaults (0.01s)
            count_test.go:104: PERF_EVENT_IOC_DISABLE: bad file descriptor
    --- FAIL: TestCount/IoctlAndCountIDsMatch (0.02s)
        count_test.go:368: didn't see a page fault
--- FAIL: TestReadRecord (0.26s)
    --- FAIL: TestReadRecord/Comm (0.06s)
        record_test.go:635: got context deadline exceeded, want valid record
--- FAIL: TestCount (0.10s)
    --- FAIL: TestCount/Software (0.00s)
        --- FAIL: TestCount/Software/PageFaults (0.00s)
            count_test.go:104: PERF_EVENT_IOC_DISABLE: bad file descriptor
    --- FAIL: TestCount/IoctlAndCountIDsMatch (0.02s)
        count_test.go:368: didn't see a page fault
--- FAIL: TestReadRecord (0.28s)
    --- FAIL: TestReadRecord/Comm (0.05s)
        record_test.go:635: got context deadline exceeded, want valid record
FAIL

Should perf.Tracepoint work as part of a group?

I had a group filled with hardware counters, then I added a Tracepoint for sys_enter_write to the group. I always got zero for my tracepoint counter, even though I was not expecting this.

Subsequently, I made a group only containing tracepoint counters, but then I get:

perf: empty event group

Here's a reproducer:

`main.go`:

package main

import (
	"fmt"
	"log"
	"os"

	"acln.ro/perf"
)

func main() {
	var g perf.Group
	g.CountFormat = perf.CountFormat{}
	g.Options.ExcludeKernel = true
	g.Options.ExcludeHypervisor = true

	tp := perf.Tracepoint("syscalls", "sys_enter_write")
	g.Add(tp)

	counts, err := g.Open(perf.CallingThread, perf.AnyCPU)
	if err != nil {
		log.Fatal(err)
	}

	c, err := counts.MeasureGroup(func() {
		os.Stdout.WriteString("Hi\n")
	})
	if err != nil {
		log.Fatal(err)
	}
	for _, v := range c.Values {
		fmt.Println(v)
	}
}

Missing events due to early return on perfhup

I've been messing around with perf, using inherit to trace a process tree.

What I've found is that when the process tree exits, we get HUP on the poll, and whatever events are in the kernel buffer at that time are lost.

I'm not sure I fully understand what's going on, but I've found anecdotally that here:

perf/record.go

Lines 130 to 133 in 6861f4b

 if resp.perfhup { 

 // Saw POLLHUP on ev.perffd. See also the 

 // documentation for ErrDisabled. 

 return ErrDisabled

If I insert:

if ev.readRawRecordNonblock(raw) {
	return nil
}

Before return ErrDisabled, then it appears the events are not lost.

	// TODO(acln): investigate the legitimacy of the following crutch.
	//
	// Wait for the parent to see that we changed our name, then exit.
	//
	// If we do not wait here, there is a terrible race condition waiting
	// to happen: If we PR_SET_NAME in the child, then immediately exit,
	// the other side may not see POLLIN on the comm record: it may see
	// POLLHUP directly, even though a comm record was actually written
	// to the ring in the meantime. Why we get POLLHUP directly, and not
	// POLLIN before it, is unclear. The machinery to deal with this
	// eventuality in the poller does not exist yet, and at the time
	// when this comment was written, I have found no good solutions to
	// this conundrum.
	//
	// So we live with it, but still try to make our test pass.
	// evwait(sawcommevfd)
	_ = sawcommevfd

	leaderattr := g.attrs[0]
	leaderattr.CountFormat.Group = true
	leader, err := Open(leaderattr, pid, cpu, nil)

	const errDisabledTestEnv = "PERF_TEST_ERR_DISABLED"

	func init() {
	// In child process of testErrDisabledProcessExist.
	if os.Getenv(errDisabledTestEnv) != "1" {
	return
	}

	if resp.perfhup {
	// Saw POLLHUP on ev.perffd. See also the
	// documentation for ErrDisabled.
	return ErrDisabled

acln0 / perf Goto Github PK

perf's People

Contributors

Stargazers

Watchers

Forkers

perf's Issues

Illustration of the issue

Idea

main.go:

Recommend Projects

Recommend Topics

Recommend Org

`main.go`: