Giter Club home page Giter Club logo

likwid's People

Watchers

 avatar

likwid's Issues

Change the static group configuration to a dynamic variant

Groups can only be set statically for all architectures. This
is inflexible because some architectures have more counters or other events.

Solution:

Implement a dynamic group approach.
Add a routine to ask for the supported groups.
The API can be implemented with function pointers.
These are initialized according to the architecture.

Original issue reported on code.google.com by [email protected] on 9 Oct 2009 at 6:18

Rework the Marker API

The marker API could be simpler in usage.

Analyse the requirements and specify and implement new revised API.

Original issue reported on code.google.com by [email protected] on 3 Dec 2010 at 9:49

Allow to set CMASK and INVERT on Intel Processors

For many events in order to count cycles it is necessary apart from eventID and 
umask also to initialize the CMASK and INVERT field.

Plan:

Analyze the implications on the code. 
Implement for Core 2 and Nehalem.

Original issue reported on code.google.com by [email protected] on 16 Jun 2010 at 7:50

TODO release 2.0

Group ticket for release 2.0.

Open points:
* Formal review
* Testing
* Final benchmark set for likwid-bench


Original issue reported on code.google.com by [email protected] on 20 Aug 2010 at 8:08

Suppress clock measure in perfctr

The clock measurement at the beginning of every start takes some time.
On automated runs it should be possible to turn it off.

1. Evaluate how far the duration can be reduced for exact results.
2. Add an option to turn it off.


Original issue reported on code.google.com by [email protected] on 28 Apr 2011 at 1:57

likwid-pin -c 0,1 complains about .so not found

What steps will reproduce the problem?
1. likwid-pin -c 0,1 ls

What is the expected output? What do you see instead?

Expected output: Message from likwid-pin about pinning, then output from ls.

Actual output:

[likwid-pin] Main PID -> core 0 - OK
ERROR: ld.so: object '/home.local/jan/bin/lib/liblikwidpin.so' from
LD_PRELOAD cannot be preloaded: ignored.
... and then the output from ls.

Please use labels and text to provide additional information.


Original issue reported on code.google.com by [email protected] on 5 Feb 2010 at 11:45

Socket lock in Marker API does not work for all architectures

There is a socket lock for turning on and off the Nehalem Uncore events. 
Still with the Marker API on multiple runs different cores might akkumulate 
the results.

Solution:

Either force one core to turn counters on or off. Or later in the result 
presentation sum up all core results.

Original issue reported on code.google.com by [email protected] on 30 May 2010 at 7:40

make install target

Add a make install target.
Put all configurable make settings in a separate file.

Original issue reported on code.google.com by [email protected] on 4 Nov 2009 at 6:00

likwid-pin always pinning to core 0

$ export OMP_NUM_THREADS=4  # dunnington1
$ echo $KMP_AFFINITY 
disabled
$ likwid-pin -t intel_omp -c 0,3,6,9  ./stream_omp_NT.exe
[likwid-pin] Main PID -> core 0 - OK
[...]
[pthread wrapper] [pthread wrapper] PIN_MASK: 0->9  
[pthread wrapper] SKIP MASK: 0x0
[pthread wrapper 0] Notice: Using libpthread.so.0 
        threadid 1073809728 -> core 9 - OK
[pthread wrapper 1] Notice: Using libpthread.so.0 
        threadid 1078008128 -> core 0 - OK
[pthread wrapper 2] Notice: Using libpthread.so.0 
        threadid 1082206528 -> core 0 - OK
[pthread wrapper 3] Notice: Using libpthread.so.0 
        threadid 1086404928 -> core 0 - OK
---------------------

In fact, all 4 application threads are running on core 0.
The only one on core 9 is the shepherd thread.

Original issue reported on code.google.com by [email protected] on 5 Feb 2010 at 2:15

NUMA topology wrong on Magny Cours

Up to now the sys fs file cpumap was used. I did not found any Documentation 
about these files. This will be changed to use the cpulist file which is plain 
integer list.


Original issue reported on code.google.com by [email protected] on 19 Sep 2010 at 2:30

Reduce processor specific code

At the moment there is much redundant processor specific code.

Solution:

Reduce the processor specific parts to data configurations which are
processed by generic routines.


Original issue reported on code.google.com by [email protected] on 25 Mar 2010 at 8:24

Label for SMT is too wide

Label for enabled SMT is too wide for box.

Solution:

1. Make the label width variable
or Increase the box width

Original issue reported on code.google.com by [email protected] on 28 Sep 2009 at 8:19

Extend skip mask to varying length

At the moment the skip mask only supports up to 64 threads due to the usage of 
64 bit integers.

Solution:

* Provide a solution where everything above 64 threads is truncated.
* Use a bitset of configurable size


Original issue reported on code.google.com by [email protected] on 26 Nov 2010 at 1:20

Refine and validate Groups in perfCtr

The performance groups in perfCtr are not validated.
This is especially true for Intel Nehalem.

Solution:

Validate the existing groups with the implemented events.
Think about new metrics for performance groups.


Original issue reported on code.google.com by [email protected] on 28 Sep 2009 at 8:24

Support continous measurement

likwid-perfCtr should support a mode were the counters run during 
execution of the application and in configured time steps read out and 
printed. This allows to generate graphs of events or derived metrics over an 
applications execution time.

Tasks:
1. Adopt the multiplex module to allow continous measurements.
2. Extend the architectures code by a routine to just read out results.
3. Provide output and computations of raw events and derived metrics.


Original issue reported on code.google.com by [email protected] on 20 Aug 2010 at 7:39

main memory bw measurement -- don't use UNC_L3_LINES_IN_ANY

Just found likwid (i.e. minutes ago) and I am very happy to see it.  I need to 
measure main memory traffic on a dual socket Nehalem system (i.e. across both 
sockets).  After investigating all the performance counters, it appears that 
some combination of the following is what I want:
UNC_QHL_REQUESTS.LOCAL_(READS|WRITES)
UNC_IMC_NORMAL_READS.ANY
UNC_IMS_WRITES_FULL.ANY

On to my question then -- in reading your wiki pages for Nehalem, I see that 
the performance group "MEM" uses UNC_L3_LINES_IN_ANY as part of its bandwidth 
measurement.  I believe that this counter will count the allocation of a cache 
line in any state, i.e., including modified or exclusive.  This means that if I 
assign a value without first reading it, you would incorrectly count this as 
part of the memory traffic.

This is relevant, in (e.g.), my particular case which is sparse matrix vector 
multiply (SpMV).  The inner loop of SpMV accumulates the dot product of one row 
in the matrix with the vector operand.  The accumulation is held in a processor 
register (ideally).  Once finished, the value is simply written into the 
destination vector, i.e., so as to avoid a write-miss.  Thus, L3 cache should 
allocate a line in the modified state without performing any DRAM access, thus 
defeating the use of UNC_L3_LINES_IN_ANY as a useful counter for memory traffic.

Finally, I must admit that I am at the moment simply searching for a good 
solution.  If the above is wrong, then please simply let me know :)

As a related question:  can likwid access all of the uncore performance 
counters (if yes, then you can add this to your brag sheet as perf cannot do 
this AFAIK).

Thank you,
Pete Stevenson

Original issue reported on code.google.com by [email protected] on 5 Jun 2011 at 11:28

Master ticket for Release V1

This is the ticket keeping track about the TODOs for the V1 release.
There is already a named branch for this release.

Current open issues:
1. Port P6 and Pentium M to new perfmon module
2. Write test procedure document for release
3. Update and Review Documentation (man pages and WIKI)
4. Provide examples in WIKI for marker use case
5. Review, extend and validate the groups for Core 2 and Nehalem


Original issue reported on code.google.com by [email protected] on 29 Apr 2010 at 12:42

Allow to enter the core ids as range

Problem:

At the moment the core ids can only be given as comma sparated list.
For high core counts it is convenient to anter ranges e.g. 0-7

Fix:

Implement ragens in the command line treatment of PErfCtr main routine.


Original issue reported on code.google.com by [email protected] on 9 Oct 2009 at 6:29

Add support for multiple instances of likwid on one node

At the moment if using the marker API only one instance of likwid
can run on a node.

It is useful to be able to start several instances of likwid on a node.

Solution:

Include the pid of either the wrapper process or the application in
the filename of the marker results.


Original issue reported on code.google.com by [email protected] on 22 Mar 2010 at 2:43

Switch to bstring library

To enhance the security of likwid a proper
handling of strings is crucial. bstring provides 
a secure, fast and feature rich interface.


Original issue reported on code.google.com by [email protected] on 14 Jan 2010 at 12:26

marker API for likwid-perfCtr not working

What steps will reproduce the problem?
1. Instrument the code as described in:
http://code.google.com/p/likwid/wiki/Introduction

2. Run the command
./likwid-perfCtr -m  -c 0  -g L2  ./likwid-pin -c 0 <executable>


3.

What is the expected output? What do you see instead?

Expected output: the hardware counter information
Actual output: The message: "WARNING: Number of threads in marker file unequal 
to number of threads in likwid-perfCtr!"

and No output is seen.

What version of the product are you using? On what operating system?
Using 1.0.
On linux

Please provide any additional information below.


Original issue reported on code.google.com by [email protected] on 13 Jul 2010 at 2:51

Write test procedure for releases

Due to multiple applications and many supported architectures in likwid it 
is necessary to follow a systematic work instruction to test likwid before 
releases.

Task:
Write document with work instructions and checklists to test and review 
likwid before releases.


Original issue reported on code.google.com by [email protected] on 15 Jan 2010 at 7:45

Test and fix Nehalem Uncore events

The Uncore events on the Intel Nehalem processors are implemented but
do not yet work reliably.

Solution:

Test and fix the Uncore events.
Add appropriate events to the performance groups.

Original issue reported on code.google.com by [email protected] on 28 Sep 2009 at 8:25

Installation fails for install paths ending with a "/"

What steps will reproduce the problem?
1. Set install path with a slash at the end
2. make && make install
3. likwid-pin will result in an error

What is the expected output? What do you see instead?
Usual likwid-pin behavior.
Instead, no pinning and error:
~/helper/likwid-2.2> likwid-pin -c0-1 uname
[likwid-pin] Main PID -> core 0 - OK
ERROR: ld.so: object '~/helper/likwid-2.2-inst' from LD_PRELOAD cannot be 
preloaded: ignored.   

What version of the product are you using? On what operating system?
2.2, Linux

Please provide any additional information below.


Original issue reported on code.google.com by [email protected] on 28 Jun 2011 at 6:47

Add check for msr module

A major problem in using the tools is if the msr module is not loaded
or if the rights on the device files are not sufficient.

Solution:

Add a check on startup and issue a informative warning if
problems are detected.

Original issue reported on code.google.com by [email protected] on 23 Oct 2009 at 2:17

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.