joaomlneto / unstickymem Goto Github PK
View Code? Open in Web Editor NEWLibrary for Dynamic Placement in NUMA Nodes
Library for Dynamic Placement in NUMA Nodes
Currently we only have a README.md
in the root directory.
Not only this is kind of limiting, but right now it's completely outdated.
Before making things public (after the publication), we should document it properly.
Sections:
LD_PRELOAD
or linking in the source code)The decision-making algorithm does the following:
numactl --interleave=all
for now)N
measurements of the memory stall rate, T
microseconds apart.k
measurements (the k/2
smallest and k/2
greatest).unstickymem/src/unstickymem/unstickymem.cpp
Lines 113 to 122 in 2d6cdef
Calling place_pages
with the same arguments results in extra pages being moved into the local node.
We wanted to improve it anyways -- rewrite!!
Currently we assume there is 1 local
/worker node.
The function takes one parameter (%local
) that places local%
of the segment in the local node, and the remainder is interleaved between the three other nodes.
Provide another version that can have different weights for each node.
Algorithm should treat all the contiguous mapped memory as blocks of N pages.
Should place x% of each block in the local node and (1-x)% interleaved in the remaining nodes.
Measurements at 2018-10-10 on intel14cores-v2
Local Ratio | 0 | 0-1 | 0-2 | 0-3 | 0-4 | 0-5 | 0-6 |
---|---|---|---|---|---|---|---|
25% | 0,8652997439 | 0,8523260392 | 0,8057950605 | 0,7694768156 | 0,7407080060 | 0,7151234701 | 0,6881931666 |
30% | 0,8634031441 | 0,8474194553 | 0,7984094285 | 0,7614674645 | 0,7408262380 | 0,7148938137 | 0,6872798011 |
35% | 0,8590710471 | 0,8382672294 | 0,7844607808 | 0,7465582615 | 0,7412492110 | 0,7140231972 | 0,6868939998 |
40% | 0,8534129592 | 0,8265852018 | 0,7677881273 | 0,7427290218 | 0,7413086173 | 0,7148290723 | 0,6826531546 |
45% | 0,8473018202 | 0,8161373897 | 0,7703117436 | 0,7470264965 | 0,7344344647 | 0,6940479186 | |
50% | 0,8417095835 | 0,8201595776 | 0,7372542016 | ||||
55% | 0,8369322683 | ||||||
60% | 0,8334777535 | ||||||
65% | 0,8316501186 | ||||||
70% | 0,8305620997 | ||||||
75% | 0,8302131814 | ||||||
80% | 0,8301691608 | ||||||
85% | 0,8301030693 | ||||||
90% | 0,8299428306 | ||||||
95% | 0,8299722645 | ||||||
100% | 0,8300216876 | ||||||
Minimum Stall Observed | 0,8299428306 | 0,8161373897 | 0,7677881273 | 0,7427290218 | 0,7344344647 | 0,7140231972 | 0,6826531546 |
Optimal Local Ratio | 90% | 45% | 40% | 40% | 45% | 35% | 40% |
Decision (local ratio) | 100% | 50% | 45% | 45% | 50% | 40% | 45% |
Per-Node Throughput | 3 954MB/s | 3 827MB/s | 3 526MB/s | 3 424MB/s | 3 364MB/s | 3 174MB/s | 2 960MB/s |
Currently we only support checking the memory map once and acting upon it.
If we update the memory mapping, the place_pages
is not idempotent. (see #9 for explanation and example).
@gureya which version of Streamcluster are you using?
I recall comments about multiple versions existing during the last Skype call.
Currently we will just crash if we're running on a machine with a single NUMA node.
We should simply show a warning and do nothing.
Algorithm should treat all the regions as one and place x% in the local node and (1-x)% interleaved in the other Nodes.
Write some basic documentation for users and future contributors
The mechanism to avoid stopping due to noise is simply not working.
unstickymem/src/unstickymem/mode/WeightedAdaptiveMode.cpp
Lines 94 to 98 in ca4533c
If we get a higher stall rate we trigger this confirmation mechanism. However, we will stop the adaptive placement independently of the stall rate being lower.
This needs to be fixed in all adaptive mechanisms (weighted and uniform)
We need to find a way of checking what data segments exist, in order to pass them to mbind
.
mmap
/malloc
change depending on the number of mappings?We should try and do some functional tests to parts of the library.
Not an easy task, given the nature of the library.
Old bug, but ideally we want to avoid running setup-counters-$(arch)
all the time.
Probably a good time to learn about packaging and stuff :-)
Currently it will work on Broadwell/Haswell (which is the case of intel14cores
).
Intel Performance Counters:
AMD Performance Counters:
Changes are being tracked on the perf branch
As memory goes from an interleaved
state to a local
state; it would be nice to offer functionality to make it go in the opposite way.
A new user with a brand new NUMA machine with default e.g. Ubuntu -- how should I install/run your application? Should cover package dependencies and a comprehensive list of commands.
If we're dealing with an application that makes many small allocations, the algorithm in WeightedAdaptiveMode
won't work.
@gureya has a fix ready in his branch :-P
I'm just writing it here not to forget we had a problem, and to (try and) make a unit test later.
The fix is basically:
MPOL_INTERLEAVE
on all nodes in the library constructorMemoryMap
until the WeightedAdaptive mode startsThe side effects are that the segments will be uniformly interleaved until the node starts, although we don't expect this to be of significance.
We want to be able to know the set of all existing segments to apply mbind
on.
To this end, we'll intercept standard C/linux functions that deal with memory mappings/allocations:
mmap
munmap
sbrk
This mechanism has significant advantages over reading /proc/self/maps
(#11):
/proc/self/maps
file.Actually, they can stay -- but the application should be able to effortlessly generate them.
There should be a background thread that does things automatically (sleep, measure, decide new weights, apply mbind).
Change library name to match the one on the paper.
The way we are dealing with algorithms won't scale and quickly becomes a mess.
Instead of spawning a thread that (currently) has a scan
mode and an adaptive
mode, refactor this into a Factory/Strategy pattern.
Evaluation of the Scalar Penta-diagonal solver of NAS Parallel Benchmarks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.