Giter Club home page Giter Club logo

unstickymem's People

Watchers

 avatar  avatar  avatar

unstickymem's Issues

Add Documentation to Wiki

Currently we only have a README.md in the root directory.

Not only this is kind of limiting, but right now it's completely outdated.

Before making things public (after the publication), we should document it properly.

Sections:

  • How to install
  • How to compile
  • How to run (LD_PRELOAD or linking in the source code)
  • Library modes
  • Library options (configuration file, environment variables)

Interleave bug

There is a bug that does not uniformly distribute the memory among the nodes.

screenshot 2018-10-08 at 12 30 04

Improve the decision-making algorithm

The decision-making algorithm does the following:

Initialization steps

  1. Interleave all memory objects (must use numactl --interleave=all for now)
  2. Wait a few seconds before starting the tuning process.

Tuning process

  1. Take N measurements of the memory stall rate, T microseconds apart.
  2. Discard k measurements (the k/2 smallest and k/2 greatest).
  3. Compute the average
  4. If the average is lower, then keep going (shift a few more pages into the local node). Otherwise, stop.

// slowly achieve awesomeness
while(local_ratio < 1.00) {
LWARNF("GOING TO CHECK A LOCAL RATIO OF %lf", local_ratio);
place_all_pages(local_ratio);
stall_rate = get_average_stall_rate(NUM_POLLS, POLL_SLEEP, NUM_POLL_OUTLIERS);
LINFOF("RATIO = %lf STALL RATE = %lf", local_ratio, prev_stall_rate);
if (stall_rate > prev_stall_rate) break;
prev_stall_rate = stall_rate;
local_ratio += 0.01;
}

`place_pages` is not idempotent

Calling place_pages with the same arguments results in extra pages being moved into the local node.

We wanted to improve it anyways -- rewrite!!

Extend `place_pages` to have independent weights per node

Currently we assume there is 1 local/worker node.

The function takes one parameter (%local) that places local% of the segment in the local node, and the remainder is interleaved between the three other nodes.

Provide another version that can have different weights for each node.

Evaluate Gureya's Benchmark

Measured stall rate and decisions taken by the algorithm

Measurements at 2018-10-10 on intel14cores-v2

screenshot 2018-10-13 at 17 46 24

Local Ratio 0 0-1 0-2 0-3 0-4 0-5 0-6
25% 0,8652997439 0,8523260392 0,8057950605 0,7694768156 0,7407080060 0,7151234701 0,6881931666
30% 0,8634031441 0,8474194553 0,7984094285 0,7614674645 0,7408262380 0,7148938137 0,6872798011
35% 0,8590710471 0,8382672294 0,7844607808 0,7465582615 0,7412492110 0,7140231972 0,6868939998
40% 0,8534129592 0,8265852018 0,7677881273 0,7427290218 0,7413086173 0,7148290723 0,6826531546
45% 0,8473018202 0,8161373897 0,7703117436 0,7470264965 0,7344344647 0,6940479186
50% 0,8417095835 0,8201595776 0,7372542016
55% 0,8369322683
60% 0,8334777535
65% 0,8316501186
70% 0,8305620997
75% 0,8302131814
80% 0,8301691608
85% 0,8301030693
90% 0,8299428306
95% 0,8299722645
100% 0,8300216876
Minimum Stall Observed 0,8299428306 0,8161373897 0,7677881273 0,7427290218 0,7344344647 0,7140231972 0,6826531546
Optimal Local Ratio 90% 45% 40% 40% 45% 35% 40%
Decision (local ratio) 100% 50% 45% 45% 50% 40% 45%
Per-Node Throughput 3 954MB/s 3 827MB/s 3 526MB/s 3 424MB/s 3 364MB/s 3 174MB/s 2 960MB/s

Dynamic Memory Mapping

Currently we only support checking the memory map once and acting upon it.

If we update the memory mapping, the place_pages is not idempotent. (see #9 for explanation and example).

Evaluate Streamcluster

@gureya which version of Streamcluster are you using?

I recall comments about multiple versions existing during the last Skype call.

Support single-node machines

Currently we will just crash if we're running on a machine with a single NUMA node.

We should simply show a warning and do nothing.

Adaptive noise-avoiding stop-condition not working

The mechanism to avoid stopping due to noise is simply not working.

if (get_average_stall_rate(_num_polls * 2, _poll_sleep,
_num_poll_outliers * 2)) {
LINFO("I guess so!");
break;
}

If we get a higher stall rate we trigger this confirmation mechanism. However, we will stop the adaptive placement independently of the stall rate being lower.

This needs to be fixed in all adaptive mechanisms (weighted and uniform)

Some Unit Tests?

We should try and do some functional tests to parts of the library.

Not an easy task, given the nature of the library.

Make `rdpmc` portable across different CPUs

Currently it will work on Broadwell/Haswell (which is the case of intel14cores).

Intel Performance Counters:

  • 1 (countreg 0x40000001) - Haswell - Core Clock Cycles
  • 111 (event 0xa2, mask 0x01) - Haswell - Any Resource Stall

AMD Performance Counters:

  • ??????????

Changes are being tracked on the perf branch

Basic Tutorial

A new user with a brand new NUMA machine with default e.g. Ubuntu -- how should I install/run your application? Should cover package dependencies and a comprehensive list of commands.

Handling many small allocations

If we're dealing with an application that makes many small allocations, the algorithm in WeightedAdaptiveMode won't work.

@gureya has a fix ready in his branch :-P

I'm just writing it here not to forget we had a problem, and to (try and) make a unit test later.

The fix is basically:

  1. Set the default memory policy to MPOL_INTERLEAVE on all nodes in the library constructor
  2. Ignoring any new segments sent from MemoryMap until the WeightedAdaptive mode starts
  3. Upon WeightedAdaptive mode starting, apply initial weights
  4. Proceed normally

The side effects are that the segments will be uniformly interleaved until the node starts, although we don't expect this to be of significance.

Wrap Mapping Functions

We want to be able to know the set of all existing segments to apply mbind on.

To this end, we'll intercept standard C/linux functions that deal with memory mappings/allocations:

  • mmap
  • munmap
  • sbrk

This mechanism has significant advantages over reading /proc/self/maps (#11):

  • Deal with user semantics rather than the kernel -- we won't observe a segment getting fragmented as when reading the /proc/self/maps file.
  • Up-to-date information -- we are able to have an always up-to-date set of segments
  • Thread-safety -- we won't be plagued by read-free-mbind crashes, as we are able to stop the process from making changes during vulnerability periods.

Licensing?

  • What license to include in the repository?
  • What copyright notice to include in the source files?

`mbind` done transparently/asynchronously

There should be a background thread that does things automatically (sleep, measure, decide new weights, apply mbind).

  • The thread should be bound to the same core in the same NUMA node.

Refactor the Algorithms part

The way we are dealing with algorithms won't scale and quickly becomes a mess.

Instead of spawning a thread that (currently) has a scan mode and an adaptive mode, refactor this into a Factory/Strategy pattern.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.