openlvc / hperf Goto Github PK
View Code? Open in Web Editor NEWHLA Performance Testing Federate for IEEE-1516 (2010)
HLA Performance Testing Federate for IEEE-1516 (2010)
Flow control is an essential part of ensuring that fast federates (or federates with fast connections) do not overwhelm slow receivers. The problem is this - a slow SENDER, that is OK for receiving, can get well out of sync with other federates. Although rare, I have encountered this in the TP testing federate!
My laptop's USB3 GB NIC does not have as good of a throughput when sending as it does when receiving. As such, the slow federate is able to receive quickly enough, and then grant more credits to a sender, but it isn't able to match the output pace.
Adding support for timestepped looping will bring all federates to the level of the lowest common denominator.
Once complete:
--timestepped
argument on the command line and the Throughput test federate will go into timestepped mode, advancing time with each step.{As a} developer of an open source RTI,
{I want} my benchmarking suite to run as fast as humanly possible,
{So that} I can test the limits of the RTI, not the benchmarking application
The HLA defines a series of asynchronous services. In 1516e federates can control the way that they receive callbacks. Previously, you had to call tick()
or evokeXxxCallbacks()
to signal to the RTI when you were ready to process waiting messages. As of 1516e you can enable an "immediate" callback mode, which essentially delivers callbacks as soon as they are ready (done on a separate thread).
Part of the problem with the tick()
style services is that you have very little control over how long you tie up the process for. The provided facilities in 1516e are:
evokeCallback( double minWaitTime )
- Process a single callback, waiting for the specified given time at a minimum if there are no messages to processevokeMultipleCallbacks( double mintime, double maxtime )
- Process many callbacks, waiting for at least mintime
but no longer than maxtime
The issue here is that when calling these methods I may block if there is no work to do, and I have to wait at least mintime
for something to appear. This is time I could be using to do other work. In many circumstances this may not be important, but when you're working on a application that you want to push out information as fast as possible, the use of these services introduces an arbitrary delay. Playing with the loop-wait
argument that the test federate takes lets you see this in action. The higher the value, the slower the throughput.
Why not just blat everything out and not tick until the end!? No - idiot - you'll fill up your queue with messages from other federates and run out of memory. The queues inside an LRC need tending, and the tick()/evoke()
calls as the formal handing over of power to the LRC to do them. Also, it's not really indicative of "real world" use.
So, what would be better is to use the immediate callback facilities that are available. In Portico - and I suspect every other RTI - this basically means that a background thread is running constantly, tending to the incoming messages and invoking calls on the FederateAmbassador
. You no longer need to call the tick/evoke methods, and nor do you need to worry about handing over time to queue-tend to ensure it doesn't grow to big. It's just done for you.
Ramifications? You can get FederateAmbassador
calls at any time - including when you're right in the middle of doing other work, so be careful about any shared state.
It would be nice as part of this ticket to maintain the ability to use evoked callbacks if a particular command line argument was given. This would let us contract performance under each style.
Thus ends this completely unnecessary and lengthy commentary on a ticket that could have read "move from evoked to immediate callbacks"
Once complete:
IMMEDIATE
callback mode, rather than EVOKED
EVOKED
mode instead via a command line argumentRemaining Tasks:
loop-wait
)On Windows, for some reason the throughput test fails to unpack received messages when the payload size is >1K. The messages that fail are attribute updates and interactions (which are stuffed with a payload of a definable size by the test). Unpacking complains that it has received a bad handle.
Suspect this could be related to openlvc/portico#65.
Each of the tests print status about their progress each 1/10th of the total loops.
The throughput test currently prints a loop summary, with information about what has happened since the last time summary information was printed. Below is an example:
INFO [main] wantest: Finished loop 3000 -- 3832ms, 39720 events received (10365/s), 10.12MB/s
Add similar support to the latency test, which currently just printed "Completed loop X"
When performing the throughput test on Windows with various payload sizes I noted that after a certain payload size, performance would drop off dramatically. This relates to the sending of information, as I was testing with just a single federate. I have the following notes on the approximate times it took to run the TP test with the provided payload sizes:
800b
: 5s900b
: 5s950b
: 5s975b
: 5s980b
: 5s989b
: 5s990b
: Exception (happened twice)995b
: 73s1000b
: 72sIn the long running tests, each of the 10% milestones would run in approximately the same time, so the slowdown was through the entire run, not just at the end (so likely not just a full buffer).
I recently added support for the JVM binding. Unfortunately this does not appear to work any more for the latency test. It would seem that all federates have been assigned to the RECEIVER
mode, and nobody the SENDER
.
This should be an easy fix. If the latency test is being run, add the --sender
argument to the command line of the federate that is explicitly named (not the peers) when starting the threads.
The value for the overall throughput in messages/second is a bit coarse at the moment. It has a strange habit of being exactly the same in two separate federates that make up a run. I suspect this is because I'm just dividing the overall message count by the run time in seconds (rounded). To take this little inaccuracy away, the total runtime in millis should be used. That'll be slightly different for each federate and thus produce those ever so slightly more accurate (and believable) numbers.
INFO [main] wantest: ---------------------------------------------
INFO [main] wantest: All| 400000 | 26.1MB/s | 390.6MB | 28571/s |
INFO [main] wantest: ---------------------------------------------
Once complete:
The throughput test print a loop activity summary each time 10% of the total loops is completed. This currently contains information about received messages, but should also potentially contain information about sent messages (and rate).
When a TP federate completes its run, it will wait to receive all the updates it expects from a federate before it attempts to synchronize on the "finished" point.
This is done because sync messages in portico are OOB and skip past any unprocessed incoming messages, potentially cutting a federate off before it's finished with the last of the messages it has queued (which means we don't end up counting them).
There is code in there to make the federate only wait a certain amount of time, in case dropped packets meant that the messages were never going to come at all. However, the overall effect is that it waits for a while and then skips past, and the other federates (which may not be finished sending) get all their messages out and processed (the federate is ticking while it is waiting to sync), or close to it.
The "count up" values printed by the wait method are actually pretty useful for seeing the waiting federates received count tick up live (and thus, reinforce that there is incoming traffic).
So, let's keep that printing going, but only as long as the value is ticking up. If it stays steady for 3-5 checks, the messages probably aren't coming, so give up then. Sounds like a good compromise!
waitForFinish
method continues to wait while received message count for a federate increases. Stops if it doesn't move for 3-5 straight checks (depending on tick/sleep time).Add support for running a JVM-binding based federation. This should be uber fast and is really of little practical value (unless you plan on running a federation on one massive box), but the numbers are fun to know.
Once complete:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.