Comments (9)
Using JVisualVM sampling (so not profiling!):
- SoundCardStream.getAmountOfFramesNeeded - I/O to sound card hardware
- ImmerseMixer.lambda15 - ???
- SampleReader.readSamples - Read from disk
- SoundCardStream.writeToLine - Write to sound card hardware
So top 4 is I/O related, as can be expected. writeToLine is already executed in parallel, 1 and 3 are not, 2 not sure. So having the input I/O reading in parallel might be an interesting speedup and maybe polling the getAmountOfFramesNeeded (or underlying) in a sep thread and just taking the latest value in the loop can also help.
from immerse.
If you need to sample on a non GUI machine like Pine64, use jstat.
https://dzone.com/articles/jvm-statistics-with-jstat
from immerse.
Even more details can be got from:
-XX:+PrintGCDetails
from immerse.
Ultimate GC knowledge resource:
https://plumbr.io/java-garbage-collection-handbook
from immerse.
Another profiler when running on local laptop gives this result:
Thread: Main Mixer Worker
Method % self time
.mixer.ImmerseMixer.sleep() 98.7
.mixer.ImmerseMixer.lambda$15() 0.4
.soundcard.SoundCardStream.getAmountOfFramesNeeded() 0.4
.util.MemoryUtil.getFreeSpaceInBytes() 0.3
.mixer.ImmerseMixer.run() 0.1
Sleep is not important here, so actual processing time is about:
A. 1/3 lambda$15
B. 1/3 getAmountOfFramesNeeded
C. 1/3 getFreeSpaceInBytes
D. a little bit: mixer run
A. What is this?
B. Fixable by doing this in sep thread, should be part of #47
C. Maybe also do this in sep thread and/or not every step? A shame if this costs more than it gives.
D. nanotime / 1MB buffer?
Notes:
- This is with just 1 sound card, so getFreeSpaceInBytes might be overrated in this stat
- It seems the actual calculations are not relevant in the performance, makes sense that I/O stuff is much slower than pure CPU numerical calculations of not too great amounts
from immerse.
lambda$15 -> is actually this code:
EntryStream.of(soundCardBufferData).forKeyValue(
(soundCardStream, bufferData) -> this.executorService.submit(() -> soundCardStream.writeToLine(bufferData)));
And specifically "this.executorService.submit". So the submission logic apparently takes quite some time. But the question is how much impact this has on for instance the Pine64. We could consider putting even the submission logic in a new thread, but that kindof defeats the purpose of a thread pool.
from immerse.
Taking all packages into account gives a much clearer picture (see below). This gives the definite insight of things already suspected above:
For main thread:
- Actual calculations are not a factor at all!
- Bottleneck: com.sun.media.sound.DirectAudioDevice.nGetBytePosition => getting the position in the output line, will be tackled by #47
- Bottleneck: sun.misc.Unsafe.unpark => getting a worker from the thread pool, may be interesting to create a separate issue for getting a better solution for this (if possible) -> tested with new Thread(() ->{}).start() instead and that is about 3 times slower, with sampling values in Thread.start and AccessController (from Thread constructor)
- Bottleneck: sun.management.MemoryPoolImpl.getUsage0 => querying the current memory usage is also not cheap, should be done less frequently (if at all) -> idea: don't try to be too smart about it, query every one in a while and trigger if less then 10% or so. Or maybe not at all is actually fine as well... -> should at least be taken out of synchronous main thread!
For worker thread (in case of test just write data to buffer):
- Bottleneck: sun.misc.Unsafe.park => parking the worker thread for later re-use. parking/unparking is better then starting new threads the whole time, that is proven. But still nice to minimize the effect of this. Improvement: use only 1 sep thread for data writing to all sound card buffers. Since the writing is pretty fast and we deal with buffers (so no real time needed), that should easily suffice.
- Bottleneck: com.sun.media.sound.DirectAudioDevice.nWrite() => writing the buffer data to the sound card. This is a needed operation and no problem cause it is in parallel and fills a buffer.
General:
- Looking at the thread (un)park stats, it may be interesting to minimize extra thread usage, so look at all new thread executions and minimize where possible, for instance listener notifications.
TODO: run this test on the Pine64 with multiple sound cards. nGetBytePosition and unpark should scale with number of sound cards, while memory pool usage shouldn't. First perform base line test, then try improvements mentioned above.
Thread: pool-1-thread-2 285817
java.lang.Thread.run() 285817 100.0 0 0.0 1
java.util.concurrent.ThreadPoolExecutor$Worker.run() 285817 100.0 0 0.0 1
java.util.concurrent.ThreadPoolExecutor.runWorker() 285817 100.0 0 0.0 33
java.util.concurrent.ThreadPoolExecutor.getTask() 285016 99.71975074960552 0 0.0 17
java.util.concurrent.SynchronousQueue.poll() 285016 99.71975074960552 0 0.0 17
java.util.concurrent.SynchronousQueue$TransferStack.transfer() 285016 99.71975074960552 0 0.0 17
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill() 285016 99.71975074960552 0 0.0 17
java.util.concurrent.locks.LockSupport.parkNanos() 285016 99.71975074960552 0 0.0 17
sun.misc.Unsafe.park() 285016 99.71975074960552 285016 99.71975074960552 17
java.util.concurrent.FutureTask.run() 599 0.2095746579104812 0 0.0 12
java.util.concurrent.Executors$RunnableAdapter.call() 599 0.2095746579104812 0 0.0 12
com.programyourhome.immerse.audiostreaming.mixer.ImmerseMixer$$Lambda$64/36569262.run() 599 0.2095746579104812 0 0.0 12
com.programyourhome.immerse.audiostreaming.mixer.ImmerseMixer.lambda$16() 599 0.2095746579104812 0 0.0 12
com.programyourhome.immerse.audiostreaming.soundcard.SoundCardStream.writeToLine() 599 0.2095746579104812 0 0.0 12
com.sun.media.sound.DirectAudioDevice$DirectDL.write() 599 0.2095746579104812 0 0.0 12
com.sun.media.sound.DirectAudioDevice.access$1800() 599 0.2095746579104812 0 0.0 12
com.sun.media.sound.DirectAudioDevice.nWrite() 599 0.2095746579104812 599 0.2095746579104812 12
java.lang.Thread.interrupted() 202 0.070674592484002 0 0.0 4
java.lang.Thread.isInterrupted() 202 0.070674592484002 202 0.070674592484002 4
Thread: Main Mixer Worker 285817
java.lang.Thread.run() 285817 100.0 0 0.0 1
com.programyourhome.immerse.audiostreaming.mixer.ImmerseMixer$$Lambda$14/1066516207.run() 285817 100.0 0 0.0 1
com.programyourhome.immerse.audiostreaming.mixer.ImmerseMixer.run() 285817 100.0 51 0.017843585231109415 111
com.programyourhome.immerse.audiostreaming.mixer.ImmerseMixer.sleep() 283057 99.03434715219879 0 0.0 56
java.lang.Thread.sleep() 283057 99.03434715219879 283057 99.03434715219879 56
com.programyourhome.immerse.audiostreaming.mixer.ImmerseMixer.updateBuffers() 2107 0.7371849819989714 0 0.0 42
com.programyourhome.immerse.audiostreaming.mixer.step.MixerStep.() 1304 0.4562359831640525 0 0.0 26
com.programyourhome.immerse.audiostreaming.mixer.step.MixerStep.calculateAmountOfFramesNeeded() 1304 0.4562359831640525 0 0.0 26
one.util.streamex.AbstractStreamEx.toList() 1304 0.4562359831640525 0 0.0 26
one.util.streamex.AbstractStreamEx.toArray() 1304 0.4562359831640525 0 0.0 26
java.util.stream.ReferencePipeline.toArray() 1304 0.4562359831640525 0 0.0 26
java.util.stream.AbstractPipeline.evaluateToArrayNode() 1304 0.4562359831640525 0 0.0 26
java.util.stream.AbstractPipeline.evaluate() 1304 0.4562359831640525 0 0.0 26
java.util.stream.AbstractPipeline.wrapAndCopyInto() 1304 0.4562359831640525 0 0.0 26
java.util.stream.AbstractPipeline.copyInto() 1304 0.4562359831640525 0 0.0 26
java.util.HashMap$KeySpliterator.forEachRemaining() 1304 0.4562359831640525 0 0.0 26
java.util.stream.ReferencePipeline$3$1.accept() 1304 0.4562359831640525 0 0.0 26
com.programyourhome.immerse.audiostreaming.mixer.step.MixerStep$$Lambda$51/1254217904.apply() 1304 0.4562359831640525 0 0.0 26
com.programyourhome.immerse.audiostreaming.mixer.step.MixerStep.lambda$2() 1304 0.4562359831640525 0 0.0 26
com.programyourhome.immerse.audiostreaming.soundcard.SoundCardStream.getAmountOfFramesNeeded() 1304 0.4562359831640525 0 0.0 26
com.sun.media.sound.DirectAudioDevice$DirectDL.getLongFramePosition() 1304 0.4562359831640525 0 0.0 26
com.sun.media.sound.DirectAudioDevice.access$1700() 1304 0.4562359831640525 0 0.0 26
com.sun.media.sound.DirectAudioDevice.nGetBytePosition() 1304 0.4562359831640525 1304 0.4562359831640525 26
one.util.streamex.EntryStream.forKeyValue() 702 0.24561170259291784 0 0.0 14
one.util.streamex.AbstractStreamEx.forEach() 702 0.24561170259291784 0 0.0 14
java.util.stream.ReferencePipeline$Head.forEach() 702 0.24561170259291784 0 0.0 14
java.util.HashMap$EntrySpliterator.forEachRemaining() 702 0.24561170259291784 0 0.0 14
one.util.streamex.EntryStream$$Lambda$63/649994605.accept() 702 0.24561170259291784 0 0.0 14
one.util.streamex.EntryStream.lambda$toConsumer$0() 702 0.24561170259291784 0 0.0 14
com.programyourhome.immerse.audiostreaming.mixer.ImmerseMixer$$Lambda$62/1540172031.accept() 702 0.24561170259291784 0 0.0 14
com.programyourhome.immerse.audiostreaming.mixer.ImmerseMixer.lambda$15() 702 0.24561170259291784 0 0.0 14
java.util.concurrent.AbstractExecutorService.submit() 702 0.24561170259291784 0 0.0 14
java.util.concurrent.ThreadPoolExecutor.execute() 702 0.24561170259291784 0 0.0 14
java.util.concurrent.SynchronousQueue.offer() 702 0.24561170259291784 0 0.0 14
java.util.concurrent.SynchronousQueue$TransferStack.transfer() 702 0.24561170259291784 0 0.0 14
java.util.concurrent.SynchronousQueue$TransferStack$SNode.tryMatch() 702 0.24561170259291784 0 0.0 14
java.util.concurrent.locks.LockSupport.unpark() 702 0.24561170259291784 0 0.0 14
sun.misc.Unsafe.unpark() 702 0.24561170259291784 702 0.24561170259291784 14
com.programyourhome.immerse.audiostreaming.mixer.step.MixerStep.calculateBufferData() 101 0.035337296242001 0 0.0 2
com.programyourhome.immerse.audiostreaming.mixer.step.MixerStep.createSilence() 101 0.035337296242001 0 0.0 2
com.programyourhome.immerse.toolbox.util.StreamUtil.toMapFixedValue() 101 0.035337296242001 0 0.0 2
one.util.streamex.StreamEx.mapToEntry() 101 0.035337296242001 0 0.0 2
one.util.streamex.BaseStreamEx.stream() 50 0.017493711010891585 0 0.0 1
one.util.streamex.AbstractStreamEx.createStream() 50 0.017493711010891585 0 0.0 1
one.util.streamex.AbstractStreamEx.createStream() 50 0.017493711010891585 0 0.0 1
java.util.stream.StreamSupport.stream() 50 0.017493711010891585 50 0.017493711010891585 1
java.lang.invoke.LambdaForm$MH/883049899.linkToTargetMethod() 51 0.017843585231109415 0 0.0 1
java.lang.invoke.LambdaForm$DMH/1072591677.invokeStatic_L_L() 51 0.017843585231109415 0 0.0 1
one.util.streamex.StreamEx$$Lambda$56/107910188.get$Lambda() 51 0.017843585231109415 51 0.017843585231109415 1
com.programyourhome.immerse.audiostreaming.util.MemoryUtil.getFreeEdenSpaceInKB() 602 0.2106242805711347 0 0.0 12
com.programyourhome.immerse.audiostreaming.util.MemoryUtil.getFreeSpaceInKB() 602 0.2106242805711347 0 0.0 12
com.programyourhome.immerse.audiostreaming.util.MemoryUtil.getFreeSpaceInBytes() 602 0.2106242805711347 0 0.0 12
sun.management.MemoryPoolImpl.getUsage() 602 0.2106242805711347 0 0.0 12
sun.management.MemoryPoolImpl.getUsage0() 602 0.2106242805711347 602 0.2106242805711347 12
Thread: Java Sound Event Dispatcher 285817
java.lang.Thread.run() 285817 100.0 0 0.0 1
com.sun.media.sound.EventDispatcher.run() 285817 100.0 0 0.0 1
com.sun.media.sound.EventDispatcher.dispatchEvents() 285817 100.0 0 0.0 1
java.lang.Object.wait() 285817 100.0 0 0.0 1
java.lang.Object.wait() 285817 100.0 285817 100.0 1
Thread: Finalizer 285817
java.lang.ref.Finalizer$FinalizerThread.run() 285817 100.0 0 0.0 1
java.lang.ref.ReferenceQueue.remove() 285817 100.0 0 0.0 1
java.lang.ref.ReferenceQueue.remove() 285817 100.0 0 0.0 1
Thread: Reference Handler 285817
java.lang.ref.Reference$ReferenceHandler.run() 285817 100.0 0 0.0 1
java.lang.ref.Reference.tryHandlePending() 285817 100.0 0 0.0 1
java.lang.Object.wait() 285817 100.0 0 0.0 1
from immerse.
Arghh, the previous results are for a run without any active scenario's!!!
Rerunning now to get 'real' results.
from immerse.
New results, with 1 sound card and 3 running scenario's, are in:
% of non-sleep space of main thread:
37% - calculate scenario results
27% - calculate output buffers
17% - get frame position of sound card data line
10% - unparking for sound card writing
7% - eden space calculations
2% - small leftovers and rounding errors
Still very interesting to do the same for the Pine64, with 6 sound cards and 3 scenario's. More sound cards will influence percentages (and Immerse should be optimized for a lot of sound cards) and also balance between CPU and (blocking) I/O will be different on different hardware.
Worker threads:
Same results as above. So less unparking might still be interesting. Restarts are peanuts (of course, since very few invocations).
Details of the 37% - calculate scenario results:
67% - AudioInputStream.read (2/3 Disk I/O, 1/3 converters)
11% - FractionalNormalizeAlgorithm
5% - fromJavaAudioFormat
rest % - various stream(ex) and self time calculations
Details of the 27% - calculate output buffers:
35% - inside calculateOutputBuffers stream stuff - mapValues, a little unclear what exactly
27% - inside calculateCombinedOutputSamples stream stuff - ReferencePipeline.accept, a little unclear what exactly
7% - sample writer
rest % - various stream(ex) and self time calculations
How must of the calculations stuff is StreamEx related?!?
Conclusions:
- Position data line still an important improvement - that can be easily parallalized
- Eden space calculations less important - but still easy to parallalize
- Unparking threads - interesting to check easy tweak (all writes in 1 thread)
- Calculations - unclear how much is stream(ex) and how much is actual data calculations - seems interesting to write a non-stream version with lots of loops just to see the difference.
from immerse.
Related Issues (20)
- AudioInputBuffer should actively keep live streams in sync HOT 1
- Support all audio formats for live streams HOT 3
- Refactor out input/output flag in ImmerseAudioFormat
- Can Url and Supplied Audio Resources still be live? HOT 1
- Research Java Sound API behaviour of different microphones HOT 3
- Research detailed behaviour of SourceDataLine HOT 8
- Refactor UdpAudioResource to better handle the many parameters
- Implement own audio format converters that use minimal amount of buffering
- General volume feature for every playback HOT 1
- Now that we have DynamicData for all ScenarioSettings with time state, can we get rid of the Factory stuff?
- Generify some DynamicData stuff in ActiveScenario?
- Measure delay in DynamicData start time and real audio start time and consider refactor
- Re-research right internal OS buffer size for all systems HOT 1
- Bug: playback randomly stops after heavy stuttering HOT 2
- Is dynamic volume setting linear?
- Does dragon wings file cause ticks in plackback because of overflow in Audacity? HOT 2
- Research and fix weird indexoutofbounds bug
- Is stopping playback a source of hickups?
- Bug: AudioInputBuffer ArrayIndexOutOfBounds
- Performance issue: playing through adventure service is much heavier on CPU than local HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from immerse.