Giter Club home page Giter Club logo

Comments (9)

keturn avatar keturn commented on September 17, 2024

While testing MovingBlocks/Terasology#4611 , the test run failed with:

org.gradle.api.internal.tasks.testing.TestSuiteExecutionException: Could not complete execution for Gradle Test Executor 22.
	at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:63)
        […]
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.stop(TestWorker.java:133)
        […]
	at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
	at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
	at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118)
	at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317)
	at org.terasology.engine.rendering.assets.texture.PNGTextureFormat.load(PNGTextureFormat.java:35)
	at org.terasology.engine.rendering.assets.texture.PNGTextureFormat.load(PNGTextureFormat.java:19)
	at org.terasology.assets.module.UnloadedAssetData$AssetSourceResolver.load(UnloadedAssetData.java:292)
	at org.terasology.assets.module.UnloadedAssetData.load(UnloadedAssetData.java:180)
	at org.terasology.assets.module.ModuleAssetDataProducer.getAssetData(ModuleAssetDataProducer.java:228)
	at org.terasology.assets.AssetType.lambda$reload$2(AssetType.java:359)
	at java.base/java.security.AccessController.doPrivileged(Native Method)
	at org.terasology.assets.AssetType.reload(AssetType.java:357)
	at org.terasology.assets.AssetType.getNormalAsset(AssetType.java:386)
	at org.terasology.assets.AssetType.getAsset(AssetType.java:262)
	at org.terasology.assets.management.AssetManager.getAsset(AssetManager.java:223)
	at org.terasology.engine.rendering.assets.font.FontMaterialProducer.getAssetData(FontMaterialProducer.java:61)

from moduletestingenvironment.

jdrueckert avatar jdrueckert commented on September 17, 2024

I wonder if we can formulate this issue less vague - "very slow" is subjective (which is not me saying, it's not very slow 😉 ) and makes it hard to assess when this issue would actually be resolved.
Which durations do we expect? What amount of improvement would be sufficient? A.k.a what's the "Definition of Done" for this issue?

from moduletestingenvironment.

jdrueckert avatar jdrueckert commented on September 17, 2024

Accelerate MTE test suite

from moduletestingenvironment.

keturn avatar keturn commented on September 17, 2024

Probably the real achievement here would be to understand what is taking so much time and why.

But if we want a range to aim for, I'd say 5 seconds per test would make it merely "really slow" instead of the "extraordinarily slow" it is now. That'd be about two minutes for the suite of 22 tests.

from moduletestingenvironment.

keturn avatar keturn commented on September 17, 2024

After saying that, it did run in under two minutes on my branch just now. (though not every test was under 5s)

Does that branch speed it up so much? Or is it faster because I only have iota modules, and it would be slower in an omega workspace? 🤔

from moduletestingenvironment.

Cervator avatar Cervator commented on September 17, 2024

Or is it faster because I only have iota modules

We've definitely seen issues in the past caused/aggravated by the presence of other modules, even if they aren't activated for whatever is going on. But on the other hand it is slow in Jenkins which has the minimum amount of modules.

If the tests in the MTE itself are just more memory hungry, are we sure they'll take advantage of all available memory? I think the current build agents are configured to offer up to 6 GB, but I have no idea if Gradle is allowing that (especially if it launches in different contexts / peripheral ones like module test cases rather than a plain engine build)

from moduletestingenvironment.

skaldarnar avatar skaldarnar commented on September 17, 2024

The devil is in the detail here: The tests are not running out of heap space, but out of native memory space. The eBay team wrote a SRE Case Study on this, and there a couple of other blog posts on troubleshooting it.

Caused by: java.lang.OutOfMemoryError: Direct buffer memory

It does look like we accumulate more and more used memory in the native space each time we start up a new engine for an MTE test, seemingly without cleaning up that memory in between. I can run a couple of tests locally just fine, but once crossing a specific threshold the test executor will crash eventually.


In a bit more detail: it all starts from StateLoading, which tries to push a screen, and then we do a bunch of asset (re-)loading. This happens each time we start an MTE test.

    ...
    at org.terasology.engine.core.modes.StateLoading.init(StateLoading.java:139)
    at org.terasology.engine.core.TerasologyEngine.switchState(TerasologyEngine.java:591)
    at org.terasology.engine.core.TerasologyEngine.processPendingState(TerasologyEngine.java:580)
    at org.terasology.engine.core.TerasologyEngine.tick(TerasologyEngine.java:484)
    at org.terasology.moduletestingenvironment.ModuleTestingEnvironment.createHost(ModuleTestingEnvironment.java:475)
    at org.terasology.moduletestingenvironment.ModuleTestingEnvironment.setup(ModuleTestingEnvironment.java:167)
    at org.terasology.moduletestingenvironment.MTEExtension.beforeAll(MTEExtension.java:93)
    ...

It does look like we have a native memory leak in there, which becomes apparent here because we start and stop the engine a bunch of times... Eventually, the test executor will crash with the above mentioned OOM.

Caused by: java.lang.OutOfMemoryError: Direct buffer memory
    at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
    at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118)
    at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317)
    at org.terasology.engine.rendering.assets.texture.AWTTextureFormat.convertToTextureData(AWTTextureFormat.java:78)
    at org.terasology.engine.rendering.assets.texture.AWTTextureFormat.load(AWTTextureFormat.java:50)
    at org.terasology.engine.rendering.assets.texture.AWTTextureFormat.load(AWTTextureFormat.java:25)
    at org.terasology.gestalt.assets.format.producer.UnloadedAssetData$AssetSourceResolver.load(UnloadedAssetData.java:311)
    at org.terasology.gestalt.assets.format.producer.UnloadedAssetData.load(UnloadedAssetData.java:181)
    at org.terasology.gestalt.assets.format.producer.AssetFileDataProducer.getAssetData(AssetFileDataProducer.java:227)
    at org.terasology.gestalt.assets.AssetType.lambda$reload$2(AssetType.java:360)
    at java.base/java.security.AccessController.doPrivileged(Native Method)
    at org.terasology.gestalt.assets.AssetType.reload(AssetType.java:358)
    ...

from moduletestingenvironment.

keturn avatar keturn commented on September 17, 2024

It seems likely that some of the things causing problems for re-using the same process for multiple MTE tests might also have been causing problems for switching between games in the same Terasology run. There was some work on that in MovingBlocks/Terasology#4829; we should review that and see if MTE is doing the appropriate dispose steps.

from moduletestingenvironment.

keturn avatar keturn commented on September 17, 2024

Follow-ups:

[closing because I can't transfer issues between @Terasology and @MovingBlocks]

from moduletestingenvironment.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.