Looks similar to the issue fixed in <a class="commit-link" data-hovercard-type="commit

I can't reproduce this. (Build from <a class="commit-link" data-hovercard-type="commi

Stephen Sinclair notifications@github.

[example] BulletBouncingBoxDynamic.py segfaults,about siconos/siconos

Comments (21)

radarsat1 commented on June 2, 2024

I can't reproduce this. (Build from 272df40, doesn't crash for me.) Is it only BulletBouncingBoxDynamic.py that fails for you, and not BulletBouncingBox.py? There shouldn't be any difference between these two cases if it is indeed similar to 83ce528.

from siconos.

xhub commented on June 2, 2024

Stephen Sinclair [email protected] [2016-03-14 10:19:19 -0700]:

I can't reproduce this. (Build from 272df40, doesn't crash for me.) Is it only BulletBouncingBoxDynamic.py that fails for you, and not BulletBouncingBox.py? There shouldn't be any difference between these two cases if it is indeed similar to 83ce528.

I got the error by using the ASAN sanitizer with gcc (you have to compile siconos with -DUSE_SANITIZER=asan to activate with gcc or clang).
I get the same error for all examples (cpp and python) in that directory.
Using ASAN is quite handy to catch bugs, but you need siconos to be compiled with it and it slow down thing a bit. Also it does not always play well with python.

Otherwise try ``siconos -P valgrind BulletBouncingBoxDynamic.cpp''

Below is the first error. There are quite a few, so I won't copy them here, but I can mail you the full log.

==2218== Invalid read of size 8
==2218== at 0x5471624: btCollisionWorld::~btCollisionWorld() (in /usr/lib64/libBulletCollision.so.2.83)
==2218== by 0x54716B8: btCollisionWorld::~btCollisionWorld() (in /usr/lib64/libBulletCollision.so.2.83)
==2218== by 0x5F95E5D: void boost::checked_delete(btCollisionWorld_) (checked_delete.hpp:34)
==2218== by 0x5F9BA11: boost::detail::sp_counted_impl_p::dispose() (sp_counted_impl.hpp:78)
==2218== by 0x416EE1: boost::detail::sp_counted_base::release() (sp_counted_base_gcc_x86.hpp:146)
==2218== by 0x416FA4: boost::detail::shared_count::~shared_count() (shared_count.hpp:443)
==2218== by 0x417D79: boost::shared_ptr::~shared_ptr() (shared_ptr.hpp:323)
==2218== by 0x5F9B796: BulletSpaceFilter::~BulletSpaceFilter() (in /install/no-asan/lib64/libsiconos_mechanics.so.3.9.0)
==2218== by 0x5F9B865: BulletSpaceFilter::~BulletSpaceFilter() (BulletSpaceFilter.hpp:26)
==2218== by 0x41C044: void boost::checked_delete(BulletSpaceFilter_) (checked_delete.hpp:34)
==2218== by 0x41CDAD: boost::detail::sp_counted_impl_p::dispose() (sp_counted_impl.hpp:78)
==2218== by 0x416EE1: boost::detail::sp_counted_base::release() (sp_counted_base_gcc_x86.hpp:146)
==2218== Address 0x117d2bf0 is 0 bytes inside a block of size 288 free'd
==2218== at 0x4C2B740: operator delete(void_) (vg_replace_malloc.c:575)
==2218== by 0x5F95DF5: void boost::checked_delete(btDbvtBroadphase_) (checked_delete.hpp:34)
==2218== by 0x5F9BA53: boost::detail::sp_counted_impl_p::dispose() (sp_counted_impl.hpp:78)
==2218== by 0x416EE1: boost::detail::sp_counted_base::release() (sp_counted_base_gcc_x86.hpp:146)
==2218== by 0x416FA4: boost::detail::shared_count::~shared_count() (shared_count.hpp:443)
==2218== by 0x5F8942B: boost::shared_ptr::~shared_ptr() (shared_ptr.hpp:323)
==2218== by 0x5F9B760: BulletSpaceFilter::~BulletSpaceFilter() (in /install/no-asan/lib64/libsiconos_mechanics.so.3.9.0)
==2218== by 0x5F9B865: BulletSpaceFilter::~BulletSpaceFilter() (BulletSpaceFilter.hpp:26)
==2218== by 0x41C044: void boost::checked_delete(BulletSpaceFilter*) (checked_delete.hpp:34)
==2218== by 0x41CDAD: boost::detail::sp_counted_impl_p::dispose() (sp_counted_impl.hpp:78)
==2218== by 0x416EE1: boost::detail::sp_counted_base::release() (sp_counted_base_gcc_x86.hpp:146)
==2218== by 0x416FA4: boost::detail::shared_count::~shared_count() (shared_count.hpp:443)
==2218== Block was alloc'd at
==2218== at 0x4C2A560: operator new(unsigned long) (vg_replace_malloc.c:333)
==2218== by 0x5F86623: BulletSpaceFilter::BulletSpaceFilter(boost::shared_ptr) (BulletSpaceFilter.cpp:106)
==2218== by 0x4151F9: main (BulletBouncingBoxDynamic.cpp:193)

This bug may not manifest itself (maybe more often when called from python), but there is definitevely an issue that may bit us.

Also there is an issue with the Newton loop and the result does not match the reference file as shown here:

http://cdash-bipop.inrialpes.fr/testDetails.php?test=329006&build=18034

from siconos.

radarsat1 commented on June 2, 2024

Ok, Valgrind reports some reachable memory in BulletBouncingBoxDynamics.cpp but not in BulletBouncingBox.cpp, so there may be something dangling there. If I manage to get ASAN running I'll look more into it.

from siconos.

xhub commented on June 2, 2024

I'm not sure that this issue is still present, but the examples are not running on cdash.

@radarsat1 did you look into this?

from siconos.

vacary commented on June 2, 2024

The example works. At least, It does not produce a segfault, but the output is rather different from the reference file.

from siconos.

radarsat1 commented on June 2, 2024

These examples are out of date. The "normal" version produces ok output, but there appears something wrong with the dynamic output. (Which tests adding one DS to the graph at step 100.)

In both cases the reference files need updating, probably due to changes in the integrator/collision detection since a few months ago. But also both examples use BulletSpaceFilter, I'd like to update them to use SiconosBulletCollisionManager instead.

from siconos.

radarsat1 commented on June 2, 2024

I have updated the examples and I'm deciding whether to update the reference file, so I am examining the differences. If I use a SiconosPlane positioned exactly where the original reference file comes to rest (which curiously is at ~1.04, allowing for some penetration which is ok I guess), the global view is more or less fine:

However a zoomed-in view near the time of settling reveals some differences:

In particular something to notice is that the bounce positions of the red line (new behaviour) are closer to the resting position. I propose that this is due to the changes regarding where updateInteractions() is called (i.e. collision detection is performed during the Newton loop, instead of before.) It might be worth testing if this is the cause, but there are sufficient changes to the integrator that this would be difficult, and I think the new behaviour is desired, so I'll just update the reference file to reflect the change if there are no arguments. However, I will also take the opportunity to update the example to put the "ground" such that the bouncing position is at 0, and change it for SiconosPlane instead of a box.

As for the Dynamic version, the behaviour is sufficiently complex that I'm not sure even having a reference file is worth it, but I'll update it to match. It appears to be more or less the same as before in global behaviour, comparable differences only at small scale.

from siconos.

radarsat1 commented on June 2, 2024

Okay something strange. I was just investigating why some of the peaks in the last graph above are missing, and in fact I don't understand. In BulletBouncingBox, there is a series of "if" statements checking that there are exactly 4 contacts, etc, and only collects the lambda(1) value if they are all true. But, weirdly, removing all conditions, I changed the code to,

SP::InteractionsGraph index1 = simulation->indexSet(1);                        
InteractionsGraph::VIterator iur = index1->begin();                            
for (iur = index1->begin(); iur != index1->end(); iur++)                       
    dataPlot(k, 3) += index1->bundle(*iur)->lambda(1)->norm2();

and I still get missing impulses:

But if I change the code to collect p() instead (which is just the sum of all lambda() over the DS, after change of reference frame, right?) then I get:

dataPlot(k, 3) = body->p(1)->norm2();

I get the expected graph (with a different amplitude of course):

Anyways, I will push the code but I guess I will not push new reference files until this is understood.

from siconos.

radarsat1 commented on June 2, 2024

As I thought, it turns out the missing impulses are due to how I changed updateInteractions() to be called during the Newton loop. They can be seen if simulation->setNewtonMaxIterations(1).

So, this brings up the issue of whether this is correct. I thought it should be controversial to be adding and removing interactions during the Newton loop. What happens here is that if there is another iteration of the Newton loop, and the contacts are resolved, then positions are updated and Bullet is called again, which removes the Interactions. So they cannot be found by traversing the indexSet1.

Two possible solutions:

Decide that this is fine (simulation is correct)
Keep existing Interactions until the next timestep, just mark them for deletion. The difficulty here is that the contact points and distance calculation would need to be updated when positions of all DS change, without the help of Bullet. Perhaps it would also be correct to move them out of indexSet1.

from siconos.

vacary commented on June 2, 2024

We should be able to keep the list of interactions within the Newton loop but to be able to update the values of the contact point and local frame at contact when an update of the position is made during the newton loop.

from siconos.

radarsat1 commented on June 2, 2024

Yes okay, it makes sense, it should be more robust, only it will be a bit of work to implement but I'll give it a try. Do you think the contact engine should be allowed to add new contacts during the Newton loop, or only at the beginning? In any case I'll add a parameter to the InteractionManager indicating whether it is being called at the beginning or in the middle of the loop.

from siconos.

radarsat1 commented on June 2, 2024

To be clear, I should mention that the reason I was inspired to add updateInteractions() in the middle of the Newton loop is because previously there was already a call to updateWorldFromDS there, to update the collision engine data structures after MoreauJeanOSI::updatePosition, however the collision engine was not subsequently being called. So I had the choice of either removing the useless updateWorldFromDS, or adding a call to the collision engine. It seemed to me a good idea to update the broadphase whenever updateState/updatePosition was called, but perhaps not. Better to just adjust the contact points according to the new DS positions, without performing new a detection sweep. Not sure how it will impact rolling/sliding/contact behaviour, I will have to test.

from siconos.

vacary commented on June 2, 2024

I think we should be able to parametrize the behavior:

No update at all : the list of interactions is fixed and the contents of the relation is not updated.
- useful for very large granular materials and finite-element in small perturbations theory with small changes in configuration during the time--step.
partial update. the list of interactions is fixed but the contents of the relation is updated.
- useful for large or finite changes within the time step of the contact configuration (large rotations, ...)
full update : the list of interactions is updated and the contents of the relation is updated.
- it seems that is the case implemented right now. this case is for very fine computation in robotic or in control for instance.

It is possible to update the relation calling Bullet without calling the broad page an changing the list of interactions. We also have to work with bullet to stabilize the list of contact points. It is nice but also quite expensive in terms of memory management and convergence of nsgs.
What do you think ?

from siconos.

radarsat1 commented on June 2, 2024

I agree that configurable is the way to go, at least until some empirical or theoretical conclusions can be made. I'm not completely convinced that the "no update" idea is valid, it seems to me that if q() changes, then the contact point locations must change as well -- I assumed this was previously a source of bugs. Nonetheless it is easy to add, so it can be an option.

(Idea: Perhaps we should store contact points in NewtonEulerFrom1DLocalFrameR as relative to q(), instead of absolute coordinates!)

I'll look into finer control with Bullet, otherwise moving the contact points and distance shouldn't be too difficult, I was just avoiding it. In fact I think it might be worth providing at the SiconosCollisionManager level instead of in SiconosBulletCollisionManager, so that the same logic can easily be applied to other collision engines (eventually). The calculations themselves could go in NewtonEulerFrom1DLocalFrameR or NewtonEulerFrom3DLocalFrameR.

from siconos.

vacary commented on June 2, 2024

Great. I trust you for this idea... and the implementation.

from siconos.

radarsat1 commented on June 2, 2024

Ok there is an initial implementation in 01a06f2. Please review. It seems to generate the impulses for BulletBouncingBox as expected and with my testing so far it doesn't introduce errors. Basically I add _relPc1 and _relPc2 to NewtonEulerFrom1DLocalFrame and use these to update Pc1 and Pc2 in a default implementation of computeh(). BulletR also calls this before calculating a new contact distance. Good results so far with slide_roll.py. I will continue to test.

from siconos.

radarsat1 commented on June 2, 2024

I found that cube_directproj.py was broken with is update. It's an interesting example.. it seems the calculations for updating the contact points were correct, but since it uses such a large timestep, only the corner of the cube enters the surface and it needs to add more contact points via the intra-Newton updateInteractions for it to be stable, otherwise it starts to oscillate instead of sitting steady on the surface.

This (a) shows a use of having this behaviour optional, and (b) would be an interesting test case for add-Interactions only behaviour. (i.e, the broadphase is called but we don't remove contact points, only add them -- this could lead to runaway behaviour where too many Interactions are generated, but in combination with Bullet's "breaking threshold" which should block generation of new contact points within a certain distance, i believe it should stabilize as updates get closer together.)

Anyways, fixed in 67ba02b

from siconos.

radarsat1 commented on June 2, 2024

The one reference file that is still not matching very well is NE_3DS_3Knee_1Prism_GMP.cpp, which gives a very close match (7.867185e-05) but far enough to fail the test.

After a git bisect I determined that this started happening in the commit where I fixed the one-DS case for PrismaticJointR. e28c8e2

Looking at the reference file it appears that there is some very small movement in beam3 in one of the constrained axes of the prismatic joint, but it is exactly zero in the reference file. I will investigate before closing this.

from siconos.

radarsat1 commented on June 2, 2024

I am not sure whether to be concerned or not, but I will put a small analysis of what I have understood so far. Basically my changes to PrismaticJointR calculations seem to have introduced a small-amplitude (1e-17) change in the response of one axis in NE_3DS_3Knee_1Prism_GMP. All other axes are identical to before.

The image below shows how axis 15 of the dat/ref files (corresponding to the X position of beam3) compared before and after my changes. Before it was exactly zero, and it seems that my modifications to PrismaticJointR introduced some play in that axis. This changes a bit if you re-orient the axis, e.g. the 4th plot here is the same as the 3rd one, but changed axe1->setValue(0, 1) to axe1->setValue(2, 1) with absoluteRef=true, which should be exactly the same kinematic configuration but gives a completely different pattern.

I can't really explain this except that I think it must come from the interaction of this joint with this particular kinematic configuration. It may be that the good behaviour in the GMP case is the exception and not the rule. As evidence, I include the ref files for MLCP and MLCP_MoreauJeanCombinedProjection cases, which also show some play on that axis.

What I do find concerning is that if this is just a question of precision, it's strange that no amount of increasing the tolerance requirements of the solvers seems to improve it. Additionally, if I decrease the time step, the signal still looks the same, which indicates to me that it is a "real" simulation artifact, i.e., something wrong with the equality, and not just a precision issue.

from siconos.

radarsat1 commented on June 2, 2024

After reviewing more details of this final issue, it seems that I was wrong about the 15th column containing all the differences. In fact, if you calculate the norm of the difference for each column of the ref individually, one gets an error of 1e-5 for the new version. If you change the computation of jachq to closer to what it was before, by setting the elements of jachq(0,3-6) and jachq(1,3-6) to zero (c.f. old PrismaticJointR.cpp, you get an error of 1e-14 for each column.

In other words the change with respect to the reference is a true reflection of the new computations in jachq that take into account the orientation with respect to change in position. I still don't know why there is more "play" in beam3-X than previously, but at an amplitude of 1e-17 it does not contribute significantly to the difference with respect to the ref file, even if it has an interesting structure.

So, I would conclude here that updating the ref is the correct action to take.

from siconos.

vacary commented on June 2, 2024

Fine. I trust you for the updating of the ref file.

Concerning the error of amplitude 1e-17, it may perhaps depends how the constraints is taken into account into the GMP solvers and especially its order.

from siconos.

[example] BulletBouncingBoxDynamic.py segfaults about siconos HOT 21 CLOSED

Comments (21)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent