Bug #103

crash on truck removal

Added by tdev over 2 years ago. Updated about 1 year ago.

Status:Closed Start date:01/18/2010
Priority:Urgent Due date:01/31/2010
Assignee:- % Done:

100%

Category:RoR - Physics
Target version:-
Operating System:All Operating System Bits:

Description

i guess we have a problem somewhere that trucks are deleted by other threads while still being used in other thread loops:

output:
First-chance exception at 0x015ca578 in RoR.exe: 0xC0000005: Access violation reading location 0x00006aac.
Unhandled exception at 0x015ca578 in RoR.exe: 0xC0000005: Access violation reading location 0x00006aac.

callstrack:

RoR.exe!Beam::truckTruckCollisions(float dt=0.00050000002, Beam * * trucks=0x0dfc5a60, int numtrucks=9) Line 7923 + 0x6 bytes C++

RoR.exe!Beam::threadentry(int id=0) Line 5747 C++
RoR.exe!threadstart(void * vid=0x00000000) Line 9800 + 0xc bytes C++

what i see in the debugger:
trucks[t] went 0, even you checked on the loop start before :-/

tested with multiplayer (when a client disconnects) and dual core support enabled.

any ideas for guarding the trucks array with mutexes without slowing it down or deadlocking it? :-\

History

#1 Updated by tdev over 2 years ago

we have a stream locking mechanisms:

LOCKSTREAMS();
UNLOCKSTREAMS();

but these are only for network purposes :-/

#2 Updated by estama over 2 years ago

No need to guard the trucks array with mutexes. We could bring all truck insertions and deletions into the thread that runs the simulation (calcEuler). Or have a small journal, containing truck deletion and insertion actions that will be acted upon at specific times that won't create collisions.

#3 Updated by tdev over 2 years ago

estama wrote:

No need to guard the trucks array with mutexes. We could bring all truck insertions and deletions into the thread that runs the simulation (calcEuler). Or have a small journal, containing truck deletion and insertion actions that will be acted upon at specific times that won't create collisions.

yes, i thought that would be the case already. I have such a journal running in order to decouple the network thread and the beam / ogre thread.

All truck creations/deletion should be done in the ogre thread. Seems we forgot a bit, since it crashed. :-\

#4 Updated by estama over 2 years ago

Thomas, in which parts of the code do trucks get added or deleted? Are these parts of the code in the same thread as trucktruckcollisions?

#5 Updated by estama over 2 years ago

Thomas is the ogre thread the same as the calcEuler thread? If not then we have a major problem, because trucktruckCollisions is in the calcEuler thread.

#6 Updated by tdev over 2 years ago

estama wrote:

Thomas is the ogre thread the same as the calcEuler thread? If not then we have a major problem, because trucktruckCollisions is in the calcEuler thread.

checking now ...

#7 Updated by tdev over 2 years ago

  • % Done changed from 10 to 30

estama wrote:

Thomas is the ogre thread the same as the calcEuler thread? If not then we have a major problem, because trucktruckCollisions is in the calcEuler thread.

just checked. Lets define the three major threads we are having in the client at the moment:
a) the ogre main thread: initial thread. All Ogre interaction needs to be done in this thread since ogre is not thread-safe
b) network thread: sending and listening for network packets. The packets are decoupled from the thread and added to a list that is processed in a)
c) in dual core mode: beam thread. This thread is decoupled from the ogre thread and just calculates the data (collision and euler)

so whats missing is the locking of the array when being processed in c) i guess. This was not always a problem since we never deleted trucks before. They just got moved to infinity and put to sleep. For the new multiplayer code it is required that truck deletion works properly. Look at

void ExampleFrameListener::removeTruck(int truck)

#8 Updated by estama over 2 years ago

Now that i'm looking at the code, it isn't just the trucktruckCollisions that is problematic. The whole of Beam::threadentry has the same synchronization problems.

This code for example (above trucktruckcollisions):

            for (t=0; t<numtrucks; t++)
            {
                if(!trucks[t]) continue;
                                ...........
                                ...........
                trucks[t]->calcForcesEuler(i==0, dtperstep, i, steps, trucks, numtrucks);

Can crash if a truck is deleted between the !truck check and calcforceseuler.

#9 Updated by tdev over 2 years ago

estama wrote:

Now that i'm looking at the code, it isn't just the trucktruckCollisions that is problematic. The whole of Beam::threadentry has the same synchronization problems.

This code for example (above trucktruckcollisions):

[...]

Can crash if a truck is deleted between the !truck check and calcforceseuler.

also realized that now. :-\

so should we create a:

1) create a journal and delete in the beam thread, and lock in the ogre thread?
2) lock the beam thread and leave deleting as it is: in the ogre thread?

#10 Updated by estama over 2 years ago

I prefer the journal way. I suspect that it could even be done without locks, by using circular buffers. The Ogre and network threads will write commands at the buffer's head, and the beam thread will read and execute them starting from the tail.

#11 Updated by tdev over 2 years ago

  • Target version changed from 0.36.3 to 0.37

#12 Updated by tdev about 2 years ago

  • Priority changed from Normal to Urgent
  • % Done changed from 30 to 40

we need to have a go at this for the next version as its a potential crash to desktop problem.

#13 Updated by estama about 2 years ago

The quickest (and safe) way to fix this is:

- Add a new attribute "todelete" to truck's class.
- Instead of deleting a truck, change this attribute to true where right now a truck gets deleted.
- Just before calling trucktruckcollisions, scan all the trucks to see if they have "todelete==true" and delete them.

This would guaranty that a truck won't get deleted while trucktruckcollisions runs. Adding a truck while trucktruckcollisions runs shouldn't present a problem. If it is, then it is fixable in more or less the same way.

#14 Updated by tdev almost 2 years ago

  • Target version deleted (0.37)

#15 Updated by tdev about 1 year ago

  • Status changed from requires improvement to Closed
  • Assignee deleted (estama)
  • % Done changed from 40 to 100

duplicate

Also available in: Atom PDF