Ticket #210 (closed defect: fixed)

Opened 10 months ago

Last modified 10 months ago

Race condition in MTInteractiveShell

Reported by: marc Assigned to: fperez
Priority: high Milestone:
Component: ipython Version:
Severity: major Keywords:
Cc:

Description (Last modified by fperez)

I am using ipython (0.8.2) in threaded mode (for pylab support) with a GTK backend in a (sort-of) embedded shell (by calling make_session). However, occasionally ipython locks up. I traced the problem after a while back to the threading synchronization in MTInteractiveShell:

Runsource:
411             got_lock = self.thread_ready.acquire(False)
412	        self.code_queue.put(code)
413	        if got_lock:
414	            self.thread_ready.wait()  # Wait until processed in timeout interval
415	            self.thread_ready.release()

Runcode:
424             global CODE_RUN
425	
426	        # Exceptions need to be raised differently depending on which thread is
427	        # active
428	        CODE_RUN = True
429	        
430	        # lock thread-protected stuff
431	        got_lock = self.thread_ready.acquire(False)
432	
...
449	        # Flush queue of pending code by calling the run methood of the parent
450	        # class with all items which may be in the queue.
451	        while 1:
452	            try:
453	                code_to_run = self.code_queue.get_nowait()
454	            except Queue.Empty:
455	                break
456	            if got_lock:
457	                self.thread_ready.notify()
458	                InteractiveShell.runcode(self,code_to_run)
459	            else:
460	                break
461	            
462	        # We're done with thread-protected variables
463	        if got_lock:
464	            self.thread_ready.release()
465	
466	        # We're done...
467	        CODE_RUN = False

The problem is that both functions acquire the lock only if available (the got_lock parameter). The race condition that occurs (every 40 commands or so) is that: A. runsource acquires lock, puts code in queue (411-412) B. runcode trys to acquire lock, fails as runsource has the lock (431) C. runsource starts waiting (as it has the lock) (414) D. runcode obtains code, but breaks as it doesn not have the lock. It does not notify the waiting Runsource! (451-460)

(C and D) could also be in different order

Possible solution:

Make the Lock an Rlock (to enable the thread calling runcode to call runsource)
364      self.thread_ready = threading.Condition(threading.RLock())

Runsource
- Make lock acquire blocking
411     got_lock = self.thread_ready.acquire()
- Only perform wait if this is not an reentrant lock (got_lock is True on outer lock, and 1 on inner locks)
413     if(got_lock is True):
414        self.thread_ready.wait()  # Wait until processed in timeout interval
- always release (not based on if(got_lock))
415     self.thread_ready.release()


Runcode
- make locking required
431    self.thread_ready.acquire()
- always run code if available (not dependent on if(got_lock)) (in the current implementation code just disappears)
- move notify out of while loop, only call it if code has been obtained and executed (not essential)
- always release lock (not dependent on if(got_lock))
450    code_to_run=None
451    while 1:
452         try:
453             code_to_run = self.code_queue.get_nowait()
454         except Queue.Empty:
455             break
458         InteractiveShell.runcode(self,code_to_run)
...            
        # We're done with thread-protected variables
461     if(not code_to_run is None):
462        self.thread_ready.notify()
463     self.thread_ready.release()

This seems to solve the deadlocking problem I encountered. Furthermore, using this the code is still reentrant (e.g. you can run ip.IP.runsource('a=1') from the console, or even something like 'ip.IP.runsource('a=1'); ip.IP.runcode()' without deadlocking), so i guess macros are ok too. I did not test it with the other backends (QT,etc.) however.

Attachments

Shell.py (41.0 kB) - added by marc on 01/30/08 09:05:50.

Change History

01/30/08 09:05:50 changed by marc

  • attachment Shell.py added.

01/31/08 01:06:26 changed by fperez

  • status changed from new to closed.
  • resolution set to fixed.
  • description changed.

Closed by r2997, where I applied your changes with small cleanups to the current version of Shell.py.

Many thanks! Your analysis seems correct, which I greatly appreciate. I'm pretty ignorant of threaded code, and I've known for a long time this code had bugs, let's hope your fixes are the last word on this. I'd appreciate it if you could test it further from SVN, in particular regarding the behavior of Ctrl-C, which is always very iffy with threads.