2

Closed

for loop doesn't Dispose() IEnumerator when exiting loop

description

<Thanks>lbaker</Thanks>
<Test>IronPythonTest.ProtocolTest and test_cliclass.py (test_iterator_dispose)</Test>
 
Version: IronPython 2.6b2
 
In C# when executing a foreach statement on an IEnumerable, the Dispose() method is called on the
IEnumerator that is returned when execution leaves the foreach block regardless of whether by normal
control flow or exception.
 
In IronPython's for loop, however, the enumerator's Dispose() method does not seem to be called when
control flow exits the for loop early via an exception raised from the body of the loop. This can cause
resource leaks in the enumerator. eg if the enumerator holds onto a database connection, mutex, file handle, etc.
 
If the list part of the 'for' statement is an IEnumerable then the IEnumerator should probably be Dispose()d on termination of the 'for' loop.
 
Example:
// C# Resource
class Query : IDisposable
{
public Query() { System.Console.WriteLine("Query Created"); }
public void Dispose() { System.Console.WriteLine("Query Disposed"); }
public static IEnumerable<int> GetValues()
{
using (var query = new Query())
{
   yield return 0;
}
}
}
 
// C# Test
public class Program
{
public static void Main()
{
try
{
  foreach (var x in Query.GetValues())
  {
     System.Console.WriteLine("throwing");
     throw new Exception();
  }
}
catch
{
   System.Console.WriteLine("caught");
}
}

}

Output:
Query Created
throwing
Query Disposed
caught
 
// Python Test
try:
for x in Query.GetValues():
print "raising"
raise Exception()
except:

print "caught"

Output:
Query Created
raising

caught

Expected Output:
Query Created
raising
Query Disposed
caught
Closed Feb 15, 2010 at 5:01 PM by dfugate
Dev did in fact add a test; just hard to find.

comments

dinov wrote Sep 2, 2009 at 1:43 AM

This is by design - we need to match the CPython behavior here which does not dispose of the iterator in for loops either.

lbaker wrote Sep 2, 2009 at 5:51 AM

I would perhaps argue that due to CPython's deterministic garbage collection, it does actually dispose of the iterator when the for loop goes out of scope due to the ref-count going to zero.

For example:
class MyIter(object):
def __init__(self, seq):
print "MyIter.__init__"
self.count = count
def __del__(self):
print "MyIter.__del__"
def next(self):
if self.count <= 0:
  raise StopIteration()
self.count -= 1
return self.count
class MySeq(object):
def __init__(self, count):
print "MySeq.__init__"
self.count = count
def __del__(self):
print "MySeq.__del__"
def __iter__(self):
return MyIter(self.count)
print "before"
for i in MySeq(5):
print i

print "after"

In CPython 2.5 I get the following output:
before
MySeq.init
MyIter.init
MySeq.del
4
3
2
1
0
MyIter.del

after

I guess I would expect that when the compiler makes implicit calls to IEnumerable.GetEnumerator() that it would also implicitly take care of cleaning up the created enumerator object when it was finished with it by disposing it. I wouldn't expect it to dispose of the IEnumerator if the user created that enumerator themselves explicitly.

For Example:

Implicit enumerator should be disposed on exit

for x in myseq:
raise Exception()

Explicit enumerator should not be disposed on exit

for x in iter(myseq):
raise Exception()

Alternatively, if you know the enumerator needs to be disposed

this seems like the only robust way of iterating over the sequence.

with iter(myseq) as myiter:
for x in myiter:
raise Exception()
Are there any other alternative ways of guaranteeing that my enumerator gets disposed that I've missed?

dinov wrote Sep 3, 2009 at 3:01 AM

The problem is it's going to zero because of GC (as you correctly point out) and not because of the for statement. If there's other references to the iterator than CPython won't collect it until whenever those go away. There have been discussions on various Python mailing lists about making for work like C# but it generally seems to be rejected. We're also not going to implement ref-counting on top of the GC.

The problem w/ the explicit call to GetEnumerator is that we do that for other things like file and generator objects. And in those cases calling Dispose would be incorrect as multiple for loops should be able to iterate over the same object w/ the 1st breaking out early. Perhaps we could look at making all of these objects IEnumerator's (there's really no perfect match to IEnumerable/IEnumerator in Python anyway) and make any IEnumerable work like C# does. It'll have to be a post-2.6 feature though.

One possible workaround would be to make a wrapper class which defines iter such as:

class disposableiterable(object):
def __init__(self, iterable):
    self.iterable = iterable
def __iter__(self):
    with self.iterable: 
        for x in self.iterable: 
            yield x

lbaker wrote Sep 8, 2009 at 12:26 AM

Looking at the PythonFile class it seems it would be compatible with disposing the IEnumerator returned from GetEnumerator() as it uses a C# generator function which returns a new enumerator each time it is called. The problem with the PythonGenerator class, though, is it is both an IEnumerable and IEnumerator. Perhaps the PythonGenerator.GetEnumerator() method could be similar to the PythonFile class.
ie.
IEnumerator IEnumerable.GetEnumerator()
{
while (MoveNext())
{
yield return Current;
}
}

Would translating the python for loop in roughly this way do what is needed?
for loopVar in listExpr:
body
seq = listExpr;
if isinstance(seq, IEnumerable):
enumerator = seq.GetEnumerator()
dispose = isinstance(enumerator, IDisposable)
else:
enumerator = PythonOps.GetEnumeratorForIteration(seq)
dispose = False
try:
while enumerator.MoveNext():
loopVar = enumerator.Current
body
finally:
if dispose:
enumerator.Dispose()
Another alternative could be to have the GetEnumeratorForIteration() method have an extra shouldDispose out-parameter or, when passed an IEnumerator as input, wrap the returned IEnumerator in a ProxyEnumerator whose Dispose() method does nothing so that callers of GetEnumeratorForIteration() can just always dispose the returned enumerator. This extra indirection would of course incur a slight performance penalty, though.

darb wrote Sep 8, 2009 at 10:01 PM

Change PythonFile and PythonGenerator (maybe others) from IEnumerable to IEnumerator. Breaking change - should push to 3k?

dinov wrote Sep 22, 2009 at 5:41 PM

Fixed for 2.6 RC1

dfugate wrote Feb 13, 2010 at 4:11 PM

Doesn't look like there's a regression for this?