Calling unicode on a unicode string containing non-ASCII characters is supposed to return the same unicode string. This indeed works properly in both CPython and IronPython. However, when using a reference to unicode, an exception is thrown instead. The attached file contains asserts that pass under CPython but throw a UnicodeDecodeError on IronPython.
Example code to replicate behaviour:
IronPython 2.6.2 (2.6.10920.0) on .NET 2.0.50727.4952
Type "help", "copyright", "credits" or "license" for more information.
unicode_reference = unicode
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: ('unknown', u'\xe9', 0, 1, '')
Under CPython, unicode and unicode_reference can be used interchangeably.
Workaround: Use unicode_reference = lambda x: unicode(x) instead.
Note: Using Jinja2 on IronPython with non-ASCII characters in a template will result in this error (at least as of Jinja2 v2.5). Replace to_string = unicode with to_string = lambda x: unicode(x) near the top of runtime.py to workaround this issue.