8-bit strings can't contain characters > 0x80
description
In the latest version of IronPython (2.0A6), I noticed some weird behaviour with 8-bit strings:
IronPython console: IronPython 2.0A6 (2.0.11102.00) on .NET 2.0.50727.1378
Copyright (c) Microsoft Corporation. All rights reserved.
>>> str("\x7e")
'~'
>>> str("\x7f")
u'\x7f'
>>> str("\x80")
u'\x80'
>>> str("\x81")
Traceback (most recent call last):
File , line 0, in ##23
File mscorlib, line unknown, in GetString
File mscorlib, line unknown, in GetChars
File mscorlib, line unknown, in Fallback
File mscorlib, line unknown, in Throw
UnicodeDecodeError: Unable to translate bytes [81] at index 0 from
specified code page to Unicode.
This appears to be a bug in IronPython, since the CPython interpreter will allow 8-bit strings to contain characters all the way up to 0xFF.