What’s New in Python 3.0
Author: | Guido van Rossum |
Release: | 0.1 |
This article explains the new features in Python 3.0, comparing to 2.6
(or in some cases 2.5, since 2.6 isn’t released yet).
The best estimate for a release date is August 2008.
This article doesn’t attempt to provide a complete specification of
the new features, but instead provides a convenient overview. For
full details, you should refer to the documentation for Python 3.0. If
you want to understand the complete implementation and design
rationale, refer to the PEP for a particular new feature.
Common Stumbling Blocks
This section briefly lists the changes that are more likely to trip
people up, without necessarily raising obvious errors. These are all
explained in more detail below. (I’m not listing syntactic changes
and removed or renamed features here, since those tend to produce hard
and fast errors; it’s the subtle behavioral changes in code that
remains syntactically valid that trips people up. I’m also omitting
changes to rarely used features.)
The print statement has been replaced with a print() function,
with keyword arguments to replace most of the special syntax of the
old print statement (PEP 3105). Examples:
Old: print "The answer is", 2*2
New: print("The answer is", 2*2)
Old: print x, # Trailing comma suppresses newline
New: print(x, end=" ") # Appends a space instead of a newline
Old: print # Prints a newline
New: print() # You must call the function!
Old: print >>sys.stderr, "fatal error"
New: print("fatal error", file=sys.stderr)
Old: print (x, y) # prints repr((x, y))
New: print((x, y)) # Not the same as print(x, y)!
You can also customize the separator between items, e.g.:
print("There are <", 2**32, "> possibilities!", sep="")
which produces:
There are <4294967296> possibilities!
Notes about the print() function:
- The print() function doesn’t support the “softspace” feature of
the old print statement. For example, in Python 2.x,
print "A\n", "B" would write "A\nB\n"; but in Python 3.0,
print("A\n", "B") writes "A\n B\n".
- Initially, you’ll be finding yourself typing the old print x
a lot in interactive mode. Time to retrain your fingers to type
print(x) instead!
- When using the 2to3 source-to-source conversion tool, all
print statements are automatically converted to print()
function calls, so this is mostly a non-issue for larger projects.
Python 3.0 uses strings and bytes instead of the Unicode strings and
8-bit strings. This means that pretty much all code that uses
Unicode, encodings or binary data in any way has to change. The
change is for the better, as in the 2.x world there were numerous
bugs having to do with mixing encoded and unencoded text.
Text files enforce an encoding; binary files use bytes. This means
that if a file is opened using an incorrect mode or encoding, I/O
will likely fail.
map() and filter() return iterators. A quick fix is e.g.
list(map(...)), but a better fix is often to use a list
comprehension (especially when the original code uses lambda).
Particularly tricky is map() invoked for the side effects of the
function; the correct transformation is to use a for-loop.
dict methods dict.keys(), dict.items() and
dict.values() return views instead of lists. For example, this no
longer works: k = d.keys(); k.sort(). Use k = sorted(d) instead.
builtin.sorted() and list.sort() no longer accept the cmp
argument providing a comparison function. Use the key argument
instead. N.B. the key and reverse arguments are now “keyword-only”.
1/2 returns a float. Use 1//2 to get the truncating behavior.
The repr() of a long integer doesn’t include the trailing L
anymore, so code that unconditionally strips that character will
chop off the last digit instead.
Strings and Bytes
- There is only one string type; its name is str but its behavior and
implementation are like unicode in 2.x.
- The basestring superclass has been removed. The 2to3 tool
replaces every occurrence of basestring with str.
- PEP 3137: There is a new type, bytes, to represent binary data (and
encoded text, which is treated as binary data until you decide to decode it).
The str and bytes types cannot be mixed; you must always
explicitly convert between them, using the str.encode() (str -> bytes)
or bytes.decode() (bytes -> str) methods.
- All backslashes in raw strings are interpreted literally. This means that
Unicode escapes are not treated specially.
- PEP 3112: Bytes literals, e.g. b"abc", create bytes instances.
- PEP 3120: UTF-8 default source encoding.
- PEP 3131: Non-ASCII identifiers. (However, the standard library remains
ASCII-only with the exception of contributor names in comments.)
- PEP 3116: New I/O Implementation. The API is nearly 100% backwards
compatible, but completely reimplemented (currently mostly in Python). Also,
binary files use bytes instead of strings.
- The StringIO and cStringIO modules are gone. Instead, import
io.StringIO or io.BytesIO.
- '\U' and '\u' escapes in raw strings are not treated specially.
- A new system for built-in string formatting operations replaces the %
string formatting operator.
- The dict.iterkeys(), dict.itervalues() and dict.iteritems()
methods have been removed.
- dict.keys(), dict.values() and dict.items() return objects
with set behavior that reference the underlying dict.
PEP 3107: Function Annotations
- A standardized way of annotating a function’s parameters and return values.
Exception Stuff
- PEP 352: Exceptions must derive from BaseException. This is the root
of the exception hierarchy.
- StandardError was removed (already in 2.6).
- Dropping sequence behavior (slicing!) and message attribute of
exception instances.
- PEP 3109: Raising exceptions. You must now use raise Exception(args)
instead of raise Exception, args.
- PEP 3110: Catching exceptions. You must now use except SomeException as
identifier: instead of except Exception, identifier:
- PEP 3134: Exception chaining. (The __context__ feature from the PEP
hasn’t been implemented yet in 3.0a2.)
- A few exception messages are improved when Windows fails to load an extension
module. For example, error code 193 is now %1 is not a valid Win32
application. Strings now deal with non-English locales.
- Classic classes are gone.
- PEP 3115: New Metaclass Syntax.
- PEP 3119: Abstract Base Classes (ABCs); @abstractmethod and
@abstractproperty decorators; collection ABCs.
- PEP 3129: Class decorators.
- PEP 3141: Numeric ABCs.
Other Language Changes
Here are most of the changes that Python 3.0 makes to the core Python
language and built-in functions.
- Removed backticks (use repr() instead).
- Removed <> (use != instead).
- != now returns the opposite of ==, unless == returns
NotImplemented.
- as and with are keywords.
- True, False, and None are keywords.
- PEP 237: long renamed to int. That is, there is only one
built-in integral type, named int; but it behaves like the old
long type, with the exception that the literal suffix L is
neither supported by the parser nor produced by repr() anymore.
sys.maxint was also removed since the int type has no maximum value
anymore.
- PEP 238: int division returns a float.
- The ordering operators behave differently: for example, x < y where x
and y have incompatible types raises TypeError instead of returning
a pseudo-random boolean.
- __getslice__() and friends killed. The syntax a[i:j] now translates
to a.__getitem__(slice(i, j)) (or __setitem__() or
__delitem__(), depending on context).
- PEP 3102: Keyword-only arguments. Named parameters occurring after *args
in the parameter list must be specified using keyword syntax in the call.
You can also use a bare * in the parameter list to indicate that you don’t
accept a variable-length argument list, but you do have keyword-only
arguments.
- PEP 3104: nonlocal statement. Using nonlocal x you can now
assign directly to a variable in an outer (but non-global) scope.
- PEP 3111: raw_input() renamed to input(). That is, the new
input() function reads a line from sys.stdin and returns it with
the trailing newline stripped. It raises EOFError if the input is
terminated prematurely. To get the old behavior of input(), use
eval(input()).
- xrange() renamed to range(), so range() will no longer
produce a list but an iterable yielding integers when iterated over.
- PEP 3113: Tuple parameter unpacking removed. You can no longer write def
foo(a, (b, c)): .... Use def foo(a, b_c): b, c = b_c instead.
- PEP 3114: .next() renamed to __next__(), new builtin next() to
call the __next__() method on an object.
- PEP 3127: New octal literals; binary literals and bin(). Instead of
0666, you write 0o666. The oct() function is modified
accordingly. Also, 0b1010 equals 10, and bin(10) returns
"0b1010". 0666 is now a SyntaxError.
- PEP 3132: Extended Iterable Unpacking. You can now write things like a, b,
*rest = some_sequence. And even *rest, a = stuff. The rest object
is always a list; the right-hand side may be any iterable.
- PEP 3135: New super(). You can now invoke super() without
arguments and the right class and instance will automatically be chosen. With
arguments, its behavior is unchanged.
- zip(), map() and filter() return iterators.
- string.letters and its friends (string.lowercase and
string.uppercase) are gone. Use string.ascii_letters
etc. instead.
- Removed: apply(), callable(), coerce(), execfile(),
file(), reduce(), reload().
- Removed: dict.has_key() – use the in operator instead.
- exec() is now a function.
- The __oct__() and __hex__() special methods are removed –
oct() and hex() use __index__() now to convert the argument
to an integer.
- Support is removed for __members__ and __methods__.
- Renamed the boolean conversion C-level slot and method: nb_nonzero is now
nb_bool and __nonzero__() is now __bool__().
- Removed sys.maxint. Use sys.maxsize.
Optimizations
- Detailed changes are listed here.
The net result of the 3.0 generalizations is that Python 3.0 runs the pystone
benchmark around 33% slower than Python 2.5. There’s room for improvement; we
expect to be optimizing string and integer operations significantly before the
final 3.0 release!
New, Improved, and Deprecated Modules
As usual, Python’s standard library received a number of enhancements and bug
fixes. Here’s a partial list of the most notable changes, sorted alphabetically
by module name. Consult the Misc/NEWS file in the source tree for a more
complete list of changes, or look through the Subversion logs for all the
details.
- The cPickle module is gone. Use pickle instead. Eventually
we’ll have a transparent accelerator module.
- The imageop module is gone.
- The audiodev, Bastion, bsddb185, exceptions,
linuxaudiodev, md5, MimeWriter, mimify,
popen2, rexec, sets, sha, stringold,
strop, sunaudiodev, timing, and xmllib modules are
gone.
- The bsddb module is gone. It is being maintained externally
with its own release schedule better mirroring that of BerkeleyDB.
See http://www.jcea.es/programacion/pybsddb.htm.
- The new module is gone.
- The functions os.tmpnam(), os.tempnam() and os.tmpfile()
have been removed in favor of the tempfile module.
- The tokenize module has been changed to work with bytes. The main
entry point is now tokenize.tokenize(), instead of generate_tokens.
Build and C API Changes
Changes to Python’s build process and to the C API include:
- PEP 3118: New Buffer API.
- PEP 3121: Extension Module Initialization & Finalization.
- PEP 3123: Making PyObject_HEAD conform to standard C.
- No more C API support for restricted execution.
- PyNumber_Coerce, PyNumber_CoerceEx, PyMember_Get,
and PyMember_Set C APIs are removed.
- New C API PyImport_ImportModuleNoBlock, works like
PyImport_ImportModule but won’t block on the import lock (returning
an error instead).
Port-Specific Changes
Platform-specific changes go here.
Other Changes and Fixes
As usual, there were a bunch of other improvements and bugfixes
scattered throughout the source tree. A search through the change
logs finds there were XXX patches applied and YYY bugs fixed between
Python 2.6 and 3.0. Both figures are likely to be underestimates.
Some of the more notable changes are:
Porting to Python 3.0
This section lists previously described changes that may require
changes to your code:
- Everything is all in the details!
- Developers can include intobject.h after Python.h for
some PyInt_ aliases.
Acknowledgements
The author would like to thank the following people for offering
suggestions, corrections and assistance with various drafts of this
article: Georg Brandl.