I’m probably missing something here, after searching I couldn’t find an answer.
I’ve explored quite a few Python projects and one thing I keep noticing is the fact that the majority of them continue to use the %
operator for formatting strings rather than the newer, recommended .format()
method. Is there a reason for this? it seems like a trivial change, unless I’m missing something entirely.
For example:
# count how many times the % operator technique is used
find . -name "*.py" -exec grep -HE ""[^"]+"s%sw+|'[^']+'s%sw+" {} ; | wc -l
# and the same for format()
find . -name "*.py" -exec grep -HE "w+.format(" {} ; | wc -l
# Results:
#
# % operator format()
# iPython 670 63
# Django 977 8
# Tornado 91 0
# requests 25 1
No real reason for this question, just curious.
Cheers guys!
2
Three extremely strong inclinations come together to produce this effect:
- Comfort: “I know how to do this the old way, learning the new way would be more effort.”
- effort/effect trade-off: “Is the old method gone? Is it even deprecated? No? Then there is no business case whatsoever for changing this. Create new code instead.”
- Safety: “Are you sure the proposed new way of doing things is exactly equivalent? No? Then better leave the old code as it is – you might introduce defects.”
All three are, in fact, quite sensible (the first one least so, since continual learning is what information workers should thrive on).
(Edit: As pointed out below there is a fourth point: not knowing that the new method even exists! This is less sensible, but might actually be the most common one.)
5
In addition to the other reasons, I would add backward compatibility and consistency. Often you need to write scripts that can run on other people’s computers, who are be unwilling or unable to upgrade to the latest and greatest Python. So you write to the lowest common denominator. By the time enough of your users have upgraded to a sufficiently recent version of Python you have probably forgotten which features appeared in which version of Python. Unless they are exceptionally valuable features, it may well not be worth the time to research, consider and decide whether it’s now safe to use each feature. The format method method is a nice API to use in new code, but the case for introducing it to a code-base that already uses % extensively is not so strong.
When developing software for your own use, it’s often to your benefit to use newer versions of your libraries and development tools. However, the projects you’re looking at are libraries and development tools meant to be shared with the community. In these cases, you want to be conservative in the features you require in order to get the broadest installed base & get the most people involved in the project.
The format()
method wasn’t added until the release of Python 3.0/2.6 in 2008. That seems like a long time ago but keep in mind that people seldom use the first release of software. Let’s put this in context of when it became available in two of the more ‘serious’ Linux distros. RHEL 5.x still uses Python 2.4 and is supported until 2017 – 6.x has Python 2.6 but that wasn’t released until 2010. In the Debian camp they didn’t have Python 2.6 until the 6.x series which became stable in 2011.
So, while it might make sense, if you were starting a project like Django (originally released in 2005) today, to standardize on the new method, there’s very little value in going back and changing all 1000 occurrences of the old method without it having been formally deprecated. It makes work, creates the possibility of introducing bugs & needlessly increases the system requirements.
Addition to @KilianFoth’s answer; porting a big project to an upper software version have too much work to do. And changing an un-deprecated usage have the least importance in the upgrade process.
Probably they are using new format in totally re-written parts and do not change the existing usage on the rest.