Thursday 7 April 2011

“Should I switch to Python?”

Rich has recently been considering switching to the Python programming language.  Currently, Matlab is the language of choice in his department for rapid development and prototyping of code.  It’s very good at this, but Mathworks (the company who produces Matlab) have been tinkering with the licencing terms, leading to hassles where none should exist.  This is very frustrating and leads to the thought that it might be nice to use a free language where this will no longer be an issue.

But of course things are not quite that straightforward.  Matlab is used for good reason – it’s very good at what it does.  So is it worth the effort to stop using Matlab and instead learn to use Python?  In this article we discuss some of the things that’ll need to be considered.

Why Python?
The first question is why out of all the programming languages that exist should we be considering Python?  The bulk of the reasoning is actually contained in the specifics of the sections below, but the starting point is that Python has a good reputation for being nice to work with, it’s already used in some areas of science (suggesting it might be a sensible language to consider), and it has a wider community of users (including some big ones such as Google), so there should be good community support.  So, this looks superficially promising.  What about the specifics?

It’s free…
First up, Python is free.  So no licence problems and no need to find the money to pay for it.  This does mean that there isn’t a company whose raison d’etre is to build new functionality for Python, but there is an active community helping to develop it, so that’s probably not too much of a problem.

What do I need it for?
This is a key question when deciding whether to learn a new language.  If you’re anything like us, you’re attracted to languages because you can do cool things with them, but you should be careful that they are the right cool things for your needs.  In this case, Rich needs a language for building prototype implementations of statistical modelling tools.  So, it needs to be fast to code in, object orientation would be desirable and lots of scientific library support is vital.  Flat-out processing speed is a nice bonus, but is less essential as Rich is happy to recode in C++ if he needs to. (or use a bigger computer)

Library support
For scientific programming, having the right libraries is vital.  We need to generate plots, process data, invert matrices, perform Fast Fourier Transforms and all sorts of specialist things like that.  All of these things can be found in libraries for various programming languages, so it’s sensible to make sure you have access to these.  Python scores well on this count because of packages such as SciPy, BioPython, NumPy and matplotlib.

Usability
This is always tricky to assess without using the language, but the perceived wisdom on the Web, backed up by the opinions of some of our colleagues, is that Python is extremely user-friendly.  Indeed, this is part of the stated design philosophy of Python (see here).

Speed
For prototyping scientific code, computational speed is a bonus rather than a necessity.  At this stage, user time (for programming) is far more valuable than CPU time, so an interpreted language like Python is acceptable.  Comparative benchmarking between languages is notoriously hard (and task specific), but the impression we’ve got is that Python and Matlab are probably of order the same speed, and a couple of orders of magnitude slower that fully compiled languages like C++.  However, in both cases people are working to make Matlab/Python implementations that are faster.  And we probably won’t be losing out significantly by switching from Matlab to Python.

What does everyone else use?
It’s very useful if you’re surrounded by experts in the language you’re using.  It’s also useful if your colleagues know the same languages as you, because they can pick up and use the things you write.  In the case of Rich’s department, many people use Matlab but almost no-one uses Python.  This is a downside.  Of course, someone has to be first whenever a change like this is made, but it would mean that Rich would be on his own to a certain degree.

A tranferable skill…
It’s always prudent to be developing transferable skills and experience with Python would certainly count as that, because it’s widely used in industry and the commercial world.  Matlab is also widely used, although perhaps more in science/engineering settings and less in places like the computing industry.  It’s probably true to say that both have their merits in this regard.

What about Octave?
Wouldn’t it be nice if there was just a free version of Matlab?  Well, there is (sort of):  GNU Octave.  This would be another good solution to Rich’s Matlab issues.  We’re discounting it here mainly because of the concern that it’s less well supported than Python, and also because it’s less of a transferable skill.  Neither of these reasons are killers, however, so we wouldn’t try to dissuade anyone from going down the Octave route.

No comments:

Post a Comment