def main() in Python considered harmful

Python-logoI recently graded the first Python programming assignments in the course I'm teaching on Social and Computational Intelligence in the Computing and Software Systems program at University of Washington Bothell. Most of the students are learning Python as a second (or third) language, approaching it from the perspective of C++ and Java programming, the languages we use in nearly all our other courses. Both of those languages require the definition of a main() function in any source file that is intended to be run as an executable program, and so many of the submissions include the definition of a main() function in their Python scripts.

In reviewing some recurring issues from the first programming assignment during class, I highlighted this practice, and suggested it was unPythonistic (a la Code like a Pythonista). I recommended that the students discontinue this practice in future programming assignments, as unlike in C++ and Java, a function named main has no special meaning to the the Python interpreter. Several students asked why they should refrain from this practice – i.e., what harm is there in defining a main() function? – and one sent me several examples of web sites with Python code including main() functions as evidence of its widespread use.

Comfort_zone_growth_zone_panic_zoneIn my experience, the greatest benefit to teaching is learning, and the students in my classes regularly offer me opportunities to move out of my comfort zone and into my growth zone (and occasionally into my panic zone). I didn't have a good answer for why def main() in Python was a bad practice during that teachable moment  … but after lingering in the growth zone for a while, I think I do now.

The potential problem with this practice is that any function defined at the top level of a Python module becomes part of the namespace for that module, and if the function is imported from that module into the current namespace, it will replace any function previously associated with the function name. This may lead to unanticipated consequences if it is combined with a practice of using wildcards when importing, e.g., from module import * (though it should be noted that wildcard imports are also considered harmful by Pythonistas).

I wrote a couple of simple Python modules – main1.py and main2.py – to illustrate the problem:

# main1.py
import sys

def main():
    print 'Executing main() in main1.py'
    print '__name__: {}; sys.argv[0]: {}\n'.format(__name__, sys.argv[0])
 
if __name__ == '__main__':
    main()

# main2.py
import sys

def main():
    print 'Executing main() in main2.py'
    print '__name__: {}; sys.argv[0]: {}\n'.format(__name__, sys.argv[0])
 
if __name__ == '__main__':
    main()

The main functions are identical except one has a string 'main1.py' whereas the other has a string 'main2.py'. If either of these modules are executed from the command line, they execute their own main() functions, printing out the hard-coded strings and the values of __name__ and sys.argv[0] (the latter of which will only have a value when the module is executed from the command line).

$ python main1.py
Executing main() in main1.py
__name__: __main__; sys.argv[0]: main1.py

$ python main2.py
Executing main() in main2.py
__name__: __main__; sys.argv[0]: main2.py

When these modules are imported into the Python interpreter using wildcards, the effect of invoking the main() function will depend on whichever module was imported first.

$ python
Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> locals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', '__doc__': None, '__package__': None}
>>> from main1 import *
>>> locals()
{'__builtins__': <module '__builtin__' (built-in)>, '__package__': None, 'sys': <module 'sys' (built-in)>, '__name__': '__main__', 'main': <function main at 0x1004aa398>, '__doc__': None}
>>> main()
Executing main() in main1.py
__name__: main1; sys.argv[0]:

>>> from main2 import *
>>> locals()
{'__builtins__': <module '__builtin__' (built-in)>, '__package__': None, 'sys': <module 'sys' (built-in)>, '__name__': '__main__', 'main': <function main at 0x1004aa140>, '__doc__': None}
>>> main()
Executing main() in main2.py
__name__: main2; sys.argv[0]:

>>> exit()
$

Now, this may all be much ado about little, especially given the aforementioned caveat about the potential harm of using wildcards in import statements. I suppose if one were to execute the more Pythonistic selective imports, i.e., from main1 import main and from main2 import main, at least the prospect of the main() function being overridden might be more apparent. But people learning a programming language for the first time – er, or teaching a programming language for the second time – often use shortcuts (such as wildcard imports), and so I offer all this as a plausible rationale for refraining from def main() in Python.

As part of my practice of leaning into discomfort and staying in the growth zone, I welcome any relevant insights or experiences that others may want to share.


by

Tags:

Comments

10 responses to “def main() in Python considered harmful”

  1. Dom Avatar
    Dom

    It’s a bit old, but Guido actually uses this pattern in an example of argument parsing from the command line. So, it can’t be that bad, right? 😉
    http://www.artima.com/weblogs/viewpost.jsp?thread=4829

  2. Joe McCarthy Avatar

    Hmm, that certainly is a reputable source.
    It seems like the original post, and several followup comments, are focused on parsing arguments, and I wonder whether the potential reduction in code complexity through the use of the argparse module would reduce the need to shuttle so much code off to a main() function … and thereby reduce the need for main().

  3. Yogi Avatar
    Yogi

    I’ve never considered main functions to be bad practice, nor have I come across anyone who’s argued it is (I don’t see where Code like a Pythonista says it’s unpythonic).
    In general, wildcard imports shouldn’t be used — your example is spot-on. But when they are, it’s the programmers responsibility to ensure they’re not shooting themselves in the foot.
    That said, a simple “fix” for this would be to name the main function _main instead. The single underscore tells python not to include it in wildcard imports. If, for some reason, _main needs to be imported, it can be pulled in explicitly with a from foo import _main. It’s also convention to prefix a variable with a single underscore when you want to indicate that it’s “private”. Of course, nothing in Python is really private, it’s just a way of telling other coders, “hey, don’t mess with this”.
    There are also double underscore name mangling tricks, as well as __all__ to explicitly identify what’s allowed to be wildcard imported:
    http://docs.python.org/tutorial/classes.html#private-variables-and-class-local-references
    http://docs.python.org/tutorial/modules.html#importing-from-a-package

  4. Joe McCarthy Avatar

    Well, it wouldn’t be the first time I’ve unwittingly taken a minority position on an issue.
    I like the suggestion using a single underscore prefix in defining _main, as that would more properly designate it as a function not intended to be called from outside the module. Of course, that further highlights the contrivance of my example, where I explicitly call main().
    I suppose my lingering discomfort with main() – or _main() – has to do with the mapping of required practices in one language into another language where it is not required. One cannot compile a C++ or Java program without a main() function, but a Python program without main() will run just fine.
    Years ago, when I was programming in LISP at UMass, I never defined a main function nor did I ever encounter a LISP program written by anyone else that included such a function. This was before Java was invented, and before C++ had become so prominent, but C – which also requires main() – was in widespread use at the time.
    Peter Norvig has highlighted the many similarities between Python and LISP – noting “Python can be seen as a dialect of Lisp with ‘traditional’ syntax” – and I imagine it’s this LISP bias that may be responsible for my “we don’t need no stinkin’ main()” attitude.
    In any case, I’m grateful for the opportunity to continue my education here.

  5. Seanjensengrey Avatar

    Having a main function is bad per se, I myself use them although with a signature that makes it callable as a module level function.

    if __name__ == "__main__":
    main(sys.argv[1:])
    

    It is really nice to be able to run python code inprocess w/o having to shell out. If main() assumes it takes its parameters from sys.argv it effectively makes the code non-callable at the module level. Passing the name of the executable shouldn’t be part of the calling convention, hence the slice.
    Now instead of calling the entry point `main`, it could have a more descriptive name …

  6. Joe McCarthy Avatar

    I like the idea of being able to access the same capability whether the module is invoked as a script (e.g., from the command line) or from within the interpreter … and like using a more descriptive name (i.e., not “main”) for that capability even better.

  7. Jeff Tratner Avatar

    Just wanted to add another example: Google App Engine also uses main() explicitly for caching I’d argue that the issue that you’ve highlighted is less about main() as an unpythonic import from Java/C and more about how you should be careful with using wildcard imports in Python, parrticularly in modules that don’t define __all__ ( as the Python docs suggest )

  8. Joe McCarthy Avatar

    @Jeff: thanks for the additional data (or, perhaps, code) point

  9. Skwurlgrrl Avatar

    Here’s a good reason to wrap your code inside a function, whether it’s main or not: It pollutes the global namespace if it’s not inside a function or class, which (IMHO) is just bad practice, and it can lead to some expected situations where you accidentally reference something external to your class or function, instead of something internal.
    I personally use main due to history with C, but wrapping it in a function or class is still a good idea. I do the same with perl – I’ve seen too much code with functions, classes, and top-level code mixed together.

  10. Skwurlgrrl Avatar

    unexpected