I made a presentation at Khan Academy on a few good things to know about Python's import system. Here's a writeup of that presentation (this is written mostly as a reference, sorry for the dryness).

The slides are available here.

Vocab Review: Package vs Module

A Python module is a single file that you can import, a Python package is a collection of modules and packages (SO answer).

For example, if we made a file named foo.py and then executed import foo, we would be importing a module.

Now if we were to make a directory named bar/ and place two files, one named __init__.py and the other named foo.py, in that directory, we could then execute import bar.foo and we would be importing the module foo from the bar package. The __init__.py is important here only because it tells Python that bar is a package and not just some random directory.

Note: Packages can be Zip Archives

Instead of containing a package within a directory, you can also store them in zip archives thanks to the zipimport standard libary shipped with Python since version 2.3. This is very common if you install packages as eggs, which are actually just fancy zip archives.

sys.path

sys.path is a list of locations Python will look for packages in when you use import or from. The list will be scanned front-to-back, and the first module it finds with the name you're looking for will be used (which can be problematic sometimes).

When you modify the PYTHONPATH environmental variable, you're indirectly adding to sys.path (it's hard for you to know ahead of time where in the list your paths will be placed though).

The site-packages Directory

The site-packages/ directory is where your third-party Python packages and modules are likely to live. pip and easy_install installs things into this directory, and virtualenv creates a new site-packages directory as one of its primary methods of seperating the packages in the virtual environment from the packages on the rest of your system.

The location of this directory varies from system to system (and of course is changed when you're in a virtual environment), but it's nearly always called site-packages/ so whenever I'm helping someone with import-related problems I usually just do a find on this name to get my bearings. I might also use the __file__ attribute (info on this attribute) of an imported module to find it as well.

The site standard module takes care of adding your site-packages directory to your path. The module also descends into it and handles any .pth files within.

.pth files

These files are read by the site module and the paths contained within are added to sys.path. If you visit your own site-packages directory on your installation you're sure to see many of these such files.

.pth files are very handy, and you should definitely consider using them instead of modifying your PYTHONPATH if you'd like the change to be permanent.

You can also add arbitrary Python into these files by prefixing an import statement to the line. This feature doesn't seem to be documented but is leveraged by a few libraries such as easy-install.

Extending the Import System

If your needs are complex, you might want to take advantage of the hooks added by PEP 302 to extend the import system. You can also extend or reimplement the site module yourself (I did this while creating Super Zippy). Finally, you can do super crazy things like overriding __import__. If you want to do it you can probably get Python to let you do it.


--

« more posts | follow me via twitter, or RSS