Including External Data in Python Unit Tests

January 13, 2009 | categories: Python, Testing | View Comments

Normally we prefer unit tests to be completely isolated -- wall off the databases, network connections, and even disk I/O. However, sometimes, packaging sample data along with a unit test is the only way to get good coverage, and that data normally requires disk access. If you do decide this is necessary, you should provide that data set along side your tests.

Your package structure might then resemble the following:

/package
  __init__.py
  module.py
  /tests
    __init__.py
    test_module.py
    /data
      data.txt

In your test, you’ll need to provide a relative path to the data file. To do this, use the special file attribute of the package to provide a starting point. This will let you know where the module lives on disk regardless of where the package is installed.

# test_module.py
import os
from package.module import ThingaMaBobber
from package.tests import __file__ as test_directory

def data_dir(): return os.path.join(os.path.dirname(test_directory), 'data')

class TestWhenUsingSampleData(object): def setup(self): self.sample_data_path = os.path.join(data_dir(), 'data.txt') self.thingamabobber = ThingaMaBobber() def test_that_thingamabobber_confabulates(self): with open(self.sample_data_path) as f: assert self.thingamabobber.confabulate(f)

Remember, there is a cost to doing business like this. When your test suite grows beyond a few thousand tests, disk IO in unit tests can make running your entire suite painful. Try to limit when you do this, or make these types of tests part of your integration suite.

blog comments powered by Disqus