Including External Data in Python Unit Tests
January 13, 2009 | categories: Python, Testing | View CommentsNormally we prefer unit tests to be completely isolated -- wall off the databases, network connections, and even disk I/O. However, sometimes, packaging sample data along with a unit test is the only way to get good coverage, and that data normally requires disk access. If you do decide this is necessary, you should provide that data set along side your tests.
Your package structure might then resemble the following:
/package
__init__.py
module.py
/tests
__init__.py
test_module.py
/data
data.txt
In your test, you’ll need to provide a relative path to the data file. To do this, use the special file attribute of the package to provide a starting point. This will let you know where the module lives on disk regardless of where the package is installed.
# test_module.py import os from package.module import ThingaMaBobber from package.tests import __file__ as test_directorydef data_dir(): return os.path.join(os.path.dirname(test_directory), 'data')
class TestWhenUsingSampleData(object): def setup(self): self.sample_data_path = os.path.join(data_dir(), 'data.txt') self.thingamabobber = ThingaMaBobber() def test_that_thingamabobber_confabulates(self): with open(self.sample_data_path) as f: assert self.thingamabobber.confabulate(f)
Remember, there is a cost to doing business like this. When your test suite grows beyond a few thousand tests, disk IO in unit tests can make running your entire suite painful. Try to limit when you do this, or make these types of tests part of your integration suite.