Python Multiline Regex

January 13, 2009 | categories: Python | View Comments

After pulling most of my hair out, I realized a simple mistake in my Python regular expression usage. I had been trying to search a simple HTML document for some text, which happened to be broken across a newline such as:

  <div>I'm some
               fancy text that needs
               to be found</div>

I tried my very simple pattern match using the re.MULTILINE flag to no avail. The solution ended up being:

  import re
  pattern = re.compile('some\s*fancy', re.DOTALL)
  match = pattern.search(input_text)

re.MULTILINE is not the same as re.DOTALL. Use DOTALL if you want your dot to match everything, including new lines. See more at this handy dandy reference site.

Doh!

4 Responses

  1. Tom L Says:

    Thanks, Mike. I was trying re.MULTILINE over and over wondering what I was missing.

    And google found your post. Bless you.

  2. Steve Says:

    Ditto. Thanks!

  3. Etan Says:

    many thanks guy!

  4. Egon Says:

    Many thanks!
    Good example, good explanation.

blog comments powered by Disqus