Published on

[LeetCode 10] Regular Expression Matching

Problem

Given an input string s and a pattern p, implement regular expression matching with support for . and * where:

  • . Matches any single character.​​​​
  • * Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Example 1:

Input: s = "aa", p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input: s = "aa", p = "a*"
Output: true
Explanation: '*' means zero or more of the preceding element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

Example 3:

Input: s = "ab", p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".

Example 4:

Input: s = "aab", p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore, it matches "aab".

Example 5:

Input: s = "mississippi", p = "mis*is*p*."
Output: false

Constraints:

  • 1 <= s.length <= 20
  • 1 <= p.length <= 30
  • s contains only lowercase English letters.
  • p contains only lowercase English letters, ., and *.
  • It is guaranteed for each appearance of the character *, there will be a previous valid character to match.

Thoughts

Recursive

If there were no *, we can simply check from left to right if each character of the text matches the pattern.

def match(text, pattern):
    if not pattern: return not text
    first_match = bool(text) and pattern[0] in {text[0], '.'}
    return first_match and match(text[1:], pattern[1:])

If a * is present in the pattern, it will be in the second position pattern[1]. Then, we may ignore this part of the pattern, or delete a matching character. If we have a match on the remaining strings after any of these operations, then the initial inputs matched.

class Solution(object):
    def isMatch(self, text, pattern):
        if not pattern:
            return not text

        first_match = bool(text) and pattern[0] in {text[0], '.'}

        if len(pattern) >= 2 and pattern[1] == '*':
            return (self.isMatch(text, pattern[2:]) or
                    first_match and self.isMatch(text[1:], pattern))
        else:
            return first_match and self.isMatch(text[1:], pattern[1:])

Dynamic Programming

As the problem has optimal substructure, it is natural to cache intermediate results. Because calls will only ever be made to match(test[i:], pattern[j:]), we use dp(i, j) to handle those calls instead, saving us expensive string-building operations and allowing us to cache the intermediate results

Top-Down

class Solution(object):
    def isMatch(self, text, pattern):
        memo = {}
        def dp(i, j):
            if (i, j) not in memo:
                if j == len(pattern):
                    ans = i == len(text)
                else:
                    first_match = i < len(text) and pattern[j] in {text[i], '.'}
                    if j+1 < len(pattern) and pattern[j+1] == '*':
                        ans = dp(i, j+2) or first_match and dp(i+1, j)
                    else:
                        ans = first_match and dp(i+1, j+1)

                memo[i, j] = ans
            return memo[i, j]

        return dp(0, 0)
function isMatch(s: string, p: string): boolean {
    // the memo object should be a global one
    const memo = {};

    // i, j are indexes for s & p
    const dp = (i, j) => {
        // make unique string key here
        const key = i + ',' + j

        // if key is not in memo, update it
        if(!(key in memo)) {
            let res;

            // final result
            // if we are at the pattern end, i should also be at string end for it to match
            if (j === p.length) {
                res = i === s.length
            } else {
                // check valid i & string match on pattern
                const firstMatch = i < s.length && [s[i], '.'].includes(p[j])

                // if we have '*', either ignore previous, or count once
                if(j + 1 < p.length && p[j + 1] === '*') {
                    res = dp(i, j + 2) || (firstMatch && dp(i + 1, j))
                } else {
                    // if we dont have the '*' check firstMatch & remaining
                    res = firstMatch && dp(i + 1, j + 1);
                }
            }
            memo[key] = res;
        }

        // return the value with key
        return memo[key];
    };

    // let's start at the beginning
    return dp(0, 0);
}

Bottom-Up

class Solution(object):
    def isMatch(self, text, pattern):
        dp = [[False] * (len(pattern) + 1) for _ in range(len(text) + 1)]

        dp[-1][-1] = True
        for i in range(len(text), -1, -1):
            for j in range(len(pattern) - 1, -1, -1):
                first_match = i < len(text) and pattern[j] in {text[i], '.'}
                if j+1 < len(pattern) and pattern[j+1] == '*':
                    dp[i][j] = dp[i][j+2] or first_match and dp[i+1][j]
                else:
                    dp[i][j] = first_match and dp[i+1][j+1]

        return dp[0][0]

Reference