2014年10月26日星期日

[Leetcode] Wildcard Matching

Implement wildcard pattern matching with support for '?' and '*'.
'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "*") → true
isMatch("aa", "a*") → true
isMatch("ab", "?*") → true
isMatch("aab", "c*a*b") → false
At first, I used KMP with '?' and greedy method to solve this problem, the code is very hard to write. Later, I found that we can use recursion to write the code, but I think the code is a little bit difficult to understand. At last, I tried use dp to solve this problem, but I got TLE many times on big data. After checking other's solution, I found that adding a pruning could considerably improve the performance. The key point is as follow:

  1. we use dp[i][j] to represent if s[0...i-1] and p[0...j-1] could match or not, here I add an additional row and column in dp to represent null string to ease some case.
  2. if s[i-1] == p[j-1] or p[j-1] == '?', then as long as dp[i-1][j-1] == true, dp[i][j] is true else is false
  3. the most difficult part is when p[j-1] = '*', we could match it with nothing, then the result would be the same with dp[i][j-1], or we could match it with s[i-1], wich the result would be the same with dp[i-1][j-1], or we could match it multi characters, which is given by dp[i-1][j], here dp[i-1][j] have contains all possible multi matching.
We add an O(n) pruning before we start dp, we count the non '*' character in p, if the number of non '*' is greater than the length of s, then we definitely could not get a match!
    public boolean isMatch(String s, String p) {
        int m = s.length();
        int n = p.length();
        
        int cnt = 0;
        for(int i = 0; i < n; i++) 
            cnt = cnt + (p.charAt(i) == '*' ? 0 : 1);
        if(cnt > m) return false;
        
        boolean[][] dp = new boolean[m+1][n+1];
        for(int i = 0; i <= m; i++) {
            for(int j = 0; j <= n; j++) {
                if(i == 0 && j == 0)
                    dp[i][j] = true;
                else if(i == 0 && j != 0) {
                    if(dp[i][j-1] && p.charAt(j-1) == '*')
                        dp[i][j] = true;
                }
                else if(i != 0 && j == 0)
                    dp[i][j] = false;
                else {
                    if(dp[i-1][j-1] && (s.charAt(i-1) == p.charAt(j-1) || p.charAt(j-1) == '?'))
                        dp[i][j] = true;
                    else if(p.charAt(j-1) == '*')
                        dp[i][j] = dp[i-1][j] || dp[i-1][j-1] || dp[i][j-1];
                    else 
                        dp[i][j] = false;
                }
            }
        }
        return dp[m][n];
    }

1 条评论: