balanced delimiters

June 2014

This week's Coding for Interviews newsletter, as usual, included a challenge. This week's was to parse a string, determining whether all the delimiters (parentheses, curly, and square braces) it contains are balanced.

For example, the string Math.min(3, 7) is balanced, while Math.min(3, )) is not.

Since I'm a big fan of Lisp, I'm writing my solution in Arc, a modern Lisp.

As we all know, Lisp has no iteration, just recursion.

Well, not quite. Although it allows for both techniques, recursion is common in the Lisp community. Let's look at the problem of finding balanced delimiters in strings, and compare recursive and iterative solutions. First, the recursive method:

(def is-balanced (str)
     (no (find-unbalanced (coerce str 'cons))))

(def find-unbalanced (char-list (o delims (obj #\( #\)
                                               #\[ #\]
                                               #\{ #\})))
     "Returns a list, in order, of all unbalanced delimiters."
     (when char-list
       (with (cur (car char-list)
              unbalanced-to-right (find-unbalanced (cdr char-list)))
             (if (mem cur (keys delims))
                 (if (is delims.cur
                         (car unbalanced-to-right))
                     (cdr unbalanced-to-right)
                   (cons cur unbalanced-to-right))
               (mem cur (vals delims))
               (cons cur unbalanced-to-right)
               unbalanced-to-right))))

This method takes the first delimiter it finds, and if it's an open one, makes sure it matches the first unmatched one in the rest of the string. We calculate this by recursing further down into the string.

If the delimiter is a close delimiter, then we pass it back to the calling code, so it can be matched further up in the call tree.

At the end, we simply return any unmatched delimiters. If there are none, we know the string is properly balanced.

But how do we know it works? Let's write some unit tests, using my test framework for Arc. The tests are fairly straightforward.

(suite delimiters
       empty (assert-t (is-balanced ""))
       parens (assert-t (is-balanced "()"))
       parens-then-curlies (assert-t (is-balanced "(){}"))
       nested-parens-square-curlies (assert-t (is-balanced "([{}])"))
       mixed-parens-brackets (assert-nil (is-balanced "([)]"))
       open-paren (assert-nil (is-balanced "("))
       close-paren (assert-nil (is-balanced ")"))
       square-dont-match-curly (assert-nil (is-balanced "([})")))

So now the iterative version.

(def is-balanced (str)
     (with (char-list (coerce str 'cons)
            delims (obj #\( #\)
                        #\[ #\]
                        #\{ #\})
            unmatched-openings nil)
           (errsafe (do (each char char-list
                              (if (mem char (keys delims))
                                  (push char unmatched-openings)
                                (mem char (vals delims))
                                (unless (is char
                                            (delims (pop unmatched-openings)))
                                  (err 'found-it))))
                        (no unmatched-openings)))))

We just make a list of opening delimiters we haven't matched yet, and iterate through, removing them when we find a match. If we ever find a close delimiter that doesn't match the most recent open delimiter, we know the string is unbalanced.

There's a little weirdness, because Arc doesn't have a way to return early from a function, so we have to create an error, then catch it with errsafe, which returns nil if an error occurs.

Overall, I prefer the recursive version, but that is largely because of the better control flow -- everything else is a little bit simpler in the iterative version. And even though it's not required by the newsletter prompt, both versions handle strings with characters other than delimiters.

If you want to hear when I publish more, sign up for my mailing list!