Resource Library

You’re invited to check out out all the different learning resources in the guide: problems and projects, former Google interview questions, online courses, education sites, videos, and more. You can search for your favorite topics.

Pit yourself against this advanced problem that requires you to write a program that can identify word sequences defined as word squares, and then return all word square sequences within a given list of words. It's a former Google interview question that will challenge your knowledge of algorithms, arrays, recursion and testing. Ready, set, go!

Take the challenge

Google

The Challenge

A “word square” is an ordered sequence of K different words of length K that, when written one word per line, reads the same horizontally and vertically. For example:

Copied
BALLAREALEADLADY

In this exercise you're going to create a way to find word squares.

First, design a way to return true if a given sequence of words is a word square.

Second, given an arbitrary list of words, return all the possible word squares it contains. Reordering is allowed.

For example, the input list

[AREA, BALL, DEAR, LADY, LEAD, YARD]

should output

[(BALL, AREA, LEAD, LADY), (LADY, AREA, DEAR, YARD)]

Finishing the first task should help you accomplish the second task.

Learning objectives

This problem gives practice with algorithms, recursion, arrays, progressively optimizing from an initial solution, and testing.

Solution explained, commentary

The straightforward solution for all permutations of length K is:

O(K^2 * N! / (N-K)!) = O(K^2 * nCr(N,K) * K!).

Involves either recursion or record-keeping but there are no “tricks”.

The optimal solution(s) can be arrived at through reasoning and progressively; there is no “ah ha” moment. Great practice at starting with a working but non-optimal solution and progressively make improvements.

Here are some great questions to ask as you solve the problem:

  • Is the input list sorted? -> no (but you can sort it, leading to an optimal solution)
  • Do all the words in the input have the same length? -> not necessarily. More importantly, how does thinking about this question help you?
  • Areas where people often struggle:

  • Not checking that all rows are length K.
  • Overflowing bounds: doing v[r][c] == v[c][r] before checking v[c].size().
  • Checking things twice (optimization: no need to check both i,j and j,i)
  • Not thinking about how to handle length 0 or 1 input.
  • Not including a word more than once: THAT, HASH, ASIA, FOOL (THAT could go at end). This is an important edge case to consider -- and solving for it makes the problem slightly more complicated.
  • An alternative solution:

    Running a backtracking algorithm with a trie T and an array P of k pointers.

    At any given point, we have filled in r rows and columns ( per the problem's constraint). And this point, we are left with a (k - r) sized square to fill. We could use the last (k - r) pointers in P to explore the trie.

    Say now that the running time of the algorithm is T(N, k), where N is the number of words given in the input. We could write

    T(N, k) = N * (T(N - 1, k - 1) + k), because for each word that we try out, we will have to adjust the k pointers, and there are N words in total. Once we have put in the first word, we are left with N - 1 words and a square of size (k - 1) to fill.

    Also, T(p, 1) = p * k^2, because if we are left with a square of size 1, and have p potential characters to fill there, we would fill them in one by one, and print out the entire word square. This will be where our recursion terminates.

    Code snippet for solution

    Copied
    #!/usr/bin/env pythonimport collectionsimport sysdef find_word_squares(words):    # Preprocess words: O(#words * word-length) time and space    words_by_letter_position = collections.defaultdict(set)    for word in words:        for index, letter in enumerate(word):            words_by_letter_position[(index,letter)].add(word)    # For each word, see if we can make a square.  O(#words * word-length^2/2)    # for speed, assuming that set intersection is ~O(1) for small sets.    # Worst-case storage is O(#words * word-length) for very very contrived    # 'words'.  Normal English words will result in much smaller storage demand;    # there is a practical maximum of ~170,000 English words.    for word in words:        # Initialize a set of incomplete possible squares; each item is an N-tuple        # of words that are valid but incomplete word squares.  We could also do        # depth-first via recursion/generators, but this approach is a little        # simpler to read top-to-bottom.        possible_squares = set([(word,)])    # As long as we have any incomplete squares:    while possible_squares:        square = possible_squares.pop()        # When matching an incomplete square with N words already present,        # we need to match any prospective words to the tuples formed by        # (N, Nth character in word 0), (N, Nth character in word 1), ...        # Only words which satisfy all of those restrictions can be added.        keys = [(i, square[i][len(square)]) for i in xrange(len(square))]        possible_matches = [words_by_letter_position[key] for key in keys]        for valid_word in set.intersection(*possible_matches):            valid_square = square + (valid_word,)            # Save valid square in 'ret' if it's complete, or save it as            # a work-to-do item if it's not.            if len(valid_square) == len(word):                yield valid_square            else:                possible_squares.add(valid_square)if __name__ == '__main__':    for square in find_word_squares(sys.argv[1:]):        print '
    '.join(square)        print
    Another solution

    https://discuss.leetcode.com/topic/60617/find-if-a-word-square-they-read-the-same-horizontally-and-vertically-example-ball-area-lead-lady-return-true-if-a-given-sequence-is-a-word-square