Skip to content

Implement a Swype Keyboard

Level: L3-L4 Topics: Strings, Subsequences, Dictionary Matching, Keyboard Layout

Problem Statement

Implement a function guessWord(input, dictionary) that powers a swipe-style (glide) keyboard.

The user places their finger on the first letter of a word and drags it across the keyboard to the last letter. The input string is the raw sequence of all letter keys the finger passed over during the swipe gesture. Your function should return all words from the dictionary that could plausibly match this gesture.

A dictionary word w matches the input if:

  • The first character of w equals the first character of input.
  • The last character of w equals the last character of input.
  • The remaining characters of w appear as a subsequence of input (in order, but not necessarily contiguous).

Background & Constraints

  • The dictionary can be large (hundreds of thousands of words).
  • The input string can be long (the finger passes over many keys).
  • Multiple dictionary words may match the same input.
  • Initially, treat the problem as keyboard-layout agnostic -- you only have the raw string of letters.

Examples

Example 1:

input  = "rtyuioiuyt"
dictionary = ["root", "riot", "rut", "rot", "ruby"]

guessWord("rtyuioiuyt", dictionary) → ["riot", "rut", "rot"]

Explanation:

  • "riot" — starts with 'r', ends with 't', and 'i', 'o' appear as a subsequence in the middle of the input (r-t-y-u-i-o-i-u-y-t).
  • "rut" — 'u' appears between 'r' and 't' (r-t-y-u-i-o-i-u-y-t).
  • "rot" — 'o' appears between 'r' and 't' (r-t-y-u-i-o-i-u-y-t).
  • "root" is excluded under strict subsequence rules: it needs two 'o's in the middle, but the input contains only one 'o'.
  • "ruby" is excluded because it ends in 'y', not 't'.

Example 2 — repeated letters (corner case):

input = "rot"
dictionary = ["rot", "root"]

guessWord("rot", dictionary) → ["rot", "root"]   # under the relaxed rule
                            → ["rot"]            # under strict subsequence

"rot" is a direct match. Whether "root" also matches depends on your matching model: a swipe gesture only registers one 'o' even if the finger pauses on the key, so "root" requires two 'o's in the input but only sees one. Discuss this with your interviewer. Two reasonable models:

  • Strict subsequence. Repeated letters in the word need repeated letters in the input. "root" rejected.
  • Relaxed (key-press collapsing). A single key press can map to one or more consecutive identical letters, on the theory that the finger lingered on the key. "root" accepted.

Your code should make this choice explicit (e.g. behind a collapse_repeats flag).

Hints & Common Pitfalls

  1. Start with the subsequence check. For each dictionary word, verify that the word is a subsequence of the input (with matching first and last characters). This is O(len(input)) per word.

  2. Brute force is O(D * L) where D is the dictionary size and L is the input length. This may be acceptable depending on constraints, but discuss optimizations.

  3. Pruning by first/last character immediately eliminates a large portion of the dictionary. Group dictionary words by their (first_char, last_char) pair for O(1) lookup of candidates.

  4. Trie-based optimization: Build a trie from the dictionary. Walk the trie while scanning the input string. At each input character, advance any trie nodes that have a matching child. This processes all dictionary words simultaneously.

  5. Be careful with repeated characters. Decide whether a single key press in the input can match multiple consecutive identical characters in a dictionary word (because the finger hovered over that key).

Follow-Up Questions

  1. How would you use additional information from the keyboard to improve accuracy? Consider: (a) the physical distance between consecutive keys in the input, (b) the time spent on each key, (c) the velocity of the finger.

  2. How would you rank multiple matching words? Think about word frequency, gesture shape similarity, and key proximity.

  3. If the dictionary is very large and queries are frequent, what data structures would you precompute? How does a trie compare to a hash-map-based approach?

  4. How would you handle the case where the user's finger slightly misses a key (e.g., hits 'e' instead of 'r')? What tolerance model would you use?