Problem

Qn - Vowel Spellchecker

Approach

First idea is to do it as naively as possible. Given each query, we will do a linear search on the wordlist. We first check if the query and the element in the wordlist is equals, then we will check if the lowercase of query and wordlist are the same. The last check is if there are vowel errors. To do so, we will extract the vowels check into a new helper function. In that helper function, we will replace all characters to lowercase, and all vowels to a special character like a "-", then we check if they are equals. This approach would bring us to where is the length of query and wordlist.

Second idea, in an attempt to make it in linear time, building on the naive solution of changing vowels to a special character. We can create two sets, one where all the words in the wordlist is all lowercased, and another where all the vowels are replaced with a "-". As we will need the return string to be in the same format as what is in the wordlist, we will need to maintain a mapping from the lowercased, and devoweled version to the actual version. We then iterate through the queries and lowercase the query and check against the lowercased set, then do the same for the devoweled. This would bring us to runtime and space, excluding the space required for the result array.

Well, I realised that that approach will not work, due to several problems. One is that we want to match the first word that is in the wordlist according to the rules that the question set, ie. exact matches first, then insensitive case matches, and lastly devoweled matches. Let’s take the following example.

wordlist = ["KiTe","kite","hare","Hare"]
queries = ["kite","Kite","KiTe","Hare","HARE","Hear","hear","keti","keet","keto"]


["kite","KiTe","KiTe","Hare","hare","","","KiTe","","KiTe"]

From the example above, we see that the first query, will return the second element in the wordlist as it’s an exact match, but our second query has to return the first element in the wordlist, as there is no exact match and we just return the first element in the wordlist that has a case insensitive match. In our previous approach, we only maintain one mapping from lowercased to the actual format in the wordlist. But there can be multiple similar words who has the same lowercase string. To only get the first, we need to somehow only set the value of the dictionary only if it has not been set. Oh, and also, before that we can’t just use a set for the lowered and devoweled and a mapping. We need a mapping for each of the lowered and devoweled strings. To do all that, there is something called a dict.setdefault(key, value) in python, that sets the key to be of that value only if the key has yet to be set. This will help ensure that we only get the first element that fits the devoweled or lowered operation in the wordlist.

Complexity

  • Time:
  • Space: Where n is length of wordlist and m is length of queries

Code

class Solution:
    def spellchecker(self, wordlist: List[str], queries: List[str]) -> List[str]:
        vowels = {'a','e','i','o','u'}
        perfectWords = set(wordlist)
        loweredWords = dict()
        devoweledWords = dict()
        result = []
 
        def devowel(word):
            return "".join("-" if c in vowels else c for c in word)
 
        # construct the dictionaries O(n)
        for word in wordlist:
            word_l = word.lower()
            loweredWords.setdefault(word_l, word)
            devoweledWords.setdefault(devowel(word_l), word)
 
        # find the correct words
        for query in queries:
            query_l = query.lower()
            query_devoweled = devowel(query_l)
            if query in perfectWords: result.append(query)
            elif query_l in loweredWords: result.append(loweredWords[query_l])
            elif query_devoweled in devoweledWords: result.append(devoweledWords[query_devoweled])
            else: result.append("")
 
        return result