Implement strstr() – return the index of the first occurrence of needle in haystack or -1

312 viewsProblem Solving

Implement strstr() – return the index of the first occurrence of needle in haystack or -1

Implement strStr().

Given two strings needle and haystack, return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.

Clarification:

What should we return when needle is an empty string? This is a great question to ask during an interview.

For the purpose of this problem, we will return 0 when needle is an empty string. This is consistent to C’s strstr() and Java’s indexOf().

Example 1:

Input: haystack = "hello", needle = "ll"
Output: 2

Example 2:

Input: haystack = "aaaaa", needle = "bba"
Output: -1

Farjanul Answered question July 27, 2022
0

O(nm) runtime, O(1) space – Brute force:

Difficulty: Easy, Frequency: High

Returns the index of the first occurrence of needle in haystack, or –1 if needle is not part of haystack.

There are known efficient algorithms such as Rabin-Karp algorithm, KMP algorithm, or the Boyer-Moore algorithm. Since these algorithms are usually studied in an advanced algorithms class, it is sufficient to solve it using the most direct method in an interview – The brute force method.

The brute force method is straightforward to implement. We scan the needle with the haystack from its first position and start matching all subsequent letters one by one. If one of the letters does not match, we start over again with the next position in the haystack.

Assume that n = length of haystack and m = length of needle, then the runtime complexity is O(nm).

Have you considered these scenarios?

  1. needle or haystack is empty. If needle is empty, always return 0. If haystack is empty, then there will always be no match (return –1) unless needle is also empty which 0 is returned.
  2. needle’s length is greater than haystack’s length. Should always return –1.
  3. needle is located at the end of haystack. For example, “aaaba” and “ba”. Catch

    possible off-by-one errors.

  4. needle occur multiple times in haystack. For example, “mississippi” and

    “issi”. It should return index 2 as the first match of “issi”.

  5. Imagine two very long strings of equal lengths = n, haystack = “aaa…aa” and needle = “aaa…ab”. You should not do more than n character comparisons, or else your code will get Time Limit Exceeded in OJ.

Below is a clean implementation – no special if statements for all the above scenarios.

Solution in Java

public int strStr(String haystack, String needle) {
   for (int i = 0; ; i++) {
for (int j = 0; ; j++) {
if (j == needle.length()) return i;
if (i + j == haystack.length()) return -1;
if (needle.charAt(j) != haystack.charAt(i + j)) break;
} }
}

Solution in Python

class Solution(object):
    def strStr(self, haystack, needle):
        lsh, lsn = len(haystack), len(needle)
        if lsn == 0:
            return 0
        next = self.makeNext(needle)
        i = j = 0
        while i < lsh:
            if j == -1 or haystack[i] == needle[j]:
                i += 1
                j += 1
                if j == lsn:
                    return i - lsn
            if i < lsh and haystack[i] != needle[j]:
                j = next[j]
        return -1
     def makeNext(self, needle):
        ls = len(needle)
        next = [0] * ls
        next[0], i, j = -1, 0, -1
        while i < ls - 1:
            if j == -1 or needle[i] == needle[j]:
                next[i + 1] = j + 1
                if needle[i + 1] == needle[j + 1]:
                    next[i + 1] = next[j + 1]
                i += 1
                j += 1
            if needle[i] != needle[j]:
                j = next[j]
        return next

Farjanul Answered question July 27, 2022
0