Question

LeetCode Link | 28. Find the Index of the First Occurrence in a String | Easy

Given two strings needle and haystack, return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.

Constrains

1 <= haystack.length, needle.length <= 10^4
haystack and needle consist of only lowercase English characters.

Examples

Example 1

Input: haystack = “sadbutsad”, needle = “sad”
Output: 0
Explanation: “sad” occurs at index 0 and 6.
The first occurrence is at index 0, so we return 0.

Example 2

Input: haystack = “leetcode”, needle = “leeto”
Output: -1
Explanation: “leeto” did not occur in “leetcode”, so we return -1.

Solution Approach

Method: KMP

Constrains

None

This method does not have any limitations.

Concept Explanation

This question can be solve by brute force, which is basically compare each index of pattern with string, and re-compare form the start if failed.

function strStr(haystack: string, needle: string): number {
	// Edge case: needle is empty
	if (needle.length === 0) return 0;

	for (let i = 0; i < haystack.length; i++) {
		// Check if needle is in haystack
		if (haystack[i] === needle[0]) {
			for (let j = 0; j < needle.length; j++) {
				// If the needle is fully matched
				if (j === needle.length - 1) {
					return i;
				}
				// If not match
				if (haystack[i + j] !== needle[j]) {
					break;
				}
			}
		}
	}
	// Not found
	return -1;
}

This cost O(n * m) time since we match from start of needle each time, which could waste lots of time if some pattern in needle is overlaped.

For example, we have haystack aaaaaaaaf and needle aaaaf, it wil always re-match from first character a, but we could save this time since first four character are the same.

Introducing the KMP Algorithm, this is a string matching algorithm designed for searching substrings or patterns within a main text string efficiently. By using KMP, we could only compare pattern after last a in needle, which is much more efficiently. For more detail, check the post The KMP Algorithm.

To use KMP, we have to compute the LPS array first. This array contains the index we need to go back to if we failed to match the string. For more detail on how to create a LPS array, check the post The KMP Algorithm.

Then we can iterate haystack string to find if needle match by comparing character just like brute force. However, if we failed to match character, we can just use LPS array to go to the nearest index that we already validate all the characters before this index.

Code

Time complexity: O(n + m)
Space complexity: O(m), need m space to store LPS

TypeScript

function strStr(haystack: string, needle: string): number {
	// Compute Longest Prefix which is also a Suffix (LPS) array
	function getLps(pattern: string): number[] {
		let lps: number[] = new Array(pattern.length).fill(0);
		let len: number = -1;
		lps[0] = len;
		for (let i = 1; i < pattern.length; i++) {
			// Find last matched pattern if not match
			while (len >= 0 && pattern[i] != pattern[len + 1]) {
				len = lps[len];
			}
			// Update length when this character match
			if (pattern[i] === pattern[len + 1]) {
				len++;
			}
			// Update LPS array
			lps[i] = len;
		}
		return lps;
	}

	// Edge case: needle is empty
	if (needle.length === 0) return 0;

	// Get LPS for needle
	const lps: number[] = getLps(needle);
	let j: number = -1;
	for (let i = 0; i < haystack.length; i++) {
		// Find last matched pattern if not match
		while (j >= 0 && needle[j + 1] != haystack[i]) {
			j = lps[j];
		}
		// Update length when this character match
		if (needle[j + 1] === haystack[i]) {
			// If it fully matches, return the result, no need to j++
			if (j === needle.length - 2) {
				return i - j - 1;
			}
			j++;
		}
	}
	// Not found
	return -1;
};

Conclusion

Using the LPS array, the KMP algorithm significantly reduces the time complexity of the search phase, making it efficient for searching substrings in larger texts. Try to understand the logic of the KMP algorithm instead of memories the code, this is a commonly used algorithm in string comparisons, memories the hard code would not help you during interview:(