
Explain like I'm five
Imagine you're looking for the best path through a maze, but you can only keep track of a few possible paths at a time. Beam Search is like having a 'beam' of flashlight that shows you only the most promising routes, and you follow those, ignoring the rest. It's a way to find a good answer quickly without checking every single option.

Why it matters
Beam Search is crucial in AI for tasks like machine translation and speech recognition, where you need to generate the most likely sequence of words or sounds. It makes these systems fast and accurate enough to use in real-time applications.

Common misconception
People often think Beam Search guarantees finding the absolute best answer, but it doesn't—it finds a good one, not necessarily the best. The 'beam width' (how many paths you keep) is a trade-off: wider beams find better answers but take longer.

Formal definition
Beam Search is a breadth-first search variant that prunes the search tree at each level, keeping only the top-k nodes (where k is the beam width) according to a heuristic evaluation function. It is commonly used in sequence-to-sequence models to decode the most probable output sequence by exploring partial sequences in parallel. The algorithm terminates when all sequences in the beam reach an end state, returning the highest-scoring complete sequence.