The Viterbi algorithm is an efficient and optimal method for decoding linear-chain Markov Models. However, the entire input sequence must be observed before the labels for any time step can be generated, and therefore Viterbi cannot be directly applied to online/ interactive/streaming scenarios without incurring significant (possibly unbounded) latency. A widely used approach is to break the input stream into fixed-size windows, and apply Viterbi to each window. Larger windows lead to higher accuracy, but result in higher latency.

We propose several alternative algorithms that compute a certainty measure on predicted labels and allow us to dynamically trade off latency and accuracy. This principled approach gives a substantial improvement over a fixed window. We show its effectiveness for the task of spotting semistructured information in large documents. When compared to full Viterbi, the approach suffers a 0.1 percent error degradation with an average latency of 2.6 time steps (versus Viterbi’s potentially infinite latency). When compared to a fixed window, we achieve a 40x reduction in error and 6x reduction in latency.