Online prediction methods are typically studied as serial algorithms running on a single processor. In this paper, we present the distributed mini-batch (DMB) framework, a method of converting a serial gradient-based online algorithm into a distributed algorithm, and prove an asymptotically optimal regret bound for smooth convex loss functions and stochastic examples. Our analysis explicitly takes into account communication latencies between computing nodes in a network. We also present robust variants, which are resilient to failures and node heterogeneity in an asynchronous distributed environment. Our method can also be used for distributed stochastic optimization, attaining an asymptotically linear speedup. Finally, we empirically demonstrate the merits of our approach on large-scale online prediction problems.