In this paper we address the problem of automatically deriving vocabularies of motion modules from human motion data, taking advantage of the underlying structure in motion. We approach this problem with a data-driven methodology for modularizing a motion stream (or time-series of human motion) into a vocabulary of parameterized actions and a set of high-level behaviors for sequencing actions. Central to this methodology is the discovery of spatio-temporal structure in a motion stream. We estimate this structure by using a spatio-temporal dimension reduction method based on extended Isomap. The utility of the derived vocabularies is validated through their use in synthesizing new humanoid motion.