TY - RPRT T1 - Efficient Serial and Parallel Algorithms for Querying Large Scale Multidimensional Time Series Data Y1 - 2004 A1 - JaJa, Joseph F. A1 - Kim,J. A1 - Wang,Q. AB - We consider the problem of querying large scale multidimensional time series data to discover events of interest,test and validate hypotheses, or to associate temporal patterns with specific events. Multidimensional time series data is growing at an extremely fast rate due to a number of trends including a recent strong interest in collecting and analyzing time series of business, scientific, demographic, and simulation data. The ability to explore such collections interactively, even at a coarse level, will be critical to the process of extracting some of the information and knowledge embedded in such collections. We develop indexing techniques and search algorithms to efficiently handle temporal range value querying of multidimensional time series data. Our indexing uses linear space data structures that enable the handling of queries very efficiently, invoking in the worst case a logarithmic number of queries to single time steps. We also show that our algorithm is ideally suited for parallel implementation on clusters of processors achieving a linear speedup in the number of available processors. A particularly simple data structure with provably good bounds is also presented for the case when the number of multidimensional objects is relatively small. These techniques improve significantly over previous techniques for either the serial or the parallel case, and are evaluated by extensive experimental results that confirm their superior performance. In particular, we achieve query times in the order of hundreds of milliseconds on a (relatively outdated) cluster of 16 processors for 140GB of data consisting of 160, 000 distinct time series of 16-dimensional points, each time series being of length 10, 000. PB - Instititue for Advanced Computer Studies, Univ of Maryland, College Park VL - UMIACS-TR-2004-50 ER -