Starting Points
Dean, Jeffrey and Sanjay Ghemawat. (2004) MapReduce: Simplified Data Processing on Large Clusters. Proceedings of the 6th Symposium on Operating System Design and Implementation (OSDI 2004).
Chang, Fay, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, and Robert Gruber. (2006) Bigtable: A Distributed Storage System for Structured Data. Proceedings of the 7th Symposium on Operating System Design and Implementation (OSDI 2004).
Ghemawat, Sanjay, Howard Gobioff, and Shun-Tak Leung. (2003) The Google File System. Proceedings of the 19th ACM Symposium on Operating Systems Principles.
Pike, Rob, Sean Dorward, Robert Griesemer, and Sean Quinlan. (2005) Interpreting the Data: Parallel Analysis with Sawzall. Scientific Programming Journal, 13(4):277-298.
Of Related Interest
DeCandia, Giuseppe, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swami Sivasubramanian, Peter Vosshall, and Werner Vogels. (2007) Dynamo: Amazon's Highly Available Key-Value Store. Proceedings of the 21st ACM Symposium on Operating Systems Principles.
Isard, Michael, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. (2007) Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks. Proceedings of the ACM SIGOPS/EuroSys European Conference on Computer Systems 2007 (EuroSys 2007).
Yang, Hung-chih, Ali Dasdan, Ruey-Lung Hsiao, and D. Stott Parker. (2007) Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters. Proceedings of ACM SIGMOD International Conference on Management of Data.