Authors
Jeffrey Dean, Sanjay Ghemawat
Publication date
2008/1/1
Journal
Communications of the ACM
Volume
51
Issue
1
Pages
107-113
Publisher
ACM
Description
MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day.
Total citations
Scholar articles
J Dean, S Ghemawat - Communications of the ACM, 2008
J Dean, S Ghemawat - 2004
J Dean, S Ghemawat - OSDI: Sixty Symposium on Operating System Design …, 2004
J Dean, S Ghemawat, S Dill, R Kumar, K McCurley… - S. Dill, R. Kumar, K. McCurley, S. Rajagopalan, D …, 2001
J Dean, S Ghemawat - 2004
J Ean, S Ghemawat - Proceedings of the 6th Symposium on Operating …, 2008
J Dean, S Ghemawat - Google, Inc, 2004
J Dean, S Ghemawat - San Francisco, CA, 2004
J Dean, S Ghemawat - 2008