views

recommends

0 collections

2,951

similar

All similar

Record: found
Abstract: not found
Book Chapter: not found

Big Data 

Representing MapReduce Optimisations in the Nested Relational Calculus

other

Author(s): Marek Grabowski , Jan Hidders , Jacek Sroka

Publication date (Print): 2013

Publisher: Springer Berlin Heidelberg

Read this book at

Publisher

Buy book

Bookmark

There is no author summary for this book yet. Authors can add summaries to their books on ScienceOpen to make them more accessible to a non-specialist audience.

Related collections

Most cited references 14

Record: found
Abstract: not found
Conference Proceedings: not found

The Google file system

Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung (2003)

0 comments Cited 86 times – based on 0 reviews

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

A Model of Computation for MapReduce

Howard Karloff, Siddharth Suri, Sergei Vassilvitskii (2010)

0 comments Cited 71 times – based on 0 reviews

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Interpreting the Data: Parallel Analysis with Sawzall

Rob N Pike, Sean Dorward, Robert D. Griesemer … (2005)

Very large data sets often have a flat but regular structure and span multiple disks and machines. Examples include telephone call records, network logs, and web document repositories. These large data sets are not amenable to study using traditional database techniques, if only because they can be too large to fit in a single relational database. On the other hand, many of the analyses done on them can be expressed using simple, easily distributed computations: filtering, aggregation, extraction of statistics, and so on. We present a system for automating such analyses. A filtering phase, in which a query is expressed using a new procedural programming language, emits data to an aggregation phase. Both phases are distributed over hundreds or even thousands of computers. The results are then collated and saved to a file. The design – including the separation into two phases, the form of the programming language, and the properties of the aggregators – exploits the parallelism inherent in having data and computation distributed across many machines.

0 comments Cited 40 times – based on 0 reviews      Review now

Bookmark

All references

Author and book information

Book Chapter

Publication date (Print): 2013

Pages: 175-188

DOI: 10.1007/978-3-642-39467-6_17

SO-VID: 0940ce1f-b171-4678-bf00-58bc594d84c8

History

Data availability:

Representing MapReduce Optimisations in the Nested Relational Calculus

Read this book at

Related collections

ScienceOpen Research

Most cited references 14

The Google file system

A Model of Computation for MapReduce

Interpreting the Data: Parallel Analysis with Sawzall

Author and book information

Book Chapter

History

Comments

Comment on this book

Book chapters

Similar content 2,951