Processing Big Data with MapReduce
by: Jesse Anderson
Published | 2013-07-24 |
---|---|
Internal code | v-jamapr |
Print status | In Print |
Pages | |
User level | |
Keywords | |
Related titles | |
ISBN | |
Other ISBN | |
BISACs |
Highlight
MapReduce is a programming paradigm that uses multiple machines to process large data sets. Apache Hadoop is the most popular MapReduce framework and this series takes you from zero MapReduce knowledge all the way to writing and running Hadoop programs.
In these screencasts, Jesse teaches MapReduce with his own novel method that makes it easy to understand. After you learn the basics, Jesse teaches you Hadoop using Java, Ruby, Python, and Perl code. No matter which technology stack you choose, you’ll have the understanding and tools you need to use to use Hadoop on your next project.
Together we’ll write code in Java, Ruby, Python and Perl.
Source code for the first and third episodes is available at github.com
Source code for the second episode is available at github.com
Free Preview Video:
<ul class="movie-list"> <li> Quicktime </li> <li> iPhone/iPad </li> <li> Theora Ogg </li> </ul>Description
Every industry is dealing with more data every day. The data comes from more and more devices and we need to both store and process the data efficiently. We’ll see how Apache Hadoop MapReduce works and scales to process these vast quantities of data.
Working with software libraries saves time and effort, and this is especially true with distributed computing systems like Hadoop. However, learning the underlying concepts and API takes time, and that often holds teams back. Jesse’s novel approach to MapReduce uses playing cards to illustrate the workflow in a simple, understandable way. This allows you to move physical objects while learning the concepts behind MapReduce. Then we’ll move from conceptual to practical and write code to do the same thing using Hadoop. We’ll work through several examples in several programming languages to ensure you have the knowledge you need to use MapReduce on your next project.
This series of screencasts is a focused look at how MapReduce works and the APIs behind it. Although Hadoop is written in Java, we’ll see how to use it with any language. Jesse teaches using examples in Java, Ruby, Python and Perl.
Contents and Extracts
Episode 1 Mapping and Reducing
- Introduction and Overview of Big Data and Hadoop
- Using Playing Cards to Demonstrate MapReduce
- How MapReduce Scales
Episode 2 Financial Gain
- Common Uses of MapReduce
- Using MapReduce on Financial Data
- Handling Key Space
- Creating a Custom Writable
Episode 3 No Java
- Using the Streaming API
- Writing MapReduce Programs in Ruby, Python and Perl
- Using Command Line Programs with MapReduce
- Scaling a Hadoop Cluster
Free Preview Video:
<ul class="movie-list"> <li> Quicktime </li> <li> iPhone/iPad </li> <li> Theora Ogg </li> </ul>