So far we have covered following topics in Big Data
· Big data: Technology Stack
· Big Data: Hadoop Distributed Filesystem (HDFS)
· Big Data: Map Reduce
· Big Data- Installing Hadoop ( Single Node)
· Big Data- Apache Hadoop Multi Node
· Big Data: Troubleshooting, Administering and optimizing Hadoop
· Big Data: Managing HDFS
· Big Data: Map Reduce Development
· Big Data: Introduction to Pig
In this blog we will discuss AMAZON ELASTIC MAPREDUCE (EMR)
AMAZON ELASTIC MAPREDUCE (EMR) Overview
What is EMR?
-Webservice on top of AWS that uses EC2 for processing and S3 for storage
-Data is pulled from S3, processed by auto-configured EC2 cluster and results pushed back to S3
-Crunch your data in the cloud without the hassle of managing your own cluster/infrastructure!!
What is an EMR Job Flow?
-Data processing wizard
-Hive,mapreduce, hbase and pig
The only thing we need to do is configure EMR Job Flow. Once its configured, rest is very easy. Even EMR JOB FLOW is very easy in amazon.
Thats it.

No comments:
Post a Comment