phone+91-87222 63165 / +1(510)-379-9024 contact@syncomint.com
Try Our Sample Training Videos

Course Info

Hadoop Developer

Hadoop is an open source implementation of MapReduce. Hadoop is the most widely used platform on which to solve problems in processing large, complex data sets. It is a high performance distributed storage and processing system. Hadoop fills the gap in the market by effectively storing and providing computational capabilities for substantial amounts of data.

Benefits of the Program

This course provides the hands-on programming skills to develop solutions that run on Hadoop platform to efficiently process a variety of Big Data. Additionally, you learn to test and deploy Big Data solutions on commodity clusters.

Topic List

The course program at Syncomint provides the hands-on experience to Hadoop platform. Additionally, this course also covers Pig, Hive, HBASE and other components of the Hadoop ecosystem. Syncomint provides you Classroom Training as well as Live Virtual Training.

Course Content
[formac-acc title="Lesson 1: Introduction to Big Data and Hadoop"]Hadoop ecosystem - Concepts, Hadoop Map-reduce concepts and features, Developing the map-reduce Applications, Pig concepts, Hive concepts, Oozie workflow concepts, HBASE Concepts, Real Life Use Cases[/formac-acc] [formac-acc title="Lesson 2: Introduction to Big Data and Hadoop"]What is Big Data?, What are the challenges for processing big data?, What technologies support big data?, What is Hadoop?, Why Hadoop?, History of Hadoop, Use Cases of Hadoop, Hadoop eco System, HDFS, Map Reduce, Statistics[/formac-acc] [formac-acc title="Lesson 3: Understanding the Cluster"]Typical workflow, Writing files to HDFS, Reading files from HDFS, Rack Awareness, 5 daemons[/formac-acc] [formac-acc title="Lesson 4: Let's talk Map Reduce"]Before Map reduce, Map Reduce Overview, Word Count Problem, Word Count Flow and Solution, Map Reduce Flow, Algorithms for simple & Complex problems[/formac-acc] [formac-acc title="Lesson 5: Developing the Map Reduce Application"]Data Types, File Formats, Explain the Driver, Mapper and, Reducer code, Configuring development environment- Eclipse, Writing Unit Test, Running locally, Running on Cluster, Hands on exercises[/formac-acc] [formac-acc title="Lesson 6: How Map-Reduce Works"]Anatomy of Map Reduce Job run, Job Submission, Job Initialization, Task Assignment, Job Completion, Job Scheduling, Job Failures, Shuffle and sort, Oozie Workflows, Hands on Exercises[/formac-acc] [formac-acc title="Lesson 7: Map Reduce Types and Formats"]MapReduce Types, Input Formats - Input splits & records,text input, binary input, multiple inputs& database input, Output Formats - textOutput, binary output, multiple outputs, lazy output and database output, Hands on Exercises[/formac-acc] [formac-acc title="Lesson 8: Map Reduce Features"]Counters, Sorting, Joins - Map Side and Reduce Side, Side Data Distribution, MapReduce Combiner, MapReduce Partitioner, MapReduce Distributed Cache, Hands Exercises[/formac-acc] [formac-acc title="Lesson 9: Hive and PIG"]Fundamentals, When to Use PIG and HIVE, Concepts, Hands on Exercises[/formac-acc] [formac-acc title="Lesson 10: HBASE"]CAP Theorem, Introduction to NOSQL, Hbase Architecture and concepts, Programming and Hands on Exercises[/formac-acc]

ClassRoom Schedule

Classroom Training - 10 Days

Day 1

11AM-5PM
Introduction to Big Data and Hadoop
  • Hadoop ecosystem - Concepts
  • Hadoop Map-reduce concepts and features
  • Developing the map-reduce Applications
  • Pig concepts
  • Hive concepts
  • Oozie workflow concepts
  • HBASE Concepts
  • Real Life Use Cases

Day 2

11AM-5PM
Introduction to Big Data and Hadoop
  • What is Big Data?
  • What are the challenges for processing big data?
  • What technologies support big data?
  • What is Hadoop?
  • Why Hadoop?
  • History of  Hadoop
  • Use Cases of  Hadoop
  • Hadoop eco System
  • HDFS
  • Map Reduce
  • Statistics

Day 3

11AM-5PM
Understanding the Cluster
  • Typical workflow
  • Writing files to HDFS
  • Reading files from HDFS
  • Rack Awareness
  • 5 daemons

Day 4

11AM-5PM
Let's talk Map Reduce
  • Before Map reduce
  • Map Reduce Overview
  • Word Count Problem
  • Word Count Flow and Solution
  • Map Reduce Flow
  • Algorithms for simple & Complex problems

Day 5

11AM-5PM
Developing the Map Reduce Application
  • Data Types
  • File Formats
  • Explain the Driver, Mapper and
  • Reducer code
  • Configuring development environment- Eclipse
  • Writing Unit Test
  • Running locally
  • Running on Cluster
  • Hands on exercises

Day 6

11AM-5PM
How Map-Reduce Works
  • Anatomy of Map Reduce Job run
  • Job Submission
  • Job Initialization
  • Task Assignment
  • Job Completion
  • Job Scheduling
  • Job Failures
  • Shuffle and sort
  • Oozie Workflows
  • Hands on Exercises

Day 7

11AM-5PM
Map Reduce Types and Formats
  • MapReduce Types
  • Input Formats - Input splits & records,text input, binary input, multiple inputs& database input
  • Output Formats - textOutput, binary output, multiple outputs, lazy output and database output
  • Hands on Exercises

Day 8

11AM-5PM
Map Reduce Features
  • Counters
  • Sorting
  • Joins - Map Side and Reduce Side
  • Side Data Distribution
  • MapReduce Combiner
  • MapReduce Partitioner
  • MapReduce Distributed Cache
  • Hands Exercises

Day 9

11AM-5PM
Hive and PIG
  • Fundamentals
  • When to Use PIG and HIVE
  • Concepts
  • Hands on Exercises

Day 10

11AM-5PM
HBASE
  • CAP Theorem
  • Introduction to NOSQL
  • Hbase Architecture and concepts
  • Programming and Hands on Exercises

Live Virtual Class Schedule

Virtual Training - 5 Days

Day 1

8AM-12PM && 1PM-5PM
Introduction to Big Data and Hadoop
  • Hadoop ecosystem - Concepts
  • Hadoop Map-reduce concepts and features
  • Developing the map-reduce Applications
  • Pig concepts
  • Hive concepts
  • Oozie workflow concepts
  • HBASE Concepts
  • Real Life Use Cases
Introduction to Big Data and Hadoop
  • What is Big Data?
  • What are the challenges for processing big data?
  • What technologies support big data?
  • What is Hadoop?
  • Why Hadoop?
  • History of  Hadoop
  • Use Cases of  Hadoop
  • Hadoop eco System
  • HDFS
  • Map Reduce
  • Statistics

Day 2

8AM-12PM && 1PM-5PM
Understanding the Cluster
  • Typical workflow
  • Writing files to HDFS
  • Reading files from HDFS
  • Rack Awareness
  • 5 daemons
Let's talk Map Reduce
  • Before Map reduce
  • Map Reduce Overview
  • Word Count Problem
  • Word Count Flow and Solution
  • Map Reduce Flow
  • Algorithms for simple & Complex problems

Day 3

8AM-12PM && 1PM-5PM
Developing the Map Reduce Application
  • Data Types
  • File Formats
  • Explain the Driver, Mapper and
  • Reducer code
  • Configuring development environment- Eclipse
  • Writing Unit Test
  • Running locally
  • Running on Cluster
  • Hands on exercises
How Map-Reduce Works
  • Anatomy of Map Reduce Job run
  • Job Submission
  • Job Initialization
  • Task Assignment
  • Job Completion
  • Job Scheduling
  • Job Failures
  • Shuffle and sort
  • Oozie Workflows
  • Hands on Exercises

Day 4

8AM-12PM && 1PM-5PM
Map Reduce Types and Formats
  • MapReduce Types
  • Input Formats - Input splits & records,text input, binary input, multiple inputs& database input
  • Output Formats - textOutput, binary output, multiple outputs, lazy output and database output
  • Hands on Exercises
Map Reduce Features
  • Counters
  • Sorting
  • Joins - Map Side and Reduce Side
  • Side Data Distribution
  • MapReduce Combiner
  • MapReduce Partitioner
  • MapReduce Distributed Cache
  • Hands Exercises

Day 5

8AM-12PM && 1PM-5PM
Hive and PIG
  • Fundamentals
  • When to Use PIG and HIVE
  • Concepts
  • Hands on Exercises
HBASE
  • CAP Theorem
  • Introduction to NOSQL
  • Hbase Architecture and concepts
  • Programming and Hands on Exercises
Connect With Us

Call: +91-87222 63165 (India)
Call: +1 510-379-9024 (USA)

Mail: contact@syncomint.com