4.28 out of 5
4.28
905 reviews on Udemy

Hadoop Developer In Real World

Free Cluster Access * HDFS * MapReduce * YARN * Pig * Hive * Flume * Sqoop * AWS * EMR * Optimization * Troubleshooting
Instructor:
Hadoop In Real World
3,432 students enrolled
English More
Understand what is Big Data, the challenges with Big Data and how Hadoop propose a solution for the Big Data problem
Work and navigate Hadoop cluster with ease
Install and configure a Hadoop cluster on cloud services like Amazon Web Services (AWS)
Understand the difference phases of MapReduce in detail
Write optimized Pig Latin instruction to perform complex data analysis
Write optimized Hive queries to perform data analysis on simple and nested datasets
Work with file formats like SequenceFile, AVRO etc
Understand Hadoop architecture, Single Point Of Failures (SPOF), Secondary/Checkpoint/Backup nodes, HA configuration and YARN
Tune and optimize slowing running MapReduce jobs, Pig instructions and Hive queries
Understand how Joins work behind the scenes and will be able to write optimized join statements
Wherever possible, students will be introduced to difficult questions that are asked in real Hadoop interviews

From the creators of the successful Hadoop Starter Kit course hosted in Udemy, comes Hadoop In Real World course. This course is designed for anyone who aspire a career as a Hadoop developer. In this course we have covered all the concepts that every aspiring Hadoop developer must know to SURVIVE in REAL WORLD Hadoop environments.

The course covers all the must know topics like HDFS, MapReduce, YARN, Apache Pig and Hive etc. and we go deep in exploring the concepts. We just don’t stop with the easy concepts, we take it a step further and cover important and complex topics like file formats, custom Writables, input/output formats, troubleshooting, optimizations etc.

All concepts are backed by interesting hands-on projects like analyzing million song dataset to find less familiar artists with hot songs, ranking pages with page dumps from wikipedia, simulating mutual friends functionality in Facebook just to name a few.

Thank You and Let's Get Started

1
Course Structure
2
Tools & Setup (Windows)
3
Tools & Setup (Linux)

Introduction To Big Data

1
What is Big Data?
2
Understanding Big Data Problem
3
History of Hadoop
4
Test your understanding of Big Data

HDFS

1
HDFS - Why Another Filesystem?
2
Blocks
3
Working With HDFS
4
HDFS - Read & Write
5
HDFS - Read & Write (Program)
6
Test your understanding of HDFS
7
HDFS Assignment

MapReduce

1
Introduction to MapReduce
2
Dissecting MapReduce Components
3
Dissecting MapReduce Program (Part 1)
4
Dissecting MapReduce Program (Part 2)
5
Combiner
6
Counters
7
Facebook - Mutual Friends
8
New York Times - Time Machine
9
Test your understanding of MapReduce
10
MapReduce Assignment

Apache Pig

1
Introduction to Apache Pig
2
Loading & Projecting Datasets
3
Solving a Problem
4
Complex Types
5
Pig Latin - Joins
6
Million Song Dataset (Part 1)
7
Million Song Dataset (Part 2)
8
Page Ranking (Part 1)
9
Page Ranking (Part 2)
10
Page Ranking (Part 3)
11
Test your understanding of Apache Pig
12
Apache Pig Assignment

Apache Hive

1
Introduction to Apache Hive
2
Dissect a Hive Table
3
Loading Hive Tables
4
Simple Selects
5
Managed Table vs. External Table
6
Order By vs. Sort By vs. Cluster By
7
Partitions
8
Buckets
9
Hive QL - Joins
10
Twitter (Part 1)
11
Twitter (Part 2)
12
Test your understanding of Apache Hive
13
Apache Hive Assignment

Architechture

1
HDFS Architechture
2
Secondary Namenode
3
Highly Available Hadoop
4
MRv1 Architechture
5
YARN
6
Test your understanding of Hadoop Architechture

Cluster Setup

1
Vendors & Hosting
2
Cluster Setup (Part 1)
3
Cluster Setup (Part 2)
4
Cluster Setup (Part 3)
5
Amazon EMR

With Amazon EMR we can start a brand new Hadoop cluster and run MapReduce jobs in matter of minutes. This lecture will walk through step by step how to set up a Hadoop cluster and run MapReduce jobs in it.

6
Test your understanding of Cluster Setup

Hadoop Administrator In Real World (Preview)

1
Cloudera Manager - Introduction

In this lecture we will learn about the benefits of Cloudera Manager, differences between Packages and Parcels and lifecycle of Parcels.

2
Cloudera Manager - Installation

In this lecture we will see how to install a 3 node Hadoop cluster on AWS using Cloudera Manager

File Formats

1
Compression
2
Sequence File
3
AVRO
4
File Formats - Pig
5
File Formats - Hive
6
Introduction to RCFile
7
Working with RCFile
8
Introduction to ORC
9
Working with ORC
10
Parquet - Another Columnar Format
11
Test your understanding of File Formats

Troubleshooting and Optimizations

1
Exploring Logs
2
MRUnit
3
MapReduce Tuning
4
Pig Join Optimizations (Part 1)
5
Pig Join Optimizations (Part 2)
6
Hive Join Optimizations
7
Test your understanding of Troubleshooting & Optimizations

Apache Sqoop

1
Sqoop Imports

This lecture will give an introduction to Apache Sqoop and demonstrate Sqoop imports to bring data from a traditional databases like MySQL to HDFS

2
Sqoop - File Formats

This lecture will cover custom Sqoop imports and how Sqoop can be used to export tables in different file formats

3
Jobs & Incremental Imports

This lecture will cover Sqoop jobs & incremental imports.

4
Hive - Exports

This lecture will demonstrate how Sqoop can be used to create and populate a Hive Table directly and also how to export data from HDFS to a MySQL table

Apache Flume

1
Introduction to Flume

In this lecture, we will see an introduction to Flume and we will look in detail about the different flume components - source, channel and sink. We will also look at a very simple flume configuration to ingest log messages to HDFS.

2
Replication

In this lecture we will ingest log messages from a single source and replicate the flume events in to HDFS and local file system.

You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
4.3
4.3 out of 5
905 Ratings

Detailed Rating

Stars 5
505
Stars 4
304
Stars 3
70
Stars 2
21
Stars 1
6
129defaba24e3edd9aa97d38f3ffec65
30-Day Money-Back Guarantee

Includes

19 hours on-demand video
4 articles
Full lifetime access
Access on mobile and TV
Certificate of Completion