3.9 out of 5
3.9
88 reviews on Udemy

Big Data Internship Program – Data Ingestion-Sqoop and Flume

Complete Reference for Apache Sqoop and Flume
Instructor:
Big Data Trunk
1,032 students enrolled
After this course, students will have knowledge and understanding of Data Ingestion .
Have excellent understanding of Apache Sqoop and flume tool with hands-on experience .
Understand the working of a project in real-world scenario.

This course is a part of “Big data Internship Program”  which is aligned to a typical Big data project life cycle stage.

  • Foundation
  • Ingestion
  • Storage
  • Processing
  • Visualization

This course is focused on the Ingestion in Big data . 

Our Course is divided into two part 1) Technical Knowledge with examples and 2) Work on project 

Technical Knowledge 

  1. Big Data ingestion’s concept and means 
  2. Sqoop concept and feature.  
  3. Good understanding of sqoop tools with arguments 
  4. Flume concept and configuration 
  5. Flume features: Multiplexing,Flume Agents,Interceptors etc .
  6. Understanding of different File Format supported by Hadoop 

Project Part 

  1. Get the access to our private GitHub repository 
  2. Build the first part of Our Recommendation Book project using sqoop and flume

Introduction

1
Introduction to Data Ingestion

In this video, we have explained what is data ingestion, How to process data, challenges in data ingestion, the key function of data ingestion.

2
Recap - Big data Internship Program - Part 1 Foundation

This part -1 course is focused on the foundation of Big data . It covers technical  items like

Technical Foundation

  • Refresh your knowledge on Unix
  • Java based on usage into Big Data .
  • Understand  git /github which is used by most of the companies for source control
  • Hadoop Installation

Part - 1 is free here 

https://www.udemy.com/big-data-internship-program-part-1-foundation


3
Data Ingestion Tools

In this video, we have explained what is data ingestion and there tools available in markets.

4
Some more Data Ingestion Tools

In this video, we have explained data ingestion tools Kafka, Chukwa,Storm etc.

Different types of File Formats in Hadoop

1
Introduction to FileFormats

This video shows different type of file format supported in Hadoop.

2
Introduction to File Formats
3
Text/CSV file formats

CSV /Text files are quite common and often used for exchanging data between Hadoop and external systems.

4
Text/CSV file formats
5
BinaryFileFormats-Sequence Files

This video shows that Sequence files store data in a binary format with a similar structure to CSV. Like CSV, sequence files do not store metadata with the data so the only schema evolution option is appending new fields. 

6
BinaryFileFormats-Sequence Files
7
BinaryFileFormats-Avro

Avro files are quickly becoming the best multi-purpose storage format within Hadoop. Avro files store metadata with the data but also allow specification of an independent schema for reading the file. Here we show you all about this file format .

8
BinaryFileFormats-Avro
9
Columnar formats-RC and ORC files

RC Files or Record Columnar Files were the first columnar file format adopted in Hadoop. Like columnar databases, the RC file enjoys significant compression and query performance benefits.ORC Files or Optimized RC Files were invented to optimize performance in Hive and are primarily backed by HortonWorks. This video shows about these two file format.

10
Columnar formats-RC and ORC files
11
Columnar format-Parquet Files

Parquet Files are yet another columnar file format that originated from Hadoop creator Doug Cutting’s Trevni project. Like RC and ORC, Parquet enjoys compression and query performance benefits, and is generally slower to write than non-columnar file formats.  In this video you can learn more about this file format .

12
Columnar format-Parquet Files

Sqoop

1
Introduction to sqoop

In this video, we have explained to you what is sqoop, what is flume,  sqoop work flow, sqoop architecture.

2
Introduction to Sqoop
3
Sqoop Import

In this video, we have explained what is import command, how sqoop import command is executed.

4
Sqoop Import
5
Import data from MySql to HDFS

In this video we have explained how to execute commands in terminal,how to get table list, how to get list of data bases, how to import data in hdfs.

6
Import data from MySql to HDFS
7
Other variations of Sqoop Import Command

In this video, we have explained how to run sqoop commands, what is structure of sqoop commands, what are the parameters used in the execution of sqoop commands.

8
Other variations of Sqoop Import Command
9
Running a Sqoop Export Command

In this video we have explained what is sqoop export, and how it is used.

10
Running a Sqoop Export Command
11
Sqoop Jobs

In this video, we have explained what is sqoop jobs how it used and when it is used. how to create jobs, how to list sqoop jobs available.

12
Sqoop Jobs
13
Sqoop incremental import

In this video we have explained what is incremental sqoop, and how it works.what are the incremental import parameters etc.

14
Sqoop incremental import
15
Lab: Sqoop incremental Import

In this video, we have explained how incremental import works, how to append data to the table.

16
Test Your Sqoop Knowledge

Flume

1
What is Flume?

In this video, we have explained what is flume, and where it is used.difference between flume and sqoop.

2
What is Flume?
3
Data Flow Model

In this video, we have explained how flume works, what is flume agent what are the components of flume agent, how data is flow between various components of the flume.

4
Data Flow Model
5
Flume Configuration File

In this video, we have explained what are components of the flume, how they are configured i.e how flume agent is configured.

6
Flume Configuration File
7
HelloWorld example in Flume

In this video, we have explained how to run flume agent. and get a result.

8
Multi Agent flow

In this video, we have explained what is multi-agent flume, what is the consolidation of flume.

9
Multi Agent flow
10
Multiplexing

In this video, we have explained what is multiplexing,use of multiplexing, channel selector etc.

11
Multiplexing
12
Interceptors in Flume

In this video, we have tried to explain what is an interceptor, why it is used, how it is configured, and how this runs. what are types of interceptors?

13
Interceptors in Flume
14
Test Flume Knowledge
15
Book recommendation Project Overview

In this video, we have tried to explain what is Recommendation with the help of book recommendation concepts.

16
Book recommendation Project Overview

Project Work

1
Book recommendation Project Sqoop Work Part-1

In this video, we have shown you how to load data in MySQL and then how to import data in hdfs. through  sqoop commands.

2
BookReccomendation Project- Sqoop Work -Part2

In this video, we have explained what is a script,how we can execute our job by using the shell script. 

3
Book recommendation Project - Flume Work

In  Video, we have shown how book recommendation is working, how the rating is generated in hdfs through the flume.

4
Bonus Lecture
You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
3.9
3.9 out of 5
88 Ratings

Detailed Rating

Stars 5
34
Stars 4
28
Stars 3
16
Stars 2
3
Stars 1
7
38218d2a9c8dc35a4bbb417246f56e90
30-Day Money-Back Guarantee

Includes

2 hours on-demand video
1 article
Full lifetime access
Access on mobile and TV
Certificate of Completion