4.05 out of 5
4.05
155 reviews on Udemy

Talend For Big Data Integration Course : Beginner to Expert

Master guide for using Talend Big Data
Learn Basic concepts of Big Data (Hadoop)
Create cluster Metadata manually, from configuration files and automatically
Create HDFS and Hive metadata Connect to your cluster to use HDFS, HBase, Hive, Pig, Sqoop and Map Reduce
Learn to design Big Data Batch jobs using Spark as framework.
Learn about EMR, how to start/stop from Talend.
Read and Write data to/from HDFS (HDFS, HBase)
Read and Write tables to/from HDFS (Hive, Sqoop)
Processing Tables stored on HDFS with Hive
Processing data stored on HDFS with Pig
Use Talend Open Studio for Big Data for real work as quickly as possible.
Write Talend Big Data v6 Certified Developer Exam
Work on Cloudera Hadoop Distribution
Work on HortonWorks Hadoop Distribution
Over 100 Lectures and Hours of Content !
Over 50 Exercises and Quiz Questions!
Once you finish this Course I guarantee, you will Pass the Certification Exam. (Offcourse you have to practice what ever I teach in this course :-)).
You will get Source code and Data Used in all 50 + Exercises.
You will get Source code and Data Used in all 100 + Jobs Designed in the Course.
I will respond to all your questions within 24 hours.
50 % Off on my other course (Talend Data Integration Course : Beginner to Expert) - Email me
A practice test which mimics the actual Talend Certification test.

Course Description

Talend Open Studio for Data Integration is an open Source ETL Tool, which means small companies or businesses can use this tool to perform Extract Transform and Load their data into Databases or any File Format (Talend supports many file formats and Database vendors).

Talend Open Studio for Big Data is an open Source Tool used to interact with Big Data systems from Talend.

If you want to learn how to use Talend Open Studio for Big Data from SCRATCH or If you want to IMPROVE your skills in Big Data Concepts and designing Talend Jobs, then this course is right for you.

Its got EVERYTHING, covers almost all the topics in Talend Open Studio for Big Data.

Talks about Real Time USE CASES.

Prepares you for the Certification Exam.

By the end of the Course you will Master Working with Big Data by designing Talend Jobs.

And what more you ask, All the Videos are HD Quality.

Email me for coupons.

What Are the System Requirements ?

  • PC or Mac.

  • Virtual Box Which is FREE.

  • Talend Software Which is FREE.

  • HDP VM Which is FREE.

  • CDH VM Which is FREE.

What Does the Course Cover ?

1
What Does the Course Cover ?

This Video will talk about the topics that are covered in this course.

2
How to Download The Data files and Job files ?

This video will show you how to download data files and Talend jobs that are designed as part of this course.

TALEND OVERVIEW

1
Introduction to Talend Open Studio for Big Data

This lecture gives you an overview of what Talend Open Studio for Big data is and it also talks about the additional features in the subscription version.

2
Installing Talend Open Studio for Big Data on Windows/Mac/Linux

After watching this lecture, you should be able to download and open Talend Open Studio for Big Data on your Windows OS.

BIG DATA OVERVIEW

1
The Three Vs of Big Data

You should be able to describe the 3 V's of Big Data and what big data is ?

2
About Hadoop

Introduction to Hadoop and its advantages over traditional systems.

3
The Hadoop Ecosystem

Walks you through different Hadoop ecosystem Tools.

4
HDFS - Understanding Block Storage, NameNode and DataNode

What is a HDFS Block, What is a Namenode and its functions, Waht is a DataNode and its functions.

5
HDFS - Architecture

This lecture explains you, What happens when you Read/Write a file from/to to HDFS.

6
MapReduce - Overview of MapReduce

Gives an overview of MapReduce and its functions.

7
MapReduce - Understanding MapReduce

Explains the Map phase, the Shuffle Sort Phase and the Reduce Phase in a Map Reduce Job.

8
MapReduce - The Key/Value Pairs of MapReduce

Explains different types of Key-Values pairs generated as part of a MapReduce Job.

9
HDFS - HDFS Federation & NameNode High Availability Hadoop 2
10
YARN - The Components of YARN

Briefly explains YARN and its daemons.

11
YARN - Lifecycle of a YARN Application

Explains what happens when you run an application on YARN.

12
Big Data Overview - Quiz

Getting Started

1
Installing Cloudera CDH VM

After watching this video you should be able to download CDH virtual box image and your cluster should be up and running.

2
Installing HortonWorks Sandbox VM

After watching this video you should be able to download HDP virtual box image and your cluster should be up and running.

3
Opening Talend project

You should be able to create and open a Talend project.

4
Creating Hadoop Cluster Metadata in Talend for HDP

This lecture walks you through the process of creating a Hadoop cluster Metadata using HDP.

5
Creating Hadoop Cluster Metadata in Talend for CDH

HDFS Components

1
HDFS - Basic Commands Using Unix Shell

This video will explain what HDFS is and shows you how to run different hdfs commands on cluster.

For example, how to create a file/directory,how to copy file from local system to hdfs, etc..

2
How to create a reusable connection to the HDFS ?

It will show you how to create a HDFS connection in a way that you can use it in all of your HDFS jobs.

3
How to copy a source file or folder into a target directory on HDFS ?

Explains the scenario where you have to copy a file or directory from your local system to HDFS.

4
How to retrieve a list of files or folders based on a filemask pattern ?
5
How to copy files from HDFS to HDFS ?
6
How to get files from HDFS into local directory ?
7
How to rename the selected files or specified directory on HDFS ?
8
How to check whether a file exists in a specific directory in HDFS ?
9
How to delete a file located on a given HDFS ?
10
How to read a file located on a given HDFS and Assign schema to it?
11
How to count the number of rows in a file in HDFS ?
12
How to present the properties of a file processed in HDFS ?
13
How to transfer data flows into a given HDFS file system ?
14
How to transfer data in the form of a single column into a given HDFS ?
15
How to compare two files on HDFS ?

Explains the scenario where you have to compare two different files that are on HDFS.

HIVE Components

1
What is Hive ?
2
Hive Architecture
3
HiveQL Vs SQL
4
How to Connect to Hive Shell
5
How to Create Hive Managed and External Tables Using Hive Shell
6
How to Load data from HDFS & Local File System to Hive table using Hive Shell
7
How to Load data from one Hive table to another Hive table using Hive Shell
8
How to join two HIVE Tables using Hive Shell
9
How to READ data from a HIVE Table and filter data using Hive Shell
10
How to open a connection to a Hive database using Talend?
11
How to close connection to a Hive databases using Talend?
12
How to create a Hive table using Talend?
13
How to extract data from Hive using Talend?
14
How to write data of different formats into a given Hive table using Talend?
15
How to execute the HiveQL query using Talend?
16
Hive - Quiz

PIG Components

1
What is Pig ?
2
What are the different Datatypes supported by Pig ?
3
How to Assign a schema to input file using Grunt Shell ?
4
What are aliases,relations and How to Load a file into Pig Alias ?
5
Pig - GROUP,GROUP ALL,DUMP,STORE,FILTER,LIMIT Operators
6
Pig - FOREACH, COUNT, MAX Operators
7
Pig - ORDER BY,DISTINCT,JOIN,COGROUP
8
How to load input data to an output stream in one single transaction ?
9
How to filter data from a relation based on conditions ?
10
How to select one or more columns from a relation ?
11
How to store the result of your Pig Job into a defined data storage space ?
12
How to remove duplicate tuples in a relation ?
13
How to perform the Pig COGROUP operation ?
14
How to perform aggregations on input data to create data to be used by Pig ?
15
How to perform join of two files based on join keys ?
16
How to sort a relation based on one or more defined sort keys ?
17
How to duplicate the incoming schema into identical output flows as needed ?
18
How to compute the cross data of two or more relations ?
19
How to transform data from multiple sources to multiple targets ?
20
How to integrate personalized Pig Code into a Talend Job ?
21
Pig - Quiz

SQOOP Components

1
What is Sqoop ?

Download the jobs designed in this section from here.

2
How to transfer data from a RDBMS into the HDFS ? - Part1


3
How to transfer data from a RDBMS into the HDFS ? - Part2
4
How to transfer all of the tables of a RDBMS into the HDFS ?

Coming Soon :-)

5
How to import incremental data ? - Part1


6
How to import incremental data ? - Part2
7
How to import incremental data ? - Part3
8
How to transfer data from the HDFS to a RDBMS ?


9
Sqoop - Quiz

Big Data Batch Jobs (EMR Spark) - Coming Soon

1
Environment for Running Big Data Batch Jobs in this course

A walk-through over the architecture we are using in this course.

You will get an overview on EMR.

You will get an overview on S3.

You will get an overview on Spark.

2
What are Big Data Batch Jobs? How do they work?
  • You will learn what are big data batch jobs in Talend.

  • You will learn how to run a BDBJ.

  • You will learn what exactly happens when you run a BDBJ.


3
Setting up EMR

You will learn what are Access keys & Secret Keys & how to setup them up on AWS.

You will learn what are roles on AWS & how to create them.

4
Creating/Start/Stop an EMR cluster using AWS Console

You will learn How to create/start/stop an EMR cluster using AWS Console.

5
Creating/Start/Stop an EMR cluster using Talend.

You will learn How to create/start/stop an EMR cluster using Talend Studio.

6
Creating Big Data Batch Job using Talend studio using Spark as framework.

You will learn how to create a BDBJ using Spark as framework.

You will learn How to configure a BDBJ.

You will learn how to run a BDBJ.

You will learn Components that are part of BDBJ.

HCATALOG Components

1
What is HCatalog ?


2
How to perform Operations on HCatalog managed Hive database/table/partition


You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
4.1
4.1 out of 5
155 Ratings

Detailed Rating

Stars 5
43
Stars 4
50
Stars 3
42
Stars 2
12
Stars 1
8
ebeb123e933472d1066ed47e030a4a42
30-Day Money-Back Guarantee

Includes

17 hours on-demand video
11 articles
Full lifetime access
Access on mobile and TV
Certificate of Completion