Course Description
Talend Open Studio for Data Integration is an open Source ETL Tool, which means small companies or businesses can use this tool to perform Extract Transform and Load their data into Databases or any File Format (Talend supports many file formats and Database vendors).
Talend Open Studio for Big Data is an open Source Tool used to interact with Big Data systems from Talend.
If you want to learn how to use Talend Open Studio for Big Data from SCRATCH or If you want to IMPROVE your skills in Big Data Concepts and designing Talend Jobs, then this course is right for you.
Its got EVERYTHING, covers almost all the topics in Talend Open Studio for Big Data.
Talks about Real Time USE CASES.
Prepares you for the Certification Exam.
By the end of the Course you will Master Working with Big Data by designing Talend Jobs.
And what more you ask, All the Videos are HD Quality.
Email me for coupons.
What Are the System Requirements ?
PC or Mac.
Virtual Box Which is FREE.
Talend Software Which is FREE.
HDP VM Which is FREE.
CDH VM Which is FREE.
What Does the Course Cover ?
This Video will talk about the topics that are covered in this course.
This video will show you how to download data files and Talend jobs that are designed as part of this course.
TALEND OVERVIEW
This lecture gives you an overview of what Talend Open Studio for Big data is and it also talks about the additional features in the subscription version.
After watching this lecture, you should be able to download and open Talend Open Studio for Big Data on your Windows OS.
BIG DATA OVERVIEW
You should be able to describe the 3 V's of Big Data and what big data is ?
Introduction to Hadoop and its advantages over traditional systems.
Walks you through different Hadoop ecosystem Tools.
What is a HDFS Block, What is a Namenode and its functions, Waht is a DataNode and its functions.
This lecture explains you, What happens when you Read/Write a file from/to to HDFS.
Gives an overview of MapReduce and its functions.
Explains the Map phase, the Shuffle Sort Phase and the Reduce Phase in a Map Reduce Job.
Explains different types of Key-Values pairs generated as part of a MapReduce Job.
Briefly explains YARN and its daemons.
Explains what happens when you run an application on YARN.
Getting Started
After watching this video you should be able to download CDH virtual box image and your cluster should be up and running.
After watching this video you should be able to download HDP virtual box image and your cluster should be up and running.
You should be able to create and open a Talend project.
This lecture walks you through the process of creating a Hadoop cluster Metadata using HDP.
HDFS Components
This video will explain what HDFS is and shows you how to run different hdfs commands on cluster.
For example, how to create a file/directory,how to copy file from local system to hdfs, etc..
It will show you how to create a HDFS connection in a way that you can use it in all of your HDFS jobs.
Explains the scenario where you have to copy a file or directory from your local system to HDFS.
Explains the scenario where you have to compare two different files that are on HDFS.
HIVE Components
PIG Components
SQOOP Components
Download the jobs designed in this section from here.
Coming Soon :-)
Big Data Batch Jobs (EMR Spark) - Coming Soon
A walk-through over the architecture we are using in this course.
You will get an overview on EMR.
You will get an overview on S3.
You will get an overview on Spark.
You will learn what are big data batch jobs in Talend.
You will learn how to run a BDBJ.
You will learn what exactly happens when you run a BDBJ.
You will learn what are Access keys & Secret Keys & how to setup them up on AWS.
You will learn what are roles on AWS & how to create them.
You will learn How to create/start/stop an EMR cluster using AWS Console.
You will learn How to create/start/stop an EMR cluster using Talend Studio.
You will learn how to create a BDBJ using Spark as framework.
You will learn How to configure a BDBJ.
You will learn how to run a BDBJ.
You will learn Components that are part of BDBJ.