3.35 out of 5
3.35
17 reviews on Udemy

Project Based Training on Big Data HADOOP-MapReduce,PIG,HIVE

Real Time Applications: Hands-on in Hadoop, MapReduce, Pig and Hive. Course is with Sample Input Data and Sample Code.
You can develop real time applications using MapReduce, Pig and Hive.
You will easily understand concept of Big Data and HADOOP.
One real time application also included in this course. So that you can easily understand the concept of Big Data HADOOP tools like Mapreduce, PIG and hive.
Sample Data and sample code is also included in this course for that project.

This course is basically intended for users who are interested to learn about Hadoop technologies.

The training includes Complete practical Training on Big Data HADOOP-MapReduce, PIG and HIVE with real time applications. You can develop your complete Project in Big Data HADOOP using MapReduce, PIG and HIVE, because sample Data and sample code is also available with this course.

Big Data and HADOOP (Concept)

1
What is Data ?
2
What is Big Data ?
3
Data Sources of Big Data part1
4
Data Sources of Big Data part 2
5
Traditional Analytics vs Big Data Analytics
6
Big Data Customers-Many Industrial Domain
7
Bigdata-attributes-volume
8
variety of data
9
Velocity of Data
10
Veracity of Data
11
Hadoop History
12
Hadoop Concepts
13
Hadoop Ecosystem
14
Hadoop Core Components
15
Hadoop Distributions
16
HDFS Blocks File Splits
17
HDFS Write Operation
18
hadoop-2.x-architecture-Part-1
19
hadoop-2.x-architecture-Part-2

Understanding MapReduce and HADOOP Installation (Concept)

1
MapReduce Components
2
Understand MapReduce Flow
3
Client Communication
4
Need of YARN
5
HDFS Architecture
6
NodeManager
7
Hadoop Cluster Modes
8
Secondary-Namenode
9
Hadoop 2.7.3 Installation-Part 1
10
Hadoop 2.7.3 Installation-Part 2
11
Hadoop 2.7.3 Configuration Files
12
Hadoop Basic Commands

Apache PIG (Concept)

1
Introduction.
2
Overview
3
Overview Countinue...
4
Set up Cloudera on Windows for PIG
5
Basics Commands in PIG
6
Group by and Co-Group Operators Demo in PIG
7
Load and Store Functions in PIG

Project: Banking and finance domain project using MapReduce, PIG and HIVE

1
Problem Statement

Project: Analyze Loan Dataset

Industry: Banking and Finance

Data: Publicly available dataset which contains complete details of all the loans issued, including the current loan status (Current, Late, Fully Paid, etc.) and latest payment information.

Problem Statement:

1. Calculate overall average risks
2. Calculate average risk per location
3. Calculate average risk per categories/loan type
4. Calculate average risk per location and Category

2
Input Data and Example

Input Data Files are attached here

Calculate Average Risk using MapReduce

1
Calculate Average Risk using MapReduce

class Mapper

{
setup()
{
}

byteof set key
line value

cleanup()
{

}
}

class reducer
{
setup()
{
}

input key = null
input value = risk

count 

sum

cleanup()
{
avg  = sum(risk)/count
}

}

2
MapReduce Coding for calculating average risk
3
Execution Of MapReduce Code for average risk calculation

Calculate Average Risk using MapReduce per Location

1
Calculate Average Risk per Location using MapReduce
2
Execution Of MapReduce Code for Average Risk per Location

Calculate Average Risk using MapReduce per Category

1
Calculate Average Risk per Category using MapReduce
2
Execution Of MapReduce Code for Average Risk per Category
3
Calculate Average Risk per Location and Category using MapReduce

Banking And Finance Domain Analysis using Pig

1
Calculate Overall Average Risk using Pig
2
Other Scenarios using Pig

Environment setup and Import Data using Sqoop

1
Setup Cloudera to work on Big-Data tools
2
Get Data RDBMS to HDFS using Sqoop

Banking And Finance Domain Analysis using HIVE.

1
Banking And Finance Domain Analysis using HIVE Part-1
2
Banking And Finance Domain Analysis using HIVE Part-2

Sample Input Data and Sample Code.

1
MapReduce Sample Code

MyDriver.java

------------------------------------------------------------------------

package BankingAvrageRisk;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class MyDriver 
{

public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException 
{
// TODO Auto-generated method stub

Configuration conf = new Configuration();

//problem -2
//conf.set("LocSearch", args[0]);

Job job = new Job(conf,"Calculate Avg Risk per Category/Loan Type");
job.setJarByClass(MyDriver.class);
job.setMapperClass(MyMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(DoubleWritable.class);
job.setReducerClass(MyReducer.class);
job.setNumReduceTasks(1);

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

System.exit(job.waitForCompletion(true)?0:1);

}

}

--------------------------------------------------------------------

MyMapper.java

-------------------------------------------------------------------

package BankingAvrageRisk;

import java.io.IOException;

import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class MyMapper extends Mapper<LongWritable,Text,Text,DoubleWritable> {

public void map(LongWritable key, Text value, Context con) throws IOException, InterruptedException
{
String line = value.toString();
String[] linePart = line.split(",");

//problem -1 -avg risk
Double risk = Double.parseDouble(linePart[7]);

//problem-2 - avg risk per location
//String loc = linePart[8].toString();

//problem-3 - avg risk per category
String cat = linePart[2].toString().substring(1, 3);

if(cat.equals("HL"))
{
cat = "Home Loan";
}
else if(cat.equals("PL"))
{
cat = "Personal Loan";
}
else if(cat.equals("VL"))
{
cat = "Viechel Loan";
}
else
{
cat = "Retailer Loan";
}

con.write(new Text(cat), new DoubleWritable(risk));

}

}

----------------------------------------------------------------------

MyReducer.java

---------------------------------------------------------------------

package BankingAvrageRisk;

import java.io.IOException;

import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class MyReducer extends Reducer<Text,DoubleWritable,Text,DoubleWritable> {


double glSum = 0;
int glcount = 0;

//double avg = 0;


public void reduce(Text key,Iterable<DoubleWritable> val,Context con) throws IOException, InterruptedException
{

for (DoubleWritable values : val) {

glSum = glSum + values.get();
glcount = glcount +1;

}

con.write(new Text(key), new DoubleWritable(glSum/glcount));


}

/*public void cleanup(Context con) throws IOException, InterruptedException
{
avg = glSum/glcount;
con.write(new Text(""), new DoubleWritable(avg));
}*/

}




2
Sample Input Data Sets

Sample Input Data Sets are attached here

3
Mapreduce Sample Code in PDF format

copy these content from pdf and save as java files

You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
3.4
3.4 out of 5
17 Ratings

Detailed Rating

Stars 5
7
Stars 4
2
Stars 3
4
Stars 2
0
Stars 1
4
b9c087b3bb1d9e5623085e6d4a5974f7
30-Day Money-Back Guarantee

Includes

8 hours on-demand video
3 articles
Full lifetime access
Access on mobile and TV
Certificate of Completion