Big Data Internship Program – Data Processing – Hive and Pig
This course is part of “Big data Internship Program” which is aligned to a typical Big data project life cycle stage.
This course is focused on Data Processing in Big data.This course is suitable for developers, data analysts and business analysts. Experience with SQL and scripting languages is recommended, but is not required.
You will learn
- Understanding of Hive core concept and architecture.
- How to create and manipulate tables using Hive.
- Advanced features of Hive.
- Hive Best Practices
- Performing real-time, complex queries on datasets
- Pig’s Architecture
- Reading and Writing Data with Pig
- Pig Best Practices
Project work –
- Provide Data in Hive and manipulate the data for Our Book Recommendation project.
- One Ad-on project — Data Masking with hive and sqoop
Data Processing Introduction in Big Data
In This video, we have explained the course structure of course, How our course is useful for Big data experts and beginners.
In This video, we have explained what is data processing, how data processing is done in big data environment, what is big data cycle. Why big data processing is important in different Areas.
In this video, we have explained how Hadoop is applicable in the retail market, how hadoop can play important role in customer analysis.
In this video, we have given a small introduction of the hive, why
Facebook uses hive, where the hive was developed, what are hive
In this video, we have explained what is hive Architecture, how the hive is integrated with other hive components, how hive executes hive query.
In this video we have explained, what are the data type available in
HiveQL, what are primitive data type available, Collection data types
In this video, we have explained what is an internal table, what is an
external table, How they are used, what is the feature of hive internal
table and external table.
In this video, we have explained what is an external table and internal
table, what is the benefit of an external table over an internal table.
In this video we have explained partitioning in the hive, what is the meaning of partitioning table.
In this video we have explained Dynamic partition in hive, what is the usage of hive dynamic table, how to create dynamic table etc.
In this video, we have shown Pig architecture, how Pig statement is executed in Pig grunt mode.
In this video, we have shown what are the different data types available in Pig, Type of data types, what are relations, bag, tuple etc.
In this video we have shown the different type of operators available in Pig Latin, like binary, ternary, flatter, how to load data using PigStorage(), Dump operator, store operator, limit and distinct, order by, grouping etc.
In this video, we have shown how we can execute word count task in pig latin.
Data Processing in Recommendation Project
In this video we have explain how to execute our Book Recommendation Project by using hive, sqoop, mysql. How to upload data in system for processing.
In this video we have explained some attribute of table, how we can access them and how we can optimize query execution in hive, we have done some hands-on recommendation database, for analysis of tables and seen the results.
Ad-on Project Data Masking
In this video, we have explained the data masking project, which components we are going to use, what is the use of data masking etc, what are project requirement, what is a flow of the project.
In this video, we have explained data masking solution, how this project
is executed, what is the goal of the project, what Softwares/tools we
are going to use in execution.
In this video we have explain the step-by-stpe flow of Data Masking project, and different stages of project.
In this video, we have explained how to create an external table, and
how to load data in the external table, how to import data from external
table to hive table.