4.4 out of 5
4.4
76 reviews on Udemy

Big Data for Managers

A foundation course for big data that covers the big data tools for various stages of a big data project
Instructor:
Ganapathi Devappa
446 students enrolled
English
Confidently lead a big data project in your organization
Differentiate big data technology from traditional technology
Talk about big data solution stages and cluster sizing with your development team, architects and CTOS
Select tools required for various stages of your big data project
Build an action plan for your big data analytics project using 5 Ps model

This course covers the required fundamentals about big data technology that will help you confidently lead a big data project in your organization. It covers the big data terminology like 3 Vs of big data and key characteristics of big data technology that will help you answer the question ‘How is big data technology different from traditional technology’. You will be able to identify various big data solution stages from big data ingestion to big data visualization and security. You will be able to choose the right tool for each stage of the big data solution. You will see the examples use of popular big data tools like HDFS, Map reduce, Spark, Zeppelin etc and also a demo of setting up EMR cluster on Amazon web services. You will practice how to use the 5 P’s methodology of data science projects to manage a big data project. You will see theory as well as practice by applying it to many case studies. You will practice how to size your cluster with a template. You will explore more than 20 big data tools in the course  and you will be able to choose the tool based on the big data problem.

Introduction

1
Introduction to the course

Introduction to the course

2
Course prerequisites and course structure
3
Big data sizes
4
Case study: Traditional solution vs Big data solution
5
Activity : Calculate the data sizes for big data projects in your organization

Use the provided template to list the big data projects (or any projects) in your organization and estimate the data sizes

Big data characteristics

1
Introduction
2
3 Vs of Big Data
3
Industry examples of big data
4
Big data analysis and visualization
5
Traditional vs big data technology
6
How is big data technology different?
7
Big data solution stages
8
Apache Hadoop and HDFS
9
Map reduce and Yarn
10
Pig, Hive and Spark
11
Things to remember
12
Activity: Technology type selection

Down load the technology excel sheet and open it in Microsoft excel. Various  technology features are listed and you need to choose whether the feature is big data or traditional technology from the drop down. Result will show correct or incorrect.

Big data storage

1
Introduction
2
Big data solution stages
3
Big data storage characteristics
4
No-SQL databases
5
HDFS
6
Hbase
7
Cassandra
8
Mongo DB and Impala
9
Sizing your cluster
10
Things to remember

Download the cluster sizing template to use for your big data projects

11
Activity: Size your big data cluster using the template

Down load the Cluster_sizing_template excel sheet and open it in Microsoft excel. Use it to size various big data projects in your organization.

12
Activity: Solve these storage exercises

Apply what you have learned in this section to estimate various storage solutions

Big data ingestion

1
Introduction
2
Solution stage Ingestion
3
Sources and types of data for ingestion
4
Big data ingestion tool features
5
Ingestion of batch data : Sqoop and Distcp
6
Streaming data ingestion using Flume
7
Kafka : A messaging system
8
Apache Flink
9
Nifi for data ingestion
10
scenarios for big data ingestion
11
Data ingestion diagram
12
Things to remember
13
Activity: Ingestion problems

Apply what you have learned so far in the course to solve these big data ingestion problems.

14
Quiz 1

Recap your learning by taking this quiz

Big data analytics

1
Introduction
2
Characteristics of big data analysis
3
Analysis using map-reduce, Pig and Hive
4
Analysis using Spark
5
Analysis using Storm and Stream sets
6
Machine learning and machine learning techniques
7
Turning insights into action
8
Things to remember
9
Activity
10
Activity: Provide solutions to these situations

Big data visualization, security and vendors

1
Introduction
2
Traditional and new types of data visualization for big data
3
Tools for big data visualization : Tableau, Qlikview and Zeppelin
4
Java script charts for visualization
5
Visualization summary
6
Big data security
7
Kerberos and Apache Knox
8
Apache Ranger and Apache Sentry
9
Best practices for big data security
10
Opensource software, support and vendors : Cloudera, Horton works and Map-R
11
Cloud computing options for big data
12
Things to remember
13
Demo: Setup big data cluster on EMR and access Spark and S3 using Zeppelin

Amazon web services provides a ready to use big data cluster service called Elastic Map Reduce or EMR. In this demo, I will show you how to create a big data cluster on EMR with Spark, Hadoop and Zeppelin already setup, Access Spark using Zeppelin, load data stored on S3 into Spark, apply map-reduce type of processing in Spark, access the results in Zeppelin using sql and visualize the results graphically in Zeppelin.

Big data projects

1
Introduction
2
Getting value out of big data
3
5 P's of data science
4
Purpose and People
5
Process and Platforms
6
Programmability
7
Case study 1: Analyze payment risks
8
Case study 2: New product analysis
9
Case study 3: Product recommendation
10
Case study 4: Log file analysis with multiple solutions
11
Things to remember

Conclusion

1
Conclusion and next steps
2
Course end quiz

Please take this quiz to reinforce the learning from the course.

Add ons

1
Demo and practice activity: Create a bucket on Amazon S3

This is added on request by one of the students. In the Amazon EMR demo earlier, I have used an already existing S3 bucket. This demo shows how to create a bucket and upload a file on to Amazon S3.

2
Answers to storage exercises

On request, I have added the answers to storage exercise so that you can verify your resolutions.

3
Answers to Ingestion exercises
You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
4.4
4.4 out of 5
76 Ratings

Detailed Rating

Stars 5
27
Stars 4
26
Stars 3
18
Stars 2
4
Stars 1
1
13e6b648d480a55e55302c6dd54d581b
30-Day Money-Back Guarantee

Includes

4 hours on-demand video
9 articles
Full lifetime access
Access on mobile and TV
Certificate of Completion