4.48 out of 5
4.48
6181 reviews on Udemy

Spark and Python for Big Data with PySpark

Learn how to use Spark with Python, including Spark Streaming, Machine Learning, Spark 2.0 DataFrames and more!
Instructor:
Jose Portilla
30,270 students enrolled
English [Auto-generated] More
Use Python and Spark together to analyze Big Data
Learn how to use the new Spark 2.0 DataFrame Syntax
Work on Consulting Projects that mimic real world situations!
Classify Customer Churn with Logisitic Regression
Use Spark with Random Forests for Classification
Learn how to use Spark's Gradient Boosted Trees
Use Spark's MLlib to create Powerful Machine Learning Models
Learn about the DataBricks Platform!
Get set up on Amazon Web Services EC2 for Big Data Analysis
Learn how to use AWS Elastic MapReduce Service!
Learn how to leverage the power of Linux with a Spark Environment!
Create a Spam filter using Spark and Natural Language Processing!
Use Spark Streaming to Analyze Tweets in Real Time!

Learn the latest Big Data Technology – Spark! And learn to use it with one of the most popular programming languages, Python!

One of the most valuable technology skills is the ability to analyze huge data sets, and this course is specifically designed to bring you up to speed on one of the best technologies for this task, Apache Spark! The top technology companies like Google, Facebook, Netflix, Airbnb, Amazon, NASA, and more are all using Spark to solve their big data problems!

Spark can perform up to 100x faster than Hadoop MapReduce, which has caused an explosion in demand for this skill! Because the Spark 2.0 DataFrame framework is so new, you now have the ability to quickly become one of the most knowledgeable people in the job market!

This course will teach the basics with a crash course in Python, continuing on to learning how to use Spark DataFrames with the latest Spark 2.0 syntax! Once we’ve done that we’ll go through how to use the MLlib Machine Library with the DataFrame syntax and Spark. All along the way you’ll have exercises and Mock Consulting Projects that put you right into a real world situation where you need to use your new skills to solve a real problem!

We also cover the latest Spark Technologies, like Spark SQL, Spark Streaming, and advanced models like Gradient Boosted Trees! After you complete this course you will feel comfortable putting Spark and PySpark on your resume! This course also has a full 30 day money back guarantee and comes with a LinkedIn Certificate of Completion!

If you’re ready to jump into the world of Python, Spark, and Big Data, this is the course for you!

Introduction to Course

1
Introduction
2
Course Overview
3
Frequently Asked Questions
4
What is Spark? Why Python?

Setting up Python with Spark

1
Set-up Overview

Let's explain the set-up for the course!

2
Note on Installation Sections

Local VirtualBox Set-up

1
Local Installation VirtualBox Part 1

Let's walk through the local installation of Ubuntu

2
Local Installation VirtualBox Part 2
3
Setting up PySpark

AWS EC2 PySpark Set-up

1
AWS EC2 Set-up Guide

Let's show you how to use Amazon Web Services' EC2 Instances for Spark!

2
Creating the EC2 Instance
3
SSH with Mac or Linux
4
Installations on EC2

Databricks Setup

1
Databricks Setup

AWS EMR Cluster Setup

1
AWS EMR Setup

Python Crash Course

1
Introduction to Python Crash Course
2
Jupyter Notebook Overview
3
Python Crash Course Part One
4
Python Crash Course Part Two
5
Python Crash Course Part Three
6
Python Crash Course Exercises
7
Python Crash Course Exercise Solutions

Spark DataFrame Basics

1
Introduction to Spark DataFrames
2
Spark DataFrame Basics

Learn the basics of Spark DataFrames!

3
Spark DataFrame Basics Part Two
4
Spark DataFrame Basic Operations

Learn some basic operations with Spark 2.0

5
Groupby and Aggregate Operations
6
Missing Data
7
Dates and Timestamps

Spark DataFrame Project Exercise

1
DataFrame Project Exercise
2
DataFrame Project Exercise Solutions

Introduction to Machine Learning with MLlib

1
Introduction to Machine Learning and ISLR
2
Machine Learning with Spark and Python with MLlib

Linear Regression

1
Linear Regression Theory and Reading
2
Linear Regression Documentation Example
3
Regression Evaluation
4
Linear Regression Example Code Along
5
Linear Regression Consulting Project
6
Linear Regression Consulting Project Solutions

Logistic Regression

1
Logistic Regression Theory and Reading
2
Logistic Regression Example Code Along
3
Logistic Regression Code Along
4
Logistic Regression Consulting Project
5
Logistic Regression Consulting Project Solutions

Decision Trees and Random Forests

1
Tree Methods Theory and Reading
2
Tree Methods Documentation Examples
3
Decision Tress and Random Forest Code Along Examples
4
Random Forest - Classification Consulting Project
5
Random Forest Classification Consulting Project Solutions

K-means Clustering

1
K-means Clustering Theory and Reading
2
KMeans Clustering Documentation Example
3
Clustering Example Code Along
4
Clustering Consulting Project
5
Clustering Consulting Project Solutions

Collaborative Filtering for Recommender Systems

1
Introduction to Recommender Systems
2
Recommender System - Code Along Project

Natural Language Processing

1
Introduction to Natural Language Processing
2
NLP Tools Part One
3
NLP Tools Part Two
4
Natural Language Processing Code Along Project

Spark Streaming with Python

1
Introduction to Streaming with Spark!
2
Spark Streaming Documentation Example
3
Spark Streaming Twitter Project - Part
4
Spark Streaming Twitter Project - Part Two
5
Spark Streaming Twitter Project - Part Three

Bonus

1
Bonus Lecture: Coupons
You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
4.5
4.5 out of 5
6181 Ratings

Detailed Rating

Stars 5
3148
Stars 4
2347
Stars 3
563
Stars 2
84
Stars 1
44
df0196c38105710392a0460dbe4d5c65
30-Day Money-Back Guarantee

Includes

11 hours on-demand video
3 articles
Full lifetime access
Access on mobile and TV
Certificate of Completion