1 out of 5
2 reviews on Udemy

Machine Learning with Apache Spark 2: 2-in-1

Learn to implement and evaluate machine learning solutions with Apache Spark 2
Packt Publishing
20 students enrolled
English [Auto-generated]
Perform advanced text processing and build classification models
Use Natural Language Processing (NLP) techniques to create a program that learns structure of the posts in a forum
Stream applications to provide real-time insights and predictions
Implement Word2Vect in Apache Spark
Delve into graph processing using GraphX library
Learn the best practices involved in building, evaluating, tuning, and deploying Spark pipelines

Apache Spark lets you apply machine learning techniques to data in real time, giving users immediate machine-learning based insights based on what’s happening right now. It’s used to create machine learning models and programs that are distributed and much faster compared to standard machine learning toolkits such as R or Python. If you’re a data professional who is familiar with machine learning and wants to use Apache Spark for developing efficient and fast machine learning systems, then this learning path is for you.

This comprehensive 2-in-1 course teaches you to build machine learning systems, perform analytics, and predictions with Apache Spark. You’ll learn through practical demonstrations of use cases, clear explanations, and interesting real-world applications. Each section briefly establishes theoretical basis for the topic under discussion and then cement your understanding with practical use cases.

This training program includes 2 complete courses, carefully chosen to give you the most comprehensive training possible.

The first course, Spark for Machine Learning, starts off with explaining how to use Spark MLlib. You will then learn supervised and unsupervised machine learning algorithms. You will also learn to build classification models and extracting proper futures from text using Word2Vect to achieve this. Next, you will build logistic regression model with Spark. You will learn to find clusters and correlations in your data using K-Means clustering. Moving ahead, you will learn how to validate models using cross-validation and area under the ROC measurement. You will then build an effective recommendation model using distributed Spark algorithm. Finally, you will be glanced through graph processing using GraphX library.

The second course, Advanced Machine Learning with Spark 2.x, starts with an introduction to the key concepts and data types that are fundamental to understanding distributed data processing and machine learning with Spark. You will then be provided with practical recipes that demonstrate some of the most popular algorithms in Spark, leading to the creation of sophisticated machine learning pipelines and applications. Further you will be learning more advanced use cases for machine learning such as streaming, NLP, and deep learning.

By the end of the course, you’ll be able to focus on leveraging Apache Spark to create fast and efficient machine learning systems.

Meet Your Expert(s):

We have the best work of the following esteemed author(s) to ensure that your learning journey is smooth:

  • Tomasz Lelek is a Software Engineer who programs mostly in Java and Scala. He is a fan of microservice architectures and functional programming. He dedicates considerable time and effort to being better every day. Recently, he’s been delving into big data technologies such as Apache Spark and Hadoop. He is passionate about nearly everything associated with software development. He thinks that we should always try to consider different solutions and approaches to solving a problem. Recently, he was a speaker at several conferences in Poland: Confitura and JDD (Java Developer’s Day) and also at Krakow Scala User Group. 

Spark for Machine Learning

The Course Overview

This video provides an overview of the entire course.

Analyzing Text Input Data

In this video, you will learn how to analyze text input data.

  • Analyze input Data
  • Prepare Input Data to make it ready for ML models
  • Learn about tokenization and Removing Stop Words
Extracting Features from Data

In this video, you will see how to extract features from text data and the bag-of-words and skip gram algorithm.

  • See how to preserve text as a vector
  • Transform text into vector of numbers
  • Learn about the Bag-Of-Words and Skip-Gram algorithm
Implementing Word2Vect Using Apache Spark

In this video, you will be using the Word2Vect algorithm.

  • Get to know Word2Vect API
  • Understand its parameters
  • Use Word2Vect
Logistic Regression Explanation

In this video, you will learn about logistic regression.

  • Know what supervised and unsupervised learning are
  • Know what is logistic regression
  • See an example of logistic regression
Writing Logistic Regression Model per Author

In this video, you will implement the logistic regression model.

  • Get a reminder of what we want to achieve
  • Implement the logistic regression model
  • Tweak the model
Validate Models Using Cross-Validation

In this video, you will learn how to validate your logistic regression modelusing cross validation.

  • Learn what cross-validation is
  • Split training and test data in a proper way
  • Learn how to implement cross- validation
Analyzing Time of Post Using Clustering - GMM Explanation

In this video, you will be analyzing the time of post using clustering.

  • Learn about the Gaussian mixture model
  • Learn how to cluster data using post timestamp
  • Learn how to use GMM in a proper way
Implementing GMM in Apache Spark

In this video,you will learn how to implement GMM in Apache Spark.

  • Prepare the input data for clustering
  • Use GMM to cluster posts
  • Learn how to implement logic in Apache Spark
Measuring Accuracy Using Area Under ROC

In this video, you will learn how to measure accuracy using area under ROC.

  • Look at the area under ROC measure
  • Evaluate and interpret ROC
  • Evaluate logistic regression with GMM model
Dimensionality Reduction

In this video, you’ll be learning about singular value decomposition (SDV) using dimensionality reduction.

  • Learn what SDV is
  • Know how can we use it
  • Implement it in Spark using SPARKML Lib
Building Recommendation Engine

In this video, we will be building a recommendation engine using collaborative filtering.

  • Look at the movie data source that will be used to train the model
  • Build collaborative filtering in Spark
  • Use the alternating least squares (ALS)algorithm
Using Recommendation Engine to Get TOP Recommendations

In this video, we will be using a recommendation engine to get top recommendations.

  • Validate CF using cross validation
  • Use root-mean-square error measure
  • Get recommended movies for a user
What is a Graph?

In this video, we will look into Graphs.

  • Know what a graph is
  • Know where it can be used
  • Learn how to present it
GraphX API

In this video, we will delve into the GraphX API.

  • Get familiar with GraphX API
  • Learn how to create Graph and Edges
  • Learn how to create Vertices
Structural Operations on Graph

In this video, we will learn about structural operations on Graph.

  • Look into Sub graph
  • Look into the Connected Components
Neighborhood Aggregation

In this video, we will learn about neighborhood aggregation.

  • Send message to edge
  • Perform Computation on graph
  • Calculate avarage - aggregation
Spark for Machine Learning:

Advanced Machine Learning with Spark 2.x

The Course Overview

This video provides an overview of the entire course.

Spark Data Structures — RDD, DataFrames, and Datasets

In this video, we will learn about Spark architecture.

  • Learn about RDD
  • Learn about DataFrames
  • Learn about Datasets
Dense and Sparse Vectors

This video will tell you why we need vectors in machine learning.

  • Create and test dense vectors
  • Create and test sparse vectors
Labeled Points, Matrix, and Other Data Types

This video will tell you what is a Labeled point.

  • Learn what is a Matrix
  • Construct a Labeled point in Spark
  • Construct a Matrix in Spark
Key Concepts, Machine Learning Pipelines, and Operations

In this video we will create simple Spark ML pipeline.

  • Create a Spark ML stage
  • Compose multiple Stages into one ML pipeline
Feature Engineering

This video will teach you what feature engineering is.

  • Learn how to extract features from data
  • Get to know more about feature engineering
Supervised Learning – Classification, Regression

Understand supervised and unsupervised ML.

  • Discuss what logistic regression is
  • Learn logistic regression with a simple example
  • Have a look at an example use case of Logistic Regression
Unsupervised Learning

This video will teach us what clustering is.

  • Learn about the Gaussian mixture model
  • Discuss clustering data using post timestamp
  • Learn how to use GMM in a proper way
Recommendation Engines

In this video, we will look at the movie data source that will be used to train model.

  • Build collaborative filtering in Apache Spark
  • Use the ALS algorithm
  • Recommend movies for a given user
Deep Dive into Regression Models

In this video, we will create a machine learning program that uses logistic regression.

  • Use logistic regression for prediction
  • Compose pipelines into stages
Deep Dive into Decision Tree Models

In this video, we will learn what a decision tree is.

  • Construct a decision tree
  • Use a decision tree to make decisions
Evaluating and Tuning Our Model

In this video, we will split a data set into a training and a test set.

  • Validate the model
  • Discover the Cross-Validation technique
Saving and Deploying Our Model

This video will teach us how to save the trained model to reuse later.

  • Save the pipeline skeleton
  • Load the trained model and reuse it
Overview of Spark Streaming

This video will talk about the spark-streaming architecture.

  • Get to know about Micro-batches
  • Learn the difference between Latency versus Throughput
  • Understand failure recovery and Checkpointing
Your Own Streaming Application with Kafka

In this video, we will learn about creating DStream provider.

  • Load data from Kafka
  • Create base for Spark streaming job
Your First Streaming Application

In this video, we will learn about sending real time notifications when user wants to buy a product.

  • Create abandoned cart logic
  • Implement Streaming logic
Analyzing Sensors Data in a Streaming Way

This video will talk about inserting a stream of sensor data into HBase.

  • Fetch data from HBase to Apache Spark
  • Calculate statistics
  • Store statistics about sensors in HBase
Natural Language Processing Overview

Understand what a Natural Language Processing is.

  • Get to know why ML is useful in such solutions
  • Natural Language Processing overview
Feature Generation from Text — CountVectorizer, TFIDF, and LDA

In this video, we will learn about Count Vectorization.

  • Get to know about TF-IDF
  • Get to know about LDA
Feature Generation from Text — Word Embeddings

This video will tell you how to show text as a vector.

  • Transform text Into vector of numbers
  • Use Word2Vect for transformation
NLP Document Classification Application

In this video, we will learn about the need of NLP.

  • Get to know about the process of NLP
  • Look at the various applications of NLP
The Spark Versus Deep Learning Use Case

This video will give us an introduction to Deep Learning.

  • Learn when and how to use them
  • Get to know about some use cases
Spark for Parallelizing Deep Learning Evaluation

Add Deeplearning4j to Apache Spark.

  • Configure MultiLayer
  • Set up neural-network parameters
Deep Learning as a Feature Generator for Existing Spark ML Algorithms

Use Mnist Database for Deep Learning use cases.

  • Create a Spark job that starts Deep Learning process
  • Deep Learning as a feature generator for existing Spark ML algorithms
Spark/Deep Learning Made Easy

Start the neural network Spark Program.

  • Summary
  • Learn Spark/Deep Learning made easy
Advanced Machine Learning with Spark 2.x:
You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
1 out of 5
2 Ratings

Detailed Rating

Stars 5
Stars 4
Stars 3
Stars 2
Stars 1
30-Day Money-Back Guarantee


4 hours on-demand video
Full lifetime access
Access on mobile and TV
Certificate of Completion