2019 AWS SageMaker and Machine Learning – With Python
*** UPDATE MAY-2019. 1. Model endpoint integration with hands-on-labs for (Direct Client, Microservice, API Gateway). 2. Hyperparameter Tuning – Learn how to automatically tune hyperparameters ***
*** UPDATE MARCH-12-2019. I came to know that new accounts are not able to use AWSML Service. AWS is asking new users to use SageMaker Service.
I have restructured the course to start with SageMaker Lectures First. Machine Learning Service Lectures are still available in the later parts of the course. Newly updated sections start with 2019 prefix.
All source code for SageMaker Course is now available on Github
The new house keeping lectures cover all the steps for setting up code from GitHub.
*** SageMaker Lectures – DeepAR – Time Series Forecasting, XGBoost – Gradient Boosted Tree algorithm in-depth with hands-on. XGBoost has won several competitions and is a very popular Regression and Classification Algorithm, Factorization Machine based Recommender Systems and PCA for dimensionality reduction ***
There are several courses on Machine Learning and AI. What is special about this course?
Here are the top reasons:
Cloud based machine learning keeps you focused on the current best practices.
In this course, you will Learn most useful algorithms. Don’t waste your time sifting through mountains of techniques that are in the wild
Cloud based service is very easy to integrate with your application and has support for wide variety of programming languages.
Whether you have small data or big data, elastic nature of the AWS cloud allows you to handle them all.
There is also No upfront cost or commitment – Pay only for what you need and use
In this course, you will learn AI and Machine Learning in three different ways:
AWS Machine Learning
AWS Machine Learning Service is designed for complete beginners.
You will learn three popular easy to understand linear algorithms from the ground-up
You will gain hands-on knowledge on complete lifecycle – from model development, measuring quality, tuning, and integration with your application
The next service is AWS SageMaker.
If you are comfortable coding in Python, SageMaker service is for you.
You will learn how to deploy your own Jupyter Notebook instance on the AWS Cloud.
You will gain hands-on model development experience on very powerful and popular machine learning algorithms like
XGBoost – a gradient boosted tree algorithm that has won several competitions,
Recurrent Neural Networks for Time Series forecasting,
Factorization Machines for high dimensional sparse datasets like Click Stream data
Neural Network based Image Classifiers,
Dimensionality reduction with Principal Component Analysis
and much more
In Application Services section of this course,
You will learn about a set of pre-trained services that you can directly integrate with your application.
You will gain hands-on experience in ready-to-use Vision service for image and video analysis, Conversation chatbots and Language Services for text translation, Speech recognition, and text to speech and more
I am looking forward to seeing you in the course.
Introduction and Housekeeping
Introduction to AWS Machine Learning Course, Topics Covered, Course Structure
2019 SageMaker Housekeeping
Following Downloadable Resources are available in this lecture:
1. Source Code and Data Setup Document
2. Introduction to Machine Learning and Concepts Document
2019 Machine Learning Concepts
2019 SageMaker Service Overview
XGBoost - Gradient Boosted Trees
"XGBoost (eXtreme Gradient Boosting) is a popular and efficient open-source implementation of the gradient boosted trees algorithm. Gradient boosting is a supervised learning algorithm that attempts to accurately predict a target variable by combining the estimates of a set of simpler, weaker models. XGBoost has done remarkably well in machine learning competitions because it robustly handles a variety of data types, relationships, and distributions, and the large number of hyperparameters that can be tweaked and tuned for improved fits. This flexibility makes XGBoost a solid choice for problems in regression, classification (binary and multiclass), and ranking"
For Source Code Setup from GitHub, please refer :
2019 Demo - Source Code and Data Setup in SageMaker Housekeeping Section
SageMaker - Principal Component Analysis (PCA)
"PCA is an unsupervised machine learning algorithm that attempts to reduce the dimensionality (number of features) within a dataset while still retaining as much information as possible. This is done by finding a new set of features called components, which are composites of the original features that are uncorrelated with one another. They are also constrained so that the first component accounts for the largest possible variability in the data, the second component the second most variability, and so on."
SageMaker - Factorization Machines
"A factorization machine is a general-purpose supervised learning algorithm that you can use for both classification and regression tasks. It is an extension of a linear model that is designed to capture interactions between features within high dimensional sparse datasets economically. For example, in a click prediction system, the factorization machine model can capture click rate patterns observed when ads from a certain ad-category are placed on pages from a certain page-category. Factorization machines are a good choice for tasks dealing with high dimensional sparse datasets, such as click prediction and item recommendation."
SageMaker - DeepAR Time Series Forecasting
"The Amazon SageMaker DeepAR forecasting algorithm is a supervised learning algorithm for forecasting scalar (one-dimensional) time series using recurrent neural networks (RNN)"
2019 Integration Options - Model Endpoint
2019 SageMaker HyperParameter Tuning
"Amazon SageMaker automatic model tuning, also known as hyperparameter tuning, finds the best version of a model by running many training jobs on your dataset using the algorithm and ranges of hyperparameters that you specify. It then chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose."
AWS Machine Learning Service
- Setup Anaconda Python Development Environment
- Install Boto3 Module needed for AWS
1. Setup Course Folder in local machine
2. Download Project Source Code
3. Download Data files
Introduction to Python Development Environment, Pandas, NumPy, Matplotlib
- Setup Simple Storage Service (S3) Bucket and Security Policies to allow access to machine learning
- S3 is the storage location where training, evaluation and test file will be kept
Summary of Introduction, Development Environment Setup and AWS Configuration