4.71 out of 5
7 reviews on Udemy

Hands-On Amazon Redshift for Data Warehousing

Leverage cutting-edge techniques to build data warehouses in the cloud
Packt Publishing
41 students enrolled
English [Auto-generated]
Understand data warehousing principles and how Redshift is challenging the traditional way of thinking
See how Redshift integrates with the AWS Cloud ecosystem
Learn how Redshift leverages the latest technology to provide up to 10x the performance of competing technologies
Create a cloud native, fully managed data warehouse and use it to join together disparate data sets
Connect your new data warehouse with disjointed data stored on Amazon S3 with Redshift Spectrum
Visualise your newly connected data sets with Amazon Quicksight
Dive headfirst into building a Redshift data warehouse using a diversified data set
Connect to and optimize your data warehouse and join data sets together
Connect data in your data warehouse with data on Amazon S3 with Redshift spectrum

Amazon Redshift is a low-cost cloud data platform that can scale from gigabytes to petabytes on a high-performance, column-oriented SQL engine. Amazon Redshift brings the power of scale-out architecture to the world of traditional data warehousing.

In this course, you will explore this low-cost, cloud-based storage, which can be scaled up or down to meet your true size and performance needs. You will learn to configure a sample data warehouse. Next, you will explore Redshift’s internal workings and architecture, and learn what makes it so fast. You will get hands-on experience connecting, querying, and building BI and data viz products and learn how to secure, maintain, and administer your new platform.

By the end of this course, you will be able to scale from gigabytes to petabytes on this high-performance, column-oriented SQL engine.

About The Author

Colibri Digital is a technology consultancy company founded in 2015 by James and Ingrid Cross. The company works to help its clients navigate the rapidly changing and complex world of emerging technologies, with deep expertise in areas like big data, data science, machine learning, and cloud computing.

Over the past few years they have worked with some of the world’s largest and most prestigious companies, including tier 1 investment banks, a leading management consultancy group, and one of the world’s most popular soft drinks companies, helping each of them to better make sense of its data, and process it in more intelligent ways.

At the frontier of AI, big data and cloud computing, we are Colibri Digital.

James Cross is a big data engineer and certified AWS Solutions architect with a passion for data-driven applications. He’s spent the last 3-5 years helping his clients to design and implement huge-scale streaming big data platforms, Cloud-based analytics stacks, and serverless architectures.

He started his professional career in Investment Banking, working with well-established technologies such as Java and SQL Server, before moving into the big data space. Since then, he’s worked with a huge range of big data tools including most of the Hadoop eco-system, Spark, and many No-SQL technologies such as Cassandra, MongoDB, Redis, and DynamoDB. More recently, his focus has been on Cloud technologies and how they can be applied to data analytics, culminating in his work at Scout Solutions as a CTO, and more recently with Mckinsey.

James is an AWS-certified solutions architect with several years’ experience designing and implementing solutions on this cloud platform. As the CTO of Scout Solutions Ltd, he built a fully serverless set of API’s and an analytics stack based around Lambda and Redshift.

Data Warehousing for the Internet Age

The Course Overview

This video gives a glimpse of the entire course.

Do We Still Need a Data Warehouse?

Understanding the use cases for data warehousing in a modern data landscape.

   •  Understand the data landscape today

   •  Understand the case for BI

   •  Have a look at the BI Use cases

Data Technologies Compared: Relational, Data Warehouse, NoSQL, and Big Data

Understanding the modern data landscape and the technologies that make it up.

   •  Understand NoSQL

   •  Understand big data

   •  Understand RDBMS

Providing Business Intelligence on Internet Scale Data

Understanding the BI use case in detail and how to solve that problem on large datasets.

   •  Have a look at the BI use case

   •  Scale up the BI tools

Cloud-Native Data Warehousing

Introducing cloud native BI data warehouse tools like Redshift.

   •  Go through the Cloud Native Data Tools

   •  Introduction to Redshift

   •  Cloud native data warehousing

Getting Started with Redshift

Launching a Redshift Data Warehouse on AWS

Introduction to the AWS console and how to use it to launch a redshift cluster.

   •  Understand the AWS Console

   •  Launch a Redshift cluster

   •  Connect to Redshift

Launching a Redshift Data Warehouse Using Cloudformation

Introduction to Cloudformation, and how to use it to launch a Redshift cluster.

   •  Get introduced to Cloudformation

   •  Launch a Cluster with Cloudformation

   •  Terminate a cluster with Cloudformation

Redshift Technology Deep Dive: Columnar File System

Understanding why technologies like columnar file systems enable the scale out of data warehouses.

   •  Differentiate between Columnar and Tabular

   •  Understand why Columnar can be used to accelerate BI queries

Redshift Technology Deep Dive: Massively Parallel Processing

Understanding why technologies like MPP enable the scale out of data warehouses.

   •  Get yourself introduced to the MPP concept

   •  Scale out queries using MPP

Creating a Redshift Data Warehouse from Disparate Data Sets

Sourcing Appropriate Data Sets

What we need in a source data set and what we're trying to achieve with it?

   •  Learn what we need in a data set

   •  Learn what we will do with the data set

   •  Download the IMDB data set

Ingesting Various Sizes of Data Set into Redshift

Understanding how to load data of various sizes into our new DWH cluster.

   •  Upload data t S3

   •  Copy data to Redshift

   •  Query data

Connecting to and Querying the Data Warehouse

Understanding how to connect to Redshift and run basic join queries against it.

   •  Connect to Redshift

   •  Execute queries

Redshift Technology Deep Dive: Query Caching

Understanding why technologies like Query Caching enable the scale out of data warehouses.

   •  Get yourself introduced to the Query caching concept

   •  Query caching at scale with Redshift

Optimizing Redshift for Scale

Ingesting Enormous Volumes of Data by Copying Directly from S3

Learn how to use manifest files to copy vast amounts of data efficiently into Redshift.

   •  Learn how does Redshift run a copy command

   •  Understand manifest files

   •  Leverage manifests to copy data

Optimizing Redshift Data Types for Query Performance at Scale

Understand the different data types Redshift can use and how they impact query performance.

   •  Explore the Compression types

   •  Learn what to use and when

Evenly Distributing Data Across Your Cluster to Improve Filters and Joins

Learn how to make the most of the MPP concept by avoiding data skew.

   •  Learn about MPP and data skew

   •  Ensure even distribution with distribution keys

Connecting Redshift with disconnected Data using Redshift Spectrum

Exploratory Analytics for Disconnected Data

The use case for tools like Spectrum.

   •  Understand the issues with BI tools

   •  Use cases for Spectrum

Loading a Disconnected Data Set

How to load a disconnected data set into Redshift?

   •  Load data to S3

Glue Data Catalog - Creating a Schema for the External Data Set

How to create an external database and schema for data sets on S3?

   •  Create an external DB

   •  Create an external schema and table

Visualizing your results with Amazon Quicksight

The BI Use Case for Data Warehousing

Reviewing the use case for BI.

   •  Understand Business Intelligence

   •  Get intelligence from data

Introducing Amazon Quicksight

What is Quicksight and where does it fit in?

   •  Explore the Visualization tool options

   •  Get introduced to Quicksight

What Is Spice and How Can It Be Used to Accelerate Analysis?

The typical problem with BI tools and how to solve it.

   •  Understand the problem with low BI tools

   •  Accelerate BI queries

   •  SPICE introduction

Loading Data into SPICE

How to load data into Spice and visualize it?

   •  Visualize data on Redshift

   •  Compare performance to data loaded into Spice

You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
4.7 out of 5
7 Ratings

Detailed Rating

Stars 5
Stars 4
Stars 3
Stars 2
Stars 1
30-Day Money-Back Guarantee


2 hours on-demand video
Full lifetime access
Access on mobile and TV
Certificate of Completion