**Course Most Recently Updated Nov/2018! **

Thank you all for the huge response to this emerging course! We are delighted to have over 20,000 students in over 160 different countries. I’m genuinely touched by the overwhelmingly positive and thoughtful reviews. It’s such a privilege to share and introduce this important topic with everyday people in a clear and understandable way.

I’m also excited to announce that I have created real closed captions for all course material, so weather you need them due to a hearing impairment, or find it easier to follow long (great for ESL students!)… I’ve got you covered.

**Most importantly: **

To make this course “real”, we’ve expanded. In November of 2018, the course went from 41 lectures and 8 sections, to 62 lectures and 15 sections! We hope you enjoy the new content!

**Unlock the secrets of understanding Machine Learning for Data Science!**

In this introductory course, the “Backyard Data Scientist” will guide you through wilderness of Machine Learning for Data Science. Accessible to everyone, this introductory course not only explains Machine Learning, but where it fits in the “techno sphere around us”, why it’s important now, and how it will dramatically change our world today and for days to come.

**Our exotic journey will include the core concepts of:**

The train wreck definition of computer science and one that will actually instead make sense.

An explanation of data that will have you seeing data everywhere that you look!

One of the “greatest lies” ever sold about the future computer science.

A genuine explanation of Big Data, and how to avoid falling into the marketing hype.

What is Artificial intelligence? Can a computer actually think? How do computers do things like navigate like a GPS or play games anyway?

What is Machine Learning? And if a computer can think – can it learn?

What is Data Science, and how it relates to magical unicorns!

How Computer Science, Artificial Intelligence, Machine Learning, Big Data and Data Science interrelate to one another.

**We’ll then explore the past and the future while touching on the importance, impacts and examples of Machine Learning for Data Science:**

How a perfect storm of data, computer and Machine Learning algorithms have combined together to make this important right

**now**.We’ll actually make sense of how computer technology has changed over time while covering off a journey from 1956 to 2014. Do you have a super computer in your home? You might be surprised to learn the truth.

We’ll discuss the kinds of problems Machine Learning solves, and visually explain regression, clustering and classification in a way that will intuitively make sense.

Most importantly we’ll show how this is changing our lives. Not just the lives of business leaders, but most importantly…you too!

**To make sense of the Machine part of Machine Learning, we’ll explore the Machine Learning process:**

How do you solve problems with Machine Learning and what are five things you must do to be successful?

How to ask the right question, to be solved by Machine Learning.

Identifying, obtaining and preparing the right data … and dealing with dirty data!

How every mess is “unique” but that tidy data is like families!

How to identify and apply Machine Learning algorithms, with exotic names like “Decision Trees”, “Neural Networks” “K’s Nearest Neighbors” and “Naive Bayesian Classifiers”

And the biggest pitfalls to avoid and how to tune your Machine Learning models to help ensure a successful result for Data Science.

**Our final section of the course will prepare you to begin your future journey into Machine Learning for Data Science after the course is complete. We’ll explore:**

How to start applying Machine Learning without losing your mind.

What equipment Data Scientists use, (the answer might surprise you!)

The top five tools Used for data science, including some surprising ones.

And for each of the top five tools – we’ll explain what they are, and how to get started using them.

And we’ll close off with some cautionary tales, so you can be the most successful you can be in applying Machine Learning to Data Science problems.

**Bonus Course! To make this “really real”, I’ve included a bonus course! **

Most importantly in the bonus course I’ll include information at the end of every section titled “Further Magic to Explore” which will help you to continue your learning experience.

In this bonus course we’ll explore:

Creating a real live Machine Learning Example of Titanic proportions. That’s right – we are going to predict survivability onboard the Titanic!

Use Anaconda Jupyter and python 3.x

A crash course in python – covering all the core concepts of Python you need to make sense of code examples that follow. See the included free cheat sheet!

Hands on running Python! (Interactively, with scripts, and with Jupyter)

Basics of how to use Jupyter Notebooks

Reviewing and reinforcing core concepts of Machine Learning (that we’ll soon apply!)

Foundations of essential Machine Learning and Data Science modules:

NumPy – An Array Implementation

Pandas – The Python Data Analysis Library

Matplotlib – A plotting library which produces quality figures in a variety of formats

SciPy – The fundamental Package for scientific computing in Python

Scikit-Learn – Simple and efficient tools data mining, data analysis, and Machine Learning

In the titanic hands on example we’ll follow all the steps of the Machine Learning workflow throughout:

1. Asking the right question.

2. Identifying, obtaining, and preparing the right data

3. Identifying and applying a Machine Learning algorithm

4. Evaluating the performance of the model and adjusting

5. Using and presenting the model

We’ll also see a real world example of problems in Machine learning, including underfit and overfit.

The bonus course finishes with a conclusion and further resources to continue your Machine Learning journey.

So I invite you to join me, the Backyard Data Scientist on an exquisite journey into unlocking the secrets of Machine Learning for Data Science…. for you know – everyday people… like you!

Sign up right now, and we’ll see you – on the other side!

### Introduction

Why should you buy this course? Begin **here **to see what we'll cover and what this course will bring to you!

I'm pleased to announce that my course has closed captioning on every lecture; that I have personally proof read, edited and corrected. I hope this helps all my students, better enjoy the course material. Please view this lecture for a personal message from me.

My personal thank you, for entrusting me with your time. It's a privilege to share this amazing topic with you.

A taste of what's to come - the course overview outlines what we'll be discussing, in each section of this course.

**SECRET SAUCE!: **Top tips on how to get the most out of this course! Don't skip this lecture - it's worth your time!

### Core Concepts

What will we discover with core concepts? Here I'll give you a brief overview of all the exciting lectures contained in this section.

The current definition of computer science is an incomprehensible train wreck! Find out why in this lecture!

In order to better understand what computer science is, it's useful to understand what DATA is. By the end of this lecture you'll be able to see **DATA EVERYWHERE** you look!

There are two different kinds of data - Structured and Unstructured. This is a key concept, that we are going to come back to time and time again later on. Important, and delivered in under 3 minutes!

Test your understanding of structured vs. unstructured data in this quick quiz!

Here we revisit the definition of what Computer Science is, with something that's actually comprehensible. Wondering what an algorithm is? We've got that covered to? And while we're at it - we'll even dive into programming.

Finally, we'll touch on what I call "One of the greatest lies - Ever **SOLD**".

So what is Big Data? Learn the three V's of big data, what it is... and what it isn't!

This lecture will educate you so you don't fall for the ** "marketing hype"** often associated with Big Data.

A quiz on the ideas of big data.

This is a longer lecture, however within 12 minutes we'll cover off the most fundamental parts of Artificial Intelligence.

Do you how a computer plans a route in a GPS? Or how it would play a game like Tic-Tac-Toe? The answers might surprise you! This lecture has several animations to help illustrate the concepts and importantly - the challenges of AI in search.

And! We'll also cover off one of the most interesting questions - "Can a computer Really Think"?

Alas! We are discussing Machine Learning! In this lecture, we'll clearly define Machine Learning. We'll give a simplified overview of the Machine Learning Process, which we'll expand later on in section 4. We'll discuss some applications of Machine Learning, as well as what Machine Learning gives AI.

By the end of this lecture, you'll have an idea of what Machine Learning can be used for.

In this Animated Example, we'll show a simple Machine Learning application. While it's a very simple example, it will show how data can be looked at, examined for patterns, and will discuss the difference between sensitivity and specificity. These are key concepts to Machine Learning and important to understand when applying it.

What is Data Science? Magical Unicorns? (Yes really!). Battling Venn Diagrams (I'm not kidding!)

In this lecture, we'll define what Data Science is and what a Data Scientist does.

Big Data! AI! Machine Learning! Computer Science! Data Science!

How does this all fit together? Where does one "start" and the other "stop?" In this lecture, we'll use an animated diagram to explain how all these different domains interrelate. Confusion stops here!

### Impacts, Importance and examples

What will we discover with "Impacts, Importance and Examples"? Here I'll give you a brief overview of all the exciting lectures contained in this section.

Why are we talking about this? Why is this important **now! **

In this lecture we'll uncover the convergence of events that have come together in a *perfect storm* of digital change.

Computers exploding?! Every one always gives lip service to "how much technology has actually changed". But what does it **really **mean? In this longer lecture, we'll take a journey from 1956 to 2014, and really explain how the world has changed.

Do you have a **super computer** in your house? You might be surprised to find out the truth.....!

In this brief lecture, we'll cover the three different problems Machine Learning solves really well.

- Classification
- Clustering
- Regression

Pictures will help make sense of every concept, and it will be the bedrock for later seeing how different problems can be solved by Machine Learning. While watching this lecture, be sure to look at how a problem can be solved in different ways, using different approaches to Machine Learning.

We've covered off - what it is. How it works. What it provides....

Now the question is *How is this changing our lives?*

In this lecture we'll talk about what we'll likely see. What happens when Machine Learning goes wrong. And we'll touch on ethics - which is not just a case of banning killer robots, but much more subtle as well.

### The Machine Learning Process

What will we discover with "The Machine Learning Process"? Here I'll give you a brief overview of all the exciting lectures contained in this section.

In this lecture we'll cover off each of the five step of the Machine Learning Process, sometimes called a "pipeline" or "workflow". Any problem being solved by Machine Learning will have to touch all of these fives steps - sometimes more than once.

This key lecture will discuss how the parts of the process work together. **Not to be missed! **

What question are you asking? What are your goals?

What does done look like? How good must our prediction be?

All these things are key parts of 1 - Asking the right question *in the first place.... *

In this tell all lecture:

- Domain expertise reigns supreme!
- Where will you get your data from? Surprising secret sources of data you might not have considered.
**Dirty data**.... dirty,dirty data! Anticipating the largest effort in any Machine Learning project realistically.. as well as discussing tidy data.

What are waiting for! Go to your the lecture (room) and clean that (data) up! __All messes are not created equal.__

It's science and it's art. In this lecture we'll discuss how Machine Learning algorithms interact with data to model answers to your problems. We'll discuss and illustrate four common Machine Learning algorithms. For each, we'll cover off how they work, and what workloads work best for them. You'll become a master of the digitally arcane, with powers over:

- Decision Trees
- Naïve Bayesian Classifiers
- Neural Networks
- kNN - K's Nearest Neighbours.

How do you evaluate the performance of your Machine Learning algorithm anyway? And if it's not working they way you expected - how do you fix it? In this tell all lecture, we'll discuss common problems of Machine Learning - and how address them.

Finally! We've reached the end goal! Or have we?

In this brief lecture, we'll cover off four important things to keep in mind to use your Machine Learning Model.

A quiz on the process of Machine Learning

### How to apply Machine Learning for Data Science

How do you get started in your journey to applying Machine Learning for Data Science? In this brief overview, we'll describe the tell-all lectures, that will give you a place to start to apply Machine Learning and Data Science.

**HOW NOT TO LOOSE YOUR MIND.**

Really. This lecture is a important one, because it will give you guidance on how to get started in your journey without loosing your mind along the way.

What do you need to do Machine Learning? Is it expensive? Out of reach?

In this surprising lecture, we'll pull back the curtain on what Data Scientists are actually using. We'll also list the top five tools for Data Science, that we will deep dive into, in the following lectures.

The number one tool for Data Science, is "R" and is a power house for Machine Learning applications. We'll describe the tool, as well as provide links an important tips on using it.

The second most popular tool for Data Science, is "Python". Python is a general programming language with incredible power, versatility and flexibility. It's gaining on R year by year, and has powerful Data Science and Machine Learning Capabilities.

We'll describe the python, as well as provide links an important tips on using it.

The third most common tool for Data Science is SQL. Pronounced SEA-QUEL, this is a Database language. In this lecture we'll describe what SQL is, and why it has shown up in the third place for data science tools.

The fourth most common tool for Data Science is Microsoft Excel? Yes - really! In this lecture we'll describe Microsoft Excel and it's value as a Data Science tool.

Finally, we'll give you the "real deal", when it comes to doing Machine Learning in excel. The answer, will surprise you!

The final top five tool for Data Science is rapid miner. In this lecture we'll discuss using Software as a Service, and some things to think about when using Rapid Miner.

You made it! In this final lecture of section 5, we'll talk about things to watch out for when doing Machine Learning. This lecture will give you key information on how to avoid obstacles on your way to success!

### Conclusion

Congratulations on your journey into Machine Learning and Data Science. We sincerely hope you enjoyed it - and we hope to see you again... in our next course!

NOTE: November 2018 - The next course is *IN THIS COURSE*! That's right - check out the next lecture for our included bonus course **"Machine Learning in Python and Jupyter for Beginners"! **

### Section 1 -Bonus course - Machine Learning in Python and Jupyter for Beginners

Introductions! Who am I?

Who are you?

Starting the Anaconda download.

Prerequisite knowledge

Topics for the course

What won't we cover today?

How the course will be delivered.

Titanic survivability project - what we'll be building.

Introducing Kaggle

Where it the titanic example?

Starting Anaconda Installation.

Platform Selections -Why python?

Platform Selections -Why python 3.x?

Platform Selections -Why Anaconda?

### Section 2 -Bonus course - Machine Learning in Python and Jupyter for Beginners

Comments.

Basic Variable and Assignments.

Notes about Data Types.

Data type Summary.

Basic Type Casting.

Advanced Assignments.

Advanced Assignments - Error situations.

Strings - Basic String Assignment.

Strings - Unusual String Assignment.

Strings - Basic String Operations.

Strings - Core Concept - Immutability.

Slices.

Lists.

Lists - Basic List Operations

Lists - Additional Operations.

Lists - Advanced Topics.

Notes about Expressions.

Arithmetic and Bitwise Operators.

Relational, Logical, and Identity Operators.

Identity Operators.

Assignment Operators and Membership Operators

Conditional Logic and "if" statements.

Iterations and Loops - Simple while loop.

Iterations and Loops - Advanced loops and for loops.

Functions and variable Scope.

Dictionaries.

Dictionaries - Errors Situations.

Dictionaries - Further Example.

Getting Help!

Further magic to explore - Where to go from here (to continue your learning on Python)

Completing the Anaconda Installation

### Section 3 - Bonus course - Machine Learning in Python and Jupyter for Beginners

Running Python Interactively.

Running Python stand alone scripts.

Running Python in Jupyter notebooks.

How to use Jupyter:

Creating notebooks

Using notebooks

Saving notebooks

Types of cells

How the Kernel works (and how to manage it)

Getting help

Help with Jupyter Markdown language.

### Section 4 - Bonus course - Machine Learning in Python and Jupyter for Beginners

What is Data Science?

What are Data Scientists?

Data Science areas.

What kinds of problems does Machine Learning Solve?

Classification

Regression

Clustering

Can a Machine Learn?

What is Machine Learning? - Simplified Overview

What does data look like?

What does the data in our Titanic example look like?

Types of Machine Learning:

Supervised

Unsupervised

Reinforcement Learning

The 5 steps of a Machine Learning Workflow

Example algorithms

Overview - Decision Trees

Overview - Naive Bayesian Classifiers

Overview - Neural Networks

Overview - kNN

Evaluating the performance of a model and adjusting

Overfitting

Underfitting

Further magic to explore - Where to go from where (to continue your learning).

### Section 5 -Bonus course - Machine Learning in Python and Jupyter for Beginners

Overview - Highest to lowest level.

Overview:

SciKit-learn

SciPy

Matplotlib

Pandas

NumPy

Basics of NumPy

Basic Creation and Assignments

Updating Values

Array Builders - Ones

Array Builders - Zeros

Array Builders - Choose your own

Matrices

NumPy: Further Magic to explore - Where to go from here (to continue your learning)

Introducing Pandas - the Python Data Analysis Library

Introducing Matplotlib - plotting library which produces publication quality figures in a variety of formats.

Pandas:

Basic Series Creation and Assignments.

Basic Data Frame Creation and Assignments.

Creating a Data frame from CSV and reviewing it.

Exploring the Data - Data Shapes and Types.

Accessing and Changing the Data - Rows (cases) and Columns (features)

Removing Data

Filtering Data

Determining Unique Values

Simple Analysis

Matplotlib:

Simple analysis and plotting

Pandas:

Simple analysis and plotting

Matplotlib - Further magic to explore - Where to go from where (to continue your learning).

Pandas: - Further magic to explore - Where to go from where (to continue your learning).

SciPy

The fundamental package for scientific computing with Python

Sparse matrix (example)

SciPy: - Further magic to explore - Where to go from where (to continue your learning).

Scikit-Learn:

Simple and efficient tools for data mining, data analysis, and Machine Learning

### Section 6 - Bonus course - Machine Learning in Python and Jupyter for Beginners

Let's get our start by applying the 5 steps of Machine Learning Workflow to the titanic.

Asking the right question.

Identifying, obtaining, and preparing the right data.

Identifying and applying a Machine Learning algorithm.

Evaluating the performance of the model and adjusting

Using and presenting the model.

**Step #1 - Asking the right question**Creating our Titanic Example file

Reviewing the data, and data dictionary

Importing out modules - Pandas, Numpy, Matplotlib, and Scikit-learn

Loading the dataframe

**Step #2 - Identifying, obtaining, and preparing the right data.**Reviewing the data, identifying gaps and problems with the data set.

**Step #2 - Identifying, obtaining, and preparing the right data.**Exploring the data with Pandas and Matplotlib - understanding people in the data set in terms of:

Survival of the disaster

Gender of people onboard

Age of passengers (histogram)

Classes of passengers

Age distribution in the Classes of passengers

Embarkation location

**Note:**The goal of this lecture (and the next lecture), is to identify the right data and features to use in the Machine Learning algorithm.

**Step #2 - Identifying, obtaining, and preparing the right data.**Exploring the data with Pandas and Matplotlib - understanding people in the data set in terms of:

Survival in relation to age (Scatter plot)

Survival in relation to gender

Survival in relation to passenger class

Survival in relation to passenger class and gender.

**Note:**The goal of this lecture (and the previous lecture), is to identify the right data and features to use in the Machine Learning algorithm.

**Step #2 - Identifying, obtaining, and preparing the right data.**Preparing the right data - adjusting gender.

Preparing the right data - filling in missing ages.

Applying a basic hypothesis:

**Step #3 - Applying a algorithm (a basic one).****Step #4 - Evaluating the performance of the hypothesis and adjusting.**

Applying Linear Regression:

**Step #2 - Preparing the data**(building the training features, and training target)

Applying Linear Regression (continued)

**Step #3 - Applying the algorithm (running fit)****Step #4 - Evaluating the performance of Linear Regression (Cross validation)**

Applying a polynomial regression

**Step #3 - Applying the algorithm (running fit)****Step #4 - Evaluating the performance of Polynomial Regression (Cross validation)**

Applying Decision Trees:

**Step #3 - Applying the algorithm (running fit)****Step #4 - Evaluating the performance of Decision tree (Cross validation)**What happened??? - Overfit!! Note: See resources in this lecture for the charts)

Adjusting the algorithm

**Step #3 - Applying the algorithm (running fit)****Step #4 - Evaluating the performance of Decision tree (Cross validation)**

**Step #5 - Using and presenting the model.**Conclusion of the Decision tree model. What features did it decide are most important?

### Section 7 -Bonus course - Machine Learning in Python and Jupyter for Beginners

In conclusion:

Concept: "The algorithm with the most data selection wins!"

Thoughts on:

Feature engineering

Data selection

Algorithm selection

Further magic to explore - Where to go from here (to continue your learning)

Kaggle

Link to an amazing blog post

Links to several amazing Jupyter notebooks

How to contact me!

Thank you!

### Retired Lectures

Retired lecture - retained here if any students should want to review it.

__Retired lecture - retained here if any students should want to review it. __

__Note: This lecture had some audio gaps, I've fixed that up.__

Here we revisit the definition of what Computer Science is, with something that's actually comprehensible. Wondering what an algorithm is? We've got that covered to? And while we're at it - we'll even dive into programming.

Finally, we'll touch on what I call "One of the greatest lies - Ever **SOLD**".

### Bonus Content

Attached is an article I wrote, in early 2017 of one of the most important developments of 2016. I think it's as relevant today as it was back then.

I hope you enjoy it! It's included in HTML format, as well as attached in PDF.