Learn ETL using SSIS
Start from an absolute beginner to writing and deploying production quality packages.
In this course we will learn about the basic and advanced concepts of SQL Server Integration Services or SSIS. We will walk through the different tools provided by SSIS to Extract, transform and Load data into various databases. This course can followed along with me, provided you have Windows OS or a Windows VM. There are no-prerequisites to the course. At the end of this course, you will be comfortable building an ETL package, moving data around systems, Transforming data using SSIS controls like Fuzzy Lookup, Web service tasks, Email Tasks etc. and configuring and deploying production quality packages with tasks like SSIS logging and checkpoint tasks. Hope you enjoy the course.
A little about myself and expectations from this course!
Starting the basics
We take a high level view to understand what is an ETL, with some simple examples of how it might be used in day to day projects.
Here we walk through a high level overview of SQL Server Data Tools, the environment that we will use throughout this course. You can download and install the tool on your windows machine, if you wish to follow along with me. Links and details can be found in the resources section or my website.
Exploring Data flows
In this lecture, we deep dive into a simple example. I will show you how to transfer data from Excel to SQL server. We will look at the control flow tasks in SSIS as well as the Data flow tasks. We create a simple excel workbook, put some data into it, create a simple SQL table and transfer the data from Excel to SQL server Table.
In this lecture, we look at the reverse of Lecture 4 i.e transferring data from SQL server database to Microsoft Excel. You can use this framework to transfer data across any DB technology like SQL, Oracle, MySQL to Excel, CSV, XML etc. Its just a different set of tools.
Introduction to Data Scripting
In this lecture, I show how to use the Script Component to perform some data transformations. This control/Tool is extremely useful to write flexible code to perform transformations that may otherwise not be possible using the out of box controls. This control is extremely powerful and provides great flexibility.
A well-architected ETL package should be able to respond to changes in the environment or to other external factors, ideally without editing the source code. Hence SSIS introduced the concept of package configurations. We will walk through a simple example where we create an XML configuration and run a package.
Control Flow Tasks
In this lecture, we study the use of control flow tasks and walk through some examples, where I show you how to setup a workflow using SSIS control flow task. We look at the Execute SQL task and File Tasks in SSIS. We look at an example, where we configure the tasks to execute one after the other or in Parallel. We also look at some conditional flow, where a task execution will be based on the previous task's success/failure.
In this lecture we look at the SSIS web service Task. We look at an example, where we look at a publicly available webservice and call the webservice from SSIS by passing parameters. We push the data returned by the webservice into a Text File.
In this lecture we look at some looping constructs. We take an example, where we store some data in 5-6 files in a folder. Then we create a package, to loop through these files and transfer the data into SQL Server. We make use of the SSIS For-Each Loop Container
In this lecture, we look at mechanisms to restart failed packages from the point of failure. Integration Services can restart failed packages from the point of failure, instead of rerunning the whole package. If a package is configured to use checkpoints, information about package execution is written to a checkpoint file. When the failed package is rerun, the checkpoint file is used to restart the package from the point of failure. If the package runs successfully, the checkpoint file is deleted, and then re-created the next time the package is run.
SQL Server Integration Services includes log providers that you can use to implement logging in packages, containers, and tasks. With logging, you can capture run-time information about a package, helping you audit and troubleshoot a package every time it is run. For example, a log can capture the name of the operator who ran the package and the time the package began and finished.
We conclude the course with this lecture. Hope you enjoyed the course and the assignments.