3.4 out of 5
3.4
382 reviews on Udemy

mastering data integration (ETL) with pentaho kettle PDI

hands on , real case studies ,tips, examples , walk trough a full project from start to end based on mySQL sakila DB.
Instructor:
Itamar Steinberg (inflow systems)
2,298 students enrolled
English [Auto-generated]
develop real pentaho kettle projects
become master in transformation steps and jobs
know how to set pentaho kettle environment and deploy
be familiar with the most used steps of pentaho kettle
you will know to secure , validate , handle errors
check the performance and have the tools to solve issues

    Why should i take this course
    Isn’t it obvious? Don’t you want to be the best ETL, pentaho kettle developer?

    General:
    The course is the outcome of my 10 year experience with IT projects and business intelligence and data integration with pentaho kettle.

    I developed the course because I want to share my knowledge with you.

    the best way to learn technological software and concepts is via an online course,
    structured by a real developer with actual experience that guide you through his (my) Path to knowledge.

    I will help you master ETL with pentaho kettle .

    What is the course about?The course is about taking you from the beginning and transfer you to a master of Pentaho kettle .

    The main dish of the course is a walk-through of a real pentaho kettle project hands on, case study, tips taking you from easy steps that becomes more and more complex, layer by layer, as you go forward. That way you can learn pentaho kettle as a beginner but also become an expert as you go along (and practice)

    Also I cover

    • the concepts of data integration
    • why we need it
    • what are the tools used today
    • data warehouse concepts

    Structure of the course
    the course is divided 4 main sections:

    Section 1: Theory and concepts of data integration in general
    (if you already an ETL developer you can skip that)

    Section 2: setting up the environment

    install and operate the data integration with pentaho kettle.
    Including database management and profiling the database as a source.
    PDI, navicat (to manage database), jdbc drivers, JRE, sakila database example, mysql and more .

    walk-through

    • pentaho kettle environment.
    • navicat (best database manager in my opinion)
    • power architect


    Section 3: the main dish

    • full data integration project with pentaho kettle
    • project overview
    • detailed design
    • step-by-step (divide and conquer)

    until the successfully end of the project. Including some 80% of the steps used by pentaho kettle in order to master data integration.
    You can see all the steps in the curriculum (it’s too many to write them here)
    just for the example:

    a. Connect to various data sources (databases, files…)

    b. manipulate the data
    changing strings, dates and calculations, joins, lookups, slowly changing dimensions, consideration of when and how to use different steps.

    c. Work with variables
    d. outputs steps (bulk load , table output , update/insert , file output…)

    Section 4: wrapping up – go to production

    You will learn how to:

    1. deploy the project.
    2. make it stable by securing the solution – validation, error handling
    3. logging
    1. performance

Installations

1
What we are going to install?

The list of software we require in order to run and work with Pentaho ETL

2
Install mysql
3
Install JRE - java runtime

JRE is required by Pentaho in order to run

4
Install pentaho data integration (kettle)

This lecture shows how to install pentaho data integration

5
Install navicat - mysql manager

This lecture shows how to install navicat

6
Install sakila database (and notepad++)
7
install power architect

this lecture shows how to install data architect - profile tool for databases

8
Install expresso

This lecture will show how to install expresso, a tool that acts as wizard for creating regular explressions

Hands on - Pentaho

1
Pentaho PDI getting started
2
kettle variables part 1
3
kettle variables part 2
4
kettle database connection
5
Pentaho repositories
6
schema introduction

Software Walkthroughs

1
Navicat walkthrough
2
power architect walkthrough

This lecture is about Profiling database with power architect

The Date Dimension

1
dim date intro
2
generate rows part 1
3
generate rows part 2
4
generate rows part 3
5
the add sequence
6
the select values
7
the mapping / string cut / string concat
8
the table output
9
the string operation
10
dim date summary

dim time

1
dim time intro
2
arrange steps and create hours and minutes
3
the Cartesian step
4
Cartesian customer example
5
the modified java script value
6
the field set / filter rows / dummy steps
7
dim time summary

dim staff

1
dim staff intro
2
the table input
3
the data grid / value mapper
4
consideration 1 - historical data in dimensions
5
consideration 2 - truncate or update table
6
consideration 3 - be like mike - deleted rows on dimension

dim store

1
dim store intro
2
the database lookup
3
the stream lookup
4
the insert /update step
5
the system info

dim customer

1
dim customer intro
2
control "changed data only" input
3
down it goes with the stream
4
slow changing dimension - concept
5
slow changing dimension - example

dim film

1
dim film intro
2
objectives
3
the number range
4
the merge join / sort rows / value null
5
the denormaiser / split fields to rows

fact rentals

1
fact rental intro
2
the inventory - film and store id
3
slow changing dimension on fact table
4
counter and date diff calculation
5
key date handling
6
the time dimension check
7
error handling step

Go to production

1
production steps intro
2
the final job
3
kitchen batch file
4
schedule jobs
5
validation - secure the stream part 1
6
validation - secure the stream part 2
7
logging
8
performance

ETL concepts and sources

1
what is ETL
2
the data warehouse concept
3
Analytical structure
4
ETL tools comparison

Whats next...

1
need more input
2
this is a goodbye
You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
3.4
3.4 out of 5
382 Ratings

Detailed Rating

Stars 5
110
Stars 4
103
Stars 3
102
Stars 2
36
Stars 1
31
b8927f0eb0cbd295165d8b43cc121a1d
30-Day Money-Back Guarantee

Includes

9 hours on-demand video
Full lifetime access
Access on mobile and TV
Certificate of Completion