- the concepts of data integration
- why we need it
- what are the tools used today
- data warehouse concepts
- pentaho kettle environment.
- navicat (best database manager in my opinion)
- power architect
- full data integration project with pentaho kettle
- project overview
- detailed design
- step-by-step (divide and conquer)
- deploy the project.
- make it stable by securing the solution – validation, error handling
- logging
- performance
Why should i take this course
Isn’t it obvious? Don’t you want to be the best ETL, pentaho kettle developer?
General:
The course is the outcome of my 10 year experience with IT projects and business intelligence and data integration with pentaho kettle.
I developed the course because I want to share my knowledge with you.
the best way to learn technological software and concepts is via an online course,
structured by a real developer with actual experience that guide you through his (my) Path to knowledge.
I will help you master ETL with pentaho kettle .
What is the course about?The course is about taking you from the beginning and transfer you to a master of Pentaho kettle .
The main dish of the course is a walk-through of a real pentaho kettle project hands on, case study, tips taking you from easy steps that becomes more and more complex, layer by layer, as you go forward. That way you can learn pentaho kettle as a beginner but also become an expert as you go along (and practice)
Also I cover
Structure of the course
the course is divided 4 main sections:
Section 1: Theory and concepts of data integration in general
(if you already an ETL developer you can skip that)
Section 2: setting up the environment
install and operate the data integration with pentaho kettle.
Including database management and profiling the database as a source.
PDI, navicat (to manage database), jdbc drivers, JRE, sakila database example, mysql and more .
walk-through
Section 3: the main dish
until the successfully end of the project. Including some 80% of the steps used by pentaho kettle in order to master data integration.
You can see all the steps in the curriculum (it’s too many to write them here)
just for the example:
a. Connect to various data sources (databases, files…)
b. manipulate the data
changing strings, dates and calculations, joins, lookups, slowly changing dimensions, consideration of when and how to use different steps.
c. Work with variables
d. outputs steps (bulk load , table output , update/insert , file output…)
Section 4: wrapping up – go to production
You will learn how to:
Installations
The list of software we require in order to run and work with Pentaho ETL
JRE is required by Pentaho in order to run
This lecture shows how to install pentaho data integration
This lecture shows how to install navicat
this lecture shows how to install data architect - profile tool for databases
This lecture will show how to install expresso, a tool that acts as wizard for creating regular explressions
Hands on - Pentaho
Software Walkthroughs
This lecture is about Profiling database with power architect