****Update****
New section added with the latest features of Informatica version 10x, changes to the Administration Console and the upgrade planner options.
****Update****
After completing this program you should be able to:
- Perform day to day activities of an Informatica Administrator
- Prepare your ground for Informatica Administration certification
- Install the Informatica 9x/10x architecture platform, configure add and manage Informatica PowerCenter & Data Quality.
- Define and set up platform best practices for users, privileges, roles, and permissions.
- Assign users to groups, privileges to roles and roles to groups. Assign permissions and manage the domain.
- Manage repositories and repository folders. Backup and migrate the Informatica domain database.
- Use the command line to manage the domain and repository, start and control workflows
- Upgrade license, versions
- Understand different processes of deploying the code from one environment to another environment.
Introduction
Informatica is a leading provider of enterprise data integration software and services. With Informatica, organizations can gain greater business value by integrating all their information assets from across the enterprise. Thousands of companies worldwide rely on Informatica to reduce the cost and expedite the time to address data integration needs of any complexity and scale.
Power Center Architecture
Informatica 961 Latest features
PowerCenter provides an environment that allows you to load data into a centralized location, such as a data warehouse or operational data store (ODS). You can extract data from multiple sources, transform the data according to business logic you build in the client application, and load the transformed data into file and relational targets.
- PowerCenter - Domain
- PowerCenter - Administration Console
- PowerCenter - Client
- PowerCenter - Repository Service
- PowerCenter - Integration Service
This session covers all the components involved in the Power Center Architecture.
Service Manager:
The Service Manager is a service that manages all domain operations. It runs within Informatica Services. It runs as a service on Windows . When you start Informatica Services, you start the Service Manager. The Service Manager runs on each node. If the Service Manager is not running, the node is not available.
The Service Manager runs on all nodes in the domain to support the application services and the domain:
Application Services: Application services represent PowerCenter server-based functionality. Here are some application services in Informatica Power Center.
- Integration Service
- Repository Service
- Reporting Service
- Metadata Manager Service
- SAP BW Service
- Web Services Hub
- Reference Table Manager Service
A node is the logical representation of a machine in a domain. One node in the domain acts as a gateway to receive service requests from clients and route them to the appropriate service and node. Services and processes run on nodes in a domain.
Different Types of Nodes are discussed in this session.
Gateway Node:
A gateway node is any node you configure to serve as a gateway for the domain. One node acts as the gateway at any given time. That node is called the master gateway. A gateway node can run application services, and it can serve as a master gateway node. The master gateway node is the entry point to the domain.
Worker Node:
A worker node is any node not configured to serve as a gateway. A worker node can run application services, but it cannot serve as a gateway.
This handout covers all the required information for the sessions discussed under section 1. Architecture, Basic definitions, Nodes, Core Services and generic troubleshooting and resolution information.
Installation and Configuraton
What do you need to get started is described here both for your personal PC and the how should it be done at work.
PAM - Product Availability Matrix is the right place to start for all pre installations checks on what version is compatible for which version of Informatica.
This session shows the way to download the free software from Oracle eDelivery website for Informatica 9.6 and Oracle 11g or 12c.
Which files should I download from the edelivery site?
Oracle 11g Installation and SQL Developer Configuration
This session shows how to extract the client and the server executable from the .ZIP and .gz files downloaded from eDelivery website of Oracle.
Step by Step process on installing the Informatica Server. Informatica Service set up. Explanation of all the options available and the port numbers.
Step by Step process on completing the Client Installation for Power Center and other available client options for Informatica Data Quality and Transformation Studio.
Administration Console
This session provides the overview of the Administration Console page layout and the tabs. Differences between the 8.6 version web page layout and the 9.x version layout. Log Management and the basic differences on what the Monitoring in Administration Console is all about and the Client Informatica Monitor Tool.
This session explains the list of services available in the Administration Console and the purpose of them. The order in which the services should be created and the dependencies. Common Issues and fixes are also discussed.
A domain is the fundamental administrative unit for Informatica nodes and services. An Informatica domain is a collection or group of nodes and services that define the Informatica platform. A PowerCenter domain that we create while installing PowerCenter is called Local Domain.
This session covers all the properties and aspects of the Domain in the Administration Console.
This is a minor bit of continuation to the the previous session.
A node is the logical representation of a machine in a domain. One node in the domain acts as a gateway to receive service requests from clients and route them to the appropriate service and node. Services and processes run on nodes in a domain.
This session covers all the properties under the Node in the Administration Console.
Step by Step process on what properties to choose for creating the Repository Service.
This session explains the default properties when the service is created and what are the options which can be updated and to what value. What will be the implications if the changes are done and what are the scenarios in which it can be changed.
Integration service is created while installing the Power Center .After creation of Integration service we use Administration console to mange the Integration Services.
What is Integration service?
Integration service is used to read workflow information from the Informatica Repository. Integration services create one or more Integration services processes to manage Workflows. When we run a workflow, what the Integration service does is that it will locks the workflow, runs the workflow tasks, and sessions.
Integration service is created while installing the Power Centre .After creation of Integration service we use Administration console to manage the Integration Services.
What is Integration service?
Integration service is used to read workflow information from the Informatica Repository. Integration services create one or more Integration services processes to manage Workflows. When we run a workflow, what the Integration service does is that it will locks the workflow, runs the workflow tasks, and sessions.
This session covers on how to start and stop the various services which we have created. How to identify the cause of the issue, fix it and restart the services is shown in this session with one example each for Power Center Services and IDQ Services, along with the descriptions on basic issues with Network connectivity and Database connectivity.
This session covers on how to start and stop the various services which we have created. How to identify the cause of the issue, fix it and restart the services is shown in this session with one example each for Power Center Services and IDQ Services, along with the descriptions on basic issues with Network connectivity and Database connectivity.
Informatica Security
- Different types of users used in this course
- Difference between Adminstrator/Domain Administrator/Default
- Administrator/Application Administrator
- LDAP Setup prcess and Steps
- User Creation
- How to enable and disable access to the users?
- How to get the list of users from the Security Tab?
- Different types of users used in this course
- Difference between Adminstrator/Domain Administrator/Default
- Administrator/Application Administrator
- LDAP Setup prcess and Steps
- User Creation
- How to enable and disable access to the users?
- How to get the list of users from the Security Tab?
- Different types of users used in this course
- Difference between Adminstrator/Domain Administrator/Default
- Administrator/Application Administrator
- LDAP Setup prcess and Steps
- User Creation
- How to enable and disable access to the users?
- How to get the list of users from the Security Tab?
How to bind a user or a group to a connection is explained with an onsite and offshore team setup example.
IDQ Services
Datasheet on Informatica Data Quality from Informatica Corporation
This session shows you how to delete and create a new MRS service.
Model Repository Service (MRS): The MRS provides automated persistence of models. It looks at the model to determine how to persist objects. The default persistence scheme can be customized. Repository capabilities are model agnostic. MRS provides metadata search and import/export. MRS allows you to add the model definitions dynamically. Like, adapter meta models.
- Profiling service plug-in translates profile into mappings
- SQL Service plug-in translates SQL into mappings
- Mapping Service executes data quality plans
It provides common services to its plug-ins: request dispatch, thread pooling, and so on. And, it also provides mapping execution using embedded Data Transformation Manager (DTM).
Data Integration Service (DIS): DIS is the container for all data integration functionalities. DIS plug-ins provide different data integration functionalities. The different plug-ins are as follows:
- Profiling service plug-in translates profile into mappings
- SQL Service plug-in translates SQL into mappings
- Mapping Service executes data quality plans
It provides common services to its plug-ins: request dispatch, thread pooling, and so on. And, it also provides mapping execution using embedded Data Transformation Manager (DTM).
Informatica Analyst is a web-based application client that analysts can use to analyze, cleanse, standardize, profile, and score data in an enterprise. Business analysts and developers use Informatica Analyst for data-driven collaboration. You can perform column and rule profiling, scorecarding, and bad record and duplicate record management. You can also manage reference data and provide the data to developers in a data quality solution.
The Content Management Service is an application service that manages reference data. It provides reference data information to the Data Integration Service and to the Developer and Analyst tools. A master Content Management Service maintains probabilistic model and classifier model data files across the domain.
Web Services
Web services are business functions that operate over the Web. They describe a collection of operations that are network accessible through standardized XML messaging. The PowerCenter Web Services Provider lets you integrate the PowerCenter metadata and data integration functionalities and expose them as web services. You can write applications that can communicate with Integration Services in any language or platform. You can embed these applications easily in existing components and products.
Web services are business functions that operate over the Web. They describe a collection of operations that are network accessible through standardized XML messaging. The PowerCenter Web Services Provider lets you integrate the PowerCenter metadata and data integration functionalities and expose them as web services. You can write applications that can communicate with Integration Services in any language or platform. You can embed these applications easily in existing components and products.
High Availability
The PowerCenter High Availability Option provides high availability and seamless failover and recovery of all PowerCenter components. This option minimizes service interruption in the event of a hardware or software outage and reduces costs associated with data downtime.
The administration console simplifies setup and management. The SOA framework that is part of the PowerCenter architecture enables this option's functionality.
The PowerCenter High Availability Option provides high availability and seamless failover and recovery of all PowerCenter components. This option minimizes service interruption in the event of a hardware or software outage and reduces costs associated with data downtime.
The administration console simplifies setup and management. The SOA framework that is part of the PowerCenter architecture enables this option's functionality.
The PowerCenter High Availability Option provides high availability and seamless failover and recovery of all PowerCenter components. This option minimizes service interruption in the event of a hardware or software outage and reduces costs associated with data downtime.
The administration console simplifies setup and management. The SOA framework that is part of the PowerCenter architecture enables this option's functionality.
BENEFITS:
- Guard against platform service outage and ensure data uptime
- Reduce costs and risks associated with data downtime
- Enable mission-critical deployment of PowerCenter for enterprise data integration initiatives
KEY FEATURES:
High Availability:
- Enables the configuration of multiple backup services across the entire platform to provide reliability and redundancy
- Extends to all platform services, making all services “hot,” meaning they can perform work as well as function as backup
Seamless Failover:
- Minimizes service interruptions in the event of a hardware, network, or software outage by automatically rerouting data integration processing to resources unaffected by outage
- Preserves development work by rerouting save operations when the primary repository server is unavailable
Guaranteed Flexible Recovery:
- Provides various recovery strategies, including automatic recovery of partially completed data integration processing due to unexpected service interruptions
- Leverages workflow and session check-pointing to persist intermediate processing states
- Ensures viability of recovery and restart procedures even when source data changes between session failure and recovery
- Guarantees zero data loss during recovery for real-time sources
Connection Resiliency:
- Makes all internal and external connection points resilient to changes in computing or network environments
- Extends resiliency to automatic retry for configurable time periods and graceful shutdown
- Full Integration Across Entire Data Integration Platform
- Automatically applies all high availability features to all data integration processing available on the PowerCenter platform, including data cleansing, data profiling, and unstructured and semistructured data processing
High Availability Option Benefits
Guard Against Platform Service Outage and Ensure Data Uptime :
The High Availability Option eliminates single points of failure in the PowerCenter environment, providing uninterrupted access to critical data and reducing errors in data delivery. PowerCenter's sophisticated
distributed architecture enables monitoring between its components and services.
Reduce Costs and Risks Associated with Data Downtime:
With its advanced failover and recovery capabilities, the High Availability Option mitigates the risk of failure and minimizes the consequences associated with processing outages. By eliminating data downtime, IT
organizations can be more productive. They can better control the total cost of enterprise data integration initiatives. The centralized, Web-based administration console allows users to rapidly create and configure
reliable and redundant services, as well as dynamically reconfigure services, simplifying overall system administration.
Enable Mission-Critical Deployment of PowerCenter for Enterprise Data Integration Initiatives:
Organizations that rely on PowerCenter as the foundation for all mission-critical enterprise data integration initiatives cannot risk processing downtime. Particularly in the context of an Integration Competency Center,
the High Availability Option can underwrite stringent service level agreements. Seamless failover and recovery for all PowerCenter components minimize the consequences of service interruptions. Adaptable recovery
strategies can be determined for configurable restart of platform services. Session checkpointing
provides faster, easier, and more effi cient recovery. This option's powerful capabilities extend PowerCenter's readiness for mission-critical deployment.
Grids
In this session we will understand the basics of GRID computing and the advantages of using the GRID in Power Center.
Grid computing is the collection of computer resources from multiple locations to reach a common goal. The grid can be thought of as adistributed system with non-interactive workloads that involve a large number of files
How to identify if GRID setup is required for your environment?
What are different scenarios to be considered?
How to come up with the data and the numbers to show business that this is reqired?
As we have determined that the environment has benefits having the GRID setup. In this session we will see what is required for us to start with the configurations.
Lets start with the Configuration steps
What are the different resources which needs to be configure?
Based on the development teams requirements on the resources required, these should be either enabled or disabled.
Resource configuration at each Integration service in the node.
This session explains, how to disable the connection resources.
The very important feature of grids is the load balancing and we discuss the different options available and the configuration steps required.
Different types of Service Levels avaiallbe and how those should be created and configured at the workflow level.
This session describes the different activities the development team should consider for better results with the GRID set up.
High Availability and Recovery Strategy benefits with Grid and the best practices to be followed for any recovery design implemetation.
Different FAQs answered by Informatica Corporation
Additional Resources and KB content from Informatica Corporation
Operating System Profiles
This session describes the need of the OS Profiles and how to set it up. This is more relevant to UNIX environment than Windows.
This is part 2 of the OS profiles section
This document is from the Informatica Knowledge base which covers all the aspects of the OS Profiles and commands to use in UNIX
License Management
Product Version
- Platform
- Expiration Date
- Power Center Options
- Connectivity
- Metadata Exchange Options
Tasks:
- Options
- Validity
- Remove the App services
- Add App services
- Add an incremental Key
- infacmd isp
Types of Liscence Keys:
- Original Key
- Incremental Key
Product Version
- Platform
- Expiration Date
- Power Center Options
- Connectivity
- Metadata Exchange Options
Tasks:
- Options
- Validity
- Remove the App services
- Add App services
- Add an incremental Key
- infacmd isp
Types of Liscence Keys:
- Original Key
- Incremental Key
Client Tools and Sample WorkFlow Creation
This session is on the steps of folder creation and getting the enviornment ready for the developers.
This session covers the basic details of the Repository configuration and the default file directory structure which is is automatically created during the installation.
This session can be skipped by the current ETL developers as they would already be aware of it.
This session can be skipped by the existing ETL developers but is a must for the learners who are new to DWH and the ETL/Data Integration world. This session explains all the basic activities a developer can do using the Informatica Power Center Client tools.
- Export and Import
- Deployment Groups
- Purging the versioned objects
- Dependencies
- Queries
- Password Reset
- Remember the 'AutoSave'
- This session can be skipped by the existing ETL developers but is a must for the learners who are new to DWH and the ETL/Data Integration world. This session explains all the basic transformation types, available transofrmaitons, meanign and the usage of them.
This session can be skipped if you are already aware of the Mapplet process. This is focused for the participants who are new to Informatica and Data Integration.
What is a Debugger in Informatica and when to use it?
A debugger is used to troubleshoot the errors in a Informatica mapping that you find before running a session or after saving the mapping and running the session. To debug a mapping, first we need to configure the debugger and then run the same within the Mapping Designer.
The Debugger makes use of the existing session or creates a debug session of its own to debug the mapping.
Informatica PowerCenter Monitor & Repository Manager for Log Information
You can monitor workflows and tasks in the Workflow Monitor. A workflow is a set of instructions that tells an Integration Service how to run tasks. Integration Services run on nodes or grids. The nodes, grids, and services are all part of a domain.
With the Workflow Monitor, you can view details about a workflow or task in Gantt Chart view or Task view. You can also view details about the Integration Service, nodes, and grids.
The Workflow Monitor displays workflows that have run at least once. You can run, stop, abort, and resume workflows from the Workflow Monitor. The Workflow Monitor continuously receives information from the Integration Service and Repository Service. It also fetches information from the repository to display historic information.
Informatica server creates session log file for each session. It writes information about session into log files such as initialization process, creation of SQL commands for reader and writer threads, errors encountered and load summary. The amount of detail in session log file depends on the tracing level that you set in the session task.
Versioning
- Why do we need Versioning??
- Topics about Check in/check out
- How to check Dependencies
- Viewing history
- Comparing different versions
In this lecture we talk about purging the unused versions of the objects.
Migration/Deployment Strategies
What is a Deployment Group?
How to create Static and Dynamic Deployment groups?
Usage of Labels.
Moving the deployment groups from one repository to an other repository.
How to use Queries in the Repository Manager tool?
Question on now to use it in a multiple environment setting.
One of the easiest and convenient ways to deploy the code and share it with others in the team in XML formats.
Steps on how to copy the entire folder form one environment to an other environment is shown in this lecture.
Steps on how to copy the single objects from one environment to an other.
Repository Metadata
As an Informatica PowerCenter administrator, you may often have the need to obtain a list of users and associated groups, workflows that have last run, mappings in a folder, default values within a mapping, etc. This information can be queried in the PowerCenter tools, however, a more efficient way of collecting this data is to query the repository metadata tables directly in the database. This method proves to be very helpful when performing a large repository upgrade or decommissioning an environment.
As an Informatica PowerCenter administrator, you may often have the need to obtain a list of users and associated groups, workflows that have last run, mappings in a folder, default values within a mapping, etc. This information can be queried in the PowerCenter tools, however, a more efficient way of collecting this data is to query the repository metadata tables directly in the database. This method proves to be very helpful when performing a large repository upgrade or decommissioning an environment.
As an Informatica PowerCenter administrator, you may often have the need to obtain a list of users and associated groups, workflows that have last run, mappings in a folder, default values within a mapping, etc. This information can be queried in the PowerCenter tools, however, a more efficient way of collecting this data is to query the repository metadata tables directly in the database. This method proves to be very helpful when performing a large repository upgrade or decommissioning an environment.
Keep this hand out as a ready reckoner and use it when required.
Repository Maintenance
What is Repository Maintenance and why do we need it will be discussed here.
Steps to take the backup of the Repository is shown in this lecture.