Alteryx – Data processing, Data Manipulation and Analytics
In this course we will make you a champion in Alteryx tool. From data extraction to business analytics this course will give you a good direction to kick start with your aspirations of delivering in Business Intelligence and/or Data Analytics. We will be majorly focusing on Business Intelligence and will add concepts of Analytics. You will learn how to extract data from raw data sources, manipulate data and utilize data for running linear and logistic regression models besides machine learning models like SVM and DT. You will also understand forecasting for a time series driven data.
Trainer & Course Introduction
A detailed discussion on what Alteryx and data analytics topics we are covering in this course. This course will be very useful for ETL developers and/or Data analysts and Data scientists looking for enhancing or starting their career into the world of Business Intelligence and Business Analytics.
A motivating video providing a reason of why you should take up the course. You should take up the course because Alteryx is an easy to understand tool, We have included intuitive slides so that we can make this training interactive, the course is very comprehensive as it is focused not just on data analysts but also on software developers, we are also covering interview questions and 2 elaborated projects towards end of this course.
In this lesson we will discuss about agenda items for this lesson and entire course. The agenda items will be (1) Trainer introduction - Background and experience (2) Alteryx course outline. (3) Why you should take up this course
An education background about trainer and how he evolved his career into development and data analytics industry. A brief about the extensive training experience Steven has on niche technologies like Informatica, Alteryx, Tableau, SAS DI Studio, Spark, Hadoop, Advanced SAS. These technologies are used in Business Intelligence and data analytics space.
Significance of data
Agenda slide on importance of data and history of data spurt. Database concepts and jargons.
Jargons used in databases like What is database, structured data
Jargons used in databases like semi-structured data, unstructured data, ETL, variable, observation, data type.
Understanding of core concepts of data bases like Primary key, foreign key. How are tables maintained in a relational database and what is the relevance.
What is ETL and Analytics?
Agenda slide on Core Concepts of ETL & data processing tools, Core Concepts of Analytics & Analytical tools and Application of Analytics
Understanding the ETL concept i.e Extract, transform and load of data. How does ETL work in creating a enterprise data warehouse. Discussion on various ETL tools available.
Evolution of data from raw data sources to a meaningful insights. Before a decision is made by the stakeholders of company how is data processed and the tedious transformation it goes through. Various analytical tools available.
Certain industry use cases where analytics can be used to take business decisions.
Downloading Alteryx. Getting familiar with Alteryx environment and Alteryx Serve
Agenda discussion on - Downloading Alteryx Designer - 14 day trial version, Alteryx at a Glance, Servers and platforms & A deep dive into Alteryx - Options available in Alteryx Tool.
A video on how to download Alteryx designer a 14 day trial software on your local desktop.
Accessing Alteryx. Look and feel of Alteryx. What does various ribbon and tools do in Alteryx.
Different modules of Alteryx with industry relevance. Modules of Alteryx are Alteryx Designer, Desktop Automation and Alteryx Server
A detailed discussion on some of tools in Alteryx. How to make use of canvas to define workflows.
What is Alteryx? And Why Organizations are using Alteryx?
History of Alteryx, What all can Alteryx do and capabilities of Alteryx.
Why Alteryx? - Alteryx as a ONE STOP SHOP. Provides solutions for ETL, Visualization and Data/Statistical Analytics. Alteryx – An end to end tool.
Data extraction: Raw files and Databases
Importing a data from raw data file that can be csv or excel or flat files or a Database.
Importing a data from raw data file using Input Tool in Alteryx and how to visualize data in Alteryx in table format.
Using Text Input tool to manually create a dataset in Alteryx with Variables and observations.
Various kinds of Character, Numeric and date data types in Alteryx. Refer to below link for various data types:
Saving Datasets and work flow. File and data formats in Alteryx
Basic understanding of how to save a Alteryx dataset and workflow. Various formats of Alteryx files.
Efficient management of data space in tables with Auto field tool
Efficient management of variables and tables size with Auto field tool functionality in Alteryx.
Browse Tool to view all records
Visualizing the large volume of data with Browse tool. Since all records are not visible by default in Alteryx and if you want to visualize data with more than 7,000 records, it can be visualized with Browse tool.
Know your data – Types of variables
An understanding of data variables is imperative for a BI or Data Analyst. Types of variables are classified into Quantitative and Qualitative at a broader level. Quantitative variables are further classified into Continuous and Discrete. Qualitative variables are further classified into nominal and ordinal.
Data Wrangling – Creating a new column with Formula Tool. Using Filter tool
Creating copy of a variable using Formula Tool and changing attribute of new variable from one data type to another.
Conditional statements: Using a "If then else" conditional statement to create a new Qualitative variable based on monthly usage for a telecom company.
Filtering records with filter tool. Also understanding if we want to use "And" & "OR" Operator along with the filter tool.
Data Wrangling – Selecting few variables and Mending variable attributes using S
Making subset of a dataset and changing attribute of a variable. Understanding the relevance of creating a subset of dataset.
Creating a new variable using arithmetic calculations. We can use numeric operators like +,-,* etc.
Creating a variable with numeric attribute. Understanding numeric functions like sum, average, max, min, power, ceiling, floor, absolute.
Significance of character functions. Understanding Text to column Tool to split observation values in a variable on the basis of a delimiter. Difference between error and warning and how critical is error and warning.
When the data is not separated by delimiter necessarily and you would like to separate the observations in a variable on the basis of indexing or position of text string, substring function can be used to perform such operations.
Data cleaning with Upper case, lower case and trim functions to bring observations at par. Nesting one function another inside to avoid multiple steps and create an efficient workflow.
Concatenating first name and last name into one column. Using character function with concatenation.
Replacing a text string to clean junk or unnecessary values in a Character variables, with use of Replace function.
A complicated scenario to extract a text string with use of findstring function and other functions like substring.
Handling dates and other formats
Handling date variables. Conversion of a character variable which as dates, into date variables is important so that Alteryx can do the computation in such variables.
Date functions can be used in Date variables only. Certain date functions to add or substract days from a date variables.
Handling currency formats in Alteryx and playing with a trick to convert currenty variable into numeric variable.
Summarize Tools - Functions in Alteryx to derive insightful data
Understanding group by functionality to summarize data at grouping variable and aggregation variable levels.
Evaluating the cumulative totals for a numeric variable and how to evaluate running totals for a Retail sales data.
Arranging data - Sorting & Cascade sorting
Sorting a variable to evaluate top 50 records with highest sales. Also understanding how to sort character variables.
Sorting variables at 2 levels and the business significance of cascade sorting.
Handling duplicate observations and duplicate variable values with Unique Tool
Removing duplicate records. Records are called as duplicates in case they have exactly same value throughout all variables.
Removing duplicates at a single variable level.
Understanding a business case scenario for a retail chain data and implementing Unique tool.
Sampling and Creating Union/concatenation of data using Sample and Union tools
Creating subset of a data through Sample tool. Select first or last few records. Concatenating records of 2 tables.
Joining tables with a common key
Understanding industry relevance of joins and revising concept of relational database. We will discuss How are tables joined with respect to a key variable and how do we populate variables from different tables to make one consolidated table with all relevant variables required for analysis.
Left join will consist of all the records from left table.
Right join will consist of all records from right table. Inner join will give us common records. Outer join will give all observations across both the tables.
Transposing a data which has higher number of records to lower number of records by converting one of variable into columns.
Transposing a data which has lower number of records to higher number of records by converting few variables into one columns.
Frequency of categorical variables
Creating frequency of a Qualitative variable so that business can draw some insights into volume of business driven through top products.
Creating frequency of a 2 Qualitative variable so that business can understand which product has higher Order priority inclination.
Automating the workflows with Macros and Global Constants
Understanding the need for making macros for replicating workflows periodically i.e weekly or monthly.
Creating a macro workflow and setting up an automated process to be reused.
Calling a Macro workflow post creation. Passing the input dataset values in runtime.
Concept of Global constants - Is a value of a variable which can be accessed anywhere in workflow. Integrating the global constants with workflow to make dynamic usage.
Replacing Missing values
Replacing missing values with overall averages.
Replacing missing values with averages across one variable only.
Replacing missing values with averages across multiple variable and business sense for doing so.
Intro to Data Analytics and decision making
Agenda - (1) What is Analytics? (2) Types of Analytics with a Case study (3) Industry specific examples
Analytics is exploration and interpretation of data to find out meaningful and insightful patterns so that these can be applied towards decision Making
A perfect example of Analytics with a online retail company. How does company identify how much discount is optimal for meeting the sales targets for a business.
4 types of analytics viz. Descriptive, Diagnostic, Predictive and Prescriptive. Understanding the concept of each of them with the online retail company.
A elaborated discussion on how analytics can be used in various industries like retail, banking, Healthcare and telecom.
Linear and Logistic Regression - Running Statistical models with Predictive Tool
Linear regression is used to predict a continuous variable. Understanding a Boston data, each row represents one house and price of it corresponding to variables like crime rate in city, number of rooms etc.
Linear Regression tool to build a model on a continuous dependent variable. Identifying impactful variables and selecting those variables into final model.
Once the model is built next step is to make predictions on the basis of linear equation. We will understand how to use score tool to make predictions.
Understanding the need for logistic regression. Logistic regression is used to predict a binary variable i.e variable which has 2 outcomes like pass or fail, success or unsuccess, default or non default etc.