‘The Cloud’ or ‘Cloud computing’ is one of the hottest buzzwords in technology. It appears more than 48 million times on the Internet search every day. Cloud computing and software-as-a-service (SaaS) have been around for quite some time now and we have been using it via multiple applications both at work and personally via mobile apps.
But when it comes to the data warehouse on a cloud, the concept or the idea has recently emerged as an alternative to conventional or traditional, on-premises data warehousing and similar types of solutions which we have been working on.
When choosing a DW solution for the first time, the very first consideration is typically one between an on-prem DW or a cloud-based one. And while a lot of folks new to the Data Warehouse domain go straight to the cloud these days because it is faster, easier and pay as you go or use method of pricing, the scalable features and the quick turnaround time on multiple aspects. Not that the Cloud is one stop solution for all Data Warehouse needs and there are still many reasons why an organization might want to choose an on-prem solution.
Even now there are a lot of projects/implementations which are maintaining and enhancing the traditional data warehouses on a daily basis. And, lot of projects and team members are also dealing with issues with different kinds of sources, the increase in volumes of data and the outburst of new requirements from business and analytics to see the real value of the unstructured formats of data.
I’m here to help you on your journey to understand the basics of ‘Cloud’ and the Cloud Data Warehouse. We would take little baby steps and go slow and easy, so you can learn more about what the cloud Data Warehouse really is and make sure that you’ll understand the cloud Data Warehouse by the time you finish with this course.
We will take examples of our day to day use of applications like Facebook, Netflix, Google Maps etc to learn more and understand better.
Introduction
‘The Cloud’ or ‘Cloud computing’ is one of the hottest buzzwords in technology. In this lecture we will briefly talk about the Cloud and Cloud Computing.
This lecture talks about 'What is covered and not covered in this course?'
Definitions and Concepts
Introduction to Definitions and Concepts of the cloud and the cloud data warehouse.
A data center (or datacenter) is a facility composed of networked computers and storage that businesses or other organizations use to organize, process, store and disseminate large amounts of data.
The cloud is a global network of remote servers that operates as a single ecosystem, over the Internet.
Cloud computing is the delivery of computing services—servers, storage, databases, networking, software, analytics, intelligence and more—over the Internet (“the cloud”) to offer faster innovation, flexible resources, and economies of scale. You typically pay only for cloud services you use, helping lower your operating costs, run your infrastructure more efficiently, and scale as your business needs change.
Elasticity is the ability to grow or shrink infrastructure resources dynamically as needed to adapt to workload changes in an autonomic manner, maximizing the use of resources.
Scalability is the capability of a process, network, software or appliance to grow and manage increased demands. This is one of the most valuable and predominant feature of cloud computing. Through scalability you can scale up your data storage capacity or scale it down to meet the demands of your growing business.
A cloud service is any service made available to users on demand via the Internet from a cloud computing provider's servers as opposed to being provided from a company's own on-premises servers.
Cloud native is a term used to describe container-based environments. Cloud-native technologies are used to develop applications built with services packaged in containers, deployed as micro-services and managed on elastic infrastructure through agile DevOps processes and continuous delivery workflows.
Cloud storage is a cloud computing model that stores data on the Internet through a cloud computing provider who manages and operates data storage as a service. It's delivered on demand with just-in-time capacity and costs, and eliminates buying and managing your own data storage infrastructure.
Capital expenditures (CapEx) refers to the money a company spends towards fixed assets, such as the purchase, maintenance, and improvements of buildings, vehicles, equipment, computers, hardware or land.
Operating expenses (OpEx) are the funds an organization uses to run its day-to-day business.
Infrastructure as a service are online services that provide high-level APIs used to reference various low-level details of underlying network infrastructure like physical computing resources, location, data partitioning, scaling, security, backup etc.
Platform as a service (PaaS) is a cloud computing model in which a third-party provider delivers hardware and software tools -- usually those needed for application development -- to users over the internet. A PaaS provider hosts the hardware and software on its own infrastructure.
Software as a service (SaaS) is a software distribution model in which a third-party provider hosts applications and makes them available to customers over the Internet. SaaS is one of three main categories of cloud computing, alongside infrastructure as a service (IaaS) and platform as a service (PaaS).
This lecture covers the different services we have discussed in the previous lectures.
Virtual machine (VM) is an emulation of a computer system. Virtual machines are based on computer architectures and provide functionality of a physical computer. Their implementations may involve specialized hardware, software, or a combination.
Containers provide a standard way to package your application's code, configurations, and dependencies into a single object.
Serverless computing is a cloud-computing execution model in which the cloud provider runs the server, and dynamically manages the allocation of machine resources. Pricing is based on the actual amount of resources consumed by an application, rather than on pre-purchased units of capacity.
The main difference between public and private clouds is that you aren't responsible for any of the management of a public cloud hosting solution. Your data is stored in the provider's data center and the provider is responsible for the management and maintenance of the data center.
Hybrid cloud is a cloud computing environment that uses a mix of on-premises, private cloud and third-party, public cloud services with orchestration between the two platforms.
Multi cloud is a strategy that leverages two or more cloud computing platforms.
API is the acronym for Application Programming Interface, which is a software intermediary that allows two applications to talk to each other.
This is the final lecture of the Definitions and Concepts for Cloud Computing
Cloud Data Warehouse
In this lecture we talk about the various data structures and what is different to the traditional Data Warehouses.
In this lecture, we speak about the need of the external data sources to an Enterprise Data Warehouse.
In this lecture, we discuss about the current challenges with traditional Data Warehouses.
In this lecture we talk about the different features of cloud technology which can help the traditional Data Warehouses.
Let's discuss the different types of options for a Data Warehouse setup on Cloud.
Let's look at the popular cloud data warehouse providers in the market today
In this lecture we will see the questions to answer before considering the move from on-premise to Cloud.
In this lecture we talk about the Common Data Warehouse Use Cases for both traditional and cloud based set ups.
In this lecture, we talk about the cloud providers who supports the above use cases, discussed in the previous lectures.
In this lecture, we will look at the different Cloud based solutions for ETL and Reporting
In this lecture, we talk about the different migration options at a very high level.
Architecture
This lecture talks about the different patters in the traditional Data Warehouses.
Amazon Redshift achieves efficient storage and optimum query performance through a combination of massively parallel processing, columnar data storage, and very efficient, targeted data compression encoding schemes. This lecture presents an introduction to the Amazon Redshift system architecture.
Bonus Section
This section will provide you with the coupons for other courses.