About the Course
This Specialization teaches the essential skills for working with large-scale data using SQL.
Maybe you are new to SQL and you want to learn the basics. Or maybe you already have some experience using SQL to query smaller-scale data with relational databases. Either way, if you are interested in gaining the skills necessary to query big data with modern distributed SQL engines, this Specialization is for you.
Most courses that teach SQL focus on traditional relational databases, but today, more and more of the data that’s being generated is too big to be stored there, and it’s growing too quickly to be efficiently stored in commercial data warehouses. Instead, it’s increasingly stored in distributed clusters and cloud storage. These data stores are cost-efficient and infinitely scalable.
To query these huge datasets in clusters and cloud storage, you need a newer breed of SQL engine: distributed query engines, like Hive, Impala, Presto, and Drill. These are open source SQL engines capable of querying enormous datasets. This Specialization focuses on Hive and Impala, the most widely deployed of these query engines.
What will you learn?
- Distinguish operational from analytic databases, and understand how these are applied in big data
- Understand how database and table design provides structures for working with data
- Appreciate how differences in volume and variety of data affects your choice of an appropriate database system
- Recognize the features and benefits of SQL dialects designed to work with big data systems for storage and analysis.
Skills you will gain
- Cloud Storage
- Data Analysis
- Big Data
- Database (DBMS)
- Data Warehousing
- Apache Hive
- Apache Impala
- Data Management
- Distributed File Systems
- Foundations for Big Data Analysis with SQL: In this course, you’ll get a big-picture view of using SQL for big data, starting with an overview of data, database systems, and the common querying language (SQL).
- Analyzing Big Data with SQL: In this course, you’ll get an in-depth look at the SQL SELECT statement and its main clauses.
- Managing Big Data in Clusters and Cloud Storage: In this course, you’ll learn how to manage big datasets, how to load them into clusters and cloud storage, and how to apply structure to the data so that you can run queries on it using distributed SQL engines like Apache Hive and Apache Impala.
To enroll for this course, click the link below.
Note: NoticeBard is associated with Coursera via an affiliate programme.