Type
Virtual
Classroom ILT
Skill Level

Available dates
Learning Path
Virtual
Duration
1 Day

TYPE
Virtual
Classroom ILT
LEARNING PATH
SKILL LEVEL

DURATION
AVAILABLE DATES
Introduction
This course, you will learn to build batch data analytics solutions using Amazon EMR, an enterprise-grade Apache Spark and Apache Hadoop managed service. You will learn how Amazon EMR integrates with open-source projects such as Apache Hive, Hue, and HBase, and with AWS services such as AWS Glue and AWS Lake Formation. The course addresses data collection, ingestion, cataloguing, storage, and processing components in the context of Spark and Hadoop. You will learn to use EMR Notebooks to support both analytics and machine learning workloads. You will also learn to apply security, performance, and cost management best practices to the operation of Amazon EMR.
Audience profile
This course is intended primarily for:
- Data platform engineers
- Architects and operators who build and manage data analytics pipelines
Pre-requisites
It is recommend that attendees of this course have:
- Students with a minimum one-year experience managing open-source data frameworks such as Apache Spark or Apache Hadoop will benefit from this course
- We suggest the AWS Hadoop Fundamentals course for those that need a refresher on Apache Hadoop
We recommend that attendees of this course have:
- Completed either AWS Technical Essentials or Architecting on AWS
- Completed either Building Data Lakes on AWS or Getting Started with AWS Glue
Course objectives
This course is designed to teach you how to:
- Apply data lake methodologies in planning and designing a data lake
- Articulate the components and services required for building an AWS data lake
- Secure a data lake with appropriate permission
- Ingest, store, and transform data in a data lake
- Query, analyse, and visualize data within a data lake Orchestrate big data workflows using AWS Data Pipeline
Course content
Module 1: Introduction to data lakes | |
|
|
|
|
Module 2: Data ingestion, cataloguing, and preparation | |
|
|
|
|
Module 3: Data processing and analytics | |
|
|
|
|
Module 4: Building a data lake with AWS Lake Formation | |
|
|
|
|
Module 5: Additional Lake Formation configurations | |
|
|
|
|
|
|
Module 6: Architecture and course review | |
|
|
|
Associated certifications and exam
On successful completion of this course students will receive a Torque IT attendance certificate.

Amazon Web Services Overview
Amazon Web Services (AWS) are leaders in cloud computing solutions and provide IT infrastructure in the form of online web services.
Torque IT is the first and only Authorized AWS Training Partner in Africa.
Our authorized AWS training solutions, and associated certifications, develop and validate the technical knowledge and skills that are relevant to individuals and organizations that are considering, implementing and maintaining cloud-based solutions.
Our hands-on AWS training courses are for anyone who wants to gain a deeper level of understanding with regard to cloud computing solutions and AWS. AWS Certifications designate individuals who demonstrate knowledge, skills and proficiency with AWS services. AWS Certification exams validate the technical knowledge and skills necessary for building and maintaining applications and services on the AWS Cloud.
Regardless of your level of experience with cloud computing, and AWS, we have a course for you.