0% found this document useful (0 votes)
256 views

Apache Airflow TRAINING12532

This 5-day training course covers Apache Airflow and includes modules on the introduction to Airflow, configuration, coding data pipelines with DAGs and operators, advanced concepts like subDAGs and hooks, implementing databases and executors, and data profiling and monitoring in Airflow. Each day consists of two modules separated by a lunch break, with 90% hands-on practice and 10% theory. Python knowledge is required and participants will gain experience developing, running, and monitoring Airflow data pipelines.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
256 views

Apache Airflow TRAINING12532

This 5-day training course covers Apache Airflow and includes modules on the introduction to Airflow, configuration, coding data pipelines with DAGs and operators, advanced concepts like subDAGs and hooks, implementing databases and executors, and data profiling and monitoring in Airflow. Each day consists of two modules separated by a lunch break, with 90% hands-on practice and 10% theory. Python knowledge is required and participants will gain experience developing, running, and monitoring Airflow data pipelines.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Apache Airflow Developer TRAINING

Course Content
Training Duration- 5 Days
*****Every Day Two Module will be covered. Module 1 before Break and Module 2
After Lunch Break.

***All Module is having Practical Expects.

 Pre-req::::::::::::::: python

*** 90 % Hands – 10 % Theory

Day1::: Apache Airflow INTRODUCTION

 BIG data vs Normal ETL Pipelines


 Why We Need Airflow ?
 First Approach to Airflow
 Introduction Airflow?
 Airflow Architecture
 Working Model of Airflow
 Installing Airflow
 Quick Tour of Airflow UI
 Quick Tour of Airflow CLI
 setting environment variable and starting web server
 setting encryption to secure connection secrets
 Configuration Option: Maximum Active Runs Airflow Configuration

Day 2:::

 Airflow Configuration Overview


 Configuration options: ORM Configuration
 Configuration Option: Maximum Active Runs Explained
 Explained Continued
 Configuration Options: Additional Configuration Settings
 Coding Your First Data Pipeline with Airflow
 DAG Explation
 Time to code your first DAG::::::::::::::: python
Day 3 :::

 Operator
 Let's use Operators Practically
 Operator Relationships and Bitshift Composition
 Adding dependencies
 How the Scheduler Works?
 A Quick Play With Backfill and Catchup
 Workflow Description

 Developing Data Pipeline


 Hands on: Project Setup
 Hands on: Data Retrieval from File System
 Hands on: Merging DataFrames
 Hands on: Aggregation Using Pandas
 Hands on: Database Connectivity:: postgres // mysql db
 Hands on: Creating Dags

Day 4 :::

 Databases and Executors


 Introduction Sequential Executor with SQLite
 Local Executor with PostgreSQL
 Configure a DAG with Local Executor and PostgreSQL
 Celery Executor with PostgreSQL and RabbitMQ
 [Practice] Configure a DAG with Celery Executor, PostgreSQL and RabbitMQ

 Implementing Advanced Concepts in Airflow



 Introduction
 Minimising Repetitive Patterns With SubDAGs
 Minimising a DAG with SubDAGs
 How to Interact With External Sources Using Hooks
 Getting Results From PostgreSQL Using Hooks
 How to Share Data Between Your Tasks With XCOMs
 Sharing Your First Messages Using XCOMs
 How to Execute Tasks According To Criteria Using Branching
 Make Your First Conditional Task Using Branching
 Control Your Tasks With SLAs
 Defining a SLA in a DAG
 AIRFLOW SENSORS

Day 5 :::

 Creating Airflow Plugins with Elasticsearch and PostgreSQL


 Adding Functionalities to Apache Airflow
 Creating a Hook to Interact With Elasticsearch
 Creating a Transfer Operator PostgresqlToElasticsearch
 Adding a View to Apache Airflow UI

DATA PROFILING IN AIRFLOW

 Adhoc Queries
 Querying Metadata Tables
 Charts in Airflow

 Executors
 Configure Local Executor
 Configure Celery Executor
 Service Level Agreements (SLAs)
 Security: Authentication, Roles, Encryption
 Write Logs to a Remote Location
 Monitor Airflow with StatsD, Prometheus and Grafana
 Managed Airflow Services

You might also like