Welcome to this ready to run repository to get started with the Apache Airflow Kafka provider! 🚀
This repository assumes you have basic knowledge of Apache Kafka and Apache Airflow. You can find resources on these tools in the Resouces section below.
This repository contains 3 DAGs:
produce_consume_treats: This DAG will produce NUMBER_OF_TREATS messages to a local Kafka cluster. Run it manually to produce and consume new messages.listen_to_the_stream: This DAG will continuously listen to a topic in a local Kafka cluster and run theevent_triggered_functionwhenever a message causes theapply_functionto return a value. Unpause this DAG to have it continuously run.walking_your_pet: This DAG is the downstream DAG the givenevent_triggered_functionin thelisten_to_the_streamDAG will trigger. Unpause this DAG to have it ready to be triggered by a TriggerDagRunOperator in the upstream DAG.
This repository is designed to spin up both a local Kafka cluster and a local Astro project and connect them automatically. Note that it sometimes takes a minute longer for the Kafka cluster to be fully started.
Run this Airflow project without installing anything locally.
-
Fork this repository.
-
Create a new GitHub codespaces project on your fork. Make sure it uses at least 4 cores!
-
After creating the codespaces project the Astro CLI will automatically start up all necessary Airflow components as well as the local Kafka cluster, using the instructions in the
docker-compose.override.yml. This can take a few minutes. -
Once the Airflow project has started access the Airflow UI by clicking on the Ports tab and opening the forward URL for port 8080. You can log in using
adminas the username andadminas the password. -
Unpause all DAGs. Manually run the
produce_consume_treatsDAG to see the pipeline in action. Note that a random function is used to generate parts of the message to Kafka which determines if thelisten_for_moodtask will trigger the downstreamwalking_your_petDAG. You might need to run theproduce_consume_treatsseveral times to see the full pipeline in action!
Download the Astro CLI to run Airflow locally in Docker. astro is the only package you will need to install.
- Run
git clone https://github.com/astronomer/airflow-quickstart.giton your computer to create a local clone of this repository. - Install the Astro CLI by following the steps in the Astro CLI documentation. Docker Desktop/Docker Engine is a prerequisite, but you don't need in-depth Docker knowledge to run Airflow with the Astro CLI.
- Run
astro dev startin your cloned repository. - After your Astro project has started. View the Airflow UI at
localhost:8080. You can log in usingadminas the username andadminas the password. - Unpause all DAGs. Manually run the
produce_consume_treatsDAG to see the pipeline in action. Note that a random function is used to generate parts of the message to Kafka which determines if thelisten_for_moodtask will trigger the downstreamwalking_your_petDAG. You might need to run theproduce_consume_treatsseveral times to see the full pipeline in action!

