Data Engineering using Kafka and Spark Structured Streaming
Learn how to build streaming pipelines using Kafka and Spark Structured Streaming in this comprehensive course. Set up your own self-support lab with Hadoop, Hive, Spark, and Kafka on a Linux-based system. Discover how to create Kafka topics, produce and consume messages, and use Kafka Connect to ingest data from web server logs. Dive into Spark Structured Streaming and integrate it with Kafka to process and write data to different targets. Plus, learn how to handle incremental data processing. With Udemy-based support, any technical challenges you encounter will be resolved within 48 hours. Click now to start your data engineering journey! ▼
ADVERTISEMENT
Course Feature
Cost:
Paid
Provider:
Udemy
Certificate:
Paid Certification
Language:
English
Start Date:
2022-12-13
Course Overview
❗The content presented here is sourced directly from Udemy platform. For comprehensive course details, including enrollment information, simply click on the 'Go to class' link on our website.
Updated in [September 05th, 2023]
Skills and Knowledge Acquisition:
Participants in the "Data Engineering using Kafka and Spark Structured Streaming" course will acquire the following skills and knowledge:
Environment Setup: Learn how to set up a self-supported lab environment with Hadoop, Hive, Spark, and Kafka on a single-node Linux-based system, providing the foundation for data engineering tasks.
Kafka Fundamentals: Gain a deep understanding of Kafka, including creating Kafka topics, producing and consuming messages, and using Kafka Connect for data ingestion from web server logs into Kafka topics.
Data Ingestion: Explore data ingestion processes, including ingesting data from web server logs into Kafka topics and ingesting data from Kafka topics into HDFS as a sink.
Spark Structured Streaming: Understand the key concepts of Spark Structured Streaming, a powerful framework for real-time data processing.
Streaming Pipeline Development: Develop streaming pipelines that consume data from Kafka topics using Spark Structured Streaming, process the data, and write it to different target destinations.
Incremental Data Processing: Learn how to handle incremental data processing efficiently using Spark Structured Streaming.
Course Contribution to Professional Growth:
This course offers significant contributions to professional growth:
Data Engineering Proficiency: Participants will become proficient data engineers capable of building streaming data pipelines, a skill in high demand across industries.
Hands-on Experience: The course provides hands-on experience in setting up the environment and working with Kafka and Spark Structured Streaming, enhancing practical skills.
Real-world Application: Learning to build streaming pipelines prepares professionals for real-world data engineering tasks, making them valuable contributors to data-centric projects.
Problem-Solving Skills: Participants will develop problem-solving skills related to data engineering challenges and gain the ability to design and implement efficient data processing solutions.
Suitability for Preparing Further Education:
The "Data Engineering using Kafka and Spark Structured Streaming" course is suitable for individuals preparing for further education or seeking to deepen their knowledge in the field of data engineering:
Graduate Studies: Students pursuing advanced degrees in data engineering, computer science, or related fields can use this course as a foundation for deeper exploration of data engineering technologies.
Certification: Those planning to pursue certifications related to data engineering or real-time data processing can benefit from this course as a preparation resource.
Professional Development: IT professionals looking to expand their knowledge of data engineering, Kafka, and Spark Structured Streaming can use this course to enhance their expertise and prepare for further career advancement.
Course Syllabus
Introduction
Getting Started with Kafka
Data Ingestion using Kafka Connect
Overview of Spark Structured Streaming
Kafka and Spark Structured Streaming Integration
Incremental Loads using Spark Structured Streaming
Setting up Environment using AWS Cloud9
Setting up Environment - Overview of GCP and Provision Ubuntu VM
Setup Single Node Hadoop Cluster
Setup Hive and Spark
Setup Single Node Kafka Cluster
Course Provider
Provider Udemy's Stats at AZClass
Discussion and Reviews
0.0 (Based on 0 reviews)
Start your review of Data Engineering using Kafka and Spark Structured Streaming