01. Introduction
- Why Apache Kafka?
- To decouple the data streams & systems
- What is Apache Kafka?
- Apache Kafka is a high-throughput distributed messaging system.
- Created by LinkedIn, now Open Source Project mainly maintained by Confluent
- Horizontal Scalability:
- Can Scale to 100s of brokers
- Can scal to millions of messages per second
- High performance (latency of less than 10ms) -> real time
- Used by the 200+ firms, 35% of the Fortune 500, e.g., Airbnb, Netflix, LinkedIn.
- Use cases:
- Messaging System
- Activity Tracking
- Gather metrics frommany different locations
- Application Logs gathering
- Stream processing (with the Kafka Streams API or Spark)
- De-coupling of system dependencies
- Integration with Spark, Flink, Storm, Hadoop, and many other Big Data technologies
- Link