Apache-flink-spark-hadoop
提供:Dev Guides
Apache Flink-Flink vs Spark vs Hadoop
これは、Apache Flink、Apache Spark、Apache Hadoopの3つの最も一般的なビッグデータフレームワークの比較を示す包括的な表です。
| Apache Hadoop | Apache Spark | Apache Flink | |
|---|---|---|---|
| Year of Origin | 2005 | 2009 | 2009 |
| Place of Origin | MapReduce (Google) Hadoop (Yahoo) | University of California, Berkeley | Technical University of Berlin |
| Data Processing Engine | Batch | Batch | Stream |
| Processing Speed | Slower than Spark and Flink | 100x Faster than Hadoop | Faster than spark |
| Programming Languages | Java, C, C++, Ruby, Groovy, Perl, Python | Java, Scala, python and R | Java and Scala |
| Programming Model | MapReduce | Resilient distributed Datasets (RDD) | Cyclic dataflows |
| Data Transfer | Batch | Batch | Pipelined and Batch |
| Memory Management | Disk Based | JVM Managed | Active Managed |
| Latency | Low | Medium | Low |
| Throughput | Medium | High | High |
| Optimization | Manual | Manual | Automatic |
| API | Low-level | High-level | High-level |
| Streaming Support | NA | Spark Streaming | Flink Streaming |
| SQL Support | Hive, Impala | SparkSQL | Table API and SQL |
| Graph Support | NA | GraphX | Gelly |
| Machine Learning Support | NA | SparkML | FlinkML |