Benchmarks
Graphalytics Benchmarks
This benchmark is to test the performance of GraphFrames algorithms, not Apache Spark itself. So, all the graphs are read from the disk and persisted in memory in the serialized format. In the result, only the time of GraphFrames algorithms is measured and the time of reading of the CSV, serialization and persisting the data does not measure.
Configurations
- Serializer:
org.apache.spark.serializer.KryoSerializer
- GraphFrame checkpoints:
localCheckpoints
- Spark Version: 4.0.0
- Scala Version: 2.13.16
- VM: standard GitHub Actions runner for open source projects.
Graph: wiki-Talk
- Vertices: 2M
- Edges: 5M
- Size Category: XS
- Source files format:
CSV
-like
Algorithm | Measurements | Time (s) |
---|---|---|
Shortest Paths Graphframes | 3 | 107.7918 |
Shortest Paths Graphframes (Local Checkpoints) | 3 | 104.7972 |
Shortest Paths GraphX | 3 | 9.2388 |
Connected Components Graphframes | 3 | 149.4867 |
Connected Components GraphX | 3 | 8.9026 |
Label Propagation GraphFrames | 3 | 107.0311 |