Benchmarks

Graphalytics Benchmarks

This benchmark is to test the performance of GraphFrames algorithms, not Apache Spark itself. So, all the graphs are read from Parquet files on disk and persisted in memory in the serialized format. As a result, only the time of GraphFrames algorithms is measured, and the time to read/parse source files, serialize, and persist the data is not measured.

Configurations

Graph: wiki-Talk

Algorithm Measurements Time (s)
Shortest Paths Graphframes 3 50.2069
Shortest Paths GraphX 3 18.2139
Connected Components Graphframes 3 29.5346
Connected Components GraphX 3 17.6575
Label Propagation GraphFrames 3 68.2840
Label Propagation GraphX 3 110.6229