object GraphFrame extends Serializable with Logging
- Grouped
- Alphabetic
- By Inheritance
- GraphFrame
- Logging
- Serializable
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
val
DST: String
Column name for destination vertices of edges.
Column name for destination vertices of edges.
- In GraphFrame.edges, this is a column of vertex IDs.
- In GraphFrame.triplets, this is a column of vertices with schema matching GraphFrame.vertices.
-
val
EDGE: String
Column name for edge in GraphFrame.triplets.
Column name for edge in GraphFrame.triplets. In GraphFrame.triplets, this is a column of edges with schema matching GraphFrame.edges.
-
val
ID: String
Column name for vertex IDs in GraphFrame.vertices Note that GraphFrame assigns a unique long ID to each vertex, If the vertex ID type is one of byte / int / long / short type, GraphFrame casts the original IDs to long as the unique long ID, otherwise GraphFrame generates the unique long ID by Spark function
which is less performant.monotonically_increasing_id
-
val
SRC: String
Column name for source vertices of edges.
Column name for source vertices of edges.
- In GraphFrame.edges, this is a column of vertex IDs.
- In GraphFrame.triplets, this is a column of vertices with schema matching GraphFrame.vertices.
-
def
apply(vertices: DataFrame, edges: DataFrame): GraphFrame
Create a new GraphFrame from vertex and edge
DataFrame
s.Create a new GraphFrame from vertex and edge
DataFrame
s.- vertices
Vertex DataFrame. This must include a column "id" containing unique vertex IDs. All other columns are treated as vertex attributes.
- edges
Edge DataFrame. This must include columns "src" and "dst" containing source and destination vertex IDs. All other columns are treated as edge attributes.
- returns
New GraphFrame instance
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native() @HotSpotIntrinsicCandidate()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
fromEdges(e: DataFrame): GraphFrame
Create a new GraphFrame from an edge
DataFrame
.Create a new GraphFrame from an edge
DataFrame
. The resulting GraphFrame will have GraphFrame.vertices with a single "id" column.Note: The GraphFrame.vertices DataFrame will be persisted at level
StorageLevel.MEMORY_AND_DISK
.- e
Edge DataFrame. This must include columns "src" and "dst" containing source and destination vertex IDs. All other columns are treated as edge attributes.
- returns
New GraphFrame instance
-
def
fromGraphX[V, E](originalGraph: GraphFrame, graph: Graph[V, E], vertexNames: Seq[String] = Nil, edgeNames: Seq[String] = Nil)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[V], arg1: scala.reflect.api.JavaUniverse.TypeTag[E]): GraphFrame
Given:
Given:
- a GraphFrame
originalGraph
- a GraphX graph derived from the GraphFrame using GraphFrame.toGraphX this method merges attributes from the GraphX graph into the original GraphFrame.
This method is useful for doing computations using the GraphX API and then merging the results with a GraphFrame. For example, given:
- GraphFrame
originalGraph
- GraphX Graph[String, Int]
graph
with a String vertex attribute we want to call "category" and an Int edge attribute we want to call "count" We can callfromGraphX(originalGraph, graph, Seq("category"), Seq("count"))
to produce a new GraphFrame. The new GraphFrame will be an augmented version oforiginalGraph
, with new GraphFrame.vertices column "category" and new GraphFrame.edges column "count" added.
See org.graphframes.examples.BeliefPropagation for example usage.
- V
the type of the vertex data
- E
the type of the edge data
- originalGraph
Original GraphFrame used to compute the GraphX graph.
- graph
GraphX graph. Vertex and edge attributes, if any, will be merged into the original graph as new columns. If the attributes are
Product
types such as tuples, then each element of theProduct
will be put in a separate column. If the attributes are other types, then the entire GraphX attribute will become a single new column.- vertexNames
Column name(s) for vertex attributes in the GraphX graph. If there is no vertex attribute, this should be empty. If there is a singleton attribute, this should have a single column name. If the attribute is a
Product
type, this should be a list of names matching the order of the attribute elements.- edgeNames
Column name(s) for edge attributes in the GraphX graph. If there is no edge attribute, this should be empty. If there is a singleton attribute, this should have a single column name. If the attribute is a
Product
type, this should be a list of names matching the order of the attribute elements.- returns
original graph augmented with vertex and column attributes from the GraphX graph
- a GraphFrame
-
def
fromGraphX[VD, ED](graph: Graph[VD, ED])(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[VD], arg1: scala.reflect.api.JavaUniverse.TypeTag[ED]): GraphFrame
Converts a GraphX
Graph
instance into a GraphFrame.Converts a GraphX
Graph
instance into a GraphFrame.This converts each
org.apache.spark.rdd.RDD
in theGraph
to aDataFrame
using schema inference.Vertex ID column names will be converted to "id" for the vertex DataFrame, and to "src" and "dst" for the edge DataFrame.
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
logDebug(s: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(s: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(s: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarn(s: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )