trait RandomWalkBase extends Serializable with Logging with WithIntermediateStorageLevel
Base trait for implementing random walk algorithms on graph data. Provides common functionality for generating random walks across a graph structure.
- Alphabetic
- By Inheritance
- RandomWalkBase
- WithIntermediateStorageLevel
- Logging
- Serializable
- AnyRef
- Any
- by any2stringadd
- by StringFormat
- by Ensuring
- by ArrowAssoc
- Hide All
- Show All
- Public
- Protected
Abstract Value Members
- abstract def runIter(graph: GraphFrame, prevIterationDF: Option[DataFrame], iterSeed: Long): DataFrame
Runs a single iteration of the random walk.
Runs a single iteration of the random walk.
- graph
prepared graph
- prevIterationDF
DataFrame from previous iteration (if any)
- iterSeed
seed for this iteration
- returns
DataFrame result of this iteration
- Attributes
- protected
Concrete Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- def +(other: String): String
- Implicit
- This member is added by an implicit conversion from RandomWalkBase toany2stringadd[RandomWalkBase] performed by method any2stringadd in scala.Predef.
- Definition Classes
- any2stringadd
- def ->[B](y: B): (RandomWalkBase, B)
- Implicit
- This member is added by an implicit conversion from RandomWalkBase toArrowAssoc[RandomWalkBase] performed by method ArrowAssoc in scala.Predef.
- Definition Classes
- ArrowAssoc
- Annotations
- @inline()
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- val batchSize: Int
Size of each batch in the random walk process.
Size of each batch in the random walk process.
- Attributes
- protected
- def cleanUp(): Unit
Deletes all temporary files associated with a given instance.
Deletes all temporary files associated with a given instance. This method uses Hadoop FileSystem to remove the directory containing batch files for the specified run ID. The temporary prefix must be set and accessible via the current SparkContext's Hadoop configuration.
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
- def ensuring(cond: (RandomWalkBase) => Boolean, msg: => Any): RandomWalkBase
- Implicit
- This member is added by an implicit conversion from RandomWalkBase toEnsuring[RandomWalkBase] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- def ensuring(cond: (RandomWalkBase) => Boolean): RandomWalkBase
- Implicit
- This member is added by an implicit conversion from RandomWalkBase toEnsuring[RandomWalkBase] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- def ensuring(cond: Boolean, msg: => Any): RandomWalkBase
- Implicit
- This member is added by an implicit conversion from RandomWalkBase toEnsuring[RandomWalkBase] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- def ensuring(cond: Boolean): RandomWalkBase
- Implicit
- This member is added by an implicit conversion from RandomWalkBase toEnsuring[RandomWalkBase] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- def getIntermediateStorageLevel: StorageLevel
Gets storage level for intermediate datasets that require multiple passes.
Gets storage level for intermediate datasets that require multiple passes.
- Definition Classes
- WithIntermediateStorageLevel
- def getRunId(): String
Get the generated (or provided) runID.
Get the generated (or provided) runID. This method returns current runID!
- val globalSeed: Long
Global random seed for reproducibility.
Global random seed for reproducibility.
- Attributes
- protected
- val graph: GraphFrame
GraphFrame on which random walks are performed.
GraphFrame on which random walks are performed.
- Attributes
- protected
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- val intermediateStorageLevel: StorageLevel
- Attributes
- protected
- Definition Classes
- WithIntermediateStorageLevel
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def logDebug(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarn(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- val maxNbrs: Int
Maximum number of neighbors to consider per vertex during random walks.
Maximum number of neighbors to consider per vertex during random walks.
- Attributes
- protected
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- val numBatches: Int
Number of batches to run in the random walk process.
Number of batches to run in the random walk process.
- Attributes
- protected
- val numWalksPerNode: Int
Number of random walks to generate per node.
Number of random walks to generate per node.
- Attributes
- protected
- def onGraph(graph: GraphFrame): RandomWalkBase.this.type
Sets the graph to perform random walks on.
Sets the graph to perform random walks on.
- graph
the GraphFrame to run random walks on
- returns
this RandomWalkBase instance for chaining
- def prepareGraph(iterationSeed: Long): GraphFrame
Prepares the graph for random walk by limiting neighbors and handling direction.
Prepares the graph for random walk by limiting neighbors and handling direction.
- returns
prepared GraphFrame
- Attributes
- protected
- def resultIsPersistent(): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def run(): DataFrame
Executes the random walk algorithm on the set graph.
Executes the random walk algorithm on the set graph.
- returns
DataFrame containing the random walks
- val runID: String
Unique identifier for the current random walk run.
Unique identifier for the current random walk run.
- Attributes
- protected
- def setBatchSize(value: Int): RandomWalkBase.this.type
Sets the batch size.
Sets the batch size.
- value
batch size
- returns
this RandomWalkBase instance for chaining
- def setGlobalSeed(value: Long): RandomWalkBase.this.type
Sets the global random seed.
Sets the global random seed.
- value
the seed value
- returns
this RandomWalkBase instance for chaining
- def setIntermediateStorageLevel(value: StorageLevel): RandomWalkBase.this.type
Sets storage level for intermediate datasets that require multiple passes (default:
).MEMORY_AND_DISKSets storage level for intermediate datasets that require multiple passes (default:
).MEMORY_AND_DISK- Definition Classes
- WithIntermediateStorageLevel
- def setMaxNbrsPerVertex(value: Int): RandomWalkBase.this.type
Sets the maximum number of neighbors per vertex.
Sets the maximum number of neighbors per vertex.
- value
the max number of neighbors
- returns
this RandomWalkBase instance for chaining
- def setNumBatches(value: Int): RandomWalkBase.this.type
Sets the number of batches.
Sets the number of batches.
- value
number of batches
- returns
this RandomWalkBase instance for chaining
- def setNumWalksPerNode(value: Int): RandomWalkBase.this.type
Sets the number of walks per node.
Sets the number of walks per node.
- value
number of walks
- returns
this RandomWalkBase instance for chaining
- def setRunId(value: String): RandomWalkBase.this.type
Sets the random walk runID.
Sets the random walk runID. If provided, cached batches from existing random walk run will be reused. User should be careful, that temporary prefix points to the right direction as well the cached data starting from the set index exists.
- def setStartingFromBatch(value: Int): RandomWalkBase.this.type
Sets the startng batch index for the continous mode.
Sets the startng batch index for the continous mode. See @setWalkId comment for details.
- def setTemporaryPrefix(value: String): RandomWalkBase.this.type
Sets the temporary prefix for storing intermediate results.
Sets the temporary prefix for storing intermediate results.
- value
the prefix string
- returns
this RandomWalkBase instance for chaining
- def setUseEdgeDirection(value: Boolean): RandomWalkBase.this.type
Sets whether to use edge direction.
Sets whether to use edge direction.
- value
true if the graph is directed
- returns
this RandomWalkBase instance for chaining
- val startingIteration: Int
Starting batch index for continous mode
Starting batch index for continous mode
- Attributes
- protected
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- val temporaryPrefix: Option[String]
Optional prefix for temporary storage during random walks.
Optional prefix for temporary storage during random walks.
- Attributes
- protected
- def toString(): String
- Definition Classes
- AnyRef → Any
- val useEdgeDirection: Boolean
Whether to respect edge direction in the graph (true for directed graphs).
Whether to respect edge direction in the graph (true for directed graphs).
- Attributes
- protected
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)
- def formatted(fmtstr: String): String
- Implicit
- This member is added by an implicit conversion from RandomWalkBase toStringFormat[RandomWalkBase] performed by method StringFormat in scala.Predef.
- Definition Classes
- StringFormat
- Annotations
- @deprecated @inline()
- Deprecated
(Since version 2.12.16) Use
formatString.format(value)instead ofvalue.formatted(formatString), or use thef""string interpolator. In Java 15 and later,formattedresolves to the new method in String which has reversed parameters.
- def →[B](y: B): (RandomWalkBase, B)
- Implicit
- This member is added by an implicit conversion from RandomWalkBase toArrowAssoc[RandomWalkBase] performed by method ArrowAssoc in scala.Predef.
- Definition Classes
- ArrowAssoc
- Annotations
- @deprecated
- Deprecated
(Since version 2.13.0) Use
->instead. If you still wish to display it as one character, consider using a font with programming ligatures such as Fira Code.