class PageRank extends Arguments
PageRank algorithm implementation. There are two implementations of PageRank.
The first one uses the org.apache.spark.graphx.graph
interface with aggregateMessages
and
runs PageRank for a fixed number of iterations. This can be executed by setting maxIter
.
Conceptually, the algorithm does the following:
var PR = Array.fill(n)( 1.0 ) val oldPR = Array.fill(n)( 1.0 ) for( iter <- 0 until maxIter ) { swap(oldPR, PR) for( i <- 0 until n ) { PR[i] = alpha + (1 - alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum } }
The second implementation uses the org.apache.spark.graphx.Pregel
interface and runs PageRank
until convergence and this can be run by setting tol
. Conceptually, the algorithm does the
following:
var PR = Array.fill(n)( 1.0 ) val oldPR = Array.fill(n)( 0.0 ) while( max(abs(PR - oldPr)) > tol ) { swap(oldPR, PR) for( i <- 0 until n if abs(PR[i] - oldPR[i]) > tol ) { PR[i] = alpha + (1 - \alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum } }
alpha
is the random reset probability (typically 0.15), inNbrs[i]
is the set of neighbors
which link to i
and outDeg[j]
is the out degree of vertex j
.
Note that this is not the "normalized" PageRank and as a consequence pages that have no inlinks will have a PageRank of alpha. In particular, the pageranks may have some values greater than 1.
The resulting vertices DataFrame contains one additional column:
- pagerank (
DoubleType
): the pagerank of this vertex
The resulting edges DataFrame contains one additional column:
- weight (
DoubleType
): the normalized weight of this edge after running PageRank
- Alphabetic
- By Inheritance
- PageRank
- Arguments
- AnyRef
- Any
- by any2stringadd
- by StringFormat
- by Ensuring
- by ArrowAssoc
- Hide All
- Show All
- Public
- Protected
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- def +(other: String): String
- def ->[B](y: B): (PageRank, B)
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
- def ensuring(cond: (PageRank) => Boolean, msg: => Any): PageRank
- def ensuring(cond: (PageRank) => Boolean): PageRank
- def ensuring(cond: Boolean, msg: => Any): PageRank
- def ensuring(cond: Boolean): PageRank
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def maxIter(value: Int): PageRank.this.type
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- def resetProbability(value: Double): PageRank.this.type
Reset probability "alpha"
- def run(): GraphFrame
- def sourceId(value: Any): PageRank.this.type
Source vertex for a Personalized Page Rank (optional)
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- def tol(value: Double): PageRank.this.type
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)
- def formatted(fmtstr: String): String
- Implicit
- This member is added by an implicit conversion from PageRank toStringFormat[PageRank] performed by method StringFormat in scala.Predef.
- Definition Classes
- StringFormat
- Annotations
- @deprecated @inline()
- Deprecated
(Since version 2.12.16) Use
formatString.format(value)
instead ofvalue.formatted(formatString)
, or use thef""
string interpolator. In Java 15 and later,formatted
resolves to the new method in String which has reversed parameters.
- def →[B](y: B): (PageRank, B)
- Implicit
- This member is added by an implicit conversion from PageRank toArrowAssoc[PageRank] performed by method ArrowAssoc in scala.Predef.
- Definition Classes
- ArrowAssoc
- Annotations
- @deprecated
- Deprecated
(Since version 2.13.0) Use
->
instead. If you still wish to display it as one character, consider using a font with programming ligatures such as Fira Code.