class StructureAwareLabelPropagation extends Arguments with WithCheckpointInterval with WithMaxIter with WithLocalCheckpoints with WithIntermediateStorageLevel with WithDirection with WithLgNomEntries with Logging
Neighborhood-aware community detection via weighted label propagation.
This algorithm is a Label Propagation variant where each incoming label vote is weighted by a combination of:
- optional direct-link baseline strength (enabled unless
ignoreDirectLinks = true), and - neighborhood-overlap strength (
structuralSimilarityMultiplier * commonNeighbors).
Intuitively, labels from neighbors that are structurally similar to the destination (many common neighbors) can be amplified, instead of treating all edges equally.
At each iteration, every vertex aggregates weighted incoming votes by label and picks the label with maximum total weight.
Main hyperparameters:
maxIter(required): maximum number of propagation rounds.ignoreDirectLinks(defaultfalse): whether to drop direct-link baseline vote mass.structuralSimilarityMultiplier(default0.5): scales neighborhood-overlap contribution.
Edge-weight regimes:
ignoreDirectLinks = false:
edgeWeight(src, dst) = 1 + structuralSimilarityMultiplier * commonNeighbors(src, dst)ignoreDirectLinks = true:
edgeWeight(src, dst) = structuralSimilarityMultiplier * commonNeighbors(src, dst)
This implementation is inspired by neighborhood-strength-driven label propagation ideas from: Xie, Jierui, and Boleslaw K. Szymanski. "Community detection using a neighborhood strength driven label propagation algorithm." 2011 IEEE Network Science Workshop. IEEE, 2011.
Note: this implementation does not strictly reproduce the paper; it adopts the core idea of modulating label votes with a common-neighbor term within the GraphFrames/Pregel design.
- Alphabetic
- By Inheritance
- StructureAwareLabelPropagation
- WithLgNomEntries
- WithDirection
- WithIntermediateStorageLevel
- WithLocalCheckpoints
- WithMaxIter
- WithCheckpointInterval
- Logging
- Arguments
- AnyRef
- Any
- by any2stringadd
- by StringFormat
- by Ensuring
- by ArrowAssoc
- Hide All
- Show All
- Public
- Protected
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- def +(other: String): String
- Implicit
- This member is added by an implicit conversion from StructureAwareLabelPropagation toany2stringadd[StructureAwareLabelPropagation] performed by method any2stringadd in scala.Predef.
- Definition Classes
- any2stringadd
- def ->[B](y: B): (StructureAwareLabelPropagation, B)
- Implicit
- This member is added by an implicit conversion from StructureAwareLabelPropagation toArrowAssoc[StructureAwareLabelPropagation] performed by method ArrowAssoc in scala.Predef.
- Definition Classes
- ArrowAssoc
- Annotations
- @inline()
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- val checkpointInterval: Int
- Attributes
- protected
- Definition Classes
- WithCheckpointInterval
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
- def ensuring(cond: (StructureAwareLabelPropagation) => Boolean, msg: => Any): StructureAwareLabelPropagation
- Implicit
- This member is added by an implicit conversion from StructureAwareLabelPropagation toEnsuring[StructureAwareLabelPropagation] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- def ensuring(cond: (StructureAwareLabelPropagation) => Boolean): StructureAwareLabelPropagation
- Implicit
- This member is added by an implicit conversion from StructureAwareLabelPropagation toEnsuring[StructureAwareLabelPropagation] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- def ensuring(cond: Boolean, msg: => Any): StructureAwareLabelPropagation
- Implicit
- This member is added by an implicit conversion from StructureAwareLabelPropagation toEnsuring[StructureAwareLabelPropagation] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- def ensuring(cond: Boolean): StructureAwareLabelPropagation
- Implicit
- This member is added by an implicit conversion from StructureAwareLabelPropagation toEnsuring[StructureAwareLabelPropagation] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def getCheckpointInterval: Int
Gets checkpoint interval.
Gets checkpoint interval.
- Definition Classes
- WithCheckpointInterval
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- def getIntermediateStorageLevel: StorageLevel
Gets storage level for intermediate datasets that require multiple passes.
Gets storage level for intermediate datasets that require multiple passes.
- Definition Classes
- WithIntermediateStorageLevel
- def getIsDirected: Boolean
Gets should graph be considered as directed.
Gets should graph be considered as directed.
- returns
true if directed
- Definition Classes
- WithDirection
- def getLgNomEntries: Int
Gets log2 of nominal entries used by Theta sketch aggregations.
Gets log2 of nominal entries used by Theta sketch aggregations.
- Definition Classes
- WithLgNomEntries
- def getUseLocalCheckpoints: Boolean
Gets whether local checkpoints are being used instead of regular checkpoints.
Gets whether local checkpoints are being used instead of regular checkpoints.
- returns
true if local checkpoints are enabled, false otherwise
- Definition Classes
- WithLocalCheckpoints
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- val intermediateStorageLevel: StorageLevel
- Attributes
- protected
- Definition Classes
- WithIntermediateStorageLevel
- val isDirected: Boolean
- Attributes
- protected
- Definition Classes
- WithDirection
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- val lgNomEntries: Int
- Attributes
- protected
- Definition Classes
- WithLgNomEntries
- def logDebug(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarn(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def maxIter(value: Int): StructureAwareLabelPropagation.this.type
The max number of iterations of algorithm to be performed.
The max number of iterations of algorithm to be performed.
- Definition Classes
- WithMaxIter
- val maxIter: Option[Int]
- Attributes
- protected
- Definition Classes
- WithMaxIter
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- def resultIsPersistent(): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def run(): DataFrame
- def setCheckpointInterval(value: Int): StructureAwareLabelPropagation.this.type
Sets checkpoint interval in terms of number of iterations (default: 2).
Sets checkpoint interval in terms of number of iterations (default: 2). Checkpointing regularly helps recover from failures, clean shuffle files, shorten the lineage of the computation graph, and reduce the complexity of plan optimization. As of Spark 2.0, the complexity of plan optimization would grow exponentially without checkpointing. Hence, disabling or setting longer-than-default checkpoint intervals are not recommended. Checkpoint data is saved under
org.apache.spark.SparkContext.getCheckpointDirwith prefix of the algorithm name. If the checkpoint directory is not set, this throws ajava.io.IOException. Set a nonpositive value to disable checkpointing. This parameter is only used when the algorithm is set to "graphframes". Its default value might change in the future.- Definition Classes
- WithCheckpointInterval
- See also
org.apache.spark.SparkContext.setCheckpointDirin Spark API doc
- def setIgnoreDirectLinks(value: Boolean): StructureAwareLabelPropagation.this.type
Sets whether direct-link baseline vote mass is ignored.
Sets whether direct-link baseline vote mass is ignored.
If
false(default), each existing edge contributes a baseline of1.0before structural overlap is added. Iftrue, only structural overlap contributes vote mass. - def setInitialLabelCol(col: String): StructureAwareLabelPropagation.this.type
Sets an explicit vertex column to use as initial labels.
Sets an explicit vertex column to use as initial labels.
By default, each vertex starts with its own
idas label. When this setter is used, the algorithm initializes labels from the provided attribute column instead, enabling attribute-guided label propagation (attribute propagation): labels can start from domain values such as categories, types, or seeds and then propagate through the graph structure.The output
labelcolumn keeps the data type of the provided column. - def setIntermediateStorageLevel(value: StorageLevel): StructureAwareLabelPropagation.this.type
Sets storage level for intermediate datasets that require multiple passes (default:
).MEMORY_AND_DISKSets storage level for intermediate datasets that require multiple passes (default:
).MEMORY_AND_DISK- Definition Classes
- WithIntermediateStorageLevel
- def setIsDirected(value: Boolean): StructureAwareLabelPropagation.this.type
Sets should graph be cosidered as directed.
Sets should graph be cosidered as directed.
- value
true to handle graph as directed
- Definition Classes
- WithDirection
- def setLgNomEntries(value: Int): StructureAwareLabelPropagation.this.type
Sets the log2 of nominal entries used by Theta sketch aggregations.
Sets the log2 of nominal entries used by Theta sketch aggregations.
- Definition Classes
- WithLgNomEntries
- def setStructuralSimilarityMultiplier(value: Double): StructureAwareLabelPropagation.this.type
Sets multiplier for the neighborhood-overlap signal (common neighbors).
Sets multiplier for the neighborhood-overlap signal (common neighbors).
Edge weighting is:
- when direct links are included:
edgeWeight(src, dst) = 1 + structuralSimilarityMultiplier * commonNeighbors(src, dst)- when direct links are ignored:
edgeWeight(src, dst) = structuralSimilarityMultiplier * commonNeighbors(src, dst)
commonNeighbors(src, dst)is the (approximate) number of shared out-neighbors between source and destination.The value must be non-negative.
- def setUseLocalCheckpoints(value: Boolean): StructureAwareLabelPropagation.this.type
Sets whether to use local checkpoints instead of regular checkpoints (default: false).
Sets whether to use local checkpoints instead of regular checkpoints (default: false). Local checkpoints are faster but less reliable as they don't survive node failures.
- value
true to use local checkpoints, false for regular checkpoints
- returns
this instance
- Definition Classes
- WithLocalCheckpoints
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- val useLocalCheckpoints: Boolean
- Attributes
- protected
- Definition Classes
- WithLocalCheckpoints
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)
- def formatted(fmtstr: String): String
- Implicit
- This member is added by an implicit conversion from StructureAwareLabelPropagation toStringFormat[StructureAwareLabelPropagation] performed by method StringFormat in scala.Predef.
- Definition Classes
- StringFormat
- Annotations
- @deprecated @inline()
- Deprecated
(Since version 2.12.16) Use
formatString.format(value)instead ofvalue.formatted(formatString), or use thef""string interpolator. In Java 15 and later,formattedresolves to the new method in String which has reversed parameters.
- def →[B](y: B): (StructureAwareLabelPropagation, B)
- Implicit
- This member is added by an implicit conversion from StructureAwareLabelPropagation toArrowAssoc[StructureAwareLabelPropagation] performed by method ArrowAssoc in scala.Predef.
- Definition Classes
- ArrowAssoc
- Annotations
- @deprecated
- Deprecated
(Since version 2.13.0) Use
->instead. If you still wish to display it as one character, consider using a font with programming ligatures such as Fira Code.