class SVDPlusPlus extends Arguments with WithMaxIter with Logging
Arguments for SVD++ algorithm.
This class implements the SVD++ algorithm for Collaborative Filtering, primarily used for Recommender Systems (Link Prediction).
Based on the paper "Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model" by Yehuda Koren (2008), available at https://dl.acm.org/citation.cfm?id=1401944.
Problem Definition
The algorithm predicts unknown ratings in a user-item system. It accounts for:
- Explicit preferences (user ratings).
- Implicit feedback (the history of items a user has interacted with).
- User and Item biases.
The prediction rule for a rating r_ui (user u, item i) is:
r_ui = µ + b_u + b_i + q_i^T * (p_u + |N(u)|^-0.5 * sum(y_j for j in N(u)))
Where N(u) is the set of items user u has interacted with (implicit feedback).
Input Requirements
!!! IMPORTANT !!! The input graph MUST be a **Directed Bipartite Graph** representing interactions:
- **Vertices**: A mix of Users and Items.
- **Edges**: Directed strictly from **User (src) -> Item (dst)**.
- **Edge Attribute**: A numeric column (default "weight") representing the rating.
DO NOT use this on general/undirected graphs (e.g., social networks), as the algorithm relies on the asymmetry between Users (who provide feedback) and Items (who receive it).
Output Model (Node Embeddings)
The algorithm returns a DataFrame of vertices with the trained model parameters. These parameters function as embeddings:
column1(Array[Double]): **Primary Latent Factors (Explicit Embedding)**.- For Users: Represents preferences (
p_u). - For Items: Represents characteristics (
q_i). column2(Array[Double]): **Implicit Factors (Implicit Embedding)**.- For Items: Represents the influence of the item (
y_i) on a user's profile based on viewing history. - For Users: Generally unused/zero.
column3(Double): **Bias**.- For Users: User bias (
b_u). - For Items: Item bias (
b_i). column4(Double): **Implicit Normalization Term**.- For Users: Precomputed
|N(u)|^-0.5. - For Items: Unused.
Parameter Tuning Guide
Constraints:
minValue/maxValue: Hard bounds for predicted ratings. Predictions outside this range are clipped. Set these to your rating scale limits (e.g., 1.0 and 5.0).
Learning Rates (Step sizes for Gradient Descent):
gamma1: Learning rate for **Biases** (b_u,b_i).gamma2: Learning rate for **Embeddings/Factors** (p_u,q_i,y_j). > Tip: Increase if convergence is too slow. Decrease if the loss explodes (NaN).
Regularization (Preventing Overfitting):
gamma6: Regularization for **Biases**.gamma7: Regularization for **Embeddings/Factors**. > Tip: Increase these if the model performs well on training data but poorly on test data.
- Alphabetic
- By Inheritance
- SVDPlusPlus
- Logging
- WithMaxIter
- Arguments
- AnyRef
- Any
- by any2stringadd
- by StringFormat
- by Ensuring
- by ArrowAssoc
- Hide All
- Show All
- Public
- Protected
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- def +(other: String): String
- Implicit
- This member is added by an implicit conversion from SVDPlusPlus toany2stringadd[SVDPlusPlus] performed by method any2stringadd in scala.Predef.
- Definition Classes
- any2stringadd
- def ->[B](y: B): (SVDPlusPlus, B)
- Implicit
- This member is added by an implicit conversion from SVDPlusPlus toArrowAssoc[SVDPlusPlus] performed by method ArrowAssoc in scala.Predef.
- Definition Classes
- ArrowAssoc
- Annotations
- @inline()
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
- def ensuring(cond: (SVDPlusPlus) => Boolean, msg: => Any): SVDPlusPlus
- Implicit
- This member is added by an implicit conversion from SVDPlusPlus toEnsuring[SVDPlusPlus] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- def ensuring(cond: (SVDPlusPlus) => Boolean): SVDPlusPlus
- Implicit
- This member is added by an implicit conversion from SVDPlusPlus toEnsuring[SVDPlusPlus] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- def ensuring(cond: Boolean, msg: => Any): SVDPlusPlus
- Implicit
- This member is added by an implicit conversion from SVDPlusPlus toEnsuring[SVDPlusPlus] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- def ensuring(cond: Boolean): SVDPlusPlus
- Implicit
- This member is added by an implicit conversion from SVDPlusPlus toEnsuring[SVDPlusPlus] performed by method Ensuring in scala.Predef.
- Definition Classes
- Ensuring
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def gamma1(value: Double): SVDPlusPlus.this.type
- def gamma2(value: Double): SVDPlusPlus.this.type
- def gamma6(value: Double): SVDPlusPlus.this.type
- def gamma7(value: Double): SVDPlusPlus.this.type
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def logDebug(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarn(s: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def loss: Double
- def maxIter(value: Int): SVDPlusPlus.this.type
The max number of iterations of algorithm to be performed.
The max number of iterations of algorithm to be performed.
- Definition Classes
- WithMaxIter
- val maxIter: Option[Int]
- Attributes
- protected
- Definition Classes
- WithMaxIter
- def maxValue(value: Double): SVDPlusPlus.this.type
- def minValue(value: Double): SVDPlusPlus.this.type
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- def rank(value: Int): SVDPlusPlus.this.type
- def resultIsPersistent(): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def run(): DataFrame
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)
- def formatted(fmtstr: String): String
- Implicit
- This member is added by an implicit conversion from SVDPlusPlus toStringFormat[SVDPlusPlus] performed by method StringFormat in scala.Predef.
- Definition Classes
- StringFormat
- Annotations
- @deprecated @inline()
- Deprecated
(Since version 2.12.16) Use
formatString.format(value)instead ofvalue.formatted(formatString), or use thef""string interpolator. In Java 15 and later,formattedresolves to the new method in String which has reversed parameters.
- def →[B](y: B): (SVDPlusPlus, B)
- Implicit
- This member is added by an implicit conversion from SVDPlusPlus toArrowAssoc[SVDPlusPlus] performed by method ArrowAssoc in scala.Predef.
- Definition Classes
- ArrowAssoc
- Annotations
- @deprecated
- Deprecated
(Since version 2.13.0) Use
->instead. If you still wish to display it as one character, consider using a font with programming ligatures such as Fira Code.