public class PowerIterationClustering
extends java.lang.Object
implements scala.Serializable
Lin and Cohen
. From the abstract: PIC finds a very
low-dimensional embedding of a dataset using truncated power iteration on a normalized pair-wise
similarity matrix of the data.
param: k Number of clusters. param: maxIterations Maximum number of iterations of the PIC algorithm. param: initMode Set the initialization mode. This can be either "random" to use a random vector as vertex properties, or "degree" to use normalized sum similarities. Default: random.
Modifier and Type | Class and Description |
---|---|
static class |
PowerIterationClustering.Assignment
Cluster assignment.
|
static class |
PowerIterationClustering.Assignment$ |
Constructor and Description |
---|
PowerIterationClustering()
Constructs a PIC instance with default parameters: {k: 2, maxIterations: 100,
initMode: "random"}.
|
Modifier and Type | Method and Description |
---|---|
PowerIterationClusteringModel |
run(Graph<java.lang.Object,java.lang.Object> graph)
Run the PIC algorithm on Graph.
|
PowerIterationClusteringModel |
run(JavaRDD<scala.Tuple3<java.lang.Long,java.lang.Long,java.lang.Double>> similarities)
A Java-friendly version of
PowerIterationClustering.run . |
PowerIterationClusteringModel |
run(RDD<scala.Tuple3<java.lang.Object,java.lang.Object,java.lang.Object>> similarities)
Run the PIC algorithm.
|
PowerIterationClustering |
setInitializationMode(java.lang.String mode)
Set the initialization mode.
|
PowerIterationClustering |
setK(int k)
Set the number of clusters.
|
PowerIterationClustering |
setMaxIterations(int maxIterations)
Set maximum number of iterations of the power iteration loop
|
public PowerIterationClustering()
public PowerIterationClustering setK(int k)
k
- (undocumented)public PowerIterationClustering setMaxIterations(int maxIterations)
maxIterations
- (undocumented)public PowerIterationClustering setInitializationMode(java.lang.String mode)
mode
- (undocumented)public PowerIterationClusteringModel run(Graph<java.lang.Object,java.lang.Object> graph)
graph
- an affinity matrix represented as graph, which is the matrix A in the PIC paper.
The similarity s,,ij,, represented as the edge between vertices (i, j) must
be nonnegative. This is a symmetric matrix and hence s,,ij,, = s,,ji,,. For
any (i, j) with nonzero similarity, there should be either (i, j, s,,ij,,)
or (j, i, s,,ji,,) in the input. Tuples with i = j are ignored, because we
assume s,,ij,, = 0.0.
PowerIterationClusteringModel
that contains the clustering resultpublic PowerIterationClusteringModel run(RDD<scala.Tuple3<java.lang.Object,java.lang.Object,java.lang.Object>> similarities)
similarities
- an RDD of (i, j, s,,ij,,) tuples representing the affinity matrix, which is
the matrix A in the PIC paper. The similarity s,,ij,, must be nonnegative.
This is a symmetric matrix and hence s,,ij,, = s,,ji,,. For any (i, j) with
nonzero similarity, there should be either (i, j, s,,ij,,) or
(j, i, s,,ji,,) in the input. Tuples with i = j are ignored, because we
assume s,,ij,, = 0.0.
PowerIterationClusteringModel
that contains the clustering resultpublic PowerIterationClusteringModel run(JavaRDD<scala.Tuple3<java.lang.Long,java.lang.Long,java.lang.Double>> similarities)
PowerIterationClustering.run
.similarities
- (undocumented)