public abstract class VertexRDD<VD> extends RDD<scala.Tuple2<java.lang.Object,VD>>
RDD[(VertexId, VD)]
by ensuring that there is only one entry for each vertex and by
pre-indexing the entries for fast, efficient joins. Two VertexRDDs with the same index can be
joined efficiently. All operations except reindex
preserve the index. To construct a
VertexRDD
, use the VertexRDD object
.
Additionally, stores routing information to enable joining the vertex attributes with an
EdgeRDD
.
Constructor and Description |
---|
VertexRDD(SparkContext sc,
scala.collection.Seq<Dependency<?>> deps) |
Modifier and Type | Method and Description |
---|---|
static RDD<T> |
$plus$plus(RDD<T> other) |
static <U> U |
aggregate(U zeroValue,
scala.Function2<U,T,U> seqOp,
scala.Function2<U,U,U> combOp,
scala.reflect.ClassTag<U> evidence$29) |
abstract <VD2> VertexRDD<VD2> |
aggregateUsingIndex(RDD<scala.Tuple2<java.lang.Object,VD2>> messages,
scala.Function2<VD2,VD2,VD2> reduceFunc,
scala.reflect.ClassTag<VD2> evidence$12)
Aggregates vertices in
messages that have the same ids using reduceFunc , returning a
VertexRDD co-indexed with this . |
static <VD> VertexRDD<VD> |
apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices,
scala.reflect.ClassTag<VD> evidence$14)
Constructs a standalone
VertexRDD (one that is not set up for efficient joins with an
EdgeRDD ) from an RDD of vertex-attribute pairs. |
static <VD> VertexRDD<VD> |
apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices,
EdgeRDD<?> edges,
VD defaultVal,
scala.reflect.ClassTag<VD> evidence$15)
Constructs a
VertexRDD from an RDD of vertex-attribute pairs. |
static <VD> VertexRDD<VD> |
apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices,
EdgeRDD<?> edges,
VD defaultVal,
scala.Function2<VD,VD,VD> mergeFunc,
scala.reflect.ClassTag<VD> evidence$16)
Constructs a
VertexRDD from an RDD of vertex-attribute pairs. |
static RDD<T> |
cache() |
static <U> RDD<scala.Tuple2<T,U>> |
cartesian(RDD<U> other,
scala.reflect.ClassTag<U> evidence$5) |
static void |
checkpoint() |
protected static void |
clearDependencies() |
static RDD<T> |
coalesce(int numPartitions,
boolean shuffle,
scala.Option<PartitionCoalescer> partitionCoalescer,
scala.math.Ordering<T> ord) |
static boolean |
coalesce$default$2() |
static scala.Option<PartitionCoalescer> |
coalesce$default$3() |
static scala.math.Ordering<T> |
coalesce$default$4(int numPartitions,
boolean shuffle,
scala.Option<PartitionCoalescer> partitionCoalescer) |
static java.lang.Object |
collect() |
static <U> RDD<U> |
collect(scala.PartialFunction<T,U> f,
scala.reflect.ClassTag<U> evidence$28) |
scala.collection.Iterator<scala.Tuple2<java.lang.Object,VD>> |
compute(Partition part,
TaskContext context)
Provides the
RDD[(VertexId, VD)] equivalent output. |
static SparkContext |
context() |
static long |
count() |
static PartialResult<BoundedDouble> |
countApprox(long timeout,
double confidence) |
static double |
countApprox$default$2() |
static long |
countApproxDistinct(double relativeSD) |
static long |
countApproxDistinct(int p,
int sp) |
static double |
countApproxDistinct$default$1() |
static scala.collection.Map<T,java.lang.Object> |
countByValue(scala.math.Ordering<T> ord) |
static scala.math.Ordering<T> |
countByValue$default$1() |
static PartialResult<scala.collection.Map<T,BoundedDouble>> |
countByValueApprox(long timeout,
double confidence,
scala.math.Ordering<T> ord) |
static double |
countByValueApprox$default$2() |
static scala.math.Ordering<T> |
countByValueApprox$default$3(long timeout,
double confidence) |
static scala.collection.Seq<Dependency<?>> |
dependencies() |
abstract VertexRDD<VD> |
diff(RDD<scala.Tuple2<java.lang.Object,VD>> other)
For each vertex present in both
this and other , diff returns only those vertices with
differing values; for values that are different, keeps the values from other . |
abstract VertexRDD<VD> |
diff(VertexRDD<VD> other)
For each vertex present in both
this and other , diff returns only those vertices with
differing values; for values that are different, keeps the values from other . |
static RDD<T> |
distinct() |
static RDD<T> |
distinct(int numPartitions,
scala.math.Ordering<T> ord) |
static scala.math.Ordering<T> |
distinct$default$2(int numPartitions) |
VertexRDD<VD> |
filter(scala.Function1<scala.Tuple2<java.lang.Object,VD>,java.lang.Object> pred)
Restricts the vertex set to the set of vertices satisfying the given predicate.
|
static T |
first() |
protected static <U> RDD<U> |
firstParent(scala.reflect.ClassTag<U> evidence$31) |
static <U> RDD<U> |
flatMap(scala.Function1<T,scala.collection.TraversableOnce<U>> f,
scala.reflect.ClassTag<U> evidence$4) |
static T |
fold(T zeroValue,
scala.Function2<T,T,T> op) |
static void |
foreach(scala.Function1<T,scala.runtime.BoxedUnit> f) |
static void |
foreachPartition(scala.Function1<scala.collection.Iterator<T>,scala.runtime.BoxedUnit> f) |
static <VD> VertexRDD<VD> |
fromEdges(EdgeRDD<?> edges,
int numPartitions,
VD defaultVal,
scala.reflect.ClassTag<VD> evidence$17)
Constructs a
VertexRDD containing all vertices referred to in edges . |
static scala.Option<java.lang.String> |
getCheckpointFile() |
protected static scala.collection.Seq<Dependency<?>> |
getDependencies() |
static int |
getNumPartitions() |
protected Partition[] |
getPartitions()
Implemented by subclasses to return the set of partitions in this RDD.
|
protected static scala.collection.Seq<java.lang.String> |
getPreferredLocations(Partition split) |
static StorageLevel |
getStorageLevel() |
static RDD<java.lang.Object> |
glom() |
static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> |
groupBy(scala.Function1<T,K> f,
scala.reflect.ClassTag<K> kt) |
static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> |
groupBy(scala.Function1<T,K> f,
int numPartitions,
scala.reflect.ClassTag<K> kt) |
static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> |
groupBy(scala.Function1<T,K> f,
Partitioner p,
scala.reflect.ClassTag<K> kt,
scala.math.Ordering<K> ord) |
static <K> scala.runtime.Null$ |
groupBy$default$4(scala.Function1<T,K> f,
Partitioner p) |
static int |
id() |
protected static void |
initializeLogIfNecessary(boolean isInterpreter) |
abstract <U,VD2> VertexRDD<VD2> |
innerJoin(RDD<scala.Tuple2<java.lang.Object,U>> other,
scala.Function3<java.lang.Object,VD,U,VD2> f,
scala.reflect.ClassTag<U> evidence$10,
scala.reflect.ClassTag<VD2> evidence$11)
Inner joins this VertexRDD with an RDD containing vertex attribute pairs.
|
abstract <U,VD2> VertexRDD<VD2> |
innerZipJoin(VertexRDD<U> other,
scala.Function3<java.lang.Object,VD,U,VD2> f,
scala.reflect.ClassTag<U> evidence$8,
scala.reflect.ClassTag<VD2> evidence$9)
Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index.
|
static RDD<T> |
intersection(RDD<T> other) |
static RDD<T> |
intersection(RDD<T> other,
int numPartitions) |
static RDD<T> |
intersection(RDD<T> other,
Partitioner partitioner,
scala.math.Ordering<T> ord) |
static scala.math.Ordering<T> |
intersection$default$3(RDD<T> other,
Partitioner partitioner) |
static boolean |
isCheckpointed() |
static boolean |
isEmpty() |
protected static boolean |
isTraceEnabled() |
static scala.collection.Iterator<T> |
iterator(Partition split,
TaskContext context) |
static <K> RDD<scala.Tuple2<K,T>> |
keyBy(scala.Function1<T,K> f) |
abstract <VD2,VD3> VertexRDD<VD3> |
leftJoin(RDD<scala.Tuple2<java.lang.Object,VD2>> other,
scala.Function3<java.lang.Object,VD,scala.Option<VD2>,VD3> f,
scala.reflect.ClassTag<VD2> evidence$6,
scala.reflect.ClassTag<VD3> evidence$7)
Left joins this VertexRDD with an RDD containing vertex attribute pairs.
|
abstract <VD2,VD3> VertexRDD<VD3> |
leftZipJoin(VertexRDD<VD2> other,
scala.Function3<java.lang.Object,VD,scala.Option<VD2>,VD3> f,
scala.reflect.ClassTag<VD2> evidence$4,
scala.reflect.ClassTag<VD3> evidence$5)
Left joins this RDD with another VertexRDD with the same index.
|
static RDD<T> |
localCheckpoint() |
protected static org.slf4j.Logger |
log() |
protected static void |
logDebug(scala.Function0<java.lang.String> msg) |
protected static void |
logDebug(scala.Function0<java.lang.String> msg,
java.lang.Throwable throwable) |
protected static void |
logError(scala.Function0<java.lang.String> msg) |
protected static void |
logError(scala.Function0<java.lang.String> msg,
java.lang.Throwable throwable) |
protected static void |
logInfo(scala.Function0<java.lang.String> msg) |
protected static void |
logInfo(scala.Function0<java.lang.String> msg,
java.lang.Throwable throwable) |
protected static java.lang.String |
logName() |
protected static void |
logTrace(scala.Function0<java.lang.String> msg) |
protected static void |
logTrace(scala.Function0<java.lang.String> msg,
java.lang.Throwable throwable) |
protected static void |
logWarning(scala.Function0<java.lang.String> msg) |
protected static void |
logWarning(scala.Function0<java.lang.String> msg,
java.lang.Throwable throwable) |
static <U> RDD<U> |
map(scala.Function1<T,U> f,
scala.reflect.ClassTag<U> evidence$3) |
static <U> RDD<U> |
mapPartitions(scala.Function1<scala.collection.Iterator<T>,scala.collection.Iterator<U>> f,
boolean preservesPartitioning,
scala.reflect.ClassTag<U> evidence$6) |
static <U> boolean |
mapPartitions$default$2() |
static <U> boolean |
mapPartitionsInternal$default$2() |
static <U> RDD<U> |
mapPartitionsWithIndex(scala.Function2<java.lang.Object,scala.collection.Iterator<T>,scala.collection.Iterator<U>> f,
boolean preservesPartitioning,
scala.reflect.ClassTag<U> evidence$8) |
static <U> boolean |
mapPartitionsWithIndex$default$2() |
abstract <VD2> VertexRDD<VD2> |
mapValues(scala.Function1<VD,VD2> f,
scala.reflect.ClassTag<VD2> evidence$2)
Maps each vertex attribute, preserving the index.
|
abstract <VD2> VertexRDD<VD2> |
mapValues(scala.Function2<java.lang.Object,VD,VD2> f,
scala.reflect.ClassTag<VD2> evidence$3)
Maps each vertex attribute, additionally supplying the vertex ID.
|
static T |
max(scala.math.Ordering<T> ord) |
static T |
min(scala.math.Ordering<T> ord) |
abstract VertexRDD<VD> |
minus(RDD<scala.Tuple2<java.lang.Object,VD>> other)
For each VertexId present in both
this and other , minus will act as a set difference
operation returning only those unique VertexId's present in this . |
abstract VertexRDD<VD> |
minus(VertexRDD<VD> other)
For each VertexId present in both
this and other , minus will act as a set difference
operation returning only those unique VertexId's present in this . |
static void |
name_$eq(java.lang.String x$1) |
static java.lang.String |
name() |
protected static <U> RDD<U> |
parent(int j,
scala.reflect.ClassTag<U> evidence$32) |
static scala.Option<Partitioner> |
partitioner() |
static Partition[] |
partitions() |
static RDD<T> |
persist() |
static RDD<T> |
persist(StorageLevel newLevel) |
static RDD<java.lang.String> |
pipe(scala.collection.Seq<java.lang.String> command,
scala.collection.Map<java.lang.String,java.lang.String> env,
scala.Function1<scala.Function1<java.lang.String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printPipeContext,
scala.Function2<T,scala.Function1<java.lang.String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printRDDElement,
boolean separateWorkingDir,
int bufferSize) |
static RDD<java.lang.String> |
pipe(java.lang.String command) |
static RDD<java.lang.String> |
pipe(java.lang.String command,
scala.collection.Map<java.lang.String,java.lang.String> env) |
static scala.collection.Map<java.lang.String,java.lang.String> |
pipe$default$2() |
static scala.Function1<scala.Function1<java.lang.String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> |
pipe$default$3() |
static scala.Function2<T,scala.Function1<java.lang.String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> |
pipe$default$4() |
static boolean |
pipe$default$5() |
static int |
pipe$default$6() |
static scala.collection.Seq<java.lang.String> |
preferredLocations(Partition split) |
static RDD<T>[] |
randomSplit(double[] weights,
long seed) |
static long |
randomSplit$default$2() |
static T |
reduce(scala.Function2<T,T,T> f) |
abstract VertexRDD<VD> |
reindex()
Construct a new VertexRDD that is indexed by only the visible vertices.
|
static RDD<T> |
repartition(int numPartitions,
scala.math.Ordering<T> ord) |
static scala.math.Ordering<T> |
repartition$default$2(int numPartitions) |
abstract VertexRDD<VD> |
reverseRoutingTables()
Returns a new
VertexRDD reflecting a reversal of all edge directions in the corresponding
EdgeRDD . |
static RDD<T> |
sample(boolean withReplacement,
double fraction,
long seed) |
static long |
sample$default$3() |
static void |
saveAsObjectFile(java.lang.String path) |
static void |
saveAsTextFile(java.lang.String path) |
static void |
saveAsTextFile(java.lang.String path,
java.lang.Class<? extends org.apache.hadoop.io.compress.CompressionCodec> codec) |
static RDD<T> |
setName(java.lang.String _name) |
static <K> RDD<T> |
sortBy(scala.Function1<T,K> f,
boolean ascending,
int numPartitions,
scala.math.Ordering<K> ord,
scala.reflect.ClassTag<K> ctag) |
static <K> boolean |
sortBy$default$2() |
static <K> int |
sortBy$default$3() |
static SparkContext |
sparkContext() |
static RDD<T> |
subtract(RDD<T> other) |
static RDD<T> |
subtract(RDD<T> other,
int numPartitions) |
static RDD<T> |
subtract(RDD<T> other,
Partitioner p,
scala.math.Ordering<T> ord) |
static scala.math.Ordering<T> |
subtract$default$3(RDD<T> other,
Partitioner p) |
static java.lang.Object |
take(int num) |
static java.lang.Object |
takeOrdered(int num,
scala.math.Ordering<T> ord) |
static java.lang.Object |
takeSample(boolean withReplacement,
int num,
long seed) |
static long |
takeSample$default$3() |
static java.lang.String |
toDebugString() |
static JavaRDD<T> |
toJavaRDD() |
static scala.collection.Iterator<T> |
toLocalIterator() |
static java.lang.Object |
top(int num,
scala.math.Ordering<T> ord) |
static java.lang.String |
toString() |
static <U> U |
treeAggregate(U zeroValue,
scala.Function2<U,T,U> seqOp,
scala.Function2<U,U,U> combOp,
int depth,
scala.reflect.ClassTag<U> evidence$30) |
static <U> int |
treeAggregate$default$4(U zeroValue) |
static T |
treeReduce(scala.Function2<T,T,T> f,
int depth) |
static int |
treeReduce$default$2() |
static RDD<T> |
union(RDD<T> other) |
static RDD<T> |
unpersist(boolean blocking) |
static boolean |
unpersist$default$1() |
protected abstract scala.reflect.ClassTag<VD> |
vdTag() |
abstract VertexRDD<VD> |
withEdges(EdgeRDD<?> edges)
Prepares this VertexRDD for efficient joins with the given EdgeRDD.
|
static <U> RDD<scala.Tuple2<T,U>> |
zip(RDD<U> other,
scala.reflect.ClassTag<U> evidence$9) |
static <B,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
boolean preservesPartitioning,
scala.Function2<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$10,
scala.reflect.ClassTag<V> evidence$11) |
static <B,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
scala.Function2<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$12,
scala.reflect.ClassTag<V> evidence$13) |
static <B,C,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
RDD<C> rdd3,
boolean preservesPartitioning,
scala.Function3<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$14,
scala.reflect.ClassTag<C> evidence$15,
scala.reflect.ClassTag<V> evidence$16) |
static <B,C,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
RDD<C> rdd3,
scala.Function3<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$17,
scala.reflect.ClassTag<C> evidence$18,
scala.reflect.ClassTag<V> evidence$19) |
static <B,C,D,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
RDD<C> rdd3,
RDD<D> rdd4,
boolean preservesPartitioning,
scala.Function4<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$20,
scala.reflect.ClassTag<C> evidence$21,
scala.reflect.ClassTag<D> evidence$22,
scala.reflect.ClassTag<V> evidence$23) |
static <B,C,D,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
RDD<C> rdd3,
RDD<D> rdd4,
scala.Function4<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$24,
scala.reflect.ClassTag<C> evidence$25,
scala.reflect.ClassTag<D> evidence$26,
scala.reflect.ClassTag<V> evidence$27) |
static RDD<scala.Tuple2<T,java.lang.Object>> |
zipWithIndex() |
static RDD<scala.Tuple2<T,java.lang.Object>> |
zipWithUniqueId() |
aggregate, cache, cartesian, checkpoint, clearDependencies, coalesce, collect, collect, context, count, countApprox, countApproxDistinct, countApproxDistinct, countByValue, countByValueApprox, dependencies, distinct, distinct, doubleRDDToDoubleRDDFunctions, first, firstParent, flatMap, fold, foreach, foreachPartition, getCheckpointFile, getDependencies, getNumPartitions, getPreferredLocations, getStorageLevel, glom, groupBy, groupBy, groupBy, id, intersection, intersection, intersection, isCheckpointed, isEmpty, iterator, keyBy, localCheckpoint, map, mapPartitions, mapPartitionsWithIndex, max, min, name, numericRDDToDoubleRDDFunctions, parent, partitioner, partitions, persist, persist, pipe, pipe, pipe, preferredLocations, randomSplit, rddToAsyncRDDActions, rddToOrderedRDDFunctions, rddToPairRDDFunctions, rddToSequenceFileRDDFunctions, reduce, repartition, sample, saveAsObjectFile, saveAsTextFile, saveAsTextFile, setName, sortBy, sparkContext, subtract, subtract, subtract, take, takeOrdered, takeSample, toDebugString, toJavaRDD, toLocalIterator, top, toString, treeAggregate, treeReduce, union, unpersist, zip, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipWithIndex, zipWithUniqueId
public VertexRDD(SparkContext sc, scala.collection.Seq<Dependency<?>> deps)
public static <VD> VertexRDD<VD> apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices, scala.reflect.ClassTag<VD> evidence$14)
VertexRDD
(one that is not set up for efficient joins with an
EdgeRDD
) from an RDD of vertex-attribute pairs. Duplicate entries are removed arbitrarily.
vertices
- the collection of vertex-attribute pairsevidence$14
- (undocumented)public static <VD> VertexRDD<VD> apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices, EdgeRDD<?> edges, VD defaultVal, scala.reflect.ClassTag<VD> evidence$15)
VertexRDD
from an RDD of vertex-attribute pairs. Duplicate vertex entries are
removed arbitrarily. The resulting VertexRDD
will be joinable with edges
, and any missing
vertices referred to by edges
will be created with the attribute defaultVal
.
vertices
- the collection of vertex-attribute pairsedges
- the EdgeRDD
that these vertices may be joined withdefaultVal
- the vertex attribute to use when creating missing verticesevidence$15
- (undocumented)public static <VD> VertexRDD<VD> apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices, EdgeRDD<?> edges, VD defaultVal, scala.Function2<VD,VD,VD> mergeFunc, scala.reflect.ClassTag<VD> evidence$16)
VertexRDD
from an RDD of vertex-attribute pairs. Duplicate vertex entries are
merged using mergeFunc
. The resulting VertexRDD
will be joinable with edges
, and any
missing vertices referred to by edges
will be created with the attribute defaultVal
.
vertices
- the collection of vertex-attribute pairsedges
- the EdgeRDD
that these vertices may be joined withdefaultVal
- the vertex attribute to use when creating missing verticesmergeFunc
- the commutative, associative duplicate vertex attribute merge functionevidence$16
- (undocumented)public static <VD> VertexRDD<VD> fromEdges(EdgeRDD<?> edges, int numPartitions, VD defaultVal, scala.reflect.ClassTag<VD> evidence$17)
VertexRDD
containing all vertices referred to in edges
. The vertices will be
created with the attribute defaultVal
. The resulting VertexRDD
will be joinable with
edges
.
edges
- the EdgeRDD
referring to the vertices to createnumPartitions
- the desired number of partitions for the resulting VertexRDD
defaultVal
- the vertex attribute to use when creating missing verticesevidence$17
- (undocumented)protected static java.lang.String logName()
protected static org.slf4j.Logger log()
protected static void logInfo(scala.Function0<java.lang.String> msg)
protected static void logDebug(scala.Function0<java.lang.String> msg)
protected static void logTrace(scala.Function0<java.lang.String> msg)
protected static void logWarning(scala.Function0<java.lang.String> msg)
protected static void logError(scala.Function0<java.lang.String> msg)
protected static void logInfo(scala.Function0<java.lang.String> msg, java.lang.Throwable throwable)
protected static void logDebug(scala.Function0<java.lang.String> msg, java.lang.Throwable throwable)
protected static void logTrace(scala.Function0<java.lang.String> msg, java.lang.Throwable throwable)
protected static void logWarning(scala.Function0<java.lang.String> msg, java.lang.Throwable throwable)
protected static void logError(scala.Function0<java.lang.String> msg, java.lang.Throwable throwable)
protected static boolean isTraceEnabled()
protected static void initializeLogIfNecessary(boolean isInterpreter)
protected static scala.collection.Seq<Dependency<?>> getDependencies()
protected static scala.collection.Seq<java.lang.String> getPreferredLocations(Partition split)
public static scala.Option<Partitioner> partitioner()
public static SparkContext sparkContext()
public static int id()
public static java.lang.String name()
public static void name_$eq(java.lang.String x$1)
public static RDD<T> setName(java.lang.String _name)
public static RDD<T> persist(StorageLevel newLevel)
public static RDD<T> persist()
public static RDD<T> cache()
public static RDD<T> unpersist(boolean blocking)
public static StorageLevel getStorageLevel()
public static final scala.collection.Seq<Dependency<?>> dependencies()
public static final Partition[] partitions()
public static final int getNumPartitions()
public static final scala.collection.Seq<java.lang.String> preferredLocations(Partition split)
public static final scala.collection.Iterator<T> iterator(Partition split, TaskContext context)
public static <U> RDD<U> map(scala.Function1<T,U> f, scala.reflect.ClassTag<U> evidence$3)
public static <U> RDD<U> flatMap(scala.Function1<T,scala.collection.TraversableOnce<U>> f, scala.reflect.ClassTag<U> evidence$4)
public static RDD<T> distinct(int numPartitions, scala.math.Ordering<T> ord)
public static RDD<T> distinct()
public static RDD<T> repartition(int numPartitions, scala.math.Ordering<T> ord)
public static RDD<T> coalesce(int numPartitions, boolean shuffle, scala.Option<PartitionCoalescer> partitionCoalescer, scala.math.Ordering<T> ord)
public static RDD<T> sample(boolean withReplacement, double fraction, long seed)
public static RDD<T>[] randomSplit(double[] weights, long seed)
public static java.lang.Object takeSample(boolean withReplacement, int num, long seed)
public static <K> RDD<T> sortBy(scala.Function1<T,K> f, boolean ascending, int numPartitions, scala.math.Ordering<K> ord, scala.reflect.ClassTag<K> ctag)
public static RDD<T> intersection(RDD<T> other, Partitioner partitioner, scala.math.Ordering<T> ord)
public static RDD<java.lang.Object> glom()
public static <U> RDD<scala.Tuple2<T,U>> cartesian(RDD<U> other, scala.reflect.ClassTag<U> evidence$5)
public static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> groupBy(scala.Function1<T,K> f, scala.reflect.ClassTag<K> kt)
public static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> groupBy(scala.Function1<T,K> f, int numPartitions, scala.reflect.ClassTag<K> kt)
public static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> groupBy(scala.Function1<T,K> f, Partitioner p, scala.reflect.ClassTag<K> kt, scala.math.Ordering<K> ord)
public static RDD<java.lang.String> pipe(java.lang.String command)
public static RDD<java.lang.String> pipe(java.lang.String command, scala.collection.Map<java.lang.String,java.lang.String> env)
public static RDD<java.lang.String> pipe(scala.collection.Seq<java.lang.String> command, scala.collection.Map<java.lang.String,java.lang.String> env, scala.Function1<scala.Function1<java.lang.String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printPipeContext, scala.Function2<T,scala.Function1<java.lang.String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printRDDElement, boolean separateWorkingDir, int bufferSize)
public static <U> RDD<U> mapPartitions(scala.Function1<scala.collection.Iterator<T>,scala.collection.Iterator<U>> f, boolean preservesPartitioning, scala.reflect.ClassTag<U> evidence$6)
public static <U> RDD<U> mapPartitionsWithIndex(scala.Function2<java.lang.Object,scala.collection.Iterator<T>,scala.collection.Iterator<U>> f, boolean preservesPartitioning, scala.reflect.ClassTag<U> evidence$8)
public static <U> RDD<scala.Tuple2<T,U>> zip(RDD<U> other, scala.reflect.ClassTag<U> evidence$9)
public static <B,V> RDD<V> zipPartitions(RDD<B> rdd2, boolean preservesPartitioning, scala.Function2<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$10, scala.reflect.ClassTag<V> evidence$11)
public static <B,V> RDD<V> zipPartitions(RDD<B> rdd2, scala.Function2<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$12, scala.reflect.ClassTag<V> evidence$13)
public static <B,C,V> RDD<V> zipPartitions(RDD<B> rdd2, RDD<C> rdd3, boolean preservesPartitioning, scala.Function3<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$14, scala.reflect.ClassTag<C> evidence$15, scala.reflect.ClassTag<V> evidence$16)
public static <B,C,V> RDD<V> zipPartitions(RDD<B> rdd2, RDD<C> rdd3, scala.Function3<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$17, scala.reflect.ClassTag<C> evidence$18, scala.reflect.ClassTag<V> evidence$19)
public static <B,C,D,V> RDD<V> zipPartitions(RDD<B> rdd2, RDD<C> rdd3, RDD<D> rdd4, boolean preservesPartitioning, scala.Function4<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$20, scala.reflect.ClassTag<C> evidence$21, scala.reflect.ClassTag<D> evidence$22, scala.reflect.ClassTag<V> evidence$23)
public static <B,C,D,V> RDD<V> zipPartitions(RDD<B> rdd2, RDD<C> rdd3, RDD<D> rdd4, scala.Function4<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$24, scala.reflect.ClassTag<C> evidence$25, scala.reflect.ClassTag<D> evidence$26, scala.reflect.ClassTag<V> evidence$27)
public static void foreach(scala.Function1<T,scala.runtime.BoxedUnit> f)
public static void foreachPartition(scala.Function1<scala.collection.Iterator<T>,scala.runtime.BoxedUnit> f)
public static java.lang.Object collect()
public static scala.collection.Iterator<T> toLocalIterator()
public static <U> RDD<U> collect(scala.PartialFunction<T,U> f, scala.reflect.ClassTag<U> evidence$28)
public static RDD<T> subtract(RDD<T> other, Partitioner p, scala.math.Ordering<T> ord)
public static T reduce(scala.Function2<T,T,T> f)
public static T treeReduce(scala.Function2<T,T,T> f, int depth)
public static T fold(T zeroValue, scala.Function2<T,T,T> op)
public static <U> U aggregate(U zeroValue, scala.Function2<U,T,U> seqOp, scala.Function2<U,U,U> combOp, scala.reflect.ClassTag<U> evidence$29)
public static <U> U treeAggregate(U zeroValue, scala.Function2<U,T,U> seqOp, scala.Function2<U,U,U> combOp, int depth, scala.reflect.ClassTag<U> evidence$30)
public static long count()
public static PartialResult<BoundedDouble> countApprox(long timeout, double confidence)
public static scala.collection.Map<T,java.lang.Object> countByValue(scala.math.Ordering<T> ord)
public static PartialResult<scala.collection.Map<T,BoundedDouble>> countByValueApprox(long timeout, double confidence, scala.math.Ordering<T> ord)
public static long countApproxDistinct(int p, int sp)
public static long countApproxDistinct(double relativeSD)
public static RDD<scala.Tuple2<T,java.lang.Object>> zipWithIndex()
public static RDD<scala.Tuple2<T,java.lang.Object>> zipWithUniqueId()
public static java.lang.Object take(int num)
public static T first()
public static java.lang.Object top(int num, scala.math.Ordering<T> ord)
public static java.lang.Object takeOrdered(int num, scala.math.Ordering<T> ord)
public static T max(scala.math.Ordering<T> ord)
public static T min(scala.math.Ordering<T> ord)
public static boolean isEmpty()
public static void saveAsTextFile(java.lang.String path)
public static void saveAsTextFile(java.lang.String path, java.lang.Class<? extends org.apache.hadoop.io.compress.CompressionCodec> codec)
public static void saveAsObjectFile(java.lang.String path)
public static <K> RDD<scala.Tuple2<K,T>> keyBy(scala.Function1<T,K> f)
public static void checkpoint()
public static RDD<T> localCheckpoint()
public static boolean isCheckpointed()
public static scala.Option<java.lang.String> getCheckpointFile()
protected static <U> RDD<U> firstParent(scala.reflect.ClassTag<U> evidence$31)
protected static <U> RDD<U> parent(int j, scala.reflect.ClassTag<U> evidence$32)
public static SparkContext context()
protected static void clearDependencies()
public static java.lang.String toDebugString()
public static java.lang.String toString()
public static JavaRDD<T> toJavaRDD()
public static long sample$default$3()
public static <U> boolean mapPartitionsWithIndex$default$2()
public static boolean unpersist$default$1()
public static scala.math.Ordering<T> distinct$default$2(int numPartitions)
public static boolean coalesce$default$2()
public static scala.Option<PartitionCoalescer> coalesce$default$3()
public static scala.math.Ordering<T> coalesce$default$4(int numPartitions, boolean shuffle, scala.Option<PartitionCoalescer> partitionCoalescer)
public static scala.math.Ordering<T> repartition$default$2(int numPartitions)
public static scala.math.Ordering<T> subtract$default$3(RDD<T> other, Partitioner p)
public static scala.math.Ordering<T> intersection$default$3(RDD<T> other, Partitioner partitioner)
public static long randomSplit$default$2()
public static <K> boolean sortBy$default$2()
public static <K> int sortBy$default$3()
public static <U> boolean mapPartitions$default$2()
public static <K> scala.runtime.Null$ groupBy$default$4(scala.Function1<T,K> f, Partitioner p)
public static scala.collection.Map<java.lang.String,java.lang.String> pipe$default$2()
public static scala.Function1<scala.Function1<java.lang.String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> pipe$default$3()
public static scala.Function2<T,scala.Function1<java.lang.String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> pipe$default$4()
public static boolean pipe$default$5()
public static int pipe$default$6()
public static int treeReduce$default$2()
public static <U> int treeAggregate$default$4(U zeroValue)
public static double countApprox$default$2()
public static scala.math.Ordering<T> countByValue$default$1()
public static double countByValueApprox$default$2()
public static scala.math.Ordering<T> countByValueApprox$default$3(long timeout, double confidence)
public static long takeSample$default$3()
public static double countApproxDistinct$default$1()
public static <U> boolean mapPartitionsInternal$default$2()
protected abstract scala.reflect.ClassTag<VD> vdTag()
protected Partition[] getPartitions()
RDD
The partitions in this array must satisfy the following property:
rdd.partitions.zipWithIndex.forall { case (partition, index) => partition.index == index }
getPartitions
in class RDD<scala.Tuple2<java.lang.Object,VD>>
public scala.collection.Iterator<scala.Tuple2<java.lang.Object,VD>> compute(Partition part, TaskContext context)
RDD[(VertexId, VD)]
equivalent output.public abstract VertexRDD<VD> reindex()
public VertexRDD<VD> filter(scala.Function1<scala.Tuple2<java.lang.Object,VD>,java.lang.Object> pred)
It is declared and defined here to allow refining the return type from RDD[(VertexId, VD)]
to
VertexRDD[VD]
.
public abstract <VD2> VertexRDD<VD2> mapValues(scala.Function1<VD,VD2> f, scala.reflect.ClassTag<VD2> evidence$2)
f
- the function applied to each value in the RDDevidence$2
- (undocumented)f
to each of the entries in the
original VertexRDDpublic abstract <VD2> VertexRDD<VD2> mapValues(scala.Function2<java.lang.Object,VD,VD2> f, scala.reflect.ClassTag<VD2> evidence$3)
f
- the function applied to each ID-value pair in the RDDevidence$3
- (undocumented)f
to each of the entries in the
original VertexRDD. The resulting VertexRDD retains the same index.public abstract VertexRDD<VD> minus(RDD<scala.Tuple2<java.lang.Object,VD>> other)
this
and other
, minus will act as a set difference
operation returning only those unique VertexId's present in this
.
other
- an RDD to run the set operation againstpublic abstract VertexRDD<VD> minus(VertexRDD<VD> other)
this
and other
, minus will act as a set difference
operation returning only those unique VertexId's present in this
.
other
- a VertexRDD to run the set operation againstpublic abstract VertexRDD<VD> diff(RDD<scala.Tuple2<java.lang.Object,VD>> other)
this
and other
, diff
returns only those vertices with
differing values; for values that are different, keeps the values from other
. This is
only guaranteed to work if the VertexRDDs share a common ancestor.
other
- the other RDD[(VertexId, VD)] with which to diff against.public abstract VertexRDD<VD> diff(VertexRDD<VD> other)
this
and other
, diff
returns only those vertices with
differing values; for values that are different, keeps the values from other
. This is
only guaranteed to work if the VertexRDDs share a common ancestor.
other
- the other VertexRDD with which to diff against.public abstract <VD2,VD3> VertexRDD<VD3> leftZipJoin(VertexRDD<VD2> other, scala.Function3<java.lang.Object,VD,scala.Option<VD2>,VD3> f, scala.reflect.ClassTag<VD2> evidence$4, scala.reflect.ClassTag<VD3> evidence$5)
this
.
If other
is missing any vertex in this VertexRDD, f
is passed None
.
other
- the other VertexRDD with which to join.f
- the function mapping a vertex id and its attributes in this and the other vertex set
to a new vertex attribute.evidence$4
- (undocumented)evidence$5
- (undocumented)f
public abstract <VD2,VD3> VertexRDD<VD3> leftJoin(RDD<scala.Tuple2<java.lang.Object,VD2>> other, scala.Function3<java.lang.Object,VD,scala.Option<VD2>,VD3> f, scala.reflect.ClassTag<VD2> evidence$6, scala.reflect.ClassTag<VD3> evidence$7)
leftZipJoin
implementation is
used. The resulting VertexRDD contains an entry for each vertex in this
. If other
is
missing any vertex in this VertexRDD, f
is passed None
. If there are duplicates,
the vertex is picked arbitrarily.
other
- the other VertexRDD with which to joinf
- the function mapping a vertex id and its attributes in this and the other vertex set
to a new vertex attribute.evidence$6
- (undocumented)evidence$7
- (undocumented)f
.public abstract <U,VD2> VertexRDD<VD2> innerZipJoin(VertexRDD<U> other, scala.Function3<java.lang.Object,VD,U,VD2> f, scala.reflect.ClassTag<U> evidence$8, scala.reflect.ClassTag<VD2> evidence$9)
innerJoin
for the behavior of the join.other
- (undocumented)f
- (undocumented)evidence$8
- (undocumented)evidence$9
- (undocumented)public abstract <U,VD2> VertexRDD<VD2> innerJoin(RDD<scala.Tuple2<java.lang.Object,U>> other, scala.Function3<java.lang.Object,VD,U,VD2> f, scala.reflect.ClassTag<U> evidence$10, scala.reflect.ClassTag<VD2> evidence$11)
innerZipJoin
implementation
is used.
other
- an RDD containing vertices to join. If there are multiple entries for the same
vertex, one is picked arbitrarily. Use aggregateUsingIndex
to merge multiple entries.f
- the join function applied to corresponding values of this
and other
evidence$10
- (undocumented)evidence$11
- (undocumented)this
, containing only vertices that appear in both
this
and other
, with values supplied by f
public abstract <VD2> VertexRDD<VD2> aggregateUsingIndex(RDD<scala.Tuple2<java.lang.Object,VD2>> messages, scala.Function2<VD2,VD2,VD2> reduceFunc, scala.reflect.ClassTag<VD2> evidence$12)
messages
that have the same ids using reduceFunc
, returning a
VertexRDD co-indexed with this
.
messages
- an RDD containing messages to aggregate, where each message is a pair of its
target vertex ID and the message datareduceFunc
- the associative aggregation function for merging messages to the same vertexevidence$12
- (undocumented)this
, containing only vertices that received messages.
For those vertices, their values are the result of applying reduceFunc
to all received
messages.public abstract VertexRDD<VD> reverseRoutingTables()
VertexRDD
reflecting a reversal of all edge directions in the corresponding
EdgeRDD
.