Class: Spark::PipelinedRDD
Overview
Pipelined Resilient Distributed Dataset, operations are pipelined and sended to worker
RDD
`-- map
`-- map
`-- map
Code is executed from top to bottom
Instance Attribute Summary collapse
-
#command ⇒ Object
readonly
Returns the value of attribute command.
-
#prev_jrdd ⇒ Object
readonly
Returns the value of attribute prev_jrdd.
Attributes inherited from RDD
Instance Method Summary collapse
-
#initialize(prev, command) ⇒ PipelinedRDD
constructor
A new instance of PipelinedRDD.
-
#jrdd ⇒ Object
Serialization necessary things and sent it to RubyRDD (scala extension).
- #pipelinable? ⇒ Boolean
Methods inherited from RDD
#+, #add_command, #add_library, #aggregate, #aggregate_by_key, #bind, #cache, #cached?, #cartesian, #checkpointed?, #coalesce, #cogroup, #collect, #collect_as_hash, #collect_from_file, #combine_by_key, #compact, #config, #count, #default_reduce_partitions, #distinct, #filter, #first, #flat_map, #flat_map_values, #fold, #fold_by_key, #foreach, #foreach_partition, #glom, #group_by, #group_by_key, #group_with, #histogram, #id, #inspect, #intersection, #key_by, #keys, #map, #map_partitions, #map_partitions_with_index, #map_values, #max, #mean, #min, #name, #new_rdd_from_command, #partition_by, #partitions_size, #persist, #pipe, #reduce, #reduce_by_key, #reserialize, #sample, #sample_stdev, #sample_variance, #set_name, #shuffle, #sort_by, #sort_by_key, #sort_by_value, #stats, #stdev, #subtract, #subtract_by_key, #sum, #take, #take_sample, #to_java, #union, #unpersist, #values, #variance
Methods included from Helper::Statistic
#bisect_right, #compute_fraction, #determine_bounds, #upper_binomial_bound, #upper_poisson_bound
Methods included from Helper::Parser
Methods included from Helper::Logger
Constructor Details
#initialize(prev, command) ⇒ PipelinedRDD
Returns a new instance of PipelinedRDD.
1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 |
# File 'lib/spark/rdd.rb', line 1338 def initialize(prev, command) if prev.is_a?(PipelinedRDD) && prev.pipelinable? # Second, ... stages @prev_jrdd = prev.prev_jrdd else # First stage @prev_jrdd = prev.jrdd end @cached = false @checkpointed = false @context = prev.context @command = command end |
Instance Attribute Details
#command ⇒ Object (readonly)
Returns the value of attribute command.
1336 1337 1338 |
# File 'lib/spark/rdd.rb', line 1336 def command @command end |
#prev_jrdd ⇒ Object (readonly)
Returns the value of attribute prev_jrdd.
1336 1337 1338 |
# File 'lib/spark/rdd.rb', line 1336 def prev_jrdd @prev_jrdd end |
Instance Method Details
#jrdd ⇒ Object
Serialization necessary things and sent it to RubyRDD (scala extension)
1360 1361 1362 |
# File 'lib/spark/rdd.rb', line 1360 def jrdd @jrdd ||= _jrdd end |
#pipelinable? ⇒ Boolean
1355 1356 1357 |
# File 'lib/spark/rdd.rb', line 1355 def pipelinable? !(cached? || checkpointed?) end |