keystoneml.workflow

Pipeline

class Pipeline[A, B] extends Chainable[A, B]

A Pipeline takes data as input (single item or an RDD), and outputs some transformation of that data. Internally, a Pipeline contains a GraphExecutor, a specified source, and a specified sink. When a pipeline is applied to data it produces a PipelineResult, in the form of either a PipelineDataset or a PipelineDatum. These are lazy wrappers around the scheduled execution under the hood, and when their values are accessed the underlying Graph will be executed.

Warning: Not thread-safe!

A

type of the data this Pipeline expects as input

B

type of the data this Pipeline outputs

Linear Supertypes
Chainable[A, B], AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. Pipeline
  2. Chainable
  3. AnyRef
  4. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def andThen[C, L](est: LabelEstimator[B, C, L], data: PipelineDataset[A], labels: PipelineDataset[L]): Pipeline[A, C]

    Chains a label estimator onto the end of this pipeline, producing a new pipeline.

    Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    labels

    The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

    Definition Classes
    Chainable
  7. final def andThen[C, L](est: LabelEstimator[B, C, L], data: RDD[A], labels: PipelineDataset[L]): Pipeline[A, C]

    Chains a label estimator onto the end of this pipeline, producing a new pipeline.

    Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    labels

    The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

    Definition Classes
    Chainable
  8. final def andThen[C, L](est: LabelEstimator[B, C, L], data: PipelineDataset[A], labels: RDD[L]): Pipeline[A, C]

    Chains a label estimator onto the end of this pipeline, producing a new pipeline.

    Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    labels

    The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

    Definition Classes
    Chainable
  9. final def andThen[C, L](est: LabelEstimator[B, C, L], data: RDD[A], labels: RDD[L]): Pipeline[A, C]

    Chains a label estimator onto the end of this pipeline, producing a new pipeline.

    Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    labels

    The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

    Definition Classes
    Chainable
  10. final def andThen[C](est: Estimator[B, C], data: PipelineDataset[A]): Pipeline[A, C]

    Chains an estimator onto the end of this pipeline, producing a new pipeline.

    Chains an estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    Definition Classes
    Chainable
  11. final def andThen[C](est: Estimator[B, C], data: RDD[A]): Pipeline[A, C]

    Chains an estimator onto the end of this pipeline, producing a new pipeline.

    Chains an estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    Definition Classes
    Chainable
  12. final def andThen[C](next: Chainable[B, C]): Pipeline[A, C]

    Chains a pipeline onto the end of this one, producing a new pipeline.

    Chains a pipeline onto the end of this one, producing a new pipeline. If either this pipeline or the following has already been executed, it will not need to be fit again.

    next

    the pipeline to chain

    Definition Classes
    Chainable
  13. final def apply(datum: PipelineDatum[A]): PipelineDatum[B]

    Lazily apply the pipeline to the lazy output of a different pipeline given an initial datum.

    Lazily apply the pipeline to the lazy output of a different pipeline given an initial datum. If the previous pipeline has already been fit, it will not need to be fit again.

    returns

    A lazy wrapper around the result of passing lazy output from a different pipeline through this pipeline.

  14. final def apply(data: PipelineDataset[A]): PipelineDataset[B]

    Lazily apply the pipeline to the lazy output of a different pipeline given an initial dataset.

    Lazily apply the pipeline to the lazy output of a different pipeline given an initial dataset. If the previous pipeline has already been fit, it will not need to be fit again.

    returns

    A lazy wrapper around the result of passing lazy output from a different pipeline through this pipeline.

  15. final def apply(data: RDD[A]): PipelineDataset[B]

    Lazily apply the pipeline to a dataset.

    Lazily apply the pipeline to a dataset.

    returns

    A lazy wrapper around the result of passing the dataset through the pipeline.

  16. final def apply(datum: A): PipelineDatum[B]

    Lazily apply the pipeline to a single datum.

    Lazily apply the pipeline to a single datum.

    returns

    A lazy wrapper around the result of passing the datum through the pipeline.

  17. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  18. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  20. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  21. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  22. final def fit(): FittedPipeline[A, B]

    Fit all Estimators in this pipeline to produce a FittedPipeline.

    Fit all Estimators in this pipeline to produce a FittedPipeline. It is logically equivalent, but only contains Transformers in the underlying graph. Applying the FittedPipeline to new data does not trigger any new optimization or estimator fitting.

    It is also serializable and may be written to and from disk.

    returns

    the fitted version of this Pipeline.

  23. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  24. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  25. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  26. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  27. final def notify(): Unit

    Definition Classes
    AnyRef
  28. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  29. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  30. def toPipeline: Pipeline[A, B]

    A method that converts this object into a Pipeline.

    A method that converts this object into a Pipeline. Must be implemented by anything that extends Chainable.

    Definition Classes
    PipelineChainable
  31. def toString(): String

    Definition Classes
    AnyRef → Any
  32. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  33. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  34. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Chainable[A, B]

Inherited from AnyRef

Inherited from Any

Ungrouped