keystoneml.nodes.stats

TermFrequency

case class TermFrequency[T](fun: (Double) ⇒ Double = ...) extends Transformer[Seq[T], Seq[(T, Double)]] with Product with Serializable

Transformer that maps a Seq[Any] of objects to a Seq[(Any, Double)] of (unique object, weighting_scheme(tf)), where tf is the number of times the unique object appeared in the original Seq[Any], and the weighting_scheme is a lambda of Double => Double that defaults to the identity function.

As an example, the following would return a transformer that maps a Seq[Any] to all objects seen with the log of their count plus 1:

TermFrequency(x => math.log(x) + 1)
fun

the weighting scheme to apply to the frequencies (defaults to identity)

Linear Supertypes
Product, Equals, Transformer[Seq[T], Seq[(T, Double)]], Chainable[Seq[T], Seq[(T, Double)]], TransformerOperator, Serializable, Serializable, Operator, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. TermFrequency
  2. Product
  3. Equals
  4. Transformer
  5. Chainable
  6. TransformerOperator
  7. Serializable
  8. Serializable
  9. Operator
  10. AnyRef
  11. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new TermFrequency(fun: (Double) ⇒ Double = ...)

    fun

    the weighting scheme to apply to the frequencies (defaults to identity)

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def andThen[C, L](est: LabelEstimator[Seq[(T, Double)], C, L], data: PipelineDataset[Seq[T]], labels: PipelineDataset[L]): Pipeline[Seq[T], C]

    Chains a label estimator onto the end of this pipeline, producing a new pipeline.

    Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    labels

    The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

    Definition Classes
    Chainable
  7. final def andThen[C, L](est: LabelEstimator[Seq[(T, Double)], C, L], data: RDD[Seq[T]], labels: PipelineDataset[L]): Pipeline[Seq[T], C]

    Chains a label estimator onto the end of this pipeline, producing a new pipeline.

    Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    labels

    The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

    Definition Classes
    Chainable
  8. final def andThen[C, L](est: LabelEstimator[Seq[(T, Double)], C, L], data: PipelineDataset[Seq[T]], labels: RDD[L]): Pipeline[Seq[T], C]

    Chains a label estimator onto the end of this pipeline, producing a new pipeline.

    Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    labels

    The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

    Definition Classes
    Chainable
  9. final def andThen[C, L](est: LabelEstimator[Seq[(T, Double)], C, L], data: RDD[Seq[T]], labels: RDD[L]): Pipeline[Seq[T], C]

    Chains a label estimator onto the end of this pipeline, producing a new pipeline.

    Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    labels

    The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

    Definition Classes
    Chainable
  10. final def andThen[C](est: Estimator[Seq[(T, Double)], C], data: PipelineDataset[Seq[T]]): Pipeline[Seq[T], C]

    Chains an estimator onto the end of this pipeline, producing a new pipeline.

    Chains an estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    Definition Classes
    Chainable
  11. final def andThen[C](est: Estimator[Seq[(T, Double)], C], data: RDD[Seq[T]]): Pipeline[Seq[T], C]

    Chains an estimator onto the end of this pipeline, producing a new pipeline.

    Chains an estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.

    est

    The estimator to chain onto the end of this pipeline

    data

    The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

    Definition Classes
    Chainable
  12. final def andThen[C](next: Chainable[Seq[(T, Double)], C]): Pipeline[Seq[T], C]

    Chains a pipeline onto the end of this one, producing a new pipeline.

    Chains a pipeline onto the end of this one, producing a new pipeline. If either this pipeline or the following has already been executed, it will not need to be fit again.

    next

    the pipeline to chain

    Definition Classes
    Chainable
  13. def apply(in: Seq[T]): Seq[(T, Double)]

    The application of this Transformer to a single input item.

    The application of this Transformer to a single input item. This method MUST be overridden by ML developers.

    in

    The input item to pass into this transformer

    returns

    The output value

    Definition Classes
    TermFrequencyTransformer
  14. def apply(in: RDD[Seq[T]]): RDD[Seq[(T, Double)]]

    The application of this Transformer to an RDD of input items.

    The application of this Transformer to an RDD of input items. This method may optionally be overridden by ML developers.

    in

    The bulk RDD input to pass into this transformer

    returns

    The bulk RDD output for the given input

    Definition Classes
    Transformer
  15. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  16. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  17. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  18. def execute(deps: Seq[Expression]): Expression

    Definition Classes
    TransformerOperator → Operator
  19. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  20. val fun: (Double) ⇒ Double

    the weighting scheme to apply to the frequencies (defaults to identity)

  21. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  22. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  23. def label: String

    Definition Classes
    Operator
  24. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  25. final def notify(): Unit

    Definition Classes
    AnyRef
  26. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  27. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  28. def toPipeline: Pipeline[Seq[T], Seq[(T, Double)]]

    A method that converts this object into a Pipeline.

    A method that converts this object into a Pipeline. Must be implemented by anything that extends Chainable.

    Definition Classes
    TransformerChainable
  29. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Product

Inherited from Equals

Inherited from Transformer[Seq[T], Seq[(T, Double)]]

Inherited from Chainable[Seq[T], Seq[(T, Double)]]

Inherited from TransformerOperator

Inherited from Serializable

Inherited from Serializable

Inherited from Operator

Inherited from AnyRef

Inherited from Any

Ungrouped