NGramsHashingTF

Instance Constructors

new NGramsHashingTF(orders: Seq[Int], numFeatures: Int)

orders
valid ngram orders, must be consecutive positive integers
numFeatures
The desired feature space to convert to using the hashing trick.

Value Members

final def !=(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def !=(arg0: Any): Boolean

Definition Classes
Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def ==(arg0: Any): Boolean

Definition Classes
Any
final def andThen[C, L](est: LabelEstimator[SparseVector[Double], C, L], data: PipelineDataset[Seq[String]], labels: PipelineDataset[L]): Pipeline[Seq[String], C]

Chains a label estimator onto the end of this pipeline, producing a new pipeline.
Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.
est
The estimator to chain onto the end of this pipeline
data
The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)
labels
The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

Definition Classes
Chainable
final def andThen[C, L](est: LabelEstimator[SparseVector[Double], C, L], data: RDD[Seq[String]], labels: PipelineDataset[L]): Pipeline[Seq[String], C]

Chains a label estimator onto the end of this pipeline, producing a new pipeline.
Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.
est
The estimator to chain onto the end of this pipeline
data
The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)
labels
The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

Definition Classes
Chainable
final def andThen[C, L](est: LabelEstimator[SparseVector[Double], C, L], data: PipelineDataset[Seq[String]], labels: RDD[L]): Pipeline[Seq[String], C]

Chains a label estimator onto the end of this pipeline, producing a new pipeline.
Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.
est
The estimator to chain onto the end of this pipeline
data
The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)
labels
The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

Definition Classes
Chainable
final def andThen[C, L](est: LabelEstimator[SparseVector[Double], C, L], data: RDD[Seq[String]], labels: RDD[L]): Pipeline[Seq[String], C]

Chains a label estimator onto the end of this pipeline, producing a new pipeline.
Chains a label estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.
est
The estimator to chain onto the end of this pipeline
data
The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)
labels
The labels to use when fitting the LabelEstimator. Must be zippable with the training data.

Definition Classes
Chainable
final def andThen[C](est: Estimator[SparseVector[Double], C], data: PipelineDataset[Seq[String]]): Pipeline[Seq[String], C]

Chains an estimator onto the end of this pipeline, producing a new pipeline.
Chains an estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.
est
The estimator to chain onto the end of this pipeline
data
The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

Definition Classes
Chainable
final def andThen[C](est: Estimator[SparseVector[Double], C], data: RDD[Seq[String]]): Pipeline[Seq[String], C]

Chains an estimator onto the end of this pipeline, producing a new pipeline.
Chains an estimator onto the end of this pipeline, producing a new pipeline. If this pipeline has already been executed, it will not need to be fit again.
est
The estimator to chain onto the end of this pipeline
data
The training data to use (the estimator will be fit on the result of passing this data through the current pipeline)

Definition Classes
Chainable
final def andThen[C](next: Chainable[SparseVector[Double], C]): Pipeline[Seq[String], C]

Chains a pipeline onto the end of this one, producing a new pipeline.
Chains a pipeline onto the end of this one, producing a new pipeline. If either this pipeline or the following has already been executed, it will not need to be fit again.
next
the pipeline to chain

Definition Classes
Chainable
def apply(line: Seq[String]): SparseVector[Double]

The application of this Transformer to a single input item.
The application of this Transformer to a single input item. This method MUST be overridden by ML developers.
returns
The output value

Definition Classes
NGramsHashingTF → Transformer
def apply(in: RDD[Seq[String]]): RDD[SparseVector[Double]]

The application of this Transformer to an RDD of input items.
The application of this Transformer to an RDD of input items. This method may optionally be overridden by ML developers.
in
The bulk RDD input to pass into this transformer
returns
The bulk RDD output for the given input

Definition Classes
Transformer
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def execute(deps: Seq[Expression]): Expression

Definition Classes
TransformerOperator → Operator
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def finalizeHash(hash: Int, length: Int): Int

Finalize a hash to incorporate the length and make sure all bits avalanche.
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def label: String

Definition Classes
Operator
final def mix(hash: Int, data: Int): Int

Mix in a block of data into an intermediate hash value.
final def mixLast(hash: Int, data: Int): Int

May optionally be used as the last mixing step.
May optionally be used as the last mixing step. Is a little bit faster than mix, as it does no further mixing of the resulting hash. For the last element this is not necessary as the hash is thoroughly mixed during finalization anyway.
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def nonNegativeMod(x: Int, mod: Int): Int
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
val numFeatures: Int

The desired feature space to convert to using the hashing trick.
val orders: Seq[Int]

valid ngram orders, must be consecutive positive integers
final val seqSeed: Int
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toPipeline: Pipeline[Seq[String], SparseVector[Double]]

A method that converts this object into a Pipeline.
A method that converts this object into a Pipeline. Must be implemented by anything that extends Chainable.

Definition Classes
Transformer → Chainable
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

case class NGramsHashingTF(orders: Seq[Int], numFeatures: Int) extends Transformer[Seq[String], SparseVector[Double]] with Product with Serializable

Instance Constructors

new NGramsHashingTF(orders: Seq[Int], numFeatures: Int)

Value Members

final def !=(arg0: AnyRef): Boolean

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: AnyRef): Boolean

final def ==(arg0: Any): Boolean

final def andThen[C, L](est: LabelEstimator[SparseVector[Double], C, L], data: PipelineDataset[Seq[String]], labels: PipelineDataset[L]): Pipeline[Seq[String], C]

final def andThen[C, L](est: LabelEstimator[SparseVector[Double], C, L], data: RDD[Seq[String]], labels: PipelineDataset[L]): Pipeline[Seq[String], C]

final def andThen[C, L](est: LabelEstimator[SparseVector[Double], C, L], data: PipelineDataset[Seq[String]], labels: RDD[L]): Pipeline[Seq[String], C]

final def andThen[C, L](est: LabelEstimator[SparseVector[Double], C, L], data: RDD[Seq[String]], labels: RDD[L]): Pipeline[Seq[String], C]

final def andThen[C](est: Estimator[SparseVector[Double], C], data: PipelineDataset[Seq[String]]): Pipeline[Seq[String], C]

final def andThen[C](est: Estimator[SparseVector[Double], C], data: RDD[Seq[String]]): Pipeline[Seq[String], C]

final def andThen[C](next: Chainable[SparseVector[Double], C]): Pipeline[Seq[String], C]

def apply(line: Seq[String]): SparseVector[Double]

def apply(in: RDD[Seq[String]]): RDD[SparseVector[Double]]

final def asInstanceOf[T0]: T0

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def execute(deps: Seq[Expression]): Expression

def finalize(): Unit

final def finalizeHash(hash: Int, length: Int): Int

final def getClass(): Class[_]

final def isInstanceOf[T0]: Boolean

def label: String

final def mix(hash: Int, data: Int): Int

final def mixLast(hash: Int, data: Int): Int

final def ne(arg0: AnyRef): Boolean

def nonNegativeMod(x: Int, mod: Int): Int

final def notify(): Unit

final def notifyAll(): Unit

val numFeatures: Int

val orders: Seq[Int]

final val seqSeed: Int

final def synchronized[T0](arg0: ⇒ T0): T0

def toPipeline: Pipeline[Seq[String], SparseVector[Double]]

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from Product

Inherited from Equals

Inherited from Transformer[Seq[String], SparseVector[Double]]

Inherited from Chainable[Seq[String], SparseVector[Double]]

Inherited from TransformerOperator

Inherited from Serializable

Inherited from Serializable

Inherited from Operator

Inherited from AnyRef

Inherited from Any

Ungrouped