Fork me on GitHub

Benchmarks

We have benchmarked KeystoneML against state-of-the-art performance achieved by other learning systems on a number of end-to-end benchmarks. In order to make these comparisons, we faithfully recreated learning pipelines as described by the benchmark authors, and run them on Amazon c2.4xlarge machines. The intent of these benchmarks is to show that end-to-end applications can be written and executed efficiently in KeystoneML.

Here we report the time taken by KeystoneML, the time taken by the competing systems, number of CPU cores (or number of GPUs) allocated to each system, total wall-clock speedup. By efficiently leveraging cluster resources, KeystoneML is able to run tasks an order of magnitude faster than highly specialized single-node systems. Meanwhile, on the TIMIT task, we’re able to match state-of-the art performance (and nearly match the runtime) on an IBM BlueGene supercomputer using a fraction of the resources.

Each of the example pipelines below can be found in the KeystoneML source code.

Dataset KeystoneML
Accuracy
Reported
Accuracy
KeystoneML
Time (m)
Reported
Time (m)
KeystoneML
CPU Cores
Reported
CPU Cores
Speedup
Over Reported
Amazon Reviews1 91.6% N/A 3.3 N/A 256 N/A N/A
TIMIT2 66.1% 66.3% 138 120 512 4096 0.87x
ImageNet3 67.4% 66.6% 270 5760 800 16 21x
VOC4 57.2% 59.2% 7 87 256 16 12x


Additionally, we’ve tested KeystoneML for its scalability to clusters with hundreds of nodes and thousands of cores. Here, we show the speedup of three pipelines (Amazon Reviews, TIMIT, and ImageNet) over 8 nodes. Ideal speedup is shown with the dotted line. KeystoneML is able to achieve near linear speedup as we add more nodes because of its use of communication-avoiding algorithms during featurization and model training.

KeysonteML Scaling

1. C. Manning and D. Klein. Optimization, Maxent Models, and Conditional Estimation Without Magic. In HLT-NAACL 2003, Tutorial Volume 5.
2. P.-S.Huang,H.Avron,T.N.Sainath,V.Sindhwani, and B. Ramabhadran. Kernel Methods Match Deep Neural Networks on TIMIT. In ICASSP, pages 205– 209. IEEE, 2014.
3. J. Sanchez, F. Perronnin, T. Mensink, and J. Verbeek. Image Classification with the Fisher Vector: Theory and Practice. International Journal of Computer Vision, 105(3):222–245, 2013.
4. K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman. The Devil is in the Details: An Evaluation of Recent Feature Encoding Methods. In British Machine Vision Conference, 2011.