Data Analysis, Statistics and Probability (physics.data-an)

  • PDF
    We introduce a statistical physics inspired supervised machine learning algorithm for classification and regression problems. The method is based on the invariances or stability of predicted results when known data is represented as expansions in terms of various stochastic functions. The algorithm predicts the classification/regression values of new data by combining (via voting) the outputs of these numerous linear expansions in randomly chosen functions. The few parameters (typically only one parameter is used in all studied examples) that this model has may be automatically optimized. The algorithm has been tested on 10 diverse training data sets of various types and feature space dimensions. It has been shown to consistently exhibit high accuracy and readily allow for optimization of parameters, while simultaneously avoiding pitfalls of existing algorithms such as those associated with class imbalance. We very briefly speculate on whether spatial coordinates in physical theories may be viewed as emergent "features" that enable a robust machine learning type description of data with generic low order smooth functions.

Recent comments

Noon van der Silk Jan 27 2016 03:39 UTC

Great institute name ...

Chris Granade Sep 22 2015 19:15 UTC

Thank you for the kind comments, I'm glad that our paper, source code, and tutorial are useful!

Travis Scholten Sep 21 2015 17:05 UTC

This was a really well-written paper! Am very glad to see this kind of work being done.

In addition, the openness about source code is refreshing. By explicitly relating the work to [QInfer](https://github.com/csferrie/python-qinfer), this paper makes it more easy to check the authors' work. Furthe

...(continued)
Chris Granade Sep 15 2015 02:40 UTC

As a quick addendum, please note that the [supplementary video](https://www.youtube.com/watch?v=22ejRV0Kx2g) for this work is available [on YouTube](https://www.youtube.com/watch?v=22ejRV0Kx2g). Thank you!