Following up on our previous post, Andrew Wilson writes:
I agree we are in a really exciting time for statistics and machine learning. There has been a lot of talk lately comparing machine learning with statistics. I am curious whether you think there are many fundamental differences between the fields, or just superficial differences — different popular approximate inference methods, slightly different popular application areas, etc. Is machine learning a subset of statistics?
In the paper we discuss how we think machine learning is fundamentally about pattern discovery, and ultimately, fully automating the learning and decision making process. In other words, whatever a human does when he or she uses tools to analyze data, can be written down algorithmically and automated on a computer. I am not sure if the ambitions are similar in statistics — and I don’t have any conventional statistics background, which makes it harder to tell. I think it’s an interesting discussion.
I don’t know enough about machine learning to know what differences there are between the fields. One of my sayings is that theoretical statistics is another name for the theory of applied statistics. That is, statistics is all about modeling what we do, and modeling what we should be doing. As always in the social sciences, normative modeling has a descriptive flavor and descriptive modeling has a normative flavor: to the extent that we’re not doing what we say we should be doing, this suggests potential changes in our theory or in our practice. And much of my work over the years has been to give theoretical foundations for various areas of statistical practice that have typically been treated informally.
Thus, compared to other academic statisticians, I think…