Advice from Hal Varian to Econ Grad-Students
From an interesting and challenging article by Hal R. Varian:
In fact, my standard advice to graduate students these days is go to the computer science department and take a class in machine learning.
He gives interesting examples of techniques that can help analyse big data and their relevance for economics. He
explains:
Google has seen 30 trillion URLs, crawls over
20 billion of those a day, and answers 100 billion search queries a month... At Google, for
example, I have found that random samples on the order of 0.1 percent work fine
for analysis of business data. (p. 3)
And
An important insight from machine learning is that averaging over many small models tends to give better out-of-sample prediction than choosing a single model. p. 24
An example
In 2006, Netflflix offered a million dollar prize to researchers who could provide
the largest improvement to their existing movie recommendation system. The
winning submission involved a “complex blending of no fewer than 800 models,”
though they also point out that “predictions of good quality can usually be obtained
by combining a small number of judiciously chosen methods” (Feuerverger, He,
and Khatri 2012). It also turned out that a blend of the best- and second-best submissions outperformed either of them.
Good reading suggestions are in the final summary of the article.
No comments:
Post a Comment