Tuesday, January 12, 2021

AI is mainly curve fitting

 


Scientists have been plotting some noisy data and drawing lines through the points for hundreds of years. Many of the AI algorithms at the core of machine learning algorithms do just that. They take some data and draw a line through them. Much of the advancement has come from finding ways to break the problem into thousands, millions, or maybe even billions of little problems and then drawing lines through all of them. It’s not magic; it’s just an assembly line for how we’ve been doing science for centuries. People who don’t like AI and find it easy to poke holes in its decisions focus on the fact that there’s often no deep theory or philosophical scaffolding to lend credibility to the answer. It’s just a guesstimate for the slope of some line.

Everyone who’s started studying data science begins to realize that there’s not much time for science because finding the data is the real job. AI is a close cousin to data science and it has the same challenges. It’s 0.01% inspiration and 99.99% perspiring over file formats, missing data fields, and character codes.

Some answers are easy to find, but deeper, more complex answers often require more and more data. Sometimes the amount of data will rise exponentially. AI can leave you with an insatiable appetite for more and more bits.

AI researchers have been devoting more time of late trying to explain just what the AI is doing.  We can dig into the data and discover that the trained model relies heavily on these parameters that come from a particular corner of the data set. Often, though, the explanations are like those offered by magicians who explain one trick by performing another. Answering the question why is surprisingly hard. You can look at the simplest linear models and stare at the parameters, but often you’ll be left scratching your head. If the model says to multiply the number of miles driven each year by a factor of 0.043255, you might wonder why not 0.043256 or 0.7, or maybe something outrageously different like 411 or 10 billion. Once you’re using a continuum, all of the numbers along the axis might be right. 

It’s like the old model where the Earth was just sitting on a giant Turtle. And where did this turtle stand? On the back of another Turtle. And where does the next stand? It’s turtles all the way down.


No comments:

Post a Comment