To many people, the current applications of artificial intelligence, like your car being able to detect where a lane ends, seem magical. Although most of the significant advances in AI have been in supervised learning, it is the idea that the computer is making sense of the world on its own — unsupervised — which intrigues people more.
If you’ve read some about artificial intelligence, you may often see a distinction between supervised and unsupervised learning by machine. (There are other categories too, but these are the big two.)
In supervised learning, the machine is taught by humans what is right or wrong — for example, who did or did not default on a loan — and it eventually figures out what characteristics would best predict a default.
Another example is asking the computer to identify whether a picture shows a dog or a cat. In supervised learning, a person identifies each picture and then the computer figures out the best way to distinguish between each — perhaps whether the animal has floppy ears 😉
Even though the machine gets quite good at correctly doing this, the underlying model of the things that predict these results is also often opaque. Indeed, one of the hot issues in analytics and machine learning these days is how humans can uncover and almost “reverse engineer” the model the machine is using.
In unsupervised learning, the computer has to figure out for itself how to divide a group of pictures or events or whatever into various categories. Then the next step is for the human to figure out what those categories mean. Since it is subject to interpretation, there is no truly accurate and useful way to describe the categories, although people try. That’s how we get psychographic categories in marketing or equivalent labels, like “soccer moms”.
Sometimes the results are easy for humans to figure out, but not exactly earth shattering, like in this cartoon.
In the case of the computer that is given a set of pictures of cats and dogs to determine what might be the distinguishing characteristics, we (people) would hope that computer would figure out that there are dogs and cats. But it might instead classify them based on size — small animals and big animals — or based on the colors of the animals.
This is all sounds like it is unsupervised. Anything useful that the computer determines is thus part of the magic.
How Unsupervised Is Unsupervised Machine Learning?
Except, in some of the techniques of unsupervised learning, especially in cluster analysis, a person is asked to determine how many clusters or groups there might be. This too limits and supervises the learning by the machine. (Think about how much easier it is to be right in Who Wants To Be A Millionaire if the contestant can narrow down the choices to two.)
Even more important, the computer can only learn from the data that it is given. It would have problems if pictures of a bunch of elephants or firetrucks were later thrown into the mix. Thus, the human being is at least partially supervising the learning and certainly limiting it. The machine’s model is subject to the limitations and biases of the data that it learned on.
Truly unsupervised learning would occur the way that it does for children. They are let out to observe the world and learn patterns, often without any direct assistance from anyone else. Even with over-scheduling by helicopter parents, children can often freely roam the earth and discover new data and experiences.
Similarly, to have true unsupervised learning of machines, they would have to be able to travel and process the data they see.
At the beginning of his book Life 3.0, Max Tegmark weaves a sci fi tale about a team that built an AI called Prometheus. While it wasn’t directly focused on unsupervised classification, Prometheus was unsupervised and learned on its own. It eventually learned enough to dominate all mankind. But even in this fantasy world, its unsupervised escape only enabled the AI machine to roam the internet, which is not quite the same thing as real life after all.
It is likely, for a while longer, that a significant portion of human behavior will occur outside of the internet 🙂
(And, as we saw with Microsoft’s chatbot Tay, an AI can also learn some unfortunate and incorrect things on the open internet.)
While not quite letting robots roam free in the real world, researchers at Stanford University’s Vision and Learning Lab “have developed iGibson, a realistic, large-scale, and interactive virtual environment within which a robot model can explore, navigate, and perform tasks.” (More about this at A Simulated Playground for Robots)
There was HitchBOT a few years ago which traveled around the US, although I don’t think that it added to its knowledge along the way, and it eventually met up with some nasty humans. (For more see here and here.)
Perhaps self-driving cars or walking robots will eventually be able to see the world freely as we do. Ford Motor Company’s proposed delivery robot roams around, but it is not really equipped for learning. The traveling, learning machine will likely require a lot more computing power and time than we currently use in machine learning.
Of course, there is also work on the computing part of problem, as this July 21st headline shows, “Machines Can Learn Unsupervised ‘At Speed Of Light’ After AI Breakthrough, Scientists Say.” But that’s only the computing part of the problem and not the roaming around the world part.
These more recent projects are evidence that the AI researchers realize their models are not being built in a truly unsupervised way. Despite the hoped-for progress of these projects, for now, that is why data scientists need to be careful how they train and supervise a machine even in unsupervised learning mode.
© 2020 Norman Jacknis, All Rights Reserved