THE ROBOTS might not take over the world any time soon but they will probably kick your n00b a**e in Starcraft II, much like Google's DeepMind AlphaStar AI has done.
The AI achieved the level of 'Grandmaster' in Blizard's real-time strategy game by beating 99.8 per cent of the active Starcraft II players on the Battle.net platform.
While the best human player can make hundreds of ‘actions per minute' in Starcraft II, the nature of AIs means they could make thousands of actions in the same time frame. Coupled with the potential to process more digital information than a fleshy human brain, one could argue that an AI is always going to have an advantage over a human player.
But DeepMind constrained AlphaStar to have a form of ‘camera view' whereby it would only see what a human Starcraft II player would see on their screen. Its actions were also reduced to 22 across five seconds, thereby bringing the AI more in line with the capabilities of a human.
Furthermore, while DeepMind did announce to Starcraft II players that its AI would be out in the wild for them to play against, the AI wasn't allowed to reveal it wasn't a real boy, This supposedly helped prevent human players from trying to exploit the system to deliberately bork the AI.
Do it human-style
To stand a chance at giving skilled Starcraft II players a kicking DeepMind has AlphaStar learn to play like the humans it was designed to crush challenge.
"We chose to use general-purpose machine learning techniques - including neural networks, self-play via reinforcement learning, multi-agent learning, and imitation learning - to learn directly from game data with general-purpose techniques," explained DeepMind.
"Using the advances described in our Nature paper, AlphaStar was ranked above 99.8 per cent of active players on Battle.net, and achieved a Grandmaster level for all three StarCraft II races: Protoss, Terran, and Zerg. We expect these methods could be applied to many other domains."
Learning how humans play Starcraft II as a means to allow the AI to come up with effective tactics to deploy beyond those the AI has learnt playing against itself.
DeepMind explained that when the AI agents that makeup AlphaStar just played against themselves in a "self-play" situation, they'd maximise the winning games, which helped them improve but caused them to "forget" certain tactics as other became effective.
"For example, in the game rock-paper-scissors, an agent may currently prefer to play rock over other options. As self-play progresses, a new agent will then choose to switch to paper, as it wins against rock," DeepMind explained.
"Later, the agent will switch to scissors, and eventually back to rock, creating a cycle. Fictitious self-play - playing against a mixture of all previous strategies - is one solution to cope with this challenge."
By having a system that introduced "fictitious self-play" into the mix where agents play against each other not to solely win but highlight and exploit weaknesses in another agent's strategy, AlphaStar effectively learns like a human player does when competing against other humans and training with friends.
At the same time, the AI had to also learn existing human strategies and bring the whole lot together to then take on real-people in the Starcraft II league.
"Learning human strategies, and ensuring that the agents keep exploring those strategies throughout self-play, was key to unlocking AlphaStar's performance. To do this, we used imitation learning - combined with advanced neural network architectures and techniques used for language modelling - to create an initial policy which played the game better than 84% of active players," DeepMind said.
"We also used a latent variable which conditions the policy and encodes the distribution of opening moves from human games, which helped to preserve high-level strategies. AlphaStar then used a form of distillation throughout self-play to bias exploration towards human strategies. This approach enabled AlphaStar to represent many strategies within a single neural network (one for each race). During evaluation, the neural network was not conditioned on any specific opening moves."
That's some rather complicated AI stuff, but in essence, AlphaStar learnt to play StarCraft II in a similar fashion to a human and then used those tactics to become a Grandmaster.
DeepMind reckons the AI training techniques learnt in AlphaStar could be applied to "other domains". That could mean say training a self-driving car to drive like the best human driver and avoid some of the dumbf**kery of idiot motorists. Or perhaps it could create virtual assistants that sound more human than ever.
Out to Glassture
Face(book) the music
And 20 per cent of Trusts have no plans to upgrade