Why AlphaStar Does Not Solve Gaming’s AI Problems | Design Dive

Articles , , , , , , , , , , , , , , , , 0 Comments


Hi I’m Tommy Thompson and welcome to Design
Dive here on AI and Games. In this episode let’s talk AlphaStar – DeepMind’s grandmaster
level AI StarCraft 2 player. AlphaStar made headlines throughout 2019 as the competence
of the system grew, first defeating pro StarCraft players TLO and MaNA followed by playing in
public matchmaking on the Battle.net European servers, allowing it to climb to the top 0.15%
of all players. The big question I hear a lot is how can the games industry capitalise
on this and build their own Deep Learning AI players. But it isn’t as straight forward
as that and despite the real innovation and excitement around AlphaStar, this isn’t going
to have an immediate impact on the way game AI is developed – or at least not in the way
you think. And in this video I’m going to explain why… Let me stress that this video isn’t intended
to speak ill of DeepMind and their work. AlphaStar is an incredible achievement that – even in
academic circles – still felt like it was years away. But rather I want to… temper
peoples enthusiasm a little bit. The media sensationalism’s of AI often means understanding
the capability of these systems is difficult to grasp. But the bigger issue is that the
way in which AlphaStar has been built does not make it an easy to adapt and translate
to a game development pipeline. So let’s talk about what this all really means for the video
game industry in the short-term rather than treating this as the next big innovation that
will transform into Skynet and eventually kill us all. **On that note, it is legit both funny and
depressing how everyone and their Aunty knows what SkyNet is yet Terminator movies are bombing
at the box office.** I won’t be talking about how AlphaStar works
in this video, because I did that already over in episode 48 of the main show. So if
you want to get a grasp of what’s actually happening under the hood of these StarCraft
AI players, then go watch that video first. Plus I do make reference to some of the points
raised in that video, but hopefully it’s all easy for you to follow along with. The first issue is that the games industry
needs to see the benefits of adopting this approach for non-player character AI before
it embraces it. This isn’t the first time machine learning has reared its head offering
to fix problems for the video games industry. In fact there was an initial exploration of
machine learning back in the late 90’s and early 2000’s – which led to games like the
original Total War and Black & White using neural networks and genetic algorithms – but
to mixed success. One of the big reasons that machine learning died out was the lack of
control or authority designers and programmers have once they’ve been trained to solve the
task at hand. Deep Learning is creating complex artificial neural networks that carry thousands
if not millions of connections that are given numeric weights. Training the connection weights
is what gives the system intelligence, but when you read that as a human – it’s just
numbers. Lots and lots of numbers. So if you build an AI player using Deep Learning
and it does something weird, you can’t crack it open and debug it. You need to isolate
what’s wrong in the design of the network, the learning process or the training data
that may have caused this erroneous behaviour and re-train it. Then if you want to create
AI that cater to particular situations or configurations, you’d need to build the training
process to reflect that. This isn’t remotely accessible or friendly to game designers who
want control over how an AI character will behave within the the game and are working
with the programming team to make that a reality. If you consider episode 47 of AI and Games
where I looked at Halo Wars 2, that whole system is built in a modular data-driven fashion
to allow designers to have a huge amount of control. Right now Deep Learning technologies
do not cater to that level of interaction and oversight for a designer to work with
it. It’s why behaviour trees are so pervasive in game AI: they’re arguably the most accessible
for both designers and developers, allowing each team to focus on their specialism without
stepping on the toes of others. This isn’t to say machine learning isn’t going
to have an impact within the industry itself, but more specifically I don’t see it being
used pervasively for in-game behaviours. Sure, we’ve seen the likes of Forza Horizon and
MotoGP adopt it for their opposing racers, but those are very bespoke situations that
actually cater quite nicely given the problem space. The industry is still evolving and
adapting to this surge in machine learning once more and while big publishers are investing
in their own AI R&D teams, that isn’t reflected in even AAA studios. Over time we’re going
to see Deep Learning used more and more in games, but not in the ways you might think
and I’d argue rarely for in-game character behaviour. The second issue is that – irrespective of
the technologies capabilities – the requirements for training AlphaStar don’t allow for it
to be easily replicated for games in active development. As mentioned in my other video,
AlphaStar’s first phase of learning is achieved by watching and then mimicking behaviours
from match replays of human players. So this is a chicken and egg problem: given
if you want to train super intelligent AI in your game, you need to have existing examples
of high-level play that it can replicate through supervised learning. If you want that training
data, then you either need to have expert players playing the game before release or
build a separate AI player to bootstrap the machine learning player by creating examples
for it to learn from – and that kinda defeats the point. AlphaStar benefits heavily from
the ecosystem that StarCraft exists within. The game has been out for nearly a decade
and is relatively bug-free, it’s been a popular eSports title for several years, plus Blizzard’s
cult of personality helps maintain an active and lively fanbase around their products.
This means lots of data already exists for AlphaStar to work with. Now all that said, the AlphaStar is still
quite a fickle system. The two version of the AI player were built against two specific
version of StarCraft 2 – with version 1 running on 4.6.2 and version 2 on 4.9.2 of the game.
Now the unspoken problem here is that any changes made to the games design that influence
the multiplayer meta in any significant way will break AlphaStar. The reinforcement learning
trains the bots against that current meta, and that means it can’t just adapt to the
changes brought on by the patch, you need to retrain it. Even the human expert play
it’s bootstrapped against might not even prove applicable anymore in this context. I can’t
say with any certainty, but there’s a small chance that already as of version 4.10 of
StarCraft 2, AlphaStar might not be able to play as well as it once did. The third and most critical element that prevents
AlphaStar being adopted en-masse is cost. Training the AlphaStar agents is an incredibly
expensive process, you need to have dedicated processing systems for the training to run
in a large distributed heterogeneous fashion. DeepMind utilise Google’s own cloud infrastructure
to achieve this and the training was executed on their Cloud Tensor Processing Unit’s or
TPUs. These are custom-developed application specific integrated circuits or ASICs designed
to help accelerate machine learning training and inference. The more recent version of AlphaStar from
November 2019 trained on 384 TPU v3 accelerators – for a period of 44 days. Now if you consider
Google’s public pricing model for using these TPUs, which runs at around $8 an hour for
a single TPU, then even a naiive estimation of cost amounts to $3,072 per hour, $73,728
a day and $3,244,032 in total. Though I’m sure DeepMind got a heavy discount. Now you might think this isn’t a big deal
when some AAA productions have budgets in the tens if not hundreds of millions of dollars,
but $3.5 million to train your AI is a ridiculous amount of money. Sure, publishers like EA,
Take Two, Ubisoft or Activision might have that kind of cash available, but this is just
the cost of running the training, not the staff, the infrastructure, the development
time and all that other critical parts of game development. Bear in mind this is but
one tiny part of a much larger puzzle when building a game of a scale akin to StarCraft.
Plus, as cool as this ridiculous expenditure is, DeepMind are actually haemorraghing money
right now – posting losses for Alphabet (the umbrella company of Google) exceeding $1 billion
dollars in the last three years. This technology is not stable enough at this stage without
further investigation for a AAA publisher to take seriously. Perhaps even more critically, this excludes
all but the top 2% of games studios and publishers even if they could afford it. The training
costs suggested here are bigger than most development budgets for a game. This technology
can’t permeate throughout the industry if it costs that much to train it. And of course,
if you need to train it again, as your design needs force you to reconsider something – boom
– that’s more money being thrown at Google to solve the problem. Alternatively, a company
invests in their own Deep Learning infrastructure or use another provider. In any case, money,
money, money. I will stress this isn’t just an issue of
unabated capitalism: the issue of data and compute resource to train Deep Learning systems
is not a solved issue and is one of the larger problems being addressed not just in research
of AI methodologies, but even hardware companies such as Intel that are building the next generation
of compute hardware to deliver training and inference of machine learning cheaper and
faster than is currently possible. Now while I’m stressing that AlphaStar isn’t
going to change gaming just yet, that is not to say that machine learning is not having
an impact within the games industry. As I mentioned earlier, the intial enthusiasm for
machine learning largely petered out by the mid-2000’s, the recent Deep Learning revolution
has seen renewed interest. But this new and more concerted effort is being explored to
address issues beyond just the creation of traditional AI players. EA’s SEED Division
revealed their work in 2018 training Deep Learning agents to play Battlefield as well
as exploring imitation learning from human play samples to bootstrap AI behaviours. Meanwhile
Ubisoft’s La Forge research lab in Montreal is experimenting with machine learning for
testing gameplay systems, AI assistants that support programmers in committing bug-free
code, motion matching animation frameworks for character behaviours and lip syncing for
dialogue in different languages. Plus the most obvious applications in data sciences
are long established at this point, as analytics teams use machine learning to learn more about
how people play their games and provide insight into changes that can be made going forward.
I mean let’s look on the bright side, I’m going to have plenty more to talk about on
this channel in the coming years! Thanks for watching this episode on Design
Dive, I figured it was worth giving my 2 cents in explaining why we shouldn’t be expecting
Deep Learning to invade all of Game AI just yet. I hope you found it interesting! If you’ve
got questions, comments or just flat out disagree with me then slap that down in the comments
and once I’ve had enough to drink I’ll go take a look! Don’t forget you can help to
support my work by joining the AI and Games Patreon or by becoming a YouTube member – just
like Scott Reynolds, Ricardo Monteiro and Viktor Viktorov have done right here, plus
all the other lovely folk you see right here in the credits. Take care folks, I’ll be back.

Leave a Reply

Your email address will not be published. Required fields are marked *