Decision Making in Game AI: Utility AI

Introduction

In the world of game AI there exist various decision-making structures, among them is a technique called "Utility AI" or a "Utility System".

The concept is quite simple:

When a decision needs to be made, the Utility AI system will gather data on the current situation.
For every possible action a "utility score" will be calculated using the data gathered from this current situation.
A behavior can then be selected based on the highest score or by using those scores-
to seed the probability distribution for a weighted random selection.

If implemented correctly, this should result in the "best" behavior being selected for the current situation.

Why is it called “Utility” AI?
The term “Utility” originates from Utilitarian ethics. In this theory, the morally correct action in a given situation is always determined to be the one that does the most good for the most people.
The “utility” of an action is a measurement of how much good it does.
More info on Utilitarianism.

Simplified Example

Example -> AI for winning Pokemon battles using Utility AI:

Since we want our AI to win, we always want to pick the action that results in the highest probability of winning.

Every turn the AI needs to choose between "FIGHT","BAG","POKEMON" or "RUN"
Current HP-, type- and status data of the pokemon (and various other aspects) are used to score these actions.
In this example we know from the gathered data that the player has no items or other Pokemon.
Therefore, all scores except the one for "FIGHT" should be 0 in this situation, guaranteeing that "FIGHT" gets picked.

In this example the AI chose "FIGHT" and now it needs to pick the optimal move.

The previously discussed data + the new data from the different moves is used for the scoring.
Let's see how we could determine these scores for this situation

Scoring Scale-> 1-5

1.Tackle
-Enemy low HP, likely does enough damage to defeat enemy => +++
-EnemyType = fire, move = normal thus no resistance or weakness => +
# Tackle Score = 4

2.Growl
-Lowers enemy attack, AI pokemon takes less damage => ++
-Enemy low HP but move does no damage => -
# Growl Score = 1

3.Leech Seed
-Enemy already seeded, has no effect if already seeded => =0 
(use =0 if action should not be an option in given situation)
# Leech Seed Score = 0

4.Vine Whip
-Enemy low HP, likely does enough damage to defeat enemy => +++
-EnemyType = fire, move = grass thus deals half damage => -
# Vine Whip Score = 2

The AI will use "Tackle" because it scored the highest in this situation.

And...

GG no re

Implementation

I originally planned to use this decision making structure as a replacement for a behavior tree in an existing project.
But after attempting to do so, I realised that correctly implementing Utility AI is a lot harder then understanding it.
This is why I decided to go with a simpler text based implementation instead of possibly ruining my existing project.

Plan -> 2 Players fighting each other using various actions which are determined using Utility AI

How do we define these players and their actions?

Actions:

Attack -> Deal damage, usage: infinite
Heavy Attack -> Deal more damage, usage: finite
Bandage -> Restore a small amount of health, usage: infinite
Heal -> Restore some health, usage: finite

Player:

Name -> Distinguish between diffrent players
HP and MaxHP -> Current HP and Max possible HP of player
MedKits and MaxMK -> Current amount of MK and Max possible MK of player (resource for Heal action)
Heavy Attack Status -> Whether the heavy attack is available or not (bool)

Now we can make considerations using the player data (for scoring the actions later on)

Scale: 0.f -> 1.f

healthWeight -> playerHP / playerMaxHP (for correct Scale)
medKitCountWeight -> playerMK / playerMaxMK (for correct Scale)
medKitWeight -> if (playerMK > 0) = 1.f ! = 0.f (1 if player has medkits 0 if not)
heavyWeight -> if (heavystatus == true) = 1.f ! = 0.f (1 if player can use heavy attack 0 if not)

Usually curves are used to map utility

Types of curves used in this project (Scale: 0.f -> 1.f)

Descending Linear Curve -> the lower a value, the higher the utility

Exponential Function -> for rapid increase/decrease of utility

Inverse/Asymptote Function -> 0.2/x -0.1 (does not respect scale on purpose)

Finally we can design the "Reasoner" (brain of the players)

Here we determine the final scores for each action and select which should be used

heavy_score -> Plan: We always want to use heavy attack action if possbile

heavy_score = heavyWeight(player) -> guarantees highest score if true else lowest score

attack_score -> Plan: the lower HP the enemy is, the more usefull an attack is

attack_score = desc_lin(c_healthWeight(enemy)) -> the lower the enemy is the higher score
but with use of minimum bound otherwise score could be 0 and attacking might not happen

heal_score -> Plan: Only heal if very low HP but should be scored highly if so

heal_score = medKitWeight(player) * inverse(healthWeight(player)) -> medkitweight used to
check for MK's (similar to heavyscore) and health of player using inverse to increase extremly
increase utility the lower the player, using certain minimum bound (only use medkits when low)

bandage_score -> Plan: Only bandage if quite low HP but should be scored quite high if so

bandage_score = exp(healthWeight(player)) -> health of player using desc exponential to rapidly
increase utility the lower the player, using certain minimum bound (only when quite low)

With these final scores we should decide how to determine the correct action

-> By picking the action that got the highest score

OR

-> Using the scores to as weights for a weighted random selection

Implementation Finished!

Result

Added randomness for testing -> extra MK's and heavys + player turns at random

Let's take a look at highest score selection first

At the start the only actions taken are Attack and Heavy Attack.
This makes sense because the other 2 actions have a min HP bound.

At this stage all diffrent actions are being used.
We can see that the player currently losing takes more healing actions.
Because attack score is quite low (enemy more HP) and heal score is high.

Near the end of the fight, the only 2 actions taken are heavy attack and bandage.
This also makes sense since the players usually have used all med kits because
once they are at the low HP bound the inverse graph used will give it a higher
utility score then possible (breaking the scale) to guarantee them being used.
The bandage action always scores higher then the normal attack at this point.

Let's take a look at the weighted random selection now

As expected, the result is similar but alot more random.
The normal attack action is even used near the end of the game!
In the highest score selection this never happens because the
attack cant score higher then the others but here the actions are
randomly selcted and the attack weight is quite high at this point.

Conclusion

Pros And Cons Of Utility AI

Pros:

All actions are constantly being weight
Handles unique situations quite nicely
Allows for variation

Cons:

Quite difficult to design, edit and maintain
More fuzzy than binary

Closing Thoughts

Utility AI can be a usefull tool that truely shines when used in combination with other decision making structures

Especially for complex systems in RPG/RTS/simulations projects

And lastly games that make good use of this technique

The Sims series -> Partly used for the decision making of the sims.
Dragon Age: Inquisition -> Partly used for the companion behavior.

References

AI Architectures: A Culinary Guide
Are Behavior Trees a Thing of the Past?
How Utility AI Helps NPCs Decide What To Do Next | AI 101
AI Made Easy with Utility AI

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Build		Build
.gitattributes		.gitattributes
.gitignore		.gitignore
Player.h		Player.h
Project.sln		Project.sln
Project1.vcxproj		Project1.vcxproj
Project1.vcxproj.filters		Project1.vcxproj.filters
README.md		README.md
main.cpp		main.cpp
pch.cpp		pch.cpp
pch.h		pch.h

WarreVannTittelboom/UtilityAI

Folders and files

Latest commit

History

Repository files navigation