In the world of game AI there exist various decision-making structures, among them is a technique called "Utility AI" or a "Utility System".
- When a decision needs to be made, the Utility AI system will gather data on the current situation.
- For every possible action a "utility score" will be calculated using the data gathered from this current situation.
- A behavior can then be selected based on the highest score or by using those scores-
to seed the probability distribution for a weighted random selection.
If implemented correctly, this should result in the "best" behavior being selected for the current situation.
Why is it called “Utility” AI?
The term “Utility” originates from Utilitarian ethics. In this theory, the morally correct action in a given situation is always determined to be the one that does the most good for the most people.
The “utility” of an action is a measurement of how much good it does.
More info on Utilitarianism.
Since we want our AI to win, we always want to pick the action that results in the highest probability of winning.
- Every turn the AI needs to choose between "FIGHT","BAG","POKEMON" or "RUN"
- Current HP-, type- and status data of the pokemon (and various other aspects) are used to score these actions.
- In this example we know from the gathered data that the player has no items or other Pokemon.
- Therefore, all scores except the one for "FIGHT" should be 0 in this situation, guaranteeing that "FIGHT" gets picked.
- The previously discussed data + the new data from the different moves is used for the scoring.
- Let's see how we could determine these scores for this situation
Scoring Scale-> 1-5
1.Tackle
-Enemy low HP, likely does enough damage to defeat enemy => +++
-EnemyType = fire, move = normal thus no resistance or weakness => +
# Tackle Score = 4
2.Growl
-Lowers enemy attack, AI pokemon takes less damage => ++
-Enemy low HP but move does no damage => -
# Growl Score = 1
3.Leech Seed
-Enemy already seeded, has no effect if already seeded => =0
(use =0 if action should not be an option in given situation)
# Leech Seed Score = 0
4.Vine Whip
-Enemy low HP, likely does enough damage to defeat enemy => +++
-EnemyType = fire, move = grass thus deals half damage => -
# Vine Whip Score = 2
- The AI will use "Tackle" because it scored the highest in this situation.
I originally planned to use this decision making structure as a replacement for a behavior tree in an existing project.
But after attempting to do so, I realised that correctly implementing Utility AI is a lot harder then understanding it.
This is why I decided to go with a simpler text based implementation instead of possibly ruining my existing project.
- Attack -> Deal damage, usage: infinite
- Heavy Attack -> Deal more damage, usage: finite
- Bandage -> Restore a small amount of health, usage: infinite
- Heal -> Restore some health, usage: finite
- Name -> Distinguish between diffrent players
- HP and MaxHP -> Current HP and Max possible HP of player
- MedKits and MaxMK -> Current amount of MK and Max possible MK of player (resource for Heal action)
- Heavy Attack Status -> Whether the heavy attack is available or not (bool)
- healthWeight -> playerHP / playerMaxHP (for correct Scale)
- medKitCountWeight -> playerMK / playerMaxMK (for correct Scale)
- medKitWeight -> if (playerMK > 0) = 1.f ! = 0.f (1 if player has medkits 0 if not)
- heavyWeight -> if (heavystatus == true) = 1.f ! = 0.f (1 if player can use heavy attack 0 if not)
- Descending Linear Curve -> the lower a value, the higher the utility
- Exponential Function -> for rapid increase/decrease of utility
- Inverse/Asymptote Function -> 0.2/x -0.1 (does not respect scale on purpose)
- heavy_score = heavyWeight(player) -> guarantees highest score if true else lowest score
- attack_score = desc_lin(c_healthWeight(enemy)) -> the lower the enemy is the higher score
but with use of minimum bound otherwise score could be 0 and attacking might not happen
- heal_score = medKitWeight(player) * inverse(healthWeight(player)) -> medkitweight used to
check for MK's (similar to heavyscore) and health of player using inverse to increase extremly
increase utility the lower the player, using certain minimum bound (only use medkits when low)
- bandage_score = exp(healthWeight(player)) -> health of player using desc exponential to rapidly
increase utility the lower the player, using certain minimum bound (only when quite low)
- At the start the only actions taken are Attack and Heavy Attack.
- This makes sense because the other 2 actions have a min HP bound.
- At this stage all diffrent actions are being used.
- We can see that the player currently losing takes more healing actions.
- Because attack score is quite low (enemy more HP) and heal score is high.
- Near the end of the fight, the only 2 actions taken are heavy attack and bandage.
- This also makes sense since the players usually have used all med kits because
once they are at the low HP bound the inverse graph used will give it a higher
utility score then possible (breaking the scale) to guarantee them being used. - The bandage action always scores higher then the normal attack at this point.
- As expected, the result is similar but alot more random.
- The normal attack action is even used near the end of the game!
- In the highest score selection this never happens because the
attack cant score higher then the others but here the actions are
randomly selcted and the attack weight is quite high at this point.
- All actions are constantly being weight
- Handles unique situations quite nicely
- Allows for variation
- Quite difficult to design, edit and maintain
- More fuzzy than binary
Utility AI can be a usefull tool that truely shines when used in combination with other decision making structures
- The Sims series -> Partly used for the decision making of the sims.
- Dragon Age: Inquisition -> Partly used for the companion behavior.
AI Architectures: A Culinary Guide
Are Behavior Trees a Thing of the Past?
How Utility AI Helps NPCs Decide What To Do Next | AI 101
AI Made Easy with Utility AI