Difference between revisions of "StarCraft AI Benchmarks"

Latest revision as of 17:09, 20 October 2015

The goal of this page is to collect benchmark problems that can be broadly used and referenced. We present a benchmark composed of a series of scenarios, each of them capturing a different aspect of RTS games. Each scenario is defined by a starting situation in StarCraft where the agent needs to either defeat the opponent or survive for as long as possible. More information can be found in the original paper: A Benchmark for StarCraft Intelligent Agents.

These benchmarks can be cited as:

@inproceedings{uriarte15b,
  author    = {Uriarte, Alberto and Onta\~{n}\'{o}n, Santiago},
  title     = {A Benchmark for StarCraft Intelligent Agents,
  booktitle = {AIIDE},
  year      = {2015}
}

We are preparing an automatic service to test your bot with all the scenarios. For now you can use the launcher provided in the repository (https://bitbucket.org/auriarte/starcraftbenchmarkai) to test your bot locally and send to us (admin[at]starcraftai.com) your bot if you want to appear in the leaderboard. Keep in mind that for micromanager maps the goal for your bot is to reach the opponent's starting position.

Metrics

All the metrics are designed to be normalized either in the interval [0,1] or [-1,1], with higher values representing better agent performance.

Survivor’s life: The sum of the square root of hit points remaining of each unit divided by amount of time it took to complete the scenario (win/defeat/timeout), measured in frames.Normalized by an lower and upper bounds. The lower bound is when player A is defeated in the minimum time and without dealing any damage to player B, while the upper bound is the opposite.
Time survived: The time the agent survived normalized by a predefined timeout.
Time needed: We start a timer when a certain event happens (e.g., a building is destroyed) and we stop it after a timeout or after a condition is triggered (e.g., the destroyed building is replaced).
Units lost: The difference in units lost by players A and B. We normalize between [0, 1] by dividing the number of units lost by the maximum units of the player.

Benchmarks Scenarios

All scenarios can be found in the repository

Scenario	Description	Evaluation
Reactive Control
RC1: Perfect Kiting	The purpose of this scenario is to test whether the intelligent agent is able to reason about the possibility of exploiting its mobility and range attack against a stronger but slower unit in order to win. In this scenario, a direct frontal attack will result in losing the combat, but via careful maneuvering, it is possible to win without taking any damage.	Survivor’s life
RC2: Kiting	In this scenario the intelligent agent is at a disadvantage, but using a hit-and-run behavior might suffice to win. The main difference with the previous case is that here, some damage is unavoidable.	Survivor’s life
RC3: Sustained Kiting	In this case there is no chance to win so we should try to stay alive as much time as possible. A typical example of this behavior is while we are scouting the enemy base.	Time survived in frames since a Zealot starts chasing the SCV normalized by the timeout.
RC4: Symmetric Armies	In equal conditions (symmetric armies), positioning and target selection are key aspects that can determine a player’s success in a battle. This scenario presents a test with several configurations as a baseline to experiment against basic AI opponents.	Survivor’s life
Tactics
T1: Dynamic obstacles	This scenario measures how well an agent can navigate when chokepoints are blocked by dynamic obstacles (e.g., neutral buildings). Notice that we are not aiming to bench- mark pathfinding, but high-level navigation.	Time needed
Strategy
S1: Building placement	This scenario simulates a Zealot rush and is designed to test whether the agent will be able to stop it (intuitively, it seems the only option is to build a wall).	Units lost: (Units player B lost / 4) - (units player A lost / 25).
S2: Plan Recovery	An agent should adapt on plan failures. This scenario tests if the AI is able to recover from the opponent disrupting its build order.	Time spent to replace a building normalized by the timeout.

Research papers using these scenarios

Q-learnings in RTS game's micro-management. Angel Camilo Palacios Garzón. Universitat de Barcelona. 2015

Leaderboard

Bot	RC1	RC2	RC3	RC4	T1	S1	S2
FreScBot	-0.0879	-0.1153	N/A	-0.0022	N/A	N/A	N/A
UAlbertaBot	-0.0933	0.0422	N/A	0.0369	N/A	-1	0.0000
Skynet	-0.1087	0.1696	N/A	0.0706	N/A	N/A	N/A
Nova	0.1111	N/A	0.0335	N/A	0.0000	-0.7420	0.0000

@@ Line 25: / Line 25: @@
 == Benchmarks Scenarios ==
+All scenarios can be found in the [https://bitbucket.org/auriarte/starcraftbenchmarkai/src/fb4684a39ab3c3a342bfe6ea2378b22f1b5c19f1/Maps/?at=master repository]
 {| class="wikitable sortable"
 |-

Anonymous

Search

Navigation

Navigation

Wiki tools

Wiki tools

Difference between revisions of "StarCraft AI Benchmarks"

Namespaces

Page actions

Latest revision as of 17:09, 20 October 2015

Contents

Metrics

Benchmarks Scenarios

Research papers using these scenarios

Leaderboard

Anonymous

Search

Navigation

Wiki tools

Page tools

Difference between revisions of "StarCraft AI Benchmarks"

Latest revision as of 17:09, 20 October 2015

Contents

Metrics

Benchmarks Scenarios

Research papers using these scenarios

Leaderboard