View on GitHub


BalanceSpan algorithm source code and some datasets on StarCraft II.

Guillaume Bosc1, Philip Tan2, Jean-Francois Boulicaut1, Chedy Raïssi3 and Mehdi Kaytoue1

1Université de Lyon, CNRS, INSA-Lyon, LIRIS, UMR5205, F-69621, France
2MIT Game Lab, Cambridge (MA), USA
3LORIA (CNRS - Inria NGE - Université de Lorraine), Vandoeuvre-lès-Nancy, France

Download this project as a .zip file Download this project as a tar.gz file

BalanceSpan: A Pattern Mining Approach to Study Strategy Balance in RTS Games

What is it about?

Problem statement. We consider an interaction of two players as a sequence of actions made by these players where exactly one player wins, typically a zeso-sum game. The objectives developped in the paper is to extract strategical patterns from traces of players' interaction and to exhibit their characteristics, such as weaknesses or strenghts. Thus, we compute a measure, called the balance, of a strategical pattern to know if it is more likely to win or to lose. Below, an example of a database built from an interaction between two players.

Frequent sequential pattern mining. Given a database of individuals or objects, each one provided with a description (set of items, sequences, graphes, vectors...), frequent pattern mining consists in finding the description abstractions that appear in more than n objects of the database. When we are facing database of sequences, one accordingly looks for subsequences shared by at least some amount of the individuals, as examplified below. The subsequence <(a)(b,c)> appears 3 times out of four: if the minimum support is set to 10%, then this pattern is frequent.

Emmerging patterns. Let us consider that each sequence of the database is associated to a unique class label (a positive label + and a negative label -). The growth-rate characterizes the discriminating power of a pattern for one class w.r.t. the other. The growth-rate of the pattern <(a)(b,c)> related to the positive class label equals 2, that means <(a)(b,c)> is two times more present in class + than in class -.

Formal problem. Considering such an interaction database, the problem we face is not feasible with standard approaches because we only want to consider two classes (positive class label and negative class label for winner and loser) with sequences taht contain actions of both players (to bring out the action-reaction effect) where the both players could perform the same actions.

How does BalanceSpan work?

In a nutshell. The principle of the algorithm is simple: it is based on a pattern growth-approach. The algorithm uses a structure of tree where the nodes correspond to a frequent sequential pattern, and the edges the extension of a frequent sequential pattern. For each node within the structure of tree (each frequent sequential pattern), the algorithm computes the balance.

Computing the balance. The balance is the main lock of this approach. In fact, to compute it, it is required to know the support of the pattern, the support of the dual of its pattern (the same sequence of actions but in one case the player wins and in the other he loses), and the support of their intersection. Once all these parameters are known, the algorithm is able to compute the balance according to the formula given in the paper.

The trick. Since the computing of the balance lies on the support of the pattern and its dual, we put up an new approach during the pattern growth approach of the algorithm that consists in performing a double projected database which contains both the pattern and its dual. Thus, the balance could be computed on the fly, without requiring a huge post-processing. Thanks to this method, the computing of the balance becomes really faster.


StarCraft II in the E-Sport Scene

E-Sport. The E-Sport notion is related to the Electronic Sports, i.e. high level played video games. These video games are becoming the de facto sensations in organized competitions, especially between professionals competitors taking part in international tournaments, with prize-pools in millions of US$, surrounded by managers, teams and sponsors and followed by a large world-wide fan base. These video games has to be balanced to ensure a fair competitive aspect for the players but also for the spectators. E-Sport should attract industrial actors and researchers in the next few years around video game analytics, just like with traditional sports.

StarCraft II. StarCraft II (Blizzard Entertainment, 2010), the most competitive real-time strategy games (RTS), has its own world-wide players ranking system (ELO) and annual world cup competition series (WCS) with a US$1.6 millions prize pool for the year 2013. StarCraft II is a game that involves two players each choosing a faction among Zerg (Z), Protoss (P) and Terran (T): there are 6 different possible match-ups with different strategies of game. During a game, two players are battling on a map (aerial view), controlling buildings and units to gather supply, build an army with the final goal of winning by destroying the opponent's forces. Each faction (Z,P,T) allows different units and buildings with distinctive weaknesses and strengths following a rock-paper-scissors principle. A strategy is hidden in large sequences of actions generated by players and called replays.

Finding (un-)Balanced Strategies in Starcraft II

Quantitative aspects. The algorithm BalanceSpan is always faster in term of runtime than the other approaches we have developed (that handle specific data). An important point to underline is that BalanceSpan is as fast as PrefixSpan that only extracts frequent sequential patterns, without computing the balance. Indeed, PrefixSpan generates two nodes for a frequent pattern (this pattern and the dual) whereas BalanceSpan only generates one node that contains both frequent pattern and its dual (if the both are frequent). Moreover, when we compute the balance as post-processing of PrefixSpan (the PrefixSpanNaive algorithm), we observe that it is not able to deal with litlle minsupp: it is not scalable.

Qualitative aspects. The distribution of the balance related to the support of the patterns shows that the game in its current state is globally balanced: the more frequent a strategical pattern is, the more balanced it is (balance equals to 0.5). For example, among 290 patterns output in the results of the dataset that gathers the games between Zerg versus Zerg, the pattern <(SpawPool+)(SpawnPool-)(SpiCrawler+)(RoachWarren+)(Spire-)> denotes games where one of the player go to air units and the second to ground units, two different known openings, with balance 0.47 and support 68.