Archive for the ‘Games’ Category

Win rate in World of Tanks – a quantitative Analysis

World of Tanks (WoT) is a multiplayer online action game where players compete in 15-vs-15 player random tank battles.

This article has been posted on WoT forums: Winrate for Mathematicians – a quantitative Analysis

This post is about win rate as a reliable and objective measure of a player’s skill and is the first post to my knowledge to include a quantitative analysis. (In the hope it will get “stickied”)

WoT is about winning tank battles and win rate measures exactly that. It does so without any “a priori”-modelling like all other so called performance rating schemes. It is a blind judge that assesses a player’s contribution to winning a battle, independent of what the player has actually done. It already includes assessment of any tactics you can think of (and even those you can’t think of!) out-of-the-box.

So what’s the problem people have with winrate? The problem is that winrate is a statistical measure. Roughly speaking, winrate gets significant (read meaningful) only after a lot of battles fought.

So how much is “a lot of” actually?

We assume that the skill of a player is strongly linked to what we call the “intrinsic winrate” of the player, which is given as the mean probability of winning a random battle when he participates, evened out over all tanks, maps, friendly and enemy team composition and any other random factors that might be involved.

Mathematically speaking, we must distinguish between this “intrinsic winrate” (linked to  skill) – which is unknown – and the player’s “actual winrate” that will serve as an estimate for the former.
This estimate will get better and better the more battles fought.

We can then ask the following question:
How many battles do we need to fight, so that a given difference in actual winrate between two players cannot be the effect of pure randomness and thus must be linked to a real difference in intrinsic winrate (that is, the skill of winning).

If we don’t take into account draws (which make up only for about 2% of battles), this is similar to asking the question whether a coin is biased (i.e. intrinsically different from a given “perfect coin”) when we only have a given number of outcomes of flipping of the coin.

I.e. we can apply an equivalent reasoning as in  http://en.wikipedia.org/wiki/Checking_whether_a_coin_is_fair.

Taking the formula given in http://en.wikipedia….rue_probability,
we obtain for the number n of battles to fight in order to ascertain that a winrate difference is significant

n = Z^2 / (4 * E^2) 

where Z is the level of confidence.

And E is the tolerance of error.

Taking E=0.01 (in order to “resolve” differences in winrate at the 1%-point level) and Z=2 (a usual scientific grade 95%+ confidence level), we obtain

n = 10000 

which in full words reads:

Given two players (A and B) with win rates Wa and Wb (in the interval [0..1]), and Wb = Wa+0.01 (1%-point difference), then we need to have around 10k battles to ascertain for almost sure that B is the better player than A, i.e. their difference in winrates cannot be explained by sheer luck/randomness/MM.

(Note that if we allow an error of E=0.05, we go down to n=400.
So if B has Wb=55% compared to Wa=50%, we can be pretty sure after only 400 battles that B is doing better than A. But we shouldn’t be any affirmative about a player C with Wc=53% compared to A or B.)

As a rule of thumb, let’s retain

  • Do not even think about comparing win rates at around 100 battles.
  • Winrate starts to get meaningfull at around 1K battles, but you shouldn’t be too confident when comparing 1%-point differences.
  • Around the 10k battles order-of-magnitude, winrate can be used to compare winning-skill with a scientific-grade confidence level down to a 1%-point resolution.

(This is true for overall winrate but also when comparing tank-wise, tier-wise etc.)

And as a corollary for the many stats sites out there…

  • Printing decimal fractions on winrates (like in “53.28%”) is ridiculous, as it would require an astronomical number of battles for this resolution to become significant.

Some common misconceptions about winrate…

Is it possible to boost one’s winrate?

… by driving in platoons or companies?

Short answer: No.

Because for each win of a company team, there is one loss of a company team.
The same is roughly true for platoons. It is irrelevant that platoon members probably cooperate better than random players. Because the competitors for win rate are not only the random players, but also the platoons from the other team! The match maker explicitely tries to distribute platoons evenly among both teams. This is not always perfect, but given a sufficient number of matches, there is a good chance that you end up with the same amount of platoons that have played against you as have played with you (including your own).
So platooning by itself alone cannot increase winrate for the same reason as company can’t.

However, what is almost sure is that playing in a *good* platoon or *good* company will increase your win rate. (Where *good* means above-average intrinsic winrate of participating players). It is then much likely, that the platoon or company will actually score even better than the average win rate of the constituting players, due to synergy effects that arise amongst good players and the fact that such platoons/companies are more likely to face platoons/companies with lesser winrate than their own.

In other words, imagine that a company is made of 10 players with a 53% intrinsic winrate each. Then it is almost sure that the company will win significantly *more* than 53% of the time, because it will face many companies with less skilled players.
(This claim needs some thorough statistical verification though. It is based on the assumption that players of all winrate segments play companies at an equal frequency, which might not be that obvious)

So except from having some very skilled and altruist friends, the only way to increase your winrate in companies and platoons is to be already a good player to start with.

… by seal-clubbing on low tiers?

Yes, there are numerous examples of players farming winrate with this. However still there is something to be noted here: Seal-clubbing works only as long as the number of seal-clubbers is low compared to the number of seals. (I leave this obvious fact to figure out to the ambitious reader. Hint: What if there’s also a seal-clubber on the enemy team? ;))

… by driving OP tanks?

Again yes, it is possible. OPness is actually *defined* by the fact that players score better winrates in these tanks than in the others they drive.
(See http://ftr.wot-news….nk-performance/)
But again, when those tanks are played a lot, people in those tanks will end up facing these tanks also in the enemy team, thus cancelling the effect out.
This btw is the reason that for example the KV-1S (by far most driven T6 heavy) has a relatively poor overall tank-wise winratehttp://www.vbaddict….=won_lost_ratio , although that when comparing the tank’s winrate player-wise, we can see that it is blatantly OP: http://ftr.wot-news….013/12/kvas.png
So the KV-1S is an example of an OP tank that cannot be used to effectively farm winrate for the simple reason that it is too abundant on the battlefield.

As a corollary, we can assert that,
whatever technique you will find to increase winrate, it’s effectiveness will always be limited by the number of players using that technique. In that sense, winrate boosting is self-regulating to some extent.

One final word…

“That can’t be true! [This and that experience i had that other day] clearly shows that …”

No. Please stop it. The thoughts presented above are of a statistical kind. Don’t even try to argue against a statistical argument with personal experience. You are off-topic in that case.
Not that i am immune to any criticism, but your arguments should be based on math and at least 10k of objective data points. (If you are making a more general point, then 1 million data points is preferable.)
Thank you.


Updates after discussion

2014-02-16
It has been argued that since WoT is way more complex than tossing a coin, it’s results cannot be predicted by such a simple approximation and thus the reasoning must be flawed.

While the part about the complexity is obvious, the *results* of both games are the same: win (heads) or loose (tail) – neglecting draws. Thus the same statistical treatment as for any binomial distribution can apply.
That is the whole strength of the winrate argument, that it does *not* rely on any modelling of the underlying game. We analyse past yes/no results statistically and find ways to forge our confidence for predicting future yes/no results. We don’t need to know the mechanics behind any individual yes/no result at all.

2014-02-16
It has been noted that winrate also depends on using premium/gold advantages and thus winrate measures “performance” rather than “skill”.

This is a valid point. It is certainly true that winrate will reflect this in the long run. Feel free to call “performance” what i named “winning-skill”.
Note that my initial argument is not “winrate equals skill”. But rather: Given a difference in winrate, how many number of games you must have played so that the differences cannot be explained by randomness/MM. Anything other than randomness could be valid reasons then.
Though i still believe that “skill” is the most prominent non-random effect.

Update 2014-02-28

I have been given a set of real-life win/loss stats over nearly 2000 battles by SpankyMnky (thanks!). I want to seize the opportunity to get rid of sdome flawed perceptions people have when considering random effects.
The following graph shows the wins (1) and losses/draws (0) over time (pale blue bars in background). (click to view larger image)

spankymnky-win-lose-draw
The green line is a moving average over 10 battles.
The dark blue one over 100 and the red one over 1K.

The graph nicely shows that although apparent randomness seems to be at full effect at the smaller scales, our player does hold a pretty stable above average winrate between 51 and 53% in the long run.

Some things worth pointing out:
The green 10-battles-average line goes down to 0.1 on several occasions and even touches 0 once.
That means 9+ losses-in-a-row streaks for our poor friend Spankymnky!

Still no reason to start a “MM is rigged thread”, as that line also touches or passes the 0.9-line at even more occasions (That’s 9+ wins in a row. The same people would probably *not* cry “rigged MM” in that case, huh?). It should become apparent to anyone that 10 (or 20 or 50) battle results are of virtually no use whatsoever to make any claim on a systematic bias.

However Spanky’s above average performance was probably at work in nearly all of those battles and this shows as a systematic offset of the 1K-line.

Again in this practical example, 1K battles comes out as a minimum order of magnitude to give reasonable and significant results. Remember this as the primary argument in any loss-streak whine-threads.

Thanks again to Spankymnky for providing the data!

Update 2014-06-21

On  19 March 2014, ortega456 posted the following comment:

On another note nobody mentioned Binomial-Distribution here which covers most part of WOT Winrate statistics:

If you wanna know, how probable your current losing/winning streak at the amount of games played currenly is just take the Binomial-Distribution:

  • n: Number of games played
  • k: Number of games won/lost
  • p: Your recent WinRate/LosingRate

Example:

If you wanna know what the probabilty is to win exactly 5 out of 10 games, if your global winrate is 50% you get => 24.6%

(k=5, p=0.5, n=10)

If you wanna know how the probability is to win between 4 and 6 games at 50% global winrate you just sum over the probabilities:

B(4|10,0.5) + B(5|10,0.5) + B(6|10,0.5) = 65,63%

Another example for a loosing streaks:

Let’s say a guy who wins 60% of his games looses 8 of 10 battles (20%WR) and started to rage, that the RNG is rigged. He has 10.000 battles played. The given formula gives us B(2|10,0.6) = 1% probability.

You might say now 1% is so unprobable, that this guy is right. However the relative frequency of occurence value is 0.01 * 10.000[games] = 100. This means if he plays every day 10 games he should have had 100 days, where he gets this 20% winrate, which corresponds to the graph of OP where the green line has strong fluctuations.

Let’s see how probable it is to get 40% winrate within 100 games, when your global winrate is 60%:

B(40|100,0.6) = 0.002%. So if he had played 10k games he has to expect this to happen 0.244 times which means he most likely hasn’t expected this kind of a loosing streak within his 10k games. That correspots to the blue line in OPs graph pretty well. It also shows that the more games you play the more probable it gets that you get a winrate according to your global winrate.

So whenever you wanna calculate winrate-probabilities and complain about RNG is rigged give math a try first 😉

Advertisements

Extending the Supreme Commander/Forged Alliance Experience – Part 2

In my previous post, i described how our group of casual gamers had identified some shortcomings in Supreme Commander – Forged Alliance (Patch 1.5.3599) that we wanted to have fixed.

In short, these were:

  • Air is way too strong
  • Experimentals are too cheap.
  • Economy drives off too fast for less gifted players and becomes pointless once T3 is reached.

In this post, i present three mods we developed and used to improve the situation:

  • Weak Air

This mod simply reduces weapon damage and health points for air units to 85% of their initial value for T1 and T2 and to 60% for T3! Some other adjustments are made for transports and air experimentals. See the lua file for details.

  • Expensive Experimentals

This mod doubles the cost of all experimental units and increases their build time by the factor of 1.5!

  • Limited Eco

This mod simply disables all T3 extractors as well as all metal converters.

I also have made a UI mod that maps some new keys, one of which selects the extractor of least upgrade level that is nearest to the mouse cursor position. Another key upgrades the selected extractor. Less experienced players will like this easy why of upgrading just by pressing the two keys in sequence.

The use of these mods has greatly improved our game experience. We saw again what we loved from TA: Epic battles for supremacy of land, nice tactical drop operations behind enemy lines, excellent team coordination on sneak attacks etc. etc.  Goodbye to boring air domination and experimental rush, welcome back to the fun of a great tactical war game!

Some of the mods underwent several balancing cycles. We globally consider them stable, but there might still be some space for improvement here and there. We nearly always play with all three mods enabled. But we keep them separate, so that other people can try with only one or two to start.

You are free to try them out yourselves and give me feedback!

All mods are available with their source code here: https://trac.assembla.com/scfa-mods/wiki/WikiStart

 

Future plans…

I have some plans for other mods, that extend on the above ideas.

Why bother with mass extractors and upgrades at all? What we want is a battle for the metal spots, but the rest should be somehow automated.

A first idea is to build the extractors once, but then they automatically provide more and more income, the longer they live. It would be kind of an automatic, smooth upgrade. I’m still trying to figure out what would be the best function of production output over time: Linearly,  (slightly) exponentially or logarithmically increasing? All of the former do not have an upper bound. A negative exponential with a lower and upper bound would probably be ok too, something like: P(t) = Pmax –  (Pmax – P0) * exp(-lambda*t). This starts at P(0)=P0 and asymptotically reaches Pmax for t=infinity.

Pushing this idea even further: Maybe we can get around the extractor building construction altogether? Why not simply attribute the metal spot to the player that has the most (or strongest? or most engineers?) land units in a given perimeter around the spot. We just need to have a visual clue to which player the spot belongs. Of course, as soon as a spot switches sides, its production output starts over from the minimum value again. This would make it interesting to attack spots far behind the enemy lines, even if they cannot be held for a long time. The difference with the first idea though is that spots could not be destroyed any more (e.g. with an air attack).

 

Extending the Supreme Commander/Forged Alliance Experience – Part 1

We have been playing Total Annihilation (TA) for over 6 years, since 2003. “We”, that is a group of friends and colleagues, initially located around Strasbourg, France. And even if the exact number and composition of the group varied over time, we were nearly always about six to seven people to play once a week, during a long evening. Most of us are a bit older ;-),  have a family and a job, so we are what people would call “casual gamers”. Most of the initial founding folks are still in, even if now more geographically separated.

TA (Well, that is TA:ÜH – A major mod, called Überhack 4.0 to be precise) has been a lot of fun to play and i still consider it one of the best RTS ever (and in particular compared to its competitor StarCraft).

So it was only natural for us to move to Supreme Commander (and the Forged Alliance extension, some time later), once it had settled to a stable and affordable game in 2009. The SupCom experience, although praised as TA’s “spiritual successor” everywhere, proved to be a lot different than TA:ÜH though.

It seemed they had mixed in many of the elements of StarCraft, that made it speedier, rushier and much more economy-dependent than TA used to be. SupCom/FA seemed to be aimed at the fast 1vs1 duel environment, typical for online centric games.

Our environment is different: With a typical number of around six players, chosen maps are rather larger (20km x 20km is common). This means that surface units spend much time moving to the enemy. This wouldn’t have been a big deal, hadn’t there been some serious flaws in the game design for this kind of scenario.

We quickly identified several shortcomings (on the basis of the latest official patch 1.5.3599) that needed to be addressed in order to save the tactical interest of the game:

  • Air is way too strong

Air units not only have a speed advantage, but were also overpowered. Stationary anti-air was nearly ineffective so that nobody really used it.

Nearly all major games ended up with gigantic aircraft clouds moving around the map, destroying everything on their passage, only to be stopped by another even more massive swirl of defensive aircraft. The map had gotten “flat” at this stage of play. The game tactics came down to producing more aircraft in less time than the enemy. This seemed to be pointless.

We wanted the TA feeling back, where you could do serious harm with a well placed bomber-run, but it was nearly ever possible to crush a suitably well-defended base with aircraft only.

  • Unit life cycle is too fast. Experimentals are too cheap.

Another consequence of large maps is that units that have made their way to the enemy will most often face defenders of higher level than their own, due to the simple fact that the defender had time to upgrade while the lower level units were still inbound.

The result is a systematic shift to a late attack strategy, since early (i.e. low level) attacks are nearly always doomed to be crushed at the “level barrier”. The consequence is a rush to experimentals, with low to zero engagements during the early and mid-game. Boring!

  • Economy drives off too fast for less gifted players and becomes pointless once L3 is reached.

This is of course more a problem of group homogeneity than of the game itself. Nevertheless it can ruin the fun.

Consider that you have a mix of more and less experienced players in your group. (This will almost always be the case somehow). There is usually no problem with this, if teams are mixed in the right way, with some less experienced players teaming up with the more experienced on  both sides. The problem is that SupCom/FA is very economy-dependent. The economy in FA runs away exponentially, leaving the less gifted players behind in a spectactor role far too early.

Another problem with economy in SupCom/FA is that once Level 3 extractors and converters are built up, there is so plenty of resources (for the gifted players that is) that the whole game turns into “how to spend income more quickly than the enemy”. Again all tactical options fall behind a purely mass fabrication strategy. There is so much income available that considerations about land occupation become totally unimportant.

We wanted to have a fight for every acre of exploitable land instead, and that until the late game!

Read how we managed to address each of these points in the following post!