trueskill parameter tuning results

Post here if you want to help developing something for FAF.

trueskill parameter tuning results

Postby Axle » 07 Feb 2016, 14:46

In case anyone is interested in "optimum" trueskill parameters for the FAF 1v1 ladder, these might qualify:

- mu = 1500
- sigma = 500
- beta = 240
- tau = 18
- draw_probability = 0.045

You can read slightly more about it in the readme.pdf here: https://github.com/Axle1975/pytrueskill
Axle
Avatar-of-War
 
Posts: 93
Joined: 02 Apr 2013, 10:14
Has liked: 0 time
Been liked: 4 times
FAF User Name: Axle

Re: trueskill parameter tuning results

Postby yorick » 07 Feb 2016, 23:27

This certainly looks interesting, once everything calmed down a bit with the new server this is worth a deeper look ( maybe for teamgames aswell later).
The draw probability in FA is quite dependend on the map ( i.e. on 5x5 maps way higher then on bigger maps). I vague recall some map stats on FAF with draw probability on there as well, but i dont know if that was used or a global draw probability.

Also it might be interesting to compare these values to the ones that are currently used. In the server code i found some values, but i dont know if these are ones that are actually used (used for everything, ladder /global rating only).https://github.com/FAForever/server/blob/a9e878ed09eed1cc19dd88518e2ce491d0e860f4/config.py#L26
-mu = 1500
- sigma = 500
- beta = 250
-tau = 5
- draw_probability = 0.10
yorick
Avatar-of-War
 
Posts: 113
Joined: 01 Oct 2014, 02:53
Has liked: 16 times
Been liked: 27 times
FAF User Name: yorick

Re: trueskill parameter tuning results

Postby Axle » 08 Feb 2016, 01:03

Hi Yorick! Thanks for that feedback.

I think you are very correct that draw_probability depends on the map (and whether your name is Lame or not), probably worth exploring. The draw_probability used here is just a global draw_probability.

One of things I originally wanted to explore was the possibility to model skill as a function of map and faction matchup too. All the necessary data is available. However I figured the above results could be immediately applicable to FAF since it requires no fiddling with the existing trueskill algorithms.

Its also interesting that the existing tau is so much lower than my tau. I have two comments on that:
- I noticed that theres a rapidly increasing penalty as tau becomes much smaller
- Some players have mentioned that after playing many games, they've subjectively found trueskill too sluggish to adjust their ratings. this could be a symptom of a too-low tau.
Axle
Avatar-of-War
 
Posts: 93
Joined: 02 Apr 2013, 10:14
Has liked: 0 time
Been liked: 4 times
FAF User Name: Axle

Re: trueskill parameter tuning results

Postby Sheeo » 08 Feb 2016, 02:00

Amazing work Axle. I'd love to get you setup with more data -- Aulex has been working on an API for searching through game results, it's still being worked on though.
Support FAF on patreon: https://www.patreon.com/faf?ty=h

Peek at our continued development on github: https://github.com/FAForever
Sheeo
Councillor - Administrative
 
Posts: 1038
Joined: 17 Dec 2013, 18:57
Has liked: 109 times
Been liked: 233 times
FAF User Name: Sheeo

Re: trueskill parameter tuning results

Postby Axle » 08 Feb 2016, 02:46

Thanks Sheeo! I'd love to get more comprehensive and up to date data. Theres no knowing how many missing games there are in my existing dataset.
Axle
Avatar-of-War
 
Posts: 93
Joined: 02 Apr 2013, 10:14
Has liked: 0 time
Been liked: 4 times
FAF User Name: Axle

Re: trueskill parameter tuning results

Postby Aulex » 08 Feb 2016, 04:46

Sheeo wrote:Amazing work Axle. I'd love to get you setup with more data -- Aulex has been working on an API for searching through game results, it's still being worked on though.

Still in hiatus, until I find time.
Nice work Axle, in terms of searching in relation to rating, I only set up rating bounds. Did you want something more specific or will this be sufficient?
Last edited by Aulex on 08 Feb 2016, 07:01, edited 1 time in total.
"Let's start beating ass and die" - drunk TA4Life

"Just because you have a d*** doesn't mean you need to be one...pussy" -Blackdeath

SCOUTING SAVES LIVES
http://imgur.com/YGk0W0o

How to play Sup Com by Ubilaz
http://goo.gl/je83z
User avatar
Aulex
Contributor
 
Posts: 1050
Joined: 17 Nov 2012, 05:29
Has liked: 299 times
Been liked: 225 times
FAF User Name: VoR_Aulex

Re: trueskill parameter tuning results

Postby Axle » 08 Feb 2016, 05:09

Hi Aulex, I wouldn't really want to search by rating. I'd be more interested in all 1vs1 games and their results.

Without knowing FAF db schema exactly (and pls forgive my rusty sql), I suspect I'd like something similar to:

select replay.id, replay.map_name, replay.time_start, replay.time_end, replay.duration, replay.game_type,
player.name, (player.uniqueID so as not to be confused by name changes??), player.score, player.faction, player.rating_mean, player.rating_stdev
from replays
join player on replay.id=player.replayid
where replay.game_type=1vs1

But for good measure, I'd like all the custom games too, for later on :D

Infact, maybe its better if I can just get my hands on a backup of the whole db :D:D
Axle
Avatar-of-War
 
Posts: 93
Joined: 02 Apr 2013, 10:14
Has liked: 0 time
Been liked: 4 times
FAF User Name: Axle

Re: trueskill parameter tuning results

Postby Softly » 08 Feb 2016, 21:28

I too would like some stats on past games to play with
Softly
Supreme Commander
 
Posts: 1009
Joined: 26 Feb 2012, 15:23
Location: United Kingdom
Has liked: 150 times
Been liked: 251 times
FAF User Name: Softles

Re: trueskill parameter tuning results

Postby Axle » 09 Feb 2016, 10:12

Here's some interesting figures:

Plots of rating progression for the 5 most prolific ladder players. In blue we have the current trueskill parameters. In red, my "optimimum" parameters. And in green, half way between.

So I guess the most obvious thing is that the "optimum" progression is a lot more volatile. The other thing to notice is that for beta=10, the log likelihood (L) isn't really that much worse than beta=18. As is often the case with maximum likelihood optimisations, the absolute optimum isn't necessarily the only acceptable solution, there are many suboptimal solutions that are almost as good. And if we throw into the mix that we don't like the degree of volatility that beta=18 gives us, well beta=10 is less volatile and is almost a predictive.

Image
Image
Image
Image
Image

So is what I'm saying is, I think maybe beta=10 is better :D

btw, the difference between pdraw=0.1 and pdraw=0.045 is negligible
Last edited by Axle on 26 Feb 2016, 23:01, edited 1 time in total.
Axle
Avatar-of-War
 
Posts: 93
Joined: 02 Apr 2013, 10:14
Has liked: 0 time
Been liked: 4 times
FAF User Name: Axle

Re: trueskill parameter tuning results

Postby Softly » 23 Feb 2016, 22:04

The most important criteria is whether your tuned version of trueskill better predicts results.

What sort of success rates does it get vs the current version?
Softly
Supreme Commander
 
Posts: 1009
Joined: 26 Feb 2012, 15:23
Location: United Kingdom
Has liked: 150 times
Been liked: 251 times
FAF User Name: Softles

Next

Return to Contributors

Who is online

Users browsing this forum: No registered users and 1 guest