Forged Alliance Forever Forged Alliance Forever Forums 2015-03-11T19:58:07+02:00 /feed.php?f=2&t=4191 2015-03-11T19:58:07+02:00 2015-03-11T19:58:07+02:00 /viewtopic.php?t=4191&p=95931#p95931 <![CDATA[Re: Regression on Game data]]>
Sheeo wrote:
Pigeonfighter wrote:
partytime wrote:what did you write the replay parser in?


I translated viewtopic.php?f=41&t=7075 source to Matlab code. It's slow for the moment (takes 10 sec to analyse a long game with lot of players) but seems to work most of the time (1 crash in 7000 games).


Sounds like a true statistician at work here ;)

Much apreciated! Let me know if you need help with grabbing bulk data.


Nice work! Will be excited to see what you come up with!

Statistics: Posted by sasin — 11 Mar 2015, 19:58


]]>
2015-03-10T07:16:42+02:00 2015-03-10T07:16:42+02:00 /viewtopic.php?t=4191&p=95774#p95774 <![CDATA[Re: Regression on Game data]]>

Statistics: Posted by -_V_- — 10 Mar 2015, 07:16


]]>
2015-03-09T22:08:41+02:00 2015-03-09T22:08:41+02:00 /viewtopic.php?t=4191&p=95751#p95751 <![CDATA[Re: Regression on Game data]]>
-_V_- wrote:
Are you guys implying that the goal of this is to get 25% win on sera, 25% win on cybran , 25% for aeon and 25% uef and then decide the game is balanced ?

Would never work out that way, different skill levels between players, distribution of factions, learning curves of each, etc. (also technically given perfect game balance at the highest levels of play each faction would win 50% of the time, not 25%) but its good to have these statistics to potentially point out map imbalances between factions by highlighting outliers and to judge if a balance change is having the desired effect; percentages probably won't get closer to some "ideal" balance percentage for each faction but given enough time they will noticably change from the previous average if they are making a difference to a single faction and not others (assuming the change is large enough to fall outside the margin of error) and we'll be able to see the direction of this change; also good for spotting unintended consequences of changes (handy with a bunch of hitbox changes potentially coming up here). The more data you can pull the more you could potentially get out of it on the impact of any given change, but it will never tell you that the game is perfectly balanced. If made into a tool this sort of analysis could also be something you use to assess your own play over time.


Trying to use it the way you mention would probably only tell you the difficulty curves to learning each faction at the average rating level on FAF; applying that logic to pro games only would not work because there wouldn't be a statistically significant sample size (both in terms of player count and players per faction), and if you had such a sample size it still would not account for problems like single dominant strategies/tactics or be hugely useful without deciding what map/game style you were basing balance on (which in our case is way more than one and balance will never be perfect on everything).

Statistics: Posted by Rogueleader89 — 09 Mar 2015, 22:08


]]>
2015-03-09T13:09:17+02:00 2015-03-09T13:09:17+02:00 /viewtopic.php?t=4191&p=95714#p95714 <![CDATA[Re: Regression on Game data]]> Statistics: Posted by -_V_- — 09 Mar 2015, 13:09


]]>
2015-03-09T12:01:02+02:00 2015-03-09T12:01:02+02:00 /viewtopic.php?t=4191&p=95710#p95710 <![CDATA[Re: Regression on Game data]]>
Pigeonfighter wrote:
partytime wrote:what did you write the replay parser in?


I translated viewtopic.php?f=41&t=7075 source to Matlab code. It's slow for the moment (takes 10 sec to analyse a long game with lot of players) but seems to work most of the time (1 crash in 7000 games).


Sounds like a true statistician at work here ;)

Much apreciated! Let me know if you need help with grabbing bulk data.

Statistics: Posted by Sheeo — 09 Mar 2015, 12:01


]]>
2015-03-09T05:47:55+02:00 2015-03-09T05:47:55+02:00 /viewtopic.php?t=4191&p=95703#p95703 <![CDATA[Re: Regression on Game data]]>
partytime wrote:
what did you write the replay parser in?


I translated viewtopic.php?f=41&t=7075 source to Matlab code. It's slow for the moment (takes 10 sec to analyse a long game with lot of players) but seems to work most of the time (1 crash in 7000 games).

Statistics: Posted by Pigeonfighter — 09 Mar 2015, 05:47


]]>
2015-03-09T05:08:37+02:00 2015-03-09T05:08:37+02:00 /viewtopic.php?t=4191&p=95701#p95701 <![CDATA[Re: Regression on Game data]]> Statistics: Posted by nine2 — 09 Mar 2015, 05:08


]]>
2015-03-09T04:37:48+02:00 2015-03-09T04:37:48+02:00 /viewtopic.php?t=4191&p=95700#p95700 <![CDATA[Re: Regression on Game data]]> Statistics: Posted by Ceneraii — 09 Mar 2015, 04:37


]]>
2015-03-09T04:07:46+02:00 2015-03-09T04:07:46+02:00 /viewtopic.php?t=4191&p=95699#p95699 <![CDATA[Re: Regression on Game data]]>
So far I brainstormed some light ideas. For now I'm mainly focused on "rating approved" 1v1 games. The idea is to use longer games happening on the same map and analyzing them from the same start positions to begin with. After that add all maps and start pos and see if patterns still emerge.

I mostly thought about classification on winners/losers or rating of players using features below.

Early game: using counts of different number of commands during the 100 first commands or first 5 min determine who will win the game/ what rating someone got. Clustering idea using early game can be to see if clear clusters appear separating rating.

Doing the same as above e.g. determining winner/rating but with more features including:
Aggressiveness: how close to enemy start pos is person during game
Actions per minute (this is really biased since a player with a lot of units to move get a lot of apm, need to do a "inputs per minute" instead: removing commands at the same tick)
Commands: the feature used for early game was counting number of commands. One can also categorise these commands during the whole game in air/land/sea or build/move and then count those, maybe even normalise to get category of gamer.
Turtler: all commands around start pos, lot of defence structures, low aggro
Endgame: category the player is in: nukes, strat bombers, monkey lords etc.
Familiarity: how often the player played on the map before/ used those exact strategies.

These are just a quick brainstorm, obviously a lot of correlating features, and I'm thankful for input on what to analyse and features that can be used etc :)

By the way. I'm converting the replay files to data with Matlab and then doing the work in R. The data importer is kind of finished but is a mess. Tell me if you want data or files and I will provide it when I cleaned it up a bit.

Statistics: Posted by Pigeonfighter — 09 Mar 2015, 04:07


]]>
2015-01-06T17:19:41+02:00 2015-01-06T17:19:41+02:00 /viewtopic.php?t=4191&p=90491#p90491 <![CDATA[Re: Regression on Game data]]>
partytime wrote:
This is a nice idea! Sasin could you also do player vs player analysis? If you could sort out the math someone else could code it


player vs. player analysis should be doable if that data is made available! I understand if Sheeo doesn't want to go that route! I'm gonna check out what's there for now!

Statistics: Posted by sasin — 06 Jan 2015, 17:19


]]>
2015-01-06T17:18:11+02:00 2015-01-06T17:18:11+02:00 /viewtopic.php?t=4191&p=90490#p90490 <![CDATA[Re: Regression on Game data]]>
Sheeo wrote:
sasin wrote:Thanks a ton Sheeo! I really appreciate you being responsive on this. Hopefully we'll be able to use the data to provide SOME evidence to answer whatever questions the community finds interesting! Even columns a-k alone would be splendid. Whenever you can send that over it'd be fantastic.


Hi Sasin, I'm sorry for the long time it's taken to reach this state. I got hit by stress at uni all of a sudden, so I haven't had time to follow up on everything :)

I haven't had time to get this into a format suitable for what you posted, but I did a dump of a bunch of game stats the other day, you can fetch them here: http://content.faforever.com/game_stats_dump.txt. This dump doesn't contain player names for now.

I'll try and see if I can't get more data in a better format (csv?) for you soon. And I'd really like for (almost) arbitrary queries to be run against our development server.


Thanks Sheeo! Sorry I totally missed your post here... I was in Asia on vacation! Feel free to PM me when you need me with stuff like this because I get an e-mail notification :). I'll go check out the data you gave me!

Statistics: Posted by sasin — 06 Jan 2015, 17:18


]]>
2015-01-06T17:03:04+02:00 2015-01-06T17:03:04+02:00 /viewtopic.php?t=4191&p=90480#p90480 <![CDATA[Re: Regression on Game data]]> I know checking for every unit built could be quite hard on the performence but I also had an idea regarding that:
At first only check for units that could be on the field at a certain point in a game.
For example at the start of the game each player only has their ACU so you only need to track what buildings the ACU builds. If a factory is build you start checking for the respective t1 units. When the first factory is t2 check for the respective t2 units, first t2 engy or t2 ACU(could be a nice idea to also track acu upgrades) check for t2 buildings and so on ...
But in a long game we still need to check for a lot of units but when higher tech is available lower tech is less likely to be build so if a unit is not being build for some time you can increase the time between checks. you may miss some of the units but it's not important to have the exact number an estimate is enough to analyse the usage of across an abundance of games.

Statistics: Posted by DeimosEvotec — 06 Jan 2015, 17:03


]]>
2014-12-09T19:01:52+02:00 2014-12-09T19:01:52+02:00 /viewtopic.php?t=4191&p=87624#p87624 <![CDATA[Re: Regression on Game data]]>
sasin wrote:
Thanks a ton Sheeo! I really appreciate you being responsive on this. Hopefully we'll be able to use the data to provide SOME evidence to answer whatever questions the community finds interesting! Even columns a-k alone would be splendid. Whenever you can send that over it'd be fantastic.


Hi Sasin, I'm sorry for the long time it's taken to reach this state. I got hit by stress at uni all of a sudden, so I haven't had time to follow up on everything :)

I haven't had time to get this into a format suitable for what you posted, but I did a dump of a bunch of game stats the other day, you can fetch them here: http://content.faforever.com/game_stats_dump.txt. This dump doesn't contain player names for now.

I'll try and see if I can't get more data in a better format (csv?) for you soon. And I'd really like for (almost) arbitrary queries to be run against our development server.

Statistics: Posted by Sheeo — 09 Dec 2014, 19:01


]]>
2014-11-06T21:10:53+02:00 2014-11-06T21:10:53+02:00 /viewtopic.php?t=4191&p=85332#p85332 <![CDATA[Re: Regression on Game data]]>
Sheeo wrote:
sasin wrote:
Sheeo wrote:We should be able to make this happen yes :)


That would be fantastic, Sheeo, thanks! Whatever data we have I could have a look at would be cool to analyze for the community! It could potentially be useful for balance discussions (although of course not the end all be all), discussions about maps being balanced or unbalanced, and perhaps even other issues like individual reclaim.

I'm hoping we have data available like this sample, which is what I had in mind, but if it looks different than we could still see what we could do. I've given a link to a google doc with what I was hoping the data looks like.

Columns B-H are the more important columns, and the rest of the columns are things that would be interesting that I was just thinking of but aren't as vital.

Google sheet:
https://docs.google.com/spreadsheets/d/ ... sp=sharing


We have enough data to answer colums A-K right now. To answer the remaining colums, we need to simulate and track these stats from the games.

I can't promise when I'll have time to gather this for you though, but I definitely want to help out and get this going soon.


Thanks a ton Sheeo! I really appreciate you being responsive on this. Hopefully we'll be able to use the data to provide SOME evidence to answer whatever questions the community finds interesting! Even columns a-k alone would be splendid. Whenever you can send that over it'd be fantastic.

Statistics: Posted by sasin — 06 Nov 2014, 21:10


]]>
2014-10-31T14:40:34+02:00 2014-10-31T14:40:34+02:00 /viewtopic.php?t=4191&p=84828#p84828 <![CDATA[Re: Regression on Game data]]>
sasin wrote:
Sheeo wrote:We should be able to make this happen yes :)


That would be fantastic, Sheeo, thanks! Whatever data we have I could have a look at would be cool to analyze for the community! It could potentially be useful for balance discussions (although of course not the end all be all), discussions about maps being balanced or unbalanced, and perhaps even other issues like individual reclaim.

I'm hoping we have data available like this sample, which is what I had in mind, but if it looks different than we could still see what we could do. I've given a link to a google doc with what I was hoping the data looks like.

Columns B-H are the more important columns, and the rest of the columns are things that would be interesting that I was just thinking of but aren't as vital.

Google sheet:
https://docs.google.com/spreadsheets/d/ ... sp=sharing


We have enough data to answer colums A-K right now. To answer the remaining colums, we need to simulate and track these stats from the games.

I can't promise when I'll have time to gather this for you though, but I definitely want to help out and get this going soon.

Statistics: Posted by Sheeo — 31 Oct 2014, 14:40


]]>