rootbeer23 wrote:uberge3k wrote:Step 3: When all tasklets have finished, loop through the unit list again, applying the cached update results to the live values.
This is an extremely common pattern in multithreading, and quite simple to implement.
it requires a lot of extra memory
Memory is a red herring. Let's examine why. A typical unit would need to store its position, velocity, and state data. Since this data would be cached, it would indeed add to the total memory requirements of the simulation. But how much?
Position and velocity would each typically be stored in a set of three 32-bit floating point values. For planets the size of PA, you could easily use 16-bit without any notable loss of precision, but we'll ignore that and use a worst-case scenario instead. There are probably a couple of turrets whose rotation need to be kept track of.So, that's (4*3*2) + (4*2) = 32 bytes of memory.
Now, how much more (mutable!) state data could a unit need? Keep in mind we're talking values such as current health, firing state, weapon timers, that sort of thing. Non-changing unit data need not be shared (eg, weapon cooldown times, max health) for obvious reasons. Let's assume that each unit can store 30 such values. Let's also assume that they're using 32-bit precision integers (eg, in case a unit needs 4 billion hitpoints). So, 4*30=120 bytes.
So we're at 152 bytes of data. Per unit.
For 10k units, that's 1.52MB more memory.
For 1 million units, that's 152MB of additional memory.
The next-gen consoles will have 8GB of memory. Not all of that will be available to each game as some is reserved for the OS and other tasks, but we can safely assume at least 4GB of that will be addressable by a game.
But we're talking about PCs. So let's check the latest Steam hardware survey:
http://store.steampowered.com/hwsurveyThe overwhelming majority of systems have at least 4GB of RAM. The most common value is 8GB, and nearly 10% even have more than 12GB.
So, we can probably safely assume that the extra 152MB of RAM needed to properly cache unit thread updates won't break the RAM budget.
rootbeer23 wrote:extra cpu resources and above all, many more fetches from memory.
Which would certainly still be far less than the CPU time gained by threading the simulation instead of letting those cores sit idle. Memory fetches, while certainly something to consider, are likewise relatively trivial. Each unit would require, in the absolute worst case, 3x as many memory reads and writes. They are absolutely trivial on modern hardware.
rootbeer23 wrote:a physics simulation is an inherently lock-free parallelizable thing, because the speed of light is not infinite and the spatial
seperation makes the paritioning straightforward.
But what about the added memory requirements and the additional reads and writes required to use some form of grid registration, quadtrees, or similar forms of spatial partitioning to make that happen?
(We should also clarify that this is in reference only to physical simulations in the context of an RTS - true physics engines, such as Havoc, are relatively more difficult to thread properly)