Forged Alliance Forever Forged Alliance Forever Forums 2018-10-10T23:30:13+02:00 /feed.php?f=2&t=16714 2018-10-10T23:30:13+02:00 2018-10-10T23:30:13+02:00 /viewtopic.php?t=16714&p=168368#p168368 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]> For a master-slave parallel workers code to function properly, speculative execution is out of question due to the non deterministic nature of those variables I mentioned.
Then you would need an interpreter to split the sim thread commands through multiple cpus, then reorganize them in the correct order.
The entire cycle load would become dependant of the slowest command of the batch, plus the interpreter time, and the cpu cache load would increase a lot. so there would be marginal gains on low cpu utilization, but hog the cache on higher cpu use or cache overload.
The best way to lower cache load, I.E., avoid the need to do memory reads for each unit and projectile in the game at every game cycle, is to have a fully deterministic system that gets handled entirely by the cpu. This could include non-deterministic elements if coupled with a predetermined delay, e.g. the alreay implemented 500ms connection latency of fa.Memory reads would be limited to a checker thread, that would compare the game cycle or "turn" results of all peers and detect any errors, therefore triggering a desync/resync.
Com thread would communicate the checker with the the other peer's master and vice versa.Variable level peer master worker retundancy could be implemented on higher power cpus to avoid the need of desyncs in a mix of cloud/p2p execution and make the game more cpu-friendly to lower level processing power. that would not need to be in sync with the supersim.The supersim thread would execute or stop and rewind the game state in case of errors.
Rendering would work on a timer on top of that.
So the supersim would work on a loop at each turn.
master-slave->com->checker->master-slave and so on.
Other than that, better get a Threadripper or a r7/r5 x6 :D

Statistics: Posted by ChaosRefractor — 10 Oct 2018, 23:30


]]>
2018-10-10T19:09:12+02:00 2018-10-10T19:09:12+02:00 /viewtopic.php?t=16714&p=168358#p168358 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]>
In this case though, that has been handled, likely in existing lua code.
All we are talking about here is reimplementing a few exe files and some lua to spread the load across more cores.
As long as everyone is running the same code, it should still be synced.
The software world is littered with remains of "should be" though, so caveat programmor.

So in this case, starting with local and testing is the place to start.


Janus.

Statistics: Posted by Janus — 10 Oct 2018, 19:09


]]>
2018-10-10T15:22:42+02:00 2018-10-10T15:22:42+02:00 /viewtopic.php?t=16714&p=168349#p168349 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]> P2P, on the other hand, would not be feasible due to the non-deterministic nature of some variables.

Statistics: Posted by ChaosRefractor — 10 Oct 2018, 15:22


]]>
2018-10-10T04:15:35+02:00 2018-10-10T04:15:35+02:00 /viewtopic.php?t=16714&p=168334#p168334 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]> However, precompiling the entire thing upon start would make a slow first start, then fast until you changed something, and had to recompile from scratch.

I am not sure how to handle caching the lua code though.
If it were me, I would precompile each unit and their projectiles on start, running the game from that in memory code.
Though what to do about graphics is another puzzle.
In celestia, we can preload/cache the textures, so I would think something like that can/does happen here.


Janus.

Statistics: Posted by Janus — 10 Oct 2018, 04:15


]]>
2018-10-10T04:05:27+02:00 2018-10-10T04:05:27+02:00 /viewtopic.php?t=16714&p=168333#p168333 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]> Caching those results for multicore implementation would further overload cpu cache.
Unfortunately, that's where FA performance bottlenecks.

Statistics: Posted by ChaosRefractor — 10 Oct 2018, 04:05


]]>
2018-10-06T08:29:39+02:00 2018-10-06T08:29:39+02:00 /viewtopic.php?t=16714&p=168195#p168195 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]> I mean, is there a separate array of units for each side, or are they all mixed together?

I am asking because I have been looking over the actual calls made to the system.
The number of calls to directX is surprisingly small, taken from the imports table of the exe.

Direct3DCreate9
D3DXCreateVolumeTextureFromFileInMemoryEx
D3DXFloat32To16Array
D3DXCreateTextureFromFileInMemoryEx
D3DXMatrixLookAtRH
D3DXMatrixScaling
D3DXMatrixTranslation
D3DXMatrixRotationQuaternion
D3DXMatrixRotationAxis
D3DXMatrixRotationZ
D3DXMatrixRotationY
D3DXMatrixRotationX
D3DXMatrixInverse
D3DXMatrixMultiply
D3DXGetVertexShaderProfile
D3DXGetPixelShaderProfile
D3DXCreateEffectCompiler
D3DXCreateEffect
D3DXSaveTextureToFileA
D3DXCreateBuffer
D3DXSaveSurfaceToFileInMemory
D3DXSaveSurfaceToFileA
D3DXLoadSurfaceFromSurface
D3DXCreateTexture
D3DXGetImageInfoFromFileInMemory
D3DXCreateCubeTextureFromFileInMemoryEx

Most of them remind me of opengl calls, but I am not an opengl programmer.
I am simply guessing based on what people have said in the past while I was asking questions figuring out what wasn't happening instead of what was.
All of the rest of the non windows specific ones in the exe, and the lua or moho dll files, are to winsock or wxwidets.
It appears to use a lua controlled wxwidgets menu and display system.
Wxwidgets does natively support opengl in a window, and directx from what I can find, so that may be the entire graphics system.
Only the exe directly calls the graphics subsystem, which is does on a clock.

Based on that, if the lua code can separate the sides into threads, which may take a lot of work.
It should be possible to make a new exe that supports multithreading.
Allowing the sides, the ammo in flight, and the graphics subsystem itself, to have their own thread.
This is all guesses and estimates.

It does sound like a fun project though.


Janus.

Statistics: Posted by Janus — 06 Oct 2018, 08:29


]]>
2018-10-04T19:04:52+02:00 2018-10-04T19:04:52+02:00 /viewtopic.php?t=16714&p=168154#p168154 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]>
Please understand, one of the major points of my work is being able to find problems in peoples projects, then find a fix that blends into their work as seamlessly as possible.
I have done my job properly when no one can tell what I did, from what they did.

I have worked in assembly (8/16/32bit for various cpu families), basic(many variations), Pascal, Ladder(plc stuff), and more.
I have forked Celestia and Explorer++ for my own use as a learning exercise.

If you are intent on doing this, I would recommend doing it in code-blocks or codelite for platform diversity.
You will need to make a copy of the game, then extract the scd files in gamedata in place which will give you a directory tree you can then search for a starting point.
Take a look at ztree as a tool for searching contents, and void tools everything for names.


Janus.

Statistics: Posted by Janus — 04 Oct 2018, 19:04


]]>
2018-10-04T18:47:43+02:00 2018-10-04T18:47:43+02:00 /viewtopic.php?t=16714&p=168152#p168152 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]>
Janus wrote:
@tatsu
Not sure how much help I could really be.
I am not a real C/C++ programmer.
Big or Gui projects are not really what I do.
I also work more on hardware than software, and where they overlap.

not to worry this could prove really useful.

plus it's more about willingness to learn than language experience.

this is not a company/paid for project so beggars can't exactly be choosers
but at the same time as choices go you seem to me like the top of the bucket.

personally my expertise is with javascript and I know no C++ I suspect that learning it won't be to hard. Algorythmics is, after all, common ground. I'm starting now. I've done some C before, the thing I had most trouble with was managing memory variables and pointers.

Statistics: Posted by tatsu — 04 Oct 2018, 18:47


]]>
2018-10-04T18:43:15+02:00 2018-10-04T18:43:15+02:00 /viewtopic.php?t=16714&p=168151#p168151 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]>
There are 4761 exports in mohoengine.dll to be exact.
I can post the list if anyone cares.

The structure of the names however, makes it obvious how it works.
Mohoengine.dll is a lua accelerator extension.
Operational logic is handled by the lua code which is highly adaptable, and can be cached once compiled.
Implementation logic of memory manipulation, bit wise math, or hardware access, is handled by the mohoengine in C/C++ which is magnitudes faster.
It is same basic idea a hardware accelerator of some sort.
Like a graphics card doing the messy math for textures or ray tracing.
High level is scripted, low level is optimized.
This enables a trade off that maximizes both.

C/C++ is good at speed, but requires a recompile every time you change anything.
Lua can run different every time, yet is slower since it requires interpretation.
By combining them, you get 90%+ of the best of both worlds.
This is why Lua shows up so little in the profiler, since all it really does it setup queues, not do anything directly.

For those who want to know, here is a sample of what the mohoengine does.
Code:
Moho::TConVar<bool>::~TConVar<bool>(void)
Moho::TConVar<int>::~TConVar<int>(void)
Moho::TConVar<uchar>::~TConVar<uchar>(void)
Moho::TConVar<float>::~TConVar<float>(void)
Moho::TConVar<std::basic_string<char,std::char_traits<char>,std::allocator<char>>>::~TConVar<std::basic_string<char,std::char_traits<char>,std::allocator<char>>>(void)
Moho::TConVar<bool>::TConVar<bool>(char const *,char const *,bool *)
Moho::TConVar<bool>::Handle(std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const &)
Moho::TConVar<int>::TConVar<int>(char const *,char const *,int *)
Moho::TConVar<int>::Handle(std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const &)
Moho::TConVar<uchar>::TConVar<uchar>(char const *,char const *,uchar *)
Moho::TConVar<uchar>::Handle(std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const &)
Moho::TConVar<float>::TConVar<float>(char const *,char const *,float *)
Moho::TConVar<float>::Handle(std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const &)
Moho::TConVar<std::basic_string<char,std::char_traits<char>,std::allocator<char>>>::TConVar<std::basic_string<char,std::char_traits<char>,std::allocator<char>>>(char const *,char const *,std::basic_string<char,std::char_traits<char>,std::allocator<char>> *)
Moho::TConVar<std::basic_string<char,std::char_traits<char>,std::allocator<char>>>::Handle(std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const &)


Most of the function are like these.
The same call, only with different variable types.
This is overloading, or making the same function designed to take different variable types.
Also done with templates, so all you program is the logic, and the compiler implements the logic for any convertible type.

The single biggest change that is needed, is getting the lua threads into separate cores.
Then it is ensuring that all the sides are done in as close to parallel as possible, which may mean some adaptation tot he existing lua code.
The makers of loud can tell you if the sides are done sequentially, or a type of parallel.
Going full parallel in the lua is the key.
If the lua logic can be tweaked, and the lua DLL made truly multicore compatible, little else should be needed.
The only fly in the ointment I see is whether the renderer expects things in a particular order.

The thing that is most easily missed is this is not logic wise a real time game.
It is actually turn based, with the turns controlled by the simulation clock instead of the players.
Each turn is 100ms long, and times out without an error.
Sim state update occurs, then updated sim state makes new rendering buffer.
Game slowdown occurs when updating/rendering takes 100ms or more.

If the Mohoengine is reimplemented, there is no reason the graphics functions ca not be made into opengl or even vulcan calls, thus opening the door to a native linux client for supcom, instead of using wine.

As for the lua DLL, it has 448 exports, and has been modified to tie into gpgcore, which should be easy to trim out.
Then we get to forgedalliance.exe, which has only three exports.
The names should tell C/C++ programmers exactly what it does.
Here are the two threads the game has, so either more branching/threads here, or dynamic threading elsewhere.

TlsCallback_0
TlsCallback_1
start[main entry]

Lots of fun, and lots of work.
As the structure of the mohoengine shows, it is mostly just lowlevel variations, each designed to take a different input type.
Template code always gives complicated looking structures like this, and from this side there is no way to tell how many of those call variations are actually used.


Janus.

Statistics: Posted by Janus — 04 Oct 2018, 18:43


]]>
2018-10-04T10:18:01+02:00 2018-10-04T10:18:01+02:00 /viewtopic.php?t=16714&p=168144#p168144 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]>
There are ~4500 exportable functions in MohoEngine.dll and ~450 in the Lua interpreter for supcom. LUA does not seem to show much on the profiler. Which means that the game is doing heavy caching of results of lua scripts. This makes sense because I presuppose that GPG programmers were competent and knew that scripts are slow.

Statistics: Posted by uzurpator — 04 Oct 2018, 10:18


]]>
2018-10-04T08:48:53+02:00 2018-10-04T08:48:53+02:00 /viewtopic.php?t=16714&p=168141#p168141 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]>
Not sure how much help I could really be.
I am not a real C/C++ programmer.
Big or Gui projects are not really what I do.
When I am involved in either one it is troubleshooting or optimization, not building.
I also work more on hardware than software, and where they overlap.

I tried to outline the approach to build a plan for anyone who wants to try, nothing more.
It is a classic deconstruction of a thing into nominally simple and normally sequential steps.
Or put more simply, troubleshooting a blank design.
Derive the shape of the over all idea.
Build descriptions of the jobs/transforms done by the steps in the sequences.
The result is flow-chart(s)/state-machine(s) of everything that happens, and why it happens as well.

Happy to help if I can though.


Janus.

Statistics: Posted by Janus — 04 Oct 2018, 08:48


]]>
2018-10-04T08:23:14+02:00 2018-10-04T08:23:14+02:00 /viewtopic.php?t=16714&p=168139#p168139 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]>
Janus wrote:
I have not looked at this, but this is how I would start.
Where is the graphics work really done?
How much of the game is done in Lua?

definitely want to help if I can.


Janus.

Hey!

You wanna join us? :D

Statistics: Posted by tatsu — 04 Oct 2018, 08:23


]]>
2018-10-04T08:20:13+02:00 2018-10-04T08:20:13+02:00 /viewtopic.php?t=16714&p=168138#p168138 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]>
Franck83 wrote:
True, here some tests : https://attractivechaos.github.io/plb/

holy shit!!! never thought LUA would be so slow! it's even slower then javascript (my main)

Statistics: Posted by tatsu — 04 Oct 2018, 08:20


]]>
2018-10-04T08:24:35+02:00 2018-10-04T08:18:27+02:00 /viewtopic.php?t=16714&p=168137#p168137 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]>
uzurpator wrote:
I don't know if I want to. tatsu is grooming me tho :P
yuck :P

uzurpator wrote:
EDIT: I also tried yesterday to profile FA to get at least coarse view of what is the performance issue with the game. No success yet, but I'll try today as well.

try opening up a setons replay famous for making cpu lag hard.

sometimes there's tons of units and especially tons of interactions and orders given to those units and FA goes into an exponential lag hole

spockodope quoted himself with this /viewtopic. ... 11#p165011 and I gotta say, looks relevant.

Statistics: Posted by tatsu — 04 Oct 2018, 08:18


]]>
2018-10-04T03:42:37+02:00 2018-10-04T03:42:37+02:00 /viewtopic.php?t=16714&p=168133#p168133 <![CDATA[Re: Online Petition (Supreme Commander 2020)]]>
uzurpator wrote:
EDIT: and to accent it: _game_ not _content_. Also - one person can _write_ it but it is another issue to do QA on a piece of software.


I have some QA knowledge, I'm sure we could put together a QA team and test the game properly. But, I've retired from gaming altogether. For me to come back to test a brand-new game I would need a) lots of free time (impossible with college <1 year away) or b) some solid money. I know the latter is not FAF's strong suit though.

Statistics: Posted by Lieutenant Lich — 04 Oct 2018, 03:42


]]>