Code Optimisation: Difference between revisions
|  (Thread handling) | |||
| Line 137: | Line 137: | ||
| This example is of the order (n^2) (3^2 = 9 iterations). For arrays that are twice as big, you will run 4 times slower, and for arrays that are 3 times as big you will run 9 times slower! Of course, you don't always have a choice, and if one (or both) of the arrays is guaranteed to be small it's not really as big of a deal. | This example is of the order (n^2) (3^2 = 9 iterations). For arrays that are twice as big, you will run 4 times slower, and for arrays that are 3 times as big you will run 9 times slower! Of course, you don't always have a choice, and if one (or both) of the arrays is guaranteed to be small it's not really as big of a deal. | ||
| ==Threads== | |||
| The game runs in a scheduled environment, and there are two ways you can run your code. Scheduled and non scheduled. | |||
| Depending on where the scope originates, determines how the code is executed. Scheduled code is subject to delays between reading the script across the engine, and execution times can depend on the load on the system at the time. | |||
| Some basic examples: | |||
| *Triggers are inside what we call the 'non-scheduled' environment. | *Triggers are inside what we call the 'non-scheduled' environment. | ||
| Line 145: | Line 150: | ||
| *FSM conditions are without scheduling | *FSM conditions are without scheduling | ||
| *Event handlers (on units and in GUI) are without scheduling | *Event handlers (on units and in GUI) are without scheduling | ||
| ====The 0.3ms delay (not 3ms)==== | |||
| The 0.3ms delay is a delay introduced in ArmA2 for scheduled environments to prevent script overload during game play (assuredly due to the occurrences in ArmA1). The 0.3ms is experienced between statements and upon finishing a statement, will proceed down the schedule until returning to the start to go through the loop again. As I always seem to be able to explain things in code, the behavior as far as I can explain it is the following: | |||
| <code><nowiki>while (true) { | |||
|    { | |||
|        execute1Statement(_x); | |||
|        sleep 0.3ms; | |||
|    } foreach SCRIPTS; | |||
| }</nowiki></code> | |||
| You can therefore see that the delay between statement execution inside your script is dependent on n scripts (or threads). In this way you should refrain from creating to many threads in your code to stop the system from scaling to the point functionality is seriously deprecated. This information is subject to change as BIS change things in the latest beta's. | |||
| ====When am I creating new threads?==== | |||
| Using the [[spawn]]/[[execVM]]/[[exec]] commands are creating small threads within the scheduler for ArmA2 (verification from a BIS DEV for specifics is needed here), and as the scheduler works through each one individually, the delay between returning to the start of the schedule to proceed to the next line of your code can be very high (in high load situations, delays of up to a minute can be experienced!). | |||
| Obviously this problem is only an issue when your instances are lasting for longer than their execution time, ie spawned loops with sleeps that never end, or last a long time. | |||
| ==Measuring Velocity Scalar== | ==Measuring Velocity Scalar== | ||
Revision as of 10:37, 30 May 2010
Make it work.
No need to worry about making it work at light speed if it doesn't even do what it is supposed to. Focus on getting a working product first.
Make it fast.
Optimisation is everything when running lots of instances, with low delays. However, there is such thing as premature optimisation. Also, avoid excessive cleverness.
"Excessive cleverness is doing something in a really clever way when actually you could have done it in a much more straightforward but slightly less optimal manner. You've probably seen examples of people who construct amazing chains of macros (in C) or bizarre overloading patterns (in C++) which work fine but which you look at an go "wtf"? EC is a variation of premature-optimisation. It's also an act of hubris - programmers doing things because they want to show how clever they are rather than getting the job done." - sbsmac
Preprocessfilelinenumbers
The preprocessFileLineNumbers command remembers what it has done, so loading a file once will load it into memory, therefore if wanted to refrain from using global variables for example, but wanted a function precompiled, but not saved, you could simply use:
call compile preprocessfilelinenumbers "file"
Remembering the only loss of performance will be the compile time of the string returned and then the call of the code itself.
Make it pretty.
Documentation, readability, and all that jazz. Clean code is good code.
Written it twice? Put it in a function
Pre-compilation by the game engine can save up 20x the amount of time processing, even if the initial time is slightly lengthened. If you've written it twice, or if there is a kind of loop consistently being compiled (perhaps a script run by execVM), make it into a function (FUNCVAR =compile preprocessfilelinenumbers "filename.sqf").
Length
If any script or function is longer than around 200-300 lines, then perhaps (not true in all cases by all means) you may need to rethink the structure of the script itself, and whether it is all within scope of the functionality required, and if you could do something cleaner, faster and better.
If Else If Else If Else ...
If you can't escape this using a switch control structure, then try and rethink the functionality. Especially if only one option is needed to match.
Constants
Using a hard coded constant more than once? Use preprocessor directives rather than storing it in memory or cluttering your code with numbers. Such as:
a = _x + 1.053;
b = _y + 1.053;
Becomes:
#define BUFFER 1.053
_a = _x + BUFFER;
_b = _y + BUFFER;
This also allows quick modifying of code.
Adding elements to an array
- set is 56x faster than binary addition
_a set [count _a,_v]
Instead of:
_a = _a + [_v]
Removing elements from an array
When FIFO removing elements from an array, the set removal method works best, even if it makes a copy of the new array.
ARRAYX set [0, objnull];
ARRAYX = ARRAYX - [objnull];
Getting position
If your feeling self-conscious and want an AGL solution (ie identical to that of getPos):
private "_pos"
_pos = getposATL player;
if(surfaceIsWater _pos) then {
	_pos = getposASL player;
};
It is still 25% faster than its getPos twin.
CreateVehicle Array
createVehicle does not take 3D positions, it is a 2D position function only, therefore you can not input the Z height of the vehicle being created. It is also (and most importantly) up to 500x slower than its younger brother, createVehicle_array. It is highly recommended to therefore use a standard of createVehicle array rather than the older (deprecated version).
Loops
These first two loop types are identical in speed (+/- 10%), and are more than 3x as fast the proceeding two loop types.
- for "_y" from # to # step # do { ... };
- { ... } foreach [ ... ];
Where as these two loops are much slower, and for maximum performance, avoided.
- while { expression } do { code };
- for [{ ... },{ ... },{ ... }] do { ... }
Waituntil can be used when you want something to only run once per frame, which can be handy for limiting scripts that may be resource heavy.
- waituntil {expression};
As requested, the method to gain this information was via the CBA_fnc_benchmarkFunction, using around 10000 iterations. It was not tested across different stations, and *may* be subject to change between them (ArmA2 is special remember :P):
fA = {
	private "_i";
	_i = 0;
	while {_i < 1000} do {
		_i = _i + 1;
		private "_t";
		_t = "0";
	};
};
fB = {
	for "_i" from 0 to 1000 do {
		private "_t";
		_t = "0"
	};
};
This code then performs 10,0000 tests and returns average time taken for the function, measured via diag_ticktime.
[fA,[],10000] call CBA_fnc_benchmarkFunction;
[fB,[],10000] call CBA_fnc_benchmarkFunction;
10,000 Iterations Limit in Loops
Doesn't exist in scheduled environments, only in non - scheduled (only where the 0.3ms delay does not exist).
Avoid O(n^2)!!
Commonly you may set up foreach foreach's. 'For' example:
{
	{ ...} foreach [0,0,0]; 
} foreach [0,0,0];
This example is of the order (n^2) (3^2 = 9 iterations). For arrays that are twice as big, you will run 4 times slower, and for arrays that are 3 times as big you will run 9 times slower! Of course, you don't always have a choice, and if one (or both) of the arrays is guaranteed to be small it's not really as big of a deal.
Threads
The game runs in a scheduled environment, and there are two ways you can run your code. Scheduled and non scheduled.
Depending on where the scope originates, determines how the code is executed. Scheduled code is subject to delays between reading the script across the engine, and execution times can depend on the load on the system at the time.
Some basic examples:
- Triggers are inside what we call the 'non-scheduled' environment.
- All pre-init code executions are without scheduling.
- FSM conditions are without scheduling
- Event handlers (on units and in GUI) are without scheduling
The 0.3ms delay (not 3ms)
The 0.3ms delay is a delay introduced in ArmA2 for scheduled environments to prevent script overload during game play (assuredly due to the occurrences in ArmA1). The 0.3ms is experienced between statements and upon finishing a statement, will proceed down the schedule until returning to the start to go through the loop again. As I always seem to be able to explain things in code, the behavior as far as I can explain it is the following:
while (true) {
   {
       execute1Statement(_x);
       sleep 0.3ms;
   } foreach SCRIPTS;
}
You can therefore see that the delay between statement execution inside your script is dependent on n scripts (or threads). In this way you should refrain from creating to many threads in your code to stop the system from scaling to the point functionality is seriously deprecated. This information is subject to change as BIS change things in the latest beta's.
When am I creating new threads?
Using the spawn/execVM/exec commands are creating small threads within the scheduler for ArmA2 (verification from a BIS DEV for specifics is needed here), and as the scheduler works through each one individually, the delay between returning to the start of the schedule to proceed to the next line of your code can be very high (in high load situations, delays of up to a minute can be experienced!).
Obviously this problem is only an issue when your instances are lasting for longer than their execution time, ie spawned loops with sleeps that never end, or last a long time.
Measuring Velocity Scalar
Sure we can just use Pythagorean theorem to calculate the magnitude from a velocity vector, but a command native to the engine runs much faster (over 10x faster) than the math.
- VECTOR distance [0,0,0]
Works for 2D vectors as well.
How to test and gain this information yourself?
There is a few ways to measure the information and run time durations inside ArmA2, mostly using differencing of the time itself. The CBA package includes a function for you to test yourself, however if you are remaining ad don free or cannot use this, the following code setup is as effective; and allows different ways to retrieve the information (chat text, rpt file, clipboard)
_fnc_dump = {
	player globalchat str _this;
	diag_log str _this;
	//copytoclipboard str _this;
};
_t1 = diag_tickTime;
// ... code to test
(diag_tickTime - _t1) call _fnc_dump;
