raP File Format - OFP: Difference between revisions
m (work in progress) |
Lou Montana (talk | contribs) m (Fix Wikipedia link) |
||
(39 intermediate revisions by 10 users not shown) | |||
Line 1: | Line 1: | ||
{{Feature|UnsupportedDoc}} | |||
{{TOC|side}} | |||
== Caveat == | |||
Althugh similar in construct and intent, if you are researching the nitty gritty of [[raP File Format - Elite|Elite]] or [[raP File Format - ArmA|ArmA]] raP encoded files, you should read those topics instead. '''This''' document deals, specifically, with raP files encountered in OFP / OFP Resistance only. | |||
== Introduction == | |||
==Introduction== | |||
raP encoding applies to any humanly readable text file in OFP that contains class statements. Examples of files that are, or should be, raPified, are mission.sqm, config.cpp, description.ext. | raP encoding applies to any humanly readable text file in OFP that contains class statements. Examples of files that are, or should be, raPified, are mission.sqm, config.cpp, description.ext. | ||
Line 34: | Line 31: | ||
The engine will work with config.'''cpp''' as a raP encoded entity, just as it would work with config.'''bin'''. | The engine will work with config.'''cpp''' as a raP encoded entity, just as it would work with config.'''bin'''. | ||
===Tools=== | === Tools === | ||
Various utilities exist which refer to binary <> cpp compression and extraction (or encoding and decoding). Again, these terms are misleading because the file concerned is not executable binary data, just tokenised strings and values. | Various utilities exist which refer to binary <> cpp compression and extraction (or encoding and decoding). Again, these terms are misleading because the file concerned is not executable binary data, just tokenised strings and values. | ||
===Basics=== | === Basics === | ||
There is no need here to elaborately define what a mission.sqm file is. But, it is worth understanding the basics of these (types of) files to understand the <u>very small</u> requirements needed to raPify them. | There is no need here to elaborately define what a mission.sqm file is. But, it is worth understanding the basics of these (types of) files to understand the <u>very small</u> requirements needed to raPify them. | ||
class files only contain 3 types of construct | class files only contain 3 types of construct | ||
ClassNames | ClassNames, TokenNames, Arrays | ||
'''class''' classname [:inherit] {...}; | |||
[:inherit] is optional and simply refers to another classname. | |||
(For your interest the [] are part of a grammar notation technique called {{Link|https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form|Backus–Naur Form}} and mean ''optional''. Whatever is within the [...] is optional. The [] do '''not''' appear in the text file.) | |||
The class body, the {...} contains more, Classnames, TokenNames, Arrays, or nothing at all. | The class body, the {...} contains more, Classnames, TokenNames, Arrays, or nothing at all. | ||
TokenNames come in 3 flavours | '''TokenNames''' come in 3 flavours | ||
aString="A string"; | aString="A string"; | ||
anInteger=77; | anInteger=77; | ||
aFloat 1.855; | aFloat 1.855; | ||
For more on this subject, see [[TokenNameValueTypes]] | |||
'''Arrays''' | |||
anArray[]={......}; | anArray[]={......}; | ||
an array containing elements including (possibly) more arrays or more TokenNames (but not classes) | an array containing elements including (possibly) more arrays or more TokenNames (but not classes) | ||
Line 69: | Line 68: | ||
raPifying encodes each of these basic types. | raPifying encodes each of these basic types. | ||
===Construct=== | === Construct === | ||
all raPified data can be expressed as | all raPified data can be expressed as | ||
Line 89: | Line 88: | ||
}; | }; | ||
}; | }; | ||
[an optional enumerated list] | |||
== Header == | |||
==Header== | |||
A raPified file has the first 4 bytes of the file encoded as follows: | A raPified file has the first 4 bytes of the file encoded as follows: | ||
Line 109: | Line 102: | ||
see elsewhere for Elite and ArmA. | see elsewhere for Elite and ArmA. | ||
The rest of the file | The rest of the file contain Class Body packets of 3 different construct types with the 1st byte defining what 'type' it is. | ||
Thus | Thus | ||
struct | struct ClassBody | ||
{ | { | ||
byte PacketType; // 0 Classname | byte PacketType; // 0 Classname | ||
Line 124: | Line 117: | ||
readable text output.''' | readable text output.''' | ||
==Packets== | == Packets == | ||
===PacketType0: Classname=== | === PacketType0: Classname === | ||
class Classname: InheritedClassName { Packets... }; | class Classname: InheritedClassName { Packets... }; | ||
struct ClassPacket | struct ClassPacket | ||
{ | { | ||
byte PacketType; // = 0 == class | byte PacketType; // = 0 == class | ||
IndexedString Classname; | [[#IndexedString]] Classname; | ||
Asciiz InheritedClassName; // optional or zero length string | Asciiz InheritedClassName; // optional or zero length string | ||
[[#BIS_Integer]] nImbeddedPackets; // Iterates thru embedded Packet(s) can be zero | |||
}; | }; | ||
The | Having no embedded packets is quite legal. | ||
The embedded packets, eg, the body of this class, immediately follows this packet. | |||
Bare in mind, that the following data (the body of this class) may indeed have further embedded packets, which may have further embedded packets, which may have.... All of which are contiguous in the datastream (OFP Only). | |||
=== PacketType1: TokenNames === | |||
The 2nd byte of this packet defines what type of variable. Thus | |||
struct VarPacket | struct VarPacket | ||
{ | { | ||
byte PacketType; | byte PacketType; // = 1 | ||
byte VarType; | byte VarType; // 0 String | ||
// 1 Float | |||
// 2 Integer | |||
IndexedString SomeName; | IndexedString SomeName; | ||
.... depends on VarType | .... depends on VarType | ||
}; | }; | ||
====VarType0 String==== | ==== VarType0 String ==== | ||
SomeName="SomeOtherName"; | SomeName="SomeOtherName"; | ||
Line 158: | Line 161: | ||
IndexedString SomeOtherName; | IndexedString SomeOtherName; | ||
}; | }; | ||
====VarType1 Float==== | ==== VarType1 Float ==== | ||
SomeName=1.23445; | SomeName=1.23445; | ||
Line 168: | Line 171: | ||
float value; // 4 bytes | float value; // 4 bytes | ||
}; | }; | ||
====VarType2 Integer==== | ==== VarType2 Integer ==== | ||
SomeName=123; | SomeName=123; | ||
struct VarTypLongInteger | struct VarTypLongInteger | ||
{ | { | ||
Line 178: | Line 182: | ||
}; | }; | ||
===PacketType2: Arrays=== | === PacketType2: Arrays === | ||
Arrays[] contain four possible element types. They are the traditional variables mentioned above with an added tweak of an embedded array type. | Arrays[] contain four possible element types. They are the traditional variables mentioned above with an added tweak of an embedded array type. | ||
Line 187: | Line 191: | ||
struct ArrayPacket | struct ArrayPacket | ||
{ | { | ||
byte | byte PacketType; // = 2 | ||
IndexedString | [[#IndexedString]] SomeName; | ||
[[#CompressedInteger]] nElements; // iterate thru ConstTypes, can be 0 | |||
.... | .... a series of ArrayTypes | ||
}; | }; | ||
====ArrayType0 String==== | Similar to classes, the embedded ArrayTypes follow immediately in the data stream (OFP only). | ||
==== ArrayType0 String ==== | |||
"SomeName", | "SomeName", | ||
Line 199: | Line 205: | ||
{ | { | ||
byte VarType; // = 0 | byte VarType; // = 0 | ||
IndexedString SomeName; | [[#IndexedString]] SomeName; | ||
}; | }; | ||
====ArrayType1 Float==== | ==== ArrayType1 Float ==== | ||
1.234, | 1.234, | ||
struct ArrayFloat | struct ArrayFloat | ||
Line 211: | Line 217: | ||
====ArrayType2 Integer==== | ==== ArrayType2 Integer ==== | ||
123, | 123, | ||
Line 220: | Line 226: | ||
int value; // 4 bytes | int value; // 4 bytes | ||
}; | }; | ||
====ArrayType3 Embedded_Array==== | ==== ArrayType3 Embedded_Array ==== | ||
{array(...},....}, | {array(...},....}, | ||
Line 226: | Line 232: | ||
{ | { | ||
byte VarType; // = 3 | byte VarType; // = 3 | ||
[[#CompressedInteger]] nElements; // iterate thru | |||
... | ... series of array elememts in '''this''' embedded array | ||
}; | }; | ||
with the above construct (embedded array) each embedded array can contain any ArrayType, including, another embedded array. The difference of course is these embedded arrays have no individual name associated with them (unlike the packet array). | with the above construct (embedded array) each embedded array can contain any ArrayType, including, another embedded array. The difference of course is these embedded arrays have no individual name associated with them (unlike the packet array). | ||
== Added Wrinkles == | |||
=== enums === | |||
Optionally, a raPified file can contain an enum table at the end of the filename class definitions. | |||
The location of this list, '''if present''', is known by the fact that | |||
class mission.sqm | |||
{ | |||
.... | |||
int nEmbeddedClasses; | |||
... | |||
}; | |||
encloses all data before it, and the number of embededd classes is known. | |||
The next four bytes after the primary class body declare the number of entries in the enumerated list. | |||
Long NumberOfEntries; | |||
This value might not be present (EOF) reached, or, equaully, it is value is zero. | |||
Struct | Struct EnumTable | ||
{ | { | ||
Asciiz String; | Asciiz String; | ||
Long value; | Long value; // an integer | ||
}[ | }[NumberOfEntries]; | ||
In OFP, this enum construct is rarely encountered. Most often #defines are used, which are pre-processed and expanded by whatever tool (and modeller) is creating the raPified file. | |||
== Type definitions == | |||
=== CompressedInteger === | |||
An integer to the engine is a four byte value. A long. | |||
To conserve space, a mild form of compression is used, mostly, to scrunch short values in the range 1 to 65k and use 1, or 2 bytes instead of four. | |||
Bit 7 of the byte is the indicator that this is an extension byte, as opposed to a simple value. | |||
When encountered as set, it means the next byte, *also* contributes to making up the real value, and so on. | |||
up to five bytes, _could_ in theory be used to represent the true value. In practice, i have only ever seen a maximum of two. | |||
The following code is a poetic ''example only'' of handling a possible two byte pair. In truth, a while loop should be used. | |||
{ | { | ||
int | int val; | ||
if ((val = GetByte())==EOF) return EOF; | |||
if ((extra = GetByte())==EOF) return EOF; | if (val & 0x80) | ||
{ | |||
int extra; | |||
if ((extra = GetByte())==EOF) return EOF; | |||
val += (extra - 1) * 0x80; | |||
} | |||
return val; | |||
} | } | ||
IndexedString | === IndexedString === | ||
struct | |||
{ | struct | ||
{ | |||
Bis_Short index; | Bis_Short index; | ||
Asciiz | Asciiz String; | ||
}; | }; | ||
0 ="Peter" | A table of strings is accumalated according to it is index number when that specific index number is first encountered. | ||
1="Paul" | |||
2="Mary" | Although the values appear to be ordinal (0,1,2,3,4,5) you should not assume so. | ||
These are defined index strings and appear, individually, and uniquely, within the mission.sqm as and when they are first encountered. From then on you will only see an index string as | |||
1="" | 0 ="Peter" | ||
1="Paul" | |||
2="Mary" | |||
These are defined index strings and appear, individually, and uniquely, within the mission.sqm as and when they are first encountered. '''From then on''' you will '''only''' see an index string as | |||
1=""; | |||
because 1 has been defined earlier on. | because 1 has been defined earlier on. | ||
Note that this is unlike a postscript dictionary in that strings are defined on an add -hoc basis, not at beginning, only when encountered, | |||
0="peter" | Note that this is unlike a postscript dictionary in that strings are defined on an add-hoc basis, not at beginning, only when encountered, thus | ||
0= | |||
0= | 0="peter" | ||
1="mary" | 0= | ||
0= | 0= | ||
1= | 1="mary" | ||
0= | 0= | ||
2="fred" | 1= | ||
2= | 0= | ||
1= | 2="fred" | ||
etc | 2= | ||
1= | |||
etc | |||
{{GameCategory|ofp|Modelling}} | |||
[[Category:BIS_File_Formats|RAP]] |
Latest revision as of 01:12, 24 February 2023
Caveat
Althugh similar in construct and intent, if you are researching the nitty gritty of Elite or ArmA raP encoded files, you should read those topics instead. This document deals, specifically, with raP files encountered in OFP / OFP Resistance only.
Introduction
raP encoding applies to any humanly readable text file in OFP that contains class statements. Examples of files that are, or should be, raPified, are mission.sqm, config.cpp, description.ext.
In fact, any text file that contains class statements, contains nothing else but class statements. So much so, that entire contents of that file, is considered to be a class !!!
eg
class mission.sqm { ... };
The fact of the matter is, if you do not raPify these files, the engine will before using them (and thus causing uneccessary cpu load)
raP encoding simply means that the data inherent in these types of files has been sanitised (stripped of commments and crud) and massaged into a form of indexed lookup table for the engine to use directly. Once done, it is free of the need to check for syntax errors, among other things. Hence, much much faster processing.
These types of files were once known as 'encrypted' or 'binarised' files. They are no such thing. They are simply a cleaner. closer equivalent to what the engine uses internally. For instance, all your savegames are raP encoded (there is no, text equivalent).
A raP encoded file is detected by the magic signature '\0raP' in the first four bytes of the file. Because of the leading 0 byte, no text file can inadvertently have this signature.
Importantly, the filename extension is immaterial.
The engine will work with config.cpp as a raP encoded entity, just as it would work with config.bin.
Tools
Various utilities exist which refer to binary <> cpp compression and extraction (or encoding and decoding). Again, these terms are misleading because the file concerned is not executable binary data, just tokenised strings and values.
Basics
There is no need here to elaborately define what a mission.sqm file is. But, it is worth understanding the basics of these (types of) files to understand the very small requirements needed to raPify them.
class files only contain 3 types of construct
ClassNames, TokenNames, Arrays
class classname [:inherit] {...};
[:inherit] is optional and simply refers to another classname.
(For your interest the [] are part of a grammar notation technique called Backus–Naur Form and mean optional. Whatever is within the [...] is optional. The [] do not appear in the text file.)
The class body, the {...} contains more, Classnames, TokenNames, Arrays, or nothing at all.
TokenNames come in 3 flavours
aString="A string"; anInteger=77; aFloat 1.855;
For more on this subject, see TokenNameValueTypes
Arrays
anArray[]={......};
an array containing elements including (possibly) more arrays or more TokenNames (but not classes)
thing[]={ 1.0, 7.67, "Elephants", fred[]={......} };
raPifying encodes each of these basic types.
Construct
all raPified data can be expressed as
class filename { class FirstEmbeddedClass { ... tokenames class FirstEmbeddedEmbeddedClass { ... }; ... }; ... class LastEmbeddedClass { }; }; [an optional enumerated list]
Header
A raPified file has the first 4 bytes of the file encoded as follows:
"\0raP"
For OFP and RESISTANCE the next three bytes are
"\004\0\0"
see elsewhere for Elite and ArmA.
The rest of the file contain Class Body packets of 3 different construct types with the 1st byte defining what 'type' it is.
Thus
struct ClassBody { byte PacketType; // 0 Classname // 1 TokenName // 2 Array ....... depends on packet type };
The very first packet encountered is a classname. It is the enclosing class for *everything* else in the file. The name of his class is the name of the file. It is *not* recorded in humanly readable text output.
Packets
PacketType0: Classname
class Classname: InheritedClassName { Packets... };
struct ClassPacket { byte PacketType; // = 0 == class #IndexedString Classname; Asciiz InheritedClassName; // optional or zero length string #BIS_Integer nImbeddedPackets; // Iterates thru embedded Packet(s) can be zero };
Having no embedded packets is quite legal.
The embedded packets, eg, the body of this class, immediately follows this packet.
Bare in mind, that the following data (the body of this class) may indeed have further embedded packets, which may have further embedded packets, which may have.... All of which are contiguous in the datastream (OFP Only).
PacketType1: TokenNames
The 2nd byte of this packet defines what type of variable. Thus
struct VarPacket { byte PacketType; // = 1 byte VarType; // 0 String // 1 Float // 2 Integer IndexedString SomeName; .... depends on VarType };
VarType0 String
SomeName="SomeOtherName";
struct VarTypString { byte PacketType; // = 1 byte VarType; // = 0 IndexedString SomeName; IndexedString SomeOtherName; };
VarType1 Float
SomeName=1.23445;
struct VarTypFloat { byte PacketType; // = 1 byte VarType; // = 1 IndexedString SomeName; float value; // 4 bytes };
VarType2 Integer
SomeName=123;
struct VarTypLongInteger { byte PacketType; // = 1 byte VarType; // = 2 IndexedString SomeName; int value; // 4 bytes };
PacketType2: Arrays
Arrays[] contain four possible element types. They are the traditional variables mentioned above with an added tweak of an embedded array type.
thus
SomeName[]={ Element,Element[],"element",....};
struct ArrayPacket { byte PacketType; // = 2 #IndexedString SomeName; #CompressedInteger nElements; // iterate thru ConstTypes, can be 0 .... a series of ArrayTypes };
Similar to classes, the embedded ArrayTypes follow immediately in the data stream (OFP only).
ArrayType0 String
"SomeName",
struct ArrayString { byte VarType; // = 0 #IndexedString SomeName; };
ArrayType1 Float
1.234,
struct ArrayFloat { byte VarType; // = 1 float value; // 4 bytes };
ArrayType2 Integer
123,
struct ArrayInteger { byte VarType; // = 2 int value; // 4 bytes };
ArrayType3 Embedded_Array
{array(...},....},
struct EmbeddedArray { byte VarType; // = 3 #CompressedInteger nElements; // iterate thru ... series of array elememts in this embedded array };
with the above construct (embedded array) each embedded array can contain any ArrayType, including, another embedded array. The difference of course is these embedded arrays have no individual name associated with them (unlike the packet array).
Added Wrinkles
enums
Optionally, a raPified file can contain an enum table at the end of the filename class definitions.
The location of this list, if present, is known by the fact that
class mission.sqm { .... int nEmbeddedClasses; ... };
encloses all data before it, and the number of embededd classes is known.
The next four bytes after the primary class body declare the number of entries in the enumerated list.
Long NumberOfEntries;
This value might not be present (EOF) reached, or, equaully, it is value is zero.
Struct EnumTable { Asciiz String; Long value; // an integer }[NumberOfEntries];
In OFP, this enum construct is rarely encountered. Most often #defines are used, which are pre-processed and expanded by whatever tool (and modeller) is creating the raPified file.
Type definitions
CompressedInteger
An integer to the engine is a four byte value. A long.
To conserve space, a mild form of compression is used, mostly, to scrunch short values in the range 1 to 65k and use 1, or 2 bytes instead of four.
Bit 7 of the byte is the indicator that this is an extension byte, as opposed to a simple value.
When encountered as set, it means the next byte, *also* contributes to making up the real value, and so on.
up to five bytes, _could_ in theory be used to represent the true value. In practice, i have only ever seen a maximum of two.
The following code is a poetic example only of handling a possible two byte pair. In truth, a while loop should be used.
{ int val; if ((val = GetByte())==EOF) return EOF; if (val & 0x80) { int extra; if ((extra = GetByte())==EOF) return EOF; val += (extra - 1) * 0x80; } return val; }
IndexedString
struct { Bis_Short index; Asciiz String; };
A table of strings is accumalated according to it is index number when that specific index number is first encountered.
Although the values appear to be ordinal (0,1,2,3,4,5) you should not assume so.
0 ="Peter" 1="Paul" 2="Mary"
These are defined index strings and appear, individually, and uniquely, within the mission.sqm as and when they are first encountered. From then on you will only see an index string as
1="";
because 1 has been defined earlier on.
Note that this is unlike a postscript dictionary in that strings are defined on an add-hoc basis, not at beginning, only when encountered, thus
0="peter" 0= 0= 1="mary" 0= 1= 0= 2="fred" 2= 1= etc