PBO File Format: Difference between revisions

From Bohemia Interactive Community
Jump to navigation Jump to search
m (→‎Bibliography: fixed (some) urls)
(Add arma3pbo link (Python))
 
(71 intermediate revisions by 17 users not shown)
Line 1: Line 1:
{{unsupported-doc}}
{{Feature|UnsupportedDoc}}
{{TOC|side}}
'''PBO''' means "'''p'''acked '''b'''ank '''o'''f files". A '''.pbo''' is identical in purpose to a zip or rar. It is a container for folder(s) and file(s).


<big><big>Pbo file structure and packing method</big></big>
The engine will internally expand any '''*.pbo''' back out to its original, tree-folder, form.


==Introduction==
==== Pbo .ext types ====
A pbo file originally meant 'packed bank of files'. Through use however, it has come to represent a single 'package' to achieve a result. Such as a mission, such as, an addon.  
generically called 'pbo' there are in fact different extension names employed, dependent on the engine being used. In all cases, the contents ultimately are identical to pbos.
* ifa: A World War2 mod hard wired to only accept ifa files.
* ebo: initially used by VBS, then by Arma, to encrypt the contents of a pbo and call it a different extension.
* xbo: Used by vbsLite encrypt with a different algorithm to above.
** the encryption method used for vbslite UK and vbslite US is different.


A '''.pbo''' file is the output produced by the Mission Editor when 'exporting' and contains nothing more (and nothing less) than the content of all the files and folders making up a mission or campaign, or addon. It is a single file representation of a folder tree. The key to grasp is that anything you uniquely make in a folder, such as a mission, such as a campaign, such as an addon, can be conveniently packaged into a single file, called, a '''pbo'''.
=== Legend ===


The engine will internally expand any '''pbo''' back out to it's original, tree-folder, form.
See [[Generic FileFormat Data Types]].


Additionally, the engine will work with the equivalent non pbo versions of missions or campaigns, but not (unfortunately) Addons. Addons must be in 'pbo format' to be usable by the engine.
=== Compression ===


===Compression===
In addition to simply packaging all files and folders in a tree into a single file, some, all, or none of the files within can be compressed using Apple's method of LZSS compression.
In addition to simply packaging all files and folders in a tree into a single file, some, all, or none of the files within can be compressed. Which type of files are compressed is ''entirely'' optional. Tools for creating compressed pbo files are Makepbo by Amalfi among others. The intent behind compression '''was''' for internet use and, in the 'good old days', simply to reduce hard disk storage requirements. The ''actual'' use of compression (a mild form of run length encoding) is becoming less 'popular' as it does represent a load on the engine. '''Neither''' Operation Flashpoint Elite, nor ArmA, can work with ''compressed pbo'' files. See [[#OFP Elite Pbo|Elite Pbo]]'s
Which type of files are compressed is ''entirely'' optional. Ones that won't benefit normally are not and some files in the Arma engines cannot be because bis forgot they had this code.
The intent behind compression '''was''' for internet use and, in the 'good old days', simply to reduce hard disk storage requirements.
The ''actual'' use of compression is becoming less 'popular' as it does represent a small load on the engine. Operation Flashpoint: Elite cannot work with ''compressed .pbo'' files - see [[#{{ofpe}} PBO|Elite PBO]]s.


===Binarised raP===
=== PboTypes ===


[[Config.bin]] definately, and often [[Config.cpp|config.cpp]], [[Mission.sqm]], or [[Description.ext]] are stored in binary form. This has no relation to pbo structure, though, and and it is '''not''' part of the pbo compression/decompression algorithm. The data for mission.sqm may indeed, also be compressed within the pbo, but the resulting output is often, a raPified version of the original mission.sqm text. It must be further decoded by utilities such as bin2cpp and cpp2bin.
==== Mission Versus Addon ====
Across all engines there are two distinct types of pbo.
* Mission pbos do not contain config.cpps and live in a mission folder (be it campaign, single, or multiplayer). They have no use for a pboPrefix, nor a Properties header.
* Addon pbos contain a config.cpp/bin which at-the-least identifies the addon name, and, with the exception of the initial cwc engine, rely on a pbo properties header. They live in an addons\ folder.


==Main Format==
==== pbo engine differences ====
The format of a pbo is extremely simple. It contains
* Cold War Crisis: the very first in the series (disregarding the 'proof of concept' release a year earlier) has no pboProperties header and no checksums. It does however form the core of all other engines. It's format has never been altered
#a header consisting of contiguous file name stuctures.
* Resistance: added a Properties header as the first 'entry' of the pbo.
#one, contiguous data block.
* Vbs1: same as Resistance (but encrypted).
* Xbox Elite: added a 0 plus 4 byte checksum to end of file.
* Arma: changed Elite's checksum to a 0 plus 20 byte sha key to end of file.
* VBS2: same as Arma (but encrypted)
* Dayz: same as Arma


Note also that in almost all cases both the properties header and/or the sha key are optional.




The header defines each file contained in the pbo, its size, date, name, whether it's compressed, and where it 'is' in the following data block. <u>Every</u> file, even zero length ones, are recorded in the header and each is referred to as an 'entry'. Entries, and consequently the 'file' they refer to, are contiguous.
== Main format ==


The last 'entry' should be blank defining the next byte and all bytes thereafter to be the data block. However, resistance format pbo's ''sometimes'' obscure this.
The format of a .pbo contains:
#a header consisting of 21byte contiguous file name structures called 'entries'. The very first entry might exceed 21 bytes (see below)
#one, contiguous data block.
#an (optional) 5 byte checksum (Elite) or a 21 byte signature key (Arma)


==Pbo Header Entry==
With exceptions, each entry defines a file contained in the .pbo, its size, date, name, it is whether it is compressed.


A standard pbo entry as as follows
Because entries and data are contiguous, there is no need for an offset to the 'next' file. <u>Every</u> file, even zero-length ones, are recorded in the header.


struct entry
However, note that there is no provision for, and no ability to, store empty folders. Folders as such are indicated simply by being part of the filename.
{
There are no, folder entries, and consequently, empty folders, cannot be included in a .pbo because there is no filename associated with them.
  Asciiz  filename; //a zero terminated string defining the '''full''' path and filename,
Put another way, an empty folder, if it could be stored (and it can't), would appear to be an empty filename when dePbo'd.
                    //        ''relative to the name of this pbo''.
                    //Last entry in header has an empty string ('\0' char only).  
                    //Other fields in the last entry are filled by zero bytes.  
                    //'''Last entry <u>slightly</u> modified for resistance.'''
  .
  ulong  PackingMethod; //0x00000000 uncompressed
                        //0x43707273 packed
                        //0x56657273 header start (resistance)
  ulong  OriginalSize;  // Unpacked: 0 or same value as the DataSize
                        // Packed: Size of file after unpacking.
                        // This value is needed for byte boundary unpacking
                        // since unpacking itself can lead to bleeding of up
                        // to 7 extra bytes.
  ulong  Reserved;
  ulong  TimeStamp;    // meant to be the unix filetime of Jan 1 1970 +, but often 0
  ulong  DataSize;      // The size in the data block.
                        // This is also the file size when '''not''' packed
};


The last header 'entry' is filled with zeroes. The next byte is the beginning of the data block.


===Null Entries===


Entries with <u>no file name</u> indicate ''boundaries''. The obvious one being end of header.
== PBO Header Entry ==


There are two 'boundaries' used in pbo headers.  
A standard .pbo entry as follows:
<syntaxhighlight lang="cpp">
struct entry
{
Asciiz filename; // a zero terminated string defining the path and filename,
// ''relative to the name of this .pbo'' or it is prefix.
// Zero length filenames ('\0') indicate the first (optional), or last (non optional) entry in header.
// Other fields in the last entry are filled by zero bytes.


#Start of header, found only in Resistance style pbo's, and
char[4] MimeType; // 0x56657273 'Vers' properties entry (only first entry if at all)
#End of header
// 0x43707273 'Cprs' compressed entry
// 0x456e6372 'Enco' comressed (vbs)
// 0x00000000 dummy last header entry
ulong OriginalSize; // Uncompressed: 0 or same value as the DataSize
// compressed: Size of file after unpacking.
// This value is needed for byte boundary unpacking
// since unpacking itself can lead to bleeding of up
// to 7 extra bytes.


An end of header is (of course) mandatory. It is normally indicated by all other entries also being zero in the struct. However, a sometimes seen case is a 'signature' in the compression method for the pbo overall. And indication, that some, none, or all, of the pbo is compressed. Somewhat useless.
ulong Offset; // not actually used, always zeros (but vbs = encryption data)


This is often the case when a product entry (Resistance) is inserted as the <u>first</u> entry. Thus a uniquely used compression signature of 0x56657273 means a product entry (all other fields zero) and 0x43707273 (all other fields zero) means end of header for ''some'' Resistance style pbo's.
ulong TimeStamp; // meant to be the unix filetime of Jan 1 1970 +, but often 0


The truth of the matter is that it doesn't matter muchly. Detection of the end of header, and, when applied, detection of a start of header, is indicated by <u>no file name</u>. The content of these entries is immaterial, the engine makes no use of them. However, certain 3rd party addon makers rely on the fact that *most* pbo extraction tools expect fields to be zero (even though they don't matter). As such, this prevents _some_ pbo's from being extracted by those tools.
ulong DataSize; // The size in the data block.
// This is also the file size when '''not''' packed
};
</syntaxhighlight>


===Resistance Pbo===
=== Null Entries ===


Resistance Pbo's add an ''optional'' further EntryType as the '''first''' entry type in the header.
Entries with no file name indicate:
# End of header, content all zeros and ignored regardless.
# PboProperties entry as the very first entry of the file (not present for cwc or any mission.pbo)


ulong PackingMethod; //0x56657273 Product Entry with '''all other fields zero'''.
=== PboProperties ===


If present (and it *is* optional) it is the FIRST entry in the header.
<syntaxhighlight lang="cpp">
struct standard entry
{
Asciiz filename; // = 0
char[4] MimeType; // 0x56657273 'Vers' properties entry (only first entry if at all)
ulong OriginalSize; // = 0
ulong Reserved; // = 0
ulong TimeStamp; // = 0
ulong DataSize; // = 0
} // end of 'standard' entry


The meaning of the following entry (the 2nd one in the header) changes to:
struct properties
{
Asciiz this1, that1; // eg this=that;
}[...];


struct ProductEntry
byte end; // = 0
{
</syntaxhighlight>
    Asciiz    *EntryName;      // = "product"
    Asciiz    *ProductName;   // = "OFP: Resistance"
    Asciiz    *ProductVersion; // = ""
};


This extended entry is a set-in-concrete signature for Resistance pbo's. It is not employed by the engine.
There can be as many contiguos paired Strings as, well, as many as, a piece of string!


A second and final change to a resistance pbo is optionally, the '''last entry''' ''replaces'' a conventional last entry ('''all''' empty) by having a PackingMethod signature of 0x43707273 (Compressed) with all other fields zero.
The '''LAST''' (or only!) String is a zero length {{Link|https://en.wikipedia.org/wiki/Null-terminated_string|ASCIIZ}} string (eg. '\0').


If used at all (even in a Resistance pbo list last entry type is optional) it is meant to indicate that some or all of the previous entries have compressed data but is actually quite useless and you should not rely on it, just account for it.
==== {{ofpr}} PBO ====
<syntaxhighlight lang="cpp">
struct properties
{
"product" = "OFP: Resistance"
};
</syntaxhighlight>


'''Note''' especially that ''some'' addon suppliers provide non zero fields in either or both of these special entries to confuse DePbo tools.
==== {{ofpe}} PBO ====
<syntaxhighlight lang="cpp">
struct properties
{
"prefix" = "Addon\FOLDER\Name"
};
</syntaxhighlight>


To summarise.
==== Arma PBO ====
<syntaxhighlight lang="cpp">
prefix=Addon\FOLDER\Name
version="123"
engine="arma3"
author=I am famous"
anything=else that takes your fancy
</syntaxhighlight>


#Non Resistance pbo's rely on the filename AND all other fields zero to indicate end of header.
#Resistance pbo's have extended this to filename and all other fields zero EXCEPT the PackingMethod.
the PackingMethod in this instance either defines
#0x56657273 the start of a single and one only, product field. or
#0x43707273 the end of header (some or all files compressed)
#0x0 the end of header no compression (same as a non-resistance pbo)


===OFP Elite Pbo===
== End of File Checksum or Sha ==


[[:Category:Operation Flashpoint: Elite|Operation Flashpoint: Elite]] pbo's intended for use on the '''Xbox''' are identical in makeup to Resistance pbo's except that the second header entry has changed to the following.
=== {{ofp}} ===


struct ProductEntry
Not present for {{ofp}} pbos
{
  Asciiz    *EntryName;      // = "prefix"
  Asciiz    *ProductName;    // = "<AddonFileName>"
  Asciiz    *ProductVersion; // = ""
};


<''AddonFileName''> refers to ''the'' name of the <file>.pbo. It is moot whether a fully qualified pathname  is used (MP mision play), or not (general, DVD based, adddons).
=== {{ofpe}} ===


'''Note''' that for ''Operation Flashpoint Elite'' and '''Armed Assault''' the compression cannot be used. This is because of the requirement to be able to stream data from the files.
<syntaxhighlight lang="cpp">
{
byte always0;
int checksum;
}
</syntaxhighlight>


===ArmA Pbo===
=== Arma ===


Armed Assault Pbo's are currently Identical in makeup as Operation Flashpoint Elite Pbo's.
<syntaxhighlight lang="cpp">
{
byte always0;
byte sha[20]; //std md5 checksum only used for MP play
}
</syntaxhighlight>


==Data compression==


Data compression in ofp is a mild, but effective, form of run length encoding, allowing (up to) 4k of previous data to repeat itself.
== Data compression ==


Compression is indicated when a signature of 0x43707273 '''and''' the filesizes do not match in the entry.  
Data compression is a mild, but effective use of Apple Corps LZSS (a variant of liv-zempel & Huffman), allowing (up to) 8k of previous data to repeat itself.


The following code <u>also</u> applies to the packing method employed in wrp OPWR files which have no header info simply a block of known output length that must be decoded.
Compression is indicated when a signature of 0x43707273 '''and''' the file sizes do not match in the entry.


The compressed data block is in contiguous 'packets' of different lengths
The following code <u>also</u> applies to the packing method employed in wrp (OPRW) and pac/paa files which have no header info simply a block of known output length that must be decoded.
In all cases, the OUTPUT size is known. With .pbo's, the INPUT size is only a boundary definition to the next block of compressed data.
It is not used or relevant to decoding data because (up to) 7 residual bytes could exist in the last flag word of the block.
As such, only the fixed in concrete output size is relevant.


block {packet1}...{packetN} {4 byte checksum}
The compressed data block is in contiguous 'packets' of different lengths:
.
<syntaxhighlight lang="cpp">
packet
block {packet1}...{packetN} {4 byte checksum}
{
 
    byte   Format;                
packet
    byte   packetdata[...];     // no fixed length
{
}
byte flagbits;
byte packetdata[...]; // no fixed length
}
</syntaxhighlight>


The contents of the packetdata contain mixtures of raw data that is passed directly to the output, and, 2byte pointers.
The contents of the packetdata contain mixtures of raw data that is passed directly to the output, and, 2byte pointers.


Format: bit values determine what the packetdata is. It is interpeted '''lsb''' first thus;
Format: bit values determine what the packetdata is. It is interpeted '''lsb''' first thus:
 
  BitN = 1 - append byte directly to file (read single byte)
  BitN =1   -           append byte directly to file (read single byte)
  BitN = 0 - pointer (read two bytes)
  BitN= 0   -           pointer (read two bytes)


for example:
for example:
Line 156: Line 207:
There are three bytes in the block a little further past the format flag that will be passed directly to the output when encountered, and there are FIVE pointers.
There are three bytes in the block a little further past the format flag that will be passed directly to the output when encountered, and there are FIVE pointers.


In this example, first byte of packetdata is passed to output, 2 bytes are read to make a pointer, next byte is passed (ultimately) to output and so on.  
In this example, first byte of packetdata is passed to output, 2 bytes are read to make a pointer, next byte is passed (ultimately) to output and so on.  


  For the very last packet in the block, it is almost inevitable that there will be
  For the very last packet in the block, it is almost inevitable that there will be
  excessive bits. These are ignored (truncated) as the final output length is always  
  excessive bits. These are ignored (truncated) as the final output length is always  
  known from the Entry. You cannot rely on the ignored bits in the format flag (up to seven  
  known from the Entry. You cannot rely on the ignored bits in the format flag (up to seven  
  of them) to be any particular value (0 or 1).
  of them) to be any particular value (0 or 1) and must be ignored.


A pointer consists of a 12 bits address and 4 bit run length.
A pointer consists of a 12 bits address and 4 bit run length.


The pointer is a reference to somewhere in the previous 4k max of built output. Given Intel's endian word format the bytes b1 and b2 form a short word value B2B1
The pointer is a reference to somewhere in the previous 4k max of built output. Given Intel's endian word format the bytes b1 and b2 form a short word value B2B1


The format of B2B1 is unfortunately AAAA LLLL AAAAAAAA, requiring a bit of shift mask fiddling.
The format of B2B1 is unfortunately in Big Endian (motorola) AAAA LLLL AAAAAAAA, requiring some shift and mask fiddling.


The address refers to the start of some data in the currently rebuilt part of the file. It is a value, relative to the current length of the reconstructed part of the file (FL).
The address refers to the start of some data in the currently rebuilt part of the file. It is a value, relative to the current length of the reconstructed part of the file (FL).


The run length of the data to be copied, the 'pattern' has 4 bits and therefore, in theory, 0 to 15 bytes can be duplicated. In practice the values are 3..18 bytes because copying 0,1 or 2 bytes makes no sense.
The run length of the data to be copied, the 'pattern' has 4 bits and therefore, in theory, 0 to 15 bytes can be duplicated.
In practice the values are 3..18 bytes because copying 0,1 or 2 bytes makes no sense.


Relative position (rpos) into the currently built output is calculated as
Relative position (rpos) into the currently built output is calculated as
            rpos = FL  - ((B2B1 &0x00FF) + (B2B1 & 0xF000)>>4) )
rpos = FL  - ((B2B1 &0x00FF) + (B2B1 & 0xF000)>>4) )


The length of the data block: rlen
The length of the data block: rlen
            rlen = (B2B1 & 0x0F00>>8) + 3
rlen = (B2B1 & 0x0F00>>8) + 3


With the values of rpos and rlen there are three basic situations possible:  
With the values of rpos and rlen there are three basic situations possible:  
Line 184: Line 236:
  block is added to the end of the file, giving a new length of FL = FL + rlen.  
  block is added to the end of the file, giving a new length of FL = FL + rlen.  


rpos + rlen > FL // data to copy exceeds what's available
rpos + rlen > FL // data to copy exceeds what's available
  In this situation the data block has a length of FL – rpos and it is added to the reconstructed file until FL = rpos + rlen.  
  In this situation the data block has a length of FL – rpos and it is added to the reconstructed file until FL = rpos + rlen.  


rpos + rlen < 0
rpos + rlen < 0
This is a special case where spaces are added to the decoded file until FL = FL,Initial + rlen
This is a special case where spaces are added to the decoded file until FL = FL,Initial + rlen
 
 
The checksum, the last four bytes of any compressed data block. It is an unsigned long (Intel Little Endian order).
It is simply a byte-at-a-time, unsigned additive spillover of the '''decompressed''' data.
 
Each and every compressed data block, contains its own, unique checksum.
 
There is no checksum or other protective device, employed on a .pbo overall.
Exceptions: Elite and Arma have residual data after the end of contiguous data block that do represent a signature for the file.
 
== Bibliography ==
 
* DosTools: {{Link|http://dev-heaven.net/projects/list_files/mikero-pbodll}}
* cpbo: {{Link|http://www.kegetys.net/arma/}}
 
 
== Open Source PBO Libraries ==
 
=== C ===
 
* JAPM: {{Link|https://github.com/RaJiska/JAPM}}
* armake: {{Link|https://github.com/KoffeinFlummi/armake}}
* libpbo: {{Link|https://github.com/Learath2/libpbo}}
 
=== C++ ===
 
* libpbo: {{Link|https://github.com/StidOfficial/libpbo}}
 
=== C# ===
 
* SwiftPbo: {{Link|https://github.com/headswe/SwiftPbo}}
* PboSharp: {{Link|https://github.com/Shix/PBOSharp}}
 
=== Python ===


<hr>
* yapbol: {{Link|https://github.com/overfl0/yapbol}}
* pbo-fuse: {{Link|https://github.com/Dahlgren/python-pbo-fuse/blob/master/pbo.py}}
* arma3pbo: {{Link|https://github.com/danielgran/arma3pbo}}


The checksum, the last four bytes of any compressed data block. It is an unsigned long (Intel Little Endian order). It is simply a byte-at-a-time, unsigned additive spillover of the '''decompressed''' data.
=== Rust ===


Each and every compressed data block, contains it's own, unique checksum.
* armake2: {{Link|https://github.com/KoffeinFlummi/armake2}}


There is no, checksum, or other protective device, employed on a pbo overall.
=== Java ===


==Bibliography==
* ArmaFiles: {{Link|https://github.com/Krzmbrzl/ArmaFiles}}
confucious : http://www.ofpec.com/editors/resource_view.php?id=414
ofpinternals : http://www.ofpec.com/editors/resource_view.php?id=147
Bin2Cpp : http://www.ofpec.com/editors/resource_view.php?id=665
Cpp2Bin : http://www.ofpec.com/OFPResources/tools/UploadedByPlanck/CPP2BIN.zip
Encryption : http://www.ofpec.com/editors/resource_view.php?id=830
Info cpp<>bin
Res Pbos: http://www.ofpec.com/editors/resource_view.php?id=833
addenda to this doc
Amalfi UnPbo : http://www.ofpec.com/OFPResources/tools/UploadedByPlanck/pbo_decryptor15.zip
Amalfi MakePbo : http://www.ofpec.com/OFPResources/tools/UploadedByPlanck/MakePBO.zip


Winpbo : http://www.ofpec.com/OFPResources/tools/UploadedByPlanck/WinPboPack.zip
=== JavaScript ===
DePboDLL : http://www.ofpec.com/editors/resource_view.php?id=828


* Pbo.js: {{Link|https://github.com/eelislynne/pbo.js}}




[[category:Armed Assault: Addons]]
[[Category:BIS File Formats]]
[[category:Operation Flashpoint: Addons]]
[[Category:Operation Flashpoint Elite: Addons]]
[[Category:Operation Flashpoint: Missions]]
[[category:Operation Flashpoint: Editing]]

Latest revision as of 10:24, 20 April 2024

bi symbol white.png
Disclaimer: This page describes internal undocumented structures of Bohemia Interactive software.

This page contains unofficial information.

Some usage of this information may constitute a violation of the rights of Bohemia Interactive and is in no way endorsed or recommended by Bohemia Interactive.
Bohemia Interactive is not willing to tolerate use of such tools if it contravenes any general licenses granted to end users of this community wiki or BI products.

PBO means "packed bank of files". A .pbo is identical in purpose to a zip or rar. It is a container for folder(s) and file(s).

The engine will internally expand any *.pbo back out to its original, tree-folder, form.

Pbo .ext types

generically called 'pbo' there are in fact different extension names employed, dependent on the engine being used. In all cases, the contents ultimately are identical to pbos.

  • ifa: A World War2 mod hard wired to only accept ifa files.
  • ebo: initially used by VBS, then by Arma, to encrypt the contents of a pbo and call it a different extension.
  • xbo: Used by vbsLite encrypt with a different algorithm to above.
    • the encryption method used for vbslite UK and vbslite US is different.

Legend

See Generic FileFormat Data Types.

Compression

In addition to simply packaging all files and folders in a tree into a single file, some, all, or none of the files within can be compressed using Apple's method of LZSS compression. Which type of files are compressed is entirely optional. Ones that won't benefit normally are not and some files in the Arma engines cannot be because bis forgot they had this code. The intent behind compression was for internet use and, in the 'good old days', simply to reduce hard disk storage requirements. The actual use of compression is becoming less 'popular' as it does represent a small load on the engine. Operation Flashpoint: Elite cannot work with compressed .pbo files - see Elite PBOs.

PboTypes

Mission Versus Addon

Across all engines there are two distinct types of pbo.

  • Mission pbos do not contain config.cpps and live in a mission folder (be it campaign, single, or multiplayer). They have no use for a pboPrefix, nor a Properties header.
  • Addon pbos contain a config.cpp/bin which at-the-least identifies the addon name, and, with the exception of the initial cwc engine, rely on a pbo properties header. They live in an addons\ folder.

pbo engine differences

  • Cold War Crisis: the very first in the series (disregarding the 'proof of concept' release a year earlier) has no pboProperties header and no checksums. It does however form the core of all other engines. It's format has never been altered
  • Resistance: added a Properties header as the first 'entry' of the pbo.
  • Vbs1: same as Resistance (but encrypted).
  • Xbox Elite: added a 0 plus 4 byte checksum to end of file.
  • Arma: changed Elite's checksum to a 0 plus 20 byte sha key to end of file.
  • VBS2: same as Arma (but encrypted)
  • Dayz: same as Arma

Note also that in almost all cases both the properties header and/or the sha key are optional.


Main format

The format of a .pbo contains:

  1. a header consisting of 21byte contiguous file name structures called 'entries'. The very first entry might exceed 21 bytes (see below)
  2. one, contiguous data block.
  3. an (optional) 5 byte checksum (Elite) or a 21 byte signature key (Arma)

With exceptions, each entry defines a file contained in the .pbo, its size, date, name, it is whether it is compressed.

Because entries and data are contiguous, there is no need for an offset to the 'next' file. Every file, even zero-length ones, are recorded in the header.

However, note that there is no provision for, and no ability to, store empty folders. Folders as such are indicated simply by being part of the filename. There are no, folder entries, and consequently, empty folders, cannot be included in a .pbo because there is no filename associated with them. Put another way, an empty folder, if it could be stored (and it can't), would appear to be an empty filename when dePbo'd.

The last header 'entry' is filled with zeroes. The next byte is the beginning of the data block.


PBO Header Entry

A standard .pbo entry as follows:

struct entry
{
	Asciiz filename;	// a zero terminated string defining the path and filename,
						// ''relative to the name of this .pbo'' or it is prefix.
						// Zero length filenames ('\0') indicate the first (optional), or last (non optional) entry in header.
						// Other fields in the last entry are filled by zero bytes.

	char[4] MimeType;	// 0x56657273 'Vers' properties entry (only first entry if at all)
						// 0x43707273 'Cprs' compressed entry
						// 0x456e6372 'Enco' comressed (vbs)
						// 0x00000000 dummy last header entry
						
	ulong OriginalSize;	// Uncompressed: 0 or same value as the DataSize
						// compressed: Size of file after unpacking. 
						// This value is needed for byte boundary unpacking
						// since unpacking itself can lead to bleeding of up
						// to 7 extra bytes.

	ulong Offset;		// not actually used, always zeros (but vbs = encryption data)

	ulong TimeStamp;	// meant to be the unix filetime of Jan 1 1970 +, but often 0

	ulong DataSize;		// The size in the data block. 
						// This is also the file size when '''not''' packed
};

Null Entries

Entries with no file name indicate:

  1. End of header, content all zeros and ignored regardless.
  2. PboProperties entry as the very first entry of the file (not present for cwc or any mission.pbo)

PboProperties

struct standard entry
{
	Asciiz	filename;		// = 0
	char[4]	MimeType;		// 0x56657273 'Vers' properties entry (only first entry if at all)
	ulong	OriginalSize;	// = 0
	ulong	Reserved;		// = 0
	ulong	TimeStamp;		// = 0
	ulong	DataSize;		// = 0
} // end of 'standard' entry

struct properties
{
	Asciiz this1, that1;	// eg this=that;
}[...];

byte end; // = 0

There can be as many contiguos paired Strings as, well, as many as, a piece of string!

The LAST (or only!) String is a zero length ASCIIZ string (eg. '\0').

Operation Flashpoint: Resistance PBO

struct properties
{
	"product" = "OFP: Resistance"
};

Operation Flashpoint: Elite PBO

struct properties
{
	"prefix" = "Addon\FOLDER\Name"
};

Arma PBO

prefix=Addon\FOLDER\Name
version="123"
engine="arma3"
author=I am famous"
anything=else that takes your fancy


End of File Checksum or Sha

Operation Flashpoint

Not present for Operation Flashpoint pbos

Operation Flashpoint: Elite

{
	byte always0;
	int checksum;
}

Arma

{
	byte always0;
	byte sha[20]; //std md5 checksum only used for MP play
}


Data compression

Data compression is a mild, but effective use of Apple Corps LZSS (a variant of liv-zempel & Huffman), allowing (up to) 8k of previous data to repeat itself.

Compression is indicated when a signature of 0x43707273 and the file sizes do not match in the entry.

The following code also applies to the packing method employed in wrp (OPRW) and pac/paa files which have no header info simply a block of known output length that must be decoded. In all cases, the OUTPUT size is known. With .pbo's, the INPUT size is only a boundary definition to the next block of compressed data. It is not used or relevant to decoding data because (up to) 7 residual bytes could exist in the last flag word of the block. As such, only the fixed in concrete output size is relevant.

The compressed data block is in contiguous 'packets' of different lengths:

block {packet1}...{packetN} {4 byte checksum}

packet
{
	byte flagbits;
	byte packetdata[...];	// no fixed length
}

The contents of the packetdata contain mixtures of raw data that is passed directly to the output, and, 2byte pointers.

Format: bit values determine what the packetdata is. It is interpeted lsb first thus:

BitN = 1	-	append byte directly to file (read single byte)
BitN = 0	-	pointer (read two bytes)

for example:

format byte, is 0x45, binary notation is: 01000101.

There are three bytes in the block a little further past the format flag that will be passed directly to the output when encountered, and there are FIVE pointers.

In this example, first byte of packetdata is passed to output, 2 bytes are read to make a pointer, next byte is passed (ultimately) to output and so on.

For the very last packet in the block, it is almost inevitable that there will be
excessive bits. These are ignored (truncated) as the final output length is always 
known from the Entry. You cannot rely on the ignored bits in the format flag (up to seven 
of them) to be any particular value (0 or 1) and must be ignored.

A pointer consists of a 12 bits address and 4 bit run length.

The pointer is a reference to somewhere in the previous 4k max of built output. Given Intel's endian word format the bytes b1 and b2 form a short word value B2B1

The format of B2B1 is unfortunately in Big Endian (motorola) AAAA LLLL AAAAAAAA, requiring some shift and mask fiddling.

The address refers to the start of some data in the currently rebuilt part of the file. It is a value, relative to the current length of the reconstructed part of the file (FL).

The run length of the data to be copied, the 'pattern' has 4 bits and therefore, in theory, 0 to 15 bytes can be duplicated. In practice the values are 3..18 bytes because copying 0,1 or 2 bytes makes no sense.

Relative position (rpos) into the currently built output is calculated as

rpos = FL  - ((B2B1 &0x00FF) + (B2B1 & 0xF000)>>4) )

The length of the data block: rlen

rlen = (B2B1 & 0x0F00>>8) + 3

With the values of rpos and rlen there are three basic situations possible:

rpos + rlen < FL // bytes to copy are within the existing reconstructed data
block is added to the end of the file, giving a new length of FL = FL + rlen. 
rpos + rlen > FL // data to copy exceeds what's available
In this situation the data block has a length of FL – rpos and it is added to the reconstructed file until FL = rpos + rlen. 
rpos + rlen < 0
This is a special case where spaces are added to the decoded file until FL = FL,Initial + rlen


The checksum, the last four bytes of any compressed data block. It is an unsigned long (Intel Little Endian order). It is simply a byte-at-a-time, unsigned additive spillover of the decompressed data.

Each and every compressed data block, contains its own, unique checksum.

There is no checksum or other protective device, employed on a .pbo overall. Exceptions: Elite and Arma have residual data after the end of contiguous data block that do represent a signature for the file.

Bibliography


Open Source PBO Libraries

C

C++

C#

Python

Rust

Java

JavaScript