Skip to content

Introduce BinaryFormat extension point#249

Draft
nickbabcock wants to merge 1 commit into
masterfrom
de
Draft

Introduce BinaryFormat extension point#249
nickbabcock wants to merge 1 commit into
masterfrom
de

Conversation

@nickbabcock

@nickbabcock nickbabcock commented May 27, 2026

Copy link
Copy Markdown
Contributor

The binary format needs 3 features:

  • Stateful parsing:
    • HOI4 interprets 0x000d as 64 bits since patch 1.17, instead of 32 bits. The patch is contained within the data as an i32 value following an 0x349d id.
    • The default f64 encoding for CK3 changed in patch 1.5 from a decimal fixed point precision of 1000 to 100000 based on the i32 value of the 0x00ee field.
    • How individual numeric fields in CK3 are encoded can differ. For instance, some 0x0167 values are encoded as Q49.15 instead of decimal fixed point. For instance if a gold id field is encountered within the scope of alive_data then is encoded as Q49.15 unlike other gold instances
    • This stateful parsing can be used for streaming token use cases (like melting the save file into plaintext) and deserialization such that both use cases are yielded the same semantic values.
  • Speculative parsing: Ultimate performance is non-negotiable as save files can be 100's of megabytes and 100 million tokens. The binary deserializer should be able to exploit the shape of the target data to drive the underlying parser. This moves the behavior from speculating on what was just parsed to what will be parsed. Like if one has been given a deserialization hint of an i32, then the parser should assume the next two bytes are 0x000c signalling an I32 value, and only fallback to generic parsing if this assumption is not met. This should give a healthy speedup.
  • Customization: Previously we got away with a binary parser that was a superset of all game binary formats. Thanks to HOI4 1.17, a superset is no longer possible, so it may be time to embrace the individuality of games. Like only EU5 binary format cares about string lookup tokens and compact fixed 5 tokens -- so why should other games pay the runtime cost of having branches that are never taken?

Whereas BinaryFlavor is a more narrow extension point, BinaryFormat
trait allows downstream implementations to be almost entirely decoupled
from the standard format. A good example of this is HOI4, whose 0x000d
token may represent 32 or 64 bits of data. This can't be captured by
BinaryFlavor, whereas BinaryFormat can.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant