• Welcome to SC4 Devotion Forum Archives.

Forays into DBPF tools

Started by JoeST, March 10, 2013, 07:47:05 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

JoeST

So, as you may know, I'm not very good at this whole SC4 thing. I've not played the game for a while, nor have I ever been any good at making things for it. At least, nothing anyone found any use for $%Grinno$%

Hopefully this may change :D I recently found myself helping Wou make SC4Mapper a little more portable (since he published the code on GitHub). This prompted me to go back and refactor my last attempts at a DBPF parser in python. I thought 'wouldn't it be cool if we could have some way of doing a 'cleanitol for TGIs'.

I'm glad to say that I've made a thing that can do this. Or at least, it can scan a single DBPF file for any TGIs. I have no idea (nor desire) to program a GUI, but the code is there for anyone who wants to stab me. I want to attempt a library that recurses through a directory, generating a list of paths to all the DBPF files. I'd then  use this to scan directories for the TGIs in the (what I'm calling) .tgis file.

Anyone interested?
Copperminds and Cuddleswarms

Swordmaster

Something like that would be great, Joe. A tool where you can input a dat file and it produces a txt list of the subfiles in it with TGI addresses.

Even better, do you think it could list addresses and contents of LTEXT files?


Cheers
Willy

CasperVg

#2
Looking great, Joe! The more DBPF information and algorithms we have, the better. I've been thinking of doing a DBPF library in C, but it probably wouldn't be much different from iLive's C++ implementation  :)



Quote from: Swordmaster on March 10, 2013, 09:11:03 AM
Even better, do you think it could list addresses and contents of LTEXT files?

My LTEXTool already does that, if you use the Export as XML functionality. It will output the LTEXTs as follows, in an XML file, which is basically a formatted TXT file.

<DBPF>NetworkAddonMod_Locale_english.dat</DBPF>
<LTEXT>
  <tgiKey>
    <tid>539399691</tid>
    <gid>708335844</gid>
    <iid>26</iid>
  </tgiKey>
  <compressed>false</compressed>
  <decompressedSize>0</decompressedSize>
  <data>ANT Network</data>
</LTEXT>
<LTEXT>
  <tgiKey>
    <tid>539399691</tid>
    <gid>1780686506</gid>
    <iid>41105</iid>
  </tgiKey>
  <compressed>false</compressed>
  <decompressedSize>0</decompressedSize>
  <data>Avenue Raised Bridge</data>
</LTEXT>
<LTEXT>
  <tgiKey>
    <tid>539399691</tid>
    <gid>173361888</gid>
    <iid>163324659</iid>
  </tgiKey>
  <compressed>true</compressed>
  <decompressedSize>36</decompressedSize>
  <data>El Train Control</data>
</LTEXT>
<LTEXT>
  <tgiKey>
    <tid>539399691</tid>
    <gid>1780686500</gid>
    <iid>200292429</iid>
  </tgiKey>
  <compressed>true</compressed>
  <decompressedSize>932</decompressedSize>
  <data>To me, pavement is a wonderful thing - the hard smooth surface, the extreme heat on a summers day. But it can be costly to create and maintain. If you are looking to save a few simoleons you might consider &lt;a href=&quot;#link_id#game.tool_plop_network(network_tool_types.DIRT_ROAD)&quot;&gt;ANT Network&lt;/a&gt; as an alternative. There&apos;s something special about the bumpy ride and billows of dust that only an unpaved path can provide. And what a boon to our local car wash owners!</data>
</LTEXT>
<LTEXT>
  <tgiKey>
    <tid>539399691</tid>
    <gid>4041568566</gid>
    <iid>308696681</iid>
  </tgiKey>
  <compressed>false</compressed>
  <decompressedSize>0</decompressedSize>
  <data>Boulevard Light</data>
</LTEXT>
<LTEXT>
  <tgiKey>
    <tid>539399691</tid>
    <gid>4062043073</gid>
    <iid>311885708</iid>
  </tgiKey>
  <compressed>false</compressed>
  <decompressedSize>0</decompressedSize>
  <data>Railway Tunnel Entrance</data>
</LTEXT>


Granted, it's a little wonky right now (need to make it output the TGI as Hexstrings instead of Longs) and it only works on one file at a time, not entire directories.
Follow my SimCity 4 Let's play on YouTube

wouanagaine

A VirtualDat (term borrowed in iLiveReader ) is something very interesting to add for any tools that will need to parse folder in the correct order => http://sc4devotion.com/forums/index.php?topic=10542.0

That is something I use in PIMX and DatPacker. Joe, for python it might be more intersting to use plain tuple instead of namedTuple or even class as I find they take a huge amount of memory, it is however more difficult to remember the index instead of a struct member

Some memory info =>
PIMX using class for 'Entry' => can't parse my plugins folder, outofmemory
PIMX using tuple for Entry => use 500Mb of RAM ( way to much compared to a C++ version )



New Horizons Productions
Berethor ♦ beskhu3epnm ♦ blade2k5 ♦ dmscopio ♦ dedgren ♦ emilin ♦ Ennedi ♦ Heblem ♦ jplumbley
M4346 ♦ moganite ♦ Papab2000 ♦ Shadow Assassin ♦ Tarkus ♦ wouanagaine
Divide wouanagaine by zero and you will in fact get one...one bad-ass that is - Alek King of SC4

JoeST

I resorted to using (in memory) sqlite databases, so I don't know how well that will scale.
Copperminds and Cuddleswarms

cogeo

#5
Let me post my thoughts here.

I was considering making something similar, a tool named "Installation Analyzer" or so, which would scan all files in the SC4 installation and report duplicates, overrides and conflicts. Take a look at my posts here and here if you want. But I just made the application's main window and then abandoned it, discouraged by the fewer and fewer active members.

I think this would be really feasible to make, and quite straightforward actually. And while many players have installations as big as 50GB, some efficient coding could overcome any performance issues. The DBPF file's header contains a pointer (offset) to the Index, which is a list of 20-byte entries, containing the TGI IDs plus the offset and size for each item in the DBPF. The operation is quite simple, you just need to open the file, read the header and then seek to the Index and read it (in one read operation) onto an array or memory block (then compare each item's TGI to the ones in the previously loaded lists, etc etc...). This is fast, and keeping the index lists of all DBPF files of the installation in memory won't need much space. I don't think it's worth using a library here, as implementation is just a few lines of code (I did this in the Model Tweaker, so I know what I'm talking about), and libraries may be inefficient, eg they may be reading (or keeping in memory) the whole file, and not just the Index, or performing additional proofing or housekeeping opearations, while by writing some few lines of code you can do exactly what you want.

wouanagaine

Cogeo, such a tool exist somewhere on my old drive, not sure I can find it. Never released publicly because the UI sucks.


New Horizons Productions
Berethor ♦ beskhu3epnm ♦ blade2k5 ♦ dmscopio ♦ dedgren ♦ emilin ♦ Ennedi ♦ Heblem ♦ jplumbley
M4346 ♦ moganite ♦ Papab2000 ♦ Shadow Assassin ♦ Tarkus ♦ wouanagaine
Divide wouanagaine by zero and you will in fact get one...one bad-ass that is - Alek King of SC4

Tropod

Quote from: cogeo on March 10, 2013, 03:39:28 PM
Let me post my thoughts here.....

Had a read of the linked post. I'm the same some times, wonder why I bother. Figure though what better way to spend my free time learning programming than to practice doing so on SC.  :blahblah:...mostly do it for fun  ;)

JoeST

Well I just updated the 'C' branch with two functions

  • `DBPF_header(FILE*, *head)` parses the header into a `struct dbpf_header` and checks the magic
  • `DBPF_index(FILE*, *head, **index)` reads the index table into an allocated buffer, passing back as *index, returning the width of each record
Copperminds and Cuddleswarms

cogeo

Well, did take a look at the sources, and yes, this is what I was talking about!

Some comments:
- For SC4, the file format is DBPF version 1.0, index version 7.0. Your code can simply refuse to read the file, if the above spec is not met. Therefore the DBPF_index() can return a pointer to (array of) dbpf_index_v70 structures instead of uint32** (better use some typedef's here). But you can even leave it as is, if you want your code to be able to read both V7.0 and V7.1 indices - the programmer will have to check the index version and assign (type-cast) the index variable to either a dbpf_index_v70 or a dbpf_index_v71 pointer (don't know how much practical this is though).
- These file functions using the FILE* descriptors have become somehow "obsolete". They can also prove inefficient, because they are... buffered (the buffer size is 512 bytes) - so better turn buffering off. Maybe consider using the function set working with file handles instead (_read(), _write(), _lseek() etc).

JoeST

I wrote this in Linux, using GCC to test.
Copperminds and Cuddleswarms