-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathguide.txt
5569 lines (4924 loc) · 222 KB
/
guide.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
###################################################################
QuickBMS
by Luigi Auriemma
e-mail: me@aluigi.org
web: aluigi.org
home: http://quickbms.com
help: http://zenhax.com
###################################################################
1) Introduction
2) Usage
3) Reimporting the extracted files
4) How to create scripts (for developers only!)
5) Experimental input, output and other features
6) Notes
7) Support
8) Additional credits
###################################################################
===============
1) Introduction
===============
QuickBMS is a multiplatform extractor engine programmed through
some simple instructions contained in textual scripts, it's
intended for extracting files and information from the archives and
files of any software and, moreover, games.
The script language used in QuickBMS is an improvement of MexScript
documented here: http://wiki.xentax.com/index.php/BMS
QuickBMS is FULLY compatible with that original syntax and all the
scripts that were created here:
http://forum.xentax.com/viewtopic.php?t=1086
QuickBMS also supports most of the WCX plugins of Total Commander:
http://www.totalcmd.net/directory/packer.html
http://www.ghisler.com/plugins.htm
The original BMS language has been improved for:
- removing implied fields, like the file number in some commands
- adding new commands, like Encryption
- adding new behaviors and features, like negative GoTo
These improvements allow QuickBMS to work with tons of simple and
complex formats and even doing tasks like modifying files, creating
new files with headers, converting files and reimporting the
extracted files back in their original archives.
The tool is open source under the GPL 2.0 license and works on
Windows, Linux and MacOSX, on both little and big endian platforms
like Intel (littlen endian) and PPC (big endian).
You can distribute the original quickbms.exe file as you desire
but reusing its source code and/or modifying it may require the
same or compatible open source license.
The official homepage of QuickBMS with all the scripts I have
written from 2009 till now is (they are just links to the same
website):
http://quickbms.com
\ http://quickbms.aluigi.org
\ http://aluigi.altervista.org/quickbms.htm
\ http://aluigi.zenhax.com/quickbms.htm (rarely updated)
There is also an official forum where it's provided support for
QuickBMS and help with file formats, it's also a very good and
friendly free community for reverse engineering game files:
https://zenhax.com
QuickBMS is perfect for those tasks in which you need a quick way
to extract information from files and at the same time you would
like to reinject them back without writing a standalone tool to do
both the extraction and rebuilding jobs.
This is particularly useful if you have 100 different types of
archives to analyze (reverse engineering), parsing and then sharing
your tools with your community. It's more easy to do that with some
lines of text pasted on a forum or pastebin rather than writing 100
different standalone extraction tools plus other 100 standalone
rebuilders.
-------------------------------------------------------------------
For Linux and MacOSX users there is a Makefile in the src folder,
the only requirements are openssl, zlib and bzip2 while the
optional components are mcrypt and tomcrypt (uncomment the line
near the end of the Makefile to enable them).
If your distro supports apt-get and you have problems during the
usage of "make", try the following:
apt-get install gcc g++ zlib1g-dev libssl-dev unicode
In case of problems on 64bit versions of Linux, try also to append
a ":i386" to the previous dependencies, like:
apt-get install libssl-dev:i386
MacOSX users need to read the simple instructions written in the
Makefile, just few steps for being able to compile QuickBMS easily
without problems, anyway maybe try a "make" first because from
version 0.8.1 it was rewritten to work easily.
Updated static builds for Linux x86 and MacOSX are available on
http://aluigi.altervista.org/quickbms.htm#builds
Feel free to contact me in case of problems or just post on
https://zenhax.com
###################################################################
========
2) Usage
========
Simple and quick:
- double-click on quickbms.exe
- select the script for the type of archive you want to extract,
for example zip.bms if it's a zip file.
- select the input archive or multiple files.
you can also select a whole folder by entering in it and then
typing * (or "" on systems before Windows 7) in the "File name:"
field, and then select Open.
You can even use * to set wildcards, for example *.txt or
*required_name* or prefix*suffix
- select the output folder where extracting the files.
you can specify any filename, it will be ignored because only the
current selected directory is taken
- watch the progress status of the extraction and the final message
That's the simple "GUI" usage but QuickBMS can do various other
things when launched from the console, in fact it supports many
command-line options for advanced users and for who writes the
scripts.
You can view all the available options simply launching QuickBMS
from command-line ("cmd.exe" on Windows) without arguments.
The following is the current list of options:
Usage: quickbms.exe
[options]
<script.BMS>
<input_archive/folder>
[output_folder]
Options:
-l list the files without extracting them
-f W filter the files to extract using the W wildcards separated by comma or
semicolon, example -f "{}.mp3,{}.txt;{}myname{}"
if the filter starts with ! it's considered an ignore/exclusion filter,
if .txt it's read as text file with multiple filters, * and {} are same
example: quickbms -f "{}.mp3;!{}.ogg" script.bms archive.dat output
example: quickbms -f myfilters_list.txt script.bms archive.dat
use {} instead of * to avoid issues on Windows, multiple -f are ok too
-F W as above but works only with the files in the input folder (if used)
example: quickbms -F "{}.dat" script.bms input_folder output_folder
-o overwrite the output files without confirmation if they already exist
-k keep the current files if already exist without asking (skip all)
-K automatically rename the output files if duplicates already exist
-r experimental reimport option that should work with many archives:
quickbms script.bms archive.pak output_folder
modify the needed files in output_folder and maybe remove the others
quickbms -w -r script.bms archive.pak output_folder
you MUST read section 3 of quickbms.txt before using this feature,
use -r -r for the alternative and better REIMPORT2 mode
use -r -r -r for REIMPORT3 that shrinks/enlarges archive if no offset
-u check if there is a new version of QuickBMS available
-i generate an ISO9660 file instead of extracting every file, the name of
the ISO image will be the name of the input file or folder
-z exactly as above but it creates a ZIP file instead of an ISO image
Advanced options:
-d automatically create an additional output folder with the name of the
input folder and file processed, eg. models/mychar/mychar.arc/*,
-d works also if input and output folders are the same (rename folder)
-D like -d but without the folder with the filename, eg. models/mychar/*
-E automatically reverse the endianess of any input file by simply reading
each field and writing the reversed value, each Get produces a Put
-c quick list of basic BMS commands and some notes about this tool
-S CMD execute the command CMD on each file extracted, you must specify the
#INPUT# placeholder which will be replaced by the name of the file
example: -S "lame.exe -b 192 -t --quiet #INPUT#"
-Y automatically answer yes to any question
-O F redirect the concatenated extracted files to output file F, data is
appended if file F exists, optional F extensions supported: TAR
-s SF add a script file or command before the execution of the input script,
useful if an archive uses a different endianess or encryption and so on
SF can be a script or directly the bms instruction you want to execute
-. don't terminate QuickBMS if there is an error while parsing multiple
files (like wrong compression or small file), just continue with the
other files in the folder; useful also in rare cases in reimport mode
Debug and experimental options:
-v verbose debug script information, useful for verifying possible errors
-V alternative verbose info, useful for programmers and formats debugging
-q quiet, no *log information
-Q very quiet, no information displayed except the Print command
-L F dump the offset, size and name of the extracted files into the file F
-x use the hexadecimal notation in myitoa (debug)
-0 no extraction of files, useful for testing a script without using space
-R needed for programs that act as interface for QuickBMS and in batch
-a S pass arguments to the input script that will take the names
quickbms_arg1, quickbms_arg2, quickbms_arg3 and so on, note they are
handled as arguments so pay attention to spaces and commas, eg:
-a "arg1 \"arg 2\", arg3"
-a arg1 -a "\"arg 2\"" -a arg3
a full backup of the whole -a options is on the var quickbms_arg
-H experimental HTML hex viewer output, use it only with very small files!
-X experimental hex viewer output on the console (support Less-like keys)
-9 toggle XDBG_ALLOC_ACTIVE (enabled)
-8 toggle XDBG_ALLOC_INDEX (enabled)
-7 toggle XDBG_ALLOC_VERBOSE (disabled)
-6 toggle XDBG_HEAPVALIDATE (disabled)
-3 execute an INT3 before each CallDll, compression and encryption
-I toggle variable names case sensitivity (default insensitive)
-M F experimental compare and merge feature that allows to compare the
extracted files with those located in the folder F, currently this
experimental option will create files of 0 bytes if they are not
different, so it's not simple to identify what files were written
-Z input file cleaner, in reimport mode replaces all archived files with
zeroes, no matter if they exist or not in the folder, will be all zeroed
-P CP set the codepage to use (default utf8), it can be a number or string
-T do not delete the TEMPORARY_FILE at the end of the process
-N decimal names for files without a name: 0.dat instead of 00000000.dat
-e ignore the compression errors and dump the (wrong) output data anyway,
in reimport2 it disables the compression of the files (experimental)
-J all the constant strings are considered Java/C escaped strings (cstring)
-B debug option dumping all the non-parsed content of the open files, the
data will be saved in the output folder as QUICKBMS_DEBUG_FILE*
-W P experimental web API (P is the port) and pipe/mailslot IPC interface
-t N experimental tree-view of the extracted/listed files where N is:
0:text1, 1:text2, 2:text3, 3:json1, 4:json2, 5:web, 6:dos, 7:ls
-U [S] list of available compression algorithms, use S for searching names
-# in reimport mode checks if the archived files and those to reimport are
the same (hash), it's useful if you didn't remove the unmodified files
-j force UTF16 output in some functions, for example with SLog
-b C use C (char or hex) as filler in reimporting if the new file is smaller,
by default it's used space in SLog and 0 for Log and CLog
-y F experimental debug output to file F, supported formats on file extension
json, csv, yaml, c/java and so on
Features and security activation options:
-w enable the write mode required to write physical input files with Put*
-C enable the usage of CallDll without asking permission
-n enable the usage of network sockets
-p enable the usage of processes
-A enable the usage of audio device
-g enable the usage of video graphic device
-m enable the usage of Windows messages
-G force the GUI mode on Windows, it's automatically enabled if you
double-click on the QuickBMS executable
Remember that the script and the input archive/folder are ever
REQUIRED and they must be specified at the end of the command-line.
The following is an example for listing all the mp3 files from the
input archive:
quickbms -l -f "{}.mp3" zip.bms myfile.zip
quickbms -l -f "{}.mp3;{}.ogg" zip.bms myfile.zip
quickbms -l -f "{}.mp3;{}.ogg,{}filename{}" zip.bms myfile.zip
quickbms -l -f file_containing_the_filters.txt zip.bms myfile.zip
(file_containing_the_filters.txt has one filter per each line)
So -l for listing the files without extracting them, and -f for
filtering the archived files. Regarding the -f and -F options it's
worth to note that both * and {} are accepted as wildcards because
the first pattern may be interpreted by the Windows console (my
suggestion is to use ever {} to avoid problems).
QuickBMS supports also a folder as input which means that with a
single command it's possible to unpack all the archives of a whole
game directly using QuickBMS.
Imagine to use the zip.bms script with all the zip files located in
the Program Files folder:
quickbms -F "{}.zip" zip.bms "c:\Program Files (x86)" c:\outfolder
Note: as said before, sometimes Windows doesn't like the * char
even if used between quotes, so in case of problems with
"*.zip" you can use {} instead of *, for example "{}.zip"
Except for -l, -f, -F and maybe -o and -s options, the others are
intended for debugging, or they are special features or switches to
enable/disabe some internals, so they should be ignored by the
common users.
If output_folder is omitted, the current directory is used.
From version 0.9.1, if output_folder is "", the same direcotyr of
input file (or each file in case of input folder) is used.
If the extraction with a particular script is too slow or scanning
a folder takes too much memory and time try using the -9 option
that disables the memory protection.
You can apply these options directly in a shortcut to quickbms.exe
in the Target field of its properties, so you can use the
double-click "GUI" method and all the command-line options you
desire without using the command-line.
In the quickbms.zip package you can also see quickbms_4gb_files.exe
(previously known as quickms64_test.exe) which is an "experimental"
version that uses 64bit numbers instead of the original 32 bits:
- it supports archives and files bigger than 4 gigabytes
- it may have problems to work with "some" scripts
- it's a native 32bit software so it works on both Windows 32 & 64
- it's experimental and partially supported, problems like crashes
and incorrect math operations may happen often in some scripts
-------------------------------------------------------------------
Advanced users could find useful also these specific options:
-d Automatically creates a folder with the name of the input file
where placing all the files, it's useful if you have many small
archives containing the same filenames and need to separate the
extracted files without overwriting or renaming them.
-E If you have a bms script that simply reads a file format, you
can change the endianess of all its numeric fields on the fly by
simply using this option.
For example if you have a "get SIZE long" a 32bit number will be
read as usual and additionally it will be reversed (0x11223344
to 0x44332211 or viceversa) and placed at the same location.
Remember that you need to specify also the -w option with
physical files, alternatively you can save the whole file in a
memory file and then dumping it so that -w is not necessary.
With this option is really trivial to convert the endianess of
files between different platforms, like Xbox 360 and PC.
###################################################################
==================================
3) Reimporting the extracted files
==================================
QuickBMS is mainly an extraction tool, but it supports also the -r
option that converts the tool in a simple reimporter/reinjector and
so it may be useful for modding or translating a game.
The idea consists of being able to reimport ("injecting back") the
modified files in the original archives without editing the script,
just reusing the same bms scripts that already exist!
-------------------------------------------------------------------
Using this feature is really trivial and the following is a
step-by-step example:
- Make a backup copy of the original archive!
- Extract the files or only those you want to modify (-f option) as
you do normally via the GUI (double-click on quickbms.exe) OR via
command-line like the following example:
quickbms script.bms archive.pak output_folder
- Modify the extracted files leaving their size unchanged or
smaller than before.
I suggest to delete the files that have not been modified so that
the reimporting process will be faster and safer. In the folder
leave only the files you modified.
Remember that their size must be smaller/equal than the original!
- Reimport the files in the archive via the GUI by clicking on the
file called "reimport.bat" OR via command-line:
quickbms -w -r script.bms archive.pak output_folder
- Test the game with the modified archive
Remember that you can use the GUI for the reimporting procedure,
just click on "reimport.bat" found in the quickbms package, it
contains the command: quickbms.exe -G -w -r.
IMPORTANT NOTE ABOUT "REIMPORT2" MODE
From version 0.8.2 QuickBMS started to implement an additional
alternative reimport mode enabled by using -r twice like:
quickbms -w -r -r script.bms archive.pak output_folder
or
reimport2.bat
This mode can be used with many formats and offers the following
advantages:
- no size limits with the imported files, the bigger files will be
inserted (appended) at the end of the archive
- the fields "offset", "size" and "compressed size" are rewritten
by matching the new imported file, that's useful with various
size-dependent compression algorithms like lz4
The reimport2 method doesn't work if:
- the TOC is compressed or located on a MEMORY_FILE
- the TOC/magic is (relatively) located at the end of the archive
- the content is sequential, so there is no offset
- the 3 fields mentioned above are very different than those
originally read from the TOC, in this mode only one maximum
"math" operation is allowed on the variable which means that the
following example works:
get OFFSET long ; math OFFSET * 0x800 ; log NAME OFFSET SIZE
while this example produces an incorrect OFFSET field:
get OFFSET long ; math OFFSET * 0x800 ; math OFFSET + BASE_OFF ; log NAME OFFSET SIZE
the same is valid for the size fields too, anyway note that
"offset" is rewritten only if the new file is bigger than before
- the game strictly trusts the original size of the archive and
ignores data appended to it, for example some archives may have a
field in the TOC that specifies the size of the archive
- SLog is implemented but may not work with some archives
- the archive is subject to other limits described below, excluded
the advantages listed before
From version 0.10.0 QuickBMS has an additional mode called
REIMPORT3, it's identical to REIMPORT2 with the only difference
that the archive is shrinked or enlarged if there is no offset
field used in the archive and the size of the input file differs
than the original.
This method "may" be useful with some language files and some
archives with sequential data.
-------------------------------------------------------------------
Another example:
- First step, use QuickBMS as usual:
archive.pak -> file1.txt
-> file2.dat
-> file3.jpg
- Second step:
- delete file1.txt and file2.dat
- modify file3.jpg, for example adding a "smile" in it
- save file3.jpg and be sure that it's size is SMALLER or EQUAL
than the original
- Third step, clink on the reimport.bat file provided in quickbms
and select the SAME file and output folder you selected in the
first step:
archive.pak <- file1.txt (doesn't exist so it's not reimported)
<- file2.dat (doesn't exist so it's not reimported)
<- file3.jpg (successfully reimported)
-------------------------------------------------------------------
Some important notes about this particular reimporting process:
- you CANNOT increase the size of the files you want to reimport,
so the new files must be smaller or equal than the original ones.
- the reimport process of compressed files may be very slow in some
cases, for example with zlib, deflate, lzma and few others that
are optimized to use less space as possible at cost of time.
zlib/deflate is particular slow because QuickBMS uses different
solutions to reduce the size as much as possible.
- for the maximum compatibility within the thousands of available
file formats I decided to not use tricks for modifying the
original size and compressed_size values.
for example imagine those formats that use encrypted information
tables or MEMORY_FILEs for such tables or that use things like
"math SIZE *= 0x800".
the reimport process must be generic, universal and without
work-arounds.
- the script is just the same for both extraction and reimporting,
it means that many of the scripts written by me and the other
users already work, cool!
- the reimporting of compressed files is perfectly possible because
the tool automatically switches to the relative compression
algorithm if available (for example deflate -> deflate_compress),
if an algorithm is not available in recompress mode then the
reimporting will fail
- SLog is a new command that has been recently added to QuickBMS
for dumping strings and texts, it works also in reimport mode but
it's very limited and prone to errors. I suggest to check the
manual for the SLog command (search slog in this text), but a
generic universal rule is:
- keep the length of the edited line of text as the original
? if the original archive uses complex encryptions that require
the usage of MEMORY_FILEs to perform temporary decryption, then
it's NOT supported and the same is valid for chunked content
(like those scripts that use the command Append)
From version 0.6.6, QuickBMS has an experimental mode for
reimporting chunked files, it works very well with files saved
directly to disk and less well with those that use MEMORY_FILEs
(most of my scripts).
In my opinion this feature is great but don't expect much, with
some scripts you can have success but many others may not work.
- FileXor, FileRot, Encryption and Filecrypt should work correctly
- things like CRCs and hashes can't be supported
- it's also possible to reimport the nameless files dumped with
'log "" OFFSET SIZE', the tool will automatically check for files
in the folder with the same number so if the file was saved as
00000014.xml it will be reimported perfectly.
- the reimport mode doesn't work if you renamed the files with the
same name during the extraction (for example using the 'r'
choice), in this case there is no way for the tool to know the
correct file to reimport and will reimport only the one with the
same original name.
- the -Z option is a simple way to zero ALL the spaces of the
archive occupied by the original files, the result will be a sort
of "empty" archive. It "may" be useful for releasing the empty
archive and the files separately and then reinjecting them in
reimport mode with the option leaving out some unused files.
Example:
- quickbms script.bms archive.ar output_folder
- quickbms -r -w -Z script.bms archive.ar output_folder
(the content of output_folder is completely ignored in -Z)
- remove videos from output_folders
- compress archive.ar and output_folder, give them to a friend
- quickbms -r -w script.bms archive.ar output_folder
- now archive.ar all the files but the videos
The behaviour of this feature may change in future depending by
the feedback of the users, currently there is no real usage.
Please note that often the games are able to load the extracted
files directly from their installation folder, sometimes directly
maybe by just removing the original archive and other times by
launching the game with specific command-line arguments.
The reimport feature of QuickBMS has already allowed to slightly
mod and translate various games, but it's meant as a quick or
temporary solution till a proper stand-alone rebuilder tool is
written by the community of the target game, due to the better
benefits coming from a complete and specific solution.
But if nobody is going to write a stand-alone rebuilder for a
specific game, then the reimport feature of QuickBMS is a great and
immediately available solution.
###################################################################
===============================================
4) How to create scripts (for developers only!)
===============================================
Originally the tool was created just for myself to be able to write
quick extractors for simple archives immediately without writing a
new tool, but QuickBMS revealed to be a powerful tool that I use
for many tasks, including the parsing of some protocols and much
more.
So, how to write these scripts?
Giving a look at http://wiki.xentax.com/index.php/BMS is a good
first step to understand at least the basis of this language
originally written by Mike Zuurman (alias Mr.Mouse of XeNTaX) in
the far 1997.
Then it's good to take a look at the various examples provided on
http://quickbms.com and http://zenhax.com
A programming knowledge and background is not required but it's
very useful for understanding the "logic" of the scripts and some
terms.
What is really necessary is the full knowledge of the format to
implement: reverse engineering is ever useful for figuring the
needed fields.
Luckily in the extraction process it's not needed to know all the
fields of an archive, so a field like a CRC doesn't matter while
the important fields to extract a file are ever the following:
- filename
- offset
- size
- optional compressed size if the file is compressed
If you don't have filename and size, it's not a problem. What's
really necessary is knowing at least of the offsets of the files.
If you check my scripts you can notice the name DUMMY assigned to
the fields that are not useful for the extraction.
Note that I will try to keep the following documentation updated as
much as I can, and also in sync with what happens inside QuickBMS
for each command.
The source code of the tool is not easy to understand so I hope
that this documentation may be useful and complete.
The fields between [] are optional fields.
---
A quick and limited list of available commands is available when
QuickBMS is launched with the -c option.
Some important notes about the QuickBMS environment:
- Everything is handled as a variable except if it starts with a
number in which case it's considered a numeric constant, so when
in this document I talk about VAR, STRING and other types of data
I refer EVER to both variables and constants because they are
EXACTLY the SAME thing inside the tool.
- All the commands and the names of the variables are case
INsensitive, "get OFFSET long" is the same as "GeT oFfSeT lOnG".
- Everything works with signed 32 bit numbers (-2147483648 to
2147483647) so QuickBMS may not work well with files over 2 Gb
but it can seek on files of 4 Gb without problems.
Consider the following limits:
- max 4gb size for archives
- max 2gb size for the archived files
Try quickbms_4gb_files.exe when working with bigger archives.
- The constant strings depends by the context of the command, in
fact in some commands they are handled as strings in C notation
like "\x12\x34\\hello\"bye\0", in this case you must know how
this representation works.
This is a solution for using binary data in the textual script.
The keyword is "C language escape characters" or escape
sequences (or cstring), they are very simple, take a look here:
https://docs.microsoft.com/en-us/cpp/c-language/escape-sequences
From http://www.acm.uiuc.edu/webmonkeys/book/c_guide/1.1.html
Escape Name / Meaning
\a Alert
\b Backspace
\f Form Feed
\n New Line
\r Carriage Return
\t Horizontal Tab
\v Vertical Tab
\' Produces a single quote
\" Produces a double quote
\? Produces a question mark
\\ Produces a single backslash
\0 Produces a null character
\ddd Defines one character by the octal digits (base-8)
\xdd Defines one character by the hexadecimal digit (base-16)
ONLY some commands support this C string notation for the escape
characters, a quick way to find them is searching the keyword
"(cstring)" without quotes in this document.
From version 0.8.2 exists the -J option that considers all the
constant strings as escaped Java and C-like strings, so every
string is a cstring when you use such option
- Both decimal and hexadecimal numbers are supported, the former is
used if the number starts with 0x so 1234 and 0x4d2 are the same.
- Any operation made on fields bigger than 8 bits is controlled by
the global endianess, it means that any number and unicode field
is read in little endian by default otherwise it's valid the
endianess specified with the Endian command.
- Comments can be used in C (// and /* */) and BMS syntax (#), for
example:
get DUMMY long # this is a comment
/*
this is a comment
*/
- The FILENUM (file number) field in the commands is set as a
constant, it means that it cannot be modified at runtime using a
variable, examples:
get TMP string 0 # ok
get TMP string VAR # wrong
- All the commands use variables for their arguments except those
in which it's specified that a constant number or a string
(STRING) is needed.
For example the commands that use a C string (cstring) use
constant strings and not variables, except some cases like the
dictionary of ComType.
Note that this behaviour may change in future or may have been
already changed in some commands.
File numbers:
Every file opened in QuickBMS has a number assigned to it, if
this number is not specified it will be considered 0, the main
input file.
The first opened file is the input archive to which is assigned
the number 0 (zero), the others must use the Open command.
Negative numbers are considered MEMORY_FILEs, so -1 is
MEMORY_FILE, -2 MEMORY_FILE2 and so on.
MEMORY_FILEs:
This is a particular type of temporary file which resides in
memory and works exactly like a normal temporary file.
It's extremely useful for doing many operations and you can use
multiple memory files: MEMORY_FILE, MEMORY_FILE2, MEMORY_FILE3
and so on.
MEMORY_FILE and MEMORY_FILE1 are the same file.
.
If you need to work with chunked parts of a file to concatenate
to the memory file, you need to use the following trick:
.
putvarchr MEMORY_FILE FINAL_SIZE 0 # allocate memory
log MEMORY_FILE 0 0 # create the file
.
The first instruction allocates the memory for containing the
final size of your chunks, and the second one is necessary for
resetting the memory file (current offset and size, not the
allocated size).
If you need to create a MEMORY_FILE of 0x100 bytes set to zero to
use in CallDLL use the following
.
log MEMORY_FILE 0 0 # create the file
putvarchr MEMORY_FILE 0x100 0 # write 0x100+1 zeroes
TEMPORARY_FILE:
This additional file called TEMPORARY_FILE resides physically on
the target folder and has that exact name.
Despite its "temporary" name, it's not deleted by the output
folder and QuickBMS will ask to remove it at end of extraction.
The file is created in any condition, even when it's used the -l
(list) option for listing the files, so it's perfect in certain
situations like when it's used a chunks based file system.
The difference with the MEMORY_FILE is only related to the amount
of memory available on the system because the previous file types
uses the RAM while this one uses the disk, so use it if you need
to create a temporary file bigger than 2 gigabytes.
.
For using the temporary file check this example:
.
log TEMPORARY_FILE 0 0 # reset it if it already exists
append # enables the append mode
...
log TEMPORARY_FILE OFFSET SIZE
...
append # disable the append mode
open "." TEMPORARY_FILE 1 # open temporary file as file 1
.
Note that from version 0.6.8, QuickBMS automatically overwrites
this file if it already exists.
The following is the list of types of variables supported, also
know as datatypes or types.
The list is ordered just like in defs.h:
BYTE 8 bit, 0 to 0xff
SIGNED_BYTE 0x99 is read as 0xffffff99
SHORT 16 bit (aka INT), 0 to 0xffff
SIGNED_SHORT 0x9999 is read as 0xffff9999
THREEBYTE 24 bit, 0 to 0xffffff
SIGNED_THREEBYTE
LONG 32 bit, 0 to 0xffffffff
SIGNED_LONG mainly useful in quickbms_4gb_files:
0x99999999 is read as 0xffffffff0x99999999
LONGLONG fake 64 bit, so only 0 to 0xffffffff but Get takes 8 bytes
FLOAT 32 bit, 123.345 is read as 123
From QuickBMS 0.10.1 floats (and doubles) are partially
handled in Get, Put, Math and Print commands.
DOUBLE 64 bit, 123.345 is read as 123
LONGDOUBLE 96 bit, 123.345 is read as 123
Note that size of long double is compiler dependent
STRING NUL delimited string (one byte for each char)
UNICODE special type used for unicode utf16 strings, the
endianess of the utf16 is the same used globally in the
script (watch the Endian command), it's used also for
converting an unicode string to an ascii one:
Set ASCII_STRING UNICODE UNICODE_STRING
unicode conversion is performed via Win32 API (CP_UTF8
and CP_ACP in case of 0xfffd chars) while on Linux it
uses iconv, fallback on mbtowc and byte=short
UTF32 experimental support for 32bit unicode (unicode32)
BINARY special type used for binary strings in C notation like
"\xff\x00\x12\x34", used mainly as a constant (cstring)
LINE special type used for carriage return/line feed delimited
string (so any string ending with a 0x00, 0x0a or 0x0d),
from version 0.6 the tool supports also strings that
have no delimiter at the end of file
ASIZE special type used to return the size of the opened file,
used only with the GET command
FILENAME special type used to return the name of the opened file
like "myfile.zip", used only with the GET command
BASENAME special type used to return the base name of the opened
file like "myfile", used only with the GET command
FILEPATH the folder of the file, like "c:\path\folder" for
"c:\path\folder\file.txt"
FULLBASENAME just like FULLNAME without extension
EXTENSION special type used to return the extension of the opened
file like "zip", used only with the GET command
FULLNAME full path of the file, in reality at the moment it returns
the same path used in the input filename
CURRENT_FOLDER the path from which has been launched QuickBMS
FILE_FOLDER the path of the loaded input file
OUTPUT_FOLDER the extraction folder (the last argument of QuickBMS)
INPUT_FOLDER same as above
BMS_FOLDER the folder where the bms script is located
EXE_FOLDER the folder where quickbms.exe is located
ALLOC a type used only in the Set command for creating a variable
with a specific allocated size
COMPRESSED a special type used for setting big strings and memory
files using a small amount of text, for using this type
you must take the original text/file, compress it with
zlib (you can use my packzip tool) and then encoding the
output file with base64 (you can use my bde64 tool) and
placing the result like the following:
set MEMORY_FILE compressed eNrtwbEJACAMBMBecIfvnMUxPuEJAe0UHN81LLzrbYKwDOjI96IN1cLveRfAGqYu
this type is very useful if you want to embed a dll inside
a script without wasting much space
You can create this variable using the following script:
http://aluigi.org/bms/file_compressed_var.bms
VARIABLE read byte per byte till the byte is negative
VARIABLE2 Unreal engine index numbers
VARIABLE3 used in various software
VARIABLE4 used in Battlefield 3 (Frostbite engine) and Rar
VARIABLE5 used in 7z archives
VARIABLE6 requires a ValueMax variable
VARIABLE7 similar to VARIABLE2
UNKNOWN use it to ask the user to insert the content of the variable
VARIANT VB/C++ variant type (http://en.wikipedia.org/wiki/Variant_type)
BITS read a specific amount of bits, QuickBMS and the
language are byte based but the "bits" method works
very well
TIME time_t Unix 32bit time
TIME64 64bit time used as FILETIME on Windows
CLSID ClassID like 00000000-0000-0001-0000-000000000000
IPV4 7f 00 00 01 = "127.0.0.1"
IPV6 like 2001:0db8:85a3:0000:0000:8a2e:0370:7334
ASM x86 assembly
ASM64 x86_x64 assembly
ASM16 x86 16bit assembly
ASM_??? arm, arm_thumb, arm64, mips, mips64, ppc,
ppc64, sparc, sysz, xcore
TCC a special type that compiles C text
??? the user will be asked to input a string that
will be the new value of the variable, prompt
from user / pause
Just for the record, the original MexScript probably contained some
types of variables that have never been used and for which it's
unknown what they should represent: PURETEXT, PURENUMBER,
TEXTORNUMBER and FILENUMBER.
QuickBMS supports also the "experimental" multidimensional arrays
inside the variables, for example:
for i = 0 < 10
get VAR[i] long
for j = 0 < 5
get VAR2[i][j] long
next j
next i
But it's possible to access that variable ONLY by specifying the
original name and index, so:
print "%VAR[0]%" # fail!
print "%VAR[j]%" # fail!
math i = 0
print "%VAR[i]%" # ok
QuickBMS supports also embedded text like the following:
Set VAR string "
this is
a text with \"blah\" and 'blah'
and so on.
"
The following is the list of bms commands:
QuickBMSver VERSION
FindLoc VAR TYPE STRING [FILENUM] [ERR_VALUE] [END_OFF]
For [VAR] [OP] [VALUE] [COND] [VAR]
Next [VAR] [OP] [VALUE]
Get VAR TYPE [FILENUM] [OFFSET]
GetDString VAR LENGTH [FILENUM]
GoTo OFFSET [FILENUM] [TYPE]
IDString [FILENUM] STRING
Log NAME OFFSET SIZE [FILENUM] [XSIZE]
Clog NAME OFFSET ZSIZE SIZE [FILENUM] [XSIZE]
Math VAR OP VAR
XMath VAR INSTR
Open FOLDER NAME [FILENUM] [EXISTS]
SavePos VAR [FILENUM]
Set VAR [TYPE] VAR
Do
While VAR COND VAR
String VAR OP VAR
CleanExit
If VAR COND VAR [...]
[Elif VAR COND VAR]
[Else]
EndIf
GetCT VAR TYPE CHAR [FILENUM]
ComType ALGO [DICT] [DICT_SIZE]
ReverseShort VAR [ENDIAN]
ReverseLong VAR [ENDIAN]
ReverseLongLong VAR [ENDIAN]
Endian TYPE [VAR]
FileXOR SEQ [OFFSET] [FILENUM]
FileRot SEQ [OFFSET] [FILENUM]
FileCrypt SEQ [OFFSET] [FILENUM]
Strlen VAR VAR [SIZE]
GetVarChr VAR VAR OFFSET [TYPE]
PutVarChr VAR OFFSET VAR [TYPE]
Debug [MODE]
Padding VAR [FILENUM] [BASE_OFF]
Append [DIRECTION]
Encryption ALGO KEY [IVEC] [MODE] [KEYLEN]
Print MESSAGE
GetArray VAR ARRAY VAR_IDX
PutArray ARRAY VAR_IDX VAR
SortArray ARRAY [ALL]
SearchArray VAR ARRAY VAR
CallFunction NAME [KEEP_VAR] [ARG1] [ARG2] ... [ARGn]
StartFunction NAME
EndFunction
ScanDir PATH NAME SIZE [FILTER]
CallDLL DLLNAME FUNC/OFF CONV RET [ARG1] [ARG2] ... [ARGn]
Put VAR TYPE [FILENUM]
PutDString VAR LENGTH [FILENUM]
PutCT VAR TYPE CHAR [FILENUM]
GetBits VAR BITS [FILENUM]
PutBits VAR BITS [FILENUM]
Include FILENAME
NameCRC VAR CRC [LISTFILE] [TYPE] [POLYNOMIAL] [PARAMETERS]