Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce copying of records in the BP #658

Open
2 tasks
Utumno opened this issue Jan 8, 2023 · 0 comments
Open
2 tasks

Reduce copying of records in the BP #658

Utumno opened this issue Jan 8, 2023 · 0 comments
Assignees
Labels
A-patchers Area: Patchers (Everything in the patcher package) A-perf Area: Performance (runtime performance and/or memory usage) A-records Area: Record Definitions (The brec package, mod_files.py and the game/*/records.py files) C-refactoring Category: Refactoring. A purely internal refactoring, with no user-facing changes
Milestone

Comments

@Utumno
Copy link
Member

Utumno commented Jan 8, 2023

We make way too many copies (deepcopy is really expensive!) - right now we copy every record that a patcher is interested in once for every mod in the load order, but we really only need a copy of the since that's the version of the record we'll let the patchers modify

  • Drop the getTypeCopy calls from the patchers
  • Add a new phase after scanLoadMods in which we make one(!) copy of every record in the BP
  • save: getSize copies the entire grup in a BytesIO - we effectively dump the file twice. Replace this with dump returning the size and then seek back to write the size (O(top-groups) seeks can't be slower than this) - py3+ we would need too many seeks
@Utumno Utumno mentioned this issue Jan 8, 2023
67 tasks
@Utumno Utumno changed the title We make way too many copies (deepcopy is really expensive!) - right now we copy every record that a patcher is interested in once for every mod in the load order, but we really only need a copy of the since that's the version of the record we'll let the patchers modify Reduce copying of records in the BP Jan 8, 2023
@Utumno Utumno added A-patchers Area: Patchers (Everything in the patcher package) A-records Area: Record Definitions (The brec package, mod_files.py and the game/*/records.py files) C-refactoring Category: Refactoring. A purely internal refactoring, with no user-facing changes A-perf Area: Performance (runtime performance and/or memory usage) labels Jan 8, 2023
@Utumno Utumno added this to the 312 milestone Jan 8, 2023
Utumno added a commit that referenced this issue Feb 4, 2023
Now we can track copies - will be much simpler after scanLoadMods common
code is factored out later on this branch.
Mopy/bash/patcher/patchers/mergers.py: seems we were missing some
copies - or even better, can we omit more? Track the do_copy=False
and add more.

Note I return from setRecord for complex groups will be used in the next
commit.

I dropped the used once copy_records - reduce API surface.

Under #658
Utumno added a commit that referenced this issue Feb 4, 2023
And all the complexity of the cell groups is on the surface now. This
commit is kinda broken (has bugs in the patch f.i. replace form ids,
nvidia fog fix and road/cell patcher - aka CELL/WRLD groups) but starts,
and runs and verification passes, and fixups should standout for future
reference, so I posted them in the next merge along with some smoothing
out. This commit evolved through a lot of steps and some of it is wrong
(for instance the sorting of Cell children and the merging of PGRD etc,
plus the patcher errors) but could not be more granular - I did my best
to document some important points below.

I used extensively https://en.uesp.net/wiki/Oblivion_Mod:Mod_File_Format
Note:

- updateMasters should leverage iter_records and be absolutely the same
for all GRUPS.
- endPos needed for WRLD grandchildren - I should probably use the
self._end_pos instead of passing it as a function parameter.
- note I use master_record a lot - the explicit names for "extra_recs"
will disappear soon (see for instance getNumRecords -> __get_cell)

Lots of subtleties here and needing a second pair of eyes and modding
foo:

- XXX Looks like the order for cell ref groups is 8, 10, 9 unlike the
code comments used to suggest and in accordance with the uesp wiki:

@@ -922,3 +922,3 @@ def dump(self,out):
         mget = self._mob_objects.get # for wrapping don't remove inline :P
-        subs = [m for m in (mget(8), *extra_recs, mget(9), mget(10)) if m]
+        subs = [m for m in (mget(8), *extra_recs, mget(10), mget(9)) if m]
         if subs:

this is the the order of the groups as read from Oblivion.esm  -> EDIT:
nope see Add missing sorting of CELL children and:

ef06994#r94025649

- Note I added _validate and the ancient TODO there - should be answered
- isFallout - is only needed for newer games? Is it only needed for
WRLD children groups or also for CELL ones? See extensive discussion in
e7a31dc
- Mopy/bash/patcher/patchers/multitweak_assorted.py: removed WRLD -
this is for interior cells but still !

WIP WRLD:

The peculiar WRLDs dump (turns out doesn't work and will be removed -
see last traceback here)

This also centralizes _load_rec_group

----

And now the clients - in a proud untested alpha, setRecord everywhere:

No more id_worldBlocks/cellBlock - id_records should ideally be
encapsulated but the real issue here is getTypeCopy - before we were
not copying always - right solution seems to add do_copy param alongside
_loading to setRecord - then go through the uses and decide on copy or
not -> new issue: #658.

Then note that we do not copy along the references - why not do
patchCells.setRecord(cellBlock)? Is the game attaching the refs from
an earlier loading plugin for the same cell fid?

Mopy/bash/patcher/patchers/base.py: simplified at last but we should
move this inside the cell and get rid of the old API. Some work on #653
that leaked in here due to conflicts - note that the tweaks were never
meant to iterate over nested records so using curr_top here should be
safe.

CellImporter:

Using a small hack, a new param in getActiveRecords. Note subtle
differences in WRLD and CELL led to set_cell - much less obscure than
setCell. Note also getActiveRecords now returns the rid instead of the
fid.

Inline importCellBlockData() and co:

In the laudable tradition of getting rid of closures :)

Adieu non poolable tweaks - this opens the way for simplifying the
tweaks hierarchy even further.

Use master record rather than the 'world' attribute

empty_mob

Starts coming together, related to _write_header - open question:
- what do we write into the BP for the rest of the header fields? I
here copy the header as I (hopefully) got it - but in case we _create_
the group all the extra header fields are 0. At least this is now
centralized in empty_mob - see also mod_files.py - using it there too.

----

Oblivion.esm records verification passes:

Last one was:

[Window Title]
Verification Failed

[Content]
Oblivion.esm failed to verify using current record definitions. The original traceback is available in the BashBugDump.

mod_links.py  114 Execute: Exception loading Oblivion.esm:
Traceback (most recent call last):
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\basher\mod_links.py", line 105, in Execute
    self._load_mod(dbg_inf, keepAll=False,
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\basher\mod_links.py", line 86, in _load_mod
    modFile.load(True, **kwargs)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\mod_files.py", line 224, in load
    new_top = topClass(header, self.loadFactory, ins,
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 1373, in __init__
    self.orphansSkipped = _orphans_skipped = 0
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 668, in __init__
    super().__init__(header, loadFactory, ins, do_unpack)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 206, in __init__
    super().__init__(header, loadFactory, ins, do_unpack)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 153, in __init__
    super().__init__(loadFactory, ins, self._end_pos)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 52, in __init__
    self._load_rec_group(ins, endPos)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 219, in _load_rec_group
    grel = self._group_element(header, ins)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 693, in _group_element
    return self._top_rec_class(self.loadFactory, ins)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 409, in __init__
    super().__init__(loadFactory, ins, self._end_pos)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 52, in __init__
    self._load_rec_group(ins, endPos)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 605, in _load_rec_group
    super()._load_rec_group(ins, endPos, self.master_record)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 508, in _load_rec_group
    self._load_mobs(gt, header, ins, master_rec)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 1348, in _load_mobs
    return
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 522, in _load_mobs
    self._mob_objects[gt] = self._mob_objects_type[gt](header,
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 927, in __init__
    super(_Nested, self).__init__(grup_head, loadFactory, ins, do_unpack,
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 378, in __init__
    super().__init__(header, loadFactory, ins, do_unpack) # FIXME pass master record here
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 206, in __init__
    super().__init__(header, loadFactory, ins, do_unpack)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 153, in __init__
    super().__init__(loadFactory, ins, self._end_pos)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 52, in __init__
    self._load_rec_group(ins, endPos)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 508, in _load_rec_group
    self._load_mobs(gt, header, ins, master_rec)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 1216, in _load_mobs
    ins.rewind() # to reread first block header
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 1026, in __init__
    self.id_records = {}
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 52, in __init__
    self._load_rec_group(ins, endPos)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 1087, in _load_rec_group
    if _rsig == b'CELL':
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 1050, in _group_element
    def _group_element(self, header, ins, do_unpack=True) -> _cell_type:
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 409, in __init__
    super().__init__(loadFactory, ins, self._end_pos)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 52, in __init__
    self._load_rec_group(ins, endPos)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 605, in _load_rec_group
    super()._load_rec_group(ins, endPos, self.master_record)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 511, in _load_rec_group
    self._load_err(f'Unexpected {sig_to_str(head_sig)} record in '
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 55, in _load_err
    raise ModError(self.inName, msg)
bash.exception.ModError: Oblivion.esm: Unexpected WRLD record in CELL Record group.

I could solve this by making _top_type a one element set, which would
only have 2 elements for _ExtCell -> {b'CELL', b'WRLD'} - but I
preferred to pass the end_pos, as those ext cell groups are nested in
WrldChildren of which we know the size - after WrldChildren we are
supposed to get another WRLD - what happens in the traceback above on
loading _ExtCell. This might make adding to the marker groups the {0}
(top type) one redundant.

_ExteriorCells needs some more work but thinning _AMob has priority here

setRecord do_copy param and revert copies in Cell patchers:

Copying records turns out was the performance sink.

get_cells:

Traceback (most recent call last):
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\basher\patcher_dialog.py", line 170, in PatchExecute
    patchFile.buildPatch(log,SubProgress(progress,0.8,0.9))#no speeding needed/really possible (less than 1/4 second even with large LO)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\patcher\patch_files.py", line 322, in buildPatch
    patcher.buildPatch(log, SubProgress(subProgress, i))
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\patcher\patchers\preservers.py", line 528, in buildPatch
    for cell_fid, patch_cell in worldBlock.getActiveRecords(b'CELL'):
AttributeError: 'MobWorld' object has no attribute 'getActiveRecords'

Monkey patch for PGRE be iterated too:

patcher_dialog.py  272 _error: Exception during Bashed Patch building:
Traceback (most recent call last):
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\basher\patcher_dialog.py", line 169, in PatchExecute
    patchFile.scanLoadMods(SubProgress(progress,0.2,0.8)) #try to speed this up!
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\patcher\patch_files.py", line 270, in scanLoadMods
    patcher.scan_mod_file(modFile,nullProgress)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\patcher\base.py", line 90, in scan_mod_file
    self.scanModFile(modFile, progress)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\patcher\patchers\base.py", line 277, in scanModFile
    if record.base in self.old_new:
AttributeError: 'MrePgrd' object has no attribute 'base'

@@ -262,3 +262,3 @@ def scanModFile(self, modFile, progress, *, __get_refs=(
                     for record in get_refs(cellBlock).iter_records():
-                        if record.base in self.old_new:
+                        if getattr(record, 'base', None) in self.old_new:
                             if not patch_cell:

use EAFP here?

BP for Oblivion dumps

Last few:

{v._mob_objects[6]._grup_head.label for v in self.id_records.values()} XXX

{_DummyFid((Oblivion.esm, 000000))}

hence passing the kwargs: **{ # the master record
                sig_to_str(master_rec._rec_sig).lower(): master_rec})

----

patcher_dialog.py  268 _error: Exception during Bashed Patch building:
Traceback (most recent call last):
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\basher\patcher_dialog.py", line 182, in PatchExecute
    self._save_pbash(patchFile, patch_name)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\basher\patcher_dialog.py", line 277, in _save_pbash
    patchFile.safeSave()
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\mod_files.py", line 257, in safeSave
    self.save(filePath.temp)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\mod_files.py", line 291, in save
    selfTops[rsig].dump(out)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 1368, in dump
    totalSize = RecordHeader.rec_header_size + sum(
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 1369, in <genexpr>
    x.dump(out) for x in self.id_records.values())
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 1333, in dump
    super().dump(out)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_groups.py", line 529, in dump
    if r: r.dump(out)
  File "C:\Dropbox (Personal)\eclipse_workspaces\python\wrye-bash\Mopy\bash\brec\record_structs.py", line 531, in dump
    raise exception.StateError(f'Data changed: {self.rec_str}')
bash.exception.StateError: Data changed: WRLD

hence bye override of dump (so much the better)

-----

Note - RecordType.sig_to_class is not Initialized (only records in
common records) - adding a deprint:

+deprint(f'{RecordType.sig_to_class=}')
+
 class _AMobBase:

prints:

record_groups.py   42 <module>: RecordType.sig_to_class={b'FLST':
<class 'bash.brec.common_records.AMreFlst'>, b'IMAD': <class
'bash.brec.common_records.AMreImad'>, b'CELL': <class
'bash.brec.common_records.AMreCell'>, b'ASTP': <class
'bash.brec.common_records.MreAstp'>, b'COLL': <class
'bash.brec.common_records.MreColl'>, b'DEBR': <class
'bash.brec.common_records.MreDebr'>, b'DLBR': <class
'bash.brec.common_records.MreDlbr'>, b'DLVW': <class
'bash.brec.common_records.MreDlvw'>, b'DUAL': <class
'bash.brec.common_records.MreDual'>, b'EYES': <class
'bash.brec.common_records.MreEyes'>, b'FSTP': <class
'bash.brec.common_records.MreFstp'>, b'FSTS': <class
'bash.brec.common_records.MreFsts'>, b'GLOB': <class
'bash.brec.common_records.MreGlob'>, b'GMST': <class
'bash.brec.common_records.MreGmst'>}

So we can't use it to populate the _accepted_sigs - see:

"Move updating cell references for cell/wrld groups to game.init"
@Utumno Utumno self-assigned this Mar 22, 2023
Utumno added a commit that referenced this issue Apr 6, 2023
I did not like adding one function call per record in _add_to_patch.
This attempts to make this at least a bit faster. As a bonus reduces
uses of the rather private 'tops' and especially 'id_records'.
Performance gain is little actually:

2cd690c:
763083810 function calls (726903143 primitive calls) in 463.804 seconds
this:
762797187 function calls (726679908 primitive calls) in 461.297 seconds

so few gains (turns out _add_to_patch is a tiny fraction of function
calls) - ~230k calls.

There is a subtler benefit and arguably more important - we used to
create the patchFile tops in those checks. See
ad85967. We break down the logic in
add_to_patch in some commonly used paths:

- a _filter_in_patch class variable to filter the records once for some
of the patchers.
- a _keep_ids property for patchers that would only keep a (usually not
already added) record only if its rid is present in their (varied) data
structures.

Gives some insight for scanModFile and a glimpse of the differences
between the patchers in case we want to merge - for one track the
value for _filter_in_patch helps - for two _keep_ids could point to
common structures. Should be faster also as the checks would be
performed in _add_to_patch anyway - we just avoid the calls - timings
above. I don't like catching the NotImplementedError there but will
do for now.

CoblCatalogsPatcher.scanModFile:

Although as seen we could use super with _add_to_patch that one looked
pretty expensive - an override is cleaner here. I add a copy -
_book_fids are few and this should be handled anyways - see #658.
@Utumno Utumno modified the milestones: 312, 313 Nov 17, 2023
@Infernio Infernio modified the milestones: 313, 314 Dec 16, 2023
@Utumno Utumno modified the milestones: 314, 315 Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-patchers Area: Patchers (Everything in the patcher package) A-perf Area: Performance (runtime performance and/or memory usage) A-records Area: Record Definitions (The brec package, mod_files.py and the game/*/records.py files) C-refactoring Category: Refactoring. A purely internal refactoring, with no user-facing changes
Projects
None yet
Development

No branches or pull requests

2 participants