diff --git a/README b/README index 4be9be5..33c91ed 100644 --- a/README +++ b/README @@ -8,18 +8,18 @@ ShimCacheParser is a proof-of-concept tool for reading the Application Compatibi More information about this cache and how it's implemented can be found here: https://www.fireeye.com/content/dam/fireeye-www/services/freeware/shimcache-whitepaper.pdf -The script will find these registry paths, automatically determine their format, and return the data in an optional CSV format. During testing it was discovered that on Windows Vista and later, files may be added to this cache if they were browsed to by explorer.exe and never actually executed. When these same files were executed, the 2nd least significant bit in the flags field was set by the CSRSS process while checking SXS information. During testing it was possible to identify if processes were executed based on this flag being set. This flag's true purpose is currently unknown and is still being testing for consistency, so it should not be currently used to definitively conclude that a file may or may not have executed. +The script will find these registry paths, automatically determine their format, and return the data in an optional CSV format. During testing it was discovered that on Windows Vista and later, files may be added to this cache if they were browsed to by explorer.exe and never actually executed. When these same files were executed, the 2nd least significant bit in the flags field was set by the CSRSS process while checking SXS information. During testing it was possible to identify if processes were executed based on this flag being set. This flag's true purpose is currently unknown and is still being testing for consistency, so it should not be currently used to definitively conclude that a file may or may not have executed. Usage ==================== -ShimCacheParser.py requires python 2.x (2.6 or later) which can be obtained from http://www.python.org/download/. Parsing of exported registry hives requires Willi Ballenthin's python-registry library which is currently included in this project or can be downloaded here: https://github.com/williballenthin/python-registry. +ShimCacheParser.py requires python 2.x (2.6 or later) or python 3.x (3.5 or later) which can be obtained from http://www.python.org/download/. Parsing of exported registry hives requires Willi Ballenthin's python-registry library which can be installed via 'pip install python-registry' or downloaded here: https://github.com/williballenthin/python-registry. Several types of inputs are currently supported: -Extracted Registry Hives (-i, --hive) - -Exported .reg registry files (-r, --reg) + -Exported .reg registry files (-r, --reg) -MIR XML (-m, --mir) -Mass MIR registry acquisitions ZIP archives (-z, --zip) -The current Windows system (-l, --local) -Exported AppComatCache data from binary file (-b, --bin) - -The output CSV file is set with the (-o, --output) argument. If no output file is specified, the data will be printed to STDOUT. ShimCacheParser will search each ControlSet and will only return unique entries by default. If you want to display duplicates as well as the full registry path where the data was taken use the verbose (-v, --verbose) option. + +The output CSV file is set with the (-o, --output) argument. If no output file is specified, the data will be printed to STDOUT. ShimCacheParser will search each ControlSet and will only return unique entries by default. If you want to display duplicates as well as the full registry path where the data was taken use the verbose (-v, --verbose) option. diff --git a/Registry/LICENSE b/Registry/LICENSE deleted file mode 100644 index 7a4a3ea..0000000 --- a/Registry/LICENSE +++ /dev/null @@ -1,202 +0,0 @@ - - Apache License - Version 2.0, January 2004 - http://www.apache.org/licenses/ - - TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION - - 1. Definitions. - - "License" shall mean the terms and conditions for use, reproduction, - and distribution as defined by Sections 1 through 9 of this document. - - "Licensor" shall mean the copyright owner or entity authorized by - the copyright owner that is granting the License. - - "Legal Entity" shall mean the union of the acting entity and all - other entities that control, are controlled by, or are under common - control with that entity. For the purposes of this definition, - "control" means (i) the power, direct or indirect, to cause the - direction or management of such entity, whether by contract or - otherwise, or (ii) ownership of fifty percent (50%) or more of the - outstanding shares, or (iii) beneficial ownership of such entity. - - "You" (or "Your") shall mean an individual or Legal Entity - exercising permissions granted by this License. - - "Source" form shall mean the preferred form for making modifications, - including but not limited to software source code, documentation - source, and configuration files. - - "Object" form shall mean any form resulting from mechanical - transformation or translation of a Source form, including but - not limited to compiled object code, generated documentation, - and conversions to other media types. - - "Work" shall mean the work of authorship, whether in Source or - Object form, made available under the License, as indicated by a - copyright notice that is included in or attached to the work - (an example is provided in the Appendix below). - - "Derivative Works" shall mean any work, whether in Source or Object - form, that is based on (or derived from) the Work and for which the - editorial revisions, annotations, elaborations, or other modifications - represent, as a whole, an original work of authorship. For the purposes - of this License, Derivative Works shall not include works that remain - separable from, or merely link (or bind by name) to the interfaces of, - the Work and Derivative Works thereof. - - "Contribution" shall mean any work of authorship, including - the original version of the Work and any modifications or additions - to that Work or Derivative Works thereof, that is intentionally - submitted to Licensor for inclusion in the Work by the copyright owner - or by an individual or Legal Entity authorized to submit on behalf of - the copyright owner. For the purposes of this definition, "submitted" - means any form of electronic, verbal, or written communication sent - to the Licensor or its representatives, including but not limited to - communication on electronic mailing lists, source code control systems, - and issue tracking systems that are managed by, or on behalf of, the - Licensor for the purpose of discussing and improving the Work, but - excluding communication that is conspicuously marked or otherwise - designated in writing by the copyright owner as "Not a Contribution." - - "Contributor" shall mean Licensor and any individual or Legal Entity - on behalf of whom a Contribution has been received by Licensor and - subsequently incorporated within the Work. - - 2. Grant of Copyright License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - copyright license to reproduce, prepare Derivative Works of, - publicly display, publicly perform, sublicense, and distribute the - Work and such Derivative Works in Source or Object form. - - 3. Grant of Patent License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - (except as stated in this section) patent license to make, have made, - use, offer to sell, sell, import, and otherwise transfer the Work, - where such license applies only to those patent claims licensable - by such Contributor that are necessarily infringed by their - Contribution(s) alone or by combination of their Contribution(s) - with the Work to which such Contribution(s) was submitted. If You - institute patent litigation against any entity (including a - cross-claim or counterclaim in a lawsuit) alleging that the Work - or a Contribution incorporated within the Work constitutes direct - or contributory patent infringement, then any patent licenses - granted to You under this License for that Work shall terminate - as of the date such litigation is filed. - - 4. Redistribution. You may reproduce and distribute copies of the - Work or Derivative Works thereof in any medium, with or without - modifications, and in Source or Object form, provided that You - meet the following conditions: - - (a) You must give any other recipients of the Work or - Derivative Works a copy of this License; and - - (b) You must cause any modified files to carry prominent notices - stating that You changed the files; and - - (c) You must retain, in the Source form of any Derivative Works - that You distribute, all copyright, patent, trademark, and - attribution notices from the Source form of the Work, - excluding those notices that do not pertain to any part of - the Derivative Works; and - - (d) If the Work includes a "NOTICE" text file as part of its - distribution, then any Derivative Works that You distribute must - include a readable copy of the attribution notices contained - within such NOTICE file, excluding those notices that do not - pertain to any part of the Derivative Works, in at least one - of the following places: within a NOTICE text file distributed - as part of the Derivative Works; within the Source form or - documentation, if provided along with the Derivative Works; or, - within a display generated by the Derivative Works, if and - wherever such third-party notices normally appear. The contents - of the NOTICE file are for informational purposes only and - do not modify the License. You may add Your own attribution - notices within Derivative Works that You distribute, alongside - or as an addendum to the NOTICE text from the Work, provided - that such additional attribution notices cannot be construed - as modifying the License. - - You may add Your own copyright statement to Your modifications and - may provide additional or different license terms and conditions - for use, reproduction, or distribution of Your modifications, or - for any such Derivative Works as a whole, provided Your use, - reproduction, and distribution of the Work otherwise complies with - the conditions stated in this License. - - 5. Submission of Contributions. Unless You explicitly state otherwise, - any Contribution intentionally submitted for inclusion in the Work - by You to the Licensor shall be under the terms and conditions of - this License, without any additional terms or conditions. - Notwithstanding the above, nothing herein shall supersede or modify - the terms of any separate license agreement you may have executed - with Licensor regarding such Contributions. - - 6. Trademarks. This License does not grant permission to use the trade - names, trademarks, service marks, or product names of the Licensor, - except as required for reasonable and customary use in describing the - origin of the Work and reproducing the content of the NOTICE file. - - 7. Disclaimer of Warranty. Unless required by applicable law or - agreed to in writing, Licensor provides the Work (and each - Contributor provides its Contributions) on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or - implied, including, without limitation, any warranties or conditions - of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A - PARTICULAR PURPOSE. You are solely responsible for determining the - appropriateness of using or redistributing the Work and assume any - risks associated with Your exercise of permissions under this License. - - 8. Limitation of Liability. In no event and under no legal theory, - whether in tort (including negligence), contract, or otherwise, - unless required by applicable law (such as deliberate and grossly - negligent acts) or agreed to in writing, shall any Contributor be - liable to You for damages, including any direct, indirect, special, - incidental, or consequential damages of any character arising as a - result of this License or out of the use or inability to use the - Work (including but not limited to damages for loss of goodwill, - work stoppage, computer failure or malfunction, or any and all - other commercial damages or losses), even if such Contributor - has been advised of the possibility of such damages. - - 9. Accepting Warranty or Additional Liability. While redistributing - the Work or Derivative Works thereof, You may choose to offer, - and charge a fee for, acceptance of support, warranty, indemnity, - or other liability obligations and/or rights consistent with this - License. However, in accepting such obligations, You may act only - on Your own behalf and on Your sole responsibility, not on behalf - of any other Contributor, and only if You agree to indemnify, - defend, and hold each Contributor harmless for any liability - incurred by, or claims asserted against, such Contributor by reason - of your accepting any such warranty or additional liability. - - END OF TERMS AND CONDITIONS - - APPENDIX: How to apply the Apache License to your work. - - To apply the Apache License to your work, attach the following - boilerplate notice, with the fields enclosed by brackets "[]" - replaced with your own identifying information. (Don't include - the brackets!) The text should be enclosed in the appropriate - comment syntax for the file format. We also recommend that a - file or class name and description of purpose be included on the - same "printed page" as the copyright notice for easier - identification within third-party archives. - - Copyright [yyyy] [name of copyright owner] - - Licensed under the Apache License, Version 2.0 (the "License"); - you may not use this file except in compliance with the License. - You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. \ No newline at end of file diff --git a/Registry/Registry.py b/Registry/Registry.py deleted file mode 100644 index 1d91184..0000000 --- a/Registry/Registry.py +++ /dev/null @@ -1,302 +0,0 @@ -#!/bin/python - -# This file is part of python-registry. -# -# Copyright 2011 Will Ballenthin -# while at Mandiant -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import sys -import RegistryParse - -RegSZ = 0x0001 -RegExpandSZ = 0x0002 -RegBin = 0x0003 -RegDWord = 0x0004 -RegMultiSZ = 0x0007 -RegQWord = 0x000B -RegNone = 0x0000 -RegBigEndian = 0x0005 -RegLink = 0x0006 -RegResourceList = 0x0008 -RegFullResourceDescriptor = 0x0009 -RegResourceRequirementsList = 0x000A - - -class RegistryKeyHasNoParentException(RegistryParse.RegistryStructureDoesNotExist): - """ - """ - def __init__(self, value): - """ - Constructor. - Arguments: - - `value`: A string description. - """ - super(RegistryKeyHasNoParentException, self).__init__(value) - - def __str__(self): - return "Registry key has no parent key: %s" % (self._value) - - -class RegistryKeyNotFoundException(RegistryParse.RegistryStructureDoesNotExist): - """ - """ - def __init__(self, value): - """ - - Arguments: - - `value`: - """ - super(RegistryKeyNotFoundException, self).__init__(value) - - def __str__(self): - return "Registry key not found: %s" % (self._value) - -class RegistryValueNotFoundException(RegistryParse.RegistryStructureDoesNotExist): - """ - """ - def __init__(self, value): - """ - - Arguments: - - `value`: - """ - super(RegistryValueNotFoundException, self).__init__(value) - - def __str__(self): - return "Registry value not found: %s" % (self._value) - -class RegistryValue(object): - """ - This is a high level structure for working with the Windows Registry. - It represents the 3-tuple of (name, type, value) associated with a registry value. - """ - def __init__(self, vkrecord): - self._vkrecord = vkrecord - - def name(self): - """ - Get the name of the value as a string. - The name of the default value is returned as "(default)". - """ - if self._vkrecord.has_name(): - return self._vkrecord.name() - else: - return "(default)" - - def value_type(self): - """ - Get the type of the value as an integer constant. - - One of: - - RegSZ = 0x0001 - - RegExpandSZ = 0x0002 - - RegBin = 0x0003 - - RegDWord = 0x0004 - - RegMultiSZ = 0x0007 - - RegQWord = 0x000B - - RegNone = 0x0000 - - RegBigEndian = 0x0005 - - RegLink = 0x0006 - - RegResourceList = 0x0008 - - RegFullResourceDescriptor = 0x0009 - - RegResourceRequirementsList = 0x000A - """ - return self._vkrecord.data_type() - - def value_type_str(self): - """ - Get the type of the value as a string. - - One of: - - RegSZ - - RegExpandSZ - - RegBin - - RegDWord - - RegMultiSZ - - RegQWord - - RegNone - - RegBigEndian - - RegLink - - RegResourceList - - RegFullResourceDescriptor - - RegResourceRequirementsList - """ - return self._vkrecord.data_type_str() - - def value(self): - return self._vkrecord.data() - -class RegistryKey(object): - """ - A high level structure for use in traversing the Windows Registry. - A RegistryKey is a node in a tree-like structure. - A RegistryKey may have a set of values associated with it, as well as a last modified timestamp. - """ - def __init__(self, nkrecord): - """ - - Arguments: - - `NKRecord`: - """ - self._nkrecord = nkrecord - - def __str__(self): - return "Registry Key %s with %d values and %d subkeys" % (self.path(), len(self.values()), len(self.subkeys())) - - def __getitem__(self, key): - return self.value(key) - - def timestamp(self): - """ - Get the last modified timestamp as a Python datetime. - """ - return self._nkrecord.timestamp() - - def name(self): - """ - Get the name of the key as a string. - - For example, "Windows" if the key path were /{hive name}/SOFTWARE/Microsoft/Windows - See RegistryKey.path() to get the complete key name. - """ - return self._nkrecord.name() - - - def path(self): - """ - Get the full path of the RegistryKey as a string. - For example, "/{hive name}/SOFTWARE/Microsoft/Windows" - """ - return self._nkrecord.path() - - def parent(self): - """ - Get the parent RegistryKey of this key, or raise - RegistryKeyHasNoParentException if it does not exist (for example, - the root key has no parent). - """ - # there may be a memory inefficiency here, since we create - # a new RegistryKey from the NKRecord parent key, rather - # than using the parent of this instance, if it exists. - try: - return RegistryKey(self._nkrecord.parent_key()) - except RegistryParse.ParseException: - raise RegistryKeyHasNoParentException(self.name()) - - def subkeys(self): - """ - Return a list of all subkeys. Each element in the list is a RegistryKey. - If the key has no subkeys, the empty list is returned. - """ - if self._nkrecord.subkey_number() == 0: - return [] - - l = self._nkrecord.subkey_list() - return [RegistryKey(k) for k in l.keys()] - - def subkey(self, name): - """ - Return the subkey with a given name as a RegistryKey. - Raises RegistryKeyNotFoundException if the subkey with the given name does not exist. - """ - #print name - if self._nkrecord.subkey_number() == 0: - raise RegistryKeyNotFoundException(self.path() + "\\" + name) - - for k in self._nkrecord.subkey_list().keys(): - if k.name() == name: - return RegistryKey(k) - raise RegistryKeyNotFoundException(self.path() + "\\" + name) - - def values(self): - """ - Return a list containing the values associated with this RegistryKey. - Each element of the list will be a RegistryValue. - If there are no values associated with this RegistryKey, then the - empty list is returned. - """ - try: - return [RegistryValue(v) for v in self._nkrecord.values_list().values()] - except RegistryParse.RegistryStructureDoesNotExist: - return [] - - def value(self, name): - """ - Return the value with the given name as a RegistryValue. - Raises RegistryValueNotFoundExceptiono if the value with the given name does not exist. - """ - if name == "(default)": - name = "" - for v in self._nkrecord.values_list().values(): - if v.name() == name: - return RegistryValue(v) - raise RegistryValueNotFoundException(self.path() + " : " + name) - - def find_key(self, path): - """ - Perform a search for a RegistryKey with a specific path. - """ - if len(path) == 0: - return self - - (immediate, _, future) = path.partition("\\") - return self.subkey(immediate).find_key(future) - -class Registry(object): - """ - A class for parsing and reading from a Windows Registry file. - """ - def __init__(self, filelikeobject): - """ - Constructor. - Arguments: - - `filelikeobject`: A file-like object with a .read() method. - If a Python string is passed, it is interpreted as a filename, - and the corresponding file is opened. - """ - try: - self._buf = filelikeobject.read() - except AttributeError: - with open(filelikeobject, "rb") as f: - self._buf = f.read() - self._regf = RegistryParse.REGFBlock(self._buf, 0, False) - - def root(self): - """ - Return the first RegistryKey in the hive. - """ - return RegistryKey(self._regf.first_key()) - - def open(self, path): - """ - Return a RegistryKey by full path. - Subkeys are separated by the backslash character ('\'). A trailing backslash may or may - not be present. - The hive name should not be included. - """ - # is the first registry key always the root? are there any other keys at this - # level? is this the name of the hive? - return RegistryKey(self._regf.first_key()).find_key(path) - -def print_all(key): - if len(key.subkeys()) == 0: - print key.path() - else: - for k in key.subkeys(): - print_all(k) - -if __name__ == '__main__': - r = Registry(sys.argv[1]) - print_all(r.root()) diff --git a/Registry/RegistryParse.py b/Registry/RegistryParse.py deleted file mode 100644 index 37c9f69..0000000 --- a/Registry/RegistryParse.py +++ /dev/null @@ -1,1234 +0,0 @@ -#!/bin/python - -# This file is part of python-registry. -# -# Copyright 2011 Will Ballenthin -# while at Mandiant -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import struct -from datetime import datetime - -# Constants -RegSZ = 0x0001 -RegExpandSZ = 0x0002 -RegBin = 0x0003 -RegDWord = 0x0004 -RegMultiSZ = 0x0007 -RegQWord = 0x000B -RegNone = 0x0000 -RegBigEndian = 0x0005 -RegLink = 0x0006 -RegResourceList = 0x0008 -RegFullResourceDescriptor = 0x0009 -RegResourceRequirementsList = 0x000A - -_global_warning_messages = [] -def warn(msg): - if msg not in _global_warning_messages: - _global_warning_messages.append(msg) - print "Warning: %s" % (msg) - -def parse_windows_timestamp(qword): - # see http://integriography.wordpress.com/2010/01/16/using-phython-to-parse-and-present-windows-64-bit-timestamps/ - return datetime.utcfromtimestamp(float(qword) * 1e-7 - 11644473600 ) - -class RegistryException(Exception): - """ - Base Exception class for Windows Registry access. - """ - - def __init__(self, value): - """ - Constructor. - Arguments: - - `value`: A string description. - """ - super(RegistryException, self).__init__() - self._value = value - - def __str__(self): - return "Registry Exception: %s" % (self._value) - -class RegistryStructureDoesNotExist(RegistryException): - """ - Exception to be raised when a structure or block is requested which does not exist. - For example, asking for the ValuesList structure of an NKRecord that has no values - (and therefore no ValuesList) should result in this exception. - """ - def __init__(self, value): - """ - Constructor. - Arguments: - - `value`: A string description. - """ - super(RegistryStructureDoesNotExist, self).__init__(value) - - def __str__(self): - return "Registry Structure Does Not Exist Exception: %s" % (self._value) - -class ParseException(RegistryException): - """ - An exception to be thrown during Windows Registry parsing, such as - when an invalid header is encountered. - """ - def __init__(self, value): - """ - Constructor. - Arguments: - - `value`: A string description. - """ - super(ParseException, self).__init__(value) - - def __str__(self): - return "Registry Parse Exception(%s)" % (self._value) - -class UnknownTypeException(RegistryException): - """ - An exception to be raised when an unknown data type is encountered. - Supported data types current consist of - - RegSZ - - RegExpandSZ - - RegBin - - RegDWord - - RegMultiSZ - - RegQWord - - RegNone - - RegBigEndian - - RegLink - - RegResourceList - - RegFullResourceDescriptor - - RegResourceRequirementsList - """ - def __init__(self, value): - """ - Constructor. - Arguments: - - `value`: A string description. - """ - super(UnknownTypeException, self).__init__(value) - - def __str__(self): - return "Unknown Type Exception(%s)" % (self._value) - -class RegistryBlock(object): - """ - Base class for structure blocks in the Windows Registry. - A block is associated with a offset into a byte-string. - - All blocks (besides the root) also have a parent member, which refers to - a RegistryBlock that contains a reference to this block, an is found at a - hierarchically superior rank. Note, by following the parent links upwards, - the root block should be accessible (aka. there should not be any loops) - """ - def __init__(self, buf, offset, parent): - """ - Constructor. - Arguments: - - `buf`: Byte string containing Windows Registry file. - - `offset`: The offset into the buffer at which the block starts. - - `parent`: The parent block, which links to this block. - """ - self._buf = buf - self._offset = offset - self._parent = parent - - def unpack_word(self, offset): - """ - Returns a little-endian WORD (2 bytes) from the relative offset. - Arguments: - - `offset`: The relative offset from the start of the block. - """ - return struct.unpack_from(" 0 - - def size(self): - """ - Size of this cell, as an unsigned integer. - """ - if self.is_free(): - return self._size - else: - return self._size * -1 - - def next(self): - """ - Returns the next HBINCell, which is located immediately after this. - Note: This will always return an HBINCell starting at the next location - whether or not the buffer is large enough. The calling function should - check the offset of the next HBINCell to ensure it does not overrun the - HBIN buffer. - """ - try: - return HBINCell(self._buf, self._offset + self.size(), self.parent()) - except: - raise RegistryStructureDoesNotExist("HBINCell does not exist at 0x%x" % (self._offset + self.size())) - - def offset(self): - """ - Accessor for absolute offset of this HBINCell. - """ - return self._offset - - def data_offset(self): - """ - Get the absolute offset of the data block of this HBINCell. - """ - return self._offset + 0x4 - - def raw_data(self): - """ - Get the raw data from the buffer contained by this HBINCell. - """ - return self._buf[self.data_offset():self.data_offset() + self.size()] - - def data_id(self): - """ - Get the ID string of the data block of this HBINCell. - """ - return self.unpack_string(0x4, 2) - - def abs_offset_from_hbin_offset(self, offset): - """ - Offsets contained in HBIN cells are relative to the beginning of the first HBIN. - This converts the relative offset into an absolute offset. - """ - h = self.parent() - while h.__class__.__name__ != "HBINBlock": - h = h.parent() - - return h.first_hbin().offset() + offset - - def child(self): - """ - Make a _guess_ as to the contents of this structure and - return an instance of that class, or just a DataRecord - otherwise. - """ - if self.is_free(): - raise RegistryStructureDoesNotExist("HBINCell is free at 0x%x" % (self.offset())) - - id_ = self.data_id() - - if id_ == "vk": - return VKRecord(self._buf, self.data_offset(), self) - elif id_ == "nk": - return NKRecord(self._buf, self.data_offset(), self) - elif id_ == "lf": - return LFRecord(self._buf, self.data_offset(), self) - elif id_ == "lh": - return LHRecord(self._buf, self.data_offset(), self) - elif id_ == "li": - return LIRecord(self._buf, self.data_offset(), self) - elif id_ == "ri": - return RIRecord(self._buf, self.data_offset(), self) - elif id_ == "sk": - return SKRecord(self._buf, self.data_offset(), self) - elif id_ == "db": - return DBRecord(self._buf, self.data_offset(), self) - else: - return DataRecord(self._buf, self.data_offset(), self) - -class Record(RegistryBlock): - """ - Abstract class for Records contained by cells in HBINs - """ - def __init__(self, buf, offset, parent): - """ - Constructor. - Arguments: - - `buf`: Byte string containing Windows Registry file. - - `offset`: The offset into the buffer at which the block starts. - - `parent`: The parent block, which links to this block. This SHOULD be an HBINCell. - """ - super(Record, self).__init__(buf, offset, parent) - - def abs_offset_from_hbin_offset(self, offset): - # TODO This violates DRY as this is a redefinition, see HBINCell.abs_offset_from_hbin_offset() - """ - Offsets contained in HBIN cells are relative to the beginning of the first HBIN. - This converts the relative offset into an absolute offset. - """ - h = self.parent() - while h.__class__.__name__ != "HBINBlock": - h = h.parent() - - return h.first_hbin().offset() + offset - -class DataRecord(Record): - """ - A DataRecord is a HBINCell that does not contain any further structural data, but - may contain, for example, the values pointed to by a VKRecord. - """ - def __init__(self, buf, offset, parent): - """ - Constructor. - - Arguments: - - `buf`: Byte string containing Windows Registry file. - - `offset`: The offset into the buffer at which the block starts. - - `parent`: The parent block, which links to this block. This should be an HBINCell. - """ - super(DataRecord, self).__init__(buf, offset, parent) - - def __str__(self): - return "Data Record at 0x%x" % (self.offset()) - -class DBIndirectBlock(Record): - """ - The DBIndirect block is a list of offsets to DataRecords with data - size up to 0x3fd8. - """ - def __init__(self, buf, offset, parent): - """ - Constructor. - Arguments: - - `buf`: Byte string containing Windows Registry file. - - `offset`: The offset into the buffer at which the block starts. - - `parent`: The parent block, which links to this block. This should be an HBINCell. - """ - super(DBIndirectBlock, self).__init__(buf, offset, parent) - - def __str__(self): - return "Large Data Block at 0x%x" % (self.offset()) - - def large_data(self, length): - """ - Get the data pointed to by the indirect block. It may be large. - Return a byte array. - """ - b = bytearray() - count = 0 - while length > 0: - off = self.abs_offset_from_hbin_offset(self.unpack_dword(4 * count)) - size = min(0x3fd8, length) - b += HBINCell(self._buf, off, self).raw_data()[0:size] - - count += 1 - length -= size - return b - -class DBRecord(Record): - """ - A DBRecord is a large data block, which is not thoroughly documented. - Its similar to an inode in the Ext file systems. - """ - def __init__(self, buf, offset, parent): - """ - Constructor. - Arguments: - - `buf`: Byte string containing Windows Registry file. - - `offset`: The offset into the buffer at which the block starts. - - `parent`: The parent block, which links to this block. This should be an HBINCell. - """ - super(DBRecord, self).__init__(buf, offset, parent) - - _id = self.unpack_string(0x0, 2) - if _id != "db": - raise ParseException("Invalid DB Record ID") - - def __str__(self): - return "Large Data Block at 0x%x" % (self.offset()) - - def large_data(self, length): - """ - Get the data described by the DBRecord. It may be large. - Return a byte array. - """ - off = self.abs_offset_from_hbin_offset(self.unpack_dword(0x4)) - cell = HBINCell(self._buf, off, self) - dbi = DBIndirectBlock(self._buf, cell.data_offset(), cell) - return dbi.large_data(length) - -class VKRecord(Record): - """ - The VKRecord holds one name-value pair. The data may be one many types, - including strings, integers, and binary data. - """ - def __init__(self, buf, offset, parent): - """ - Constructor. - Arguments: - - `buf`: Byte string containing Windows Registry file. - - `offset`: The offset into the buffer at which the block starts. - - `parent`: The parent block, which links to this block. - This should be an HBINCell. - """ - super(VKRecord, self).__init__(buf, offset, parent) - - _id = self.unpack_string(0x0, 2) - if _id != "vk": - raise ParseException("Invalid VK Record ID") - - def data_type_str(self): - """ - Get the value data's type as a string - """ - data_type = self.data_type() - if data_type == RegSZ: - return "RegSZ" - elif data_type == RegExpandSZ: - return "RegExpandSZ" - elif data_type == RegBin: - return "RegBin" - elif data_type == RegDWord: - return "RegDWord" - elif data_type == RegMultiSZ: - return "RegMultiSZ" - elif data_type == RegQWord: - return "RegQWord" - elif data_type == RegNone: - return "RegNone" - elif data_type == RegBigEndian: - return "RegBigEndian" - elif data_type == RegLink: - return "RegLink" - elif data_type == RegResourceList: - return "RegResourceList" - elif data_type == RegFullResourceDescriptor: - return "RegFullResourceDescriptor" - elif data_type == RegResourceRequirementsList: - return "RegResourceRequirementsList" - else: - raise UnknownTypeException("Unknown VK Record type 0x%x at 0x%x" % (data_type, self.offset())) - - def __str__(self): - if self.has_name(): - name = self.name() - else: - name = "(default)" - - data = "" - data_type = self.data_type() - if data_type == RegSZ or data_type == RegExpandSZ: - data = self.data()[0:16] + "..." - elif data_type == RegMultiSZ: - data = str(len(self.data())) + " strings" - elif data_type == RegDWord or data_type == RegQWord: - data = str(hex(self.data())) - elif data_type == RegNone: - data = "(none)" - elif data_type == RegBin: - data = "(binary)" - else: - data = "(unsupported)" - - return "VKRecord(Name: %s, Type: %s, Data: %s) at 0x%x" % (name, - self.data_type_str(), - data, - self.offset()) - - def has_name(self): - """ - Has a name? or perhaps we should use '(default)' - """ - return self.unpack_word(0x2) != 0 - - def has_ascii_name(self): - """ - Is the name of this value in the ASCII charset? - Note, this doesnt work, yet... TODO - """ - if self.unpack_word(0x10) & 1 == 1: - print "ascii name" - else: - print "not ascii name" - return self.unpack_word(0x10) & 1 == 1 - - def name(self): - """ - Get the name, if it exists. If not, the empty string is returned. - """ - if not self.has_name(): - return "" - else: - name_length = self.unpack_word(0x2) - return self.unpack_string(0x14, name_length) - - def data_type(self): - """ - Get the data type of this value data as an unsigned integer. - """ - return self.unpack_dword(0xC) - - def data_length(self): - """ - Get the length of this value data. - """ - return self.unpack_dword(0x4) - - def data_offset(self): - """ - Get the offset to the raw data associated with this value. - """ - if self.data_length() < 5 or self.data_length() >= 0x80000000: - return self.absolute_offset(0x8) - else: - return self.abs_offset_from_hbin_offset(self.unpack_dword(0x8)) - - def data(self): - """ - Get the data. This method will return various types based on the data type. - - RegSZ: - Return a string containing the data, doing the best we can to convert it - to ASCII or UNICODE. - RegExpandSZ: - Return a string containing the data, doing the best we can to convert it - to ASCII or UNICODE. The special variables are not expanded. - RegMultiSZ: - Return a list of strings. - RegNone: - See RegBin - RegDword: - Return an unsigned integer containing the data. - RegQword: - Return an unsigned integer containing the data. - RegBin: - Return a sequence of bytes containing the binary data. - RegBigEndian: - Not currently supported. TODO. - RegLink: - Not currently supported. TODO. - RegResourceList: - Not currently supported. TODO. - RegFullResourceDescriptor: - Not currently supported. TODO. - RegResourceRequirementsList: - Not currently supported. TODO. - """ - data_type = self.data_type() - data_length = self.data_length() - data_offset = self.data_offset() - - if data_type == RegSZ or data_type == RegExpandSZ: - if data_length >= 0x80000000: - # data is contained in the data_offset field - s = struct.unpack_from("<%ds" % (4), self._buf, data_offset)[0] - elif 0x3fd8 < data_length < 0x80000000: - d = HBINCell(self._buf, data_offset, self) - if d.data_id() == "db": - # this should always be the case - # but empirical testing does not confirm this - s = d.child().large_data(data_length) - else: - s = d.raw_data()[:data_length] - else: - d = HBINCell(self._buf, data_offset, self) - s = struct.unpack_from("<%ds" % (data_length), self._buf, d.data_offset())[0] - - try: - s = s.decode("utf16").encode("utf8").decode("utf8") # iron out the kinks by - except UnicodeDecodeError: # converting to and back to a Python str - try: - s = s.decode("utf8").encode("utf8").decode("utf8") - except UnicodeDecodeError: - try: - s = s.decode("utf8", "replace").encode("utf8").decode("utf8") - except: - print "Well at this point you are screwed." - raise - s = s.partition('\x00')[0] - return s - elif data_type == RegBin or data_type == RegNone: - if data_length >= 0x80000000: - data_length -= 0x80000000 - return self._buf[data_offset:data_offset + data_length] - elif 0x3fd8 < data_length < 0x80000000: - d = HBINCell(self._buf, data_offset, self) - if d.data_id() == "db": - # this should always be the case - # but empirical testing does not confirm this - return d.child().large_data(data_length) - else: - return d.raw_data()[:data_length] - return self._buf[data_offset + 4:data_offset + 4 + data_length] - elif data_type == RegDWord: - return self.unpack_dword(0x8) - elif data_type == RegMultiSZ: - if data_length >= 0x80000000: - # this means data_length < 5, so it must be 4, and - # be composed of completely \x00, so the strings are empty - return [] - elif 0x3fd8 < data_length < 0x80000000: - d = HBINCell(self._buf, data_offset, self) - if d.data_id() == "db": - s = d.child().large_data(data_length) - else: - s = d.raw_data()[:data_length] - else: - s = self._buf[data_offset + 4:data_offset + 4 + data_length] - s = s.decode("utf16") - return s.split("\x00") - elif data_type == RegQWord: - d = HBINCell(self._buf, data_offset, self) - return struct.unpack_from(" -# while at Mandiant -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -__all__ = [ - 'Registry', - 'RegistryParse', -] diff --git a/ShimCacheParser.py b/ShimCacheParser.py index 31539af..69fcf4e 100755 --- a/ShimCacheParser.py +++ b/ShimCacheParser.py @@ -18,6 +18,12 @@ # # Identifies and parses Application Compatibility Shim Cache entries for forensic data. +from __future__ import print_function +try: + from builtins import range +except ImportError: + range = xrange + import sys import struct import zipfile @@ -25,7 +31,7 @@ import binascii import datetime import codecs -import cStringIO as sio +from io import BytesIO import xml.etree.cElementTree as et from os import path from csv import writer @@ -51,15 +57,15 @@ # Values used by Windows 8 WIN8_STATS_SIZE = 0x80 -WIN8_MAGIC = '00ts' +WIN8_MAGIC = b'00ts' # Magic value used by Windows 8.1 -WIN81_MAGIC = '10ts' +WIN81_MAGIC = b'10ts' # Values used by Windows 10 WIN10_STATS_SIZE = 0x30 WIN10_CREATORS_STATS_SIZE = 0x34 -WIN10_MAGIC = '10ts' +WIN10_MAGIC = b'10ts' CACHE_HEADER_SIZE_NT6_4 = 0x30 CACHE_MAGIC_NT6_4 = 0x30 @@ -145,7 +151,7 @@ def convert_filetime(dwLowDateTime, dwHighDateTime): temp_time <<= 32 temp_time |= dwLowDateTime return date + datetime.timedelta(microseconds=temp_time/10) - except OverflowError, err: + except OverflowError as err: return None # Return a unique list while preserving ordering. @@ -157,33 +163,42 @@ def unique_list(li): ret_list.append(entry) return ret_list +def open_csv(filename, encoding): + if sys.version_info >= (3,0,0): + f = open(filename, 'w', newline='', encoding=encoding) + else: + f = open(filename, 'wb') + if encoding == 'utf-8-sig': + f.write(codecs.BOM_UTF8) + return f + # Write the Log. def write_it(rows, outfile=None): try: if not rows: - print "[-] No data to write..." + print("[-] No data to write...") return if not outfile: for row in rows: - print " ".join(["%s"%x for x in row]) + print(" ".join(["%s"%x for x in row])) else: - print "[+] Writing output to %s..."%outfile + print("[+] Writing output to %s..." % outfile) try: - f = open(outfile, 'wb') + encoding = 'utf8' if g_usebom: - f.write(codecs.BOM_UTF8) - csv_writer = writer(f, delimiter=',') - csv_writer.writerows(rows) - f.close() - except IOError, err: - print "[-] Error writing output file: %s" % str(err) + encoding = 'utf-8-sig' + with open_csv(outfile, encoding) as f:#, encoding=encoding) as f: + csv_writer = writer(f, delimiter=',') + csv_writer.writerows(rows) + except IOError as err: + print("[-] Error writing output file: %s" % str(err)) return - except UnicodeEncodeError, err: - print "[-] Error writing output file: %s" % str(err) + except UnicodeEncodeError as err: + print("[-] Error writing output file: %s" % str(err)) return # Read the Shim Cache format, return a list of last modified dates/paths. @@ -209,14 +224,14 @@ def read_cache(cachebin, quiet=False): if (test_max_size-test_size == 2 and struct.unpack(" WIN8_STATS_SIZE and cachebin[WIN8_STATS_SIZE:WIN8_STATS_SIZE+4] == WIN8_MAGIC: if not quiet: - print "[+] Found Windows 8/2k12 Apphelp Cache data..." + print("[+] Found Windows 8/2k12 Apphelp Cache data...") return read_win8_entries(cachebin, WIN8_MAGIC) # Windows 8.1 will use a different magic dword, check for it elif len(cachebin) > WIN8_STATS_SIZE and cachebin[WIN8_STATS_SIZE:WIN8_STATS_SIZE+4] == WIN81_MAGIC: if not quiet: - print "[+] Found Windows 8.1 Apphelp Cache data..." + print("[+] Found Windows 8.1 Apphelp Cache data...") return read_win8_entries(cachebin, WIN81_MAGIC) # Windows 10 will use a different magic dword, check for it elif len(cachebin) > WIN10_STATS_SIZE and cachebin[WIN10_STATS_SIZE:WIN10_STATS_SIZE+4] == WIN10_MAGIC: if not quiet: - print "[+] Found Windows 10 Apphelp Cache data..." + print("[+] Found Windows 10 Apphelp Cache data...") return read_win10_entries(cachebin, WIN10_MAGIC) # Windows 10 Creators Update will use a different STATS_SIZE, account for it elif len(cachebin) > WIN10_CREATORS_STATS_SIZE and cachebin[WIN10_CREATORS_STATS_SIZE:WIN10_CREATORS_STATS_SIZE+4] == WIN10_MAGIC: if not quiet: - print "[+] Found Windows 10 Creators Update Apphelp Cache data..." + print("[+] Found Windows 10 Creators Update Apphelp Cache data...") return read_win10_entries(cachebin, WIN10_MAGIC, creators_update=True) else: - print "[-] Got an unrecognized magic value of 0x%x... bailing" % magic + print("[-] Got an unrecognized magic value of 0x%x... bailing" % magic) return None - except (RuntimeError, TypeError, NameError), err: - print "[-] Error reading Shim Cache data: %s" % err + except (RuntimeError, TypeError, NameError) as err: + print("[-] Error reading Shim Cache data: %s" % err) return None # Read Windows 8/2k12/8.1 Apphelp Cache entry formats. @@ -292,25 +307,29 @@ def read_win8_entries(bin_data, ver_magic): # Skip past the stats in the header cache_data = bin_data[WIN8_STATS_SIZE:] - data = sio.StringIO(cache_data) + data = BytesIO(cache_data) while data.tell() < len(cache_data): header = data.read(entry_meta_len) # Read in the entry metadata # Note: the crc32 hash is of the cache entry data magic, crc32_hash, entry_len = struct.unpack('<4sLL', header) + # Abort on empty magic value + if magic == b'\x00\x00\x00\x00': + break + # Check the magic tag if magic != ver_magic: raise Exception("Invalid version magic tag found: 0x%x" % struct.unpack("= (3,0,0): + import winreg as reg + else: + import _winreg as reg except ImportError: - print "[-] \'winreg.py\' not found... Is this a Windows system?" + print("[-] 'winreg' not found... Is this a Windows system?") sys.exit(1) hReg = reg.ConnectRegistry(None, reg.HKEY_LOCAL_MACHINE) hSystem = reg.OpenKey(hReg, r'SYSTEM') - for i in xrange(1024): + for i in range(1024): try: control_name = reg.EnumKey(hSystem, i) if 'controlset' in control_name.lower(): hSessionMan = reg.OpenKey(hReg, 'SYSTEM\\%s\\Control\\Session Manager' % control_name) - for i in xrange(1024): + for i in range(1024): try: subkey_name = reg.EnumKey(hSessionMan, i) if ('appcompatibility' in subkey_name.lower() @@ -831,7 +858,7 @@ def read_zip(zip_name): for zip_file in archive.infolist(): zip_contents.append(zip_file.filename) - print "[+] Processing %d registry acquisitions..." % len(zip_contents) + print("[+] Processing %d registry acquisitions..." % len(zip_contents)) for item in zip_contents: try: if '_w32registry.xml' not in item: @@ -848,8 +875,8 @@ def read_zip(zip_name): # Catch possibly corrupt MIR XML data. try: out_list = read_mir(xml_file, quiet=True) - except(struct.error, et.ParseError), err: - print "[-] Error reading XML data from host: %s, data looks corrupt. Continuing..." % hostname + except(struct.error, et.ParseError) as err: + print("[-] Error reading XML data from host: %s, data looks corrupt. Continuing..." % hostname) continue # Add the hostname to the entry list. @@ -861,16 +888,16 @@ def read_zip(zip_name): li.insert(0, hostname) final_list.append(li) - except IOError, err: - print "[-] Error opening file: %s in MIR archive: %s" % (item, err) + except IOError as err: + print("[-] Error opening file: %s in MIR archive: %s" % (item, err)) continue # Add the final header. final_list.insert(0, ("Hostname", "Last Modified", "Last Update", "Path", "File Size", "File Executed", "Key Path")) return final_list - except (IOError, zipfile.BadZipfile, struct.error), err: - print "[-] Error reading zip archive: %s" % zip_name + except (IOError, zipfile.BadZipfile, struct.error) as err: + print("[-] Error reading zip archive: %s" % zip_name) return None # Do the work. @@ -912,70 +939,70 @@ def main(argv=[]): # Pull Shim Cache MIR XML. if args.mir: - print "[+] Reading MIR output XML file: %s..." % args.mir + print("[+] Reading MIR output XML file: %s..." % args.mir) try: - with file(args.mir, 'rb') as xml_data: + with open(args.mir, 'rb') as xml_data: entries = read_mir(xml_data) if not entries: - print "[-] No Shim Cache entries found..." + print("[-] No Shim Cache entries found...") return else: write_it(entries, args.out) - except IOError, err: - print "[-] Error opening binary file: %s" % str(err) + except IOError as err: + print("[-] Error opening binary file: %s" % str(err)) return # Process a MIR XML ZIP archive elif args.zip: - print "[+] Reading MIR XML zip archive: %s..." % args.zip + print("[+] Reading MIR XML zip archive: %s..." % args.zip) entries = read_zip(args.zip) if not entries: - print "[-] No Shim Cache entries found..." + print("[-] No Shim Cache entries found...") else: write_it(entries, args.out) # Read the binary file. elif args.bin: - print "[+] Reading binary file: %s..." % args.bin + print("[+] Reading binary file: %s..." % args.bin) try: - with file(args.bin, 'rb') as bin_data: + with open(args.bin, 'rb') as bin_data: bin_data = bin_data.read() - except IOError, err: - print "[-] Error opening binary file: %s" % str(err) + except IOError as err: + print("[-] Error opening binary file: %s" % str(err)) return entries = read_cache(bin_data) if not entries: - print "[-] No Shim Cache entries found..." + print("[-] No Shim Cache entries found...") else: write_it(entries, args.out) # Read the key data from a registry hive. elif args.reg: - print "[+] Reading .reg file: %s..." % args.reg + print("[+] Reading .reg file: %s..." % args.reg) entries = read_from_reg(args.reg) if not entries: - print "[-] No Shim Cache entries found..." + print("[-] No Shim Cache entries found...") else: write_it(entries, args.out) elif args.hive: - print "[+] Reading registry hive: %s..." % args.hive + print("[+] Reading registry hive: %s..." % args.hive) try: entries = read_from_hive(args.hive) if not entries: - print "[-] No Shim Cache entries found..." + print("[-] No Shim Cache entries found...") else: write_it(entries, args.out) - except IOError, err: - print "[-] Error opening hive file: %s" % str(err) + except IOError as err: + print("[-] Error opening hive file: %s" % str(err)) return # Read the local Shim Cache data from the current system elif args.local: - print "[+] Dumping Shim Cache data from the current system..." + print("[+] Dumping Shim Cache data from the current system...") entries = get_local_data() if not entries: - print "[-] No Shim Cache entries found..." + print("[-] No Shim Cache entries found...") else: write_it(entries, args.out)