williballenthin / python-registry Goto Github PK
View Code? Open in Web Editor NEWPure Python parser for Windows Registry hives.
License: Apache License 2.0
Pure Python parser for Windows Registry hives.
License: Apache License 2.0
Some pointers from Erik Kristensen:
Here is a PPA quick start guide — http://blog.launchpad.net/ppa/personal-package-archives-for-everyone
If you install the python-stdeb package, you’ll be able to do this very simply on a ubuntu system.
While running amcache.py against collected Amcache.hve files no entries are parsed out. I encountered this only on Windows 10 10.0.16299 Versions. I'm only assuming that the 10.0.16299 also changed something in this file (I'm referring to the AppCompatCache change). The AmCache.hve is readable with an Registry Tool and contains valid data. Maybe you can have a look. Sidenote: Other tools also break / are empty :)
Breaks with:
OS Name: Microsoft Windows 10 Pro
OS Version: 10.0.16299 N/A Build 16299
The output is simply the header and thats it:
for@workstation
$ amcache.py Amcache.hve
path|sha1|size|file_description|source_key_timestamp|created_timestamp|modified_timestamp|modified_timestamp2|linker_timestamp|product|company|pe_sizeofimage|version_number|version|language|header_hash|pe_checksum|id|switchbackcontext
for@workstation
$
Works with:
OS Name: Microsoft Windows 10 Pro
OS Version: 10.0.15063 N/A Build 15063
The BCD hive as described here is identified as "HiveType.UNKNOWN" by the hive_type()
method. Since it's a pretty well-known type, I figure it merits inclusion as a distinct type. I can submit a PR if you like.
Traceback (most recent call last):
File "/home/willi/projects/python-registry/samples/shellbags.py", line 914, in
for shellbag in get_all_shellbags(registry):
File "/home/willi/projects/python-registry/samples/shellbags.py", line 883, in get_all_shellbags
new = get_shellbags(shell_key)
File "/home/willi/projects/python-registry/samples/shellbags.py", line 865, in get_shellbags
shellbag_rec(bagmru_key, "", "")
File "/home/willi/projects/python-registry/samples/shellbags.py", line 819, in shellbag_rec
for bag in bags_key.subkey(str(slot)).subkeys():
File "/usr/local/lib/python2.7/dist-packages/Registry/Registry.py", line 217, in subkey
for k in self._nkrecord.subkey_list().keys():
File "/usr/local/lib/python2.7/dist-packages/Registry/RegistryParse.py", line 858, in keys
for _ in range(0, self._keys_len()):
AttributeError: 'RIRecord' object has no attribute '_keys_len'
python-registry does not support large (larger than 16k) Registry values, which are stored as "big-blocks".
For reference, see http://www.msuiche.net/2009/06/07/windows-vista-and-later-registry-secrets/.
via @EricZimmerman...
not all value names are stored as ascii, some are unicode.
strange thing is i see what looks like ascii in some of the Unicode flagged ones. namepresent == 0 but its ascii for 'Letter' vs some crazy Chinese symbols
suggested logic:;
if (NameLength == 0)
{
ValueName = "(default)";
}
else
{
if (NamePresentFlag > 0)
{
ValueName = Encoding.ASCII.GetString(rawBytes, 0x18, NameLength);
}
else
{
// in very rare cases, the valuename is in ascii even when it should be in Unicode.
var valString = BitConverter.ToString(rawBytes,0x18,NameLength);
bool foundMatch = false;
try {
foundMatch = Regex.IsMatch(valString, "-[!^0]{2}");
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
if (foundMatch)
{
// we found what appears to be unicode
ValueName = Encoding.Unicode.GetString(rawBytes, 0x18, NameLength);
}
else
{
ValueName = Encoding.ASCII.GetString(rawBytes, 0x18, NameLength);
}
}
}
$ python findkey.py -i SYSTEM Harddisk
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvTraceback (most recent call la
st):
File "scripts\registry\findkey.py", line 103, in
main()
File "scripts\registry\findkey.py", line 77, in main
rec(reg.root(), 0, needle)
File "scripts\registry\findkey.py", line 70, in rec
rec(subkey, depth + 1, needle)
File "scripts\registry\findkey.py", line 70, in rec
rec(subkey, depth + 1, needle)
File "scripts\registry\findkey.py", line 70, in rec
rec(subkey, depth + 1, needle)
File "scripts\registry\findkey.py", line 70, in rec
rec(subkey, depth + 1, needle)
File "scripts\registry\findkey.py", line 70, in rec
rec(subkey, depth + 1, needle)
File "scripts\registry\findkey.py", line 70, in rec
rec(subkey, depth + 1, needle)
File "scripts\registry\findkey.py", line 70, in rec
rec(subkey, depth + 1, needle)
File "scripts\registry\findkey.py", line 58, in rec
if (args.case_insensitive and needle in str(value.value()).lower()) or needle in str(value.value()):
File "build\bdist.win32\egg\Registry\Registry.py", line 153, in value
File "build\bdist.win32\egg\Registry\RegistryParse.py", line 740, in data
File "build\bdist.win32\egg\Registry\RegistryParse.py", line 553, in decode_utf16le
File "c:\python27\lib\encodings\utf_16.py", line 16, in decode
return codecs.utf_16_decode(input, errors, True)
UnicodeDecodeError: 'utf16' codec can't decode bytes in position 112-113: illegal encoding
quick fix:
in rec(), catch UnicodeDecodeError in addition to UnicodeEncodeError:
except (UnicodeEncodeError, UnicodeDecodeError):
pass
Have there been any thoughts how checksum should be implemented.
Or more correctly how inconsistencies with sequence number or checksum should be handled / reported?
Code below works, but leaves a few things to be desired in regards to feedback.
diff --git a/Registry/RegistryParse.py b/Registry/RegistryParse.py
index e5e9a56..d2a0b9f 100644
--- a/Registry/RegistryParse.py
+++ b/Registry/RegistryParse.py
@@ -243,9 +243,21 @@ class REGFBlock(RegistryBlock):
if _seq1 != _seq2:
# the registry was not synchronized
+ print("Sequence counters is out of sync %s %s" % (_seq1, _seq2))
pass
- # TODO: compute checksum and check
+ # compute checksum and check
+ _ssum = self.stored_checksum()
+ _csum = self.calculate_checksum()
+ if _ssum != _csum:
+ print("Checksum 0x%x do not match calculated 0x%x" % (_ssum, _csum))
+ pass
def major_version(self):
"""
@@ -273,6 +291,27 @@ class REGFBlock(RegistryBlock):
"""
return self.unpack_dword(0x28)
+ def stored_checksum(self):
+ """
+ Get the stored checksum in file.
+ """
+ return self.unpack_dword(0x1FC)
+
+ def calculate_checksum(self):
+ """
+ checksum is calculated over the first 0x200 bytes
+ XOR of all D-Words from 0x00000000 to 0x000001FB
+ """
+ csum = 0
+ idx = 0x0
+ while idx <= 0x1FB: # 0x1FC includes the checksum value and should result in zero
+ _csum = self.unpack_dword(idx)
+ #print("add 0x%08x to 0x%8x at 0x%8x gives 0x%8x" % (_csum, csum, idx, csum ^ _csum))
+ csum ^= _csum
+ idx += 0x4
+
+ return csum
+
def first_key(self):
first_hbin = next(self.hbins())
RegTester does not seem to be updated to work with current code. Is it used to test changes to make sure everything works?
Is there any archive with Hive files and corresponding .reg files that can be used for regression testing?
Would it be appropriate to have small file sets in the repo for testing?
Have some work on an updated version and will prepare a PR after some sleep.
On tool startup, call REGFBlock.validate() on each block, and if any fail, report this to user before continuing. This might also inform which exception recovery handler to use.
Hi willi,
Great code you have here to solve our hve parsing worries. I am relatively new to python (a few months old) and i was referred here by david sharpe. Thing is, i downloaded the whole package, ran setup.py and it worked fine. then i proceeded to run amcache.py and i got the following error:
I use python 3.5, and i am running a windows 10 machine. i'll really appreciate if this issue can be fixed.
Thanks in advance.
The Registry
class currently looks for a filelikeobject with a .read() and if that fails, assumes it's a string to the path of the file (which is then attempted to be opened with a .read() as well).
It'd be useful to have the ability to pass the contents of a registry hive to the Registry
class.
e.g. - within pytsk, one would open a registry hive with something like this:
file_obj = filesystem_obj.open("/Windows/System32/Config/SYSTEM")
file_data = file_obj.read_random(0,file_obj.info.meta.size)
that produces:
HiWilli@hehe:~$ python test.py evidence.E01 | xxd
0000000: 7265 6766 eb02 0000 eb02 0000 59ec 2d6a regf........Y.-j
0000010: 3515 d001 0100 0000 0500 0000 0000 0000 5...............
0000020: 0100 0000 2000 0000 00b0 9d00 0100 0000 .... ...........
0000030: 5300 5900 5300 5400 4500 4d00 0000 0000 S.Y.S.T.E.M.....
...
and this obviously isn't a filelikeobject so there's no read() which produces a TypeError "file() argument 1 must be encoded string without NULL bytes, not str"
My initial thought/work around is to add another try statement along the lines of:
try:
self._buf = filelikeobject.read()
except AttributeError:
try:
with open(filelikeobject, "rb") as f:
self._buf = f.read()
except TypeError:
self._buf = filelikeobject
self._regf = RegistryParse.REGFBlock(self._buf, 0, False)
Not sure how often this comes up but figured I'd see your thoughts on it. I saw others doing workarounds like here where they write the registry hive to disk then read it, but seems like an unneeded step.
An entry in the global menu might be appropriate for this
Use mask and flags to check if value data is resident, rather than weird
arithmetic operations modulo 0x80000000.
for example: https://github.com/williballenthin/python-registry/blob/master/Registry/RegistryParse.py#L762
I have a few servers using Python 2.6. The amcache.py sample contains a dict comprehension line of code that is not compatible with python < 2.7.
Line giving Syntax Error:
return ExecutionEntry(**{e.name:e.getter(key) for e in FIELDS})
Possible fix:
return ExecutionEntry(dict(e.name, e.getter(key)) for e in FIELDS))
Hi,
When requesting a (sub)key, Registry raises an exception if this key is not found. The same for values. Since actually the easiest way to check if a key exists (implicitly) to call key.subkey() and then the absence of the subkey is not really an exception in that sense.
Would you consider replacing the exception-raising with returning None? or maybe add functions with that return None instead of raise exception.
Like this:
def subkey(self, name):
"""
Return the subkey with a given name as a RegistryKey.
Return None if the subkey with the given name does not exist.
"""
if self._nkrecord.subkey_number() == 0:
return None
for k in self._nkrecord.subkey_list().keys():
if k.name().lower() == name.lower():
return RegistryKey(k)
return None
This issue is related to #7.
Windows doesn't support big data records when the minor version of the hive format is equal to or less than 3. For example, if a hive has the minor version set to 3, and there is a large value stored in this hive, and the value begins with the db string, then python-registry will treat such a value like the big data structure, but Windows will treat the value literally.
Example:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "Registry/Registry.py", line 160, in value
return self._vkrecord.data()
File "Registry/RegistryParse.py", line 1024, in data
d = self.raw_data()
File "Registry/RegistryParse.py", line 923, in raw_data
ret = d.child().large_data(data_length)
File "Registry/RegistryParse.py", line 713, in large_data
cell = HBINCell(self._buf, off, self)
File "Registry/RegistryParse.py", line 501, in __init__
self._size = self.unpack_int(0x0)
File "Registry/RegistryParse.py", line 212, in unpack_int
return struct.unpack_from(str("<i"), self._buf, self._offset + offset)[0]
The hive is attached.
test-db.zip
Hello.
At present, the data_type()
method of the VKRecord
class applies the DEVPROP_MASK_TYPE
(0x00000FFF) bit mask to a registry value type using the AND operation, and thus clears bits #12
-#31
in this value type. I know that the purpose of this operation is to extract the DEVPROP_TYPE_FILETIME
(0x00000010) type (you call it RegFileTime
), but there is a major issue here:
DEVPROP_TYPE_FILETIME
),DEVPROP_MASK_TYPE
bit mask when parsing a value type.Based on this, the library can't silently clear the bits in a value type. For example, the PnP subsystem is using the DEVPROP_TYPE_STRING_LIST
(0x00002012) type to store device location paths in the registry (this type will be converted to following value: 0xFFFF2012). Reading such a type with python-registry
will return 18 (0x12), or DEVPROP_TYPE_STRING
, if we resolve this constant in the context of device properties, but this value is wrong, because it doesn't match the value stored by the subsystem.
Some registry hives can be as large as 2GB. Maybe not a big issue for most people, but also not difficult to fix. I went with the following solution using mmap
for myself. It substantially reduces the time to read from a large hive and uses almost no memory.
import mmap
from Registry import RegistryParse
from Registry.Registry import Registry as _Registry
class Registry(_Registry):
def __init__(self, f):
self._buf = mmap.mmap(f.fileno(), 0, prot=mmap.ACCESS_READ)
self._regf = RegistryParse.REGFBlock(self._buf, 0, False)
Used like this:
with open(path) as f:
r = Registry(f)
# stuff
xp $ python ~/projects/python-registry/samples/findkey.py NTUSER.DAT.copy0 "upx305w"
Traceback (most recent call last):
File "/home/willi/projects/python-registry/samples/findkey.py", line 24, in
rec(reg.root(), 0, needle)
File "/home/willi/projects/python-registry/samples/findkey.py", line 17, in rec
rec(subkey, depth + 1, needle)
File "/home/willi/projects/python-registry/samples/findkey.py", line 17, in rec
rec(subkey, depth + 1, needle)
File "/home/willi/projects/python-registry/samples/findkey.py", line 17, in rec
rec(subkey, depth + 1, needle)
File "/home/willi/projects/python-registry/samples/findkey.py", line 17, in rec
rec(subkey, depth + 1, needle)
File "/home/willi/projects/python-registry/samples/findkey.py", line 17, in rec
rec(subkey, depth + 1, needle)
File "/home/willi/projects/python-registry/samples/findkey.py", line 17, in rec
rec(subkey, depth + 1, needle)
File "/home/willi/projects/python-registry/samples/findkey.py", line 11, in rec
if needle in str(value.value()):
File "/usr/local/lib/python2.7/dist-packages/Registry/Registry.py", line 139, in value
return self._vkrecord.data()
File "/usr/local/lib/python2.7/dist-packages/Registry/RegistryParse.py", line 697, in data
return db.large_data(data_length)
AttributeError: 'DataRecord' object has no attribute 'large_data'
Windows 7 64-bit OS, SYSTEM hive, value ControlSet001\Control\TimeZoneInformation : TimeZoneKeyName should contain something like "Pacific Standard Time", however python-registry returns a Unicode string of length 1: "P".
Rejistry parses the value just fine.
Encountered this when looking at a key with 600+ subkeys
As reported by @woanware.
data type: 4294901776
or 0xFFFF0010
WIN8\SYSTEM\CCS\Enum\USBStor\XXXXXX\Properties\{83da6326-97a6-4088-9453-a1923f573b29}\0064
The values byte data should be "OD936B116FCBCE01"
I know that the actual value is a FILETIME value stored as byte array. The path comes from this posting:
http://www.swiftforensics.com/2013/11/windows-8-new-registry-artifacts-part-1.html
My parser identifies the value at the offset 0x517E94 (5340820) where as the python-registry library identifies the value at 0x517E7C (5340796).
README.MD
doesn't explicitly say one way or the other whether this code only supports Windows NT-style registry files, or if it also supports Windows 95-style registry files.
I found that it doesn't support Windows 95-style files as at a002bb3:
File "[...]/python2.7/site-packages/Registry/RegistryParse.py", line 295, in __init__
raise ParseException("Invalid REGF ID")
Registry.RegistryParse.ParseException: Registry Parse Exception (Invalid REGF ID)
This was raised because the signature wasn't 0x66676572
, which appears to be a Windows NT-specific signature based on a quick skim through documentation/WinReg.txt
.
It would be nice if this was mentioned explicitly in README.MD
.
On Windows hosts, I think the registry files need to be read in binary mode (or at least they do here).
So in the Registry.init method (Registry.py line 266 ish), I've amended:
with open(filename) as f:
self._buf = f.read()
to:
with open(filename,'rb') as f:
self._buf = f.read()
Thanks for the really useful module!
We see some registry with circular references resulting in infinite loop while trying to throw a RegistryKeyKeyNotFound exception (version https://pypi.python.org/packages/source/p/python-registry/python-registry-1.1.0a.tar.gz#md5=589311e55826174c1d50b0b177cd9d55)
key = root.subkey("Select")
File "/opt/python2.7/lib/python2.7/site-packages/Registry/Registry.py", line 249, in subkey
raise RegistryKeyNotFoundException(self.path() + "\\" + name)
File "/opt/python2.7/lib/python2.7/site-packages/Registry/Registry.py", line 209, in path
return self._nkrecord.path()
File "/opt/python2.7/lib/python2.7/site-packages/Registry/RegistryParse.py", line 1232, in path
name = p.name() + "\\" + name
KeyboardInterrupt
If I add a print ('name', name) at line 1232 I see this:
name CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}
name Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}
name Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}
name CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}\Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}
name Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}\Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}
name Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}\Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}
name CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}\Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}\Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}
name Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}\Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}\Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}
name Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}\Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}\Microsoft\Windows\CMI-CreateHive{3D971F19-49AB-4000-8D39-A6D9C673D809}
....
worse and worse until I run out of memory
It would be nice to have this tool available for py3. If compatibility with python >= 2.6 and 3.x is good enough, I can contribute.
As mentioned by @williballenthin in #39
developing a second backend to python-registry that operates over .reg files. So it exposes the familiar RegistryKey/RegistryValue interface, but the underlying data comes from a .reg export.
(For reference this is kind of the inverse of #4)
I think it is worth having a separate issue open for this. (no hurry just wanted it documented)
At least it can be used to discuss how this could be implemented to best fit in the project.
The best I can come up with right now is a separate module that Registry imports, and uses depending on the contents of the file?
NKRecord returns the classname as raw bytes, rather than a Python string that was decoded from UTF-16LE.
VKRecord.has_ascii_name() may print debugging message to stdout when called
Its the string "ascii name"
received via email...
Good morning,
I'm not a member of github so hopefully this is an acceptable way to
submit a possible bug.
I am getting the following error when trying to use Registry:
Z:\WIP>"c:\Program Files\skeletool\skeletool.exe" -i \WIP\images\GSI
WIP\output
Processing drive: F:
Processing software hive: F:\WINDOWS\system32\config\SOFTWARE
Traceback (most recent call last):
File "", line 169, in
File "", line 75, in process_image
File "skeletool\build\pyi.win32\skeletool\outPYZ1.pyz/dfir_registr
try", line 14, in system_metadata
File "skeletool\build\pyi.win32\skeletool\outPYZ1.pyz/Registry.Reg
285, in open
File "skeletool\build\pyi.win32\skeletool\outPYZ1.pyz/Registry.Reg
252, in find_key
File "skeletool\build\pyi.win32\skeletool\outPYZ1.pyz/Registry.Reg
215, in subkey
File "skeletool\build\pyi.win32\skeletool\outPYZ1.pyz/Registry.Reg
line 1009, in subkey_list
File "skeletool\build\pyi.win32\skeletool\outPYZ1.pyz/Registry.Reg
line 285, in init
File "skeletool\build\pyi.win32\skeletool\outPYZ1.pyz/Registry.Reg
line 161, in unpack_int
struct.error: unpack_from requires a buffer of at least 4 bytes
This doesn't occur all the time, and on one system, it only occurs
after I "compile" the code with PyInstaller.
Python 2.7, Windows XP and Windows 7.
If there is any other information I can provide, please let me know.
Hi,
I am somewhat confused.
I downloaded the NTUSER.DAT file and put it in hives sub-directory.
Now I am trying to do:
f = open("hives/NTUSER.DAT", "rb")
reg = Registry.Registry(f)key = reg.open("SOFTWARE\Microsoft\Windows\Current Version\Run")
This gives me:
Traceback (most recent call last):
File "C:\Users\c5211757\Documents\Programming\pythonRegistry\getRegValue1.py", line 10, in
key = reg.open("SOFTWARE\Microsoft\Windows\Current Version\Run")
File "C:\Users\c5211757\Documents\Programming\pythonRegistry\Registry\Registry.py", line 290, in open
return RegistryKey(self._regf.first_key()).find_key(path)
File "C:\Users\c5211757\Documents\Programming\pythonRegistry\Registry\Registry.py", line 254, in find_key
return self.subkey(immediate).find_key(future)
File "C:\Users\c5211757\Documents\Programming\pythonRegistry\Registry\Registry.py", line 217, in subkey
for k in self._nkrecord.subkey_list().keys():
File "C:\Users\c5211757\Documents\Programming\pythonRegistry\Registry\RegistryParse.py", line 900, in keys
yield NKRecord(self._buf, d.data_offset(), self)
File "C:\Users\c5211757\Documents\Programming\pythonRegistry\Registry\RegistryParse.py", line 999, in init
raise ParseException("Invalid NK Record ID")
Registry.RegistryParse.ParseException: Registry Parse Exception(Invalid NK Record ID)
What am I doing wrong?
Open this in a new tab or window. Run the action from the global menu.
Came across this small bug recently. Apologies, but I don't have any test data available.
Registry\RegistryParse.py", line 1227, in subkey_list
raise ParseException("Subkey list with type %s encountered, but not yet supported." % (id_))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128)
Hey Willi! I just found a bug in amcache.py. I was running it on an Amcache.hve file extracted from a Windows 7 host (I just found out that MS apparently backported whatever change creates this hive to Win7), and I get an error (only) when running it with the -t option.
$ python amcache.py -t Amcache.hve
Traceback (most recent call last):
File "amcache.py", line 225, in <module>
main(argv=sys.argv)
File "amcache.py", line 202, in main
ts = getattr(e, t)
AttributeError: 'ExecutionEntry' object has no attribute 'first_run'
I'm guessing for some reason certain entries don't have one of the timestamp values set... Then again, the first_run time should be the last modified time on the key, right?... Weird.
The RegistryParse classes currently provide an offset() method, however there is no easy way to access the raw data contained in a structure. It would be useful to have a raw_data() method for (at least) each of the following:
The sample findkey.py should be able to query Registry key/value/paths by regular expressions, rather than simple exact text matches. I propose adding an additional flag to the command line, -r
that specifies that the query should be interpreted as a regular expression. For instance:
findkey.py -r SYSTEM.hive "[Bb]eep"
Should have to add an additional import for re
.
The .reg format is the format used by Regedit and other tools when Windows Registry keys are exported. You can see an example here: http://www.byalexv.co.uk/RegFormat.html . It would be nice to have a reg_format() method for the RegistryKey class which returns a string containing the corresponding .reg format content.
A compare_to(.reg_format_string) method may also be useful, although maybe not as appropriate in the core class.
REGFBlock returns the hive name as raw bytes, rather than a Python string that was decoded from UTF-16LE.
Hello,
It fails while decoding value.value() ... (Registry hive taken from Windows 7)
b'\x00\x00\xd1w\x03\x00\x00\x00\xec\xa8\xcdw\x089l\x00\x88\xef\xd1\x01\xa9\xdb\xcfw\x089l\x00\x8cEl\x00\x96\x00\x00\x00l\xf2\xd1\x01\x00\x00\x00\x00\x00\x00\x00\x00\x9e\x00\x00\x00\xf0\xf1\xd1\x01{k\xcfw\xd0\xef\xd1\x01\x90\xf2\xd1\x01\x00\x00\x00\x00l\xf2\xd1\x01\x80El\x00\xd7\xa8\xcdw.\x00\x00\x00\x01\x00\x00\x00@\x96l\x00\xcc\xef\xd1\x01p\xe7l\x00\x01\x00\x00\x00\x88\xe8l\x00\xdc\xef\xd1\x01\x00\x00\x00\x00\x98\xf5l\x00\x9e\x00\xa0\x00\x84El\x00\x90\xf2\xd1\x01\xf0\xef\xd1\x01\xad\x14\xf8u\x00\x00f\x00\x00\x00\x00\x00\x90\xe8l\x00\x04\xf0\xd1\x01Z\x12\x19v\x00\x00f\x00\x00\x00\x00\x00\x90\xe8l\x00\xa1\xfb\xcbw\xb0\x0f2v\x84\x02\x00\x00\x00\x00\x00\x00 \xf0\xd1\x01\x8f!\x19v\x84\x02\x00\x00\xbc\xf2\xd1\x01\x16"\x19v\x80\'\x19v\xae|0\xf1(2k\x00(2k\x004\x03\xe8u\x00\x00\x00\x00(\xf0\xd1\x01\x00\x00\x00\x00\x01\x00\x00\x00\xb8\x96l\x00r\xca\xf8\x86\x00\x00\x00\x00\x00\x00\x00\x00\x90\xe8l\x00\x00\x00\x00\x008\xe8l\x00\xc0\xf0\xd1\x01\x8c\x98\xddu\x00\x00\x00\x00\x01\x00\x00\x00\xff\xff\xff\xff\xf0\xdel\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe8\xafm\x00\xf0\xdel\x00\x01\x00\x00\x00\xa44ak\x00\x00\x00\x00g\x90gw\x08\xf5\xd1\x01[\xeagw\xf8\xf4\xd1\x01U\x00S\x00B\x00\\\x00V\x00I\x00D\x00_\x008\x000\x008\x007\x00&\x00P\x00I\x00D\x00_\x000\x007\x00D\x00C\x00\\\x005\x00&\x00]\x01\xccw\x00\x00f\x00\xf88l\x00\x00\x00\x00\x00\x07\x00\x00\x07\x00\x00\x00\x00\xa89l\x00\xb81k\x00\xd0\x96m\x008\x00\x00\x00\x00\x00f\x00\x00\x00\x00\x10\xf88l\x00\x14\xf2\xd1\x01\xce8\xcdw8\x01f\x00\xaa8\xcdw\x11\xa2\nv\x00\x00\x00\x00\x00\x00f\x00\x009l\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x1b\x00\x00\x00\x00\x00'
Traceback (most recent call last):
...
self.content = Value.value()
File "/frameworks/virtualenvs/.../Registry.py", line 156, in value
return self._vkrecord.data()
File "/frameworks/virtualenvs/.../lib/python3.6/site-packages/Registry/RegistryParse.py", line 748, in data
s = decode_utf16le(s)
File "/frameworks/virtualenvs/.../lib/python3.6/site-packages/Registry/RegistryParse.py", line 555, in decode_utf16le
**s = s.decode("utf16")
UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 20-21: illegal UTF-16 surrogate**
Is it expected?
Thank you in advance.
P.S Python-registry is awsome !
I'm a basic python user searching for a way to find and replace (everywhere in the registry) "old" by "new" (in name, key, value, even in the protected keys).
It would be nice to add to your documentation basic explaination (for basic user)
(Python 3.6, windows 7).
$ python3.4 samples/reg_export.py BCD BCD System
Traceback (most recent call last):
File "samples/reg_export.py", line 158, in <module>
main(*sys.argv[1:])
File "samples/reg_export.py", line 149, in main
sys.stdout.write(reg_format_header())
TypeError: must be str, not bytes
The reason for using python 3.4 was becuase of enum import error with python 2.7
File "Registry/Registry.py", line 23, in <module>
from enum import Enum
ImportError: No module named enum
Will try to fix the first issue and make a PR.
When processing a Registry key in the SAM hive, python-registry may throw an exception:
"Registry.RegistryParse.UnknownTypeException: Unknown Type Exception(Unknown VK Record type 0x3e9 at 0x36dc)"
This exception will not be handled by python-registry, and should be caught by a user's program. Here is the reasoning:
The key is mentioned on page 4 of the following paper (the author did
some of the best original research in understanding the Registry):
http://sentinelchicken.com/data/TheWindowsNTRegistryFileFormat.pdf
Basically, Microsoft and some third parties hijack the TYPE field of one
of the Registry data structures and store data in specific instances,
instead. One such case is storing the user ID in the SAM. The
python-registry code is failing because it does not account for this
arbitrary data in the TYPE field. Fortunately, you can still access the
ID in the current version.
A RegistryValue is backed by the lower level VKRecord structure, which
you can access as RegistryValue._vkrecord. You can interpret the integer
result of the method VKRecord.data_type() as the data of the Registry
value. You should only do this in the few specific cases where the file
format is broken. I don't know if this this is documented anywhere,
unfortunately.
Very explicitly, the user ID key stored in the SAM can be accessed as
follows:
k = r.open("\SAM\Domains\Account\Users\Names\Administrator")
userid = k.value("(default)")._vkrecord.data_type()
This code snippet accesses the VKRecord that backs the RegistryValue for
the default
value of the Administrator key. userid
is an integer
read directly from the binary data, from the field that usually contains
the data type of the Registry value. Instead, the field is overloaded
with the alternate meaning.
When viewing a registry value, you should be able to right click, and save to file...
Value data offset calculation may be incorrect. The condition checks for value size greater than 0x800000000, however this is larger than the range of a DWORD. The correct condition is probably greater than 0x8000000.
def data_offset(self):
"""
Get the offset to the raw data associated with this value.
"""
if self.data_length() < 5 or self.data_length() >= 0x80000000:
return self.absolute_offset(0x8)
else:
return self.abs_offset_from_hbin_offset(self.unpack_dword(0x8))
https://github.com/williballenthin/python-registry/blob/master/Registry/RegistryParse.py#L628
def long_name_size(self):
if self._off_long_name_size:
return self._off_long_name_size <--------------------
elif self._off_long_name:
return len(self.long_name()) + 2
else:
return 0
Use this issue to track proposed changes to the python-registry
API for the 2.x release.
details to follow
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.