Giter Club home page Giter Club logo

recuperabit's Introduction

RecuperaBit

Support via PayPal

A software which attempts to reconstruct file system structures and recover files. Currently it supports only NTFS.

RecuperaBit attempts reconstruction of the directory structure regardless of:

  • missing partition table
  • unknown partition boundaries
  • partially-overwritten metadata
  • quick format

You can get more information about the reconstruction algorithms and the architecture used in RecuperaBit by reading my MSc thesis or checking out the slides.

Usage

usage: main.py [-h] [-s SAVEFILE] [-w] [-o OUTPUTDIR] path

Reconstruct the directory structure of possibly damaged filesystems.

positional arguments:
  path                  path to the disk image

optional arguments:
  -h, --help            show this help message and exit
  -s SAVEFILE, --savefile SAVEFILE
                        path of the scan save file
  -w, --overwrite       force overwrite of the save file
  -o OUTPUTDIR, --outputdir OUTPUTDIR
                        directory for restored contents and output files

The main argument is the path to a bitstream image of a disk or partition. RecuperaBit automatically determines the sectors from which partitions start.

RecuperaBit does not modify the disk image, however it does read some parts of it multiple times through the execution. It should also work on real devices, such as /dev/sda but this is not advised for damaged drives. RecuperaBit might worsen the situation by "stressing" a damaged drive or it could crash due to an I/O error.

Optionally, a save file can be specified with -s. The first time, after the scanning process, results are saved in the file. After the first run, the file is read to only analyze interesting sectors and speed up the loading phase.

Overwriting the save file can be forced with -w.

RecuperaBit includes a small command line that allows the user to recover files and export the contents of a partition in CSV or body file format. These are exported in the directory specified by -o (or recuperabit_output).

Limitation

Currently RecuperaBit does not work with compressed files on an NTFS filesystem. If you have deep knowledge of the inner workings of file compression on NTFS filesystem, your help would be much appreciated, as available documentation is quite sparse on the topic.

Pypy

RecuperaBit can be run with the standard cPython implementation, however speed can be increased by using it with the Pypy interpreter and JIT compiler:

pypy3 main.py /path/to/disk.img

Recovery of File Contents

Files can be restored one at a time or recursively, starting from a directory. After the scanning process has completed, you can check the list of partitions that can be recovered by issuing the following command at the prompt:

recoverable

Each line shows information about a partition. Let's consider the following output example:

Partition #0 -> Partition (NTFS, 15.00 MB, 11 files, Recoverable, Offset: 2048, Offset (b): 1048576, Sec/Clus: 8, MFT offset: 2080, MFT mirror offset: 17400)

If you want to recover files starting from a specific directory, you can either print the tree on screen with the tree command (very verbose for large drives) or you can export a CSV list of files (see help for details).

If you rather want to extract all files from the Root and the Lost Files nodes, you need to know the identifier for the root directory, depending on the file system type. The following are those of file systems supported by RecuperaBit:

File System Type Root Id
NTFS 5

The id for Lost Files is -1 for every file system.

Therefore, to restore Partition #0 in our example, you need to run:

restore 0 5
restore 0 -1

The files will be saved inside the output directory specified by -o.

License

This software is released under the GNU GPLv3. See LICENSE for more details.

recuperabit's People

Contributors

artkpv avatar davispuh avatar ddovod avatar gmbnomis avatar h4r0 avatar hallihalloschatz avatar lazza avatar maxqia avatar nicolascarpi avatar peter-fayez95 avatar pwde avatar rober710 avatar slavanap avatar whard avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

recuperabit's Issues

Savefile not flushed after final write

I ran into a OOM-kill after which the savefile was broken (unexpected eof), kind of breaks the purpose for having a save-file.

I suggest closing (or flushing) the savefile before continuing to avoid this issue. Diff included.

index 5cea733..e6ed491 100755
--- a/main.py
+++ b/main.py
@@ -333,6 +333,7 @@ def main():
         logging.info('Saving results to %s', args.savefile)
         savefile = open(args.savefile, 'wb')
         pickle.dump(interesting, savefile)
+        savefile.close()
 
     # Ask for partitions
     parts = {}```

restore everything possible?

I am trying to recover a severly damaged NTFS disk and created an image with ddrescue. I first started an attempt with photorec/testdisk and that showed me 4 partitions (. After that I tried recuperabit and after the initial scanning thousands of partitions are shown after typing recoverable. I tried restore 0 5 and 0 -1 as shown in the docs and luckily correct filenames were reconstructed. Now to do that for all those thousands of "partition" would be rather tedious, so my question is: is it possible to just say: recover anything like restore * ?

Add support for Ext4

Dewald and Seufert describe an approach that can be used to reconstruct Ext4 file systems even if the metadata is damaged: Andreas Dewald, Sabine Seufert, AFEIC: Advanced forensic Ext4 inode carving, Digital Investigation, Volume 20, Supplement, March 2017, Pages S83-S91, ISSN 1742-2876.

Their approach seems to have some similarities to the NTFS module of RecuperaBit:

[We] develop a novel approach to identify files in an Ext4 file system even in cases where the superblock is corrupted or overwritten, e.g. because of a re-formatting of the volume. Our approach applies heuristic search patterns for utilizing methods of file carving and combines them with metadata analysis.

The paper (full text) should be carefully studied to verify if this approach can be turned into a RecuperaBit plug-in.

UnicodeEncodeError: 'ascii' codec can't encode character u'\xdf' in position 35: ordinal not in range(128)

Hi,

I want to recover files from a 300 GB drive which seems to have been quickformated under Windows 10.

Some comments on excerpt of the attached console output:

INFO:root:13786 partitions found.
I wouldn't expect that number of partions but maybe this is how the program works.

Partition #10869 -> Partition (NTFS, 297.26 GB, 288715 files, Recoverable, Offset: 1730560, Offset (b): 886046720, Sec/Clus: 8, MFT offset: 8022016, MFT mirror offset: 1730576)
Among the recoverable partions, this one sounds promising.

When I try to recover files from that partition (see last 20 lines of attached file), I get the error message mentioned as subject and the program exits.

Is there a chance to get this fixed?
How can I assist further?

Thanks in advance.

Bye.
recuperabit.zip

What are _recuperabit_content files?

Thanks for the very useful program!

I just a have a minor support question, related to the recovered files. I'm getting a few files whose recovered path is, for example:

./Bilder/DSC_0016.JPG_recuperabit_content

These files actually are shown in my Mac's Finder as follows:

DSC_0016.JPG/
    DSC_0016.JPG_recuperabit_content

That is, a folder called like the file, and the actual file inside with the content suffix. I can rename the file from DSC_0016.JPG_recuperabit_content to DSC_0016.JPG and move it up one folder, then everything looks fine, and the image is viewable.

Is that something the program cannot auto-detect for some reason?

(It's not an issue for me to go through these files manually; I'm mostly curious!)

Getting a memory error at the end of the process

Hi I had a similar problem then what is listed here: http://stackoverflow.com/questions/37539950/corrupt-ntfs-folder-not-accessible-from-either-windows-mas-osx-or-linux
The hard drive that had the problem is an ssd, and it had my windows install on it. I can see all folders when i mount the drive with the exception of 2, the "c:\Users" being one of them. I created a partition and installed linux on another HD.
Since the drive doesn't seem to be damaged i ran the application directly on it. The first time it ran it crashed midway, i don't know what happened the terminal just froze, and no savefile was created. I ran the application 4 times since then, the first one took a long time (around 20 hours) and created a savefile. The other other ones ran much faster around 2 hours. But they all end up the same with the message:

INFO:root:Found NTFS boot sector at sector 234229752
INFO:root:First scan completed
INFO:root:Parsing MFT entries
INFO:root:Parsing INDX records
INFO:root:Reading boot sectors
DEBUG:root:Dropping bogus NTFS partition with MFT position 27585690 generated by MFT mirror of partition at offset 27580528
DEBUG:root:Dropping bogus NTFS partition with MFT position 65746898 generated by MFT mirror of partition at offset 65741736
DEBUG:root:Dropping bogus NTFS partition with MFT position 112 generated by MFT mirror of partition at offset 0
INFO:root:Finding partition geometry
INFO:root:Finalizing MFT reconstruction of partition at offset 27580528
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
INFO:root:Finalizing MFT reconstruction of partition at offset 65741736
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
INFO:root:Finalizing MFT reconstruction of partition at offset 0
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST

But then a series of messages appear saying:
WARNING:root:Cannot read sector(s). Filling with 0x00. Offset: 119785142 Size: 2 Bsize: 512

After a bunch of these, the folowing apppears:

Traceback (most recent call last):
  File "main.py", line 328, in <module>
    main()
  File "main.py", line 310, in main
    parts.update(scanner.get_partitions())
  File "/home/leandro/RecuperaBit-master/recuperabit/fs/ntfs.py", line 841, in get_partitions
    self.finalize_reconstruction(part)
  File "/home/leandro/RecuperaBit-master/recuperabit/fs/ntfs.py", line 674, in finalize_reconstruction
    self.add_from_attribute_list(parsed, part, node.offset)
  File "/home/leandro/RecuperaBit-master/recuperabit/fs/ntfs.py", line 629, in add_from_attribute_list
    _integrate_attribute_list(parsed, part, image)
  File "/home/leandro/RecuperaBit-master/recuperabit/fs/ntfs.py", line 221, in _integrate_attribute_list
    entries += attribute_list_parser(dump)
  File "/home/leandro/RecuperaBit-master/recuperabit/fs/ntfs_fmt.py", line 118, in attribute_list_parser
    content.append(decoded)
MemoryError

Now I'm lost on what to do, and what could be causing the problem. And when I try to run the "recoverable" command it just says "command not found", I'm assuming because the program crashed...

Merge functionality gives errors on restore and seems to recover unrelated files

I tried using the new merge functionality on the 5 partitions I referenced in issue #18 , however the recovery wasn't successful.
Here is the output:
https://pastebin.com/sns4t84A
Notice first how the root doesn't contain any files. In the individual partitions I had recovered before, some of them had files in the root directory, therefore I would assume that at least those files would end up in the root once I merged these partitions.

As for the LostFiles, it has very few files and none of them were part of the original partition.
In fact from the bodyfile, I noticed that their timestamps were from before the original partition was even created.

Here are the exact commands I executed:

merge 312 451
merge 312 283
merge 312 248
merge 312 1007

Then I tried to restore partition #312.
After the merge, it was listed like this when issuing the recoverable command:

Partition #312 -> Partition (NTFS, ??? b, 576538 files, Recoverable, Offset: 264192, Offset (b): 135266304, Sec/Clus: 8, MFT offset: 4287141256, MFT mirror offset: None)

Video file that was recovered will not open.

Okay so I'm trying to recover files from a Windows 10 partition on a 1TB WD blue hard drive. After running the command "recuperabit /dev/sdb3 -o Recovery -s Windows10.save" it asks me to proceed so I say yes, and it begins the scanning process but then after sometime of scanning, it seems like it all just comes to a halt and never continues. It doesn’t prompt me to enter another command, the scan just stops with no prompt. If I do try to enter another command like "recover", nothing happens. Please help me figure out what to do. I absolutely need to recover some files.

ERROR:root:Cannot handle multiple attribute

Hi,
Was trying to use your application to recover deleted files/directories on a 320GB hard disk. When i ran the main.py command it output with the below error. Is there something i am missing?

DEBUG:root:Found MATCH in positions set([30803328]) with weight 6330 (73.6132108
385%)
INFO:root:Finalizing MFT reconstruction of partition at offset 30801920
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
ERROR:root:Cannot handle multiple attribute $VOLUME_INFORMATION
Traceback (most recent call last):
  File "./RecuperaBit-master/main.py", line 385, in <module>
    main()
  File "./RecuperaBit-master/main.py", line 367, in main
    parts.update(scanner.get_partitions())
  File "/media/ubuntu/d807d211-57d8-47be-8296-72bd6bba7d0d/paula/RecuperaBit-mas
ter/recuperabit/fs/ntfs.py", line 847, in get_partitions
    self.finalize_reconstruction(part)
  File "/media/ubuntu/d807d211-57d8-47be-8296-72bd6bba7d0d/paula/RecuperaBit-mas
ter/recuperabit/fs/ntfs.py", line 677, in finalize_reconstruction
    self.add_from_attribute_list(parsed, part, node.offset)
  File "/media/ubuntu/d807d211-57d8-47be-8296-72bd6bba7d0d/paula/RecuperaBit-mas
ter/recuperabit/fs/ntfs.py", line 632, in add_from_attribute_list
    _integrate_attribute_list(parsed, part, image)
  File "/media/ubuntu/d807d211-57d8-47be-8296-72bd6bba7d0d/paula/RecuperaBit-mas
ter/recuperabit/fs/ntfs.py", line 250, in _integrate_attribute_list
    child_parsed = parse_file_record(dump)
  File "/media/ubuntu/d807d211-57d8-47be-8296-72bd6bba7d0d/paula/RecuperaBit-mas
ter/recuperabit/fs/ntfs.py", line 150, in parse_file_record
    attributes = _attributes_reader(entry, header['off_first'])
  File "/media/ubuntu/d807d211-57d8-47be-8296-72bd6bba7d0d/paula/RecuperaBit-mas
ter/recuperabit/fs/ntfs.py", line 132, in _attributes_reader
    raise NotImplementedError
NotImplementedError

No len()

The following is from rev: f03f5e9

I am using Python 2.7. Do I need an older version?

INFO:root:Found NTFS boot sector at sector 7813967871
INFO:root:First scan completed                                                                                                        
INFO:root:Parsing MFT entries                                                                                                         
INFO:root:Parsing INDX records                                                                                                        
INFO:root:Reading boot sectors                                                                                                        
3230662656 -> 
Traceback (most recent call last):
  File "./main.py", line 385, in <module>
    main()
  File "./main.py", line 367, in main
    parts.update(scanner.get_partitions())
  File "/home/swango/dev/RecuperaBit/recuperabit/fs/ntfs.py", line 782, in get_partitions
    print address, '->', len(partitioned_files[address])
TypeError: object of type 'NTFSPartition' has no len()

Partition split in 5 by RecuperaBit? Most recovered files broken.

Hi, I'm trying to recover data from a 3TB WD RED drive (WD30EFRX).
The drive itself is healthy and working properly. What happened is that I accidentally wrote an .iso to it using dd. The iso is 4GB.
This is the original partition table from obtained through fdisk -l before the accident happened and this table was destroyed:

Device          Start        End    Sectors  Size Type
/dev/xvdj1         34     262177     262144  128M BIOS boot
/dev/xvdj2     264192 5848213503 5847949312  2.7T Microsoft basic data
/dev/xvdj3 5848215552 5860532223   12316672  5.9G Linux filesystem

This is after the dd accident:

Device     Boot Start     End Sectors  Size Id Type
/dev/sde1  *        0 8294399 8294400    4G  0 Empty
/dev/sde2         968   63923   62956 30,8M ef EFI (FAT-12/16/32)

so the first 4GB of the disk got overwritten.
The MFT was probably completely overwritten by the 4GB iso.
This is why I chose to try your tool after reading the article you wrote about it.

I am interested in recovering the second 2.7tb partition. This is not a windows partition btw, it's just file storage. The BIOS boot partition is a residue from an old windows installation and it has no meaning.

Originally there were 83Gb of free space on this partition so this is about 2.6tb of files to recover.

After the analysis these are the recoverable partitions I get:
https://pastebin.com/1kWJGuZL
As you can see there are 5 partitions that start at the same point as the original partition (264192) but they all end long before the original partition, and all at the same spot (Offset (b): 135266304)

Partition #248 -> Partition (NTFS, ??? b, 193571 files, Recoverable, Offset: 264192, Offset (b): 135266304, Sec/Clus: 8, MFT offset: 2698650792, MFT mirror offset: None)
Partition #283 -> Partition (NTFS, ??? b, 114301 files, Recoverable, Offset: 264192, Offset (b): 135266304, Sec/Clus: 8, MFT offset: 3418165840, MFT mirror offset: None)
Partition #312 -> Partition (NTFS, ??? b, 391342 files, Recoverable, Offset: 264192, Offset (b): 135266304, Sec/Clus: 8, MFT offset: 4287141256, MFT mirror offset: None)
Partition #451 -> Partition (NTFS, ??? b, 71413 files, Recoverable, Offset: 264192, Offset (b): 135266304, Sec/Clus: 8, MFT offset: 614360032, MFT mirror offset: None)
Partition #1007 -> Partition (NTFS, ??? b, 207562 files, Recoverable, Offset: 264192, Offset (b): 135266304, Sec/Clus: 8, MFT offset: 2592906944, MFT mirror offset: None)

I tried recovering all recoverable partitions and these 5 are the only ones that contain files from my original 2.7TB partition. The others are unrelated.
The problem is that the total amount of data recovered from these 5 partitions is about 963GB, of which more than 100GB are broken (partial) files, and consider that among these broken files there are many that are 0 bytes, so there is a lot missing.
The data that I could recover is less than 33% overall in GB. Much lower if you only count the % of working files.
Now, I understand that many files are fragmented and that many fragments are probably in the initial 4GB that were overwritten, but to lose almost 70% of the data seems unrealistic.
Also it's weird that the data is divided in 5 partitions starting at the same point.

I did restore (#) 5 and restore (#) -1 for all of these partitions.
Most of the files recovered are in LostFiles folders. Some are in Root but it's a minority.
I meticulously examined each and every Dir_XXXX folder (there were thousands), and I sorted them based on content and where the files were originally and I noticed something strange:
Some Dir_XXXX folders were actually originally sub-folders of folders recovered from the other partitions.
So for example Dir_XXXX in partition #1007 could be the sub-folder of a a folder inside Dir_YYYY from partition #248.
From what I understand about your program something is placed inside a Dir_XXXX if the parent can't be found. But in these cases the parent was found, just in another partition. These 5 partitions are actually just one.

Another thing I noticed is that files found in a Dir_XXXX folder on one partition were originally in the same folder as files I found in a Dir_YYYY folder on another partition. So in these cases the two Dir_ folder were originally the same folder.

I have also noticed that there are no duplicate files or folders between these 5 partitions. They are not different versions of the same data, I think they are pieces of one partition, but the program thinks they are separate.

This may also be why many files are broken; Perhaps their other pieces are in another partition among these 5.

I am asking because I'm not sure about how to solve this. Is there a way of telling the program to consider all sectors from 264192 to 5848213503 as one partition instead of dividing it in 5?
I think that recovering a lot more of my files should be possible, as only a very small piece of the disk was overwritten.
Perhaps should I use fdisk to recreate the partition from 264192 to 5848213503 without formatting it, and then run RecuperaBit again pointing it at this partition? Or is there another solution?
Thank you.

PS: Partition #846 which starts at 1180893537also contains files from my original partition.
It's the only one that contains my files and doesn't start at 264192 like the other 5.
The reason why I didn't mention it before is that all of the files I extracted from it are broken.
I believe this is because it has a wrong cluster setting (Sec/Clus: 1 instead of 8).

README is unclear on savefiles

I've had to completely scan my drive three times today because I didn't specify an output directory at my initial issuance of the command to run the program! This has been an arduous process. The README suggests using parameter -s to load from a savefile but it's unclear on how to create a savefile after a scan in the first place. I'm very glad and thankful that you've put together this program, but I wish the README were more clear on this issue. How can savefiles be created?

Eternal loop reconstructing MFT

I was running RecuperaBit on a 466GB image for a little longer than 100h. I looked at the output and it was repeating the same positions set number.

For example, if you look at the output file, look at the first three positions: 38190544, 37475152, 38357392. The last three positions on the end are the same: 38190544, 37475152, 38357392.

I was able to grab only this part of the output, but when I checked there was more repetition (around 10 or 15 positions). I realized it was repeating the same sequence over and over.

The drive I'm running at is a very damaged one. I'm not sure if it's the image issue (unrecoverable) and I should let it go.

(some more info: running on Mac OS 10.12.1, was using a whole lot of memory, around 12GB~14GB)

Python 2.7.12 (aff251e543859ce4508159dd9f1a82a2f553de00, Nov 13 2016, 01:57:41)
[PyPy 5.6.0 with GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)]

Error parsing MFT entries

Hi,
I am trying to recover an NTFS formatted disk (for your information there is an underlying RAID 5 (3 disks + 1 spare)) whose partition informations are lost (even MFT and MFT mirror are lost!). I have tried the most updated RecuperaBit (1.1.1), launching the command:

recuperabit -s /mnt/tera8/atpmicrobusrecovery/savefile -o /mnt/tera8/atpmicrobusrecovery /dev/cciss/c0d0

and obtaining:

[.....omissis.....]
INFO:root:First scan completed
INFO:root:Parsing MFT entries
Traceback (most recent call last):
  File "/usr/local/bin/recuperabit", line 358, in <module>
    main()
  File "/usr/local/bin/recuperabit", line 340, in main
    parts.update(scanner.get_partitions())
  File "/mnt/tera8/rb/RecuperaBit/recuperabit/fs/ntfs.py", line 722, in get_partitions
    part.add_file(NTFSFile(parsed, position, ads=ads_name))
  File "/mnt/tera8/rb/RecuperaBit/recuperabit/fs/ntfs.py", line 270, in __init__
    ads_suffix = ':' + ads if ads != '' else ads
TypeError: cannot concatenate 'str' and 'NoneType' objects

Any idea?
Thanks.

PDFs hosted on scribd require registration to download

I'm interested in understanding how this works and reading your thesis, but Scribd only allows downloads when you register for an account.

Have you considered including it in this repo or posting it on arXiv or some other service that does not require registration?

Pause and resume scan for big drives

A way to pause and resume scanning of big drives would be a cool feature.
I've started RecuperaBit on a USB3 8TB drive (that i've overwritten the first 1-2GB) and it's taking a while. So far it's about 2/3 done but it's still going to be another day or two until it finishes. Would not be a problem usually but i've stupidly forgot to put it in a screen and now i'm stuck waiting for it to finish :/

TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'

I tried recovering files of an image made from a very badly defective hard disk. I got the following error (with a few lines of previous output):

DEBUG:root:Found MATCH in positions set([212032]) with weight 5 (55.5555555556%)
INFO:root:Finalizing MFT reconstruction of partition at offset 206848
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([32073856]) with weight 3 (42.8571428571%)
INFO:root:Finalizing MFT reconstruction of partition at offset 206848
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([27858496]) with weight 2 (66.6666666667%)
INFO:root:Finalizing MFT reconstruction of partition at offset 206848
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
INFO:root:Finalizing MFT reconstruction of partition at offset 206848
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
Traceback (most recent call last):
  File "main.py", line 309, in <module>
    main()
  File "main.py", line 291, in main
    parts.update(scanner.get_partitions())
  File "/home/mint/RecuperaBit/recuperabit/fs/ntfs.py", line 835, in get_partitions
    self.finalize_reconstruction(part)
  File "/home/mint/RecuperaBit/recuperabit/fs/ntfs.py", line 668, in finalize_reconstruction
    self.add_from_attribute_list(parsed, part, node.offset)
  File "/home/mint/RecuperaBit/recuperabit/fs/ntfs.py", line 623, in add_from_attribute_list
    _integrate_attribute_list(parsed, part, image)
  File "/home/mint/RecuperaBit/recuperabit/fs/ntfs.py", line 242, in _integrate_attribute_list
    real_pos = mft_pos + index * FILE_size
TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'

Recuperabit doesn't find any partitions, while the partition table is still intact.

I have a disk image with a corrupted ntfs filesystem. The partition table is still intact. RecuperaBit prints a lot of messages about finding file entries. and then it prints 0 partitions found

The allparts command does not return anything.

The partition table of the disk is still intact, so I would expect RecuperaBit to find the partition (and maybe tell me it's not recoverable)

Output follows:

INFO:root:Found NTFS boot sector at sector 63
INFO:root:Found NTFS file record at sector 3993093
INFO:root:Found NTFS file record at sector 83242008
INFO:root:Found NTFS file record at sector 219609705
INFO:root:Found NTFS file record at sector 220144489
INFO:root:Found NTFS file record at sector 221549529
INFO:root:Found NTFS file record at sector 221679915
INFO:root:Found NTFS file record at sector 221960169
INFO:root:Found NTFS file record at sector 221960174
INFO:root:Found NTFS file record at sector 221960215
INFO:root:Found NTFS file record at sector 222270425
INFO:root:Found NTFS file record at sector 224799858
INFO:root:Found NTFS file record at sector 225427456
INFO:root:Found NTFS file record at sector 227972838
INFO:root:Found NTFS file record at sector 228109772
INFO:root:Found NTFS file record at sector 231214086
INFO:root:Found NTFS file record at sector 231284524
INFO:root:Found NTFS file record at sector 232242528
INFO:root:Found NTFS file record at sector 233857004
INFO:root:Found NTFS file record at sector 233868102
INFO:root:Found NTFS file record at sector 234851360
INFO:root:Found NTFS file record at sector 238046962
INFO:root:Found NTFS file record at sector 239385901
INFO:root:Found NTFS file record at sector 239405609
INFO:root:Found NTFS file record at sector 239405614
INFO:root:Found NTFS file record at sector 239405655
INFO:root:Found NTFS file record at sector 239985879
INFO:root:Found NTFS file record at sector 244553198
INFO:root:Found NTFS file record at sector 314520764
INFO:root:Found NTFS file record at sector 314522048
INFO:root:Found NTFS file record at sector 314536382
INFO:root:Found NTFS file record at sector 328078905
INFO:root:Found NTFS file record at sector 328083705
INFO:root:Found NTFS file record at sector 328085991
INFO:root:Found NTFS file record at sector 328288313
INFO:root:Found NTFS file record at sector 328293145
INFO:root:Found NTFS file record at sector 429865600
INFO:root:Found NTFS file record at sector 431426292
INFO:root:Found NTFS file record at sector 432091151
INFO:root:Found NTFS file record at sector 432146680
INFO:root:Found NTFS file record at sector 432166419
INFO:root:Found NTFS file record at sector 432191859
INFO:root:Found NTFS file record at sector 432495584
INFO:root:Found NTFS file record at sector 432876948
INFO:root:Found NTFS file record at sector 433034720
INFO:root:Found NTFS file record at sector 433495444
INFO:root:Found NTFS file record at sector 433536591
INFO:root:Found NTFS file record at sector 433593368
INFO:root:Found NTFS file record at sector 433654771
INFO:root:Found NTFS file record at sector 434802697
INFO:root:Found NTFS file record at sector 434802702
INFO:root:Found NTFS file record at sector 434802743
INFO:root:Found NTFS boot sector at sector 625137344
INFO:root:First scan completed
INFO:root:Parsing MFT entries
INFO:root:Parsing INDX records
INFO:root:Reading boot sectors
INFO:root:Finding partition geometry
INFO:root:0 partitions found.

Feature request: Possibility to exclude items marked as deleted

I have a 3TB drive that I'm currently recovering files from. I can get the list of files nice and easy with the analysis pass so it would be easy to just use restore x 5 and restore x -1 to recover everything. However, it seems that some previously deleted files (marked as deleted in the csv output) are being recovered as well, and this means that I run out of disk space if I use the simple route (I don't have a 4TB drive to recover to).

In my case, I don't need those deleted files as they are either from the Windows Recycle Bin or directly deleted by me or some program. It would be nice if I could use a parameter to skip the deleted files or maybe skip files and/or directories by name using a wildcard.

Error: No such file or directory

Thanks for your previous len() fix, it worked. This software is outstanding when it comes to borked indexes, and I have recovered most of my filesystem. I am running Windows 7 x64 with pypy 32bit btw, and file recovery is quite fast. However, there is another problem. I used the command

recover 7 5

for partition #7, and here is the where the output gets interesting:

...
...
INFO:root:Restoring #122688 Root\backup\Quiz\websites\walkthroughs\www.thecomputershow.com\computershow\previews\SOURCE  Activision, Inc. -0-
 10\18\96_\CONTACT_  Andrea Kaimowitz or Cindy Miller, or Press_  Miriam Adler or Erika Brown, of Morgen-Walke Associates, 212-850-5600.html
Traceback (most recent call last):
  File "main.py", line 309, in <module>
    main()
  File "main.py", line 306, in main
    interpret(cmd, arguments, parts, shorthands, args.outputdir)
  File "main.py", line 185, in interpret
    logic.recursive_restore(myfile, part, partition_dir)
  File "D:\Program Files (x86)\RecuperaBit\recuperabit\logic.py", line 255, in recursive_restore
    recursive_restore(child, part, outputdir, make_dirs=False)
  File "D:\Program Files (x86)\RecuperaBit\recuperabit\logic.py", line 255, in recursive_restore
    recursive_restore(child, part, outputdir, make_dirs=False)
  File "D:\Program Files (x86)\RecuperaBit\recuperabit\logic.py", line 255, in recursive_restore
    recursive_restore(child, part, outputdir, make_dirs=False)
  File "D:\Program Files (x86)\RecuperaBit\recuperabit\logic.py", line 255, in recursive_restore
    recursive_restore(child, part, outputdir, make_dirs=False)
  File "D:\Program Files (x86)\RecuperaBit\recuperabit\logic.py", line 255, in recursive_restore
    recursive_restore(child, part, outputdir, make_dirs=False)
  File "D:\Program Files (x86)\RecuperaBit\recuperabit\logic.py", line 255, in recursive_restore
    recursive_restore(child, part, outputdir, make_dirs=False)
  File "D:\Program Files (x86)\RecuperaBit\recuperabit\logic.py", line 255, in recursive_restore
    recursive_restore(child, part, outputdir, make_dirs=False)
  File "D:\Program Files (x86)\RecuperaBit\recuperabit\logic.py", line 255, in recursive_restore
    recursive_restore(child, part, outputdir, make_dirs=False)
  File "D:\Program Files (x86)\RecuperaBit\recuperabit\logic.py", line 241, in recursive_restore
    with codecs.open(restore_path, 'wb') as outfile:
  File "D:\Program Files (x86)\RecuperaBit\lib-python\2.7\codecs.py", line 884, in open
    file = __builtin__.open(filename, mode, buffering)
IOError: [Errno 2] No such file or directory: u'j:\\Partition7\\Root\\backup\\Quiz\\websites\\walkthroughs\\www.thecomputershow.com\\computershow\\previews\\SOURCE  Activision, Inc. -0-     10\\18\\96_\\CONTACT_  Andrea Kaimowitz or Cindy Miller, or Press_  Miriam Adler or Erika Brown, of Morgen-Walke Associates, 212-850-5600.html' 

>

After this, I have to restart and enter file and directory references by hand using the csv file. The next commands would be

recover 7 122689
recover 7 122690
...

until I hit a directory, which moves faster, then repeat as necessary. As you can appreciate, this is quite laborious. This has happened several times with different files. Maybe the output path is too long?

Anyway, is it possible to have RecuperaBit flag this kind of error and skip the file, rather than force me to traverse the tree by hand?

Thanks

urgrueA

Restoring partition content into a different NTFS can break the target file system

I've started with an external 1TB HDD (msdos partition table) with a single NTFS partition that was broken by partially overwriting the partition by a smaller FAT partition (< 1.3 GB, possibly not fully written, the partition table was amended though).

The RecuperaBit successfully located the partition and most (if not all) of the files that were lost in the process. As a storage to recover the files I used an additional external 3TB HDD (gpt) with a single NTFS partition, already containing some files. I've made a dedicated directory for the restored files and proceeded with the restore.

For the procedure I've used a GNU/Linux system with ntfs-3g implementation of the NTFS support. After the procedure, I've noticed that the destination directory contained NTFS metafiles that appeared as a normal files ($Mft, $Secure, ...) . Not sure if the permissions of the recovered files were valid either, but the system was able to open them, so I didn't really care, I assume that because of a different color in the terminal listing.

I've decided to delete those metafiles because I thought they are irrelevant for the new filesystem, and after unmounting it I'm no longer able to mount it again. Windows freezes or is stuck in a connect – disconnect loop, GNU/Linux fails with the following output:

ntfs_mst_post_read_fixup_warn: magic: 0xff00ffff  size: 4096   usa_ofs: 65535  usa_count: 65535: Invalid argument
Actual VCN (0xffffffffffffffff) of index buffer is different from expected VCN (0x0).
Failed to open $Secure: No such file or directory

So, IMO, RecuperaBit should not try to write metafiles if the target file system is ntfs and/or should use a valid name for them. The "legit" metafiles should be created by ntfs-3g when restored files are written. I'm assuming RecuperaBit uses standard python IO for creating files.

It is also possible that ntfs-3g should refuse to write those files in the target file system.


As a side note, any ideas on how to approach fixing the target NTFS now are welcome 🙂 .

be slightly more verbose

Just a suggestion for a minor improvement: :-)

I'm using RecuperaBit for the first time and right after starting the program I got this prompt:

Type [Enter] to start the analysis or "exit" / "quit" / "q" to quit:

After pressing [Enter] nothing happens on the screen. A little Scanning the image, this might take quite some time... or anything similar (or even a progress bar) would be nice.

In my case I wondered if RecuperaBit was still waiting for input, had locked up or whatever. After a few minutes I had to check the disk I/O statistics to understand that RecuperaBit was actually reading heavily from the disk (since I use RecuperaBit remotely I can't hear the disks working, so I have absolutely no feedback...).

That said, your program looks very promising. Keep up the good work!

crash on german umlauts can't encode character u'\xf6'

so im trying to recover a drive that got erased with hddscan, the erase option got instantly aborted only the first 1mb of the disk shows the hddscan overwrite pattern.

whenever i try to do a tree output or restore recuperabit crashes with the following message:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 66: ordinal not in range(128)

im using git master, the crash occurs after some seconds after it already restored about 300mb of files

traceback:

INFO:root:Restoring #273517 Root/Program Files/Windows NT/TableTextService/TableTextService.dll
Traceback (most recent call last):
File "/usr/lib/pypy/lib-python/2.7/logging/init.py", line 897, in emit
stream.write(fs % msg.encode("UTF-8"))
File "/usr/lib/pypy/lib-python/2.7/codecs.py", line 369, in write
data, consumed = self.encode(object, self.errors)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 62: ordinal not in range(128)
Logged from file logic.py, line 231
Traceback (most recent call last):
File "main.py", line 385, in
main()
File "main.py", line 382, in main
interpret(cmd, arguments, parts, shorthands, args.outputdir)
File "main.py", line 191, in interpret
logic.recursive_restore(myfile, part, partition_dir)
File "/recupera/recuperabit/logic.py", line 258, in recursive_restore
recursive_restore(child, part, outputdir, make_dirs=False)
File "/recupera/recuperabit/logic.py", line 258, in recursive_restore
recursive_restore(child, part, outputdir, make_dirs=False)
File "/recupera/recuperabit/logic.py", line 258, in recursive_restore
recursive_restore(child, part, outputdir, make_dirs=False)
File "/recupera/recuperabit/logic.py", line 232, in recursive_restore
if not makedirs(restore_path):
File "/recupera/recuperabit/logic.py", line 196, in makedirs
os.makedirs(path)
File "/usr/lib/pypy/lib-python/2.7/os.py", line 157, in makedirs
mkdir(name, mode)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 66: ordinal not in range(128)

MemoryError and no partitions found

RecuperaBit consistently drops out of warp with a memory error:

INFO:root:Found NTFS boot sector at sector 1317656575
INFO:root:First scan completed
INFO:root:Parsing MFT entries
Traceback (most recent call last):
  File "main.py", line 385, in <module>
    main()
  File "main.py", line 367, in main
    parts.update(scanner.get_partitions())
  File "/root/RecuperaBit-master/recuperabit/fs/ntfs.py", line 698, in get_partitions
    parsed = parse_file_record(dump)
  File "/root/RecuperaBit-master/recuperabit/fs/ntfs.py", line 149, in parse_file_record
    attributes = _attributes_reader(entry, header['off_first'])
  File "/root/RecuperaBit-master/recuperabit/fs/ntfs.py", line 109, in _attributes_reader
    attr, name = parse_mft_attr(entry[offset:])
  File "/root/RecuperaBit-master/recuperabit/fs/ntfs.py", line 79, in parse_mft_attr
    nonresident = unpack(attr, attr_nonresident_fmt)
  File "/root/RecuperaBit-master/recuperabit/utils.py", line 98, in unpack
    result[label] = formatter(data[low:high+1])
MemoryError

Carefully following the process via top I see that in fact the memory in use rises up to 4G and then the process dies. It looks as if it is really the amount of files (or whatever) it finds that causes this.

I found that 'partitioned_files' grows beyond 100000 so around line 720 in ntfs.py I added some code to stop the process when it reached 50000. This "solves" the MemoryError, but now I end up with "0 partitions found" :-(

I already used photorec on this imagefile and that resulted in a lot of files, but obviously all named cryptic, so I hoped recuperabit would rush to the rescue... Any help is appreciated (and if not then perhaps this can help so the tool will exit nicely when it grows irratically).

"KeyError: 'content'" when parsing MFT entries

I'm trying to fix a bad Hard Disk, created the image with ddrescue /dev/sdb backup.img backup.log, then started RecuperaBit via pypy main.py backup.img -o output -s savefile.save. (got the tutorial from here).

When initializing (before hitting enter), it shows this warning message:
(Solved removing an ô character from the directory tree, will open another issue to separate them)

The script read all records from image, but when parsing MFT entries it crashed with following message:
(Just to let you know: this was the point where testdisk failed, because it couldn't find neither the MFT nor the MFT Mirror.)

INFO:root:First scan completed
INFO:root:Parsing MFT entries
Traceback (most recent call last):
  File "main.py", line 309, in <module>
    main()
  File "main.py", line 291, in main
    parts.update(scanner.get_partitions())
  File "/home/ubuntu/RecuperaBit/recuperabit/fs/ntfs.py", line 720, in get_partitions
    self.add_from_indx_root(parsed, part)
  File "/home/ubuntu/RecuperaBit/recuperabit/fs/ntfs.py", line 504, in add_from_indx_root
    if (attribute['content'] is None or
KeyError: 'content'

I'm using Ubuntu 16.04.1, python and pypy version below:

Python 2.7.10 (5.1.2+dfsg-1~16.04, Jun 16 2016, 17:37:42)
[PyPy 5.1.2 with GCC 5.3.1 20160413]

Tried running again, it found the .save file, read all relevant information and kept going till the same failure point: parsing MFT entries.

(Don't know if relevant information, but hope it helps: running Ubuntu Live via USB flash drive; the RecuperaBit downloaded directly via git clone on /home/ubuntu; .img, .save and output folder are saved on a mounted external HDD via USB 3.0)

Please allow dumping tree view of files of a partition to a file

Since it is possible to dump csv to a file, why can the tree command only output to stdout? It would be convenient if it was possible to automatically save the output to a file.

I, for example, had quite a large partition with lots of files. It was most convenient for me to copy&paste output of the tree command to a file, then inspect it and delete rows that were of no interest for me, only leaving these files that could be useful; and according to this list I started recovering files.

Out of memory when sorting attributes

It appears that if you have a drive with a lot of files, when it goes to recreate the MFT, it runs out of memory.

I have a 4TB drive that had only 1 partition, but (rough guess) had hundreds of thousands of files.

Is there any debugging that I can add to try and figure out how much RAM will be needed? I have access to a box with 192gigs of RAM, but I would really like to figure out how much RAM will be needed for it to complete before I requisition the box.

Recover a single NTFS partition

I generated an image of a single partition containing an NTFS filesystem rather than a whole disk.

Is it possible to modify RecuperaBit's restore command or create another command to assume the data in the image file is a single partition?

I have already run a scan and captured the results in a save file. I also have access to the original drive from where the image was taken, so capturing any values/sizes from the original partition table to be used as parameters to RecuperaBit is possible.

Add command to manually merge partitions

It would be useful, as highlighted by issue #18 , to implement a command to manually merge partitions that are recognized as separate by RecuperaBit, but which are known to be part of the same original partition.

Can't use "real drive" as input

Hey there, I came across the program via your recommendation on SE. As I mentioned on there, I've been trying to recover the directory structure from a 2TB drive for months and have used a ton of programs in the process, including GetDataBack and TestDisk, so will be very impressed if RecuperaBit lives up to the level of capability that your Masters thesis outlines it can.

Unfortunately, I can't seem to get it to work with my drive using Cygwin on Windows 7, even though the RB's documentation states it should work with "real drives". I'm not an expert in bash and/or Python by any means, so it's possible I'm getting the syntax wrong.

hello

Thanks in advance for your help, I really appreciate it.

I don't get it, how do I use this on Windows?

Hi, I'm in desperate need to recover an NTFS volume but I can't figure out how to install your Recuperabit python program on Windows and there is no instructions on your website how to do so. I get an error when I try to do python main.py

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Admin>cd c:\python34

c:\Python34>python "M:\RecuperaBit-master\main.py"
  File "M:\RecuperaBit-master\main.py", line 82
    print 'Partition #' + str(i), '->', parts[part]
                      ^
SyntaxError: invalid syntax

c:\Python34>

What gives?

Error while scanning an image (.img): "IndexError: list index out of range"

I started a scanning an image. The size of image is 465 GB.
I used the latest version of recuperabit.
This is the output:


INFO:root:Found NTFS index record at sector 975881640
INFO:root:Found NTFS index record at sector 976504280
INFO:root:Found NTFS index record at sector 976668320
INFO:root:Found NTFS boot sector at sector 976769023
INFO:root:First scan completed
INFO:root:Saving results to settInteressanti.save
INFO:root:Parsing MFT entries
INFO:root:Parsing INDX records
INFO:root:Reading boot sectors
DEBUG:root:Dropping bogus NTFS partition with MFT position 2064 generated by MFT mirror of partition at offset 2048
DEBUG:root:Dropping bogus NTFS partition with MFT position 86432682 generated by MFT mirror of partition at offset 86427520
DEBUG:root:Dropping bogus NTFS partition with MFT position 128554554 generated by MFT mirror of partition at offset 128549392
DEBUG:root:Dropping bogus NTFS partition with MFT position 372031174 generated by MFT mirror of partition at offset 372026012
Traceback (most recent call last):
  File "/opt/RecuperaBit/main.py", line 328, in <module>
    main()
  File "/opt/RecuperaBit/main.py", line 310, in main
    parts.update(scanner.get_partitions())
  File "/opt/RecuperaBit/recuperabit/fs/ntfs.py", line 798, in get_partitions
    relative = datas[0]['runlist'][0]['offset']
IndexError: list index out of range
xubuntu@xubuntu-virtual-machine:/media/xubuntu/Maxtor 1.3 GB$ 

Feature request: batch processing options

I'm trying to recover some files, and somehow Recuperabit found over 20k partitions, 1.4k of which are said to be recoverable.

Partition 57 lists a lot (maybe all) of the files, but I cannot restore certain files--said files are very large (>5GB) Outlook PSTs, and had CRC errors on the original disk--I'm running Recuperabit on a disk image created by ddrescue. I want to find out if any of the other 1.4k recoverable partitions contain enough data to recover the files I need, but I find that running csv, locate, or bodyfile to peruse each of the 1.4k partitions is quite tedious.

Which is why I'm wondering if some form of batch processing is possible.

For example, a FOR loop where the input is the list of recoverable partitions, and the output is a csv or bodyfile for each said partition (e.g. partition 1 would output 1.csv or 1.bodyfile, partition 2 would create 2.csv or 2.bodyfile, etc.). Then I can use grep or other text processors to find the files I need.

Alternatively, expand the locate command so that it can:

  1. check all recoverable partitions (with option for other/non-recoverable partitions, which might be useful in certain scenarios)
  2. output the results to a text file for later perusal.

Print elapsed time and estimated time to completion

When trying to recover large drives, it helps to know how much time you need to wait.

For example, photorec displays this:

Elapsed time 0h00m04s – Estimated time to completion 0h22m54

P.S. Thank you for this tool, looks promising! Although I'm still waiting for it to scan the whole drive…

Edit: It worked! I recovered the files when both testdisk and photorec didn't help at all. Amazing.

File Metadata - creation date

Hi,

First, thank you, this is a great tool. The ability to restore filenames and directories really helps in tracking down lost data. This may be more of an enhancement request, but I think the ability to restore the original file's creation/modification date would further assist in investigations.

OverflowError ...

Trying to recover files after CHKDSK screwed up my filesystem on boot. I have a 1.82TB partition image and have a save file of size 2017KB. Is my image too big for RecuperaBit?

Here is the offending output:

...
...
INFO:root:Found NTFS boot sector at sector 3906854911
INFO:root:First scan completed
INFO:root:Saving results to h:\savefile.txt
INFO:root:Parsing MFT entries
INFO:root:Parsing INDX records
INFO:root:Reading boot sectors
DEBUG:root:Dropping bogus NTFS partition with MFT position 1953427448 generated
by MFT mirror of partition at offset 0
INFO:root:Finding partition geometry
Traceback (most recent call last):
  File "main.py", line 309, in <module>
    main()
  File "main.py", line 286, in main
    parts.update(scanner.get_partitions())
  File "D:\Program Files (x86)\RecuperaBit\recuperabit\fs\ntfs.py", line 821, in
 get_partitions
    part, address, most_likely
  File "D:\Program Files (x86)\RecuperaBit\recuperabit\fs\ntfs.py", line 527, in
 find_boundary
    width = len(text_list)
OverflowError: long int too large to convert to int

Thanks

tikzplot raises UnicodeDecodeError and terminates process

  • The final while True loop should catch and ignore all exceptions (but print stack trace anyway). I know it's generally a bad idea to swallow exceptions (only if you don't print or log, and the program cannot possibly continue?). But individual commands failing should not corrupt internal state, and losing analysis data built over an hour is a terrible user experience.
  • Maybe you should add an option to save (eg. pickle/CPickle) the 7GB of RAM to a file, and reload it later on.
  • The crash below.
    • Notes: NTFS stores UTF-16 (unpaired surrogates allowed) which can be encoded as WTF-8. Python 2 has 8-bit bytes/str and arbitrary-bit unicode. For the minimum changes to your code, you could try latin1 instead of ascii.
> tikzplot 67
Traceback (most recent call last):
  File "main.py", line 385, in <module>
    main()
  File "main.py", line 382, in main
    interpret(cmd, arguments, parts, shorthands, args.outputdir)
  File "main.py", line 171, in interpret
    print utils.tikz_part(part)
  File "/home/jimbo1qaz/Dropbox/encrypted/code/pypy/RecuperaBit/recuperabit/utils.py", line 310, in tikz_part
    lines += [tikz_child(entry, 4)[0] for entry in (part.root, part.lost)]
  File "/home/jimbo1qaz/Dropbox/encrypted/code/pypy/RecuperaBit/recuperabit/utils.py", line 283, in tikz_child
    content, number = tikz_child(entry, padding+4)
  File "/home/jimbo1qaz/Dropbox/encrypted/code/pypy/RecuperaBit/recuperabit/utils.py", line 283, in tikz_child
    content, number = tikz_child(entry, padding+4)
  File "/home/jimbo1qaz/Dropbox/encrypted/code/pypy/RecuperaBit/recuperabit/utils.py", line 283, in tikz_child
    content, number = tikz_child(entry, padding+4)
  File "/home/jimbo1qaz/Dropbox/encrypted/code/pypy/RecuperaBit/recuperabit/utils.py", line 283, in tikz_child
    content, number = tikz_child(entry, padding+4)
  File "/home/jimbo1qaz/Dropbox/encrypted/code/pypy/RecuperaBit/recuperabit/utils.py", line 280, in tikz_child
    lines = [r'%schild {%s' % (pad, _tikz_repr(directory))]
  File "/home/jimbo1qaz/Dropbox/encrypted/code/pypy/RecuperaBit/recuperabit/utils.py", line 273, in _tikz_repr
    _ltx_clean(node.index), _ltx_clean(node.name)
  File "/home/jimbo1qaz/Dropbox/encrypted/code/pypy/RecuperaBit/recuperabit/utils.py", line 263, in _ltx_clean
    clean = str(label).replace('$', r'\$').replace('_', r'\_')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 11: ordinal not in range(128)

Git master 18090ab
Python 2.7.13 (8cdda8b8cdb8ff29d9e620cccd6c5edd2f2a23ec, Apr 16 2019, 18:25:57)
[PyPy 7.1.1 with GCC 8.2.0]

use of savefile

First of all thanks for this software.

I'm working on a 4TB drive. I have successfully scanned this disk before and recovered files without using the SAVEFILE option because I thought I wouldn't need to do it again - once the scan finished I was given a prompt to do recovery and restore of files. Last night I needed to scan again and do more recovery so this time I decided to use the SAVEFILE option in case I would need to do it again in the future. I woke up this morning and the scan had finished and the SAVEFILE has been made but I wasn't given the prompt to do 'tree' or 'restore 5' etc. So I decided I'd missed something so ran the scan again thinking it would use the SAVEFILE to speed this process up - which it did - it said it found a SAVEFILE and ran through the analysis quickly. However it still didn't present the prompt for 'tree' and 'recovery'. This is how the scan ends:

INFO:root:Found NTFS file record at sector 7631198930
INFO:root:Found NTFS file record at sector 7637167806
INFO:root:First scan completed
INFO:root:Parsing MFT entries
INFO:root:Parsing INDX records
INFO:root:Reading boot sectors
INFO:root:Finding partition geometry
DEBUG:root:Found MATCH in positions set([96075079]) with weight 4 (0.107700592353%)
DEBUG:root:Found MATCH in positions set([-1633197202, -1633197194, -1633197186, -1633197178, -1633197170, -1633197162, -1633197154, 82384694, 82384702, 82384710, 82384718, 82384726, 82384734, 82384742]) with weight 4 (0.107700592353%)
Killed

Am I doing something wrong? How can I get the prompt for recovery and tree and saving to csv again?

Thanks.

UnboundLocalError

Hi there, I'm trying to recover an external 4tb WD My Book drive. on which I accidentally used Startup Disk Creator to create an Ubuntu startup disk, so I think the first 3 or 4 gigs may be overwritten. so I tried recuperabit and it works fine until parsing the INDX records, then the program quits with next error:

INFO:root:Found NTFS file record at sector 3906986000
INFO:root:Found NTFS boot sector at sector 7813969912
INFO:root:First scan completed
INFO:root:Saving results to uitvoer.save
INFO:root:Parsing MFT entries
INFO:root:Parsing INDX records
Traceback (most recent call last):
  File "/usr/local/bin/recuperabit", line 385, in <module>
    main()
  File "/usr/local/bin/recuperabit", line 367, in main
    parts.update(scanner.get_partitions())
  File "/home/henri/RecuperaBit/recuperabit/fs/ntfs.py", line 733, in get_partitions
    parsed = parse_indx_record(dump)
  File "/home/henri/RecuperaBit/recuperabit/fs/ntfs.py", line 182, in parse_indx_record
    name_ok = file_name['name'] is not None
UnboundLocalError: local variable 'file_name' referenced before assignment

Please any advice on what might be wrong, and what to do.

I'm using Ubuntu 18.04 LTS
Python 2.7.15rc1 installed

greetings Henri

Feature Request: 'Recover' All for Recoverable

I've emailed before about recovering all partitions and I understand that was not a great idea. Now I raise the question, If it is possible to recover the partitions that are listed as recoverable instead of having to individually filter through each one. I am confident that each partition contains important date because as I am looking for a variety of file types on this rescue image. If you could lead me to the solution or a better solutions that would also be appreciated. Thanks you in advance.

Document image creation process

Whether specific format for disk image is expected or an image file from output of dd is sufficient, is not clear from README.
It will be great if image creation process and example tools are specified in README.

Add a feature: given a file it prints all the parents

I needed to recover one directory (call it "my documents"). I performed a scan, I created a list.csv file, but it was very difficult to locate the entry due to the fact that my 1 GB RAM netbook hardly managed to open a 30 mb csv file with libreoffice. I couldn't use CTRL + F function because it freezed libreoffice. I scrolled down the records until I found one file that I knew it was nested in my documents. I took note of the parent ID, then using "grep" I recursively found the ID of the folder I was looking for. It would be cool to avoid this manual recursive use of grep command, it would be nice to specify an entry and recursively get all the parents until root folder.
Note that I couldn't directly grep "my documents" string due to the fact that the corresponding entry in csv file was:
290396,-1,"Dir_290396",None,None,None,0,0.00 B,None,None,1,,1
Andrea, says it's because MFT entry of the file was missing.

Denial of service

Since memory issues appeared common I decided to do a quick code inspection and I did spot an actual problem, honestly since I'm not familiar with NTFS data-structures I can't tell for sure if there are other problems, for example reading too few sectors when dumping attribute lists.

Anyway the problem will cause RecuperaBit to go into an infinite loop while allocating an attribute list eventually in my case causing an OOM-kill.

A list a is equal to a[None:] thus ensuring that the loop never terminates and that the same faulty structure is reparsed (where length == None due to being out-of-bounds) and added to the content list each iteration. Quick fix included below.

index c7b3dd9..160aa8a 100644
--- a/recuperabit/fs/ntfs_fmt.py
+++ b/recuperabit/fs/ntfs_fmt.py
@@ -113,7 +113,7 @@ def attribute_list_parser(dump):
             ('id', ('i', 24, 24))
         ])
         length = decoded['length']
-        if length == 0:
+        if length == None or length == 0:
             break
         content.append(decoded)
         dump = dump[length:]

"OverflowError: long int too large to convert to int" while parsing scan results

Background info:

  1. Disk: Seagate, Capacity 4T, one primary partition, NTFS file system in the partition. Windows Explorer shows no files in the file system, checking the drive capacity in Windows 10 shows that 80% of the drive is in use. The latter is true as I used the drive to store all my backups.
  2. I use Knoppix distribution (V7.6, 64bit), which I boot from a DVD Drive.
  3. Recuperabit was able to scan the complete 4T HDD (9 hours) and then successfully wrote the save file (16MB in size).
  4. Processing went ok but then ran into an error with the following message:
Traceback (most recent call last):
  File "./main.py", line 358, in <module>
    main()
  File "./main.py", line 340, in main
    parts.update(scanner.get_partitions())
  File "/home/knoppix/Downloads/RecuperaBit-1.1.1/recuperabit/fs/ntfs.py", line 833, in get_partitions
    part, address, most_likely
  File "/home/knoppix/Downloads/RecuperaBit-1.1.1/recuperabit/fs/ntfs.py", line 571, in find_boundary
    text_list, pattern_list, mft_address + delta, k=min_support
  File "/home/knoppix/Downloads/RecuperaBit-1.1.1/recuperabit/logic.py", line 152, in approximate_matching
    lookup = preprocess_pattern(pattern)
  File "/home/knoppix/Downloads/RecuperaBit-1.1.1/recuperabit/logic.py", line 128, in preprocess_pattern
    length = len(pattern)
OverflowError: long int too large to convert to int

Is there any way to change the environment to avoid the OverflowError? Fyi, I run main.py directly from the shell command line in the Knoppix terminal window, no pypy .

Thanks in advance for your support - would be great if there is a fix to this problem.

Complete log file (after completion of the scan):

INFO:root:First scan completed
INFO:root:Saving results to /media/sda2/RecSave.save
INFO:root:Parsing MFT entries
INFO:root:Parsing INDX records
INFO:root:Reading boot sectors
DEBUG:root:Dropping bogus NTFS partition with MFT position 16 generated by MFT mirror of partition at offset 0
DEBUG:root:Dropping bogus NTFS partition with MFT position 3503002718 generated by MFT mirror of partition at offset 3502997556
DEBUG:root:Dropping bogus NTFS partition with MFT position 586923218 generated by MFT mirror of partition at offset 586918056
DEBUG:root:Dropping bogus NTFS partition with MFT position 2422179350 generated by MFT mirror of partition at offset 2422174188
DEBUG:root:Dropping bogus NTFS partition with MFT position 2962857399 generated by MFT mirror of partition at offset 2952383055
DEBUG:root:Dropping bogus NTFS partition with MFT position 3161728710 generated by MFT mirror of partition at offset 3161723548
DEBUG:root:Dropping bogus NTFS partition with MFT position 35711538 generated by MFT mirror of partition at offset 35706376
DEBUG:root:Dropping bogus NTFS partition with MFT position 2017735986 generated by MFT mirror of partition at offset 2017730824
DEBUG:root:Dropping bogus NTFS partition with MFT position 2828974847 generated by MFT mirror of partition at offset 2818500503
DEBUG:root:Dropping bogus NTFS partition with MFT position 6919306178 generated by MFT mirror of partition at offset 6919301016
INFO:root:Finding partition geometry
INFO:root:Finalizing MFT reconstruction of partition at offset 0
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([2952567047L, 2818684495L]) with weight 2 (100.0%)
INFO:root:Finalizing MFT reconstruction of partition at offset 3502997556
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([2962118975L, 2828236423L]) with weight 15 (100.0%)
INFO:root:Finalizing MFT reconstruction of partition at offset 586918056
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([2810326312L]) with weight 7 (87.5%)
INFO:root:Finalizing MFT reconstruction of partition at offset 2786705352
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([2810329438L]) with weight 42 (22.4598930481%)
INFO:root:Finalizing MFT reconstruction of partition at offset 2786705358
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([2793729611L]) with weight 42 (23.8636363636%)
INFO:root:Finalizing MFT reconstruction of partition at offset 2786705379
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([2810584129L, 2810584815L]) with weight 2 (8.69565217391%)
DEBUG:root:Found MATCH in positions set([2952503119L, 2818620567L]) with weight 44 (93.6170212766%)
DEBUG:root:Found MATCH in positions set([2810415350L]) with weight 3 (12.5%)
INFO:root:Finalizing MFT reconstruction of partition at offset 2786706286
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([2810826549L, 2810826551L]) with weight 2 (10.5263157895%)
DEBUG:root:Found MATCH in positions set([2797722264L]) with weight 2 (10.5263157895%)
INFO:root:Finalizing MFT reconstruction of partition at offset 2786705576
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([2810877234L, 2810876533L]) with weight 2 (9.09090909091%)
DEBUG:root:Found MATCH in positions set([2811262072L, 2811261619L, 2811261847L]) with weight 2 (8.69565217391%)
DEBUG:root:Found MATCH in positions set([2809673534L]) with weight 3 (8.57142857143%)
INFO:root:Finalizing MFT reconstruction of partition at offset 2786706430
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([2791335807L, 2791334710L, 2791335703L]) with weight 2 (1.8018018018%)
DEBUG:root:Found MATCH in positions set([2791668323L]) with weight 5 (9.09090909091%)
INFO:root:Finalizing MFT reconstruction of partition at offset 2786705435
INFO:root:Adding extra attributes from $ATTRIBUTE_LIST
INFO:root:Adding ghost entries from $INDEX_ALLOCATION
DEBUG:root:Found MATCH in positions set([2952501311L, 2818618759L]) with weight 82 (97.619047619%)
DEBUG:root:Found MATCH in positions set([2952476663L, 2818594111L]) with weight 65 (38.4615384615%)
Traceback (most recent call last):
  File "./main.py", line 358, in <module>
    main()
  File "./main.py", line 340, in main
    parts.update(scanner.get_partitions())
  File "/home/knoppix/Downloads/RecuperaBit-1.1.1/recuperabit/fs/ntfs.py", line 833, in get_partitions
    part, address, most_likely
  File "/home/knoppix/Downloads/RecuperaBit-1.1.1/recuperabit/fs/ntfs.py", line 571, in find_boundary
    text_list, pattern_list, mft_address + delta, k=min_support
  File "/home/knoppix/Downloads/RecuperaBit-1.1.1/recuperabit/logic.py", line 152, in approximate_matching
    lookup = preprocess_pattern(pattern)
  File "/home/knoppix/Downloads/RecuperaBit-1.1.1/recuperabit/logic.py", line 128, in preprocess_pattern
    length = len(pattern)
OverflowError: long int too large to convert to int

KeyError: 'content'

Hi,

while running : python2.7 main.py image -o recov -s log.save
i got the following error :

INFO:root:Parsing MFT entries
Traceback (most recent call last):
  File "main.py", line 328, in <module> main()
  File "main.py", line 310, in main  parts.update(scanner.get_partitions())
  File "/mnt/hdd/RecuperaBit-master/recuperabit/fs/ntfs.py", line 719, in get_partitions  part.add_file(NTFSFile(parsed, position, ads=ads_name))
  File "/mnt/hdd/RecuperaBit-master/recuperabit/fs/ntfs.py", line 303, in __init__    parent_id = filenames[0]['content']['parent_entry']
KeyError: 'content'

any idea ?
Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.