colinfinck / ntfs Goto Github PK
View Code? Open in Web Editor NEWAn implementation of the NTFS filesystem in a Rust crate, usable from firmware level up to user-mode.
License: Apache License 2.0
An implementation of the NTFS filesystem in a Rust crate, usable from firmware level up to user-mode.
License: Apache License 2.0
I looked at the code but couldn't judge how safe is it to read directory entry tables directly with this on a live system? I mean, how safe is it to scan a whole disk lets say 5-10 tb while it is being heavily written, with potentially some deletes happening? Can you be reading a folder and it be deleted while reading contents or lets say a file in the folder is deleted?
I'm planning to write some code to do multi threaded full disk scans based on the directory entries only. I know it can be done as this software does it https://diskanalyzer.com/about but I'm weary that I don't remember seeing any write order guarantees on disk to say that what you are reading is consistent. Maybe reading each folder twice and if both matches its the correct entry?
I have an image of an empty 8GB USB drive formatted with 2MB clusters (because I'm evil, not because it's useful :D ):
Trying to read that results in the above error thrown by boot_sector.rs#L50 because supposedly the cluster size isn't a power of two.
Not sure how that happens, but the surrounding code claims that sector_size is always 512, sectors_per_cluster is u8, but sector_size*sectors_per_cluster can reach 2097152. That obviously doesn't work out, sectors_per_cluster would have to be 4096, which doesn't fit an u8.
When the error is thrown, self looks like this:
self = BiosParameterBlock {
sector_size: 512,
sectors_per_cluster: 244,
zeros_1: [
0,
0,
0,
0,
0,
0,
0,
],
media: 248,
zeros_2: [
0,
0,
],
dummy_sectors_per_track: 63,
dummy_heads: 255,
hidden_sectors: 0,
zeros_3: 0,
physical_drive_number: 128,
flags: 0,
extended_boot_signature: 0,
reserved: 0,
total_sectors: 15728639,
mft_lcn: Lcn(
1536,
),
mft_mirror_lcn: Lcn(
1,
),
file_record_size_info: -10,
zeros_4: [
0,
0,
0,
],
index_record_size_info: -12,
zeros_5: [
0,
0,
0,
],
serial_number: 11734322338205781691,
checksum: 0,
}
For convenience here's an image that triggers it:
Hello, I started fuzzing the crate and found a lot of crashes (panics) and infinite loops. A lot of the panics are caused by unsafe use of []
slice operator, sometimes it is out of bounds, some times its a range slice which is not valid like [start..end]
with start=320
and end=307
. I'm gonna try to fix all the panics and write more safe rust code, but I'd like your opinion and if you have a prefered version of how you want to deal with it.
Well, for an example, here is the first bug:
Lines 281 to 289 in 9348d72
start
is equal to 320
and end
to 307
so the range is not valid. I don't understand the deeper problem here but can give a quick fix like this:
pub(crate) fn non_resident_value_data_and_position(&self) -> Result<(&'f [u8], NtfsPosition)> {
debug_assert!(!self.is_resident());
let start = self.offset + self.non_resident_value_data_runs_offset() as usize;
let end = self.offset + self.attribute_length() as usize;
let position = self.file.position() + start;
if let Some(data) = &self.file.record_data().get(start..end) {
Ok((data, position))
} else {
Err(NtfsError::MissingIndexAllocation {position})
}
}
I also don't know which NtfsError
to use, I put MissingIndexAllocation
here because that was an easy one and fairly accurate.
I have the PR ready with the the changes induces by the Result
in the return of the function.
Thanks for your time
There is bug which lead to an infinite loop.
In the next
implemention of Iterator
in NtfsAttrubutes
here:
Lines 508 to 583 in 9348d72
None
indicating the end. I searched but I'm not able to find the problem and how to solve the issue.Lines 216 to 231 in 9348d72
iter.next()
is never returning None
so it never ends.Hello again (๐ข), I have a new crash with the latest fixes, so here it is.
During Record
fixup I think an invalid ntfs data that is lying on sizes is leading to this and here is the file to reproduce: test.zip
So Here is what I observed:
The lines that crashes are those ones
Lines 59 to 62 in e3671c5
self.data.len()
is 1024
and array_position_end
is 1025
so the out of bounds panics.Lines 51 to 57 in e3671c5
My guess is that array_end
is wrong in the loop condition before that
Line 47 in e3671c5
array_end
is 120827
and array_position
is 1025
and is incrementing of 2 each loop iteration so its gonna take time before this exits. And array_end
is build from this Lines 41 to 42 in e3671c5
self.update_sequence_size()
is 1019
which is quite relevant in this situation but the offset it going to the sky because update_sequence_offset()
is 119808
.Trying to find a given data stream in a file from its name ( https://docs.rs/ntfs/latest/ntfs/struct.NtfsFile.html#method.data ) requires matching the exact case of the data stream.
From my tests, ADS names are also treated in a case insensitive manner by windows. Invoking notepad file:stream
or notepad FILE:STREAM
will open the same data stream.
IMHO this isn't really a bug, given that handling case insensitivity requires building the upcase table and knowing which functions handle it properly. But it is a "surprising" behavior and I got caught by it :).
I think adding a new data method that handles case insensitivity (this is pretty trivial to do, just need to pass the Ntfs object and call upcase_cmp) and improving the documentation on the data method to specify this behavior would be nice, and would allow implementing the full usecase of retrieving the data from a path + stream name). What do you think?
As @msuhanov noted in #12, Arsenal Image Mounter can emulate 4kn/512e devices. This indeed allows you to create valid If you mount an image as writable and format it with Windows, you get an NTFS file system with 4096 sector size.
Here's an example image, which when mounted in Arsenal Image Mounter with 4k sector size and as "removable device" (because I didn't include a partition table) is recognized as a normal NTFS filesystem: empty-4kSectors-8GB.img.gz
Using the same method I was also able to create images with sector sizes of 1024 and 2048 bytes. 8192 bytes and larger fails: Windows can create the partition, but can't format it. This matches with my comment in #12 about the sector size being limited by the page size (so what about Windows on an Alpha or Itanium system with 8k pages? Probably not that relevant though).
I am trying to open a volume where this crate fails to open every file beyond the well-known ones (i.e. FRN > 16). Digging into the code a little, I found this:
Lines 93 to 100 in cf4c127
It turns out that the MFT in my volume does have an attribute list. And taking a peek at ntfs-3g (1, 2) suggests that:
Ntfs::file
must read the attribute listI tried playing with the code here to call find_attribute
instead of find_resident_attribute
but a problem is that find_attribute
invokes file
(infinite loop). When I work around that, I can actually access all the files - but the FRNs are offset by 16 (i.e. looking up FRN 770 returns FRN 786). Given that the first 16 FRNs are in the resident portion of the MFT, this makes sense - though I'm not sure if this might be a bug in NtfsAttributes
. The code there mentions concatenating attribute entries inside an attribute list, but the situation I'm observing here suggests that attributes from inside and outside an attribute list should be concatenated as well.
Hi, any idea what it would take to compile this driver for EFI? Has this been considered yet?
Is it possible to publish a new version with the updated deps and recently merged changes ?
PS C:\Users\redacted\source\ntfs> .\target\debug\examples\ntfs_shell.exe \\.\C:
**********************************************************************
ntfs-shell - Demonstration of the ntfs Rust crate
by Colin Finck <colin@reactos.org>
**********************************************************************
Opened "\\.\C:" read-only.
ntfs-shell:\> cd Users
ntfs-shell:\Users> cd redacted
ntfs-shell:\Users\redacted> dir
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', C:\Users\redacted\source\ntfs\src\index_record.rs:68:51
stack backtrace:
0: std::panicking::begin_panic_handler
at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b\/library\std\src\panicking.rs:498
1: core::panicking::panic_fmt
at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b\/library\core\src\panicking.rs:107
2: core::panicking::panic
at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b\/library\core\src\panicking.rs:48
3: enum$<core::option::Option<u64> >::unwrap<u64>
at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b\library\core\src\option.rs:746
4: ntfs::index_record::NtfsIndexRecord::new<std::io::buffered::bufreader::BufReader<ntfs_shell::sector_reader::SectorReader<std::fs::File> > >
at .\src\index_record.rs:68
5: ntfs::structured_values::index_allocation::NtfsIndexAllocation::record_from_vcn<std::io::buffered::bufreader::BufReader<ntfs_shell::sector_reader::SectorReader<std::fs::File> > >
at .\src\structured_values\index_allocation.rs:65
6: ntfs::index::NtfsIndexEntries<ntfs::indexes::file_name::NtfsFileNameIndex>::next<ntfs::indexes::file_name::NtfsFileNameIndex,std::io::buffered::bufreader::BufReader<ntfs_shell::sector_reader::SectorReader<std::fs::File> > >
at .\src\index.rs:173
7: ntfs_shell::dir<std::io::buffered::bufreader::BufReader<ntfs_shell::sector_reader::SectorReader<std::fs::File> > >
at .\examples\ntfs-shell\main.rs:298
8: ntfs_shell::main
at .\examples\ntfs-shell\main.rs:85
9: core::ops::function::FnOnce::call_once<enum$<core::result::Result<tuple$<>,anyhow::Error>, 1, 18446744073709551615, Err> (*)(),tuple$<> >
at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b\library\core\src\ops\function.rs:227
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
I don't know what's causing it or how to narrow it down, but I'm happy to run it with a couple println statements or something if you can give me a clue where to start
Reproduction code:
fn main() {
let data = [
235, 82, 144, 78, 84, 70, 83, 32, 32, 32, 32, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 248, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 128, 0, 128, 0, 255, 222, 222, 94, 1, 94, 1, 222, 222,
222, 1, 0, 0, 0, 0, 0, 255, 7, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 8, 0, 112, 112, 121, 32, 97,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 0, 0, 0, 0, 128, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 8, 0, 0, 0, 0, 0, 0, 85, 170, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 255, 255, 255, 255, 255, 250,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 0, 0, 0, 0, 128,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 85, 170, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 255, 255, 255, 255, 255, 250, 0, 0, 0, 0, 0, 0, 0, 0, 85,
170, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 255, 255, 255, 255, 255, 250, 0,
0, 0, 0, 0,
];
let mut data_2 = std::io::Cursor::new(data);
let _ = ntfs::Ntfs::new(&mut data_2);
}
Found this with a fuzzer. I wouldn't expect the library to produce valid results on invalid file systems, but it shouldn't crash.
In the example:
file_name.name()
fails because it's actually a Result.
let mut ntfs = Ntfs::new(&mut fs).unwrap();
let root_dir = ntfs.root_directory(&mut fs).unwrap();
let index = root_dir.directory_index(&mut fs).unwrap();
let mut iter = index.entries();
while let Some(entry) = iter.next(&mut fs) {
let entry = entry.unwrap();
let file_name = entry.key().unwrap();
println!("{}", file_name.name());
}
PS C:\Users\redacted\source\ntfs> .\target\debug\examples\ntfs_shell.exe \\.\C:
**********************************************************************
ntfs-shell - Demonstration of the ntfs Rust crate
by Colin Finck <colin@reactos.org>
**********************************************************************
Opened "\\.\C:" read-only.
ntfs-shell:\> fsinfo
Cluster Size: 4096
File Record Size: 1024
MFT Byte Position: 0xc0000000
NTFS Version: 3.1
Sector Size: 512
Serial Number: 1651328304054153915
Size: 943060998656
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src\structured_values\volume_name.rs:96:46
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
As you can see, I indeed don't have a name set, which is just fine with Windows (Tested on Windows 10 21H1):
On a different drive with a configured name the command succeeds.
fn main() {
let data = [
235, 82, 144, 78, 84, 70, 83, 32, 32, 0, 0, 0, 0, 0, 0, 128, 32, 128, 0, 255, 15, 0, 0, 0,
0, 0, 0, 32, 0, 0, 0, 0, 0, 0, 0, 255, 7, 0, 0, 0, 0, 0, 0, 149, 0, 0, 0, 8, 0, 0, 0, 120,
183, 16, 124, 224, 39, 74, 127, 0, 0, 0, 0, 14, 31, 190, 113, 124, 172, 34, 192, 116, 11,
86, 180, 14, 187, 7, 0, 205, 16, 94, 235, 240, 50, 228, 205, 22, 205, 25, 235, 254, 84,
104, 105, 115, 32, 105, 115, 32, 110, 111, 116, 32, 97, 32, 98, 111, 111, 116, 97, 98, 108,
101, 32, 100, 105, 115, 107, 46, 32, 80, 50, 101, 97, 115, 101, 32, 105, 110, 115, 101,
114, 116, 32, 97, 32, 98, 111, 111, 116, 97, 98, 108, 101, 32, 102, 108, 111, 112, 112,
121, 32, 97, 110, 100, 13, 10, 112, 114, 101, 115, 115, 32, 97, 110, 121, 32, 107, 101,
121, 32, 116, 111, 32, 116, 114, 121, 32, 97, 103, 97, 105, 110, 32, 97, 110, 121, 32, 107,
101, 121, 32, 116, 111, 32, 116, 114, 121, 32, 97, 103, 97, 105, 110, 32, 46, 46, 46, 32,
13, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 255,
255, 255, 255, 2, 183, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 128, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 85, 170,
];
let mut data = std::io::Cursor::new(data);
let mut fs = ntfs::Ntfs::new(&mut data).unwrap();
fs.read_upcase_table(&mut data);
}
Error: thread 'main' panicked at 'range end index 4 out of range for slice of length 0', /home/jess/.cargo/registry/src/github.com-1ecc6299db9ec823/ntfs-0.1.0/src/record.rs:108:9
Hi Colin,
I have a question that drives me nuts :) and it has no thing to do with the framework at all rather it might clear some concepts on my head. When reading through Microsoft documentation, I fully understand that to access the physical raw of a disk and getting a handle of the MBR file, you will need to specify the path as \.\PhysicalDrive0 and you managed to do it through \.\C: . Is it possible to do it both or what it is the idea behind that...
Thanks in advance and apology for writing this as an issue.
In Record::update_sequence_array_count
, there is a case where update_sequence
is 0 and is subtracted of sizeof<u16>
which is 2 and so it overflows to below 0 for an u16
.
Lines 107 to 113 in 7dce76c
saturating_sub
works here but not sure it is the preferred option ๐คHey, me again...
The fix 441ea7b for #24 introduced a new bug because sector_position_end
still needs to be verified inside the loop because it is incremented at each iteration and so still needs to be checked. Because here in line 67 sector_position_end
is out of bounds
Lines 56 to 67 in 7dce76c
Just need to add back this condition to the loop.
if sector_position_end > self.data.len() {
return Err(NtfsError::UpdateSequenceArrayExceedsRecordSize {
position: self.position,
array_count: self.update_sequence_array_count(),
record_size: self.data.len(),
});
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.