Comments (36)
I need more detailed information about the setup and use a simplified setup that can be replicated.
https://github.com/trapexit/mergerfs#support
What OS? How was it installed? Is the offset for diff always different? What are your exact config?
from mergerfs.
Hi.
I will try and tell you all informatons:
The problem exists on my system since the 16.10.2023.
All Downloads prior the date are OK.
I download files via JDownloader or wget or curl.
I made a system update on the 16th via "apt update && apt upgrade" and made a reboot.
The only relevant to this case, IMHO, is a update of glibc to version 2.36-9+deb12u3 from 2.36-9+deb12u1.
Mergerfs is not running on any virtualisation like docker or such.
- OS:
root@pve:/ext-usb/mergerfs/union/data# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
root@pve:/ext-usb/mergerfs/union/data#
root@pve:/ext-usb/mergerfs/union/data# uname -a
Linux pve 6.2.16-15-pve #1 SMP PREEMPT_DYNAMIC PMX 6.2.16-15 (2023-09-28T13:53Z) x86_64 GNU/Linux
- MergerFS Version:
root@pve:/ext-usb/mergerfs/union/data# mergerfs -V
mergerfs v2.37.1
I downloaded the deb package from github an installed it via dpkg.
The offset for diff is not always the same:
First Download:
root@pve:/ext-usb/mergerfs/union/data# cmp PDF200MB_sdh1.pdf PDF200MB_mergerfs.pdf
PDF200MB_sdh1.pdf PDF200MB_mergerfs.pdf differ: byte 56137241, line 224719
root@pve:/ext-usb/mergerfs/union/data#
Second Download:
root@pve:/ext-usb/sdh1/data# cmp PDF200MB_sdh1.pdf PDF200MB_mergerfs.pdf
PDF200MB_sdh1.pdf PDF200MB_mergerfs.pdf differ: byte 197153817, line 1336303
root@pve:/ext-usb/sdh1/data#
Third Downlad:
--> Files are the same
Fourth Download:
root@pve:/ext-usb/sdh1/data# cmp PDF200MB_sdh1.pdf PDF200MB_mergerfs.pdf
PDF200MB_sdh1.pdf PDF200MB_mergerfs.pdf differ: byte 25270809, line 105234
root@pve:/ext-usb/sdh1/data#
I mount the mergerfs via fstab:
root@pve:/ext-usb/sdh1/data# cat /etc/fstab
/dev/pve/root / ext4 errors=remount-ro 0 1
UUID=F391-570F /boot/efi vfat defaults 0 1
/dev/pve/swap none swap sw 0 0
proc /proc proc defaults 0 0
UUID=fd756d7d-a57b-4447-951a-1cba87230415 /ext-usb/sdc1 xfs defaults 0 1
UUID=6486f55c-8481-4881-af2e-3f1dd5609c99 /ext-usb/sda1 xfs defaults 0 1
UUID=53318944-360b-4192-a796-04fca3b0125b /ext-usb/sdf1 xfs defaults 0 1
UUID=ace6e4c2-8246-4259-939d-238ee5eeee10 /ext-usb/sdg1 xfs defaults 0 1
UUID=bc7b716e-c420-40b5-af6e-8b2d89b7a9f6 /ext-usb/sdb1 xfs defaults 0 1
UUID=dcde2686-23a9-4cf8-866b-b158b1253042 /ext-usb/sdh1 xfs defaults 0 1
/ext-usb/sda1:/ext-usb/sdb1:/ext-usb/sdc1:/ext-usb/sdf1:/ext-usb/sdg1:/ext-usb/sdh1 /ext-usb/mergerfs/union fuse.mergerfs defaults,nonempty,allow_other,use_ino,ignorepponrename=true,dropcacheonclose=true,category.create=mfs,moveonenospc=true,posix_acl=true,func.getattr=newest,fsname=mergerfs,cache.files=per-process 0 0
root@pve:/ext-usb/sdh1/data#
As a furthergoing test i downloaded the files to another disk and copied the file to to the mergerfs:
root@pve:/ext-usb/sdh1/data# cd /root/
root@pve:~# wget https://link.testfile.org/PDF200MB
--2023-10-22 09:14:40-- https://link.testfile.org/PDF200MB
Resolving link.testfile.org (link.testfile.org)... 188.114.96.3, 188.114.97.3, 2a06:98c1:3121::3, ...
Connecting to link.testfile.org (link.testfile.org)|188.114.96.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://files.testfile.org/PDF/200MB-TESTFILE.ORG.pdf [following]
--2023-10-22 09:14:40-- https://files.testfile.org/PDF/200MB-TESTFILE.ORG.pdf
Resolving files.testfile.org (files.testfile.org)... 188.114.96.3, 188.114.97.3, 2a06:98c1:3121::3, ...
Connecting to files.testfile.org (files.testfile.org)|188.114.96.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 214119654 (204M) [application/pdf]
Saving to: ‘PDF200MB’
PDF200MB 100%[========================================>] 204.20M 115MB/s in 1.8s
2023-10-22 09:14:42 (115 MB/s) - ‘PDF200MB’ saved [214119654/214119654]
root@pve:~# cp PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs
root@pve:~# cmp PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs
PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs differ: byte 11872793, line 53254
root@pve:~# cp PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs
root@pve:~# cmp PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs
root@pve:~# cp PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs
root@pve:~# cmp PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs
PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs differ: byte 149810201, line 587626
root@pve:~#
So it seems that this a general problem with file operations to the mergerfs mount on my system.
I made an strace of the cp command which results in a corrupt file on the mergerfs.
from mergerfs.
The problem exists on my system since the 15.10.2023.
What do you mean by this?
from mergerfs.
The problem exists on my system since the 15.10.2023.
What do you mean by this?
I'm sorry I meant 16.10.2023 as is made an update which i described above.
from mergerfs.
I made an mergerfs trace while copying the File "PDF200MB" to the mergerfs mount which results in a corrupt file.
from mergerfs.
Have you tried any settings relevant to narrow down things? Such as cache.files, threads, etc.?
from mergerfs.
I did not alter the configuration yet , as this system has run now for about a year without any problems.
I installed 2.37.1 of mergerfs only because this problem occured out of a sudden.
Prior to this i had 2.33.5, which was bundled with debian.
As a test i added threads=-1 and also threads=6. Same result.
What occured to me is, that there is a slight timing Variation in this.
I made a small test:
cp PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs && echo "Compare A: " && cmp -l PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs && sleep 2 && echo -n "Compare B: " && cmp PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs
Output is:
root@pve:~# cp PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs && echo "Compare A: " && cmp -l PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs && sleep 2 && echo -n "Compare B: " && cmp PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs
Compare A:
Compare B: PDF200MB /ext-usb/mergerfs/union/data/PDF200MB_mergerfs differ: byte 171749122, line 841596
So..Directly after the copy-process files are the same. After 2 Seconds they differ.
And every time the offset is another.
from mergerfs.
This is proxmox? Not regular debian? I just installed Debian 12 on a system, installed bookworm x86_64 version from releases page, wget'ed the file... works fine. Copied files around through mergerfs. No issues. Same settings.
from mergerfs.
Sorry...I did not mention that. Yes ist proxmox , latest stable version. Underlying OS is bookworm.
I make the test directly on bare metal , not in an Virtualization.
from mergerfs.
I installed 2.37.1 of mergerfs only because this problem occured out of a sudden.
You are saying this happened after an update of the OS across multiple versions of mergerfs? This is important information. Please... I truly mean all info when I say I need all details about what your setup is and what you've tried.
from mergerfs.
No..This Problem on my system occured also with 2.33.5.
I installed 2.37.1 manually ONLY to check if it would fix it.
from mergerfs.
I should do tests with other filesizes.
I think the error occures only on Files bigger than 100MB.
I will get to you with my results.
from mergerfs.
I made the Copy-Compare-Test with 3 More Files Sizes (100MB, 50MB, 20MB).
These files get also corrupted.
from mergerfs.
Base Configuration:
First Installation: 12.01.2023 Proxmox 7.3 (Bullseye)
Updated in March to 7.4 (Bullseye)
Updated in August to Proxmox 8.0 (Bookworm)
Updated to 8.0.3 and 8.0.4
Proxmox 8.04 (Debian 12) + mergerfs 2.33.5 : Everything was fine.
Updated the System on the 16.10.2023 .
The only relevant beside samba updates was the update of libc-bin to version 2.36-9+deb12u3.
From that point on all downloaded files were corrupt.
I installed mergerfs 2.37.1 yesterday to see if the problem persisted.
from mergerfs.
Downgraded to prior installed libc version 2.36-9+deb12u1.
Same behavior. So i think libc has nothing to do with it.
from mergerfs.
This sounds like a hardware or OS issue. mergerfs has changed a decent amount between 2.33.5 and 2.37.1 and the fact it happened after an OS update after a year of working fine suggests similar. It could be that something in the OS triggered a bug in mergerfs but it would be unique to Proxmox. They seem to use a non-Debian kernel. Have you updated the kernel lately?
I am unable to install proxmox on a spare machine I have. I'll have to try a VM.
from mergerfs.
And have you tried any other FUSE filesystems? There was a number of changes to FUSE in 6.2 kernel and I'm pretty sure there was some bugs introduced.
from mergerfs.
I can't seem to replicate the issue in a VM with a fresh install of Proxmox.
from mergerfs.
Hi.
Have you tried update your proxmox installation to 8.0.4?
I have configured now a unionfs as a workaround. No Problems thus far.
It is slow but it gets the job done for now.
As a test I configured mergerfs and unionfs parallel with the same disks and mountpoints.
Made the Copy And Compare Test of a 100MB File on both Union Filesystems.
Unionfs works. Mergerfs does not.
It must be something with the proxmox Kernel then. or the FUSE.
from mergerfs.
I'm not a Proxmox user. How does one upgrade?
from mergerfs.
In Addition: With the update i made the kernel was updated from 6.2.16-12-pve to 6.2.16-15-pve
Regarding the Update:
https://tteck.github.io/Proxmox/
-> Proxmox VE Tools -> Proxmox VE Post Install
Run the command below in the Proxmox VE Shell.
bash -c "$(wget -qLO - https://github.com/tteck/Proxmox/raw/main/misc/post-pve-install.sh)"
After the reboot you can update/upgrade via apt.
from mergerfs.
I'm having nothing but problems with networking with virtualbox and proxmox. I'll have to try again another time.
from mergerfs.
Just as I posted the above I got it working.... and after the update it still works fine.
from mergerfs.
Ok .
So as we tried everything to reproduce this, this must be an error on my side.
In the meantime unionfs works flawlessly for the last 4 hours ,
so I will use unionfs until I figure out where the problem with mergerfs resides.
Thank you for your time and support.
from mergerfs.
There shouldn't be "an error on [your] side." There is no valid situation where corruption will happen that wouldn't happen on another filesystem.
unionfs is very old and uses a simple set of FUSE options. This is why it is important to play with different options related to writing to see if any of them change things. cache.files, writeback, moveonenospc, etc.
from mergerfs.
Here in Germany it is now 7:00 AM. I have to go to work. After Work I will alter my mergerFS options and set them to them which are stated in your Image.I will get back to you in a couple of hours.
Thank you.
from mergerfs.
Eureka , I found it.
I had to remove the option "dropcacheonclose=true" although i use cache.file=partial.
Same configuration , without the option works now flawlessly. No Corruption .
Made the test again with the option enabled....Corruption every time i create a file on mergerfs.
So there must be a problem with the kernel.
Could it be this?
https://lore.kernel.org/all/[email protected]/
from mergerfs.
Curious. It shouldn't be in that you are running 6.2 kernel. Unless Proxmox backported it from 6.3+. Also... I'm using "DONTNEED". Not "NOREUSE". In part because NOREUSE didn't work as desired.
To be clear: you have cache.file=partial
and then dropcacheonclose=true
== corruption and dropcacheonclose=false
== no corruption? What about cache.file=off
? Or auto-full
?
from mergerfs.
cache.file=partial + dropchacheonclose=true == corruption
cache.file=off + dropacheonclose=true == corruption
cache.file=auto-full + dropcacheonclose=true ==corruption
cache.file=[partial/off/auto-full] + dropcacheonclose=false == no corruption.
from mergerfs.
Interesting. dropcacheonclose literally is used in 1 place in all the code and just calls fadvise dont need twice. If you have the time... could you try https://github.com/Feh/nocache ? Just copy a file with nocache tool like it shows. And perhaps test with mergerfs pointing to just /tmp?
It could be a xfs bug too.
from mergerfs.
I compiled nocache and did the tests.
Copied a file on my mergerfs (dropcacheonclose=true , cache.files=partial) on XFS with nocache and compared it = corruption
Configured a mergerfs mount with the same configuration on /tmp and did the Copy/Compare... = no corruption
I think you are right. XFS is the Problem.
from mergerfs.
I meant to use nocache directly with the underlying filesystem. Like /ext-usb/sdc1. Since nocache and mergerfs are effectively doing the same thing.
from mergerfs.
Sorry. This was a misunderstanding.
I made the test with all 6 hard disks.
The Copy/Compare Test works on sda1,sdb1,sdc1,sdf1 and sdg1.
sdh1, my new HDD (only 3 Months old) , a Seagate Exos X18 SATA III 18TB, corrupts the file .
smartctl logged no errors on a short test.
Could it be the SATA Cable that would cause this?
But why did it work so long?
from mergerfs.
So all drives... you ran something like nocache cp /tmp/randomfile /ext-usb/sdX/
and all worked fine except sdh1?
Do you know if you formatted that one with different settings? A different version of mkfs.xfs or whatnot? As I understand xfs has been getting numerous enhancements over the past year or two. Might want to use xfs tooling to check each filesystem's settings and do a xfs_repair or whatnot.
from mergerfs.
Hi.
I owe you an apology for wasting your time.
I found the problem.
The Hard disk which had the Problem was connected to PCI-E To SATA Adapter.
I think that something changed in the kernel or the adapter got faulty all out of a sudden as it worked flawlessly yet.
I ordered a new adapter, changed the adapter and there is no more corruption on this disk after writing a file to it.
The Adapter ist ASUS U3S6 Rev 1.0. A Marvell SE9123 Chip is repsonsible for both SATA Ports.
As i said...Sorry for wasting your time.
This can be closed.
from mergerfs.
I appreciate the apology, I did spend several hours looking into this, but... it happens. No worries.
So the chipset of both the non-working and working adapters are the same? No errors in the kernel logs? That is an oddly specific issue.
from mergerfs.
Related Issues (20)
- cannot build 2.37.1 for musl libc, undefined reference to `malloc_trim' HOT 3
- Error installing Mergerfs on Unraid HOT 4
- Core Dump during normal operation HOT 2
- docker container (gitlab-ce) fails to start when using mergerfs as mounted data HOT 6
- S3-Mount breaks after starting plex HOT 4
- Failure to mount with user_allow_other disabeld in fuse.conf HOT 6
- Update wiki instructions for Synology HOT 2
- Would a connect() sa_family=AF_UNIX on mergerfs give a ECONNREFUSED? HOT 2
- Phantom zero-sized files, delayed updates & other weird behavior HOT 10
- Can't use fio to check performace of mergerfs HOT 1
- Merge of directory level
- MergerFS mount randomly disappears, only displays ??? when listed HOT 66
- Operation not permitted when mounting as non-root HOT 5
- Add support for "cp --reflink" HOT 4
- Failing to mount from fstab on boot, but can mount from command line HOT 5
- ubuntu 18.04 fresh install, with error HOT 1
- Permissions issue qBittorrent file creation when preload library is used in docker instance HOT 9
- unable to create new files (touch: cannot touch 'test': Not a directory) HOT 6
- Hangs when overlayfs is mounted on top of mergerfs's branch HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mergerfs.