Giter Club home page Giter Club logo

Comments (12)

serathius avatar serathius commented on August 26, 2024 2

@benma etcd project still has maintainers, however we are very stretched with work on etcd. We can review PR and fix bugs, but there is no active development on bbolt.

from bbolt.

benma avatar benma commented on August 26, 2024 1

I am also running into the issue that Check() on a corrupt DB crashes. Check() should definitely return an error instead of panicking.

panics-on-check.db.zip

cc @serathius - I saw you recently committed to the repo - who to ping? Is this repo still maintained?

Edit: the address fault is a segmentation fault, not a panic, so I this can't even be recovered with recover(). This seems to require a bugfix in this library, as it cannot be worked around really.

from bbolt.

tmm1 avatar tmm1 commented on August 26, 2024

I tried to use tx.Check() but it also blows up. Perhaps because I'm using ReadOnly: true?

unexpected fault address 0xaf41008
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0xaf41008 pc=0x42e6aa7]

goroutine 90 [running]:
runtime.throw(0x4a48764, 0x5)
	/usr/local/Cellar/go/1.10.2/libexec/src/runtime/panic.go:616 +0x81 fp=0xc4205e0be0 sp=0xc4205e0bc0 pc=0x402d2e1
runtime.sigpanic()
	/usr/local/Cellar/go/1.10.2/libexec/src/runtime/signal_unix.go:395 +0x211 fp=0xc4205e0c30 sp=0xc4205e0be0 pc=0x4042b11
github.com/coreos/bbolt.(*freelist).read(0xc4200bf500, 0xaf41000)
	.go/src/github.com/coreos/bbolt/freelist.go:236 +0x37 fp=0xc4205e0ce0 sp=0xc4205e0c30 pc=0x42e6aa7
github.com/coreos/bbolt.(*DB).loadFreelist.func1()
	.go/src/github.com/coreos/bbolt/db.go:290 +0x12b fp=0xc4205e0d30 sp=0xc4205e0ce0 pc=0x42ef22b
sync.(*Once).Do(0xc42032f050, 0xc420055d78)
	/usr/local/Cellar/go/1.10.2/libexec/src/sync/once.go:44 +0xbe fp=0xc4205e0d68 sp=0xc4205e0d30 pc=0x406379e
github.com/coreos/bbolt.(*DB).loadFreelist(0xc42032ef00)
	.go/src/github.com/coreos/bbolt/db.go:283 +0x4e fp=0xc4205e0d98 sp=0xc4205e0d68 pc=0x42e201e
github.com/coreos/bbolt.(*Tx).check(0xc420384380, 0xc42039a600)
	.go/src/github.com/coreos/bbolt/tx.go:399 +0x47 fp=0xc4205e0fd0 sp=0xc4205e0d98 pc=0x42ed2c7
runtime.goexit()
	/usr/local/Cellar/go/1.10.2/libexec/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc4205e0fd8 sp=0xc4205e0fd0 pc=0x405b871
created by github.com/coreos/bbolt.(*Tx).Check
	.go/src/github.com/coreos/bbolt/tx.go:393 +0x67

from bbolt.

tmm1 avatar tmm1 commented on August 26, 2024

Without ReadOnly, Open() crashes right away on a different backup:

unexpected fault address 0x8bf2008
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8bf2008 pc=0x42e6aa7]

goroutine 79 [running]:
runtime.throw(0x4a48764, 0x5)
	/usr/local/Cellar/go/1.10.2/libexec/src/runtime/panic.go:616 +0x81 fp=0xc42047f0d8 sp=0xc42047f0b8 pc=0x402d2e1
runtime.sigpanic()
	/usr/local/Cellar/go/1.10.2/libexec/src/runtime/signal_unix.go:395 +0x211 fp=0xc42047f128 sp=0xc42047f0d8 pc=0x4042b11
github.com/coreos/bbolt.(*freelist).read(0xc4205cf320, 0x8bf2000)
	.go/src/github.com/coreos/bbolt/freelist.go:236 +0x37 fp=0xc42047f1d8 sp=0xc42047f128 pc=0x42e6aa7
github.com/coreos/bbolt.(*DB).loadFreelist.func1()
	.go/src/github.com/coreos/bbolt/db.go:290 +0x12b fp=0xc42047f228 sp=0xc42047f1d8 pc=0x42ef22b
sync.(*Once).Do(0xc42038d050, 0xc42047f270)
	/usr/local/Cellar/go/1.10.2/libexec/src/sync/once.go:44 +0xbe fp=0xc42047f260 sp=0xc42047f228 pc=0x406379e
github.com/coreos/bbolt.(*DB).loadFreelist(0xc42038cf00)
	.go/src/github.com/coreos/bbolt/db.go:283 +0x4e fp=0xc42047f290 sp=0xc42047f260 pc=0x42e201e
github.com/coreos/bbolt.Open(0xc4200edc20, 0x41, 0x180, 0xc42047f388, 0xc4206446b8, 0x0, 0x0)
	.go/src/github.com/coreos/bbolt/db.go:260 +0x38e fp=0xc42047f330 sp=0xc42047f290 pc=0x42e1c4e

test2.db.gz

from bbolt.

tmm1 avatar tmm1 commented on August 26, 2024

Similar issue: boltdb/bolt#698

from bbolt.

tmm1 avatar tmm1 commented on August 26, 2024

Here's my repro code:

func readBackup(file string) error {
	db, err := bolt.Open(file, 0600, &bolt.Options{Timeout: 1 * time.Second, ReadOnly: true})
	if err != nil {
		return err
	}
	defer db.Close()

	db.View(func(tx *bolt.Tx) error {
		if groups := tx.Bucket([]byte("groups")); groups != nil {
			num := groups.Stats().KeyN
			log.Printf("num: %v", num)
		}
	})
	return nil
}

Would be really nice if there was some way I could check to see if the backup was consistent before trying to read it. Ideally bbolt would be able to deal with truncated/corrupted files itself and not crash the entire process.

from bbolt.

subbu05 avatar subbu05 commented on August 26, 2024
defer func() {
	if err := recover(); err != nil {
		fmt.Printf("Corrupted or invalid boltDB file\n",)
	}
}()

Add code to recover.

from bbolt.

cenkalti avatar cenkalti commented on August 26, 2024

With https://pkg.go.dev/runtime/debug#SetPanicOnFault , segmentation faults can be turned into panics.

from bbolt.

ahrtr avatar ahrtr commented on August 26, 2024

Check() should definitely return an error instead of panicking.

Agreed.

Fixing corrupted db file is my top priority recently. The most important thing is to figure out how to reproduce the issue. It would be great if anyone provide clues on this. Please do not hesitate to ping me if you have any thoughts. Thanks.

FYI. Recently we added a bbolt surgery clear-page-elements command as a workaround to fix corrupt db file, see #417.

from bbolt.

ahrtr avatar ahrtr commented on August 26, 2024

I am also running into the issue that Check() on a corrupt DB crashes. Check() should definitely return an error instead of panicking.

panics-on-check.db.zip

The DB (panics-on-check.db) was somehow corrupted during the last transaction. The corrupted db can be easily fixed by reverting the meta page (It actually rollback the last transaction).

$ ./bbolt surgery revert-meta-page /tmp/panics-on-check.db --output ./new.db
The meta page is reverted.
$ ./bbolt check ./new.db 
OK

I am almost sure that the corruption isn't caused by bbolt. The db file has 6 pages in total, but the bucket's root page is somehow a huge value 7631988 (0x747474). Most likely it's caused by other issues, e.g. hardware or OS issue?

@benma Do you still remember how was the corrupt file generated? Was there anything unusual (e.g. power off, OS crash, etc.) when the corrupt file being generated? BTW, what's the bbolt version?

$ ./bbolt  page /tmp/panics-on-check.db 0
Page ID:    0
Page Type:  meta
Total Size: 4096 bytes
Overflow pages: 0
Version:    2
Page Size:  4096 bytes
Flags:      00000000
Root:       <pgid=4>
Freelist:   <pgid=5>
HWM:        <pgid=6>
Txn ID:     2
Checksum:   eef96d7a2c1b336e

$ ./bbolt  page /tmp/panics-on-check.db 1
Page ID:    1
Page Type:  meta
Total Size: 4096 bytes
Overflow pages: 0
Version:    2
Page Size:  4096 bytes
Flags:      00000000
Root:       <pgid=3>
Freelist:   <pgid=2>
HWM:        <pgid=4>
Txn ID:     1
Checksum:   264c351a5179480f

$ ./bbolt  page /tmp/panics-on-check.db 4
Page ID:    4
Page Type:  leaf
Total Size: 4096 bytes
Overflow pages: 0
Item Count: 1

"bucket": <pgid=7631988,seq=0>

from bbolt.

ahrtr avatar ahrtr commented on August 26, 2024

test.db.gz

The corrupted file provided by @tmm1 seems like a potential bbolt bug. What's your bbolt version?

The freelist page (108) was somehow reset (all fields have zero value).

What's confusing is that two meta pages have exactly the same Root (99), Freelist (108) and HWM (482). Meta 0 has TXN 64920, while meta 1 has TXN 64920; it indicates that the last RW transaction did not change anything. But the freelist should change anyway (It's a potential improvement point, we shouldn't sync freelist if the RW TXN changes nothing)

$ ./bbolt page /tmp/test.db  0
Page ID:    0
Page Type:  meta
Total Size: 4096 bytes
Overflow pages: 0
Version:    2
Page Size:  4096 bytes
Flags:      00000000
Root:       <pgid=99>
Freelist:   <pgid=108>
HWM:        <pgid=482>
Txn ID:     64921
Checksum:   aab8d660770b88f7

$ ./bbolt page /tmp/test.db  1
Page ID:    1
Page Type:  meta
Total Size: 4096 bytes
Overflow pages: 0
Version:    2
Page Size:  4096 bytes
Flags:      00000000
Root:       <pgid=99>
Freelist:   <pgid=108>
HWM:        <pgid=482>
Txn ID:     64920
Checksum:   929bdcc802b6f642

from bbolt.

ahrtr avatar ahrtr commented on August 26, 2024

test.db.gz

There is even no way to fix the corrupted db file. The file is only 204800 bytes, so it's 50 pages ( 204800/4096 ). Obviously the root page ID (99), Freelist (108) and HWM (482) exceeds the file size. I can't even find the root page in the available 50 pages. It seems that the file was somehow truncated, and the root was in the truncated part.

$ ls -lrt test.db
-rw-r--r-- 1 wachao wheel 204800 May 26 15:15 test.db

from bbolt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.