timshannon / badgerhold Goto Github PK

BadgerHold is an embeddable NoSQL store for querying Go types built on Badger

License: MIT License

Go 100.00%

badgerhold's Introduction

BadgerHold

BadgerHold is a simple querying and indexing layer on top of a Badger instance. The goal is to create a simple, higher level interface on top of Badger DB that simplifies dealing with Go Types and finding data, but exposes the underlying Badger DB for customizing as you wish. By default the encoding used is Gob, so feel free to use the GobEncoder/Decoder interface for faster serialization. Or, alternately, you can use any serialization you want by supplying encode / decode funcs to the Options struct on Open.

One Go Type will be prefixed with it's type name, so you can store multiple types in a single Badger database with conflicts.

This project is a rewrite of the BoltHold project on the Badger KV database instead of Bolt. For a performance comparison between bolt and badger, see https://blog.dgraph.io/post/badger-lmdb-boltdb/. I've written up my own comparison of the two focusing on characteristics other than performance here: https://tech.townsourced.com/post/boltdb-vs-badger/.

Indexes

Indexes allow you to skip checking any records that don't meet your index criteria. If you have 1000 records and only 10 of them are of the Division you want to deal with, then you don't need to check to see if the other 990 records match your query criteria if you create an index on the Division field. The downside of an index is added disk reads and writes on every write operation. For read heavy operations datasets, indexes can be very useful.

In every BadgerHold store, there will be a reserved bucket _indexes which will be used to hold indexes that point back to another bucket's Key system. Indexes will be defined by setting the badgerhold:"index" struct tag on a field in a type.

type Person struct {
	Name string
	Division string `badgerhold:"index"`
}

// alternate struct tag if you wish to specify the index name
type Person struct {
	Name string
	Division string `badgerholdIndex:"IdxDivision"`
}

This means that there will be an index created for Division that will contain the set of unique divisions, and the main record keys they refer to.

Optionally, you can implement the Storer interface, to specify your own indexes, rather than using the badgerHoldIndex struct tag.

Queries

Queries are chain-able constructs that filters out any data that doesn't match it's criteria. An index will be used if the .Index() chain is called, otherwise BadgerHold won't use any index.

Queries will look like this:

s.Find(&result, badgerhold.Where("FieldName").Eq(value).And("AnotherField").Lt(AnotherValue).Or(badgerhold.Where("FieldName").Eq(anotherValue)))

Fields must be exported, and thus always need to start with an upper-case letter. Available operators include:

Equal - Where("field").Eq(value)
Not Equal - Where("field").Ne(value)
Greater Than - Where("field").Gt(value)
Less Than - Where("field").Lt(value)
Less than or Equal To - Where("field").Le(value)
Greater Than or Equal To - Where("field").Ge(value)
In - Where("field").In(val1, val2, val3)
IsNil - Where("field").IsNil()
Regular Expression - Where("field").RegExp(regexp.MustCompile("ea"))
Matches Function - Where("field").MatchFunc(func(ra *RecordAccess) (bool, error))
Skip - Where("field").Eq(value).Skip(10)
Limit - Where("field").Eq(value).Limit(10)
SortBy - Where("field").Eq(value).SortBy("field1", "field2")
Reverse - Where("field").Eq(value).SortBy("field").Reverse()
Index - Where("field").Eq(value).Index("indexName")
Contains - Where("field").Contains(val1)
ContainsAll - Where("field").Contains(val1, val2, val3)
ContainsAny - Where("field").Contains(val1, val2, val3)
HasKey - Where("field").HasKey(val1) // to test if a Map value has a key

An empty / zero value query matches against all records, because it has no critiera. You can then use Skip and Limit to page through all records in your dataset:

q := &badgerhold.Query{}

err := store.Find(&result, q.Skip(10).Limit(50))

If you want to run a query's criteria against the Key value, you can use the badgerhold.Key constant:

store.Find(&result, badgerhold.Where(badgerhold.Key).Ne(value))

You can access nested structure fields in queries like this:

type Repo struct {
  Name string
  Contact ContactPerson
}

type ContactPerson struct {
  Name string
}

store.Find(&repo, badgerhold.Where("Contact.Name").Eq("some-name")

Instead of passing in a specific value to compare against in a query, you can compare against another field in the same struct. Consider the following struct:

type Person struct {
	Name string
	Birth time.Time
	Death time.Time
}

If you wanted to find any invalid records where a Person's death was before their birth, you could do the following:

store.Find(&result, badgerhold.Where("Death").Lt(badgerhold.Field("Birth")))

Queries can be used in more than just selecting data. You can delete or update data that matches a query.

Using the example above, if you wanted to remove all of the invalid records where Death < Birth:

// you must pass in a sample type, so BadgerHold knows which bucket to use and what indexes to update
store.DeleteMatching(&Person{}, badgerhold.Where("Death").Lt(badgerhold.Field("Birth")))

Or if you wanted to update all the invalid records to flip/flop the Birth and Death dates:

store.UpdateMatching(&Person{}, badgerhold.Where("Death").Lt(badgerhold.Field("Birth")), func(record interface{}) error {
	update, ok := record.(*Person) // record will always be a pointer
	if !ok {
		return fmt.Errorf("Record isn't the correct type!  Wanted Person, got %T", record)
	}

	update.Birth, update.Death = update.Death, update.Birth

	return nil
})

If you simply want to count the number of records returned by a query use the Count method:

// need to pass in empty datatype so badgerhold knows what type to count
count, err := store.Count(&Person{}, badgerhold.Where("Death").Lt(badgerhold.Field("Birth")))

You can also use FindOne which is a shorthand for Find + Limit(1) which returns a single record instead of a slice of records, and will return an ErrNotFound if no record is found, unlike a normal Find query where an empty slice would be returned with no error.

	result := &ItemTest{}
	err := store.FindOne(result, query)

Keys in Structs

A common scenario is to store the badgerhold Key in the same struct that is stored in the badgerDB value. You can automatically populate a record's Key in a struct by using the badgerhold:"key" struct tag when running Find queries.

Another common scenario is to insert data with an auto-incrementing key assigned by the database. When performing an Insert, if the type of the key matches the type of the badgerhold:"key" tagged field, the data is passed in by reference, and the field's current value is the zero-value for that type, then it is set on the data before insertion.

type Employee struct {
	ID uint64 `badgerhold:"key"`
	FirstName string
	LastName string
	Division string
	Hired time.Time
}

// old struct tag, currenty still supported but may be deprecated in the future
type Employee struct {
	ID uint64 `badgerholdKey`
	FirstName string
	LastName string
	Division string
	Hired time.Time
}

Badgerhold assumes only one of such struct tags exists. If a value already exists in the key field, it will be overwritten.

If you want to insert an auto-incrementing Key you can pass the badgerhold.NextSequence() func as the Key value.

err := store.Insert(badgerhold.NextSequence(), data)

The key value will be a uint64.

If you want to know the value of the auto-incrementing Key that was generated using badgerhold.NextSequence(), then make sure to pass a pointer to your data and that the badgerholdKey tagged field is of type uint64.

err := store.Insert(badgerhold.NextSequence(), &data)

Slices in Structs and Queries

When querying slice fields in structs you can use the Contains, ContainsAll and ContainsAny criterion.

val := struct {
    Set []string
}{
    Set: []string{"1", "2", "3"},
}
bh.Where("Set").Contains("1") // true
bh.Where("Set").ContainsAll("1", "3") // true
bh.Where("Set").ContainsAll("1", "3", "4") // false
bh.Where("Set").ContainsAny("1", "7", "4") // true

The In, ContainsAll and ContainsAny critierion accept a slice of interface{} values. This means you can build your queries by passing in your values as arguments:

where := badgerhold.Where("Id").In("1", "2", "3")

However if you have an existing slice of values to test against, you can't pass in that slice because it is not of type []interface{}.

t := []string{"1", "2", "3", "4"}
where := badgerhold.Where("Id").In(t...) // compile error

Instead you need to copy your slice into another slice of empty interfaces:

t := []string{"1", "2", "3", "4"}
s := make([]interface{}, len(t))
for i, v := range t {
    s[i] = v
}
where := badgerhold.Where("Id").In(s...)

You can use the helper function badgerhold.Slice which does exactly that.

t := []string{"1", "2", "3", "4"}
where := badgerhold.Where("Id").In(badgerhold.Slice(t)...)

Unique Constraints

You can create a unique constraint on a given field by using the badgerhold:"unique" struct tag:

type User struct {
  Name string
  Email string `badgerhold:"unique"` // this field will be indexed with a unique constraint
}

The example above will only allow one record of type User to exist with a given Email field. Any insert, update or upsert that would violate that constraint will fail and return the badgerhold.ErrUniqueExists error.

ForEach

When working with large datasets, you may not want to have to store the entire dataset in memory. It's be much more efficient to work with a single record at a time rather than grab all the records and loop through them, which is what cursors are used for in databases. In BadgerHold you can accomplish the same thing by calling ForEach:

err := store.ForEach(badgerhold.Where("Id").Gt(4), func(record *Item) error {
	// do stuff with record

	// if you return an error, then the query will stop iterating through records

	return nil
})

Aggregate Queries

Aggregate queries are queries that group results by a field. For example, lets say you had a collection of employees:

type Employee struct {
	FirstName string
	LastName string
	Division string
	Hired time.Time
}

And you wanted to find the most senior (first hired) employee in each division:

result, err := store.FindAggregate(&Employee{}, nil, "Division") //nil query matches against all records

This will return a slice of Aggregate Result from which you can extract your groups and find Min, Max, Avg, Count, etc.

for i := range result {
	var division string
	employee := &Employee{}

	result[i].Group(&division)
	result[i].Min("Hired", employee)

	fmt.Printf("The most senior employee in the %s division is %s.\n",
		division, employee.FirstName + " " + employee.LastName)
}

Aggregate queries become especially powerful when combined with the sub-querying capability of MatchFunc.

Many more examples of queries can be found in the find_test.go file in this repository.

Comparing

Just like with Go, types must be the same in order to be compared with each other. You cannot compare an int to a int32. The built-in Go comparable types (ints, floats, strings, etc) will work as expected. Other types from the standard library can also be compared such as time.Time, big.Rat, big.Int, and big.Float. If there are other standard library types that I missed, let me know.

You can compare any custom type either by using the MatchFunc criteria, or by satisfying the Comparer interface with your type by adding the Compare method: Compare(other interface{}) (int, error).

If a type doesn't have a predefined comparer, and doesn't satisfy the Comparer interface, then the types value is converted to a string and compared lexicographically.

Behavior Changes

Since BadgerHold is a higher level interface than Badger DB, there are some added helpers. Instead of Put, you have the options of:

Insert - Fails if key already exists.
Update - Fails if key doesn't exist ErrNotFound.
Upsert - If key doesn't exist, it inserts the data, otherwise it updates the existing record.

When getting data instead of returning nil if a value doesn't exist, BadgerHold returns badgerhold.ErrNotFound, and similarly when deleting data, instead of silently continuing if a value isn't found to delete, BadgerHold returns badgerhold.ErrNotFound. The exception to this is when using query based functions such as Find (returns an empty slice), DeleteMatching and UpdateMatching where no error is returned.

When should I use BadgerHold?

BadgerHold will be useful in the same scenarios where BadgerDB is useful, with the added benefit of being able to retire some of your data filtering code and possibly improved performance.

You can also use it instead of SQLite for many scenarios. BadgerHold's main benefit over SQLite is its simplicity when working with Go Types. There is no need for an ORM layer to translate records to types, simply put types in, and get types out. You also don't have to deal with database initialization. Usually with SQLite you'll need several scripts to create the database, create the tables you expect, and create any indexes. With BadgerHold you simply open a new file and put any type of data you want in it.

options := badgerhold.DefaultOptions
options.Dir = "data"
options.ValueDir = "data"

store, err := badgerhold.Open(options)
defer store.Close()
if err != nil {
	// handle error
	log.Fatal(err)
}


err = store.Insert("key", &Item{
	Name:    "Test Name",
	Created: time.Now(),
})

That's it!

Badgerhold currently has over 80% coverage in unit tests, and it's backed by BadgerDB which is a very solid and well built piece of software, so I encourage you to give it a try.

If you end up using BadgerHold, I'd love to hear about it.

badgerhold's People

Contributors

Stargazers

Watchers

badgerhold's Issues

Count is eating up all the memory

Hi guys

When using Count in a DB with a huge amount of entries the memory is being eaten up.
And it is being freed right after Count.

Cheers,
artvel

Panic on nested Key Values

Same issue from bolthold: timshannon/bolthold#43

Transaction Conflict. Please retry error

Most of the time I get this error, Is this normal? if it is normal how can I try again in code
Error:

panic: Transaction Conflict. Please retry

goroutine 10 [running]:
gitlab.com/nanovy/rent/server/database.(*BadgerDB).Test.func1(0xc0000240e0, 0x2, 0xc00000e640)
        /home/eren/Desktop/Nanovy/Rent/Server/database/connect.go:69 +0xf6
created by gitlab.com/nanovy/rent/server/database.(*BadgerDB).Test
        /home/eren/Desktop/Nanovy/Rent/Server/database/connect.go:62 +0x106
exit status 2

Code:

type Item struct {
	Id        uint64 `badgerhold:"key"`
	Name string
	Cost    int
}

func (bdb *BadgerDB) Test() {
	db := bdb.Store
	db.Badger().DropAll()

	var wg sync.WaitGroup

	for i := 0; i < 100; i++ {
		wg.Add(1)
		i := i
		go func() {
			defer wg.Done()
			item := Item{
				Name: "Glass",
				Cost: i,
			}
			if err := db.Insert(badgerhold.NextSequence(), &item); err != nil {
				panic(err)
			}
		}()
	}

	wg.Wait()

	var items []Item
	if err := db.Find(&items, badgerhold.Where("Cost").Ge(98)); err != nil {
		panic(err)
	}
	spew.Dump(items)
}

Cannot run tests

I'm trying to run the tests and getting the following error:
# github.com/timshannon/badgerhold ./store.go:48:41: cannot call non-function badger.DefaultOptions (type badger.Options) FAIL github.com/timshannon/badgerhold [build failed]

Am I doing something wrong here?

Roaring Bitmaps for indexing

As a previous user of badgerhold I had performance issues with large indices. After much research I have started to use Roaring Bitmaps for indexing.

I would like to suggest that badgerhold use RoaringBitmaps for indexing. I believe this will provide significant performance improvements.

https://github.com/RoaringBitmap/roaring

https://sudonull.com/post/28381-Go-Bitmap-Indexes-Wild-Speed-Search-Badoo-Blog

SortBy ASC and DESC per Field

badgerhold is awesome!

To make the query even better it would be nice to have the possibility to define ASC or DESC per field.

What do you guys think?

Query semantics (limits and empty query construction)

Since both Query.Limit and Query.Skip should not accept negative values, why not simply change the argument type to uint or one of its flavours? That would be one less error to worry about, and one less type conversion for users with unsigned semantics for limit and offset.

Also, when using badgerhold.Find(&result, nil), is the returned order the struct key field order?

I would like to create a query to iterate over all stored entities in key order and retrieve some of them with a specified limit and offset.
For that I have resorted to the following:

query := (&badgerhold.Query{}).Skip(int(idx)).Limit(int(page))
err := badgerhold.Find(&result, query)

Is there something I am missing with regards to empty query construction and default order, or is this functionality not in scope?

Typo Errors and Grammar construct error in Codebase docs and comments

There are a couple of spelling, typo, and grammar construct errors in codebase docs and comments

Is the Badger stream framework supported?

I was wondering if Bagderhold can support stream (https://dgraph.io/docs/badger/get-started/#stream) if not is it a major enhancement or minor and easy to adapt?

Badgerhold queries don't work with both badgerhold.Key and In operator

Running badgerhold Find or DeleteMatching query against badgerhold.Key with In operator causes error: unexpected EOF.
Also it would be nice to include tests that use both badgerhold.Key and In operator.

Code that reproduces bug:

type TestStruct struct {
	Value int
}

func TestBadgerFind(t *testing.T) {
	options := bh.DefaultOptions
	options.InMemory = true
	store, err := bh.Open(options)
	require.NoError(t, err)
	defer func() {
		err = store.Close()
		require.NoError(t, err)
	}()

	for i := 0; i < 5; i++ {
		err = store.Insert(i, TestStruct{i})
		require.NoError(t, err)
	}

	results := make([]TestStruct, 0, 5)
	err = store.Find(
		&results,
		bh.Where(bh.Key).In(0, 1, 2, 3, 4),
	)
	// results in unexpected EOF
	require.NoError(t, err)
	require.Len(t, results, 5)
}

badgerhold version: 74e0215974cf
go version: go1.16.5 darwin/arm64

Enhance documentation for field names

As I read through the documentation, you are using GOB by default and the field names in queries have upper-case first letter which I think means that the fields used in queries must match the de/serializer. So if one would use json then the field names in queries would be most likely all lower-cased. If this asumption is correct, I think it should be mentioned in the documentation that for queries, one should use field names the serializer uses itself and not struct fields, by default.

Load multiple by primary key

Since objects are stored by the key that is provided to the crud methods via argument, they do not need to be declared as indices, except "key" which will provided auto-incrementing uin64 index, if needed. I have string primary keys(uuid) so how do I use the Find() method to load multiple objects by their ids? I do not see any method for the query. Do I need to define the index or how does it work?

Add support for transformation tags

I have some indices but I want them to be case-insensitive. So instead of having to define two fields - one for original value and the other for index value, it would be good if there would be tag support for something like 'lowercase' or 'uppercase' or something like that. This is especially useful for e-mail addresses where the case can vary but I always want to index it as lower-cased value. Same as things like handles/slugs like for twitter or github user names.

Question on String Sort

When using Find and Query, what is badgerhold's definition of string sorting? The reason I ask is that I'm getting a weird order for a string field that I've asked Badgerhold DB to return. The field is a string and I get a result set similar to:

nil
...about 7
nil
Blarg
Richard Smith
Joe Doe

Sorting is apparently doing something because when I do a Reverse, it returns the list to me in reverse order but in this weird, non-alphabetic order.

I'm using github.com/timshannon/badgerhold v0.0.0-20200316131017-7bcffb989f0d

Keys prefixes collision during iteration

Inserting values of a type whose name has a prefix of another type name causes problems when iterating over keys. It tries to include a type with a longer name as well.
One way to resolve it would be to store the key with an additional separator : after the type name in badger.

Code that reproduces issue:

type TestStruct struct {
	Value int
}

type TestStructCollision struct {
	Value int
}

func TestBadgerKeyPrefixCollision(t *testing.T) {
	options := bh.DefaultOptions
	options.InMemory = true
	store, err := bh.Open(options)
	require.NoError(t, err)
	defer func() {
		err = store.Close()
		require.NoError(t, err)
	}()

	for i := 0; i < 5; i++ {
		err = store.Insert(i, TestStruct{Value: i})
		require.NoError(t, err)

		err = store.Insert(i, TestStructCollision{Value: i})
		require.NoError(t, err)
	}

	query := bh.Where(bh.Key).In(0, 1, 2, 3, 4)
	results := make([]TestStruct, 0, 5)
	err = store.Find(
		&results,
		query,
	)
	require.NoError(t, err)
}

Go get issue

go get github.com/timshannon/badgerhold/...

github.com/timshannon/badgerhold

......\go\src\github.com\timshannon\badgerhold\store.go:48:2: cannot use badger.DefaultOptions (type func(string) badger.Options) as type badger.Options in field value

Q: Strategies for data migration?

First:
so glad i found badgerhold, just what i was looking for after unexplained db updates silently not working in sqlite3 when using multiple go routines.

My question:
Say I remove a field in a data structure and want to migrate old data. Anyone suggest a strategy? My thought was to read data into map[string]interface{} and adjust data on application restart, but attempts to serialize into map structure gave decoder errors. I'd rather not have to maintain a "legacy data structure" of every data structure and marshal into that before unmarshalling into "official data structure"

Badgerhold with Badger v2?

Can badgerhold be used with badgerDb v2?

Support for ForEach?

I'm looking to migrate from bolthold to badgerhold... one thing that seems to be missing (unless I am overlooking something) that I use quite a lot is ForEach? Are you planning to add this?

Nested indexes are not supported?

Hi, I was trying to use badgerhold in my project. I got a problem when I was trying to find a record. It said, "Nested indexes are not supported. Only top level structures can be indexed". I not so sure what I did wrong, or maybe I misunderstand the 'index' feature. If, I did anything wrong, please help me out to understand the index feature. Thank you!

Data struct:

type AdminUser struct {
	Username  string `json:"username" badgerhold:"key"`
	NickName  string `json:"nick_name"`
	FirstName string `json:"first_name"`
	LastName  string `json:"last_name"`
	Email     string `json:"email" badgerhold:"unique"`
	AvatarUrl string `json:"avatar_url"`
	Password  string `json:"password"`
	Authorities    string `json:"authorities"`
	CreateTime     int64  `json:"create_time"`
	LastModifyTime int64  `json:"last_modify_time"`
	LastLoginTime  int64  `json:"last_login_time"`
	StorageScope string `badgerhold:"index"`
}

Code for inserting data:

adminUser := &store.AdminUser{
		Username:        "abcd",
		NickName:        "aaa",
		FirstName:       "a",
		LastName:        "b",
		Email:           "[email protected]",
		Password:        "12345",
		Authorities:     "super",
		StorageScope: "admin.user",
}
err := app.Store.Insert("abcd", adminUser)

Code for finding record:

dbResult := &store.AdminUser{}
err := app.Store.FindOne(dbResult, badgerhold.Where("Username").Eq("abcd").Index("admin.user"))

Error message: panic: Nested indexes are not supported. Only top level structures can be indexed

Some additional information:

github.com/dgraph-io/badger/v3 v3.2011.1
github.com/timshannon/badgerhold/v3 v3.0.0-20210415132401-e7c90fb5919f

GOVERSION="go1.16.3"

OS: Manjaro Linux x86_64
Kernel: 5.12.1-2-MANJARO

Is FindOne good-to-go?

Support for FindOne was added in #42 but its not documented in README.md ... is this simply an omission or is it not ready for use yet?

Question: how to select what parts of the record to pull

I may have gotten this wrong, but I'm noticing that if I have a record that is a complex type badgerhold tend to retain the full data structure into memory. Sometimes I'm just want to perform a count or select a single value on the record. Consider this struct:

type Document struct {
	ID         uuid.UUID `badgerhold:"index"`
	Reference  string    `badgerhold:"index"`
	Title      string
	Excerpt    []byte
	Type       string `badgerhold:"index"`
	Status     string
	Properties []Property
	Date       time.Time
	Modified   time.Time
}

If I run this command:

store.Find(&results, badgerhold.Where("Type").Eq("speakers").Index("Type"))

I get 400 or so records but the query takes 93ms to run. When I print the values the WHOLE body is being pulled. At first this scared me but then I realised it pulled far more than I required. Is there I way I can only extract certain properties?

I'd imagine if I restricted the amount of data to extract that performance will increase dramatically. Most of the heft in my record is in the Properties element. It's about 5x the size of the actual struct.

Apologies if this is documented. I'm trying to tip toe into Badger and Badgerhold. :-)

Define actions interface

Hi,
for my *sql drivers I am using the following interface:

type SqlConnection interface {
	Query(query string, args ...interface{}) (*sql.Rows, error)
	QueryRow(query string, args ...interface{}) *sql.Row
	Exec(query string, args ...interface{}) (sql.Result, error)
}

this allows me to switch between the sql database and sql transaction object, since both support these methods. Then, in my (mysql/...)driver's methods I simply call driver.conn.Query() and so on, and not care about the underlying object.

The driver has the following methods:

BeginTransaction(ctx context.Context) (Driver, error)
CommitTransaction() error
RollbackTransaction() error
InTransaction() bool

with the following implementation:

func (r *repo) BeginTransaction(ctx context.Context) (foo.Repository, error) {
	r.mx.Lock()
	defer r.mx.Unlock()

	// if we're already in transaction, keep the same instance
	if _, ok := r.conn.(driver.Tx); ok {
		return r, nil
	}

	// if we're not in transaction, create new instance
	tx, err := r.opts.db.BeginTx(ctx, nil)
	if err != nil {
		return nil, ErrOut(err)
	}

	inTx := &repo{
		conn: tx,
		opts: r.opts,
	}

	return inTx, nil
}

func (r *repo) CommitTransaction() error {
	r.mx.Lock()
	defer r.mx.Unlock()

	if tx, ok := r.conn.(driver.Tx); ok {
		err := tx.Commit()
		r.conn = r.opts.db
		return ErrOut(err)
	}

	return ErrOut(ErrNotInTx)
}

func (r *repo) RollbackTransaction() error {
	r.mx.Lock()
	defer r.mx.Unlock()

	if tx, ok := r.conn.(driver.Tx); ok {
		err := tx.Rollback()
		r.conn = r.opts.db
		return ErrOut(err)
	}

	return ErrOut(ErrNotInTx)
}

I am now trying to use this library as replacement for mysql db because I think it will work better for me due to some graph edges logic I am needing that mysql is not suitable for.

And it would be nice to be able to follow my established logic with this library instead of having to check if I have open transaction or not and switch between the direct method or the transactioned one(ie. Delete() vs TxDelete()....). so I think having an established interface in bh would take care of this nicely and I tihnk many people would find it useful.

Issue with basic example

Hi Tim, great work on this project. I attempted running an example and got an error:

type Item struct {
	Name    string
	Created time.Time
}
store, err := badgerhold.Open("example", 0666, nil)
if err != nil {
	fmt.Println(err)
}
err = store.Insert("key", &Item{
	Name:    "Test Name",
	Created: time.Now(),
})

The resulting error:

../../pkg/mod/github.com/timshannon/[email protected]/store.go:48:41: cannot call non-function badger.DefaultOptions (type badger.Options)

Not sure if I did something incorrectly (doesn't look like it). Any ideas?

README.MD - data by value

I don't get it.
Do I have to pass data by value or data by reference?

If you want to know the value of the auto-incrementing Key that was generated using badgerhold.NextSequence(), then make sure to pass your data by value and that the badgerholdKey tagged field is of type uint64.

err := store.Insert(badgerhold.NextSequence(), &data)

Support for other backends?

It looks like "they" made badger completely unusable on Windows and I am in a need of fast db so bolt won't work for me(it has blocking transactions). So I was wondering if the *hold could be made into a wrapper that could take different backends, like bitcask or pebble?

Key tag issue

As referenced in the README, the key field tag for a struct can be either badgerholdKey or badgerhold:"key", but in the package tests the tag `badgerholdKey:"Key" is used.

Is that an issue of outdated documentation?

When a uint64 struct field is tagged as a struct key, it behaves differently (badgerhold.Store.Get and badgerhold.Store.Find) depending on which tag is used.

What is the correct way to declare a key field?

package badgerhold_test

import (
	"testing"
	"time"

	"github.com/timshannon/badgerhold"
)

type BHTag struct {
	ID uint64 `badgerholdKey`
	Data string
	Created time.Time
}

type BHKTag struct {
	ID uint64 `badgerholdKey:"Key"`
	Data string
	Created time.Time
}

type BKTag struct {
	ID uint64 `badgerhold:"key"`
	Data string
	Created time.Time
}

func TestGetKeyTag1(t *testing.T) {
	testWrap(t, func(store *badgerhold.Store, t *testing.T) {
		data := &BHTag{
			Data: `tag badgerholdKey`,
			Created: time.Now(),
		}

		err := store.Insert(badgerhold.NextSequence(), &BKTag{})
		if err != nil {
			t.Fatalf("Error on empty value consumption from sequence: %s", err)
		}

		err = store.Insert(badgerhold.NextSequence(), data)
		if err != nil {
			t.Fatalf("Error inserting data: %s", err)
		}

		key := data.ID
		if key == 0 {
			t.Fatalf("Error on insert: empty key")
		}

		findResult := make([]BHTag, 0)
		err = store.Find(&findResult, badgerhold.Where(badgerhold.Key).Eq(key))
		if err != nil {
			t.Fatalf("Error on find: %s", err)
		}
		if len(findResult) != 1 {
			t.Fatalf("Error: find returned %d results, expected 1", len(findResult))
		}
		if findResult[0].ID == 0 {
			t.Fatalf("Error: empty value for key from find")
		}

		getResult := &BHTag{}
		err = store.Get(key, getResult)
		if err != nil {
			t.Fatalf("Error on get: %s", err)
		}
		if getResult.ID == 0 {
			t.Fatalf("Error: empty value for key from get")
		}
	})
}

func TestGetKeyTag2(t *testing.T) {
	testWrap(t, func(store *badgerhold.Store, t *testing.T) {
		data := &BHKTag{
			Data: `tag badgerholdKey:"Key"`,
			Created: time.Now(),
		}

		err := store.Insert(badgerhold.NextSequence(), &BHKTag{})
		if err != nil {
			t.Fatalf("Error on empty value consumption from sequence: %s", err)
		}

		err = store.Insert(badgerhold.NextSequence(), data)
		if err != nil {
			t.Fatalf("Error inserting data: %s", err)
		}

		key := data.ID
		if key == 0 {
			t.Fatalf("Error on insert: empty key")
		}

		findResult := make([]BHKTag, 0)
		err = store.Find(&findResult, badgerhold.Where(badgerhold.Key).Eq(key))
		if err != nil {
			t.Fatalf("Error on find: %s", err)
		}
		if len(findResult) != 1 {
			t.Fatalf("Error: find returned %d results, expected 1", len(findResult))
		}
		if findResult[0].ID == 0 {
			t.Fatalf("Error: empty value for key from find")
		}

		getResult := &BHKTag{}
		err = store.Get(key, getResult)
		if err != nil {
			t.Fatalf("Error on get: %s", err)
		}
		if getResult.ID == 0 {
			t.Fatalf("Error: empty value for key from get")
		}
	})
}

func TestGetKeyTag3(t *testing.T) {
	testWrap(t, func(store *badgerhold.Store, t *testing.T) {
		data := &BKTag{
			Data: `tag badgerhold:"key"`,
			Created: time.Now(),
		}

		err := store.Insert(badgerhold.NextSequence(), &BKTag{})
		if err != nil {
			t.Fatalf("Error on empty value consumption from sequence: %s", err)
		}

		err = store.Insert(badgerhold.NextSequence(), data)
		if err != nil {
			t.Fatalf("Error inserting data: %s", err)
		}

		key := data.ID
		if key == 0 {
			t.Fatalf("Error on insert: empty key")
		}

		findResult := make([]BKTag, 0)
		err = store.Find(&findResult, badgerhold.Where(badgerhold.Key).Eq(key))
		if err != nil {
			t.Fatalf("Error on find: %s", err)
		}
		if len(findResult) != 1 {
			t.Fatalf("Error: find returned %d results, expected 1", len(findResult))
		}
		if findResult[0].ID == 0 {
			t.Fatalf("Error: empty value for key from find")
		}

		getResult := &BKTag{}
		err = store.Get(key, getResult)
		if err != nil {
			t.Fatalf("Error on get: %s", err)
		}
		if getResult.ID == 0 {
			t.Fatalf("Error: empty value for key from get")
		}
	})
}

Sorry for the long test, but I am not well-versed with reflection in Go.

Triplestore structure (querying sub-properties)

I was wondering if a structure like this would be possible or if you someone suggest a better structure:

type Property struct {
	Value     interface{}
	Reference uuid.UUID
}
type Triple struct {
	Subject Property
	Predicate string
	Object Property
}
docID := Property{ Reference: uuid.v4() }
err = store.Insert(badgerhold.NextSequence(), &Triple{
	Subject: docID,
	Predicate: "Document",
	Object: Property{Value: "Blog Post"},
})

Basically with this layout I'd have to do the following:

store.Find(&results, badgerhold.Where("Predicate").Eq("Document").And("Object.Reference").Eq(uuid)
for _, doc := range results {
	fmt.Println(doc.O)
}

The reason for this structure is that my Subject and Object properties can actually be references to other records. Perhaps it's best to keep Subject and Object as interfaces{} and use a MatchFunc to check the type is a UUID?

Are you going to migrate the library on BadgerDB v3.2011.* ?

Open is unsafe for concurrent use

I've been digging into a data race caught by go test -race in a downstream project, summarized below:

Write at 0x0000024de218 by goroutine 301:
  github.com/timshannon/badgerhold/v3.Open()
      /home/mvdan/go/pkg/mod/github.com/timshannon/badgerhold/[email protected]/store.go:56 +0x64
  go.vocdoni.io/dvote/vochain/scrutinizer.InitDB()
      /home/mvdan/src/dvote/vochain/scrutinizer/db.go:77 +0xf7
  [...]

Previous read at 0x0000024de218 by goroutine 142:
  github.com/timshannon/badgerhold/v3.(*Store).TxUpsert()
      /home/mvdan/go/pkg/mod/github.com/timshannon/badgerhold/[email protected]/put.go:206 +0x3aa
  github.com/timshannon/badgerhold/v3.(*Store).Upsert.func1()
      /home/mvdan/go/pkg/mod/github.com/timshannon/badgerhold/[email protected]/put.go:168 +0xa4
  [...]

This seems to be because Open sets globals like encode and decode, which are used by methods on Store later on.

This feels wrong - the moment you open a bunch of badgerhold DBs and try to use them, at best you're going to get the wrong behavior (only one wins the write to the global), and at worst you're going to get a panic like the one above.

I think the fix here is simple; the encode and decode variables shouldn't be globals, but rather fields on each Store struct.

I also strongly suggest you to set up a test that creates many DBs in parallel and reads+writes to them, and ensure that go test -race is happy. If you haven't done any concurrency safety testing before, I imagine there will be a few more data races lurking in your code :)

HasSuffix filter does not work for keys

Hi,
I looks like filtering (string) keys by prefix (and suffix) does not work as expected. For example, when I have inserted items with keys prefixed with test I would expect that store.Find(&result, badgerhold.Where(badgerhold.Key).HasPrefix("test")) gives the inserted items but it finds nothing.

I have modified the example._test.go to reproduce the issue.

package main

import (
	"fmt"
	"log"
	"os"
	"time"

	"io/ioutil"

	"github.com/dgraph-io/badger/v3"
	"github.com/timshannon/badgerhold/v4"
)

type Item struct {
	ID       string
	Category string `badgerholdIndex:"Category"`
	Created  time.Time
}

func id(n int) string {
	return "test_" + fmt.Sprint(n)
}

func main() {
	data := []Item{
		{
			ID:       id(0),
			Category: "blue",
			Created:  time.Now().Add(-4 * time.Hour),
		},
		{
			ID:       id(1),
			Category: "red",
			Created:  time.Now().Add(-3 * time.Hour),
		},
		{
			ID:       id(2),
			Category: "blue",
			Created:  time.Now().Add(-2 * time.Hour),
		},
		{
			ID:       id(3),
			Category: "blue",
			Created:  time.Now().Add(-20 * time.Minute),
		},
	}

	dir := tempdir()
	defer os.RemoveAll(dir)

	options := badgerhold.DefaultOptions
	options.Dir = dir
	options.ValueDir = dir
	store, err := badgerhold.Open(options)
	defer store.Close()

	if err != nil {
		// handle error
		log.Fatal(err)
	}

	// insert the data in one transaction

	err = store.Badger().Update(func(tx *badger.Txn) error {
		for i := range data {
			err := store.TxInsert(tx, data[i].ID, data[i])
			if err != nil {
				return err
			}
		}
		return nil
	})

	if err != nil {
		// handle error
		log.Fatal(err)
	}

	// Find all items in the blue category that have been created in the past hour
	var result []Item

	err = store.Find(&result, badgerhold.Where(badgerhold.Key).HasPrefix("test"))

	if err != nil {
		// handle error
		log.Fatal(err)
	}

	fmt.Printf("This should be non-empty: %v", result)
	// Output: 3

}

// tempdir returns a temporary dir path.
func tempdir() string {
	name, err := ioutil.TempDir("", "badgerhold-")
	if err != nil {
		panic(err)
	}
	return name
}

Get/Update don't work when opening existing DB

Hello,

I have noticed something strange that occurs when opening an existing DB:

If I use the Get() method to retrieve an element by key from the storage it isn't found, but if I use the Find() one passing a query filtering by key, the same element is now found.

The same happens also for the Update/UpdateMatching methods, the first returns an ErrNotFound while the second correctly updates the element which is indeed stored in the DB.

This doesn't happen at all the very first time when the DB is created instead of opened.

Performance considerations and optimizing a structure

I've been passively working with BadgerHold and I really like the simplicity of it. I've managed to migrate a MySQL data structure to a simplified version. My structure looks similar to this now:

// Document represents the main document structure
type Document struct {
	ID        uint64 `badgerhold:"key"`
	Type      string `badgerholdIndex:"Type"`
	Title     string
	Content   string
	Excerpt   string
	Reference string
	Status    string
	Created   time.Time
	Updated   time.Time
}
// Property represents meta value for a document
type Property struct {
	Document uint64 `badgerholdIndex:"Document"`
	Key      string `badgerholdIndex:"Key"`
	Value    interface{}
}

I have about 1100 Document types and about 15,000 Property types in Badger. There are no subfields and not much complexity in general. The Badger file size is 273mb. When I run the following query I'm getting what I believe may not be optimal performance:

speakers := []Document{}
store.Find(&speakers, badgerhold.Where(badgerhold.Key).Eq(uint64(1128)))

I can run that query in 31.644858ms. An equivalent run on a similar size RDBS (same exact dataset) would run in 0.0007s (considerable difference). I'm wondering if I'm not using this is the best way possible, perhaps I'm making a mistake or there could be another performance blocker. I would hope that Badgerhold would at least perform as fast or faster than an SQLite implementation? I may be wrong with that expectation.

Mind providing some insight? I'd be willing to share my dataset privately if you'd like to take a peek out of curiosity.

Feature Request: Add TTL support to keys

Have you considered adding support for setting TTL values on keys? Would be nice to have this. Maybe via something like:

func (s *Store) InsertTTL(key, data interface{}, ttl time.Duration) {...}
func (s *Store) UpdateTTL(key, data interface{}, ttl time.Duration) {...}
func (s *Store) UpsertTTL(key, data interface{}, ttl time.Duration) {...}

Ref: https://github.com/dgraph-io/badger#setting-time-to-livettl-and-user-metadata-on-keys

Add LoadBy method

Since there is support for unique indices, I am missing a method to load record by this unique field. The find method uses slice which requires more lines of code to return single result.

migrations

v2 looks awesome.

Really like the DSL and aggregations.

One thing that is blocking is Migrations. Does anyone have any thoughts on a strategy for this ?

Implement Subqueries in Update and Delete transactions

Right now running subqueries via a MatchFunc call isn't supported in Update or Delete queries. This is due to the fact that Badger only allows one iterator on Read/Write transactions.

I have a couple options to allow this:

Start a new transaction for subqueries: I don't like this, because it breaks the isolation level. You'd be reading the previously committed version of records and using that information to update or delete data inside of a transaction. Not really safe IMO.
When a new subquery starts, store the current iterators location, run the subquery, and seek back to that location when the subquery finishes.

1 is obviously easier, but not really an option in my opinion. Currently I'm simple throwing an error if a subquery is attempted in a RW transaction.

Getting slow

Hi,

When I start fresh with a new db.Dir I get:
INFO: Replay took: 67.932µs

But after inserting/deleting 1 million structs even though file stays small, I'm getting
INFO: Replay took: 16.604332ms -- and it keeps getting worse.

Snippet:

	sID := ksuid.New().String()
	err := db.Store.DeleteMatching(&Session{},
		badgerhold.Where("Origin").Eq(origin).And("User.Email").Eq(u.Email))

	if err != nil {
		return err
	}

	err = db.Store.Insert(sID, &Session{
		Origin: origin,
		Since:  time.Now(),
		User:   u,
	})

	if err != nil {
		return err
	}

go version go1.13.4 linux/amd64
badger: 8097259
badgerhold: 723db32

Worth noting here is that I have a mutex that locks at the beginning of func call, so It's not concurrent.

Am I doing something wrong here?

Thanks in advance

StartsWith operator

This is fantastic Tim, let me be the first to congratulate you!
I'd love to make some contributions too if you'll consider them.
The first thing that comes to mind are StartsWith and EndsWith operators for string fields.
I think they would be useful for multi-segmented keys and indexes:

store.Find(badgerhold.Where(badgerhold.Key).StartsWith("en-AU:")

I know we can probably already do this using Regex or MatchFunc but it looks more readable with the above syntax.

FYI: badger future

It appears that the people behind badger (dgraph) are having issues as summarized here… :(

Question on default sorting order

when i do

q := &badgerhold.Query{}
q.SortBy("CreatedAt")

And CreatedAt is unix timestamp, i need to know the order in which badgerhold sorts the data based on that field.

Thanks in advance!

Custom index name error

When a struct has an index defined on a field whose name does not match the field name, the specified index name is not respected:

type Person struct {
	Name string
	Division string `badgerholdIndex:"IdxDivision"`
}

In the above struct, the query

badgerhold.Where("Division").Eq("sales").Index("IdxDivision")

fails with the error The index IdxDivision does not exist, while the query

badgerhold.Where("Division").Eq("sales").Index("Division")

is executed successfully.

This behaviour is observed against tag v1.0.0 (commit hash 8addb9d), but the same should be observed in v3.0.0 (2b6764d) tag as well.

Ne(int) bad; Ne(uint) good

In bolthold, I used the following construction a few times to loop through all the records in the database :

_ = cfg.badger.ForEach(bh.Where(bh.Key).Ne(0), func(got *Whatever) error {
		got.Whatever()
		return nil
	})

Now, when I moved this over to badgerhold, it no longer worked (as in: no errors thrown and no records retrieved). The solution I discovered was that .Ne(0) needed to be replaced with .Ne(uint(0))... so it seems that badgerhold, unlike bolthold, is a tad sensitive to int/uint?

Skip() and Limit() without Where clause

Is there a way to get all records of a type and specifying a Skip and a Limit on the query, without having to specify a Where clause?

Or maybe it is possible to add an All() query which keeps all records elligible for retrieval?

Thanks in advance.

Unnecessary usage of iterator

When using index query with operator eq or in it would be better to use simple badger.Get to get value of that index rather than using iterator which iterate over every index and decode value of KeyList.

	type TestStruct struct {
		Value string
		Index int `badgerhold:"index"`
	}

	for i := 0; i < 10_000; i++ {
		testStruct := TestStruct{
			Value: strconv.Itoa(i),
			Index: i,
		}
		err := s.storage.database.Badger.Insert(i, &testStruct)
		s.NoError(err)
	}

	results := make([]TestStruct, 0, 1)
	err := s.storage.database.Badger.Find(
		&results,
		bh.Where("Index").Eq(5000).Index("Index"),
	)
	s.NoError(err)
	s.Len(results, 1)
	s.Equal(results[0].Index, 5000)

In above test badgerhold.Store iterates over all records and calls Decode function 10k times.
It should be just badger.Get call for badgerhold.KeyList and one get for each value in that struct.

No ID Returned in Get

I'm not sure if I'm missing something, but it seems like .Get doesn't return the ID within the struct that gets returned. I tested the same with BoltHold, which produced exactly what I was expecting, so I suspect that this is a bug with BadgerHold. If you run the linked code example, you'll get the following response:

Testing badger...
{[email protected] 0}
{[email protected] 1}
[{[email protected] 1}]
{[email protected] 0} < Should be 1
Testing bolt...
{[email protected] 1}
{[email protected] 2}
[{[email protected] 2}]
{[email protected] 2}

Badger example:

type User struct {
	Email    string `badgerhold:"index"`
	ID       uint64 `badgerhold:"key"`
}

user := User{
    Email:    "[email protected]",
}

if err := s.Insert(badgerhold.NextSequence(), &user); err != nil {
    log.Fatal(err)
}

fmt.Println(user)
// OUTPUT: &{[email protected] 0} CORRECT

user2 := User{
    Email:    "[email protected]",
}

if err := s.Insert(badgerhold.NextSequence(), &user2); err != nil {
    log.Fatal(err)
}

fmt.Println(user2)
// OUTPUT: &{[email protected] 1} CORRECT

var userResults []User

if err := s.Find(&userResults, badgerhold.Where("Email").Eq("[email protected]").Index("Email")); err != nil {
    log.Fatal(err)
}

fmt.Println(userResults)
// OUTPUT: [{[email protected] 1}] CORRECT

var userReturned User

if err := s.Get(uint64(1), &userReturned); err != nil {
    log.Fatal(err)
}

fmt.Println(userReturned)
// OUTPUT: &{[email protected] 0} INCORRECT
// EXPECTED: &{[email protected] 1}

Ability to ignore struct properties

It would be super helpful to be able to ignore properties with a tag similar to how json/yaml/glob etc do it.

type Example struct {
    Name string
    RuntimeInfo RuntimeInfo `badgerhold:"-"`
}

Unique constraints

@timshannon,

I enjoyed reading your article comparing boltdb-vs-badger. I've been flopping back and forth between the two of them, balancing the pros and cons of each.

That being said, I've been playing around with badgerhold. I wanted to know if you support anything like Storm, unique constraints on a field within a struct?

type User struct {
  ThePrimaryKey string `storm:"id"`// primary key
  Group string `storm:"index"` // this field will be indexed
  Email string `storm:"unique"` // this field will be indexed with a unique constraint
  Name string // this field will not be indexed
}

If not, is this something you'd consider? Doesn't seem like bolthold supports that use case either, but I've only skimmed through it and haven't used it extensively.

Thanks again for all your contributions, code and research.

Support bolthold-like Contains / ContainsAny / ContainsAll ?

I'm moving another bit of code that I wrote to use badgerhold instead of bolthold... and I'm a bit stuck because it looks like badgerhold does not implement the Contains / ContainsAny / ContainsAll queries.... can these be moved over?

Custom Storer interface inconsistent value type

When implementing custom Storer for a type, the IndexFunc receives a value of the provided type with an extra pointer indirection (**Local instead of *Local), which seems to be relevant to issue timshannon/bolthold#115.

This happens when building against 8addb9d (v1.0.0)

package main

import (
	//"fmt"
	"io/ioutil"
	"os"
	"testing"

	"github.com/timshannon/badgerhold"
)

type Local struct {
	Reference string
}

func (l *Local) Type() string {
	return "Local"
}

func (l *Local) Indexes() map[string]badgerhold.Index {
	refIdxFunc := func(name string, value interface{}) ([]byte, error) {
		/*
		if loc, ok := value.(*Local); ok {
			return []byte(loc.Reference), nil
		}
		return nil, fmt.Errorf("IndexFunc: expected *Local, got %T", value)
		*/
		ref := value.(*Local).Reference
		return []byte(ref), nil
	}

	return map[string]badgerhold.Index{
		"Reference": badgerhold.Index{
			IndexFunc: refIdxFunc,
			Unique: true,
		},
	}
}

func TestStorerDelete(t *testing.T) {
	path, _ := ioutil.TempDir(os.TempDir(), "store-*")
	os.Remove(path)

	opts := badgerhold.DefaultOptions
	opts.Dir, opts.ValueDir = path, path

	store, err := badgerhold.Open(opts)
	if err != nil {
		t.Fatalf("Failed to open store: %v", err)
	}
	defer store.Close()

	local := Local{
		Reference: "a unique string",
	}
	err = store.Insert(1, &local)
	if err != nil {
		t.Fatalf("Unable to insert: %v", err)
	}

	err = store.Delete(1, &local)
	if err != nil {
		t.Fatalf("Unable to delete: %v", err)
	}

	err = store.Insert(32, &local)
	if err != nil {
		t.Fatalf("Unable to reinsert: %v", err)
	}
}