Comments (1)
The Migration Guide covers this very issue with an example:
LUCENE-2380: FieldCache.GetStrings/Index --> FieldCache.GetDocTerms/Index
-
The field values returned when sorting by
SortField.STRING
are now
BytesRef
. You can callvalue.Utf8ToString()
to convert back to
string, if necessary. -
In
FieldCache
,GetStrings
(returningstring[]
) has been replaced
withGetTerms
(returning aBinaryDocValues
instance).
BinaryDocValues
provides aGet
method, taking adocID
and aBytesRef
to fill (which must not benull
), and it fills it in with the
reference to the bytes for that term.
If you had code like this before:string[] values = FieldCache.DEFAULT.GetStrings(reader, field); ... string aValue = values[docID];
you can do this instead:
BinaryDocValues values = FieldCache.DEFAULT.GetTerms(reader, field); ... BytesRef term = new BytesRef(); values.Get(docID, term); string aValue = term.Utf8ToString();
Note however that it can be costly to convert to
String
, so it's better to work directly with theBytesRef
. -
Similarly, in
FieldCache
, GetStringIndex (returning aStringIndex
instance, with direct arraysint[]
order andString[]
lookup) has
been replaced withGetTermsIndex
(returning a
SortedDocValues
instance).SortedDocValues
provides the
GetOrd(int docID)
method to lookup the int order for a document,
LookupOrd(int ord, BytesRef result)
to lookup the term from a given
order, and the sugar methodGet(int docID, BytesRef result)
which internally callsGetOrd
and thenLookupOrd
.
If you had code like this before:StringIndex idx = FieldCache.DEFAULT.GetStringIndex(reader, field); ... int ord = idx.order[docID]; String aValue = idx.lookup[ord];
you can do this instead:
DocTermsIndex idx = FieldCache.DEFAULT.GetTermsIndex(reader, field); ... int ord = idx.GetOrd(docID); BytesRef term = new BytesRef(); idx.LookupOrd(ord, term); string aValue = term.Utf8ToString();
Note however that it can be costly to convert to
String
, so it's better to work directly with theBytesRef
.
DocTermsIndex
also has aGetTermsEnum()
method, which returns an iterator (TermsEnum
) over the term values in the index (ie, iterates ord = 0..NumOrd-1).
Furthermore, if you drill down into the issue LUCENE-2380, there is an explanation for the change: primarily, this was done for performance reasons. There is no longer a string[]
stored in the field cache, the underlying data is now a byte[]
so extra steps are required to get a UTF8 string.
Do note that you are meant to reuse the BytesRef
instance that is passed in to get better performance.
from lucenenet.
Related Issues (20)
- Docs: DocFx Build Failure for API Docs HOT 4
- Lucene.Net: 4.8 SetNextReader executes repeatedly and returns only one result HOT 1
- Replace Lucene.Net.Support.Arrays.Empty<T> with System.Array.Empty<T>
- Audit use of AtomicInt32 and AtomicInt64 methods
- Improve ICollector usage
- Simplify IndexReader constructor
- Meta: Add Support unit tests HOT 1
- Review formatting of boolean strings (in ToString() methods and similar)
- Add cancellation support to IndexSearcher
- Fix test name reporting when test is in a base class
- Create Roslyn code analyzer to streamline review of proper usage of format/parse methods for numeric types
- Target .NET 8 HOT 16
- .Net 6 and 8 slower than .Net 472 HOT 7
- Remove unnecessary`[MethodImpl(MethodImplOptions.NoInlining)]`
- Fix calls to Exception.StackTrace
- Performance decrease 30x when running on .NET 8 HOT 37
- Set license expression on nuget HOT 1
- Poor multi-threaded indexing performance HOT 19
- Lucene.Net.Util.SystemConsole throws not supported exception in .NET MAUI app running on android/iOS
- Lucene.Net.Store.Azure.AzureDirectory: Enable usage with a sasuri and existing containers HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lucenenet.