Giter Club home page Giter Club logo

Comments (1)

NightOwl888 avatar NightOwl888 commented on June 13, 2024

The Migration Guide covers this very issue with an example:

LUCENE-2380: FieldCache.GetStrings/Index --> FieldCache.GetDocTerms/Index

  • The field values returned when sorting by SortField.STRING are now
    BytesRef. You can call value.Utf8ToString() to convert back to
    string, if necessary.

  • In FieldCache, GetStrings (returning string[]) has been replaced
    with GetTerms (returning a BinaryDocValues instance).
    BinaryDocValues provides a Get method, taking a docID and a BytesRef
    to fill (which must not be null), and it fills it in with the
    reference to the bytes for that term.

    If you had code like this before:

    string[] values = FieldCache.DEFAULT.GetStrings(reader, field);
    ...
    string aValue = values[docID];

    you can do this instead:

    BinaryDocValues values = FieldCache.DEFAULT.GetTerms(reader, field);
    ...
    BytesRef term = new BytesRef();
    values.Get(docID, term);
    string aValue = term.Utf8ToString();

    Note however that it can be costly to convert to String, so it's better to work directly with the BytesRef.

  • Similarly, in FieldCache, GetStringIndex (returning a StringIndex
    instance, with direct arrays int[] order and String[] lookup) has
    been replaced with GetTermsIndex (returning a
    SortedDocValues instance). SortedDocValues provides the
    GetOrd(int docID) method to lookup the int order for a document,
    LookupOrd(int ord, BytesRef result) to lookup the term from a given
    order, and the sugar method Get(int docID, BytesRef result)
    which internally calls GetOrd and then LookupOrd.

    If you had code like this before:

    StringIndex idx = FieldCache.DEFAULT.GetStringIndex(reader, field);
    ...
    int ord = idx.order[docID];
    String aValue = idx.lookup[ord];

    you can do this instead:

    DocTermsIndex idx = FieldCache.DEFAULT.GetTermsIndex(reader, field);
    ...
    int ord = idx.GetOrd(docID);
    BytesRef term = new BytesRef();
    idx.LookupOrd(ord, term);
    string aValue = term.Utf8ToString();

    Note however that it can be costly to convert to String, so it's better to work directly with the BytesRef.

    DocTermsIndex also has a GetTermsEnum() method, which returns an iterator (TermsEnum) over the term values in the index (ie, iterates ord = 0..NumOrd-1).

Furthermore, if you drill down into the issue LUCENE-2380, there is an explanation for the change: primarily, this was done for performance reasons. There is no longer a string[] stored in the field cache, the underlying data is now a byte[] so extra steps are required to get a UTF8 string.

Do note that you are meant to reuse the BytesRef instance that is passed in to get better performance.

from lucenenet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.