Comments (4)
So, this is something we don't want to do. The reason for this is that the DocumentsWriter in Lucene 4.8.0 writes segments concurrently, not sequentially. However, we are getting test failures (I don't recall which tests) when attempting to do the same in .NET, possibly due to a missing lock or very subtle locking behavior in Java that doesn't work with the same syntax in .NET. 00d3942 is interesting and may help to address the problem, although we almost always strictly follow the way the tests are written in Java unless there is a good reason to change the test (and there may be here).
963e10c is the hack that we put in place to make it run sequentially for the time being, but our intention is to fix the bug rather than change the API like this which would render it unfixable.
That being said, nobody is currently working on trying to get the concurrent document writing to function and it is considered low priority since it can most likely be addressed without any breaking API change after the release. However, you seem to have a knack for this, so you are welcome to attempt to roll back those changes and work on fixing the concurrency bug.
Do note that DocumentsWriter is in an inconsistent state somewhere between Lucene 4.8.0 and 4.8.1 which may be contributing to the issue. So it may require upgrading to 4.8.1 in order to properly patch the bug. I ran a diff some time ago and there are less than 100 files that have changes between the two versions (and several of the modules were ported from 4.8.1 so there are fewer changes to deal with than that).
from lucenenet.
00d3942 , After fixing all the tests, this one failed because the Java version supports volatile long
but .NET
doesn't. It is important to note that the returned long can be ignored. Also currently working on NRT feature 02ed5d3 kindly take a look.
I Rant LuceneNet's 4.8 is 10-year-old release. I am eagerly looking forward to seeing more improvements in the project. Recently vector search added for Azure Search but they never leveraged LuceneNet.
from lucenenet.
963e10c is the hack that we put in place to make it run sequentially for the time being, but our intention is to fix the bug rather than change the API like this which would render it unfixable.
That being said, nobody is currently working on trying to get the concurrent document writing to function and it is considered low priority since it can most likely be addressed without any breaking API change after the release. However, you seem to have a knack for this, so you are welcome to attempt to roll back those changes and work on fixing the concurrency bug.
@NightOwl888 Can you please provide more details about the concurrency issue? This will help me understand the problem better and work on finding a solution.
from lucenenet.
@Jeevananthan-23 - 963e10c points to #325 where the original error report is. I followed up with another stack trace on the same failing test.
I have reverted the relevant changes from 963e10c in this branch: https://github.com/NightOwl888/lucenenet/tree/fix/documentswriter-concurrency. I ran the tests 30 times on Azure DevOps and ran the TestMultiThreadedSnapshotting
test locally 30,000 times and couldn't get a failure. That is the good news. The bad news is that another test TestRollingUpdates.TestUpdateSameDoc
fails, but very rarely. I got it to fail locally on both .NET 5.0 and .NET 6.0, but not on .NET 7.0.
So, we cannot merge the patch until we have a fix for the failing test. I am attaching the log from the test failure. I got it to fail on net5.0
on Windows (the original failure was on Linux). I used the [Repeat(1000)]
attribute on the test, and it failed after about 3 runs.
I also used the assembly attributes as specified in the test failure. This ensures the same random components are plugged into the test during each run, which may help narrow down which component is faulty. On the other hand, these may have nothing to do with the exception at all - it is hard to determine this when the failure happens so rarely. Do note we have our own random class so these will work consistently across target frameworks and operating systems.
[assembly: Lucene.Net.Util.RandomSeed("0xe6dee1082501680d")]
[assembly: NUnit.Framework.SetCulture("sat-Olck")]
TestUpdateSameDoc-638362856625826917.zip
If you could pull down the branch to investigate why the test is failing, that would be great.
TestTargetFramework.props
is where the target framework for the tests can be specified.
from lucenenet.
Related Issues (20)
- Random Query Parser Error HOT 1
- The type initializer for "Lucene.Net.Diagnostics.Debugging" threw an exception HOT 1
- Scarce Documentation for OpenNLP Integration HOT 10
- Add a link and info about the Lucene.NET Slack channel HOT 4
- Investigate Failing Test: Lucene.Net.Index.TestIndexWriterOnJRECrash::TestNRTThreads_Mem()
- Investigate Failing Test: Lucene.Net.Analysis.Miscellaneous.TestStemmerOverrideFilter::TestRandomRealisticWhiteSpace() HOT 1
- Task: Finish [SuppressTempFileChecks] attribute functionality
- Failure when parsing phrases HOT 3
- Alternative for SetNextReader to return all strings HOT 1
- Docs: DocFx Build Failure for API Docs HOT 4
- Lucene.Net: 4.8 SetNextReader executes repeatedly and returns only one result HOT 1
- Replace Lucene.Net.Support.Arrays.Empty<T> with System.Array.Empty<T>
- Audit use of AtomicInt32 and AtomicInt64 methods
- Improve ICollector usage
- Simplify IndexReader constructor
- Meta: Add Support unit tests HOT 1
- Review formatting of boolean strings (in ToString() methods and similar)
- Add cancellation support to IndexSearcher
- Fix test name reporting when test is in a base class
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lucenenet.