Comments (2)
The driver connection is now thread-safe and can now be used in multi-threaded server processing environments like ASP.NET
or Message Queue systems. C# asynchronicity async
/await
keywords are now supported in run()
and run helpers.
Internal Design Notes
Previously, the C# and Java driver followed the same ReadResponse
semantic calling pattern. WriteQuery
followed by a ReadResponse
. The cursor implementation is a bit more complicated, ReadResponse
pass through ConnectionInstance.ReadResponse
. When a cursor sequence is detected, it is added to the cursorCache
; an instance of Dictionary<token,Cursor>
where token
is the query token of type long
.
cursorCache
was changed to be aConcurrentDictionary<token,Cursor>
for thread-safety. So, potentially, many threads can add themselves to thecursorCache
.
Next, starting with SocketWrapper
, WriteQuery
was changed prevent multiple threads writing to the network stream at the same time.
lock( writeLock )
{ // Everyone can write their query as fast as they can; block if needed.
try
{
this.bw.Write(token);
var jsonBytes = Encoding.UTF8.GetBytes(json);
this.bw.Write(jsonBytes.Length);
this.bw.Write(jsonBytes);
}
catch
{
Log.Trace($"Write Query failed for Token {token}.");
}
}
Without the writeLock
, multiple threads could write to the network stream at different times while writing a query to the network stream. writeLock
is an object that's created on the CLR heap. The CLR run-time stores objects in the heap in the following raw memory format: [syncblock][handle][object]
. The syncblock is part of an object header used by synchronization primitives and the CLR to determine whether or not an object is locked by a thread. The syncblock has two formats, thinlock and indexed. Thinlocks are super fast (locking info is already preset in the [syncblock]
object header); whereas indexed require an extra lookup into the CLR syncblock table to determine locking information for an object. There is no way to explicitly define what type of "lock" gets placed with an object that goes on the CLR heap because it is entirely managed by CLR run-time. After experimenting I found the CLR is indeed giving us a thinlock:
Setting a breakpoint inside the lock:
lock( writeLock )
{ // Everyone can write their query as fast as they can; block if needed.
try
{
Debugger.Break();
this.bw.Write(token);
var jsonBytes = Encoding.UTF8.GetBytes(json);
this.bw.Write(jsonBytes.Length);
this.bw.Write(jsonBytes);
}
Run a query, break, then loading up the windows debugger on the process ID. Dump the heap for SocketWrapper
objects:
ntsd -p 1644
0:012> .loadby sos mscorwks
0:012> .load sosex
0:019> !DumpHeap -type RethinkDb.Driver.Net.SocketWrapper
Address MT Size
02a0d21c 0706d744 52
02a12430 0706f61c 16
Statistics:
MT Count TotalSize Class Name
0706f61c 1 16 RethinkDb.Driver.Net.SocketWrapper+<>c__DisplayClass11_0
0706d744 1 52 RethinkDb.Driver.Net.SocketWrapper
Total 2 objects
The object we're intrested in is 52 bytes at address 02a0d21c
:
0:019> !do 02a0d21c
Name: RethinkDb.Driver.Net.SocketWrapper
MethodTable: 0706d744
EEClass: 06f17184
Size: 52(0x34) bytes
Fields:
MT Field Offset Type VT Attr Value Name
0706e664 4000035 4 ...Sockets.TcpClient 0 instance 02a0d740 socketChannel
04e57e8c 4000036 28 System.TimeSpan 1 instance 02a0d244 timeout
04e51d7c 4000037 8 System.String 0 instance 02a0c0a0 hostname
04e53cc4 4000038 24 System.Int32 1 instance 28015 port
0706da24 4000039 c ...ets.NetworkStream 0 instance 02a0f124 ns
04e4c41c 400003a 10 ...m.IO.BinaryWriter 0 instance 02a0f14c bw
04e65770 400003b 14 ...m.IO.BinaryReader 0 instance 02a0f1ec br
04e4966c 400003c 18 ...lationTokenSource 0 instance 02a12408 pump
0706df74 400003d 1c ...ethinkDb.Driver]] 0 instance 02a0d250 awaiters
04e5211c 400003e 20 System.Object 0 instance 02a0d734 writeLock
The SocketWrapper
has reference to the writeLock
object and is on the heap at address 02a0d734
. So, let's look at the hidden object header by going back 0x4
bytes from the object pointer starting address:
0:019> dd 02a0d734-0x4 l1
02a0d730 0000000a
Indeed, we see that the object header is a _thin-lock_. AWESOME. Thin-locks have the upper byte of the object header as 0x00
and an indexed lock would have the upper byte 0x08
... the object header has the value of 0xA
or 10 dec
which is the managed thread ID that is holding the current lock.
0:019> !threads
ThreadCount: 12
UnstartedThread: 0
BackgroundThread: 10
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
ID OSID ThreadOBJ
0 1 2660 0068dce8
2 2 2c80 0069b3d0
7 3 2534 06521b20
8 4 25a4 06523d78
10 5 270c 065247a8
11 6 2db8 06528de8
12 7 728 06566a30
13 8 155c 0656f750
14 9 8a0 065d1540
15 10 2eec 065e9020
17 11 20b8 065fae68
18 12 5ec 065fb688
So, the managed thread ID 10 is actually thread 15 inside the operating system. Switch our thread context and double check we're inside the lock:
0:010> ~15s
0:015> !ClrStack
OS Thread Id: 0x2eec (15)
Child SP IP Call Site
083fde14 0546400f System.Diagnostics.Debugger.Break()
083fde3c 082fa41b RethinkDb.Driver.Net.SocketWrapper.WriteQuery
Indeed, we are the thread with the writeLock
. Awesome.
Sanity check by checking the syncblock table:
0:015> !syncblk
Index SyncBlock MonitorHeld Recursion
-----------------------------
Total 18
CCW 0
RCW 0
ComClassFactory 0
Free 0
Nothing is indexed in the syncblock table, so we're going as fast as we can with this write-lock. 👌 While it might not always be the case that we are using _thin-lock_s because locks are handled by the CLR, we have some good empirical evidence that we're in fact using a _thin-lock_s for this workload.
So, now reading responses
_See addendum design note below for current implementation_. The following block quote was from previous proposal idea:
... just before a thread writes their query to the network, the query token is assigned an awaiter. A dedicated thread from the thread pool is used to pull messages off the network. The dedicated thread loops inside
SocketWrapper.ResponsePump
reads any response, then looks up the awaiter by the response token ID. Then pushes the result to the awaiting thread, and everything continues as normally.I would have preferred to have both of these interactions happen in one
SendQuery
(send & wait on a returned awaiter), but it would have had too much of an impact in high-level construct objects likeCursor
that does things likeMaybeSendContinue
without really awaiting on a returned value from the call. So the current implementation still keeps same semantic calling patterns as the Java driver, but also requires us to make sure that we come back and grab our taskAwaitResponseAsync
(replacingReadResponse
) after we have written the query. In other words, do not write internal driver code to write a query to the network and not come back for the associated awaiting task. If you need toWriteQuery
which allows "fire and forget" types of writes that don't expect any token associated response pass inWriteQuery(assignAwaiter:false)
.
🚀
from rethinkdb.driver.
Addendum to this design note:
I decided to implement SendQuery
in SocketWrapper
and avoid the Write then ReadResponse semantic calling pattern. SendQuery
returns a future awaiter Response
value. Previously, the threading architecture was getting too complicated orchestrating threads to follow a two step process (write then come back later to await/read response). For example, the two step process also included a very obscure GC memory leak in a Cursor
because Cursor sent MaybeSendContinue
and Cursor STOP
, both, which did not process their awaiters. The leak would occur because it was the awaiter responsibility to remove themselves from the awaiter's dictionary. The main ResponsePump
thread would set the awaiter's Response
result but not remove the awaiter from the dictionary of awaiters. Since the zombie awaiter (from a Cursor MaybeSendContinue
or STOP
) did not finish processing the awaited response, the zombie awaiter would be sitting in the awaiters dictionary forever.
Now the responsibility of removal of an awaiter from the awatier's dictionary is on the ResponsePump
thread. Soon as a response is read from the network, the associated awaiter is removed off the awaiter dictionary (avoiding the memory leak). Additionally, SendQuery
always returns an awaiter (Response
future). Callers to SendQuery
can specify awaitResponse=false
for truly fire-and-forget queries. If the caller thread has awaitResponse=false
indicating they are not interested in a response, they still get an instantly completed awaiter of Response
null. If you don't want the awaiter, fine, disregard it; otherwise, the response you're looking for pushed inside the returned awaiter at some future moment in time. This added some additional responsibility to a Cursor
. Cursors now need to keep track of the current CONTINUE
awaiter in progress to read the CONTINUE
ed response. It's not too bad. Note: The outstandingRequests == 0
check in MaybeSendContinue
guards us from sending too many continues (and consequently losing track of subsequent CONTINUE
ed awaiters).
Overall, SendQuery
in SocketWrapper
has much better calling semantics. Makes everything less complicated and simplified a lot of code making it easier to understand.
from rethinkdb.driver.
Related Issues (20)
- Create a method to return result as raw json HOT 4
- Resolution function on OptArg insert conflict HOT 1
- ConnectionPool.Builder.ConnectAsync does not respect InitialTimeout() HOT 4
- Add Error codes to Exceptions HOT 2
- Additional exceptions thrown when trying to cancel Cursor.MoveNextAsync() HOT 6
- Object's primary key and key of RunAtom() is different (Client-side generation). HOT 4
- Support for System.Text.Json.JsonDocument
- Cursor<T> should implement IAsyncEnumerable<T> in .net core 3
- RethinkDB 2.4 Release
- Best options: RunResult, RunAtom or RunCursor HOT 3
- Get list from Table that contains 'A' in the 'message' field HOT 1
- ASL 2.0 HOT 1
- Implement lambda functions as a possible argument in Insert().OptArgs("conflict",...) HOT 1
- System.NullReferenceException HOT 1
- Add EntityFramework support to Linq provider HOT 5
- Polymorphism HOT 1
- Various exceptions when updating a document HOT 1
- System.Configuration.ConfigurationErrorsException: HOT 1
- Wrong OptArg serialisation after OrderBy
- Insert not working with RunNoReply & Can't use runOpts is always null -- solved
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rethinkdb.driver.