Giter Club home page Giter Club logo

orleanscontrib / orleans.syncwork Goto Github PK

View Code? Open in Web Editor NEW
54.0 5.0 12.0 246 KB

This package's intention is to expose an abstract base class to allow https://github.com/dotnet/orleans/ to work with long running CPU bound synchronous work, without becoming overloaded.

Home Page: https://OrleansContrib.github.io/Orleans.SyncWork

License: MIT License

C# 100.00%
orleans orleans-grains orleans-example actor-model csharp dotnet-core hacktoberfest

orleans.syncwork's Introduction

Build and test Coverage Status

Latest NuGet Version License

This package's intention is to expose an abstract base class to allow Orleans to work with long running, CPU bound, synchronous work, without becoming overloaded.

Built with an open source license, thanks Jetbrains!

Building

The project was built primarily with .net3 in mind, though the varying major version releases support .net6, .net7, and .net8; depending on the package version (should mirror the .net versions).

Requirements

Project Overview

There are several projects within this repository, all with the idea of demonstrating and/or testing the claim that the NuGet package https://www.nuget.org/packages/Orleans.SyncWork/ does what it is claimed it does.

Note that this project's major revision is kept in-line with the Orleans major version, so the project does not necessarily abide by SemVer, but we try as much as possible to do so. If breaking changes are introduced, descriptions of the breaking change and how to implement against it should be provided in release notes.

The projects in this repository include:

Orleans.SyncWork

The meat and potatoes of the project. This project contains the abstraction of "Long Running, CPU bound, Synchronous work" in the form of an abstract base class SyncWorker; which implements an interface ISyncWorker.

When long running work is identified, you can extend the base class SyncWorker, providing a TRequest and TResponse unique to the long running work. This allows you to create as many ISyncWork<TRequest, TResponse> implementations as necessary, for all your long running CPU bound needs! (At least that is the hope.)

Basic "flow" of the SyncWork:

  • Start
  • Poll GetStatus until a Completed or Faulted status is received
  • GetResult or GetException depending on the GetStatus

This package introduces a few "requirements" against Orleans:

  • In order to not overload Orleans, a LimitedConcurrencyLevelTaskScheduler is introduced. This task scheduler is registered (either manually or through the provided extension method) with a maximum level of concurrency for the silo being set up. This maximum concurrency MUST allow for idle threads, lest the Orleans server be overloaded. In testing, the general rule of thumb was Environment.ProcessorCount - 2 max concurrency. The important part is that the CPU is not fully "tapped out" such that the normal Orleans asynchronous messaging can't make it through due to the blocking sync work - this will make things start timing out.

  • Blocking grains are stateful, and are currently keyed on a Guid. If in a situation where multiple grains of long running work is needed, each grain should be initialized with its own unique identity.

  • Blocking grains likely CAN NOT dispatch further blocking grains. This is not yet tested under the repository, but it stands to reason that with a limited concurrency scheduler, the following scenario would lead to a deadlock:

    • Grain A is long running
    • Grain B is long running
    • Grain A initializes and fires off Grain B
    • Grain A cannot complete its work until it gets the results of Grain B

    In the above scenario, if "Grain A" is "actively being worked" and it fires off a "Grain B", but "Grain A" cannot complete its work until "Grain B" finishes its own, but "Grain B" cannot start its work until "Grain A" finishes its work due to limited concurrency, you've run into a situation where the limited concurrency task scheduler can never finish the work of "Grain A".

    That was quite a sentence, hopefully the point was conveyed somewhat sensibly. There may be a way to avoid the above scenario, but I have not yet deeply explored it.

Usage

Create an interface for the grain, which implements ISyncWorker<TRequest, TResult>, as well as one of the IGrainWith...Key interfaces. Then create a new class that extends the SyncWorker<TRequest, TResult> abstract class, and implements the new interface that was introduced:

public interface IPasswordVerifierGrain
    : ISyncWorker<PasswordVerifierRequest, PasswordVerifierResult>, IGrainWithGuidKey;

public class PasswordVerifierGrain : SyncWorker<PasswordVerifierRequest, PasswordVerifierResult>, IPasswordVerifierGrain
{
    private readonly IPasswordVerifier _passwordVerifier;

    public PasswordVerifier(
        ILogger<PasswordVerifier> logger,
        LimitedConcurrencyLevelTaskScheduler limitedConcurrencyLevelTaskScheduler,
        IPasswordVerifier passwordVerifier) : base(logger, limitedConcurrencyLevelTaskScheduler)
    {
        _passwordVerifier = passwordVerifier;
    }

    protected override async Task<PasswordVerifierResult> PerformWork(
        PasswordVerifierRequest request, GrainCancellationToken grainCancellationToken)
    {
        var verifyResult = await _passwordVerifier.VerifyPassword(request.PasswordHash, request.Password);

        return new PasswordVerifierResult()
        {
            IsValid = verifyResult
        };
    }
}

public class PasswordVerifierRequest
{
    public string Password { get; set; }
    public string PasswordHash { get; set; }
}

public class PasswordVerifierResult
{
    public bool IsValid { get; set; }
}

Run the grain:

var request = new PasswordVerifierRequest()
{
    Password = "my super neat password that's totally secure because it's super long",
    PasswordHash = "$2a$11$vBzJ4Ewx28C127AG5x3kT.QCCS8ai0l4JLX3VOX3MzHRkF4/A5twy"
}
var passwordVerifyGrain = grainFactory.GetGrain<IPasswordVerifierGrain>(Guid.NewGuid());
var result = await passwordVerifyGrain.StartWorkAndPollUntilResult(request);

The above StartWorkAndPollUntilResult is an extension method defined in the package (SyncWorkerExtensions) that Starts, Polls, and finally GetResult or GetException upon completed work. There would seemingly be place for improvement here as it relates to testing unexpected scenarios, configuration based polling, etc.

Orleans.SyncWork.Tests

Unit testing project for the work in Orleans.SyncWork. These tests bring up a "TestCluster" which is used for the full duration of the tests against the grains.

One of the tests in particular throws 10k grains onto the cluster at once, all of which are long running (~200ms each) on my machine - more than enough time to overload the cluster if the limited concurrency task scheduler is not working along side the SyncWork base class correctly.

TODO: still could use a few more unit tests here to if nothing else, document behavior.

Orleans.SyncWork.Demo.Api

This is a demo of the ISyncWork<TRequest, TResult> in action. This project is being used as both a Orleans Silo, and client. Generally you would stand up nodes to the cluster separate from the clients against the cluster. Since we have only one node for testing purposes, this project acts as both the silo host and client.

The OrleansDashboard is also brought up with the API. You can see an example of hitting an endpoint in which 10k password verification requests are received here:

Dashboard showing 10k CPU bound, long running requests

Swagger UI is also made available to the API for testing out the endpoints for demo purposes.

Orleans.SyncWork.Demo.Api.Benchmark

Utilizing Benchmark DotNet, a benchmarking class was created to both test that the cluster wasn't falling over, and see what sort of timing situation we're dealing with.

Following is the benchmark used at the time of writing:

public class Benchy
{
    const int TotalNumberPerBenchmark = 100;
    private readonly IPasswordVerifier _passwordVerifier = new Services.PasswordVerifier();
    private readonly PasswordVerifierRequest _request = new PasswordVerifierRequest()
    {
        Password = PasswordConstants.Password,
        PasswordHash = PasswordConstants.PasswordHash
    };

    [Benchmark]
    public void Serial()
    {
        for (var i = 0; i < TotalNumberPerBenchmark; i++)
        {
            _passwordVerifier.VerifyPassword(PasswordConstants.PasswordHash, PasswordConstants.Password);
        }
    }

    [Benchmark]
    public async Task MultipleTasks()
    {
        var tasks = new List<Task>();
        for (var i = 0; i < TotalNumberPerBenchmark; i++)
        {
            tasks.Add(_passwordVerifier.VerifyPassword(PasswordConstants.PasswordHash, PasswordConstants.Password));
        }

        await Task.WhenAll(tasks);
    }

    [Benchmark]
    public async Task MultipleParallelTasks()
    {
        var tasks = new List<Task>();

        Parallel.For(0, TotalNumberPerBenchmark, i =>
        {
            tasks.Add(_passwordVerifier.VerifyPassword(PasswordConstants.PasswordHash, PasswordConstants.Password));
        });

        await Task.WhenAll(tasks);
    }

    [Benchmark]
    public async Task OrleansTasks()
    {
        var siloHost = await BenchmarkingSIloHost.GetSiloHost();
        var grainFactory = siloHost.Services.GetRequiredService<IGrainFactory>();
        var tasks = new List<Task>();
        for (var i = 0; i < TotalNumberPerBenchmark; i++)
        {
            var grain = grainFactory.GetGrain<IPasswordVerifierGrain>(Guid.NewGuid());
            tasks.Add(grain.StartWorkAndPollUntilResult(_request));
        }

        await Task.WhenAll(tasks);
    }
}

And here are the results:

Method Mean Error StdDev
Serial 12.399 s 0.0087 s 0.0077 s
MultipleTasks 12.289 s 0.0106 s 0.0094 s
MultipleParallelTasks 1.749 s 0.0347 s 0.0413 s
OrleansTasks 2.130 s 0.0055 s 0.0084 s

And of course note, that in the above the Orleans tasks are limited to my local cluster. In a more real situation where you have multiple nodes to the cluster, you could expect to get better timing, though you'd probably have to deal more with network latency.

Orleans.SyncWork.Demo.Services

This project defines several grains to demonstrate the workings of the Orleans.SyncWork package, through the Web API, benchmark, and tests.

orleans.syncwork's People

Contributors

einari avatar htxryan avatar kritner avatar reubenbond avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

orleans.syncwork's Issues

Allow GetWorkStatus to report NotStarted

GetWorkStatus() throws an exception when the grain has not been started yet, versus returning NotStarted.

if (_status == SyncWorkStatus.NotStarted)
{
    Logger.LogError("{Method} was in a status of {WorkStatus}", nameof(GetWorkStatus), SyncWorkStatus.NotStarted);
    DeactivateOnIdle();
    throw new InvalidStateException(_status);
}
return Task.FromResult(_status);

In my use case I intended to start a long-running task (in my project a sequenceFactoryGrain eventually creates a sequenceGrain, working as a long-running task) in an idempotent way, like

ISequenceGrain sequenceGrain = await sequenceFactoryGrain.GetWorkStatus() switch
{
    SyncWorkStatus.NotStarted => await sequenceFactoryGrain.StartWorkAndPollUntilResult(sequence),
    SyncWorkStatus.Running => await sequenceFactoryGrain.ContinueWorkAndPollUntilResult(sequence),
    SyncWorkStatus.Completed => (await sequenceFactoryGrain.GetResult())!,
    SyncWorkStatus.Faulted => throw (await sequenceFactoryGrain.GetException())!,
    _ => throw new NotImplementedException()
};

but was surprised by the exception thrown when trying to check the current state.

My intent: If some external non-Orleans API call tries to start the factory to get a grain, but if somehow there is a timeout/cancellation/... and the API retries, I should be able to find out that the sequenceFactoryGrain is already processing and just tag along waiting for the long-running call to complete.
(ContinueWorkAndPollUntilResult is the same code as StartWorkAndPollUntilResult but without the Start call)

Originally posted by @hendrikdevloed in #55 (comment)

I'll make the modification to allow retrieving GetWorkStatus() and set up a pull request.

Can ISyncWorker omit imposing IGrainWithGuidKey?

The definition

public interface ISyncWorker<in TRequest, TResult> : IGrainWithGuidKey

imposes the type of key to a grain. Would it be interesting to omit the grain type from the ISyncWorker, or perhaps basing it on a generic IGrain (or IAddressable, I'm not quite sure what the proper base would be) instead?

public interface ISyncWorker<in TRequest, TResult>: IGrain

Testing the modification in the SyncWork repo breaks all grainfactory instances, because there is no type system hint what proper overload to choose.
GrainFactory.GetGrain<ISyncWorker<TestDelaySuccessRequest, TestDelaySuccessResult>>(Guid.NewGuid());

I see some options:

  • Have a parallel hierarchy ISyncWorkerWithGuidKey<> etc for each of the Orleans grain types
  • Have the user name the implementation explicitly like
/// <summary>
/// Represents the contract for a SyncWorker grain verifying a password.
/// </summary>
public interface IPasswordVerifierGrain : IPasswordVerifier, ISyncWorker<PasswordVerifierRequest, PasswordVerifierResult>, IGrainWithGuidKey;

What are your thoughts on this?

Kind regards,
Hendrik

CICD not working quite as intended

Using:

I'm not getting quite the behavior I wanted. I'm not sure if it would be better to do "release branch" pushes of the nuget packages, or continue doing so from main. Ideally, the setup would just run from main and tag releases as they go out, but perhaps I should just start with doing pushes to nuget from release/v{version} branches; this will at least make tagging (manually) for new versions of the package easier, albeit more tedious as it's a manual process.

The publish builds as they stand are currently "failing" even though the packages are being pushed (brandedoutcast/publish-nuget#58). It seems like https://github.com/brandedoutcast/publish-nuget may be mostly dead, from taking in new PRs, might need to reevaluate the use of this action?

Put on NuGet

Is your feature request related to a problem? Please describe.
I'm frustrated that this code isn't on nuget

Describe the solution you'd like
the code on nuget

Describe alternatives you've considered
the code on nuget and/or other package manager

Add `[FlakyTheory]`

Is your feature request related to a problem? Please describe.
Right now only [FlakyFact] is supported. Sometimes my [Theory] is flaky too!

Describe the solution you'd like
I'd like support for [FlakyTheory] as an accompaniment to the existing [FlakyFact]

Describe alternatives you've considered
Having each data part to a theory, as individual [FlakyFacts] isn't great.

Add cancellation token support

Is your feature request related to a problem? Please describe.

Need some means of cancelling long running work

Describe the solution you'd like

The Start method

Task<bool> Start(TRequest request);
should have a new parameter added to it, taking in a cancellation token, which will be used to dispatch the long running work. Upon receiving a cancel signal, the long running grains should be able to see the task is cancelled, and do any necessary clean up. This cleanup is up to the grain implementation itself to handle.

Additional context

This may also affect tests and usage of the extension methods in

public static class SyncWorkerExtensions

They will both need cancellation token support, and we'll probably need a new SyncWorkStatus representative of a cancelled job.

Introduce "bootstrapping" extension method for `ISiloBuilder`, like the one existing for `ISiloHostBuilder`

Is your feature request related to a problem? Please describe.
The ISiloHostBuilder will seemingly be obsoleted/deprecated at some point as per dotnet/orleans#5685.

Describe the solution you'd like
Introduce an extension method onto ISiloBuilder that will allow a similar registration of the Orleans.SyncWork requirements such as the ApplicationParts and LimitedConcurrencyLevelTaskScheduler.

Describe alternatives you've considered
n/a

Additional context
Both the ISiloBuilder and ISiloHostBuilder extension methods will exist together.

Should closely resemble the current ISiloHostBuilder extension method located here: https://github.com/OrleansContrib/Orleans.SyncWork/blob/v1.4.10/src/Orleans.SyncWork/ExtensionMethods/SiloHostBuilderExtensions.cs

public static ISiloHostBuilder ConfigureSyncWorkAbstraction(this ISiloHostBuilder builder, int maxSyncWorkConcurrency = 4)
{
	builder.ConfigureApplicationParts(parts => parts.AddApplicationPart(typeof(ISyncWorkAbstractionMarker).Assembly).WithReferences());

	builder.ConfigureServices(services =>
	{
		services.AddSingleton(_ => new LimitedConcurrencyLevelTaskScheduler(maxSyncWorkConcurrency));
	});

	return builder;
}

WhenGivenLargeNumberOfRequests_SystemShouldNotBecomeOverloaded timeout

Describe the bug
All unit tests pass, with the exception of WhenGivenLargeNumberOfRequests_SystemShouldNotBecomeOverloaded, which timeouts.

To Reproduce

Reproduction Repository
Current main branch c19842b

Expected behavior
Test should not timeout.

Screenshots

   Source: SyncWorkerTests.cs line 55
   Duration: 1.5 min

  Message: 
System.TimeoutException : Response did not arrive on time in 00:00:30 for message: Request [ sys.client/6fe2e63ff303466c97dfee58f690f240]->[S127.0.0.1:44340:0 passwordverifier/8a45d6b4d480457f938cb739266c894a] Orleans.SyncWork.ISyncWorker<TRequest,TResult>Orleans.SyncWork.ISyncWorker<TRequest,TResult>.GetWorkStatus() #16573. 

  Stack Trace: 
ResponseCompletionSource`1.GetResult(Int16 token) line 230
<.cctor>b__4_0(Object state)
--- End of stack trace from previous location ---
SyncWorkerExtensions.StartWorkAndPollUntilResult[TRequest,TResult](ISyncWorker`2 worker, TRequest request, Int32 msDelayPerStatusPoll) line 52
SyncWorkerTests.WhenGivenLargeNumberOfRequests_SystemShouldNotBecomeOverloaded() line 69
--- End of stack trace from previous location ---

Additional context
I was looking into SyncWork in order to solve the same timeout error in a spike I'm writing to see if Orleans is feasible for us. The 30s timeout error in my own code occurred when an external call into a grain was causing a lot of work setting up other grains, and eventually the initial call timed out.

I expected that this library would avoid precisely this type of error so I was especially surprised to get it again. Perhaps I'm misinterpreting its purpose?

The test passes in 11.7 seconds when scaled down to for (var i = 0; i < 400; i++). 1000 iterations also complete within the deadline, at 27.1 sec.

Upgrade to Orleans 7?

Hello! Is this project still alive? Are there any plans to upgrade to Orleans 7? Or is there an alternative pattern that should now be used?

If it's just a matter of capacity, I could take a first crack at a v7 upgrade PR.

Upgrade to Orleans 8

After migrating as many Orleans-related packages to 8.0.0 in my Orleans-based application, a runtime exception was encountered
Could not load type 'Orleans.Serialization.Codecs.TagDelimitedFieldCodec' from assembly 'Orleans.Serialization'
with the stack pointing to SyncWork-related code.

This seems similar to OrleansContrib/OrleansDashboard#395 which was alledgedly solved by updating the package to match the Orleans runtime.

I removed the SyncWork nuGet reference and replaced it by a locally compiled version of the SyncWork source code, upgraded its runtime and dependencies to Orleans 8 and the error didn't reoccur.

Removal of final "0" in version of version.json

It's been a hot minute since I've done a release of this package. The package in RELEASE/v7.0.0 is being done as a prerelease version rather than release version, and I'm not positive why.

When looking at previous releases, the version has only had "major" and "minor", whereas the current (attempted to be built) package has "major", "minor" and "revision/patch". Not sure if this is actually the cause, but this is a reminder to myself to try to fix it tonight :3

Example of working "release" version from a previous tag:
image

The current non working "release" that is being packed/published as pre-release:
image

image

Don't have permission to repo to update repo secrets

Hey @ReubenBond could you help me out here, or ping someone that would be able to?

My NuGet API key expired and I need to set a new one on the repo, but it doesn't look like I have permission to do so.

This repo's settings page:
image

A repo where I have access to secrets:
image

Start method throws MissingMethodException exception

Describe the bug
Catch exception when calling Start method

System.MissingMethodException: Method not found: 'System.Threading.Tasks.ValueTask`1<!!0> Orleans.Runtime.GrainReference.InvokeAsync(Orleans.Serialization.Invocation.IInvokable)'.
   at OrleansCodeGen.Orleans.SyncWork.Proxy_ISyncWorker`2.global::Orleans.SyncWork.ISyncWorker<TRequest,TResult>.Start(TRequest arg0)

Symbols not published in package

It was the intention to include symbols in the NuGet push, I don't think my command is quite right, though I want to automate this with GH actions anyway

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.