Giter Club home page Giter Club logo

scientist.net's Introduction

Scientist.NET

A .NET Port of the Scientist library for carefully refactoring critical paths.

Build status Gitter

To give it a twirl, use NuGet to install: Install-Package Scientist

How do I science?

Let's pretend you're changing the way you handle permissions in a large web app. Tests can help guide your refactoring, but you really want to compare the current and refactored behaviors under load.

using GitHub;

...

public bool CanAccess(IUser user)
{
    return Scientist.Science<bool>("widget-permissions", experiment =>
    {
        experiment.Use(() => IsCollaborator(user)); // old way
        experiment.Try(() => HasAccess(user)); // new way
    }); // returns the control value
}

Wrap a Use block around the code's original behavior, and wrap Try around the new behavior. Invoking Scientist.Science<T> will always return whatever the Use block returns, but it does a bunch of stuff behind the scenes:

  • It decides whether or not to run the Try block,
  • Randomizes the order in which Use and Try blocks are run,
  • Measures the durations of all behaviors,
  • Compares the result of Try to the result of Use,
  • Swallows (but records) any exceptions raised in the Try block, and
  • Publishes all this information.

The Use block is called the control. The Try block is called the candidate.

If you don't declare any Try blocks, none of the Scientist machinery is invoked and the control value is always returned.

Making science useful

Publishing results

What good is science if you can't publish your results?

By default results are published in an in-memory publisher. To override this behavior, create your own implementation of IResultPublisher:

public class MyResultPublisher : IResultPublisher
{
    public Task Publish<T, TClean>(Result<T, TClean> result)
    {
        Logger.Debug($"Publishing results for experiment '{result.ExperimentName}'");
        Logger.Debug($"Result: {(result.Matched ? "MATCH" : "MISMATCH")}");
        Logger.Debug($"Control value: {result.Control.Value}");
        Logger.Debug($"Control duration: {result.Control.Duration}");
        foreach (var observation in result.Candidates)
        {
            Logger.Debug($"Candidate name: {observation.Name}");
            Logger.Debug($"Candidate value: {observation.Value}");
            Logger.Debug($"Candidate duration: {observation.Duration}");
        }

        if (result.Mismatched)
        {
            // saved mismatched experiments to DB
            DbHelpers.SaveExperimentResults(result);
        }

        return Task.FromResult(0);
    }
}

Then set Scientist to use it before running the experiments:

Scientist.ResultPublisher = new MyResultPublisher();

As of v1.0.2, A IResultPublisher can also be wrapped in FireAndForgetResultPublisher so that result publishing avoids any delays in running experiments and is delegated to another thread:

Scientist.ResultPublisher = new FireAndForgetResultPublisher(new MyResultPublisher(onPublisherException));

Controlling comparison

Scientist compares control and candidate values using ==. To override this behavior, use Compare to define how to compare observed values instead:

public IUser GetCurrentUser(string hash)
{
    return Scientist.Science<IUser>("get-current-user", experiment =>
    {
        experiment.Compare((x, y) => x.Name == y.Name);

        experiment.Use(() => LookupUser(hash));
        experiment.Try(() => RetrieveUser(hash));
    });
}

Adding context

Results aren't very useful without some way to identify them. Use the AddContext method to add to the context for an experiment:

public IUser GetUserByName(string userName)
{
    return Scientist.Science<IUser>("get-user-by-name", experiment =>
    {
        experiment.AddContext("username", userName);

        experiment.Use(() => FindUser(userName));
        experiment.Try(() => GetUser(userName));
    });
}

AddContext takes a string identifier and an object value, and adds them to an internal Dictionary. When you publish the results, you can access the context by using the Contexts property:

public class MyResultPublisher : IResultPublisher
{
    public Task Publish<T, TClean>(Result<T, TClean> result)
    {
        foreach (var kvp in result.Contexts)
        {
            Console.WriteLine($"Key: {kvp.Key}, Value: {kvp.Value}");
        }
        return Task.FromResult(0);
    }
}

Expensive setup

If an experiment requires expensive setup that should only occur when the experiment is going to be run, define it with the BeforeRun method:

public int DoSomethingExpensive()
{
    return Scientist.Science<int>("expensive-but-worthwile", experiment =>
    {
        experiment.BeforeRun(() => ExpensiveSetup());

        experiment.Use(() => TheOldWay());
        experiment.Try(() => TheNewWay());
    });
}

Keeping it clean

Sometimes you don't want to store the full value for later analysis. For example, an experiment may return IUser instances, but when researching a mismatch, all you care about is the logins. You can define how to clean these values in an experiment:

public IUser GetUserByEmail(string emailAddress)
{
    return Scientist.Science<IUser, string>("get-user-by-email", experiment =>
    {
        experiment.Use(() => OldApi.FindUserByEmail(emailAddress));
        experiment.Try(() => NewApi.GetUserByEmail(emailAddress));
        
        experiment.Clean(user => user.Login);
    });
}

And this cleaned value is available in the final published result:

public class MyResultPublisher : IResultPublisher
{
    public Task Publish<T, TClean>(Result<T, TClean> result)
    {
        // result.Control.Value = <IUser object>
        IUser user = (IUser)result.Control.Value;
        Console.WriteLine($"Login from raw object: {user.Login}");
        
        // result.Control.CleanedValue = "user name"
        Console.WriteLine($"Login from cleaned object: {result.Control.CleanedValue}");
        
        return Task.FromResult(0);
    }
}

Ignoring mismatches

During the early stages of an experiment, it's possible that some of your code will always generate a mismatch for reasons you know and understand but haven't yet fixed. Instead of these known cases always showing up as mismatches in your metrics or analysis, you can tell an experiment whether or not to ignore a mismatch using the Ignore method. You may include more than one block if needed:

public bool CanAccess(IUser user)
{
    return Scientist.Science<bool>("widget-permissions", experiment =>
    {
        experiment.Use(() => IsCollaborator(user));
        experiment.Try(() => HasAccess(user));

        // user is staff, always an admin in the new system
        experiment.Ignore((control, candidate) => user.IsStaff);
        // new system doesn't handle unconfirmed users yet
        experiment.Ignore((control, candidate) => control && !candidate && !user.ConfirmedEmail);
    });
}

The ignore blocks are only called if the values don't match. If one observation raises an exception and the other doesn't, it's always considered a mismatch. If both observations raise different exceptions, that is also considered a mismatch.

Enabling/disabling experiments

Sometimes you don't want an experiment to run. Say, disabling a new codepath for anyone who isn't staff. You can disable an experiment by setting a RunIf block. If this returns false, the experiment will merely return the control value. Otherwise, it defers to the global Scientist.Enabled method.

public decimal GetUserStatistic(IUser user)
{
    return Scientist.Science<decimal>("new-statistic-calculation", experiment =>
    {
        experiment.RunIf(() => user.IsTestSubject);

        experiment.Use(() => CalculateStatistic(user));
        experiment.Try(() => NewCalculateStatistic(user));
    });
}

Ramping up experiments

As a scientist, you know it's always important to be able to turn your experiment off, lest it run amok and result in villagers with pitchforks on your doorstep. You can set a global switch to control whether or not experiments is enabled by using the Scientist.Enabled method.

int percentEnabled = 10;
Random rand = new Random();
Scientist.Enabled(() =>
{
    return rand.Next(100) < percentEnabled;
});

This code will be invoked for every method with an experiment every time, so be sensitive about its performance. For example, you can store an experiment in the database but wrap it in various levels of caching.

Running candidates in parallel (asynchronous)

Scientist runs tasks synchronously by default. This can end up doubling (more or less) the time it takes the original method call to complete, depending on how many candidates are added and how long they take to run.

In cases where Scientist is used for production refactoring, for example, this ends up causing the calling method to return slower than before which may affect the performance of your original code. However, if the candidates can be run at the same time as the control method without affecting each other, then they can be run in parallel so the Scientist call will only take as long as the slowest task (plus a tiny bit of overhead):

await Scientist.ScienceAsync<int>(
	"ExperimentName",
	3, // number of tasks to run concurrently 
	experiment => {
        experiment.Use(async () => await StartRunningSomething(myData));
        experiment.Try(async () => await RunAtTheSameTimeAsTheControlMethod(myData));
        experiment.Try(async () => await AlsoRunThisConcurrently(myData));
	});

As always when using async/await, don't forget to call .ConfigureAwait(false) where appropriate.

Testing

When running your test suite, it's helpful to know that the experimental results always match. To help with testing, Scientist has a ThrowOnMismatches property that can be set to true. Only do this in your test suite!

To throw on mismatches:

Scientist.Science<int>("ExperimentN", experiment => 
{
    experiment.ThrowOnMismatches = true;
    // ...
});

Scientist will throw a MismatchException<T, TClean> exception if any observations don't match.

Handling errors

If an exception is thrown in any of Scientist's internal helpers like Compare, Enabled, or Ignore, the default behavior of Scientist is to re-throw that exception. Since this halts the experiment entirely, it's often a better idea to handle this error and continue so the experiment as a whole isn't canceled entirely:

Scientist.Science<int>("ExperimentCatch", experiment =>
{
    experiment.Thrown((operation, exception) => InternalTracker.Track($"Science failure in ExperimentCatch: {operation}.", exception))
    // ...
});

The operations that may be handled here are:

  • Operation.Compare - an exception is raised in a Compare block
  • Operation.Enabled - an exception is raised in the Enabled block
  • Operation.Ignore - an exception is raised in an Ignore block
  • Operation.Publish - an exception is raised while publishing results
  • Operation.RunIf - an exception is raised in a RunIf block

Designing an experiment

Because Enabled and RunIf determine when a candidate runs, it's impossible to guarantee that it will run every time. For this reason, Scientist is only safe for wrapping methods that aren't changing data.

When using Scientist, we've found it most useful to modify both the existing and new systems simultaneously anywhere writes happen, and verify the results at read time with Science. ThrowOnMismatches has also been useful to ensure that the correct data was written during tests, and reviewing published mismatches has helped us find any situations we overlooked with our production data at runtime. When writing to and reading from two systems, it's also useful to write some data reconciliation scripts to verify and clean up production data alongside any running experiments.

Finishing an experiment

As your candidate behavior converges on the controls, you'll start thinking about removing an experiment and using the new behavior.

  • If there are any ignore blocks, the candidate behavior is guaranteed to be different. If this is unacceptable, you'll need to remove the ignore blocks and resolve any ongoing mismatches in behavior until the observations match perfectly every time.
  • When removing a read-behavior experiment, it's a good idea to keep any write-side duplication between an old and new system in place until well after the new behavior has been in production, in case you need to roll back.

Breaking the rules

Sometimes scientists just gotta do weird stuff. We understand.

Ignoring results entirely

Science is useful even when all you care about is the timing data or even whether or not a new code path blew up. If you have the ability to incrementally control how often an experiment runs via your Enabled method, you can use it to silently and carefully test new code paths and ignore the results altogether. You can do this by setting Ignore((x, y) => true), or for greater efficiency, Compare((x, y) => true).

This will still log mismatches if any exceptions are raised, but will disregard the values entirely.

Trying more than one thing

It's not usually a good idea to try more than one alternative simultaneously. Behavior isn't guaranteed to be isolated and reporting + visualization get quite a bit harder. Still, it's sometimes useful.

To try more than one alternative at once, add names to some Try blocks:

public bool CanAccess(IUser user)
{
    return Scientist.Science<bool>("widget-permissions", experiment =>
    {
        experiment.Use(() => IsCollaborator(user));
        experiment.Try("api", () => HasAccess(user));
        experiment.Try("raw-sql", () => HasAccessSql(user));
    });
}

Alternatives

Here are other implementations of Scientist available in different languages.

scientist.net's People

Contributors

aloisdg avatar arntj avatar darcythomas avatar davezych avatar haacked avatar jlandheer avatar jon-adams avatar joncloud avatar josh-hiles avatar joshhiles avatar jtreuting avatar kolektiv avatar m-zuber avatar martincostello avatar paulbreen avatar pedrolamas avatar richarddalton avatar ryangribble avatar shiftkey avatar silvenga avatar zizhong-zhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scientist.net's Issues

Async ResultPublishers could deadlock sync experiments

Thanks for the great library!

One issue we ran into is when using an async ResultPublisher for synchronous experiments within an MVC app.

The controller below will deadlock on every request, because of the awaiting on Task.Delay() within the synchronous web request.

This article has a more in-depth explanation on why:
http://blog.stephencleary.com/2012/07/dont-block-on-async-code.html

It looks like ConfigureAwait could potentially be used for a fix, or possibly adding a separate ResultPublisher method that gets invoked on synchronous calls. Our current workaround is to use sync implementations of Publish without any awaits.

using System.Threading.Tasks;
using System.Web.Mvc;
using GitHub;

namespace ScientistDeadlockingResultPublisher.Controllers
{
    public class AsyncResultPublisher : IResultPublisher
    {
        public async Task Publish<T, TClean>(Result<T, TClean> result)
        {
            await Task.Delay(5000);
        }
    }

    public class DeadlockController : Controller
    {
        // GET: Deadlock
        public ActionResult Index()
        {
            Scientist.ResultPublisher = new AsyncResultPublisher();
            var result = Scientist.Science<int>("myExperiment", e =>
            {
                e.Use(() => 2);
                e.Try("candidate", () => 1);
            });

            return Content("done");
        }
    }

Documentation

The readme is pretty helpful, but I'm not sure exactly what the difference is [w/o debugging] between Result<T, TClean>.Candidates and Result<T, TClean>.Observations. Is Observations just Candidates + Control?

Support multiple candidates

The original Ruby library supports multiple candidates to compare to a control. We should do this. In fact, we should refactor the code to be a bit more in line with the Ruby library. We should only deviate where it makes sense because we're using C#

publishing builds to NuGet

@haacked was this a manual process before?

I'd love for it to be driven from tags, but I'm also rusty as hell on all this so I'm happy to defer to those who have a better idea of what's technically feasible these days

Script NuGet package creation

Things to figure out:

  • Use DNU Pack or Nuget.exe? (or does it matter?)
  • Do we want to auto push to nuget, or just create the package?
    • If we auto push, what branches should push?

@haacked your thoughts? Anything else I'm missing?

Change root namespace Github to Scientist

This project has nothing to do with Github itself, so root namespace Github is awkward. Using Scientist.NET in project currently requires the following using

using Github;

Change namespace to Scientist, so using in .NET requires the following more logical using statement

using Scientist;

wire up CI providers and add build badge

Appveyor was previously used on the project to test on WIndows, so I'm inclined to enable that again unless someone wants to try enabling Azure Pipelines so this can be tested on macOS and Linux.

Appveyor also supports Linux environments, but I've not had a chance to use that yet

add some project documentation

The README has some interesting examples, but maybe we could go further and have a published site hosted somewhere that's more discoverable?

Rewrite for .NET 6

My thoughts are that we use github actions to deploy a 2.x.x-alpha to NuGet which we can use for testing etc. with the wider community.

In regards to the main branch, we keep that going and alive for the time being with the outlook to have a 1.x.x branch where the old code sits for any bug fixes and such. However i think people should moving on from old .net versions!

Is this library still maintained?

Hello there,

I am looking for some tool for testing refactoring of critical paths in my app, and with a little bit of Google magic, I found this one. It looked great to me until I saw the commit history.
I've seen that the last commit was on May 29, 2020, and before that on Apr 4, 2019, so that was immediately a red flag to me.

Could someone tell me if this project/library is still actively being maintained, and if not, are there some other .NET libraries that you recommend that can be used instead?

Use and Try blocks run sequentially?

Hey all,

Love the GitHub Engineering blog post on Scientist and just checking out the .Net port of it... great work!

I've succesfully implemented an experiment in a webAPI project where I am trying out a new way to do some calculations whilst still providing users with "the old way" but starting to gather data on any discrepancies between old and new.

Things do "work" however it seems that the "behaviours" are run one at a time, which means (unless im doing something wrong) that running the control Use() and one experimental Try() the api call now takes about twice as long as it used to.

What do you think about having an option to stipulate the number of experiments to run concurrently and then having the internals of ExperimentInstance handle running multiple tasks in parallel? In my case I would be happy to run the Control and single Try at the same time, then hopefully the duration/user impact should be roughly what it always has been, but we start to gain the data that lets us ensure consistent results and then perhaps even start to tune the call duration for better performance.

Or have I just missed something and it's meant to be doing this already?

Build script fails to run tests with dotnet cli preview5 (latest)

It looks like the latest version of dotnet cli requires a different set of parameters in order to run the unit tests.

A couple of ideas:

  • Specify -Version 1.0.0-preview2-003121 when downloading dotnet cli
  • Continue developing #71

Another thing that should probably be updated is that if the dotnet test command fails it should fail the build instead of allowing it to pass without checking tests.

Sample output from appveyor:

.\tools\dotnet\dotnet.exe test .\test\Scientist.Test\
Couldn't find a project to run test from. Ensure a project exists in C:\projects\scientist-net.
Or pass the path to the project
Finished Target: RunTests

ScienceAsync does not execute Use and Try in parallel

The documentation indicates that when using Scientist.ScienceAsync the Use and Try functions should run in parallel and should only take as long as the slowest function.

https://github.com/scientistproject/Scientist.net#running-candidates-in-parallel-asynchronous

The following code always takes 10 seconds to run. Additionally, "Done1" and "Done2" seem to complete in a random order each time.

namespace Science
{
    class Program
    {
        static void Main(string[] args)
        {
            
            var t = Scientist.ScienceAsync<int>("ExperimentName", 2,
                experiment => {
                    experiment.Use(async () => await Thing1());
                    experiment.Try(async () => await Thing2());
                });

            t.Wait();
            Console.WriteLine("All Done");
            Console.ReadLine();
        }

        static async Task<int> Thing1()
        {
            Thread.Sleep(5000);
            Console.WriteLine("Done1");
            return await Task.FromResult(0);
        }

        static async Task<int> Thing2()
        {
            Thread.Sleep(5000);
            Console.WriteLine("Done2");
            return await Task.FromResult(1);
        }
    }
}

Support different return types

I'm not sure how scientist handle this (nor how ruby handles this situation), and I know you want to stay close to the ruby lib, but it could be useful to support different return types between the control and candidate. When refactoring there is no guarantee that the candidate methods will return the same type as the existing method, which could lead to hardships trying to experiment on them. Supporting 2 different types allows us to get around that issue, and we can check the differences between 2 different complex objects in the comparison.

A potential implementation:

internal class Experiment<TControl, TCandidate>
{
    Func<Task<TControl>> _control;
    Func<Task<TCandidate>> _candidate;

    public void Use(Func<Task<TControl>> control);
    public void Try(Func<Task<TCandidate>> candidate);

    public void Compare(Func<TControl, TCandidate, bool> comparison);
}

Move to use nUnit (rather than xUnit)

I fear I am opening a can of worms | starting a religious war...

I prefer nUnit over xUnit because:

  • Resharper has better support for running code coverage inside Visual Studio (rather than using the command line) (This is a bug that will be fixed in VS2015.2 apparently)
  • The syntax reads nicer (to me)
    • e.g.,: Assert.That(() => Int32.Parse("abc"), Throws.Exception.TypeOf<FormatException>());

This is a nice to have (for me...) issue.
So totally cool if it is closed etc

Memory cleanup issue

I'm using scientist within a scheduled task that runs every hour to test a new approach of fetching data from a database. What I noticed is ever since I've added scientist.net in the solution, the memory consumption of the app goes up with every run.

Scientist is comparing 2 lists of around 100k items.

After some investigation and debugging I noticed that InMemoryResultProvider will always keep a copy of the last result which in turn will have a copy of the original array. Clearing that up fixes the memory issue, however don't understand the reasons why this (and a lot more stuff in the whole library) is static.

SingleContextIncludedWithPublish Unit Test passes inconsistently

Running build.cmd or tests through Visual Studio seems to occasionally fail due to an exception.

Test run on 43e27ee

xUnit.net DNX Runner (32-bit DNX 4.5.1)
  Discovering: Scientist.Test
  Discovered:  Scientist.Test
  Starting:    Scientist.Test
    TheScientistClass+TheScienceMethod.SingleContextIncludedWithPublish [FAIL]
      System.AggregateException : One or more errors occurred.
      ---- System.Exception : Exception of type 'System.Exception' was thrown.
      Stack Trace:
           at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
           at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
           at System.Threading.Tasks.Task`1.get_Result()
        C:\Users\jdber_000\Documents\GitHub\scientist.net\src\Scientist\Scientist.cs(35,0): at GitHub.Scientist.Science[T](String name, Action`1 experiment)
        C:\Users\jdber_000\Documents\GitHub\scientist.net\test\Scientist.Test\ScientistTests.cs(539,0): at TheScientistClass.TheScienceMethod.SingleContextIncludedWithPublish()
        ----- Inner Stack Trace -----
        C:\Users\jdber_000\Documents\GitHub\scientist.net\src\Scientist\Internals\Experiment.cs(13,0): at GitHub.Internals.Experiment`1.<>c.<.cctor>b__35_1(Operation operation, Exception exception)
        C:\Users\jdber_000\Documents\GitHub\scientist.net\src\Scientist\Internals\ExperimentInstance.cs(87,0): at GitHub.Internals.ExperimentInstance`1.<Run>d__12.MoveNext()
  Finished:    Scientist.Test
=== TEST EXECUTION SUMMARY ===
   Scientist.Test  Total: 34, Errors: 0, Failed: 1, Skipped: 0, Time: 0.554s

Support custom comparators

Similar to Ruby's original code.

I'm willing to implement this, but I would like to discuss the preferred implementation. I see a few ways to do this in C#:

1. Use an optional parameter

We could make Science have an optional parameter Func<T, T, bool> that would be called at the end of the experiment if it is not null.
I implemented this in my fork, you can see the changes needed here.

2. Add a property in Scientist class

Very similar to # 1, but instead of passing an optional parameter you would make Scientist generic (Scientist<T>), having a property Func<T, T, bool> Comparer that could be set like an ObservationPublisher.
This would change the way the lib is used right now, as a type parameter would have to be passed in Scientist class but no parameter would be needed in the Science method anymore.

Old way:

var result = Scientist.Science<int>("success", experiment =>
...

New way

var result = Scientist<int>.Science("success", experiment =>
...

From my point of view this would make the code a little less clean, as the type would have to be used anytime the Scientist object is referenced.

3. Change Experiment<T> interface

Ruby's implementation of Comparator is inside the Experiment class, doing that in C# would make implementing the Compare method required, even for straightforward value types' comparison.
Maybe we could provide an abstract Experiment class that would be easier to implement (default Compare virtual method using default C# comparators?) allowing the user to override when needed?


What do you think? Does any of these make sense? Is there a simpler more idiomatic way of doing this in C#?

Support different return types for Control/Candidate methods

This implementation depends on decisions made for #15.

Basically the idea is that sometimes the candidate code returns a different, but relatable value. For example:

public int ControlMethod(int foo)
{
    return 1;
}

public bool CandidateMethod(int foo)
{
    return true;
}

The candidate method now returns a bool, and while bool and int are different, you could easily map 1 -> true and 0 -> false.

This could be implemented creating a new interface Experiment<TControl, TCandidate> that would allow the user to specify different types for each method. In this case, providing a Compare method could even be required for the interface.

What do you think?

Do something about the unwieldy ExperimentInstance constructor.

The constructor for ExperimentInstance has a lot of parameters. Because we want this to effectively be an read only instance, we need to supply all the parameters via the constructor.

Our options are:

  1. Create a new ExperimentSettings class that has settable properties. Add a constructor overload to ExperimentInstance that takes a settings instance and sets all the properties accordingly.
  2. Create a new ExperimentBuilder class that lets you set properties and returns a new instance of ExperimentInstance with those properties set via some Create method.
  3. Add multiple ctor overloads to ExperimentInstance
  4. Something I hadn't considered...

/cc @davezych who noted this problem

Properties vs Methods in interfaces

While browsing the code, one of the first things that will be come across is this file.

Personally I was very confused at the beginning as to how it all works. I did find that the implementation is here, but it is not clear if methods like public void Use(Func<Task<T>> control) { _control = control; } would ever do more then just set the field.

If it is true that they will only be setting/getting the method to be scienced, then I would like to propose changing the interfaces to declare properties and not methods.

Provide a fluent interface

Fluent interfaces are fairly standard in .NET APIs now. We set this up in Shience and it worked well. We chained our calls like this:

var userHasAccess = Scientist.Science<bool>("experiment-name")
    .Use(() => IsCollaborator(user))
    .Try(() => HasAccess(user))
    .WithContext(new { userId = user.Id })
    .Execute(); //Or ExecuteAsync()

To keep consistent with the way the Scientist api works now, we could do something like:

return Scientist.Science<bool>("experiment-name", experiment =>
    {
        experiment.Use(() => IsCollaborator(user))
                  .Try(() => HasAccess(user))
                  .WithContext(new { userId = user.Id })
    });

Makes for a cleaner api.

[META] Prepare for 1.0 release

I'd like to open a discussion about where we stand right now and where we need to get for an official 1.0 release.

Things that are in the ruby lib that we don't have:

In order to be an official port, these all need to be implemented.

Documentation

The docs need to be updated. In my mind we should probably steal the ruby docs and modify the sample code with C# examples. @haacked - from what I remember this was your intention as well, right? And I'm assuming it's okay if we "steal" the ruby docs, right?

Testing

I have used Scientist in an app I'm developing and so far haven't run into any issues. This is good. However are we confident in the product and it's stability? I'm not sure how to answer that other than to use it more, and also to wait some time and see if we get any bug reports.

Official logo

We need an official logo for the nuget package. I can't really find a logo for the ruby lib that we could mimic and .net-ify. What do we want here?

Use PcgRandom instead of System.Random

Line 13 of Scientist.cs says:

// TODO: Evaluate the distribution of Random and whether it's good enough.

PcgRandom (NuGet, GitHub) is a .NET port of the PCG RNG algorithm. It's a drop-in replacement for System.Random that provides a high-quality PRNG.

If you're interested, I can submit a PR; just let me know. (I'd also fix the thread-safety bug in Experiment.cs; a static Random instance could be accessed simultaneously by multiple threads, but it's not thread-safe, and doing so can corrupt its internal state.)

[Question] Is there a way to test sideeffects of a method

As an example I have a method that:

  • Receives some Input
  • Prepares the data, for example bulids a query
  • calls a dependecy, for example execute a query

the call could be considered an indirect output. Is there a way to intercept the indirect output to ensure that a refactored version of the method still does the same call?

Support older versions of .NET Framework

As far as I can see from nuget package, Scientist.net currently targets net452. Is it possible to add support for older versions of .NET Framework? If so, what would be the lowest version supported?

Documentation

This project needs better documentation. As stated in #51, the current plan is to copy the docs for the Ruby version and make modifications as applies. I currently have some time to spare and can start looking into this as early as tomorrow.

Experiment without Try

The Ruby docs says following about not adding a Try block in an experiment:

If you don't declare any try blocks, none of the Scientist machinery is invoked and the control value is always returned.

From what I can see in the source code the behavior is identical in the .NET version, but this behavior is not clearly stated anywhere. I propose adding a test to verify that 'nothing' happens if you don't include a Try block in an experiment.

Looking for New Maintainers

With the passage of time the previous maintainers have moved on to other things, which is how I find myself again filling these shoes. I'm trying to avoid new responsibilities at the moment, and Scientist.NET is not something I am suited to be taking charge of:

  • I've not contributed to the project in any capacity
  • I'm not aware of this being used on any GitHub projects
  • I'm no longer doing C# development

And so I'm now looking out for others to take up ownership of the project. If I don't hear anything before February 4th, I plan to archive the repository to confirm it's dormancy.

This project has been rather quiet for a while in terms of issues and contributions, which is why I favour archiving currently. Here's a cursory glance at the 2018 activity:

  • two issues opened
  • three PRs merged
  • one open PR outstanding

The 2.0 release went out about 8 months ago, and here's the overall NuGet stats:

But if someone (preferably multiple people, to share the load) wants to step up but isn't quite sure I'm happy to provide guidance and mentoring whenever I have bandwidth, but I would love to get to a spot where we have people who know the project and want to actively work on it in charge.

The rough transition process I have in my head is:

  • identify interested contributors and figure out a proper transition process
  • transfer this repository to a different owner/location on GitHub - it doesn't make sense to have this under the GitHub organization when it's not actively used in GitHub's projects
  • identify other systems they need access to to continue publishing releases - CI systems, NuGet, etc

Feel free to ask any question you may have here, or you can email me - handle [at] this website - if you wish to talk privately.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.