Giter Club home page Giter Club logo

Comments (8)

DragonSpit avatar DragonSpit commented on June 10, 2024

Hi Rick,
It may be possible to do what you're wanting. Could you provide more details about the needs?

HPCsharp has a function SortRadix with the following interface:
///


/// Sort an array of user defined class based on a separate array of unsigned integer Keys, using Radix Sorting algorithm. Linear time sort algorithm.
///

/// input array of type T
/// input array of keys (unsigned integers) to be sorted on
/// sorted array of keys (unsigned integers)
/// sorted array of user defined class
public static T[] SortRadix(this T[] inputArray, UInt32[] inKeys, ref UInt32[] outSortedKeys)

It's a little bit of an awkward interface, which I've been meaning to improve, but essentially the user provides two arrays: an input array and an array of keys. The sorting algorithm then sorts the array of keys, and also the input array along with it.

Is this what you're wanting, but for Parallel Merge Sort algorithm? Are you sorting based on the double[] array, as this double[] array is the array of keys? HPCsharp also has the ability to sort a double[] using Radix Sort (SortRadixMsd) in linear time.
-Victor

from hpcsharp.

rickpgood avatar rickpgood commented on June 10, 2024

Hi Victor,

The current SortRadix function is actually backwards of what I need.

Consider this example:

double[] val = { 5.1, 1.9, 4.2, 0.3, 3.1, 3.1, 2.5 };
uint[] idx = { 0, 1, 2, 3, 4, 5, 6 };

Array.Sort(val, idx);

Does an in-place sort and returns
val = {0.3, 1.9, 2.5, 3.1, 3.1, 4.2, 5.1}
idx = {3, 1, 6, 4, 5, 2, 0}

SortRadix will sort the [double] val based on the values of [uint] idx.

uint[] idx_out = new uint[7];
Algorithm.SortRadix(val, idx, ref idx_out);

Will then return the original values of val and idx. So yes, I'm looking to sort based on the double[] array keys.

Per your example, I created a user-defined class containing (val, idx) and a comparer:

public class UserDefinedClassComparer : IComparer<UserDefinedClass>
        {
            public int Compare(UserDefinedClass first, UserDefinedClass second)
            {
                if (first.Val > second.Val)
                    return 1;
                if (first.Val < second.Val)
                    return -1;
                return 0;

            }
        }

Unfortunately, that approach was about 4x slower than Sort.Array() on my 500k row case.

Thanks again for your help.

Rick

from hpcsharp.

DragonSpit avatar DragonSpit commented on June 10, 2024

Hi Rick,
Great experiments and creative ideas! Yeah, adding a Compare function call really slows things down. You could change the Compare() implementation to do:
return (int)(first.Val - second.Val);
which should work for double, but I bet it will still be too slow.

Sadly, the interfaces are not quite right for what you need. And, it's a little bit harder than I thought to add new interfaces to support it, but not terrible. I'll add it to the top of my list of ones to implement for the Parallel Merge Sort, to support the same dual-array interfaces as Array.Sort(). The result will be a bit slower since it has to manipulate two arrays, but hopefully faster than Array.Sort(). And, to add support for SortRadix() to sort based on double[] keys, and not just integer keys. I already implement this in SortRadixMsd(), where it can sort double[], but it is a bit slower than SortRadix() which uses an LSD Radix Sort algorithm.

Can't make any time-frame promises though.

Could you try

    public static void SortRadixMsd(this double[] arrayToBeSorted)

to see how fast it sorts double[] on your data set to see if there is hope for it being fast enough?

It's also would have been cool if this was possible Array.AsParallel().Sort(), like it is for Linq.
-Victor

from hpcsharp.

rickpgood avatar rickpgood commented on June 10, 2024

Thanks Victor!

This is what I have so far for timing (mean of 50x).

val.SortMergeInPlacePar() 15ms
val.SortMergeInPlaceStablePar() 30ms
Array.Sort(val) 31ms
Algorithm.SortRadixMsd(val) 59ms
Array.Sort(val, idx) 67ms

for the next two I smoosh'd the var and idx arrays together to make a 2-column jagged array
S = [var, idx]
and used your comparer advice from above.

public class UserDefinedClassComparer : IComparer<double[]>
        {
            public int Compare(double[] first, double[] second)
            {              
                return (int)(first[0] - second[0]);
            }
        }

S.SortMergeInPlacePar(comparer) 150ms (plus ~30ms for array smoosh'n)
Array.Sort(XS, comparer) 490ms (plus ~30ms for array smoosh'n)

So you can see SortMergeInPlacePar is 2-3x as faster than Array.Sort with a (more or less) apples-apples comparison. Cool algo!

Rick

from hpcsharp.

DragonSpit avatar DragonSpit commented on June 10, 2024

Take a look at the new 3.8.0 release, which now has
public static void SortMergeInPlacePar<T1, T2>(this T1[] keys, T2[] items, IComparer comparer = null);
My benchmarks show it running between 2.5 and 5X faster that Array.Sort() for array of keys that are double and an array of items that are ints.

Oops, further testing revealed that keys are sorted, but items are not - working on tracking the issue down.

I also improved interfaces on SortRadix with keys and items, which now returns a tuple of sorted arrays of keys and items, making it more functional-style

from hpcsharp.

DragonSpit avatar DragonSpit commented on June 10, 2024

Try the 3.8.1 version I just pushed, as it fixes the problem.
However, there are a few miscomparisons between Array.Sort(double[], int[]) and the new SortMergeInPlacePar(double[], int[]). I'm working on tracking these down. My theory is that these are several keys which are equal to each other, showing up as "stability differences", as some sorting algorithms are not stable, where equal keys can get swapped in position from input to output.
I'll let you know what I find.

I found that to be the exact issue when comparing the results of the two algorithms:
Diff at index 1349 (-1.79697576570928E+308:1479440) (-1.79697576570928E+308:653777)
Diff at index 1350 (-1.79697576570928E+308:653777) (-1.79697576570928E+308:1479440)

The keys are of the same value, but items are swapped. This is exactly what sorting stability is all about, where sometimes we care about it and other times it's ok if the order gets swapped of items associated with equal keys.

from hpcsharp.

rickpgood avatar rickpgood commented on June 10, 2024

Hi Victor, it works like a charm. For my case I'm seeing a 1.8x ratio. Very nice work.

Array.Sort(val,idx) --> 67ms
val.SortMergeInPlacePar(idx) --> 37ms

Thanks for adding this feature. I think others will find it useful too.

Rick

from hpcsharp.

DragonSpit avatar DragonSpit commented on June 10, 2024

You're welcome! Glad it works for you. I also think that fixing LSD Radix Sort to provide support for double[] keys would run even faster, while using only a single core. I added it to the list of items to work on.

from hpcsharp.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.