scandum / quadsort Goto Github PK
View Code? Open in Web Editor NEWQuadsort is a branchless stable adaptive mergesort faster than quicksort.
License: The Unlicense
Quadsort is a branchless stable adaptive mergesort faster than quicksort.
License: The Unlicense
Is there a Python version of quadsort? I would like to learn more of it in Python
is quadsort multithreaded?
is it using SSE/AVX?
dumb question: why just quad? Why not 8 or 16 or even more?
Otherwise, this is fascinating.
Can this and other searching and sorting related code be relicensed using a Copyfree license?
Line 473 in db6c2f8
Neither graph has a single labeled axis. There are also no units at all! Fix this!!
I've been looking at your code and noticed some improvements that can be made:
Replacing all divisions of factors of 2 by Shift Right operations.
Replacing all modulus operators of factors of 2 by bitmasks.
Remove the swap_two function, using it on the spot, as it only currently appears in a total of 4 instances throughout the code.
Around line 300, if you use pts[0] instead pts[cnt++] and initialize cnt = 5 you save some operations, reducing the computational intensity. Usage of constants whenever possible is always preferable for speedup!
I will fork and submit a version later.
I am adding quadsort as package into xmake-repo repository, but I meet some compilation errors.
https://github.com/xmake-io/xmake-repo/actions/runs/4099518400/jobs/7069506843
https://github.com/xmake-io/xmake-repo/actions/runs/4099518385/jobs/7069506091
checkinfo: ...amdir/core/sandbox/modules/import/core/tool/compiler.lua:84: @programdir/modules/core/tools/gcc.lua:713: /home/runner/.config/.xmake/packages/f/fluxsort/2023.02.05/aa60a67a14f44c54abbe30ae453ecfb9/include/quadsort.h:322:8: error: duplicate case value '8'
case sizeof(long double):
^
/home/runner/.config/.xmake/packages/f/fluxsort/2023.02.05/aa60a67a14f44c54abbe30ae453ecfb9/include/quadsort.h:318:8: note: previous case defined here
case sizeof(long long):
^
In file included from /dev/shm/.xmake1001/230206/_3B3636D6A724[48](https://github.com/xmake-io/xmake-repo/actions/runs/4099518400/jobs/7069506843#step:5:49)F693B42B70C050A25B.c:2:
/home/runner/.config/.xmake/packages/f/fluxsort/2023.02.05/aa60a67a14f44c54abbe30ae453ecfb9/include/fluxsort.h:246:8: error: duplicate case value '8'
case sizeof(long double):
^
/home/runner/.config/.xmake/packages/f/fluxsort/2023.02.05/aa60a67a14f44c54abbe30ae4[53](https://github.com/xmake-io/xmake-repo/actions/runs/4099518400/jobs/7069506843#step:5:54)ecfb9/include/fluxsort.h:243:8: note: previous case defined here
case sizeof(long long):
https://github.com/xmake-io/xmake-repo/actions/runs/4099518378/jobs/7069506679
C:\Users\runneradmin\AppData\Local\.xmake\packages\f\fluxsort\2023.02.05\b8a2a0e1a6f142e7aa83b7a7cf086eb4\include\quadsort.h(37): fatal error C1083:
Cannot open include file: 'stdalign.h': No such file or directory
Genericity in C++(template)is quite useful.
Making it more useful won't hurt.
case sizeof(long long):// line322
return quadsort64(array, nmemb, cmp);
case sizeof(long double):// line325
return quadsort128(array, nmemb, cmp);
Sometimes both long long
and long double
are 64-bit
First of all, great job on this and I am currently working on trying to make a JavaScript port and test out against various sorting algorithms on Node.js.
I don't quite understand these two sections, or the symbols. Can you explain in more detail?
A --> A,B --> A,C
B --> A,B --> A,B,C,D
C --> C,D --> A,B,C,D
D --> C,D --> B,D
A --> A,B --> A? --> C? --> A,C
B --> A,B --> B? --> D? --> A,C
C --> C,D --> C? --> A? --> B,D
D --> C,D --> D? --> B? --> B,D
Side question, why are the variable names so obscure? Variable names go a long way to improve readability, and it's really really hard to follow the flow of this code.
I am getting the following error when trying to compile the bench.c program.
shyamalchandra@iMac quadsort % gcc bench.c
Undefined symbols for architecture x86_64:
"_tail_insert32", referenced from:
_tail_swap32 in bench-d905ef.o
"_tail_insert64", referenced from:
_tail_swap64 in bench-d905ef.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
What is this bench-d905ef.o?
Hi,
I'm trying to understand how some bits works, and I'm slightly puzzled by mysterious else
:
In parity_swap_eight and parity_tail_swap_eight.
What adds to my confusion is the fact that I thought that I understood parity_swap_sixteen where after sorting 4 blocks x 4 items you check if it is a case when they are all "in row" so no need to merge anything.
I thought same thing happens in parity_swap_eight
, but this else
puzzles me.
Regards,
Milosz
From the readme: "Traditionally merge sorts would merge two blocks to swap memory, then copy them back to main memory."
I'm not aware of any library implementation of merge sort that does a copy back. Most implementations of stable sort are some hybrid of insertion sort | bottom up merge sort that change the direction of merge with each sort pass.
Merge sorts have been using ping pong like merging since the days of tape sorts (1960's) . For 3 tape drives, polyphase merge sort can be used, and it's generally quicker than standard merge sort up to 7 tape drives:
Multi-threading - GNU Sort for Unix defaults to using 16 threads during the internal sort phase which sorts an array of pointers to records, then writes the records according to the sorted array to create sorted runs on hard drives. It then defaults to a 16 way merge to merge the sorted runs into a sorted file. For drives with very fast transfer rates, like NVMe SSD with 6+ GB/s transfer rates, multi-threading with 2 or 4 way merging (using if|else not mini-heap) would be faster.
There is TimSort (used by Python) which looks a bit similar in the ideas (based on merge sort), SLOC count, stability and probably performance to quadsort.
Dare to make a comparison? You can take CPython implementation for this (or if you're an adventurous type, implement it based on Julia TimSort as it's well readable unlike other implementations I know of).
Other potential candidate(s) for comparison would be "Block sort" (e.g. https://github.com/BonzaiThePenguin/WikiSort/tree/master or https://github.com/Mrrl/GrailSort which I slightly prefer over WikiSort). A slightly biased (read "not based on measurements") opinion on WikiSort gives the author of Timsort.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.