Comb Sort versus Quick Sort
#1
I thought johnno had a contender for QSort when I ran 1 Million Numbers on QB64, it beat my Strings Quick Sort test times, BUT! When I compare the exact same String arrays QSort clearly wins every time! Smile

Here is my test code, both take the string array to sort as a parameter and QSort needs a high and low index, because it calls itself recursively:
Code: (Select All)
Option _Explicit
_Title "Comb Sort vrs Quick Sort" ' b+ 2023-05-30
Randomize Timer ' so we have a different array each time we compare

DefLng A-Z
Const nItems = 1000000
Dim sa$(1 To nItems) ' setup a string array sa$() to sort
Dim copy$(1 To nItems) ' make a copy of sa$() to compare another sort to
Dim As Long i, j ' indexes to array  for building and displaying the arrays
Dim As Long r '  a random posw integer = 2 to 6
Dim t##, qtime##, ctime##
Dim b$ ' building string
For i = 1 To nItems ' make a random list to sort
    b$ = ""
    r = (Rnd * 5) \ 1 + 2
    For j = 0 To r
        b$ = b$ + Mid$("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789.?", (Rnd * 64) \ 1 + 1, 1)
    Next
    sa$(i) = b$
    copy$(i) = b$
    Print b$,
Next
Print
Print "Press any to Quick Sort"
Sleep
Cls
t## = Timer(.001)
QuickSort 1, nItems, sa$()
qtime## = Timer(.001) - t##
For i = 1 To 10
    Print sa$(i),
Next
Print: Print
For i = nItems - 9 To nItems
    Print sa$(i),
Next
Print: Print
Print "   Quick Sort time:"; qtime##
Print
Print "   Press any to Comb Sort with array copy, zzz..."
Print
Print
Sleep
t## = Timer(.001)
CombSort copy$()
ctime## = Timer(.001) - t##
For i = 1 To 10
    Print copy$(i),
Next
Print: Print
For i = nItems - 9 To nItems
    Print copy$(i),
Next
Print: Print
Print "   Comb Sort time:"; ctime##
Print
If ctime## < qtime## Then Print "   Comb winds!" Else Print "   QSort wins again!"


Sub QuickSort (start As Long, finish As Long, arr$())
    Dim Hi As Long, Lo As Long, Middle$
    Hi = finish: Lo = start
    Middle$ = arr$((Lo + Hi) / 2) 'find middle of arr$
    Do
        Do While arr$(Lo) < Middle$: Lo = Lo + 1: Loop
        Do While arr$(Hi) > Middle$: Hi = Hi - 1: Loop
        If Lo <= Hi Then
            Swap arr$(Lo), arr$(Hi)
            Lo = Lo + 1: Hi = Hi - 1
        End If
    Loop Until Lo > Hi
    If Hi > start Then Call QuickSort(start, Hi, arr$())
    If Lo < finish Then Call QuickSort(Lo, finish, arr$())
End Sub

' trans from johnno ref: https://rcbasic.freeforums.net/thread/779/sort-algorithms
Sub CombSort (arr$())
    Dim As Long itemCount, start, fini, swaps, gap, i
    start = LBound(arr$)
    itemCount = UBound(arr$) - start + 1
    fini = start + itemCount - 1
    gap = itemCount
    While gap > 1 Or swaps <> 0
        gap = Int(gap / 1.25)
        If gap < 1 Then gap = 1
        swaps = 0
        For i = start To itemCount - gap
            If arr$(i) > arr$(i + gap) Then
                Swap arr$(i), arr$(i + gap)
                swaps = 1
            End If
        Next
    Wend
End Sub
 I think I have Comb Sort generalized enough to be flexible to start with it's lower bound and end with it's upper bound.
b = b + ...
Reply
#2
Hi b - would this also be a fair test of Recursion v's For/Next ? I think some past discussion on this seemed to suggest the speeds were dependent on your CPU and size of data base being processed. I am a strong believer in Qsort and so far the data base I'm working with hasn't really slowed it down.
Reply
#3
A non recursive QSort would be very interesting, can it be done?

The reward would be less calls to a sub to pass parameters, gots to speed it some or allot?

Can it be done?, probably...  

I'd like to see that so much, I will give it a  shot. Smile

PS For... Next Loop is the slowest loop structure, though it might act as a crutch to get the sub up and running, you would want to replace it with Do... Loop or While... Wend maybe even a GoTo.
b = b + ...
Reply
#4
Hi bplus,
here comes a showdown (based on your code):

You can see that sometimes using SHELL with superfast sorter (both in EXE/ELF) is way to go EVEN when the data is passed via a file?!

Sometimes, sorting strings is an extra burden since they can come from a file, we need to load and parse them before sorting, in my tweak to your code, you can see three more approaches:

Code: (Select All)
'Benchmark (sorting 3,000,000 strings, up to 62 chars) on Windows 10 and i5-7200U 3.1GHz Max Turbo:
' Quicksort recursive:               8.16s
' Quicksort iterative:               7.54s
' Quicksort 'Magnetica':             4.22s
' Quicksort multithreaded Magnetica: 1.82s


Attached Files Image(s)
   

.7z   QB64_qsort.7z (Size: 1,014.74 KB / Downloads: 48)
"He learns not to learn and reverts to what the masses pass by."
Reply
#5
Man I forgot all about this thread and my search for non recursive QSort.

Looks like Sanmayce has got it in aces! Checking it out...

Update: well I tried it twice and get an error message The System can not find the file. ??? and my download disappears ???
b = b + ...
Reply
#6
Oh, forgot to add qsm.h, the quick fix is to rename it to Magnetica_v18.h (since qsm.h is just a wrapper to Magnetica_v18.h).
Anyway, the fixed package is attached.

Code: (Select All)
Const nItems = 30000000

Could you try on your machine 3 million along with 30 million strings, sharing the screenshot?


Attached Files
.7z   QB64_qsort_added-missing-header.7z (Size: 1,015.02 KB / Downloads: 40)
"He learns not to learn and reverts to what the masses pass by."
Reply
#7
I don't know what's going on but now I am getting messages from Windows Defender as well as about not finding a header. Plus, the file is disappeared each time.

@Sanmayce are you running from Windows? maybe try a zip? Try a download of your code?
b = b + ...
Reply
#8
I don't see any issues with what I shared.
In this new package, I tarred the package to preserve the executability of ELF, also I changed your DIMs with single REDIM, and to make everything fair, before each sort I reread the string pool from a file.

So, the benchmark is done on my main laptop running both Fedora 36 and Windows 10:


All in all, several practical tecniques are explored in this package.

- Avoiding allocating multiple string arrays, since this choke QB64 and makes it frozen?! Try 30 million strings <63 bytes, which is < 2GB;
- Loading via BINARY mode with LINE INPUT;
- Loading via single GET and parsing the chunk within QB64;
- Invoking C written Quicksort;
- Invoking external code via SHELL.

The limitations of 2GB combined with internal string movements prompts for using non-QB64 code, when 30[+] million strings are to be sorted.


Attached Files Image(s)
       

.zip   QB64_qsort_added-missing-header_3+30_million.zip (Size: 1.67 MB / Downloads: 47)
"He learns not to learn and reverts to what the masses pass by."
Reply
#9
And the 30 million run in Linux:


Attached Files Image(s)
       
"He learns not to learn and reverts to what the masses pass by."
Reply
#10
Well I was comparing algorithms in QB64 not jobbing out to some exotic world of sorting and computing.

So I got tired of waiting for the file to be made.
b = b + ...
Reply




Users browsing this thread: 2 Guest(s)