Comparison QB64 compiled with gcc optimizations and without
#61
(08-31-2022, 09:30 PM)Coolman Wrote: :
_Deflate (compression) and _Inflate (decompression) functions.

program compiled with qb64 v0.5.0 -O3 :
Function _Deflate : 10.1x seconds
Function _Inflate :  1.2x seconds

program compiled with qb64 v3.0.0 -O3 :
Function _Deflate : 9.5x
Function _Inflate : 1.2x

program compiled with qb64 v0.5.0 original :
Function _Deflate : 10.1x seconds
Function _Inflate :  1.2x seconds
What a shame this still can't be done with v3.0 on Linux! Therefore this test is very subjective to me. What I think is Freebasic should be involved in this testing.

Meh compiler switches, something else besides "_INTEGER64", "_MEM", and besides real audio far better than "BEEP", and besides way better graphics than "CHR$()" capability on 8-bit computers and much more, on QB64's side only that the BASIC programmer, who was a newbie decades ago, has to come across in these later, burgeoning times.

There is the trap about the programmer who starts being obsessed with compiler switches "to make it run faster", but doesn't want to go full-on into C/C++. Pretty soon he/she is going to want to do nothing at all.
Reply
#62
What can't be done on Linux? Inflate and Deflate files? I'm pretty certain those are cross-compatible routines. The compile options? I think they work on Linux as well.

What can't be done on Linux??
Reply
#63
If you don't believe me:

[attachment=813]

QB64PE thinks "_DEFLATE" is an array.

LOL forgot the dollar sign after the function name, but when I included it, this is what I got instead:

[attachment=814]

Contents of ".../qb64pe/internal/temp/compilelog.txt":
Code: (Select All)
g++  -w -std=gnu++11 -DFREEGLUT_STATIC -I./internal/c/libqb/include -DDEPENDENCY_NO_SOCKETS -DDEPENDENCY_NO_PRINTER -DDEPENDENCY_NO_ICON -DDEPENDENCY_NO_SCREENIMAGE -DDEPENDENCY_ZLIB internal/c/libqb.cpp -c -o internal/c/libqb_make_0000000000001.o
g++  -w -std=gnu++11 -DFREEGLUT_STATIC -I./internal/c/libqb/include -DDEPENDENCY_NO_SOCKETS -DDEPENDENCY_NO_PRINTER -DDEPENDENCY_NO_ICON -DDEPENDENCY_NO_SCREENIMAGE -DDEPENDENCY_ZLIB internal/c/qbx.cpp -c -o internal/c/qbx.o
g++  -w -std=gnu++11 -DFREEGLUT_STATIC -I./internal/c/libqb/include -DDEPENDENCY_NO_SOCKETS -DDEPENDENCY_NO_PRINTER -DDEPENDENCY_NO_ICON -DDEPENDENCY_NO_SCREENIMAGE -DDEPENDENCY_ZLIB ./internal/c/libqb_make_0000000000001.o ./internal/c/qbx.o -o 'boredom' ./internal/c/libqb/src/threading.o ./internal/c/libqb/src/buffer.o ./internal/c/libqb/src/threading-posix.o ./internal/c/parts/core/src.a  -lGL -lGLU -lX11 -lpthread -ldl -lrt "-l:libz.a"
/usr/bin/ld: cannot find -l:libz.a
collect2: error: ld returned 1 exit status
make: *** [Makefile:374: boredom] Error 1

This is on Fedora 36 XFCE, which was updated last week. It shouldn't be any different in any other Linux distro, maybe except the ones where the user has to hunt down the dependencies because "setup_lnx.sh" didn't do enough. I'm sorry for the off-topic.
Reply
#64
Short answer: You're not finding the zlib library. Do you have the zlib dependencies installed on that version of Linux?
Reply
#65
Thumbs Up 
(09-02-2022, 06:08 PM)SMcNeill Wrote: Short answer:  You're not finding the zlib library.  Do you have the zlib dependencies installed on that version of Linux?
No, but before I had to give this terminal command:

sudo yum install zlib-static

(This is only on Fedora 36, for other Linux distro the command will be different such as "pacman" on Arch, "apt-get" on Debian/Ubuntu etc. Also the name of the package might be different but should have "zlib" as lead name or something. Saying that for those who beg for more education about it.)

Thank you for your answer.
Reply
#66
I point out that I use a linux distribution based on ubuntu 20.04 and that this code works perfectly :

Code: (Select All)
Dim tab$(19)
tab$(0) = "malevolently malevolous malexecution malfeasance malfeasant malfeasantly malfeasants malfeasor overwon"
tab$(1) = "malfed malformation malformations malformed malfortune malfunction malfunctioned malfunctioning overwood"
tab$(2) = "malgovernment nonemotionalism nonemotionally verwing overwinning overwinter overwintered overwintering"
tab$(3) = "nonemanating nonemancipation nonemancipative nonembarkation nonembellished nonembellishing overwiped"
tab$(4) = "nonembellishment nonembezzlement nonembryonal nonembryonic nonembryonically nonemendable overwithered"
tab$(5) = "nonemendation nonemergence nonemergent nonemigrant nonemigration nonemission nonemotional overwomanize"
tab$(6) = "overwisdom overwise overwisely overwoman overwomanly overwooded overwoody overword overwords overwore"
tab$(7) = "overwork segreant segregable segregant segregate segregated segregatedly segregatedness segregateness"
tab$(8) = "segregates segregating segregation segregational segregationist segregationists segregative segregator"
tab$(9) = "teleostean teleostei teleosteous teleostomate teleostome teleostomi teleostomian teleostomous teleosts"
tab$(10) = "teleotemporal teleotrocha teleozoic teleozoon telepath telepathy telepathic telepathically telepathies"
tab$(11) = "telepathist telepathize nffroze unfibbed unfibbing unfiber unfibered unfibred unfibrous unfibrously"
tab$(12) = "unfickle unfictitious unfictitiously unfictitiousness unfidelity unfidgeting unfiducial unfielded"
tab$(13) = "unfiend unfiendlike unfierce unfiercely unfiery unfight unfightable unfighting unfigurable unfigured"
tab$(14) = "zulus zumatic zumbooruk zuni zunian zunyite zunis zupanate zurich zurlite zutugil zuurveldt zuza"
tab$(15) = "zwanziger zwieback zwiebacks zwieselite zwinglian zwinglianism zwinglianist zwitter zwitterion"
tab$(16) = "zwitterionic cognovits cogon cogonal cogons cogovernment cogovernor cogracious cograil cogrediency"
tab$(17) = "cogredient cogroad cogs cogswellia coguarantor coguardian cogue cogway cogways cogware cogweel"
tab$(18) = "cogweels cogwheel cogwheels xiphistna xiphisura xiphisuran xiphiura xiphius xiphocostal xiphodynia"
tab$(19) = "xiphodon xiphodontidae xiphoid xyphoid xiphoidal xiphoidian xiphoids xiphopagic xiphopagous xiphopagus"
Color 7: Print "Wait..."
Color 2: Print: Print " generation of test string..."
Dim tabch$(15)
For boucle% = 1 To 15
    chaine$ = ""
    For nbr% = 1 To 10000
        chaine$ = chaine$ + tab$(Rnd * 19)
    Next nbr%
    tabch$(boucle%) = chaine$
Next boucle%
Dim chcompress$(15)
Color 3: Print: Print " compress string 100 times in : ";
start = Timer(.001)
For nbr% = 1 To 100
    For boucle% = 1 To 15
        chaine$ = tabch$(boucle%)
        chcompress$(boucle%) = _Deflate$(chaine$)
    Next boucle%
Next nbr%
Print Timer(.001) - start; "seconds"
Color 14: Print
For boucle% = 1 To 15
    Print " len string original : "; Len(tabch$(boucle%)); " -->  Compressed : "; Len(chcompress$(boucle%))
Next boucle%
Color 3: Print: Print " Decompress string 100 times in : ";
start = Timer(.001)
For nbr% = 1 To 100
    For boucle% = 1 To 15
        chaine$ = chcompress$(boucle%)
        tabch$(boucle%) = _Inflate$(chaine$)
    Next boucle%
Next nbr%
Print Timer(.001) - start; "seconds"
Color 7
End
Reply
#67
I saw in a post this sentence:

-Os enables all -O2 optimizations except those that often increase code size

by curiosity i compiled a program with this option, the executable has a size of about 1.3 mo while with the -O3 optimization it's about 1.9 mo. surprising, 600 k more. in addition the compiling time is much faster. so i decided to make a comparison test with the old data, here is the result :

simple code using pset
2.3x seconds : program compiled with qb64 v0.5.0 -O3
2.1x seconds : program compiled with qb64 v3.0.0 -O3
2.6x seconds : program compiled with qb64 v3.1.0 -Os
3.5x seconds : program compiled with qb64 v0.5.0 original

Fractal Tree : I got this code from the site rosettacode.
2.9x seconds : program compiled with qb64 v0.5.0 -O3
3.0x seconds : program compiled with qb64 v3.0.0 -O3
3.4x seconds : program compiled with qb64 v3.1.0 -Os
6.1x seconds : program compiled with qb64 v0.5.0 original

Bucketsort : I got this code from the site rosettacode.
1.1x seconds : program compiled with qb64 v0.5.0 -O3
1.0x seconds : program compiled with qb64 v3.0.0 -O3
1.0x seconds : program compiled with qb64 v3.1.0 -Os
3.4x seconds : program compiled with qb64 v0.5.0 original

sorting algorithm QuickSort.
3.0x seconds : program compiled with qb64 v0.5.0 -O3
3.0x seconds : program compiled with qb64 v3.0.0 -O3
3.2x seconds : program compiled with qb64 v3.1.0 -Os
4.6x seconds : program compiled with qb64 v0.5.0 original

Alien Trees Reflection - Plasma Mod.
5.3x seconds : program compiled with qb64 v0.5.0 -O3
5.4x seconds : program compiled with qb64 v3.0.0 -O3
6.7x seconds : program compiled with qb64 v3.1.0 -Os
9.2x seconds : program compiled with qb64 v0.5.0 original

Nbody
16.9x seconds : program compiled with qb64 v0.5.0 -O3
16.5x seconds : program compiled with qb64 v3.0.0 -O3
23.8x seconds : program compiled with qb64 v3.1.0 -Os
68.0x seconds : program compiled with qb64 v0.5.0 original

Picture unroller
1.3x seconds : program compiled with qb64 v0.5.0 -O3
1.1x seconds : program compiled with qb64 v3.0.0 -O3
1.8x seconds : program compiled with qb64 v3.1.0 -Os
2.5x seconds : program compiled with qb64 v0.5.0 original

New screen - How ?. Found in the old forum.
1.4x seconds : program compiled with qb64 v0.5.0 -O3
1.1x seconds : program compiled with qb64 v3.0.0 -O3
1.9x seconds : program compiled with qb64 v3.1.0 -Os
2.3x seconds : program compiled with qb64 v0.5.0 original

opens a text file of about 4 mo to read it line by line three times in a row with the use of Left$ InStr
6.5x seconds : program compiled with qb64 v0.5.0 -O3
6.3x seconds : program compiled with qb64 v3.0.0 -O3
6.7x seconds : program compiled with qb64 v3.1.0 -Os
7.2x seconds : program compiled with qb64 v0.5.0 original

a very simple code that opens a text file of about 4 mo to read it line by line and copy it to a new file.
3.0x seconds : program compiled with qb64 v0.5.0 -O3
2.7x seconds : program compiled with qb64 v3.0.0 -O3
2.9x seconds : program compiled with qb64 v3.1.0 -Os
3.0x seconds : program compiled with qb64 v0.5.0 original

here is another very simple code that copies a test file of about 4 mo to an array and copies the data randomly to another array.

program compiled with qb64 v0.5.0 -O3
4.4x seconds : Read file and copy to table
7.1x seconds : Copy data randomly into another table.

program compiled with qb64 v3.0.0 -O3
4.6x seconds : Read file and copy to table
8.3x seconds : Copy data randomly into another table.

program compiled with qb64 v3.1.0 -Os
4.9x seconds : Read file and copy to table
8.3x seconds : Copy data randomly into another table.

program compiled with qb64 v0.5.0 original
5.9x seconds : Read file and copy to table
27.4x seconds : Copy data randomly into another table.

found a code in the old forum
13.2x seconds : program compiled with qb64 v0.5.0 -O3
10.8x seconds : program compiled with qb64 v3.0.0 -O3
10.1x seconds : program compiled with qb64 v3.1.0 -Os
26.2x seconds : program compiled with qb64 v0.5.0 original

CRC-32 : I got this code from the site rosettacode.
6.7x seconds : program compiled with qb64 v0.5.0 -O3
6.9x seconds : program compiled with qb64 v3.0.0 -O3
10.1x seconds : program compiled with qb64 v3.1.0 -Os
17.6x seconds : program compiled with qb64 v0.5.0 original

_Deflate (compression) and _Inflate (decompression) functions.

program compiled with qb64 v0.5.0 -O3 :
Function _Deflate : 10.1x seconds
Function _Inflate : 1.2x seconds

program compiled with qb64 v3.0.0 -O3 :
Function _Deflate : 9.5x
Function _Inflate : 1.2x

program compiled with qb64 v3.1.0 -Os :
Function _Deflate : 9.5x
Function _Inflate : 1.2x

program compiled with qb64 v0.5.0 original :
Function _Deflate : 10.1x seconds
Function _Inflate : 1.2x seconds

In conclusion : programs compiled with -O3 are still faster but not by much. compile times are longer and the size of the generated executables is much larger. using the -Os option seems to be a good compromise between speed gain and optimization of the program size. the compile time is also an important criterion. after these results, i recompiled the whole qb64 in -Os. the size of the executable is about 5.5 MB instead of 7.5 MB with -O3. finally i decided to adopt the -Os option, it's mainly the gains in compile time and size of the generated programs that convinced me.
Reply
#68
Coolman
I am glad that you finally realized that -O3 is rarely beneficial, you may get a small speed increase but not always, I prefer -O2
could you post the Bucket Sort ?
Reply
#69
@Jack

I think -Os is even better with the reduction of the size of the generated executables and an important gain during the compilation.

you can find Bucket Sort in the first post here :

https://staging.qb64phoenix.com/showthre...51#pid1851
Reply
#70
Bucket Sort times
-O2 .88 sec
-O3 .7 sec
-Os .72 sec
keep in mind that timing differences between O levels may differ for different programs, sometimes -O2 is best and sometimes -Os
Reply




Users browsing this thread: 8 Guest(s)