You'd think this would be faster, but NO!!!!!! [Resolved] - Printable Version +- QB64 Phoenix Edition (https://staging.qb64phoenix.com) +-- Forum: Chatting and Socializing (https://staging.qb64phoenix.com/forumdisplay.php?fid=11) +--- Forum: General Discussion (https://staging.qb64phoenix.com/forumdisplay.php?fid=2) +--- Thread: You'd think this would be faster, but NO!!!!!! [Resolved] (/showthread.php?tid=968) |
You'd think this would be faster, but NO!!!!!! [Resolved] - Pete - 10-13-2022 The top code sets the variable "h" to equal the SCREEN() function. It is used so the screen position is read only once. The variable then checks two places in the code where this info is polled. Now the bottom code does exactly the same thing, but it calls the SCREEN() function THREE times. You'd probably think that's the slower way to do things, but it's actually about 5 times faster! Code: (Select All) ii = 0 Code: (Select All) FOR i = 0 TO LEN(a.ship) - 1 Pete - Looking forward to an afterlife based on attendance. RE: You'd think this would be faster, but NO!!!!!! - Kernelpanic - 10-13-2022 Even if I can't describe it exactly right away, but actually it is clear. Compare this in the For Loop: Code: (Select All) h = SCREEN(j, k + i) With the: Code: (Select All) IF SCREEN(j, k + i) = ASC(g.flagship) OR SCREEN(j, k + i) = g.m_asc THEN RE: You'd think this would be faster, but NO!!!!!! - mnrvovrfc - 10-14-2022 It has to do with how the C++ compiler optimizes code. I'm not an expert on this by any means. In the past it wasn't recommended to call functions repeatedly because C/C++ compilers were less clever about it. Also some programmers preferred to do assembly language "by hand" to try to make it faster despite the "expensive" function calls that had to be made. This code might be even faster if it were written in C++ from scratch and even if it employed the classes. "Inline" is mentioned many times in the "info gcc" manual, and it looks like no project in C/C++ is acceptable without any option for optimization. With 64-bit processors being way faster than 16-bit (parallel processing FTW), it doesn't seem to matter how many times to call a certain function previously considered inefficient, unless of course it does take longer than a few seconds to execute something. However, optimization has become more important in 32-bit and 64-bit than for 16-bit. Finally, SCREEN() function was considered inefficient in QuickBASIC because it had to read from video memory. QB64PE has to fake it with memory pools, and although it also has to draw a graphics screen it's not really a problem especially when a lot of people have desktop systems with hot-rod GPU's. Probably in that case ASC() is the one that is dragging things. It's being called twice while SCREEN() is called three times, but each one does a radically different thing. EDIT: This gave me an idea: QB64PE doesn't support short-circuit evaluation. I think there should be a compiler option for it. :tu: RE: You'd think this would be faster, but NO!!!!!! - Pete - 10-14-2022 (10-13-2022, 11:41 PM)Kernelpanic Wrote: Even if I can't describe it exactly right away, but actually it is clear. To my good friend from Germany with uneven tan lines.... Yes, that code I posted is not the best. I hate to use this term, but I was "transitioning" into a SELECT CASE model, but before I got to that part, I noticed the horrible lag making the SCREEN() function into a variable created. So, what am I really going to use in the program? This... Code: (Select All) FOR i = 0 TO LEN(a.ship) - 1 Thank, and as a side note, I'm not big on castles, but if I took my wife to Germany, she'd be more than happy to push my ass down that Alpine Coaster! Pete RE: You'd think this would be faster, but NO!!!!!! - JRace - 10-14-2022 @mnrvovrfc beat me to it. It DOES sound like optimization at work. If the compiler decides that the value of SCREEN(j, k + i) doesn't change between calls, then it may be holding that value in a register variable (storing the results of the first call in a CPU register), which would be faster to access than a standard RAM-based variable. Even better: after the return value of SCREEN(j, k + i) is loaded into a register for first use, if that register remains unchanged between calls to SCREEN and the compiler determines that the return value of SCREEN will not change, then the compiler may not have to do anything with the value; just leave that register alone and use that held value as needed instead of calling SCREEN. That would be fast. RE: You'd think this would be faster, but NO!!!!!! - Pete - 10-14-2022 Thanks for the replies. It sure seems evident this should either be addressed as a compiler upgrade or a wiki notation. Imagine if I posted all 1000 lines of code and asked the community, "Why can't QB64 run these animation sequences faster? "Talk about a needle in a haystack issue. Pete RE: You'd think this would be faster, but NO!!!!!! - SMcNeill - 10-14-2022 IF h = ASC(g.flagship) OR h = g.m_asc THEN IF h = ASC(g.flagship) THEN ii = 1 EXIT FOR ELSE ii = 2 EXIT FOR END IF END IF The above here, just seems dang weird to me. IF h = 1 OR h = 2 THEN <-- This says only of h is 1 or 2 then we do inside this block IF h = 1 THEN do_whatever <-- so h is 1 here and we do something ELSE do_junk <-- h *has* to be 2 here to do junk. (Otherwise we wouldn't have passed the first IF) END IF END IF Now my question is: WTH is the point of that outer loop to even begin with??? IF h = 1 THEN do_whatever IF h = 2 THEN do_junk PRESTO! Done, clean, with several fewer IF checks and SCREEN calls... What am I missing here? Why do you IF check for 2 conditions and then IF check for each of those conditions independently? Personally, I'd just SELECT CASE the code and keep it simple. SELECT CASE SCREEN(j, k + i) CASE IS = ASC(g.flagship): ii =1: EXIT FOR CASE IS = g.m_asc: ii = 2: EXIT FOR END SELECT RE: You'd think this would be faster, but NO!!!!!! - DSMan195276 - 10-14-2022 FWIW I don't really think this is an optimization thing, unless Pete turned the "Compile program with C++ optimization flag" thing on then the code isn't being optimized. Beyond that, both the SCREEN() and ASC() functions live in a separate .ofile from the main code so typically this means they would never be inlined since the linker doesn't normally do that (and we don't tell it to do link-time-optimization). The one thing I was wondering is whether hwas a SINGLE rather than integer value, then there would be a floating point conversion going on. Still, I would expect calling func_screen()to be slower than the floating point conversion. The other consideration is whether the event checking around h = SCREEN(j, k + i)adds too much overhead, simply because func_screen()and func_asc()do almost nothing in these cases and thus should be very fast. If you have a chance you might try putting $CHECKING:OFFaround it and see if the speed discrepancy is still there. RE: You'd think this would be faster, but NO!!!!!! - Pete - 10-14-2022 Hi Matt, I've already modified the code to the SELECT CASE snippet I posted, which solved the slow down the h = SCREEN() created. I do have a backup copy of the one with the speed killing situation. No, $CHECKING:OFF does nothing to offset the decreased speed issue in the backup copy with the speed reduction problem. h, btw is an integer, defined by a DEFINT H-K at the top of the program. Pete RE: You'd think this would be faster, but NO!!!!!! - Pete - 10-14-2022 Steve must have missed my second post. Pete |