An hash array dictonary step by step
#31
QB64-PE needs a lot of things Big Grin  I'm not sure it needs a native dictionary type, but definitely better support for making such structures efficient and self-contained. You should be able to make a "Dictionary" or "List" type that you can just Dim a variable as without needing extra stuff like global arrays or etc.

Additionally, if they're not built-in to the language then it should provide a supported library of common data structures, so even if they're not directly part of the language you wouldn't need to write your own.
Reply
#32
Photo 
> Welcome ideas, constructive criticisms, other code demonstrations and all you, kind people of QB64, want to share about these arguments.

Associative Arrays are important but not for Heavy_Lifting, IMO, if you wanna deal with really SCARY amount of data at FORMIDABLE speed then the name of the game, as far as I understand, is "mapping" or rather "fast mapping" or even better "fastest mapping" or best "fastest mapping of fat data", yes?
There are lots of built-in (in different languages) mappers, yet, hashing (as INTEGRAL part of mapping) is CRUCIAL, when speaking of speed I am not aware of faster hasher than mine, called 'Gumbotron', it serves as "Compressor" of data, similarly to the SHA family, of course the trade-off is strength vs speed, however when maintaining dedicated slots for different key/data lengths this 16 Byte hash is quite useful - battle-proven in my heavy compression benchmarks.

[Image: Gumbotron_darker.png]

Imagine a scenario of having 8 billion (this number is not arbitrary, each and every decent program should handle current Earth's population) records, let us say, up to 256 chars in size, then Gumbotron needs a prefix (the key length) to the 16B hash-value i.e. 17B in order to hash properly (without collisions) these up to 256 chars keys.
Ideally, the most sought feature (yes, Perfect Hashing) is to have all the ducks in an array/row (where the index/hash "maps onto" the key itself) these people's e.g. fullnames need 33bit or INTEGER64 which is scary already - 8x8=64GB, Gumbotron needs 9 more bytes, or 8x17=136GB.
Nothing new really, however, diving into power-limited approaches is no fun and is to be avoided, the worse the better, meaning one have to prepare for real-world tasks.

Code: (Select All)
Declare CustomType Library "./manatarka"
    Sub DoubleDeuceAES_Gumbotron (ByVal buffer As _Offset, Byval bytesto As _Integer64, Byval LoPart As _Offset, Byval HiPart As _Offset) 'Note: This hasher uses AES extension, the CPU has to support it!
    Sub DoubleDeuceAES_Gumbotron_YMM (ByVal buffer As _Offset, Byval bytesto As _Integer64, Byval LoPart As _Offset, Byval HiPart As _Offset) 'Note: This hasher uses AES extension, the CPU has to support it!
End Declare

Practical example (sourcecode and binaries) using above approach at:
https://qb64forum.alephc.xyz/index.php?t...#msg139414

Food for thought.
"He learns not to learn and reverts to what the masses pass by."
Reply
#33
Hi Sanmayce
welcome your post that brings the discussion in other perspective!
Big data & Speed is the first point, 
followed by Qsm.h to use ASM code directly as showed at link posted by you with QuickSort_QB64 V9
and then Gumbotron hash mapper using  139GB (17B) to avoid collisions in hash table using  a key of 256 characters.
Associative Arrays not important (good) for heavy / huge work.

Are QMS.H and Gumbotron  QB64pe compatible?
Have you a demo?
Reply




Users browsing this thread: 19 Guest(s)