An hash array dictonary step by step
#32
Photo 
> Welcome ideas, constructive criticisms, other code demonstrations and all you, kind people of QB64, want to share about these arguments.

Associative Arrays are important but not for Heavy_Lifting, IMO, if you wanna deal with really SCARY amount of data at FORMIDABLE speed then the name of the game, as far as I understand, is "mapping" or rather "fast mapping" or even better "fastest mapping" or best "fastest mapping of fat data", yes?
There are lots of built-in (in different languages) mappers, yet, hashing (as INTEGRAL part of mapping) is CRUCIAL, when speaking of speed I am not aware of faster hasher than mine, called 'Gumbotron', it serves as "Compressor" of data, similarly to the SHA family, of course the trade-off is strength vs speed, however when maintaining dedicated slots for different key/data lengths this 16 Byte hash is quite useful - battle-proven in my heavy compression benchmarks.

[Image: Gumbotron_darker.png]

Imagine a scenario of having 8 billion (this number is not arbitrary, each and every decent program should handle current Earth's population) records, let us say, up to 256 chars in size, then Gumbotron needs a prefix (the key length) to the 16B hash-value i.e. 17B in order to hash properly (without collisions) these up to 256 chars keys.
Ideally, the most sought feature (yes, Perfect Hashing) is to have all the ducks in an array/row (where the index/hash "maps onto" the key itself) these people's e.g. fullnames need 33bit or INTEGER64 which is scary already - 8x8=64GB, Gumbotron needs 9 more bytes, or 8x17=136GB.
Nothing new really, however, diving into power-limited approaches is no fun and is to be avoided, the worse the better, meaning one have to prepare for real-world tasks.

Code: (Select All)
Declare CustomType Library "./manatarka"
    Sub DoubleDeuceAES_Gumbotron (ByVal buffer As _Offset, Byval bytesto As _Integer64, Byval LoPart As _Offset, Byval HiPart As _Offset) 'Note: This hasher uses AES extension, the CPU has to support it!
    Sub DoubleDeuceAES_Gumbotron_YMM (ByVal buffer As _Offset, Byval bytesto As _Integer64, Byval LoPart As _Offset, Byval HiPart As _Offset) 'Note: This hasher uses AES extension, the CPU has to support it!
End Declare

Practical example (sourcecode and binaries) using above approach at:
https://qb64forum.alephc.xyz/index.php?t...#msg139414

Food for thought.
"He learns not to learn and reverts to what the masses pass by."
Reply


Messages In This Thread
RE: An hash array dictonary step by step - by Sanmayce - 05-08-2023, 12:12 AM



Users browsing this thread: 9 Guest(s)