An hash array dictonary step by step
#23
(04-11-2023, 05:36 PM)TempodiBasic Wrote: I must understand ,thinking about your post, that the very better result of Luke's algorithm is linked to a different way to store data.

Kinda, it's more that Luke's hash table doesn't support multiple values with the same key. I should have included the code for my "fix":

Code: (Select All)
SUB InsertLookup (key$, lngIndex&)
    DIM hIndex~&

    hIndex~& = GetHash~&(key$, UBOUND(m_arrLookup))
    DO
        IF m_arrLookup(hIndex~&) = 0 THEN EXIT DO
        IF m_arrDict(m_arrLookup(hIndex~&)).key = key$ THEN EXIT DO ' I added this line
        hIndex~& = (hIndex~& + 1) MOD (UBOUND(m_arrLookup) + 1)
    LOOP
    m_arrLookup(hIndex~&) = lngIndex&
END SUB ' InsertLookup

The change causes the insert to return an existing entry if the same key is already in the hash table. Otherwise, the loop in that function has to loop over all the existing identical keys before it finds the next free spot, which is very slow and gets slower the more keys you add.

My hash table is the same and also doesn't support multiple entries with the same key (lookup can only return a single entry for a given key), but it doesn't have the performance issue Luke's does because
HashTableAdd&
doesn't have to loop over all the existing entries in a bucket. It still doesn't work though, when you add an entry with the same key the old ones are still in the bucket but can't be accessed via the lookup function.

Now if you _want_ to be able to add multiple entries with the same key my bucket approach is better but neither are really suited for it. What you'd probably want is to create a linked list for each set of entries with the same key. For my approach, that means adding another link next to
nxt
that points to entries with the same key, making each bucket a bit of a "list of lists". You could also do this with Luke's approach but I think it would require a separate array to store the links for the list of entries with the same key. It's definitely doable though.

That said, I'd probably call this the user's problem Big Grin A hash table doesn't really need to support this natively, you can always modify the
Value
Type to hold a list and get the same thing.

Additionally, for a speed comparison, I can add a similar check in my
HashTableAdd&
that blocks adding entries with the same key. Technically in typical usage this actually makes it slower, because now the
Add
function has to loop over all the entries. For your particular test though it actually makes it faster, because it means avoiding some ReDim's on the
HashEntries()
array:

[Image: hashtable2.png]

It's not particularly consistent on my machine though (you can see everything is a bit slower today).
Reply


Messages In This Thread
RE: An hash array dictonary step by step - by DSMan195276 - 04-11-2023, 09:17 PM



Users browsing this thread: 22 Guest(s)