B4A Class [B4X] [Class] Interpolation search

Short (long-winded 😁) premise.

I needed a function that, given a list, returned the index of the element with the closest value to a searched one.
List.IndexOf is great but it only returns the index if it finds exactly the searched value.

I immediately thought of a binary search (I remember the good old days, about 45 years ago :confused:, when I "studied" the various Bubble, Shell and Quick sort) but I wondered if there was a better algorithm.

I discovered the existence of this "Interpolation search".
The biggest advantage is in the case in which the list of values are uniformly distributed. In the worst case, this algorithm will be as fast as a "normal" binary search.

I publish the function in a class (in the attached project) rather than a library because it will be easier to change the data type of the list, if desired. In my case, I want to pass it a list of custom types (it would be nice to be able to pass it the type and the field to search on, but as far as I know it is impossible).

I hope (and think) that my implementation is correct and optimal.

With the following values:
B4X:
    For i = 1 To 20
        SortedList.Add(i * 5) ' 5, 10, 15, ..., 100
    Next

1733295215617.png
 

Attachments

  • InterpolationSearch.zip
    10.9 KB · Views: 71
Last edited:

Claudio Oliveira

Active Member
Licensed User
Longtime User
Well done, Luca!

This thread made me drift back to 1980-something, when I read and "studied" Donald E. Knuth book "The Art Of Computer Programming: Sorting and Searching, Volume 3", which describes and explains all these algorithms mathematically. Fascinating book.
Good old days indeed!

Thanks for sharing.
 

emexes

Expert
Licensed User
Not that it matters, but...
it feels like the two highlighted lines always return identical results regardless of whether IsAscending is True or False,
and thus you could reduce this code size by 80% (as in: remove 4 out of the 5 lines)

B4X:
If IsAscending Then
    Pos = Low + (SearchValue - LowValue) * (High - Low) / (HighValue - LowValue)
Else
    Pos = Low + (LowValue - SearchValue) * (High - Low) / (LowValue - HighValue)
End If
 

LucaMs

Expert
Licensed User
Longtime User
Not that it matters, but...
it feels like the two highlighted lines always return identical results regardless of whether IsAscending is True or False,
and thus you could reduce this code size by 80% (as in: remove 4 out of the 5 lines)

B4X:
If IsAscending Then
    Pos = Low + (SearchValue - LowValue) * (High - Low) / (HighValue - LowValue)
Else
    Pos = Low + (LowValue - SearchValue) * (High - Low) / (LowValue - HighValue)
End If
Correct.

The worst thing is that if the data is not "well distributed", its operation becomes very similar to a sequential, non-binary search.
This makes me doubt whether it is better to use this search or that one. You could add a test to evaluate the distribution and based on this, perform one of the two search algorithms.

But the even more serious thing is that in my sw the critical point, the one that slows down the processing the most, is another 😁 :confused:
 
Top