Android Question The best way to do massive searches

PABLO2013

Well-Known Member
Licensed User
Longtime User
Greetings, I am trying to read from a txt or a csv the location of an airport with respect to distance to my location, what happens is that the number of airports is 350,000 and it takes too long, the information could be cut or converted, for example, to sql or so. ...thank you

B4X:
Sub CheckNearbyAirports(path As String, Filename As String, mylat As Double,mylon As Double)

   
        Dim lst As List = File.ReadList(path, Filename)
       
        For i = 0 To lst.size -1
           
            Dim line As String = lst.Get(i)
            line = line.Trim.Replace("  ", "")
            Dim xyz() As String = Regex.Split(",", line)

            Dim airportName As String = xyz(0)
            Dim airportLat As Double = xyz(5)
            Dim airportLon As Double = xyz(6)

    Log(i)

If Abs(mylat-airportLat) < 0.001 Then

            Dim distance As Double = HaversineDistance(mylat, mylon , airportLat, airportLon)
            If distance < 100 Then
                Log("Near to : " & airportName)
             
            End If
 End If
 
   Next
 

End Sub
 

emexes

Expert
Licensed User
Step 1 would be to do the pre-trig filtering by longitude as well as latitude, at least for say latitudes between +/- 80 degrees, with the 0.001 upped to 0.006 to take into account that degrees longitude are shorter as you move closer to the poles. I expect there are few enough airports close to the poles that it's easy enough to just search them all without worrying about initial filtering.

Now that I think about it, that 0.001 (radians?) seems a bit tight - wouldn't that miss airports that are more than 6.4 km north or south of your location?
 
Upvote 0

emexes

Expert
Licensed User
Step 2 would be to scale the longitudinal distances from angular to linear (eg km), depending on latitude. I was a bit wary of that to begin with, thinking that it would involve a cosine operation for each airport tested, but then I realised: just doing the one cosine of mylat and using it to scale longitudinal distances for all airports will be good enough, because you've already filtered to airports that are close in latitude and the scale isn't going to change that much over that distance.
 
Upvote 0

Alex_197

Well-Known Member
Licensed User
Longtime User
Greetings, I am trying to read from a txt or a csv the location of an airport with respect to distance to my location, what happens is that the number of airports is 350,000 and it takes too long, the information could be cut or converted, for example, to sql or so. ...thank you

B4X:
Sub CheckNearbyAirports(path As String, Filename As String, mylat As Double,mylon As Double)

  
        Dim lst As List = File.ReadList(path, Filename)
      
        For i = 0 To lst.size -1
          
            Dim line As String = lst.Get(i)
            line = line.Trim.Replace("  ", "")
            Dim xyz() As String = Regex.Split(",", line)

            Dim airportName As String = xyz(0)
            Dim airportLat As Double = xyz(5)
            Dim airportLon As Double = xyz(6)

    Log(i)

If Abs(mylat-airportLat) < 0.001 Then

            Dim distance As Double = HaversineDistance(mylat, mylon , airportLat, airportLon)
            If distance < 100 Then
                Log("Near to : " & airportName)
            
            End If
 End If
 
   Next
 

End Sub
Please see the code to calc the distance by lat/long

B4X:
Public Sub CalcDistance(CurrentLat As Double,CurrentLong As Double,HouseLat As Double,HouseLong As Double) As Double
    
    Try
        
        
        Dim LatP As Double=CurrentLat
        Dim LongP As Double=CurrentLong
        Dim LatD As Double=HouseLat
        Dim LongD As Double=HouseLong
        
        Dim Miles As Double
        Dim Yards As Double
        Dim X As Double
        
        Dim Coeff As Double= 1760.0

        X = Sin(LatP * 0.01745) * Sin(LatD * 0.01745) + Cos(LatP * 0.01745)* Cos(LatD * 0.01745) * Cos(( LongD - LongP ) * 0.01745)
        
        Miles = 1.15 * 3963 * ( ATan(-X / Sqrt(-X * X+ 1.0000000000001)) + 2* ATan(1) )
                                    
        Miles = Round2(Miles, 2)
            
        Yards = Miles * Coeff
            
        Return Round( Yards)
        
    Catch
        
        Log("CalcDistance " & LastException.Message)
        
        Return -1
    End Try
    
End Sub
 
Upvote 0

emexes

Expert
Licensed User
Please see the code to calc the distance by lat/long

For locations within a few degrees of each other, converting the angular distances to straight-line distances (eg miles) and using Pythagoras is way simpler and faster than trig operations, and accurate enough for filtering purposes. In fact, for selecting airports within x distance, you don't even need to root it, you can just compare the squares.
 
Upvote 0

emexes

Expert
Licensed User
Upvote 0

emexes

Expert
Licensed User
it takes too long

This is another approach, and an interesting read anyway:

88-86c7c00f57.jpg

Keywords so I can find this post later: programming pearls longitude latitude cartestian closest nearest distance
 
Upvote 0

Hamied Abou Hulaikah

Well-Known Member
Licensed User
Longtime User
Problems:
1- I think for end customer, he doesn't need all 350K results, he never read them.
2- Distance calc for each record will consume a lot of memory.
3- txt, csv or sqlite all will be slow in response for this large records number.
Solutions:
1- Just narrow search result by anyway.
2- If step 1 solved, this automatically solved,
3- Use document database types such as Couchbase Lite.

That's my opinion
 
Upvote 0

William Lancee

Well-Known Member
Licensed User
Longtime User
Divide and conquer.
180 x 360 areas of the globe based on lat/lon degrees.
Each with a list of all it's airports.
Registration is done once. Even at 100x4188 airports, it is <2 seconds
Searches will always take only 16msecs.

Edit: I have replaced the distance computation with the Haversine algorithm to make it more accurate.
Revised code NearX_V4 is attached below.

Edit: Disclaimer: Use at your own risk. Especially if you're flying a plane!

Log of results:
Nearby Airports

YOUNGSTOWN YNG YOUNGSTOWN WARREN RGNL 364km
WILLIAMSPORT IPT WILLIAMSPORT RGNL 319km
BUFFALO BUF BUFFALO NIAGARA INTERNATIONAL 123km
WATERLOO YKF WATERLOO RGNL 187km
HAMILTON YHM HAMILTON 168km
TORONTO YKZ BUTTONVILLE MUNI 98km
TORONTO YTZ CITY CENTRE 106km
TORONTO YYZ LESTER B PEARSON INTERNATIONAL 122km
TORONTO YZD DOWNSVIEW 108km
NIAGARA FALLS IAG NIAGARA FALLS INTERNATIONAL 114km
ROCHESTER ROC GREATER ROCHESTER INTERNATIONAL 102km
WATERTOWN ART WATERTOWN INTERNATIONAL 172km
SYRACUSE SYR SYRACUSE HANCOCK INTERNATIONAL 191km
MUSKOKA YQA MUSKOKA 145km
PETERBOROUGH YPQ PETERBOROUGH 34km
TRENTON YTR TRENTON 54km
KINGSTON YGK KINGSTON 129km
PETAWAWA YWA PETAWAWA 232km

Closest: PETERBOROUGH YPQ PETERBOROUGH 34km

Search time: 16milliseconds

B4X:
Sub Class_Globals
    Type Point(X As Float, Y As Float)        'Move to Sub Class_Global
    Private Root As B4XView        'ignore
    Private xui As XUI
 
    Private Areas(180, 360) As List
End Sub

Public Sub Initialize
End Sub

Private Sub B4XPage_Created (Root1 As B4XView)
    Root = Root1
    Dim airports As List = File.ReadList(File.DirAssets, "GlobalAirportDatabase.txt")
    'AYGA:GKA:GOROKA:GOROKA:PAPUA NEW GUINEA:006:004:054:S:145:023:030:E:01610:-6.082:145.392

    Dim airportCount As Int
    Dim markTime As Long = DateTime.now
    For Each s As String In airports
        Dim v() As String = Regex.Split("\:", s)
        If v(8) <> "U" Then
            Dim pt As Point = LLToDecimal(Array(v(5), v(6), v(7), v(8)), Array(v(9), v(10), v(11), v(12)))
            Dim ptx As Point = CreatePoint(pt.X + 90, pt.Y + 180)
            If Areas(ptx.X, ptx.Y).isInitialized = False Then Areas(ptx.X, ptx.Y).Initialize
            Areas(ptx.X, ptx.Y).Add(Array(v(3), v))
            airportCount = airportCount + 1
        End If
    Next
    Log("Registration time: " & (DateTime.Now - markTime) & "milliseconds" & TAB & "For " & airportCount & " Airports")
    Log(TAB)
   

    Dim cities As Map = CreateMap( _
        "Cobourg": CreatePoint(43.95977, -78.16515), _
        "Melbourne": CreatePoint(-37.814, 144.96332), _
        "Christchurch": CreatePoint(-43.5320, 172.6306), _
        "Greenwich": CreatePoint(51.4934, 0.0098))
   
    Log("Nearby Airports")
   
    Dim example As String = "Cobourg"
    markTime = DateTime.now
    Dim found As List = findNearest(cities.Get(example), 2)
    Dim closest() As Object
    Dim minDist As Float = 1 / 0
    If found.Size > 0 Then
        Log(found.Size)
        For Each ar() As Object In found
            Dim v() As String = ar(1)
            Dim thisExact As Point = LLToDecimal(Array(v(5), v(6), v(7), v(8)), Array(v(9), v(10), v(11), v(12)))
            Dim dist As Float = DistanceKM(cities.Get(example), thisExact)
            If dist < minDist Then
                minDist = dist
                closest = ar
            End If
            Log(ar(0) & TAB & v(1) & TAB & v(2) & TAB & Ceil(dist) & "km")
        Next
    End If
    ar = closest
    v = ar(1)
    Log(TAB)
    Log("Closest: " & ar(0) & TAB & v(1) & TAB & v(2) & TAB & Ceil(minDist) & "km")
    Log(TAB)
    Log("Search time: " & (DateTime.Now - markTime) & "milliseconds")
End Sub

B4X:
Private Sub findNearest(pt As Point, delta As Int) As List
    Dim result As List = CreateEmptyList
    Dim ptx As Point = CreatePoint(pt.X + 90, pt.Y + 180)
    For i = Max(0, ptx.x - delta) To Min(179, ptx.x + delta)
        Dim xi  As Int = i
        For j = ptx.y - delta To ptx.y + delta
            Dim xj As Int = j
            If xj < 0 Then xj = xj + 360
            If xj > 359 Then xj = xj - 360
            If Areas(xi, xj).IsInitialized Then
                For Each ar() As Object In Areas(xi, xj)
                    result.Add(ar)
                Next
            End If
        Next
    Next
    Return result
End Sub

Note see attached zip for complete code, including some helper subs.
 

Attachments

  • NearX_V4.zip
    208.8 KB · Views: 151
Last edited:
Upvote 0

BlueVision

Active Member
Licensed User
Longtime User
Impressive processing speeds!
My approach would be to import the CSV data into an SQL database first. I'm not entirely sure, but the processing speed of an SQL database as a data basis should certainly be many times higher than with a CSV-based basis.

I have not yet been able to look at the example, but if the coordinates of the airports are also already contained in the SQL database, it might be possible to save time-consuming trigonometric calculations. The fastest connections between airports are great circles (a virtual rubber band from the departure airport to the destination airport) on the WGS84 ellipsoid. I would then use the distance calculation of the GPS library for this. Of course, it still depends on sophisticated programming to make the programme fast. It all depends on a trial.
My largest SQL database has almost 100,000 entries. At least the visible representation of the database on a screen has its limits... So if you start with a clever preselection of locations, searching through this amount of data and displaying it only takes a moment.
 
Upvote 0

emexes

Expert
Licensed User
B4X:
    For i = pt.x - delta To pt.x + delta
        Dim xi As Int = i
        If i < 0 Then xi = xi + 180
        If i > 178 Then xi = xi - 180

Why testing i instead of xi?

Why 178?

Maybe even: why not Mod? eg:

B4X:
    For i = pt.x - delta To pt.x + delta
        Dim xi As Int = (i + 180) Mod 180    'lol Mod eliminates one trap but introduces a new one
 
Last edited:
Upvote 0

William Lancee

Well-Known Member
Licensed User
Longtime User
@emexes

178 Just a typo really should be 179

But the mod stuff is better

i vs xi same really here

To those watching, we're talking about taking care of the transition between lat 179 and 180=0 and lon 359 and 360 =0.
 
Upvote 0

William Lancee

Well-Known Member
Licensed User
Longtime User
Essentially, just search the neighbourhood (500km) of where you are!
Canada, like Australia is big, so neighbourhoods are big too :)

Mind you, the 109km from Cobourg to ROCHESTER is across Lake Ontario.

I put in Melbourne in the same algorithm.

____________________

Registration time: 19milliseconds For 4188 Airports

Nearby Airports
AVALON AVV AVALON 61km
MELBOURNE MEB MELBOURNE ESSENDON 12km
MELBOURNE MEL MELBOURNE INTERNATIONAL 21km
POINT COOK N/A POINT COOK 27km
MELBOURNE MBW MELBOURNE MOORABBIN 24km
ALBURY ABX ALBURY 295km

Closest: MELBOURNE MEB MELBOURNE ESSENDON 12km

Search time: 15milliseconds
 
Upvote 0

William Lancee

Well-Known Member
Licensed User
Longtime User
For ChristChurch NZ: of course the closest isn't necessarily the most practical.

Registration time: 21milliseconds For 4188 Airports

Nearby Airports
OAMARU OAM OAMARU 235km
TIMARU TIU TIMARU 179km
CHRISTCHURCH CHC CHRISTCHURCH INTERNATIONAL 12km
WIGRAM N/A WIGRAM 9km

Closest: WIGRAM N/A WIGRAM 9km

Search time: 19milliseconds

For .zip see #11
 
Last edited:
Upvote 0

emexes

Expert
Licensed User
@emexes
Please explain

For mapping an "out of range" angle to [0, 180)

the If comparisons, won't handle anything requiring more than one adjustment eg -359 will map to -179
although While comparisons would

the Mod operator of Java and hence B4J/B4A doesn't handle negative numbers as you'd expect it to
ie you'd think that x Mod 180 would always return a value [0, 180)
so that you could use it to index into arrays like DegreeLongitudeToMetres(180) As Single
if you are confident that Float-to-Int casting is Floor rather than Round
https://torstencurdt.com/tech/posts/modulo-of-negative-numbers/
although to be fair, even the construct I gave relies on the delta being less than 180, which is the same thing that saves your If comparison

I do remember something about B4X Mod working with reals as well as integers, which would be nice.
 
Upvote 0

emexes

Expert
Licensed User
Reducing angle xi to [0, 180)

B4X:
Do While xi < 0
    xi = xi + 180
Loop
Do While xi >= 180
    xi = xi - 180
Loop

or doing it "directly" without comparisons or loops, then either of:

B4X:
xi = ((xi Mod 180) + 180) Mod 180    'to be sure, to be sure

B4X:
xi = xi - Floor(xi / 180) * 180    'what Mod should be (bonus: works for reals too)
 
Last edited:
Upvote 0

William Lancee

Well-Known Member
Licensed User
Longtime User
Thanks @emexes for that explanation. I knew about the negative x in 'x mod n', but I hadn't thought about negative n.
I also didn't know that different languages deal with this differently. In the link you provided, Fortran is not even on the list!

Since my loop goes from (index - 2) to (index + 2) I was only concerned about
mapping -2, -1, to 178, 179 and mapping 180, 181 to 0, 1, so the if statements will suffice.

Edit: 0 to 180 => Latitude -90 (South Pole) to +90 (North Pole), so -2, -1 in my loop should not loop back to the North Pole!
It is enough to limit Latitude to -90 to 90 and the lower limit of the loop to 0 and 179. Corrections made.

But I'm always looking for a better way.

I put Greenwich in the algorithm. there were 92 airports "nearby".
Search time: 21milliseconds

The furthest away was
PARIS CDG CHARLES DE GAULLE 395km

Since Amsterdam Schiphol was not on the list and is closer (at about 347km) than Paris, you can start to see the flaws of this approach.
Beyond a certain distance (280km?) the completeness of the list is unreliable. We could increase the delta or just remove airports beyond 280km from the list.

There are also large bodies of water, deserts, and tundras without airports. If one was in those places, the list would be empty.
 
Last edited:
Upvote 0
Top