Parse this Web Page/Html to return data ... did it in VB but how here

b4AMarkO

Member
Licensed User
Longtime User
Here is an example of what's contained in the page

B4X:
<table id="ctl00_ctl00_ctl00_ContentPlaceHolderDefault_COTSectionTextpagePlaceHolder_ctl00_COT_LiveCallsInArea_9_grdDivision03" align="Center" cellspacing="0" cellpadding="4" border="0" style="color:#333333;width:470px;border-collapse:collapse;"><tbody><tr style="color:White;background-color:#5D7B9D;font-weight:bold;"><th align="left" abbr="Description" scope="col">
      Description
    </th><th align="left" abbr="Location" scope="col">
      Location
    </th></tr><tr style="color:#333333;background-color:#F7F6F3;"><td align="center">
      Traffic Stop
    </td><td>
      5600 S YALE 
    </td></tr><tr style="color:#284775;background-color:White;"><td align="center">
      Burglar Alarm
    </td><td>
      1300 E 46 ST S 
    </td></tr>

I parsed the above and was able to get the data ie; Traffic Stop and 5600 S Yale
into two variable arrays

Something like this

B4X:
 If myTable IsNot Nothing Then

                For Each myElement As HtmlElement In myTable.GetElementsByTagName("TR")
                    Dim myTDTags As HtmlElementCollection = myElement.GetElementsByTagName("TD")

                    If myTDTags.Count = 2 Then
                        If myTDTags(0).InnerText = "Description" Or myTDTags(1).InnerText = "Location" Then
                            ' Dont add
                        Else
                            myPubAr.Add(myTDTags(0).InnerText)
                            myLocAr.Add(myTDTags(1).InnerText)
                        End If


                    End If
                Next


            End If



But guys IM lost ... no idea how to even start in B4A

the website is here https://www.tulsapolice.org/live-calls-/police-calls-near-you.aspx
Help is much appreciated
 

b4AMarkO

Member
Licensed User
Longtime User
I decided to get out VB and see if I could write it again ... and I did ... even with the new web page layout they have I was able to gather the data no problem at all

Here is the entire code as an example ... my question is what is the equivalent to this code
Wrote it out in aprox 30 minutes :( LOL I hope there is a similar way to do this here
B4X:
Public Class Form1

    Dim _url As String = "https://www.tulsapolice.org/live-calls-/police-calls-near-you.aspx"

    Public myPubAr As New ArrayList
    Public myLocAr As New ArrayList

    Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
        ' Nothing Here yet
    End Sub
    Function UpdateCalls()

        myPubAr.Clear()
        myLocAr.Clear()

        wb2.Navigate(_url) ' Navigates to web Page
        Do Until wb2.ReadyState = WebBrowserReadyState.Complete
            Application.DoEvents()

        Loop
        ' ******************** Now to paqrse the page ***********

        Dim myTable As HtmlElement = wb2.Document.All("ctl00_ctl00_ctl00_ContentPlaceHolderDefault_COTSectionTextpagePlaceHolder_ctl00_COT_LiveCallsInArea_9_grdDivision03")
        If myTable IsNot Nothing Then
            For Each myElement As HtmlElement In myTable.GetElementsByTagName("TR")
                Dim myTdTags As HtmlElementCollection = myElement.GetElementsByTagName("TD")

                If myTdTags.Count = 2 Then
                    If myTdTags(0).InnerText = "Description" Or myTdTags(1).InnerText = "Location" Then
                        ' Do Nothing
                    Else
                        myPubAr.Add(myTdTags(0).InnerText)
                        myLocAr.Add(myTdTags(1).InnerText)
                    End If

                End If
            Next
        Else

        End If

        For i As Integer = 1 To myPubAr.Count - 1
            Dim fontBold As New Font(Font.FontFamily, 13, FontStyle.Bold)
            Dim fontRegular As New Font(Font.FontFamily, 12, FontStyle.Regular)

            rtbCalls.SelectionFont = fontBold                                     'Change font size and style before appending
            rtbCalls.AppendText(myPubAr(i)) ' & vbTab & vbTab & myTime) '???Using 'Append descriptions to rich text box
            rtbCalls.SelectionFont = fontRegular                                  'Return font to normal
            rtbCalls.AppendText(vbNewLine & vbTab & myLocAr(i) & vbNewLine)       'Append locations to rich text box
            'lstLoc.Items.Add(myLocAr(i))
        Next
    End Function

    Private Sub btnUpdate_Click(sender As Object, e As EventArgs) Handles btnUpdate.Click
        Call UpdateCalls()
    End Sub
End Class
 
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Here:
B4X:
Sub Process_Globals
   Type Row (Description As String, Location As String)
   Private Rows As List
   Private tempRow As Row
End Sub

Sub Globals
   
End Sub

Sub Activity_Create(FirstTime As Boolean)
   Download
End Sub
Sub Download
   Dim j As HttpJob
   j.Initialize("j", Me)
   j.Download("https://www.tulsapolice.org/live-calls-/police-calls-near-you.aspx")
End Sub

Sub JobDone(job As HttpJob)
   If job.Success Then
      Dim jtidy As Tidy
      jtidy.Initialize
      'convert html to xml
      jtidy.Parse(job.GetInputStream, File.DirInternalCache, "1.html")
      Dim sax As SaxParser
      sax.Initialize
      Dim In As InputStream = File.OpenInput(File.DirInternalCache, "1.html")
      Rows.Initialize
      tempRow.Initialize
      sax.Parse(In, "sax")
      In.Close
      Log(Rows)
      
   Else
      ToastMessageShow("Error downloading page.", True)
      Log(job.ErrorMessage)
   End If
   job.Release
End Sub

Sub Sax_EndElement (Uri As String, Name As String, Text As StringBuilder)
   If Name = "td" Then
      If tempRow.Description = "" Then
         tempRow.Description = Text
      Else
         tempRow.Location = Text
         Rows.Add(tempRow)
         Dim tempRow As Row 'create a new object
         tempRow.Initialize
      End If
   End If
End Sub

You need to download Tidy library and HttpUtils2 modules.
 
Upvote 0

b4AMarkO

Member
Licensed User
Longtime User
Erel this is amazing it is doing it ....

Its getting td and loading it into arrayList .....

Question; As in my code I get the td data as well ... Only I can aim it to the specific Table of the Riverside Division and do not addthe words "Description" or "Location"

How would I do that here? I assume that in Sax_EndElement is where this would happen Right?

About the words Description and Location

Log(Rows) returns in the Log this
[Description=Auto Theft, Location=600 S Peoria

Couldnt the list simply be
Auto Theft, 600 S Peoria or am I looking at this wrong

If it Description and Location must be there then how do I remove them from the List and display the info so it looks like this

Auto Theft 600 S Peoria
Burglar Alarm 3100 W Easton

[CODE]
Sub Sax_EndElement (Uri As String, Name As String, Text As StringBuilder)
If Name = "td" Then
If tempRow.Description = "" Then
tempRow.Description = Text
Else
tempRow.Location = Text
Rows.Add(tempRow)
Dim tempRow As Row 'create a new object
tempRow.Initialize
End If
End If
End Sub
[/CODE]
 
Last edited:
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Log(Rows) just prints the list. It is only for debugging. You should iterate over the items in the list and do whatever you need with the data.

About the parsing question. I don't exactly remember the XML structure. As Tidy object returns a valid XML string you can extract any data you need from it. I like to save the output to a file in the external storage and then copy it to the desktop and analyze the XML.
 
Upvote 0

b4AMarkO

Member
Licensed User
Longtime User
Log(Rows) just prints the list. It is only for debugging. You should iterate over the items in the list and do whatever you need with the data.

About the parsing question. I don't exactly remember the XML structure. As Tidy object returns a valid XML string you can extract any data you need from it. I like to save the output to a file in the external storage and then copy it to the desktop and analyze the XML.


Sounds good ... How do I save it to the External Storage .... so I can analyze the xml

Sounds like the way to go ....
 
Upvote 0

b4AMarkO

Member
Licensed User
Longtime User
Ok Erel

Trying to iterate through the list then display it in a LIstView
Its Displaying but its not replacing string .... I am probably going about it wrong .... but I am really rusty so .... help appreciated
'*********************** I figured it out ********************** I added myString = myString ect ect
B4X:
For i = 0 To Rows.Size - 1
   Dim myString As String
   myString = Rows.Get(i)
        
   If myString.StartsWith("[Description=") Then
        myString =  myString.Replace("[Description=", "")
          lv1.AddSingleLine(myString)
   End If
Next


Still need to know how to save the xml file to external ... Thank you in advance
 
Last edited:
Upvote 0
Top