Android Tutorial XML Parsing with the XmlSax library

It is simpler to parse XML with Xml2Map class: https://www.b4x.com/android/forum/threads/b4x-xml2map-simple-way-to-parse-xml-documents.74848/

The XmlSax library provides an XML Sax parser.
This parser sequentially reads the stream and raises events at the beginning and end of each element.
The developer is responsible to do something useful with those events.

There are two events:
B4X:
StartElement (Uri As String, Name As String, Attributes As Attributes)
EndElement (Uri As String, Name As String, Text As StringBuilder)
The StartElement is raised when an element begins. This event includes the element's attributes list.
EndElement is raised when an element ends. This event includes the element's text.

In this example we will parse the forum RSS feed. RSS is formatted using XML.
A simplified example of this RSS is:
B4X:
<?xml version="1.0" encoding="ISO-8859-1"?>
<rss version="2.0">
    <channel>
        <title>Basic4ppc  / Basic4android - Android programming</title>
        <link>http://www.b4x.com/forum</link>
        <description>Basic4android - android programming and development</description>
        <ttl>60</ttl>
        <image>
            <url>http://www.b4x.com/forum/images/misc/rss.jpg</url>
            <title>Basic4ppc  / Basic4android - Android programming</title>
            <link>http://www.b4x.com/forum</link>
        </image>
        <item>
            <title>Phone library was updated - V1.10</title>
            <link>http://www.b4x.com/forum/additional-libraries-official-updates/6859-phone-library-updated-v1-10-a.html</link>
            <pubDate>Sun, 12 Dec 2010 09:27:38 GMT</pubDate>
            <guid isPermaLink="true">http://www.b4x.com/forum/additional-libraries-official-updates/6859-phone-library-updated-v1-10-a.html</guid>
        </item>
        ...MORE ITEMS HERE
    </channel>
</rss>
The first line is part of the XML protocol and is ignored.
On the second line the StartElement event will be raised with "Name = rss" and the attributes will include the "version" field.
The EndElement of the rss element will only be called on the last line: </rss>.

We will populate a list view with all items parsed from an offline file. When the user will press on an item we will open the browser with the relevant link.
Every item represents a forum thread.

xmlsax_1.png


For each item we are interested in two values. The title and the link.
The SaxParser object includes a handy list that holds the names of all the current parents elements.
This is useful as it will help us find the "correct" 'title' and 'link' elements. The correct elements are the ones under the 'item' element.

The parsing code in this case is pretty simple:
B4X:
Sub Parser_StartElement (Uri As String, Name As String, Attributes As Attributes)

End Sub
Sub Parser_EndElement (Uri As String, Name As String, Text As StringBuilder)
    If parser.Parents.IndexOf("item") > -1 Then
        If Name = "title" Then
            Title = Text.ToString
        Else If Name = "link" Then
            Link = Text.ToString
        End If
    End If
    If Name = "item" Then
        ListView1.AddSingleLine2(Title, Link) 'add the title as the text and the link as the value
    End If
End Sub
Title and Link are global variables.
We are only using EndElement events in this program.
First we check if we are inside an 'item' element. If this is the case we check the actual element name and save it if it is 'title' or 'link'.

If the current element is 'item' it means that we are done parsing an item.
So we add the data collected to the list view.

We are using ListView.AddSingleLine2. This method receives two values. The first is the item text and the second is the value that will return when the user will click on this item. In this case we are storing the link as the return value.

Later we will use it to open the browser:
B4X:
Sub ListView1_ItemClick (Position As Int, Value As Object)
    StartActivity(PhoneIntents1.OpenBrowser(Value)) 'open the brower with the link
End Sub
The code that initiated the parsing is:
B4X:
    Dim in As InputStream
    in = File.OpenInput(File.DirAssets, "rss.xml") 'This file was added with the file manager.
    parser.Parse(in, "Parser") '"Parser" is the events subs prefix.
    in.Close
 

Attachments

  • XmlSax.zip
    10 KB · Views: 6,260
Last edited:

airblaster

Active Member
Licensed User
Longtime User
If you are parsing different XML files, can you reuse the parser object? Or is it safer to create a parser object for each XML file?
 

socialnetis

Active Member
Licensed User
Longtime User
If I have something like this:

B4X:
Sub HandleDownloadPage (Job As HttpJob)
   XmlParser.Initialize
   XmlParser.Parse(Job.GetInputStream, "XmlParser")
    DoSomething()            'Use the parsed data       
End Sub

Does the DoSomething() function will be executed after ALL the Input is parsed? Or the Parser runs in another thread?
 

netchicken

Active Member
Licensed User
Longtime User
For the life of my I can't parse this xml. I know it uses Attributes, as all the data is within the tags however there are two main parts, the <platform> and the <Position>.

Might they have to be processed separately?

Does anyone have any ideas?

I have been trying to get the data out with below in the Parser_StartElement
PlatformTag = Attributes.GetValue2("", "PlatformTag")




B4X:
 <Platform PlatformTag="1" Name="City Exchange (A)">
    <Position Lat="-4.353379800000000e+001" Long="1.726375730000000e+002" />
  </Platform>
  <Platform PlatformTag="2" PlatformNo="37323" Name="City Exchange (B)" RoadName="Lichfield St">
    <Position Lat="-4.353411900000000e+001" Long="1.726373400000000e+002" />
  </Platform>
  <Platform PlatformTag="3" PlatformNo="37334" Name="City Exchange (C)" RoadName="Lichfield St">
    <Position Lat="-4.353411900000000e+001" Long="1.726371900000000e+002" />
  </Platform>
  <Platform PlatformTag="4" PlatformNo="39911" Name="City Exchange (D1)" BearingToRoad="2.7112744e+002" RoadName="Colombo St">
    <Position Lat="-4.353377900000000e+001" Long="1.726366870000000e+002" />
  </Platform>
  <Platform PlatformTag="5" PlatformNo="39924" Name="City Exchange (E2)" BearingToRoad="9.1127449e+001" RoadName="Colombo St">
    <Position Lat="-4.353318461000000e+001" Long="1.726365578000000e+002" />
  </Platform>
 
Last edited:

netchicken

Active Member
Licensed User
Longtime User
OK Fixed it ... I think (I will post the fixed version when its done)
The big issue is that it is case sensitive, which got me.
Otherwise all working, maybe someone could use this if they find a similar structure.

Here is what I have so far ....

First lines of the datafile....

B4X:
<?xml version="1.0"?>
<JPPlatforms xmlns="urn:connexionz-co-nz:jp" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:connexionz-co-nz:jp JourneyPlanner.xsd">
  <Content Expires="2011-05-15T03:32:00" />
  <Platform PlatformTag="1" Name="City Exchange (A)">
    <Position Lat="-4.353379800000000e+001" Long="1.726375730000000e+002" />
  </Platform>
  <Platform PlatformTag="2" PlatformNo="37323" Name="City Exchange (B)" RoadName="Lichfield St">
    <Position Lat="-4.353411900000000e+001" Long="1.726373400000000e+002" />
  </Platform>
  <Platform PlatformTag="3" PlatformNo="37334" Name="City Exchange (C)" RoadName="Lichfield St">
    <Position Lat="-4.353411900000000e+001" Long="1.726371900000000e+002" />
  </Platform>
  <Platform PlatformTag="4" PlatformNo="39911" Name="City Exchange (D1)" BearingToRoad="2.7112744e+002" RoadName="Colombo St">
    <Position Lat="-4.353377900000000e+001" Long="1.726366870000000e+002" />
  </Platform>
  <Platform PlatformTag="5" PlatformNo="39924" Name="City Exchange (E2)" BearingToRoad="9.1127449e+001" RoadName="Colombo St">
    <Position Lat="-4.353318461000000e+001" Long="1.726365578000000e+002" />
  </Platform>

Under Globals
B4X:
Type BusStop(PlatformName As String, PlatformTag As Int, PlatformNo As Int, RoadName As String, Lat As Long, Lon As Long)
   
Dim ListOfBusStops As List
Dim ThisBusStop As BusStop



Under Start Element
B4X:
Sub Parser_StartElement (Uri As String, Name As String, Attributes As Attributes)

If Name = "Platform" Then
     
        ThisBusStop.PlatformTag = Attributes.GetValue2("", "PlatformTag")
        ThisBusStop.PlatformName  = Attributes.GetValue2("", "name")
        ThisBusStop.PlatformNo = Attributes.GetValue2("", "PlatformNo")
         ThisBusStop.RoadName = Attributes.GetValue2("", "RoadName")
  
End If

If Name = "Position" Then
      ThisBusStop.Lat = Attributes.GetValue2("", "Lat")
      ThisBusStop.Lon = Attributes.GetValue2("", "Long")
End If 


ListOfBusStops.Add(ThisBusStop)
 

Log("name " & ThisBusStop.PlatformName &" " & " Lat " & ThisBusStop.Lat &" " & " number " & ThisBusStop.PlatformNo)
 
Last edited:

Fabrice La

Active Member
Licensed User
Longtime User
Hi

I have this xml file.
<total>
<items>
<item id='1' date='2012-01-16' time='21:26:00' duration='19:20 min'>
<itemcomputer mode='exit' deviceid='5ca8f36c' itemid='7fffffff'>
<haut max='19.6 m' mean='9.772 m' />
<sample time='0:20 min' haut='3.4 m' />
<sample time='0:20 min' haut='6.3 m' />
<sample time='0:20 min' haut='6.3 m' />
<sample time='0:20 min' haut='6.3 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
</itemcomputer>
</item>
<item id='45' rating='2' visibility='3' date='2013-05-10' time='10:07:00' duration='31:20 min'>
<location gps='43.043842 1.225603'>There</location>
<itemmaster>other</itemmaster>
<allience>History</allience>
<bout size='7.0 l' workpressure='200.0 ess' description='15L 200 ess' />
<weightsystem weight='5.0 kg' description='ceinture' />
<itemtemperature hot='19.0 C'/>
<itemcomputer mode='exit' deviceid='5ca8f36c' itemid='7fffffff'>
<haut max='30.7 m' mean='16.091 m' />
<sample time='0:20 min' haut='6.3 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
</itemcomputer>
</item>
<item id='33' rating='2' visibility='3' date='2013-05-10' time='10:07:00' duration='31:20 min'>
<location gps='43.043842 1.225603'>There</location>
<itemmaster>other</itemmaster>
<allience>History</allience>
<bout size='7.0 l' workpressure='200.0 ess' description='15L 200 ess' />
<weightsystem weight='5.0 kg' description='ceinture' />
<itemtemperature hot='19.0 C'/>
<itemcomputer mode='exit' deviceid='5ca8f36c' itemid='7fffffff'>
<haut max='30.7 m' mean='16.091 m' />
<sample time='0:20 min' haut='6.3 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
<sample time='31:20 min' haut='4.7 m' />
</itemcomputer>
</item>
</items>
</total>

1) I have to extract th id of each item --> ok

2) After choosing in list the id I need to extract the sample lines only in this id ?

any help
 

mterveen

Member
Licensed User
Longtime User
sax parser in code module

(erel - feel free to delete double post in general forum area on this).

does the xml sax parser work in a code module? if so can someone get the xml example from post 1 to work as a module.

i have no trouble getting my program to work in an activity with the xmlsax parser but moving it to the code module does not seem to trigger the start and end events.
 

shawny

Member
Licensed User
Longtime User
Having some issues getting this to work. I was able to use the HttpJob to download my XML file, used w3schools to validate file. Wether I use an inputstream and parse or textreader and parse2 I get the same issue. NullReference. Can't for the life of me figure this out, need some help.

LogCat connected to: B4A-Bridge: samsung SAMSUNG-SGH-I537-356554050382940
--------- beginning of /dev/log/main
running waiting messages (3)
** Activity (main) Resume **
** Service (service1) Destroy **
** Service (service1) Create **
** Service (service1) Start **
Connected to B4A-Bridge (Wifi)
Installing file.
** Activity (main) Pause, UserClosed = false **
PackageAdded: package:b4a.example
** Activity (main) Create, isFirst = true **
** Activity (main) Resume **
** Activity (main) Pause, UserClosed = false **
** Activity (listpendingshipmentscrmodule) Create, isFirst = true **
** Activity (listpendingshipmentscrmodule) Resume **
Error occurred on line: 52 (listpendingshipmentscrmodule)
java.lang.NullPointerException
at anywheresoftware.b4a.objects.SaxParser.parse(SaxParser.java:78)
at anywheresoftware.b4a.objects.SaxParser.Parse2(SaxParser.java:88)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:511)
at anywheresoftware.b4a.shell.Shell.runMethod(Shell.java:485)
at anywheresoftware.b4a.shell.Shell.raiseEventImpl(Shell.java:232)
at anywheresoftware.b4a.shell.Shell.raiseEvent(Shell.java:174)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:511)
at anywheresoftware.b4a.ShellBA.raiseEvent2(ShellBA.java:93)
at anywheresoftware.b4a.BA.raiseEvent2(BA.java:158)
at anywheresoftware.b4a.BA.raiseEvent(BA.java:154)
at anywheresoftware.b4a.objects.ViewWrapper$1.onClick(ViewWrapper.java:64)
at android.view.View.performClick(View.java:4354)
at android.view.View$PerformClick.run(View.java:17961)
at android.os.Handler.handleCallback(Handler.java:725)
at android.os.Handler.dispatchMessage(Handler.java:92)
at android.os.Looper.loop(Looper.java:137)
at android.app.ActivityThread.main(ActivityThread.java:5328)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:511)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1102)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:869)
at dalvik.system.NativeStart.main(Native Method)
 

shawny

Member
Licensed User
Longtime User
Initializing the object worked. But now it's telling me the same after all items have been looped thru. Thanks NJDude
 

shawny

Member
Licensed User
Longtime User
Got it working now ;) Had to move some declarations around. Gotta say, pretty impressed with the IDE as well as the community on this. Recently purchased it after 2 days of usage, got it integrating with my XML API I've been using for years in PHP. Have been able to bridge the gap now to Android Devices. Thanks b4a.
 

GMan

Well-Known Member
Licensed User
Longtime User
Cool Lib - works from the scratch :D
 

jcredk

Member
Licensed User
Longtime User
Hi all,
This library is great ... I am using it inside a kind of RSS Reader/Concentrator I am doing.

Despite I read all pages of this post I still have issues with encoding on some flows only.

I reuse the sample at the beginning of the post so you can see the behavior I have ...
Basically the application downloads the RSS feed of a french newspaper (text will contain french accentuated letters éèà ...) that are not properly handled.

You will see in the attachement below the test application highly inspired from top post sample ... Then accentuated letters are replaced with question mark inside a diamond (cf. ScreenShot) ...!

I tried many things with an intermediate stream to handle encoding but without success ...
... Any help from those that already solved this issue will be greatly appreciated !

Thanks,
Screenshot_2014-03-18-09-39-01.png
 

Attachments

  • XmlSaxTest.zip
    1.1 KB · Views: 472
Top