B4J Library [B4X] Xml2Map - Simple way to parse XML documents

Status
Not open for further replies.
Nobody likes to parse XML.

Parsing JSON is simple and fun. Parsing XML is tedious and boring.

That is the reason behind the Xml2Map class. It internally parses the XML document and returns a Map with the parsed data. It is similar to parsing JSON.
Tip: You can use this tool to help you with parsing JSON: https://b4x.com:51041/json/index.html

So instead of the code explained in the old tutorial: https://www.b4x.com/android/forum/threads/xml-parsing-with-the-xmlsax-library.6866/#content

We can achieve the same thing with this code:
B4X:
Sub Process_Globals
   Private ParsedData As Map
End Sub

Sub Globals
   Private ListView1 As ListView
End Sub

Sub Activity_Create(FirstTime As Boolean)
   If FirstTime Then
     Dim xm As Xml2Map
     xm.Initialize
     xm.StripNamespaces = True '<--- new in v1.01
     ParsedData = xm.Parse(File.ReadString(File.DirAssets, "rss.xml"))
   End If
   Activity.LoadLayout("1")
   ListView1.SingleLineLayout.ItemHeight = 60dip
   Dim rss As Map = ParsedData.Get("rss")
   Dim channel As Map = rss.Get("channel")
   Dim items As List = channel.Get("item")
   For Each item As Map In items
     Dim title As String = item.Get("title")
     Dim link As String = item.Get("link")
     ListView1.AddSingleLine2(title, link)
   Next
End Sub

Sub ListView1_ItemClick (Position As Int, Value As Object)
   Dim pi As PhoneIntents
   StartActivity(pi.OpenBrowser(Value))
End Sub

You can use the JSON library to convert the Map to a json string, this is useful for understanding how to access the data:
B4X:
Dim jg As JSONGenerator
jg.Initialize(ParsedData)
Log(jg.ToPrettyString(4))

The result in this case will look like:
"rss": {
"Attributes": {
"version": "2.0"
},
"channel": {
"title": "Basic4ppc \/ Basic4android - Android programming",
"link": "http:\/\/www.b4x.com\/forum",
"description": "Basic4android - android programming and development",
"language": "en",
"lastBuildDate": "Sun, 12 Dec 2010 10:19:27 GMT",
"generator": "vBulletin",
"ttl": "60",
"image": {
"url": "http:\/\/www.b4x.com\/forum\/images\/misc\/rss.jpg",
"title": "Basic4ppc \/ Basic4android - Android programming",
"link": "http:\/\/www.b4x.com\/forum"
},
"item": [
{
"title": "Phone library was updated - V1.10",
"link": "http:\/\/www.b4x.com\/forum\/additional-libraries-official-updates\/6859-phone-library-updated-v1-10-a.html",
"pubDate": "Sun, 12 Dec 2010 09:27:38 GMT",
"description": "An Intent object was added. This allows creating custom intents for interacting with external applications and services.\n\nInstallation...",
"encoded": "<div>An Intent object was added...",
"category": {
"Attributes": {
"domain": "http:\/\/www.b4x.com\/forum\/additional-libraries-official-updates\/"
},
"Text": "Additional libraries and official updates"
},
"creator": "Erel",
"guid": {
"Attributes": {
"isPermaLink": "true"
},
"Text": "http:\/\/www.b4x.com\/forum\/additional-libraries-official-updates\/6859-phone-library-updated-v1-10-a.html"
}
MORE ITEMS HERE

Note that attributes are added under the Attributes key. In such cases the text will be available under the Text key.

This module is compatible with B4A, B4J and B4i.
It depends on XmlSax library (which is included in the IDE).

upload_2017-1-4_14-26-40.png


Edit (October 2017):

Common pitfall


Consider this xml:
B4X:
<root>
<book>
   <title>Book 1</title>
</book>
<book>
   <title>Book 2</title>
</book>
</root>

There could be any number of book elements.
You can parse it with:
B4X:
Dim root As Map = ParsedData.Get("root")
For Each book As Map In root.Get("book")
Dim title As String = book.Get("title")
Next
However this code will fail in two cases:
1. There is only one book in the xml so root.Get("book") will return a Map instead of a List.
2. There are no books at all so root.Get("book") will return Null.

To solve this issue you can use this sub:
B4X:
Sub GetElements (m As Map, key As String) As List
   Dim res As List
   If m.ContainsKey(key) = False Then
     res.Initialize
     Return res
   Else
     Dim value As Object = m.Get(key)
     If value Is List Then Return value
     res.Initialize
     res.Add(value)
     Return res
   End If
End Sub
It will return a list in all cases.
You can safely use it with:
B4X:
Dim root As Map = ParsedData.Get("root")
For Each book As Map In GetElements(root, "book"))
Dim title As String = book.Get("title")
Next


Map2Xml - New class!

Map2Xml converts the map created with Xml2Map to a Xml string. It uses XmlBuilder library and it is compatible with B4A, B4i and B4J.
It can be used to modify existing XML documents. You read the document with Xml2Map, make the changes in the returned map and write it back with Map2Xml.

It is an internal library now.

Updates:

- v1.01 - New StripNamespaces property. When set to true the namespaces from keys and attributes are stripped. It is recommend to set it true. The behavior regarding namespaces, between B4A, B4J and B4i is different when namespaces are kept.
 

Attachments

  • Xml2Map.b4xlib
    2.2 KB · Views: 393
Last edited:

samikinikar

Member
Licensed User
Longtime User
Thank you, yes there was an error in the html tags, an unknown tag was causing the issue. I replaced it and its working fine now. I also checked with the uploaded format.xml file with the xmlvalidation.com, it displays the error, not sure how you did not encounter any issue.

Anyway thanks once again.
 

madru

Active Member
Licensed User
Longtime User
like the idea to parse XML that way :) .... but:

how can I get the "channel id" from this XML example

B4X:
<?xml version="1.0" encoding="utf-8"?>
<Channels resultCount="335" xmlns="urn:stream:vasaweb:1.0">
  <Channel id="111_BBC_One">
    <ParentChannelCount>0</ParentChannelCount>
    <ChildChannelCount>0</ChildChannelCount>
    <EventCount>123</EventCount>
    <TstvEventCount>0</TstvEventCount>
    <AllEventCount>123</AllEventCount>
    <ProductCount>0</ProductCount>
    <RollingBufferCount>0</RollingBufferCount>
    <IsAdult>false</IsAdult>
    <Is3D>false</Is3D>
    <IsHD>false</IsHD>
    <IsAudioOnly>false</IsAudioOnly>
    <Name>Rai Uno</Name>
    <Aliases>
      <Alias type="IngestedServiceId">0008</Alias>
    </Aliases>
    <CustomProperties>
      <CustomProperty href="urn:stream:metadata:cs:PropertyCS:2010:otherIdentifier">0008</CustomProperty>
    </CustomProperties>
    <IsViewableOnCpe>false</IsViewableOnCpe>
    <PlayInfos />
  </Channel>
</Channels>

THX

M
 

madru

Active Member
Licensed User
Longtime User
B4X:
(MyMap) {Attributes={id=111_BBC_One}, ParentChannelCount=0, ChildChannelCount=0, EventCount=123, TstvEventCount=0, AllEventCount=123, ProductCount=0, RollingBufferCount=0, IsAdult=false, Is3D=false, IsHD=false, IsAudioOnly=false, Name=Rai Uno, Aliases={Alias={Attributes={type=IngestedServiceId}, Text=0008}}, CustomProperties={CustomProperty={Attributes={href=urn:stream:metadata:cs:propertyCS:2010:eek:therIdentifier}, Text=0008}}, IsViewableOnCpe=false, PlayInfos=}
= null
 

madru

Active Member
Licensed User
Longtime User
THX

overseen the Attributes tag, sorry for the stupid question .... ;)
 

madru

Active Member
Licensed User
Longtime User
another stupid question .....

I am struggling to get my code done, as a last result I tried to use the original posted example (code and XML) but I get the same error all the time :

does the example really work ?


** Service (starter) Create **
** Service (starter) Start **
** Activity (main) Create, isFirst = true **
Error occurred on line: 33 (Main)
java.lang.ClassCastException: anywheresoftware.b4a.objects.collections.Map$MyMap cannot be cast to java.util.List
at b4a.example.main._activity_create(main.java:411)
at java.lang.reflect.Method.invoke(Native Method)
at java.lang.reflect.Method.invoke(Method.java:372)
at anywheresoftware.b4a.shell.Shell.runMethod(Shell.java:710)
at anywheresoftware.b4a.shell.Shell.raiseEventImpl(Shell.java:342)
at anywheresoftware.b4a.shell.Shell.raiseEvent(Shell.java:249)
at java.lang.reflect.Method.invoke(Native Method)
at java.lang.reflect.Method.invoke(Method.java:372)
at anywheresoftware.b4a.ShellBA.raiseEvent2(ShellBA.java:134)
at b4a.example.main.afterFirstLayout(main.java:102)
at b4a.example.main.access$000(main.java:17)
at b4a.example.main$WaitForLayout.run(main.java:80)
at android.os.Handler.handleCallback(Handler.java:739)
at android.os.Handler.dispatchMessage(Handler.java:95)
at android.os.Looper.loop(Looper.java:135)
at android.app.ActivityThread.main(ActivityThread.java:5294)
at java.lang.reflect.Method.invoke(Native Method)
at java.lang.reflect.Method.invoke(Method.java:372)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:904)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:699)
** Activity (main) Resume **

B4X:
 Sub Activity_Create(FirstTime As Boolean)
    If FirstTime Then
        Dim xm As Xml2Map
        xm.Initialize
        ParsedData = xm.Parse(File.ReadString(File.DirAssets, "rss.xml"))
    End If
    Activity.LoadLayout("1")
    ListView1.SingleLineLayout.ItemHeight = 60dip
    Dim rss As Map = ParsedData.Get("rss")
    Dim channel As Map = rss.Get("channel")
    Dim items As List = channel.Get("item")
    For Each item As Map In items ' <- this is 33
        Dim title As String = item.Get("title")
        Dim link As String = item.Get("link")
        ListView1.AddSingleLine2(title, link)
    Next
End Sub

B4X:
(MyMap) {title=Phone library was updated - V1.10, link=http://www.b4x.com/forum/additional-libraries-official-updates/6859-phone-library-updated-v1-10-a.html, pubDate=Sun, 12 Dec 2010 09:27:38 GMT, guid={Attributes={isPermaLink=true}, Text=http://www.b4x.com/forum/additional-libraries-official-updates/6859-phone-library-updated-v1-10-a.html}}


help please
 

madru

Active Member
Licensed User
Longtime User
sure, 1:1 what you have in your example

B4X:
<?xml version="1.0" encoding="ISO-8859-1"?>
<rss version="2.0">
    <channel>
        <title>Basic4ppc  / Basic4android - Android programming</title>
        <link>http://www.b4x.com/forum</link>
        <description>Basic4android - android programming and development</description>
        <ttl>60</ttl>
        <image>
            <url>http://www.b4x.com/forum/images/misc/rss.jpg</url>
            <title>Basic4ppc  / Basic4android - Android programming</title>
            <link>http://www.b4x.com/forum</link>
        </image>
        <item>
            <title>Phone library was updated - V1.10</title>
            <link>http://www.b4x.com/forum/additional-libraries-official-updates/6859-phone-library-updated-v1-10-a.html</link>
            <pubDate>Sun, 12 Dec 2010 09:27:38 GMT</pubDate>
            <guid isPermaLink="true">http://www.b4x.com/forum/additional-libraries-official-updates/6859-phone-library-updated-v1-10-a.html</guid>
        </item>
    </channel>
</rss>
 

Erel

B4X founder
Staff member
Licensed User
Longtime User
does the example really work ?
Yes.

sure, 1:1 what you have in your example
Not exactly the same. The rss from the example (attached to this post) has multiple items so channel.Get("item") returns a Llst of maps instead of a map.
 

Attachments

  • rss.xml
    22.2 KB · Views: 632

madru

Active Member
Licensed User
Longtime User
OK, makes sense - wrong example :(

but this raises another question, if you get dynamic data where you have in one case a single and in another case multiple items how to distinguish between List and Map?
 
Last edited:

mcqueccu

Well-Known Member
Licensed User
Longtime User
I need help outputting lyrics of a song. I do get the Title, the author and I navigate to the verse but when i tried outputing the lyrics between the <lines> tag, I get this commas without the lyrics.
Attached the xml file, and image for how the song is presented before i exported it to xml. I wish to display it in that same way

B4X:
** Service (starter) Create **
** Activity (main) Create, isFirst = true **
{br=[, , ]}
{br=[, , ]}
{br=[, , ]}
{br=[, , ]}
{br=[, , ]}
{br=[, , ]}
{br=[, , , , ]}
** Activity (main) Resume **
** Service (starter) Start **

THIS IS MY CODE

B4X:
  xm.Initialize
   parseData = xm.Parse(File.ReadString(File.DirAssets,"Hymn1.xml"))

   Dim song As Map = parseData.Get("song")
   Dim properties As Map = song.Get("properties")
   Dim titles As Map = properties.Get("titles")
   Dim authors As Map = properties.Get("authors")

   'GEt the Title, Author information and output to user
     Dim songTitle As String = titles.Get("title")
     Dim author As String = authors.Get("author")

     Log("Author: " & author &", Title: " & songTitle)

   Dim lyrics As Map = song.Get("lyrics")
   Dim verseList As List = lyrics.Get("verse")

   For Each verse As Map In verseList
     Log(verse.Get("lines"))
   Next
 

Attachments

  • hymn1.xml
    1.8 KB · Views: 612
  • xml preview.jpg
    xml preview.jpg
    54 KB · Views: 644
Last edited:

Erel

B4X founder
Staff member
Licensed User
Longtime User
The unescaped <br/> breaks the parser. You can escape the lines text with CDATA:
B4X:
Dim parseData As Map = xm.Parse(AddCDATAToLines(File.ReadString(File.DirAssets, "hymn1.xml")))

B4X:
Sub AddCDATAToLines(s As String) As String
   Return Regex.Replace2("^\s*<lines>(.*)</lines>\s*$", Regex.MULTILINE, s, "<lines><![CDATA[$1]]></lines>")
End Sub
 

ivan.tellez

Active Member
Licensed User
Longtime User
Its really easy to use and parse documents. Thanks

I just have a little issue, I have to parse XML documents, all with the same content, but generated by many sources.

So in the map, when using:

B4X:
ParsedData.Get("Key")
'Or
ParsedData.ContainsKey("Key")

It will break because the documents have different casing. (KEY, Key, key, etc)

For now I just can think in changing the Class to:

B4X:
att.Put(Attributes.GetName(i).ToUpperCase, Attributes.GetValue(i))

What is the best oproach to this?

Thanks
 
Status
Not open for further replies.
Top