Android Question How to parse "strange" xml

vecino

Well-Known Member
Licensed User
Longtime User
Hi, a webservice soap returns a strange xml and I can't extract the information.
What can I use to get it?
Thank you very much.

XML:
<?xml version="1.0"?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/">
  <SOAP-ENV:Body SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:NS1="urn:wsNewGES2000Intf-IwsNewGES2000">
    <NS1:PDA_DatosArticuloResponse xmlns:NS2="urn:wsNewGES2000Intf">
      <NS2:PDADatosArticulo id="1" xsi:type="NS2:PDADatosArticulo">
        <Codigo xsi:type="xsd:string">B1001</Codigo>
        <Nombre xsi:type="xsd:string">PAT. LISA FAMILIAR 260 GRS X 12 B.</Nombre>
        <Existencias xsi:type="xsd:double">0</Existencias>
        <Costo xsi:type="xsd:double">1.2</Costo>
        <Margen0 xsi:type="xsd:double">3</Margen0>
        <PVP0 xsi:type="xsd:double">1.212</PVP0>
        <Margen1 xsi:type="xsd:double">2</Margen1>
        <PVP1 xsi:type="xsd:double">1.224</PVP1>
        <Margen2 xsi:type="xsd:double">0</Margen2>
        <PVP2 xsi:type="xsd:double">1.236</PVP2>
        <Margen3 xsi:type="xsd:double">4</Margen3>
        <PVP3 xsi:type="xsd:double">1.248</PVP3>
      </NS2:PDADatosArticulo>
      <return href="#1"/>
    </NS1:PDA_DatosArticuloResponse>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>
 

Daestrum

Expert
Licensed User
Longtime User
maybe
B4X:
xml = xml.Trim.Replace("NS2:","").Replace("xsi:","").Replace("xsd:","")
before Xml2Map
 
Upvote 0

sirjo66

Well-Known Member
Licensed User
Longtime User
hi vecino,
you can try do it with Regex

hola vecino,
puede intentar hacerlo utilizando Regex
 
Upvote 0

TILogistic

Expert
Licensed User
Longtime User
test:
new: StripNamespaces = True

only use
sXML = sXML.Trim.Replace("xsi:","").Replace("xsd:","")
B4X:
    Dim sXML As String = $"
    <?xml version="1.0"?>
    <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/">
      <SOAP-ENV:Body SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:NS1="urn:wsNewGES2000Intf-IwsNewGES2000">
        <NS1:PDA_DatosArticuloResponse xmlns:NS2="urn:wsNewGES2000Intf">
          <NS2:PDADatosArticulo id="1" xsi:type="NS2:PDADatosArticulo">
            <Codigo xsi:type="xsd:string">B1001</Codigo>
            <Nombre xsi:type="xsd:string">PAT. LISA FAMILIAR 260 GRS X 12 B.</Nombre>
            <Existencias xsi:type="xsd:double">0</Existencias>
            <Costo xsi:type="xsd:double">1.2</Costo>
            <Margen0 xsi:type="xsd:double">3</Margen0>
            <PVP0 xsi:type="xsd:double">1.212</PVP0>
            <Margen1 xsi:type="xsd:double">2</Margen1>
            <PVP1 xsi:type="xsd:double">1.224</PVP1>
            <Margen2 xsi:type="xsd:double">0</Margen2>
            <PVP2 xsi:type="xsd:double">1.236</PVP2>
            <Margen3 xsi:type="xsd:double">4</Margen3>
            <PVP3 xsi:type="xsd:double">1.248</PVP3>
          </NS2:PDADatosArticulo>
          <return href="#1"/>
        </NS1:PDA_DatosArticuloResponse>
      </SOAP-ENV:Body>
    </SOAP-ENV:Envelope>
    "$

    sXML = sXML.Trim.Replace("xsi:","").Replace("xsd:","")

    Dim Xml2Map1 As Xml2Map
    Xml2Map1.Initialize
    Xml2Map1.StripNamespaces = True
    Dim mRoot As Map = Xml2Map1.Parse(sXML)
   
    Dim Envelope As Map = mRoot.Get("Envelope")
    Dim Body As Map = Envelope.Get("Body")
    Dim PDA_DatosArticuloResponse As Map = Body.Get("PDA_DatosArticuloResponse")
    Dim PDADatosArticulo As Map = PDA_DatosArticuloResponse.Get("PDADatosArticulo")
   
    Log(PDADatosArticulo.Get("Nombre").As(Map).Get("Text"))
    Log(PDADatosArticulo.Get("Margen0").As(Map).Get("Text"))
1710286391390.png
 
Last edited:
Upvote 0

TILogistic

Expert
Licensed User
Longtime User
see:

Note:
xsd: defines the structure of the elements
xsi: namespace prefix
 
Last edited:
Upvote 0

emexes

Expert
Licensed User
What can I use to get it?

For unnested (single data record) XML I usually just do it "manually" with eg a function like this:
B4X:
Sub GetXMLField(XML As String, FieldName As String) As String
    Dim P1 As Int = XML.IndexOf("<" & FieldName & " ")
    If P1 > -1 Then
        Dim P2 As Int = XML.IndexOf2(">", P1)
        If P2 > -1 Then
            Dim P3 As Int = XML.IndexOf("</" & FieldName & ">")
            If P3 > -1 Then
                Return XML.SubString2(P2 + 1, P3)
            End If
        End If
    End If
 
    Return "XML field " & FieldName & " not found"    'or whatever you prefer to signify not found
End Sub

which on your sample data:
B4X:
Dim SampleXML As String = $"<?xml version="1.0"?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/">
  <SOAP-ENV:Body SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:NS1="urn:wsNewGES2000Intf-IwsNewGES2000">
    <NS1:PDA_DatosArticuloResponse xmlns:NS2="urn:wsNewGES2000Intf">
      <NS2:PDADatosArticulo id="1" xsi:type="NS2:PDADatosArticulo">
        <Codigo xsi:type="xsd:string">B1001</Codigo>
        <Nombre xsi:type="xsd:string">PAT. LISA FAMILIAR 260 GRS X 12 B.</Nombre>
        <Existencias xsi:type="xsd:double">0</Existencias>
        <Costo xsi:type="xsd:double">1.2</Costo>
        <Margen0 xsi:type="xsd:double">3</Margen0>
        <PVP0 xsi:type="xsd:double">1.212</PVP0>
        <Margen1 xsi:type="xsd:double">2</Margen1>
        <PVP1 xsi:type="xsd:double">1.224</PVP1>
        <Margen2 xsi:type="xsd:double">0</Margen2>
        <PVP2 xsi:type="xsd:double">1.236</PVP2>
        <Margen3 xsi:type="xsd:double">4</Margen3>
        <PVP3 xsi:type="xsd:double">1.248</PVP3>
      </NS2:PDADatosArticulo>
      <return href="#1"/>
    </NS1:PDA_DatosArticuloResponse>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>
"$

and tested using:
B4X:
Log("VENDITA!!! --- " & GetXMLField(SampleXML, "Nombre") & " --- VENDITA!!!")

Dim FieldName() As String = Array As String("Codigo", "Nombre", "Existencias", "Costo", "Margen3", "margen3")
For Each S As String In FieldName
    Log(S & " = " & GetXMLField(SampleXML, S))
Next

returns this:
Log output:
Waiting for debugger to connect...
Program started.
VENDITA!!! --- PAT. LISA FAMILIAR 260 GRS X 12 B. --- VENDITA!!!
Codigo = B1001
Nombre = PAT. LISA FAMILIAR 260 GRS X 12 B.
Existencias = 0
Costo = 1.2
Margen3 = 4
margen3 = XML field margen3 not found
 
Last edited:
Upvote 0

TILogistic

Expert
Licensed User
Longtime User
???
Another way to do it with REGEX.
B4X:
    Dim Pattern As String = $"<(?<tagName>\w+)(?<attributes>.*?)>(?<Value>.*)<\/\w+>"$
    Dim XMLMatcher As Matcher = Regex.Matcher(Pattern, SampleXML)
    Do While XMLMatcher.Find
        Log($"${XMLMatcher.Group(1)}: ${XMLMatcher.Group(3)}"$)
    Loop
1710298720567.png
 
Upvote 0

TILogistic

Expert
Licensed User
Longtime User
:rolleyes::rolleyes::rolleyes:
Other (Using a map)
EDIT:
B4X:
    Dim DataXML As Map : DataXML.Initialize
    Dim Pattern As String = $"<(?<tagName>\w+)(?<attributes>.*?)>(?<Value>.*)<\/\w+>"$
    Dim MatcherXML As Matcher = Regex.Matcher(Pattern, SampleXML)
    Do While MatcherXML.Find
        DataXML.Put(MatcherXML.Group(1), MatcherXML.Group(3))
    Loop

    For Each Key As String In DataXML.Keys
        Log($"${Key}: ${DataXML.Get(Key)}"$)
    Next
   
    Log("------------")

    Log("Nombre " & DataXML.Get("Nombre"))
    Log("Existencias " & DataXML.Get("Existencias"))
    Log("PVP0 " & DataXML.Get("PVP0"))

1710300165148.png
 
Last edited:
Upvote 0

TILogistic

Expert
Licensed User
Longtime User
B4X:
Dim Pattern As String = $"<(?<tagName>\w+)(?<attributes>.*?)>(?<Value>.*)<\/\w+>"$

That pattern is a work of art ? and possibly magic too ? but I still can't work out what the third question mark is for.
There are actually four pieces of information that regex collects.
Groups 0 and 2 are the attributes.

I got this routine that I had from an app, where I needed to know the type of the attribute of each field (string, double, etc.)
1710303063370.png


See:
<(?<tagName>\w+)(?<attributes>.*?)>(?<Value>.*)
 
Upvote 0

emexes

Expert
Licensed User
<(?<tagName>\w+)(?<attributes>.*?)>(?<Value>.*)<\/\w+>

I still can't work out what the third question mark is for. Not that it matters; just curious from a learn-something-new-every-day perspective. ?
 
Upvote 0

TILogistic

Expert
Licensed User
Longtime User
<attributes>.*? ==>> it's optional

Using the sign ?

matches the previous token between zero and unlimited times, as few times as possible, expanding as needed (lazy)

Not use
matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)

Note:
this regex is for matching
<name attributes>value<name>

See
 
Last edited:
Upvote 0

vecino

Well-Known Member
Licensed User
Longtime User
Hi folks, I am overwhelmed by so much valuable information and so many possibilities to get that data extracted.
I am now in a dilemma to decide for one option or another.
That's awesome, thank you so much.
:)
 
Upvote 0

emexes

Expert
Licensed User
decide for one option or another

Tbh I think a combination would be best for your probable use-case:

B4X:
Sub GetXMLField(XML As String, FieldName As String) As String
    Dim Pattern As String = "<" & FieldName & ".*>(.*)<\/"
    Dim XMLMatcher As Matcher = Regex.Matcher(Pattern, XML)
    If XMLMatcher.Find Then
        Return XMLMatcher.Group(1)
    End If
    Return "XML field " & FieldName & " not found"    'or whatever you prefer to signify not found
End Sub

as long as you don't mind having to trust that the regex pattern goobledygook is real programming and not some arbitrary magical incantation. ?
 
Last edited:
Upvote 0

sirjo66

Well-Known Member
Licensed User
Longtime User
try this:


B4X:
    Dim xml As String = $"
    <?xml version="1.0"?>
    <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/">
      <SOAP-ENV:Body SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:NS1="urn:wsNewGES2000Intf-IwsNewGES2000">
        <NS1:PDA_DatosArticuloResponse xmlns:NS2="urn:wsNewGES2000Intf">
          <NS2:PDADatosArticulo id="1" xsi:type="NS2:PDADatosArticulo">
            <Codigo xsi:type="xsd:string">B1001</Codigo>
            <Nombre xsi:type="xsd:string">PAT. LISA FAMILIAR 260 GRS X 12 B.</Nombre>
            <Existencias xsi:type="xsd:double">0</Existencias>
            <Costo xsi:type="xsd:double">1.2</Costo>
            <Margen0 xsi:type="xsd:double">3</Margen0>
            <PVP0 xsi:type="xsd:double">1.212</PVP0>
            <Margen1 xsi:type="xsd:double">2</Margen1>
            <PVP1 xsi:type="xsd:double">1.224</PVP1>
            <Margen2 xsi:type="xsd:double">0</Margen2>
            <PVP2 xsi:type="xsd:double">1.236</PVP2>
            <Margen3 xsi:type="xsd:double">4</Margen3>
            <PVP3 xsi:type="xsd:double">1.248</PVP3>
          </NS2:PDADatosArticulo>
          <return href="#1"/>
        </NS1:PDA_DatosArticuloResponse>
      </SOAP-ENV:Body>
    </SOAP-ENV:Envelope>
    "$

    Dim fields() As String = Array As String("Codigo", "Nombre", "Existencias", "Costo", "Margen0", "PVP0", "Margen1", "PVP1", "Margen2", "PVP2", "Margen3", "PVP3")
    Dim valuesMap As Map
    valuesMap.Initialize
    
    For Each field As String In fields
        Dim match As Matcher =     Regex.Matcher("<" & field & ".*?>(.*?)<\/" & field & ">", xml)
        If match.Find Then
            Dim value As String = match.Group(1)
            Log(field & " = " & value)
            valuesMap.Put(field, value)
        End If
    Next
    
    ' valuesMap now contains values
 
Upvote 0

TILogistic

Expert
Licensed User
Longtime User
????
All the solutions I've seen are good, but what would happen if the XML contains more than one element with the same name?
g.

XML:
<NS2:pDADatosArticulo id="1" xsi:type="NS2:pDADatosArticulo">
......
</NS2:pDADatosArticulo>

<NS2:pDADatosArticulo id="2" xsi:type="NS2:pDADatosArticulo">
......
</NS2:pDADatosArticulo>
 
Last edited:
Upvote 0
Top