I have inside my sqlite database invalid characters which I need to remove before I am able to use XML-Builder to generate nice looking xml and being able to parse it again with SAX.
According to the pages I have read valid UTF-8 characters should be the following bytes
0x9 | 0xA | 0xD | [0x20-0xD7FF] | [0xE000-0xFFFD] | [0x10000-0x10FFFF]
I found an example (see the link below) which I tried to rewrite in basic4android, but my Chinese characters are being stripped and normal characters are left behind. Would be nice if it would take the whole utf-8 range instead of a-z.
java - How to encode characters from Oracle to XML? - Stack Overflow
By the way XML-Builder doesn't treat these characters nicely, reason I am in need of a function which strips bad characters.
my code
Any help is appreciated as I have been fighting with this for some days now.
Looking forward to any replies.
According to the pages I have read valid UTF-8 characters should be the following bytes
0x9 | 0xA | 0xD | [0x20-0xD7FF] | [0xE000-0xFFFD] | [0x10000-0x10FFFF]
I found an example (see the link below) which I tried to rewrite in basic4android, but my Chinese characters are being stripped and normal characters are left behind. Would be nice if it would take the whole utf-8 range instead of a-z.
java - How to encode characters from Oracle to XML? - Stack Overflow
By the way XML-Builder doesn't treat these characters nicely, reason I am in need of a function which strips bad characters.
my code
B4X:
Sub XmlCharacterWhitelist(in_string As String) As String
If( in_string == Null ) Then
Return Null
End If
Dim data() As Byte : data = in_string.GetBytes("UTF8")
Dim sbOutput As StringBuilder : sbOutput.Initialize
Dim ch As Byte
For i = 0 To data.Length - 1
ch = data(i)
If ((ch >= 0x0020 AND ch <= 0xD7FF ) OR _
(ch >= 0xE000 AND ch <= 0xFFFD ) OR _
(ch >= 0x10000 AND ch <= 0x10FFFF) OR _
ch == 0x0009 OR _
ch == 0x000A OR _
ch == 0x000D ) Then
sbOutput.Append(BytesToString(data, i, 1, "UTF8"))
End If
Next
Return sbOutput.ToString
End Sub
Any help is appreciated as I have been fighting with this for some days now.
Looking forward to any replies.