Android Code Snippet Unescape Unicode sequences for Spanish language

GeoT · Sep 15, 2023

I have created a second way to do the same:

B4X:

Log(DecodeUnicode("canci\u00f3n p\u00fablica"))    'prints: canción pública

Sub DecodeUnicode(strOriginal As String) As String
   
    ' Pattern to find Unicode escape sequences like \uXXXX
    Dim m As Matcher
    m = Regex.Matcher("\\u[0-9a-fA-F]{4}", strOriginal)   'Double slash to escape '\' character in regular expression

    Dim resultBuilder As StringBuilder
    resultBuilder.Initialize
   
    Dim currentIndex As Int   'To track the current position in the text
   
    Do While m.Find
        Dim match As String
        match = m.Match
'        LogColor(match, Colors.Green)
       
        If match <> "" Then
           
            ' Take actions with the matches found
'            Log("Match found in position: " & m.GetStart(0))        'Match Positions
           
            ' Adds unfound characters from the current position to the match to the StringBuilder
            resultBuilder.Append(strOriginal.SubString2(currentIndex, m.GetStart(0)))
           
            ' Add the substitute character to the StringBuilder
            Dim unicodeValue As Int
            unicodeValue = Bit.ParseInt(match.SubString(2), 16)  'Convert Unicode value to integer, omitting the leading '\'
            Dim charValue As String
            charValue = Chr(unicodeValue)  'Convert Unicode value to normal character
            resultBuilder.Append(charValue)
           
            ' Updates current position at the end of the match
            currentIndex = m.GetEnd(0)
        End If
    Loop
   
    ' Add any characters not found after the last match
    If currentIndex < strOriginal.Length Then
        resultBuilder.Append(strOriginal.SubString(currentIndex))
    End If
   
    ' Now you have all the characters (matches and non-matches) in resultBuilder  
    Return resultBuilder.ToString
End Sub

GeoT · Sep 15, 2023

I see that they also work for other Romance languages.
But I don't know if it also works in other types of languages.

I would appreciate confirmations or comments.

byz · Nov 25, 2023

GeoT said:
I see that they also work for other Romance languages.
But I don't know if it also works in other types of languages.

I would appreciate confirmations or comments.

hi,chinese is ok!

GeoT · Nov 25, 2023

Hi byz!
Ok, good!
Thank you for your info.

Android Code Snippet Unescape Unicode sequences for Spanish language

GeoT

Active Member

GeoT

Active Member

byz

Active Member

GeoT

Active Member

Similar Threads