Android Question How to code Sql commands to sort the result which is non-English?

Theera

Well-Known Member
Licensed User
Longtime User
Refer to Alexeder's code as belows
B4X:
    'SQL query based on the search text
    Dim DR As ResultSet = sql1.ExecQuery2("SELECT * FROM dt_Country WHERE name LIKE ? ORDER BY name ASC",Array As String("%" & SearchText & "%"))

"ORDER BY name ASC" it can used only English, If I need to test my language(Thai) ,How to code?
 

aeric

Expert
Licensed User
Longtime User
I think the non-English character also has ascii value of the unicode so it will be sorted accordingly. There is no point to sort it unless you want to sort the duplicate text.
The correct way is to sort the English translated name.
 
Upvote 0

Theera

Well-Known Member
Licensed User
Longtime User
I think the non-English character also has ascii value of the unicode so it will be sorted accordingly. There is no point to sort it unless you want to sort the duplicate text.
The correct way is to sort the English translated name.
It means that we don't code "ASC" at the end of the set of commands, right? Thanks in advance.
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
It means that we don't code "ASC" at the end of the set of commands, right? Thanks in advance.
You can sort it but it is not useful I guess.

By the way, "ASC" is optional or it is a default option for ORDER BY.

SELECT * FROM users ORDER BY name
is same as
SELECT * FROM users ORDER BY name ASC
 
Upvote 0

Sandman

Expert
Licensed User
Longtime User
I sense there might be confusion here. "asc" has nothing to do with "ascii". "asc" it's just short for "ascending", and describes how to sort the results. The opposite is "desc" for "descending".
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
I sense there might be confusion here. "asc" has nothing to do with "ascii". "asc" it's just short for "ascending", and describes how to sort the results. The opposite is "desc" for "descending".
Yes, I am talking about 2 different things. Sorry if I am making any confusion here. I don't mean ASC is same as ASCII here.
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
About ASCII values, my understanding is the character used to encode and turn into another character has it it's decimal value. For Roman char or alphabet e.g A = 65 and a = 97. In Chinese/Japanese/Korean, there are other encoding such as Big5, ShiftJ. For other countries or language like Spanish or Greek have another different encoding. I think SQL engine will convert the char into ASCII values when doing sorting. It will sort for 1st char and if 2 words are same for 1st char then check for 2nd char and so on. Eventually sorting will be done successfully but the results are not useful.
For example country names in first post, when we sort by A to Z, we may get
Canada
Malaysia
Thailand
but when the names are sorted in another language such as Thai,
we may get in different order. Maybe Thailand comes first in the list. Because the ASCII code for the first char in the word Thailand has smaller value in ASCII compared to the rest.
 
Upvote 0

emexes

Expert
Licensed User
About ASCII values, my understanding is the character used to encode and turn into another character has it it's decimal value

This test might make it clearer about sorting by ASCII codes (actually Unicode code points, although the first 128 are same as ASCII).

It outputs a list of country names in English and Thai, with their ASCII/Unicode character numbers, before and after sorting using the List.Sort method, which appears to sort them out by those "ASCII" codes.

B4X:
Dim Countries() As String = Array As String( _
    "English",   "Thai", _
    "Malaysia",  "มาเลเซีย", _
    "Canada",    "แคนาดา", _
    "Germany",   "เยอรมนี", _
    "Thailand",  "ไทย", _
    "UK",        "สหราชอาณาจักร", _
    "Spain",     "สเปน", _
    "France",    "ฝรั่งเศส", _
    "USA",       "สหรัฐอเมริกา", _
    "Israel",    "อิสราเอล", _
    "Italy",     "อิตาลี", _
    "Australia", "ออสเตรเลีย" _
)

For Language = 0 To 1
    Dim L As List
    L.Initialize
    
    For I = 2 + Language To Countries.Length - 1 Step 2
        L.Add(Countries(I))
    Next
    
    Dim Longest As Int = 0
    For Each S As String In L
        If S.Length > Longest Then Longest = S.Length
    Next
    
    For Sorted = 0 To 1
        If Sorted = 1 Then L.Sort(True)
    
        Log(Chr(160))    'blank line to separate groups
        
        Dim sb As StringBuilder
        sb.Initialize
        sb.Append("===== ")
        sb.Append(Array As String("Unsorted", "Sorted")(Sorted))
        sb.Append(" ")
        sb.Append(Countries(Language))
        sb.Append(" =====")
        Log(sb.ToString)   
        
        For Each S As String In L
            Dim sb As StringBuilder
            sb.Initialize

            sb.Append(S)
            For I = S.Length + 1 To Longest
                sb.Append(" ")               
            Next
            
            For I = 0 To S.Length - 1
                sb.Append(" " & Asc(S.CharAt(I)))
            Next
            
            Log(sb.ToString)
        Next
    Next
Next

Log output:
Waiting for debugger to connect...
Program started.
 
===== Unsorted English =====
Malaysia  77 97 108 97 121 115 105 97
Canada    67 97 110 97 100 97
Germany   71 101 114 109 97 110 121
Thailand  84 104 97 105 108 97 110 100
UK        85 75
Spain     83 112 97 105 110
France    70 114 97 110 99 101
USA       85 83 65
Israel    73 115 114 97 101 108
Italy     73 116 97 108 121
Australia 65 117 115 116 114 97 108 105 97
 
===== Sorted English =====
Australia 65 117 115 116 114 97 108 105 97
Canada    67 97 110 97 100 97
France    70 114 97 110 99 101
Germany   71 101 114 109 97 110 121
Israel    73 115 114 97 101 108
Italy     73 116 97 108 121
Malaysia  77 97 108 97 121 115 105 97
Spain     83 112 97 105 110
Thailand  84 104 97 105 108 97 110 100
UK        85 75
USA       85 83 65
 
===== Unsorted Thai =====
มาเลเซีย      3617 3634 3648 3621 3648 3595 3637 3618
แคนาดา        3649 3588 3609 3634 3604 3634
เยอรมนี       3648 3618 3629 3619 3617 3609 3637
ไทย           3652 3607 3618
สหราชอาณาจักร 3626 3627 3619 3634 3594 3629 3634 3603 3634 3592 3633 3585 3619
สเปน          3626 3648 3611 3609
ฝรั่งเศส      3613 3619 3633 3656 3591 3648 3624 3626
สหรัฐอเมริกา  3626 3627 3619 3633 3600 3629 3648 3617 3619 3636 3585 3634
อิสราเอล      3629 3636 3626 3619 3634 3648 3629 3621
อิตาลี        3629 3636 3605 3634 3621 3637
ออสเตรเลีย    3629 3629 3626 3648 3605 3619 3648 3621 3637 3618
 
===== Sorted Thai =====
ฝรั่งเศส      3613 3619 3633 3656 3591 3648 3624 3626
มาเลเซีย      3617 3634 3648 3621 3648 3595 3637 3618
สหรัฐอเมริกา  3626 3627 3619 3633 3600 3629 3648 3617 3619 3636 3585 3634
สหราชอาณาจักร 3626 3627 3619 3634 3594 3629 3634 3603 3634 3592 3633 3585 3619
สเปน          3626 3648 3611 3609
ออสเตรเลีย    3629 3629 3626 3648 3605 3619 3648 3621 3637 3618
อิตาลี        3629 3636 3605 3634 3621 3637
อิสราเอล      3629 3636 3626 3619 3634 3648 3629 3621
เยอรมนี       3648 3618 3629 3619 3617 3609 3637
แคนาดา        3649 3588 3609 3634 3604 3634
ไทย           3652 3607 3618
 
Upvote 0

Theera

Well-Known Member
Licensed User
Longtime User
This test might make it clearer about sorting by ASCII codes (actually Unicode code points, although the first 128 are same as ASCII).

It outputs a list of country names in English and Thai, with their ASCII/Unicode character numbers, before and after sorting using the List.Sort method, which appears to sort them out by those "ASCII" codes.

B4X:
Dim Countries() As String = Array As String( _
    "English",   "Thai", _
    "Malaysia",  "มาเลเซีย", _
    "Canada",    "แคนาดา", _
    "Germany",   "เยอรมนี", _
    "Thailand",  "ไทย", _
    "UK",        "สหราชอาณาจักร", _
    "Spain",     "สเปน", _
    "France",    "ฝรั่งเศส", _
    "USA",       "สหรัฐอเมริกา", _
    "Israel",    "อิสราเอล", _
    "Italy",     "อิตาลี", _
    "Australia", "ออสเตรเลีย" _
)

For Language = 0 To 1
    Dim L As List
    L.Initialize
    
    For I = 2 + Language To Countries.Length - 1 Step 2
        L.Add(Countries(I))
    Next
    
    Dim Longest As Int = 0
    For Each S As String In L
        If S.Length > Longest Then Longest = S.Length
    Next
    
    For Sorted = 0 To 1
        If Sorted = 1 Then L.Sort(True)
    
        Log(Chr(160))    'blank line to separate groups
        
        Dim sb As StringBuilder
        sb.Initialize
        sb.Append("===== ")
        sb.Append(Array As String("Unsorted", "Sorted")(Sorted))
        sb.Append(" ")
        sb.Append(Countries(Language))
        sb.Append(" =====")
        Log(sb.ToString)   
        
        For Each S As String In L
            Dim sb As StringBuilder
            sb.Initialize

            sb.Append(S)
            For I = S.Length + 1 To Longest
                sb.Append(" ")               
            Next
            
            For I = 0 To S.Length - 1
                sb.Append(" " & Asc(S.CharAt(I)))
            Next
            
            Log(sb.ToString)
        Next
    Next
Next

Log output:
Waiting for debugger to connect...
Program started.
 
===== Unsorted English =====
Malaysia  77 97 108 97 121 115 105 97
Canada    67 97 110 97 100 97
Germany   71 101 114 109 97 110 121
Thailand  84 104 97 105 108 97 110 100
UK        85 75
Spain     83 112 97 105 110
France    70 114 97 110 99 101
USA       85 83 65
Israel    73 115 114 97 101 108
Italy     73 116 97 108 121
Australia 65 117 115 116 114 97 108 105 97
 
===== Sorted English =====
Australia 65 117 115 116 114 97 108 105 97
Canada    67 97 110 97 100 97
France    70 114 97 110 99 101
Germany   71 101 114 109 97 110 121
Israel    73 115 114 97 101 108
Italy     73 116 97 108 121
Malaysia  77 97 108 97 121 115 105 97
Spain     83 112 97 105 110
Thailand  84 104 97 105 108 97 110 100
UK        85 75
USA       85 83 65
 
===== Unsorted Thai =====
มาเลเซีย      3617 3634 3648 3621 3648 3595 3637 3618
แคนาดา        3649 3588 3609 3634 3604 3634
เยอรมนี       3648 3618 3629 3619 3617 3609 3637
ไทย           3652 3607 3618
สหราชอาณาจักร 3626 3627 3619 3634 3594 3629 3634 3603 3634 3592 3633 3585 3619
สเปน          3626 3648 3611 3609
ฝรั่งเศส      3613 3619 3633 3656 3591 3648 3624 3626
สหรัฐอเมริกา  3626 3627 3619 3633 3600 3629 3648 3617 3619 3636 3585 3634
อิสราเอล      3629 3636 3626 3619 3634 3648 3629 3621
อิตาลี        3629 3636 3605 3634 3621 3637
ออสเตรเลีย    3629 3629 3626 3648 3605 3619 3648 3621 3637 3618
 
===== Sorted Thai =====
ฝรั่งเศส      3613 3619 3633 3656 3591 3648 3624 3626
มาเลเซีย      3617 3634 3648 3621 3648 3595 3637 3618
สหรัฐอเมริกา  3626 3627 3619 3633 3600 3629 3648 3617 3619 3636 3585 3634
สหราชอาณาจักร 3626 3627 3619 3634 3594 3629 3634 3603 3634 3592 3633 3585 3619
สเปน          3626 3648 3611 3609
ออสเตรเลีย    3629 3629 3626 3648 3605 3619 3648 3621 3637 3618
อิตาลี        3629 3636 3605 3634 3621 3637
อิสราเอล      3629 3636 3626 3619 3634 3648 3629 3621
เยอรมนี       3648 3618 3629 3619 3617 3609 3637
แคนาดา        3649 3588 3609 3634 3604 3634
ไทย           3652 3607 3618
The correct sorted Thai isแคนาดา, ไทย, มาเลเซีย, ฝรั่งเศส, เยอรมันนี, สเปน, สหรัฐอเมริกา, สหราชอาณาจักร, ออสเตรเลีย, อิตาลี, อิสราเอลคือ it should be sorted According to the Ascii code of Thai characters are also available.(Thai alphabet has a higher priority value than vowels.)
 
Last edited:
Upvote 0

emexes

Expert
Licensed User
The correct sorted Thai is:

แคนาดา
ไทย
มาเลเซีย
ฝรั่งเศส
เยอรมันนี
สเปน
สหรัฐอเมริกา
สหราชอาณาจักร
อิตาลี
อิสราเอล
ออสเตรเลีย


Is there some logic behind that? For example, are these letters in order? :

แ ไ ม ฝ เ ส อิ อ

but these letters are not? :

ฝ ม ส อ อิ เ แ ไ

How many characters are used when writing Thai?

Do they have a specified order for sorting?

Although I'd be surprised that they'd not already be in that order in Unicode (like letters and digits are in ASCII).
 
Upvote 0

Theera

Well-Known Member
Licensed User
Longtime User
How many characters are used when writing Thai?
Thai has 44 characters,but the present, we unuse ฃ and ฅ
 
Upvote 0

emexes

Expert
Licensed User
Well, the first bit looks easy enough:
(I should probably test it, but... hey, what could possibly go rwong?)

B4X:
'Returns true if character is in the range from SARA E to SARA AI MAIMALAI,
'i.e. if the character is a leading vowel
Sub isLeadingVowel(C As Char) As Boolean

    Dim SARA_E As Int = 0x0E40
    Dim SARA_AI_MAIMALAI As Int = 0x0E44
  
    Return (Asc(C) >= SARA_E And Asc(C) <= SARA_AI_MAIMALAI)

End Sub
B4X:
'Returns true if character is in the range from MAITHAIKHU to THANTHAKHAT
'which includes the four tone marks.  I.e. all "above" symbols
Sub isToneMark(C As Char) As Boolean

    Dim MAITAIKHU As Int = 0x0E47
    Dim THANTHAKHAT As Int = 0x0E4C

    Return (Asc(C) >= MAITAIKHU And Asc(C) <= THANTHAKHAT)
  
End Sub
 
Upvote 0

Theera

Well-Known Member
Licensed User
Longtime User
Well, the first bit looks easy enough:
(I should probably test it, but... hey, what could possibly go rwong?)

B4X:
'Returns true if character is in the range from SARA E to SARA AI MAIMALAI,
'i.e. if the character is a leading vowel
Sub isLeadingVowel(C As Char) As Boolean

    Dim SARA_E As Int = 0x0E40
    Dim SARA_AI_MAIMALAI As Int = 0x0E44
  
    Return (Asc(C) >= SARA_E And Asc(C) <= SARA_AI_MAIMALAI)

End Sub
B4X:
'Returns true if character is in the range from MAITHAIKHU to THANTHAKHAT
'which includes the four tone marks.  I.e. all "above" symbols
Sub isToneMark(C As Char) As Boolean

    Dim MAITAIKHU As Int = 0x0E47
    Dim THANTHAKHAT As Int = 0x0E4C

    Return (Asc(C) >= MAITAIKHU And Asc(C) <= THANTHAKHAT)
  
End Sub
Thank you for your replies, I will try coding myself.I still don't understand about 01,02,03,04,05,06 which are his code tell about.
 
Upvote 0

emexes

Expert
Licensed User
I still don't understand about 01,02,03,04,05,06 which are his code tell about

I found it confusing too. That is partly why I left the rest as a treat for you. 🍻

But the middle button shows the intermediate sorting values for his sample data, so at least you've got something to aim for and to check against.
 
Upvote 0

Theera

Well-Known Member
Licensed User
Longtime User
Hi again,
I have problem how to code continue (after sorted) ,so I have attached the zip file.
 

Attachments

  • TestSortThaiWords.zip
    10.5 KB · Views: 5
Upvote 0

Theera

Well-Known Member
Licensed User
Longtime User
I found it confusing too. That is partly why I left the rest as a treat for you. 🍻

But the middle button shows the intermediate sorting values for his sample data, so at least you've got something to aim for and to check against.
I have convert his code from java to B4A and create B4A project, but I don't how to manage mylist for sorting.
 
Upvote 0

emexes

Expert
Licensed User
how to manage mylist for sorting.

1736072707630.png

1736072911751.png
 
Upvote 0
Top