Anagram solving by Regex or SQL

LordZenzo

Well-Known Member
Licensed User
Longtime User
Well, you have discovered my simple solution

Following on from the @LucaMs example, by adding a word length column, you can use a where clause to ignore words of the wrong length. Also index the Length and Sorted Letters column to get a faster response.
as @LucaMs says useless


B4X:
Private Sub SortLetters(Letters As String) As String
    Dim Result As String

    Dim lstLetters As List
    lstLetters.initialise
   
    Dim bytChars() As Byte = Letters.GetBytes("UTF8")
    For i = 0 To bytChars.Length - 1
        lstLetters.Add(bytChars(i))
    Next
    lstLetters.Sort(True)
   
    For i = 0 To lstLetters.Size - 1
        Result = Result & Chr(lstLetters.Get(i))
    Next

    Return result
End Sub
getbyte is no good, there are letters that in UTF8 use 2 bytes and others even 4
better to use regex.split("",word) and then list.sort(true)

I still recommend my library, it also has other useful functions
 

Daestrum

Expert
Licensed User
Longtime User
I just tried it without using a database (used a list) and some logic to check letters - got this result
1716161344143.png
 

Daestrum

Expert
Licensed User
Longtime User
did you use my lib?
no it was just some code I knocked up using a list and some code to determine if the letters were contained in the word from the list.
B4X:
Sub Process_Globals
    Dim wordlist As List
End Sub

Sub AppStart (Args() As String)
    Dim st1 As Long = DateTime.now
    wordlist.Initialize
    Dim textRdr As TextReader
    textRdr.Initialize(File.OpenInput("d:/wordlist","words_alpha.txt"))
    Dim wrd As String = textRdr.ReadLine
    
    Do While wrd <> Null
        wordlist.Add(wrd)
        wrd = textRdr.readline
    Loop
    textRdr.Close
    Log("loaded " & wordlist.size & " words in " & (DateTime.Now - st1) & " ms")

    wordlist.Sort(True)
    Dim st As Long = DateTime.now
    Dim srchword As String = "muinimula"
    Log("Search letters [" & srchword & "]")
    Dim flag As Boolean = True
    Dim savedword As String
    For Each word As String In wordlist
        savedword = word
        flag = True
        For a = 0 To srchword.Length - 1
            If srchword.Length <> word.Length Then
                flag = False
                Exit
            End If
            If word.indexof(srchword.CharAt(a)) > -1 Then
                If word.IndexOf(srchword.CharAt(a))=0 Then
                    word = "*" & word.SubString(word.IndexOf(srchword.CharAt(a))+1)
                Else
                    word = word.SubString2(0,word.IndexOf(srchword.CharAt(a))) & "*" & word.SubString(word.IndexOf(srchword.CharAt(a))+1)
                End If
            Else
                flag = False
                Exit
            End If
        Next
        If flag=True And word.Replace("*","")="" Then Log(savedword)
    Next
    Log((DateTime.Now - st) & " ms")
End Sub
 

LucaMs

Expert
Licensed User
Longtime User
no it was just some code I knocked up using a list and some code to determine if the letters were contained in the word from the list.
B4X:
Sub Process_Globals
    Dim wordlist As List
End Sub

Sub AppStart (Args() As String)
    Dim st1 As Long = DateTime.now
    wordlist.Initialize
    Dim textRdr As TextReader
    textRdr.Initialize(File.OpenInput("d:/wordlist","words_alpha.txt"))
    Dim wrd As String = textRdr.ReadLine
 
    Do While wrd <> Null
        wordlist.Add(wrd)
        wrd = textRdr.readline
    Loop
    textRdr.Close
    Log("loaded " & wordlist.size & " words in " & (DateTime.Now - st1) & " ms")

    wordlist.Sort(True)
    Dim st As Long = DateTime.now
    Dim srchword As String = "muinimula"
    Log("Search letters [" & srchword & "]")
    Dim flag As Boolean = True
    Dim savedword As String
    For Each word As String In wordlist
        savedword = word
        flag = True
        For a = 0 To srchword.Length - 1
            If srchword.Length <> word.Length Then
                flag = False
                Exit
            End If
            If word.indexof(srchword.CharAt(a)) > -1 Then
                If word.IndexOf(srchword.CharAt(a))=0 Then
                    word = "*" & word.SubString(word.IndexOf(srchword.CharAt(a))+1)
                Else
                    word = word.SubString2(0,word.IndexOf(srchword.CharAt(a))) & "*" & word.SubString(word.IndexOf(srchword.CharAt(a))+1)
                End If
            Else
                flag = False
                Exit
            End If
        Next
        If flag=True And word.Replace("*","")="" Then Log(savedword)
    Next
    Log((DateTime.Now - st) & " ms")
End Sub

Slightly improved version.

1 - The test:
B4X:
If Letters.Length <> Word.Length Then
executed before the loop.

2 - Avoided repeating 5 times:
B4X:
Word.indexof(Letters.CharAt(A))
using a variable.

3 - Letters = Letters.ToLowerCase

4 - Changed variable names.

5 - Returns a List of anagrams.

B4X:
Sub Anagrams(Letters As String) As List
    Dim lstAnagrams As List
    lstAnagrams.Initialize
 
    Letters = Letters.ToLowerCase
 
    Dim Flag As Boolean = True
    Dim SavedWord As String
    Dim IndexOfLetter As Int

    For Each Word As String In mlstWords
        If Word.Length = Letters.Length Then
            SavedWord = Word
            Flag = True
            For A = 0 To Letters.Length - 1
                IndexOfLetter = Word.indexof(Letters.CharAt(A))
                If IndexOfLetter > - 1 Then
                    If IndexOfLetter = 0 Then
                        Word = "*" & Word.SubString(IndexOfLetter + 1)
                    Else
                        Word = Word.SubString2(0, IndexOfLetter) & "*" & Word.SubString(IndexOfLetter + 1)
                    End If
                Else
                    Flag = False
                    Exit
                End If
            Next
            If Flag And Word.Replace("*", "") = "" Then
                lstAnagrams.Add(SavedWord)
            End If
        End If
    Next
    Return lstAnagrams
End Sub

You "need" also:
B4X:
Sub Process_Globals
    Private mlstWords As List
End Sub

Public Sub Init
    mlstWords.Initialize
    Dim textRdr As TextReader
    Dim Dir, FileName As String
    Dir = File.DirAssets
    FileName = "YourWordsTextFile.txt"
    textRdr.Initialize(File.OpenInput(Dir, FileName))
    Dim wrd As String = textRdr.ReadLine

    Do While wrd <> Null
        mlstWords.Add(wrd)
        wrd = textRdr.readline
    Loop
    textRdr.Close
    mlstWords.Sort(True)
End Sub
 
Last edited:

Daestrum

Expert
Licensed User
Longtime User
This is how my code ended up after a rethink
B4X:
    For Each word As String In wordlist
        If srchword.Length <> word.Length Then Continue
        savedword = word
        flag = True
      
        For a = 0 To srchword.Length - 1
            If word.indexof(srchword.CharAt(a)) = -1 Then
                flag = False
                Exit
            Else
                If word.IndexOf(srchword.CharAt(a))=0 Then
                    word = word.SubString(1)
                Else
                    word = word.SubString2(0,word.IndexOf(srchword.CharAt(a))) & word.SubString(word.IndexOf(srchword.CharAt(a))+1)
                End If
            End If
        Next
        If flag=True Then Log("Word found <" & savedword & ">")
    Next

Which gave this result ( I used a bigger txt file for words than the original run)

Waiting for debugger to connect...
Program started.
loaded 1516999 words in 134 ms
Search letters [oxidation]
Word found <oxidation>
60 ms
Program terminated (StartMessageLoop was not called).
 

dlfallen

Active Member
Licensed User
Longtime User
The earlier B4A anagram program was a port of an even earlier Basic4PPC program. I loved the PPC days. You could compile the same code for the device and for a PC. I dug out my old .EXE program and it still runs just fine on Win 11. So if you have the old Basic4PPC program you can still write Windows programs with it.
 
Top