B4J Code Snippet Compare text strings

Written last century, resurrection prompted by related question in Spanish forum

It is a very nice algorithm, even if I do say so myself. ?

B4X:
Dim S1 As String = "Now is the time for all good men to come to the aid of the party"
Dim S2 As String = "It was the time for good women to come to the aid of the men too"

Log("S1 = """ & S1 & """")
Log("S2 = """ & S2 & """")
Log("TO GET FROM S1 TO S2:")
WhatChanged(S1, S2)

Sub WhatChanged(S1 As String, S2 As String)
    If S1.Length = 0 And S2.Length = 0 Then
        'nothing to do
    else if S1.Length = 0 Then
        Log("delete """ & S2 & """")
    else if S2.Length = 0 Then
        Log("add """ & S1 & """")
    Else
        For L = S1.Length To 5 Step -1    'less than 5 = just do replacement
            For P1 = 0 To S1.Length - L
                Dim LookFor As String = S1.SubString2(P1, P1 + L)
                Dim P2 As Int = S2.IndexOf(LookFor)
                If P2 >= 0 Then
                    WhatChanged(S1.SubString2(0, P1), S2.SubString2(0, P2))
                    Log("keep """ & LookFor & """")
                    WhatChanged(S1.SubString(P1 + L), S2.SubString(P2 + L))
                    Return
                End If
            Next
        Next
        Log("replace """ & S1 & """ with """ & S2 & """")
    End If
End Sub
Log output:
Waiting for debugger to connect...
Program started.
S1 = "Now is the time for all good men to come to the aid of the party"
S2 = "It was the time for good women to come to the aid of the men too"
TO GET FROM S1 TO S2:
replace "Now i" with "It wa"
keep "s the time for "
replace "all good " with "good wo"
keep "men to come to the aid of the "
replace "party" with "men too"
 
Last edited:

emexes

Expert
Licensed User
But is O(n²) so probably best to keep string length n < 1000.

To compare larger texts, break them down into lines and hash signature each line, then apply the algorithm to the lists of signatures.

TBH that's what it was originally used for: comparing different versions of source code files. Which is obviously ho-hum now but, back when I wrote it, all we had was DOS File Compare (FC.EXE) which was close but no cigar.

Lol actually I first did it on an Apple II but, without recursion, it was a total schmozzle. Then we got TurboPascal (1.0 on CP/M) which turned it from spaghetti code into a work of beauty. Did I ever mention how much I loved TurboPascal? ?
 
Cookies are required to use this site. You must accept them to continue using the site. Learn more…