I have been getting spotty results decoding quoted-printable html email with the forum’s recommended sub. It gives a good result in some cases. But in others, either the message displayed is malformed, or at worst will not display at all.
I am attaching two HTML files for an email recently received from B4A. This email will not display as decoded by the sub below.
Example A was decoded by this sub, is unedited and will not display. I have made notations at those lines that are causing the problem. All the required code is there, but these lines are fragmented.
Example B has been edited manually. I have reformatted the fragmented lines into a single line contained by its tags. I have again made notations at these lines. This HTML will now display correctly in a browser or WebView.
Is there something in this sub that could be changed, so that the fragmentation does not occur?
I have attached a third HTML file, as Example C. This file was decoded by the quoted-printable sub below, followed by sub Strip. This comes very close to decoding my emails 100%. Unfortunately, every so often the Strip sub strips a needed character from an inline image, and only the place holder is displayed. This is not an elegant solution, but the sub below does not produce the line fragmentation created by the sub above. All emails are displayed, with only the occurrence of an occasional missing image. For illustration purpose, and to save space, I have only posted a portion of sub Strip.
Extensions on HTML files renamed to txt in order to permit uploading.
I am attaching two HTML files for an email recently received from B4A. This email will not display as decoded by the sub below.
Example A was decoded by this sub, is unedited and will not display. I have made notations at those lines that are causing the problem. All the required code is there, but these lines are fragmented.
Example B has been edited manually. I have reformatted the fragmented lines into a single line contained by its tags. I have again made notations at these lines. This HTML will now display correctly in a browser or WebView.
Is there something in this sub that could be changed, so that the fragmentation does not occur?
B4X:
Sub DecodeQuotePrintable(q As String) As String
Dim bytes As List
bytes.Initialize
Dim i As Int
Do While i < q.Length
Dim c As String
c = q.CharAt(i)
If c = "_" Then
bytes.AddAll(" ".GetBytes("utf8"))
Else If c = "=" And i < q.Length - 1 Then
Dim hex As String
hex = q.CharAt(i + 1) & q.CharAt(i + 2)
i = i + 2
Try
bytes.Add(Bit.ParseInt(hex, 16))
Catch
bytes.AddAll(hex.GetBytes("utf-8"))
End Try
Else
bytes.AddAll(c.GetBytes("utf-8"))
End If
i = i + 1
Loop
Dim b(bytes.Size) As Byte
For i = 0 To bytes.Size - 1
b(i) = bytes.Get(i)
Next
Return BytesToString(b, 0, b.Length, "utf-8")
End Sub
B4X:
Sub DecodeQuotePrintable(q As String) As String
Dim m As Matcher
m = Regex.Matcher("=\?([^?]*)\?Q\?(.*)\?=$", q)
If m.Find Then
Dim charset As String
Dim data As String
charset = m.Group(1)
data = m.Group(2)
Dim bytes As List
bytes.Initialize
Dim i As Int
Do While i < data.Length
Dim c As String
c = data.CharAt(i)
If c = "_" Then
bytes.AddAll(" ".GetBytes(charset))
Else If c = "=" Then
Dim hex As String
hex = data.CharAt(i + 1) & data.CharAt(i + 2)
i = i + 2
bytes.Add(Bit.ParseInt(hex, 16))
Else
bytes.AddAll(c.GetBytes(charset))
End If
i = i + 1
Loop
Dim b(bytes.Size) As Byte
For i = 0 To bytes.Size - 1
b(i) = bytes.Get(i)
Next
Return BytesToString(b, 0, b.Length, charset)
Else
Return q
End If
End Sub
Sub Strip(value As String) As String
value = value.Replace("=20","")
value = value.Replace("=21","!")
value = value.Replace("=22",$"""$)
value = value.Replace("=23","#")
value = value.Replace("=24","$")
value = value.Replace("=25","%")
value = value.Replace("=26","&")
value = value.Replace("=27","'")
value = value.Replace("=28","(")
Return value
End Sub