B4J Question [B4X]How to compress and decompress strings using gzip.

byz

Active Member
Licensed User
Base64 + Gzip is a widely adopted method for efficient and text-safe data transmission in web APIs and network protocols. the WEBAPI interface compresses a large json string through gzip and returns it as a base64 encoded string. Therefore it needs to be unzipped. I've done it in vb.net, but b4x always fails.
DecompressString:
Sub DecompressString( base64 As String)
    Dim compress As CompressedStreams
    Dim su As StringUtils
    Dim compressed(), decompressed() As Byte
    compressed = su.DecodeBase64(base64)
    decompressed= compress.DecompressBytes(compressed, "gzip")
    xui.MsgboxAsync(BytesToString(decompressed,0, decompressed.Length, "UTF8"), "")
End Sub
LINE 6 ERROR TIPS: Caused by: java.util.zip.ZipException: Not in GZIP format

It's not clear if it's my method that's wrong or if it's a compatibility issue.
Attached are the test items
================A complete solution===================
This is the correct compression and compression code that supports gzip and zlib compression algorithms. In the second attachment is an OK test item.

 

Attachments

  • TestGZIP.zip
    10.6 KB · Views: 21
  • OK_TestGZIP.zip
    10.3 KB · Views: 16
Last edited:
Solution
DecompressString:
Sub DecompressString( base64 As String)
     ...
    compressed = su.DecodeBase64(base64)
    decompressed= compress.DecompressBytes(compressed, "gzip")
    ...
End Sub
LINE 6 ERROR TIPS: Caused by: java.util.zip.ZipException: Not in GZIP format
B4X:
    compressed = su.DecodeBase64(base64)
    decompressed= compress.DecompressBytes(compressed, "gzip")
This code is wrong, it(compressed) only is bytes which decoded Base64, not GZIP format. you can directly convert the bytes to string as below
B4X:
    Dim bc As ByteConverter
    Log(bc.StringFromBytes(compressed,"utf8"))

aeric

Expert
Licensed User
Longtime User
I found this:
 
Upvote 0

teddybear

Well-Known Member
Licensed User
DecompressString:
Sub DecompressString( base64 As String)
     ...
    compressed = su.DecodeBase64(base64)
    decompressed= compress.DecompressBytes(compressed, "gzip")
    ...
End Sub
LINE 6 ERROR TIPS: Caused by: java.util.zip.ZipException: Not in GZIP format
B4X:
    compressed = su.DecodeBase64(base64)
    decompressed= compress.DecompressBytes(compressed, "gzip")
This code is wrong, it(compressed) only is bytes which decoded Base64, not GZIP format. you can directly convert the bytes to string as below
B4X:
    Dim bc As ByteConverter
    Log(bc.StringFromBytes(compressed,"utf8"))
 
Last edited:
  • 100%
Reactions: byz
Upvote 0
Solution

byz

Active Member
Licensed User
Show your compress code. It is reversed of decompress.
It is possible to use the forum's decompression and compression code. But what is returned from the server is a gzip string in base64 format. Cannot be correctly decompressed by b4x. I have included the test files and project in the attached project.
 
Upvote 0

aeric

Expert
Licensed User
Longtime User
I found that zlib+base64 works but not works for gzip+base64.
 
Upvote 0

emexes

Expert
Licensed User
I've done it in vb.net, but b4x always fails.

I missed this bit. Let me have a closer look at it as UTF-8. Although the leading zeroes look un-texty and thus un-JSONy too. The could be null-terminators for lines but, if so: why 7 empty lines up front, and then (by eye) no line terminators after that? On the other hand, I've seen JSON scrunched onto a single line, so perhaps that's plausible. I'd expect a line terminator at the end of that single line, though. 🤔
 
Upvote 0

emexes

Expert
Licensed User
closer look at it as UTF-8

It doesn't look like UTF-8 because in UTF-8, high-bit bytes are never alone, they always occur in groups of 2 to 4 (or maybe 5). So the 55 91 4d is not UTF-8 because it contains the single high-bit byte 91 by itself, without a follow-on second byte of a multibyte sequence.

Can you show us the VB code? And confirm that it processes the supplied example base64.txt file correctly, presumably recovering the supplied sample json.txt?
 
Upvote 0

byz

Active Member
Licensed User
It doesn't look like UTF-8 because in UTF-8, high-bit bytes are never alone, they always occur in groups of 2 to 4 (or maybe 5). So the 55 91 4d is not UTF-8 because it contains the single high-bit byte 91 by itself, without a follow-on second byte of a multibyte sequence.

Can you show us the VB code? And confirm that it processes the supplied example base64.txt file correctly, presumably recovering the supplied sample json.txt?

This is my VB.net code.
vb.net:
Imports System.IO
Imports System.IO.Compression
Imports System.Text

Public Class Gzip
    ''' <summary>
    ''' 将字符串压缩并转换为 Base64 字符串
    ''' </summary>
    ''' <param name="input">要压缩的字符串</param>
    ''' <returns>压缩后的 Base64 字符串</returns>
    Public Shared Function CompressString(input As String) As String
        ' 将字符串转换为字节数组
        Dim inputBytes As Byte() = Encoding.UTF8.GetBytes(input)

        ' 使用 Gzip 压缩
        Using memoryStream As New MemoryStream()
            Using gzipStream As New GZipStream(memoryStream, CompressionMode.Compress)
                gzipStream.Write(inputBytes, 0, inputBytes.Length)
            End Using

            ' 将压缩后的字节数组转换为 Base64 字符串
            Dim compressedBytes As Byte() = memoryStream.ToArray()
            Return Convert.ToBase64String(compressedBytes)
        End Using
    End Function

    ''' <summary>
    ''' 将 Base64 字符串解压缩为原始字符串
    ''' </summary>
    ''' <param name="compressedBase64">压缩后的 Base64 字符串</param>
    ''' <returns>解压缩后的原始字符串</returns>
    Public Shared Function DecompressString(compressedBase64 As String) As String
        ' 将 Base64 字符串转换为字节数组
        Dim compressedBytes As Byte() = Convert.FromBase64String(compressedBase64)

        ' 使用 Gzip 解压缩
        Using memoryStream As New MemoryStream(compressedBytes)
            Using gzipStream As New GZipStream(memoryStream, CompressionMode.Decompress)
                Using reader As New StreamReader(gzipStream, Encoding.UTF8)
                    Return reader.ReadToEnd()
                End Using
            End Using
        End Using
    End Function
End Class
 
Upvote 0

emexes

Expert
Licensed User
Well, that all looks ok. Can you temporarily modify the VB line:

B4X:
Dim inputBytes As Byte() = Encoding.UTF8.GetBytes(input)

to:

B4X:
'''Dim inputBytes As Byte() = Encoding.UTF8.GetBytes(input)
Dim inputBytes As Byte() = Encoding.UTF8.GetBytes("Hello, World!")

and post the resultant Base64 output? (which may well be longer than the original 13 bytes)
 
Upvote 0

byz

Active Member
Licensed User
Well, that all looks ok. Can you temporarily modify the VB line:



to:

B4X:
'''Dim inputBytes As Byte() = Encoding.UTF8.GetBytes(input)
Dim inputBytes As Byte() = Encoding.UTF8.GetBytes("Hello, World!")

and post the resultant Base64 output? (which may well be longer than the original 13 bytes)
There is test data in the attachment
 
Upvote 0

William Lancee

Well-Known Member
Licensed User
Longtime User
Your test data, based on what @emexes found
@teddybear already had the solution in #4!

B4X:
    Dim su As StringUtils
    Dim s As String=File.ReadString(File.DirAssets,"base64.txt")
    Dim d2() As Byte
    d2=su.DecodeBase64(s)
    Dim bc As ByteConverter
    Log(bc.StringFromBytes(d2 , "UTF8"))
    '{"appName":"WeatherTracker","version":"2.3.5","features":["real-time updates","5-day forecast","severe weather alerts","custom locations"],"settings":{"temperatureUnit":"Celsius","notificationEnabled":true,"refreshInterval":30,"theme":"dark"},"user":{"id":"WTU-7845-2023","premiumMember":false,"lastActive":"2023-11-15T08:23:17Z"},"systemInfo":{"apiCallsToday":147,"uptime":"36:14:22","memoryUsage":"64%"},"messages":[{"id":1,"text":"Welcome to WeatherTracker!","priority":"low"},{"id":2,"text":"New update available","priority":"medium"}]}
 
Last edited:
  • Like
Reactions: byz
Upvote 0
Top