Android Question Best way to minimize HTML + Javascript

max123

Well-Known Member
Licensed User
Longtime User
Hi all,

sometimes we use a WebView to load a local (or non local) HTML file.

To decrease the page overhead and speed up the page loading is good practice to minimize the size,
so remove all informations really not needed for code interpretation, but useful on development
process about user interation, like unused white spaces, comments, end of lines, tabulators etc.
See https://en.wikipedia.org/wiki/Minification_(programming)

This can be different but similar to when we compile eg. Java to bytecode, C++ to a binary file etc....

On this direction I've searched on the forum if someone already wrote some code to just pass
the HTML string and return a minified working string decreased on size.

I've searched it but seem no one have already wrote this, or not posted to the forum, so I tried to write it myself.

I wrote this, it works and return the full HTML as just one full long line decreasing by 15-20% factor the size.

It is a bit spartan and probably only works in my use case but not with all HTML + CSS + JS strings.

My question is.... There are better ways to do this ?

Thanks
B4X:
Private Sub MinifyHTMLFile(Original As StringBuilder) As ResumableSub
 
    Dim isStyle As Boolean = False
    Dim isScript As Boolean = False
    Dim isMultilineComment As Boolean = False
 
    Dim Minified, FinalMinified, Styles, Scripts As StringBuilder
 
    Minified.Initialize
    FinalMinified.Initialize
    Styles.Initialize
    Scripts.Initialize
 
    Dim lines() As String = Regex.Split(CRLF, Original.ToString)
    Log("NUMBER OF LINES: " & lines.Length)
 
    For i = 0 To lines.Length-1
        Dim line As String = lines(i).Trim
 
        If line.Length = 0 Then
            Log("Found void string on line " & (i+1))
        Else
            If line.StartsWith("<style>") Then
                isStyle = True
                Minified.Append(line).Append(CRLF)
                Log("LINE [ " & (i+1) & "]: " & line & " [START STYLE]")
            Else If line.StartsWith("</style>") Then
                isStyle = False
                Minified.Append(Styles.ToString).Append(CRLF)
                Minified.Append(line).Append(CRLF)
                Log("LINE [ " & (i+1) & "]: " & line & " [END STYLE]")
                Log("FULL STYLE: [" & Styles.ToString & "]")
            Else If line.StartsWith("<script>") Then
                isScript = True
                Minified.Append(line).Append(CRLF)
                Log("LINE [ " & (i+1) & "]: " & line & " [START SCRIPT]")
            Else If line.StartsWith("</script>") Then
                isScript = False
                Minified.Append(Scripts.ToString).Append(CRLF)
                Minified.Append(line).Append(CRLF)
                Log("LINE [ " & (i+1) & "]: " & line & " [END SCRIPT]")
                Log("FULL Script: [" & Scripts.ToString & "]")
            Else If line.StartsWith("<") And line.EndsWith(">") Then
                Minified.Append(line).Append(CRLF)
                Log("LINE [ " & (i+1) & "]: " & line & " [HTML TAG]")
            Else
                If line.StartsWith("/*") Then isMultilineComment = True  'MULTILINE COMMENT
 
                If isStyle Then              
                    If line.StartsWith("//") = True Then
                        Log("Found Full Line [STYLE] comment on line " & (i+1) & " -> [" & line & "]")
                    Else
                        If line.Contains("//") = False Then
                            If isMultilineComment = False Then
                                Styles.Append(line)
                                Log("LINE [ " & (i+1) & "]: " & line & " [STYLE]")
                            Else
                                Log("LINE [ " & (i+1) & "]: " & line & " [STYLE MULTILINE COMMENT]")
                            End If
                        Else
                            Dim n As Int = line.IndexOf("//")
                            Dim comment As String = line.SubString(n).Trim
                            Dim tmp As String = line.SubString2(0, n).Trim
                            Styles.Append(tmp)
                            Log("Found Partial Line [STYLE] comment on line " & (i+1) & " -> [" & comment & "]    LINE WITHOUT COMMENTS -> [" & tmp & "]")
                        End If
                    End If
                Else If isScript Then
                    If line.StartsWith("//") = True Then
                        Log("Found Full Line [SCRIPT] comment on line " & (i+1) & " -> [" & line & "]")
                    Else
                        If line.Contains("//") = False Or line.Contains("\/") = True Then  ' We add Escape
                            If isMultilineComment = False Then
                                Scripts.Append(line)
                                Log("LINE [ " & (i+1) & "]: " & line & " [SCRIPT]")
                            Else
                                Log("LINE [ " & (i+1) & "]: " & line & " [SCRIPT MULTILINE COMMENT]")
                            End If
                        Else
                            Dim n As Int = line.IndexOf("//")
                            Dim comment As String = line.SubString(n).Trim
                            Dim tmp As String = line.SubString2(0, n).Trim
                            Scripts.Append(tmp)
                            Log("Found Partial Line [SCRIPT] comment on line " & (i+1) & " -> [" & comment & "]    LINE WITHOUT COMMENTS -> [" & tmp & "]")
                        End If
                    End If
                End If
 
                If line.EndsWith("*/") Then isMultilineComment = False
            End If

'            Sleep(0)
        End If
    Next
 
    LogColor("MINIFIED MULTILINE:", xui.Color_Green)
 
    '''''' Log it line by line
    Dim lines() As String = Regex.Split(CRLF, Minified.ToString)
    For Each line As String In lines
        LogColor(line, xui.Color_Yellow)
    Next
 
    FinalMinified.append(Minified.ToString.Replace(CRLF, ""))  ' Make it one line
 
    LogColor("NO MINIFIED Size: " & Original.Length & " Bytes", xui.Color_Red)
    LogColor("MINIFIED Size: " & FinalMinified.Length & " Bytes", xui.Color_Red)
 
    LogColor("Size decreased by a factor of " & NumberFormat(100 - MapFloat(FinalMinified.Length, 0, Original.Length, 0, 100), 0, 2) & " percent", xui.Color_Green)

    Return FinalMinified
End Sub

'Re-maps a Float number from one range to another.
Sub MapFloat(Value As Float, fromLow As Float, fromHigh As Float, toLow As Float, toHigh As Float) As Float
    Return ( (Value - fromLow) * (toHigh - toLow) / (fromHigh - fromLow) + toLow )
End Sub

I use it that way:
B4X:
    CreateHTMLString
    Wait For (MinifyHTMLFile(mFullHTML)) Complete (Minified As StringBuilder)
    SaveHTMLFile(Minified.ToString)
 
Last edited:

max123

Well-Known Member
Licensed User
Longtime User
Hi @DonManfred , No I will ensure you that I save a lot of space.... 16-24% and more space, depending of html file.

I remove all HTML single and multiline comments (not used in my html) but useful
I remove all unnecessary spaces
I format script and style tags, remove single and multilines comments and format both to a single line.
Finally I remove all end of lines.

The result is a one very long single line. Is faster to load and I save less bytes on DirInternal.

You can try yourself with my code I posted.
It is not perfect and may have some problems with other html files.

Attached a screenshot.
 

Attachments

  • Screenshot 2024-04-22 161917.png
    65 KB · Views: 69
Last edited:
Upvote 0

OliverA

Expert
Licensed User
Longtime User
My question is.... There are better ways to do this ?
It depends. Looking at a Java library that can do HTML/CSS compression (https://github.com/serg472/htmlcompressor), it has a lot of code for something that one may think is easy to do. The question is, how much more could hundreds/thousands of lines of additional code provide you that that would be worth adding all the additional complexities to do proper HTML/CSS compression? As an alternative, you could try zip'ing the files and see if that satisfies your compression needs / speed needs as an alternative to rolling your own HTML/CSS compression routines.
 
Upvote 0

max123

Well-Known Member
Licensed User
Longtime User
Yes @OliverA I see htmlcompressor library, It have a lot of classes, so I do not anyway use it, instead using some regex with right patterns it can be done on B4X with just a couple of lines.
The problem is.... is not the right direction use a big library that have a lots of functions, even compression and others dependences to just optimize one html file at runtime. The code should be short and relatively simple.
I do not search a way eg. to rename JS variables to shorter names etc.... just remove spaces, comments and all end of lines.
My code do this but I'm not practice with regex patterns.

As you can see from screenshot posted to @DonManfred in my use case I finally save 17% of space on device when save html file, and the page is a bit faster to load, for me this is good, just may do not works well with other html strings and need some hacks.
 
Last edited:
Upvote 0

OliverA

Expert
Licensed User
Longtime User
The problem is, is not yhe right direction use a big library that have a lots of functions, even compression and others dependences to just optimize one html file at runtime.
It depends on a person's interpretation of Best. Some would say, best = as small as possible, no matter how much code you need. In your case, best = the best compression ratio for the minimal amount of code. And it looks like you are already there. You can always post your code as a snippet, so that other's may use your code in case they have the same issue. You can always request for other users to post suggestions for improvement in that post.

BTW, in the hey days, when coding HTML files by hand, I relied on a tool called tidy to clean up/minify my pages (this way I could write HTML/CSS that was pleasing to the eye, but also create files that take out all the unnecessary fluff). Looks like the tool found a bit of a revival here: https://www.html-tidy.org/
 
Upvote 0

max123

Well-Known Member
Licensed User
Longtime User
@JohnC here my html file. I copyed it from B4A log.

I converted it to .txt, please rename it to .html, the upload as html won't work.

Thanks
 

Attachments

  • CodeMirror.txt
    11.3 KB · Views: 74
Last edited:
Upvote 0

max123

Well-Known Member
Licensed User
Longtime User
You can always post your code as a snippet, so that other's may use your code in case they have the same issue. You can always request for other users to post suggestions for improvement in that post.
@OliverA it is already posted on post #1 , you can try it, it works in my case with my html css and js.
If you want to try it just get my html string I posted to @JohnC that it is working for me and after a minification I can load it on a WebView without problems. After this you can try with others html files if they have no strange things should work.
If you change something in the code, please post the new code.

Yes thanks, good to know, I see Tidy on my PSPad that is a must for me, if you do not use it, it is a text editor with great functionalities, even have an html emulator to test html files.... but it is just an editor like Notepad++, and all these tools are not available by code, and cannot process dynamic html files created by code at runtime.
 
Last edited:
Upvote 0

TILogistic

Expert
Licensed User
Longtime User
I saw your post and I find it interesting. and work on it.

But if it is for webview or web server I have used some compressors and obfuscators in javascripts that perform this task.

If you continue with this idea, count on us.

If you see Banano he also does it.

Greetings.
 
Upvote 0

max123

Well-Known Member
Licensed User
Longtime User
Many thanks @TILogistic , very good work...
My current project will be reduced from about 182 kB to about 93 kB within 1.5 seconds with Terser's default settings. I think, that's not bad.
This remove a lots of data, probably it do hard optimizations like replace variable names with shorter and so on...

I don't know nothing on NodeJS, I managed Threejs and had two options, or use NPM and NodeJS or use just HTML and Javascript, I used the second one.

I've in mind to port my code to C++ to use with ESP32 and similar microcontrollers, but here I don't have Regex to help a bit, but may yes.... eventually I will just use my concept and work with characters or strings. This is good because these microcontrollers have WiFi and can serve webservers, websockets, and more and minimize the html + js size here is essential to speed up and to decrease precious RAM usage.

Here I've a question for you. normally to serve a webserver ESP32 or ESP8266 or Raspberry PICO or similars use gzipped html files to decrease the size, load it from a flash memory that is 4, 8, 16 or 32 MB.
Is there some options on B4X to load directly a gzipped html file on a WebView ? May using JavaObject or inline Java ?

I actually work on the Codemirror code editor port for B4A and I search to optimize it
I have to write a small Javascript IDE with B4A, or at least I already wrote it, it is simple but fully functional, but it use just a normal EditText where I write JS code and view threejs scenes on another webview. Next time I want to replace the EditText with a Codemirror code editor that already I've implemented some useful things like higlihting, code folding, automatic error recognition and more.

Many thanks for your tips!
 
Upvote 0
Cookies are required to use this site. You must accept them to continue using the site. Learn more…