Android Question [B4X] Extract URLs from string

Alexander Stolte

Expert
Licensed User
Longtime User
Hello is there a way to extract all urls from a string, like:

B4X:
www.b4x.com 'Add2List
b4x.com     'Add2List
https://www.b4x.com 'Add2List
api.b4x.com 'Add2List

B4X:
Hello this is a test string b4x.com i need another
url www.b4x.com oh i see,
it has an api api.b4x.com thats very cool.
Oh i see it's not ssl encripted https://api.b4x.com
now its better.

Thanks :)
 
Solution
ChatGPT says:

To extract all URLs from a given string in B4A (Basic4Android), you can use a Regular Expression (RegEx) that matches URLs. The following routine demonstrates how to accomplish this. Note that B4A includes the Regex library for working with regular expressions, which you must reference in your project.

This routine uses a basic pattern to match URLs. It's worth noting that URL matching with regular expressions can get quite complex, especially if you want to capture all possible variations of URLs. The provided pattern should match the examples you've given but might need adjustments for more complex cases.


B4X:
Sub ExtractURLsFromString(TargetString As String) As List
    Dim Matcher As Matcher
    Dim URLPattern...

Star-Dust

Expert
Licensed User
Longtime User
I think with regex, now I'm not with the PC and I can't try.
 
Upvote 0

JohnC

Expert
Licensed User
Longtime User
ChatGPT says:

To extract all URLs from a given string in B4A (Basic4Android), you can use a Regular Expression (RegEx) that matches URLs. The following routine demonstrates how to accomplish this. Note that B4A includes the Regex library for working with regular expressions, which you must reference in your project.

This routine uses a basic pattern to match URLs. It's worth noting that URL matching with regular expressions can get quite complex, especially if you want to capture all possible variations of URLs. The provided pattern should match the examples you've given but might need adjustments for more complex cases.


B4X:
Sub ExtractURLsFromString(TargetString As String) As List
    Dim Matcher As Matcher
    Dim URLPattern As String
    Dim Result As List
    Result.Initialize

    ' This pattern matches URLs starting with http://, https://, or www. and includes those without these prefixes.
    ' Adjust the pattern as needed for more specific use cases.
    URLPattern = "(https?://)?(www\.)?[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,}(/\S*)?"
   
    Matcher = Regex.Matcher(URLPattern, TargetString)
   
    Do While Matcher.Find
        Result.Add(Matcher.Match)
    Loop
   
    Return Result
End Sub

To use this subroutine, call it with the string you want to extract URLs from, and it will return a list of matched URLs. Here is how you might call this subroutine and output the result:

B4X:
Sub Activity_Create(FirstTime As Boolean)
    Dim testString As String
    testString = "Your long test string here..."
   
    Dim urls As List
    urls = ExtractURLsFromString(testString)
   
    For Each url As String In urls
        Log(url) ' Output each URL to the logs
    Next
End Sub

Remember to replace "Your long test string here..." with the actual string you're analyzing. This routine will log all found URLs to the B4A logs, which you can view in the IDE.
 
Last edited:
Upvote 0
Solution

drgottjr

Expert
Licensed User
Longtime User
i understand this topic is closed, but i think "ws://", "wss://", "ftp://" and a number of other schemes are considered url's.
 
Last edited:
Upvote 0
Top