Hopefully an easy question for someone.
I am reading a large html file from a server and scanning line by line to find links to image files. In each line I should find a thumbnail link and a main image link. This follows in a loop, through the lines. Everything was working well. Regex was working fine until a few days ago.
Now, checking the original web page html in notepad++, I see that the hyperlinks have a CRLF in the middle. This has no effect in a web browser but my app sees only a part of the link and therefore finds no files.
I don't really want to make two loops if I can help it as the web page could change back and be okay again.
Can I import the whole html file somehow and "clean" the code so that my app will work?
Example:
old:
new:
Many thanks
Mark
I am reading a large html file from a server and scanning line by line to find links to image files. In each line I should find a thumbnail link and a main image link. This follows in a loop, through the lines. Everything was working well. Regex was working fine until a few days ago.
Now, checking the original web page html in notepad++, I see that the hyperlinks have a CRLF in the middle. This has no effect in a web browser but my app sees only a part of the link and therefore finds no files.
I don't really want to make two loops if I can help it as the web page could change back and be okay again.
Can I import the whole html file somehow and "clean" the code so that my app will work?
Example:
old:
B4X:
<a href="thumbs/niketta-1-8007.jpg"><img alt="" src="thumbs/niketta-1.jpg" border="2" height="316" width="200"></a><br>
new:
B4X:
<p> <i> </i> <a href="thumbs/niketta-1-8007.jpg"><img
alt="" src="thumbs/niketta-1.jpg" border="2"
height="316" width="200"></a><br>
Many thanks
Mark