It is necessary to remove the value <A HREF="UID50B5C7EEF7"> from the given string.
The highlighted characters can be any letters or numbers in any quantity.
Please help with the regular expression.
with a generic function to extract the values and do whatever you want
B4X:
Dim html As String = $"<A HREF="UID50B5C7EEF789AB44AECFF">Origen 1 2 </A>Tarea 1 2<A HREF="UIDD034051C4B77FA419DC48">Nombre Apellido</A>"$
Dim sb As StringBuilder
sb.Initialize
For Each s As String In ExtractAllTagValues(html)
Log(s)
sb.Append(s).Append(" ")
Next
Log(sb.ToString.Trim)
B4X:
Public Sub ExtractAllTagValues(html As String) As List
Dim Pattern As String = "(<[^>]+>([^<]+)</[^>]+>)|([^<]+)"
Dim Matcher As Matcher = Regex.Matcher(Pattern, html)
Dim Values As List
Values.Initialize
Do While Matcher.Find
If Not(Matcher.Group(2) = Null) Then Values.Add(Matcher.Group(2))
If...
i'm not sure i understand what you're trying to removed, but look at these:
B4X:
Dim s As String = $"<A HREF="UID50B5C7EEF7">"$
Dim matcher As Matcher = Regex.Matcher($"<A HREF="(.+?)">"$, s)
If matcher.Find Then
s = Regex.Replace(matcher.Group(1),s,"")
Log("s now: " & s)
End If
Dim s2 As String = $"<A HREF="UID50B5C7EEF7">"$
Dim matcher As Matcher = Regex.Matcher($"(<A HREF=".+?">)"$, s2)
If matcher.Find Then
s2 = Regex.Replace(matcher.Group(1),s2,"")
Log("s2 now: " & s2)
End If
I apologize for poorly expressing the required actions.
The line in which the replacement needs to be made: <A HREF="UID50B5C7EEF789AB44AECFF">Source </A>Task<A HREF="UIDD034051C4B77FA419DC48"> Name</A>
I need to get: Source Task Name
Dim s As String = $"<A HREF="UID50B5C7EEF789AB44AECFF">Source </A>Task<A HREF="UIDD034051C4B77FA419DC48"> Name</A>"$
Dim matcher As Matcher = Regex.Matcher($"<A HREF="(.+?)">Source </A>Task<A HREF="(.+?)"> Name</A>"$,s)
If matcher.Find Then
Log(" source = " & matcher.Group(1))
Log(" name = " & matcher.Group(2))
End If
Sub Process_Globals
Private HtmlParser As MiniHtmlParser
End Sub
B4X:
Dim SB As StringBuilder
SB.Initialize
Dim text As String = $"<A HREF="UID50B5C7EEF789AB44AECFF">Source </A>Task<A HREF="UIDD034051C4B77FA419DC48"> Name</A>"$
HtmlParser.Initialize
Dim root As HtmlNode = HtmlParser.Parse(text)
Dim nodes As Int = root.Children.Size
For i = 0 To nodes - 1
Dim node As HtmlNode = root.Children.Get(i)
Select node.Name
Case "A"
Dim value As String = HtmlParser.GetTextFromNode(node, 0)
Case "text"
Dim value As String = HtmlParser.GetAttributeValue(node, "value", "")
End Select
'Log(value)
SB.Append(value)
Next
Log(SB.ToString)
I apologize for poorly expressing the required actions.
The line in which the replacement needs to be made: <A HREF="UID50B5C7EEF789AB44AECFF">Source </A>Task<A HREF="UIDD034051C4B77FA419DC48"> Name</A>
I need to get: Source Task Name
If these are the requirements, then this regex should do it:
B4X:
Dim s As String = $"<A HREF="UID50B5C7EEF789AB44AECFF">Source </A>Task<A HREF="UIDD034051C4B77FA419DC48"> Name</A>"$
Dim result As String = Regex.Replace("<[^>]+>",s, "")
Log(result)
Alwaysbusy, thank you, that's what I wanted!
I would also like to ask you to clarify how the pattern should be changed if each of the words Source, Task, Name can consist of several words and spaces between them?
Regular expression tester with syntax highlighting, explanation, cheat sheet for PHP/PCRE, Python, GO, JavaScript, Java, C#/.NET, Rust.
regex101.com
use:
B4X:
Dim html As String = $"<A HREF="UID50B5C7EEF789AB44AECFF">Origen 1 2 </A>Tarea 1 2<A HREF="UIDD034051C4B77FA419DC48">Nombre Apellido</A>"$
Dim Pattern As String = "(<[^>]+>([^<]+)</[^>]+>)|([^<]+)"
Dim Matcher As Matcher = Regex.Matcher(Pattern, html)
Do While Matcher.Find
Dim InsideTag As String = Matcher.Group(2)
Dim OutsideTag As String = Matcher.Group(3)
If Not(InsideTag = Null) Then Log(InsideTag)
If Not(OutsideTag = Null) Then Log(OutsideTag)
Loop
Dim html As String = $"<A HREF="UID50B5C7EEF789AB44AECFF">Origen 1 2 </A>Tarea 1 2<A HREF="UIDD034051C4B77FA419DC48">Nombre Apellido</A>"$
Dim Pattern As String = "(<[^>]+>([^<]+)</[^>]+>)|([^<]+)"
Dim Matcher As Matcher = Regex.Matcher(Pattern, html)
Dim sb As StringBuilder
sb.Initialize
Do While Matcher.Find
Dim InsideTag As String = Matcher.Group(2)
Dim OutsideTag As String = Matcher.Group(3)
If Not(InsideTag = Null) Then
sb.Append(InsideTag).Append(" ")
Log(InsideTag)
End If
If Not(OutsideTag = Null) Then
sb.Append(OutsideTag).Append(" ")
Log(OutsideTag)
End If
Loop
Log(sb.ToString.Trim)
with a generic function to extract the values and do whatever you want
B4X:
Dim html As String = $"<A HREF="UID50B5C7EEF789AB44AECFF">Origen 1 2 </A>Tarea 1 2<A HREF="UIDD034051C4B77FA419DC48">Nombre Apellido</A>"$
Dim sb As StringBuilder
sb.Initialize
For Each s As String In ExtractAllTagValues(html)
Log(s)
sb.Append(s).Append(" ")
Next
Log(sb.ToString.Trim)
B4X:
Public Sub ExtractAllTagValues(html As String) As List
Dim Pattern As String = "(<[^>]+>([^<]+)</[^>]+>)|([^<]+)"
Dim Matcher As Matcher = Regex.Matcher(Pattern, html)
Dim Values As List
Values.Initialize
Do While Matcher.Find
If Not(Matcher.Group(2) = Null) Then Values.Add(Matcher.Group(2))
If Not(Matcher.Group(3) = Null) Then Values.Add(Matcher.Group(3))
Loop
Return Values
End Sub
TILogistic, thank you very much! You solved my problem. I still have a task, how to first determine the value of the string "Tarea 1 2", and then the values of the remaining strings. This string can be at the beginning, in the middle or at the end of the entire string. Of course, if I am not too annoying with my questions.
TILogistic, thank you very much! You solved my problem. I still have a task, how to first determine the value of the string "Tarea 1 2", and then the values of the remaining strings. This string can be at the beginning, in the middle or at the end of the entire string. Of course, if I am not too annoying with my questions.
Public Sub Test1
Dim html(6) As String
html(0) = $"Tarea 1 2<A HREF="UIDD0">Nombre Apellido</A><A HREF="UID50">Origen 1 2</A>"$
html(1) = $"Tarea 1 2<A HREF="UID50">Origen 1 2</A><A HREF="UIDD0">Nombre Apellido</A>"$
html(2) = $"<A HREF="UID50">Origen 1 2</A>Tarea 1 2<A HREF="UIDD0">Nombre Apellido</A>"$
html(3) = $"<A HREF="UIDD0">Nombre Apellido</A>Tarea 1 2<A HREF="UID50">Origen 1 2</A>"$
html(4) = $"<A HREF="UID50">Origen 1 2</A><A HREF="UIDD0">Nombre Apellido</A>Tarea 1 2"$
html(5) = $"<A HREF="UIDD0">Nombre Apellido</A><A HREF="UID50">Origen 1 2</A>Tarea 1 2"$
For i = 0 To html.Length - 1
For Each s As String In ExtractAllTagHtmlValues(html(i))
Log(s)
Next
Log("--------------------")
Next
End Sub
B4X:
Public Sub ExtractAllTagHtmlValues(Html As String) As List
Dim Pattern As String = "(<[^>]+>([^<]+)</[^>]+>)|([^<]+)"
Dim Matcher As Matcher = Regex.Matcher(Pattern, Html)
Dim Values As List
Values.Initialize
Do While Matcher.Find
If Not(Matcher.Group(2) = Null) Then Values.Add(Matcher.Group(2))
If Not(Matcher.Group(3) = Null) Then Values.InsertAt(0,Matcher.Group(3))
Loop
Return Values
End Sub
B4X:
Call B4XPages.GetManager.LogEvents = True to enable logging B4XPages events.
Tarea 1 2
Nombre Apellido
Origen 1 2
--------------------
Tarea 1 2
Origen 1 2
Nombre Apellido
--------------------
Tarea 1 2
Origen 1 2
Nombre Apellido
--------------------
Tarea 1 2
Nombre Apellido
Origen 1 2
--------------------
Tarea 1 2
Origen 1 2
Nombre Apellido
--------------------
Tarea 1 2
Nombre Apellido
Origen 1 2
--------------------