[Rspamd-Users] Unexpected URLs
Steve Sturges (ststurge)
ststurge at cisco.com
Thu Jul 21 19:45:28 UTC 2022
Hi all—
Testing something with rspamd 3.2, I have an email body with a multipart, one of which is text/html:
--_000_6be055295eab48a5af7ad4022f33e2d0_
Content-Type: text/html; charset="utf-8"
<html><body>
<a href="http://somewhere.example.net">https://somewhereelse.otherexample.com</a>
</html>
In a lua plugin I’m building, I run task:get_parts() followed by part:get_urls(), it returns two URLs, both are the value of the href target, but nothing about the text that would be displayed.
local function url_test()
local all_urls = {}
local parts = task:get_parts()
if not parts then
return nil
end
for _,part in ipairs(parts) do
if part:is_text() then
local urls = part:get_urls()
rspamd_logger.debugx("task:get_parts -> part:get_urls: %1", urls)
for _,url in ipairs(urls) do
table.insert(all_urls, url)
end
end
end
return all_urls
end
The output from the above code:
task:get_parts -> part:get_urls: {[1] = http://somewhere.example.net, [2] = http://somewhere.example.net}
I can definitely see a reason to return 2 URLs — link text is different than the target; however, the result is unexpected — I would expect either a single URL from the href, or both the URL from the href and the one that is the display text.
Any ideas before I dig into the C++-side of rspamd?
Cheers
-steve
More information about the Users
mailing list