comparison tests/test_format.py @ 33:9e4eb3f2754e

Improve handling of character limits in html stripping The code now more closely keeps track of character counts during html stripping, and should be absolutely exact. When the limit is exceeded, it now restarts the stripping without any URLs to prevent incorrect trimming. It also better preserves whitespace in the original post. New tests are added for Twitter silo to ensure it works as expected.
author Ludovic Chabant <ludovic@chabant.com>
date Wed, 10 May 2023 16:10:12 -0700
parents c898b4df0f29
children 486affad656e
comparison
equal deleted inserted replaced
32:2265920c4688 33:9e4eb3f2754e
43 print(expected) 43 print(expected)
44 assert actual == expected 44 assert actual == expected
45 45
46 46
47 @pytest.mark.parametrize("text, expected", [ 47 @pytest.mark.parametrize("text, expected", [
48 ("<p>Something with <a href=\"http://example.org/blah\">a link</a>", 48 ("<p>Something with <a href=\"http://example.org/blah\">a link</a></p>",
49 "Something with a link\nhttp://example.org/blah"), 49 "Something with a link\nhttp://example.org/blah"),
50 ("<p>Something with a link <a href=\"http://example.org/blah\">http://example.org</a>", # NOQA 50 ("<p>Something with a link <a href=\"http://example.org/blah\">http://example.org</a></p>", # NOQA
51 "Something with a link\nhttp://example.org/blah"), 51 "Something with a link\nhttp://example.org/blah"),
52 ("<p>Something with <a href=\"http://example.org/first\">one link here</a> and <a href=\"http://example.org/second\">another there</a>...</p>", # NOQA 52 ("<p>Something with <a href=\"http://example.org/first\">one link here</a> and <a href=\"http://example.org/second\">another there</a>...</p>", # NOQA
53 "Something with one link here and another there...\nhttp://example.org/first\nhttp://example.org/second") # NOQA 53 "Something with one link here and another there...\nhttp://example.org/first\nhttp://example.org/second") # NOQA
54 ]) 54 ])
55 def test_strip_html_with_bottom_urls(text, expected): 55 def test_strip_html_with_bottom_urls(text, expected):