annotate tests/test_commands_populate.py @ 18:a921cc2306bc

Do our own HTML parsing/stripping of micropost contents. - This lets us properly handle various forms of linking. - Add tests for processing posts with links. - Fix configuration in tests. - Basic error handling for processing posts.
author Ludovic Chabant <ludovic@chabant.com>
date Sun, 16 Sep 2018 21:16:20 -0700
parents a1b7a459326a
children b739ca5feb45
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
1
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
2 feed1 = """
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
3 <html><body>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
4 <article class="h-entry">
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
5 <h1 class="p-name">A new article</h1>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
6 <div class="e-content">
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
7 <p>This is the text of the article.</p>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
8 <p>It has 2 paragraphs.</p>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
9 </div>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
10 <a class="u-url" href="https://example.org/a-new-article">permalink</a>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
11 </article>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
12 </body></html>"""
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
13
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
14
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
15 def test_populate(cli):
18
a921cc2306bc Do our own HTML parsing/stripping of micropost contents.
Ludovic Chabant <ludovic@chabant.com>
parents: 0
diff changeset
16 feed = cli.createTempFeed(feed1)
0
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
17 cli.appendSiloConfig('test', 'print', items='name')
18
a921cc2306bc Do our own HTML parsing/stripping of micropost contents.
Ludovic Chabant <ludovic@chabant.com>
parents: 0
diff changeset
18 cli.setFeedConfig('feed', feed)
a921cc2306bc Do our own HTML parsing/stripping of micropost contents.
Ludovic Chabant <ludovic@chabant.com>
parents: 0
diff changeset
19 ctx, _ = cli.run('populate', '-s', 'test')
0
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
20 assert ctx.cache.wasPosted('test', 'https://example.org/a-new-article')
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
21
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
22
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
23 feed2 = """
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
24 <html><body>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
25 <article class="h-entry">
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
26 <h1 class="p-name">First article</h1>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
27 <div><time class="dt-published" datetime="2018-01-07T09:30:00-00:00"></time></div>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
28 <a class="u-url" href="https://example.org/first-article">permalink</a>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
29 </article>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
30 <article class="h-entry">
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
31 <h1 class="p-name">Second article</h1>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
32 <div><time class="dt-published" datetime="2018-01-08T09:30:00-00:00"></time></div>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
33 <a class="u-url" href="https://example.org/second-article">permalink</a>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
34 </article>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
35 <article class="h-entry">
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
36 <h1 class="p-name">Third article</h1>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
37 <div><time class="dt-published" datetime="2018-01-09T09:30:00-00:00"></time></div>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
38 <a class="u-url" href="https://example.org/third-article">permalink</a>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
39 </article>
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
40 </body></html>""" # NOQA
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
41
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
42
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
43 def test_populate_until(cli):
18
a921cc2306bc Do our own HTML parsing/stripping of micropost contents.
Ludovic Chabant <ludovic@chabant.com>
parents: 0
diff changeset
44 feed = cli.createTempFeed(feed2)
0
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
45 cli.appendSiloConfig('test', 'print', items='name')
18
a921cc2306bc Do our own HTML parsing/stripping of micropost contents.
Ludovic Chabant <ludovic@chabant.com>
parents: 0
diff changeset
46 cli.setFeedConfig('feed', feed)
a921cc2306bc Do our own HTML parsing/stripping of micropost contents.
Ludovic Chabant <ludovic@chabant.com>
parents: 0
diff changeset
47 ctx, _ = cli.run('populate', '-s', 'test', '--until', '2018-01-08')
0
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
48 assert ctx.cache.wasPosted('test', 'https://example.org/first-article')
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
49 assert ctx.cache.wasPosted('test', 'https://example.org/second-article')
a1b7a459326a Initial commit.
Ludovic Chabant <ludovic@chabant.com>
parents:
diff changeset
50 assert not ctx.cache.wasPosted('test', 'https://example.org/third-article')