Caching on work index pages doesn't check for include_work_search or exclude_work_search

Description

Steps to Reproduce:

  1. Log in (to avoid nginx's cache, since it may complicate the results).

  2. Visit the following URL for all Marvel works tagged with "Hurt/Comfort":
    https://test.archiveofourown.org/works?include_work_search%5Bfreeform_ids%5D%5B%5D=2379&tag_id=Marvel

  3. Visit the main Marvel listing:
    https://test.archiveofourown.org/tags/Marvel/works

What Should Happen: Visiting the first URL should show the first page of a search with a total of around 3,000 works, and should only display works that are tagged with a synonym or subtag of "Hurt/Comfort." Visiting the second URL should show the first page of a search with a total of around 50,000 works, and should include works that aren't tagged with "Hurt/Comfort."

What Actually Happens: Both pages show exactly the same search results. If the include_work_search version is cached, then the second URL will have the wrong number of results. If the plain version is cached, then the first URL will have the wrong number of results.

(I think in theory it might be possible to get the right results if the second page view is perfectly timed to match the 20 minute expiration for the search results cache, but otherwise both results will be the same – and one of them will be wrong.)

Explanation: Currently, WorksController#index decides whether to cache the search results by checking params[:work_search] to ensure that the user hasn't refined their search in any way. But since the Elasticsearch upgrade, search parameters are split between params[:work_search], params[:include_work_search], and params[:exclude_work_search].

In normal use, any query that sets params[:include_work_search] or params[:exclude_work_search] will also set params[:work_search], so this isn't an issue.

However, if someone tries to edit the URL to eliminate blank parameters, they may end up with a search that sets only params[:include_work_search] and/or params[:exclude_work_search], without also setting params[:work_search]. That kind of search is cached using the same cache key as a blank search. As a result, if one person visits the page with extra search params and one person doesn't, the results that both receive will depend on who visits the page first.

Either WorksController#index should also check both params[:include_work_search] and params[:exclude_work_search], or the work search parameters should be refactored so that they all fall under params[:work_search].

Activity

Show:
Sarken
April 4, 2019, 10:47 AM

The first page showed me 2986 works and the second page showed me 54692, so I think we’re good!

redsummernight
April 4, 2019, 12:37 PM
Edited

I looked at those 2 pages and saw the same counts Sarken did.

Then I looked at https://test.archiveofourown.org/works?exclude_work_search[freeform_ids][]=2379&tag_id=Marvel (Marvel works excluding "Hurt/Comfort") and see 51706 works.

51706 + 2986 = 54692 if my math is correct. Looks good.

Lady Oscar
April 11, 2019, 1:45 AM

Looks good!

DeployedToBeta

Assignee

ticking instant

Reporter

ticking instant

Roadmap

Search

Priority

Medium

Affects versions

Fix versions

Components

BackEnd
Caching

Difficulty

Medium

Milestone

Internal 0.9