Skip to content

Caching

We use caching to store material that is either frequently used or expensive to compute.

We use web2py caching. We do not deliberately manage browser caching. We use the lower-level mechanisms of web2py: cache.ram, and refrain from decorating controllers with @cache or @cache.action, because we have to be selective on which request vars are important for keying the cached items.

All caching is triggered via the Model: CACHING object. We cahce in RAM only, but there is a switch by which we could also cache on disk, if we want to keep the cache between restarts of the server.

Here is a list of what we cache:

We do not cache rendered views, because the views implement tweaks that are dependent on the browser.

Note that what the user sees, is the effect of the javascript on the html produced by rendered view. So the cached data only has to be indexed by those request vars that select content: mr, qw, book, chapter, item id (iid) and (result) page.

I think this strikes a nice balance: * these chunks of html are equal for all users that visit such a page, regardless of their view settings * these chunks of html are relatively small, only the material of one page.

It is tempting to cache the SQL queries, but they fetch large amounts of data, of which only a tiny portion shows up. So it uses a lot of space. If a user examines a query with 1000 pages of results, it is unlikely that (s)he will visit all of them, so it is not worthwhile to keep the results of the one big query in cache all the time. On the other hand, many users look at the first page of query results, and by caching individual pages, the number of times that the big query is exececuted is reduced significantly.

There is one exception: looking up the queries that have results in a given chapter is quite expensive. We alleviate that by making an index of queries by chapter and store that in the cache.

Time consuming and priority

This is time consuming and it has to happen before the website is visited. If pages are served before this index is finished, sidebars maybe incomplete, and yet they will be cached, so they remain incomplete.

The update script of SHEBANQ will make a first visit right after the update to counter this.

Updating the index

When a query gets executed, it should be removed from the index and then added again. Therefore we need to know which chapters are affected. For that we also hold an index from queries to chapters in the cache.