There is a class of bugs that are worse than crashes. Crashes are loud, they page someone and they get fixed.

This one is quiet. Your cache is running, your hit rate looks fine. But your database is being hit on every single request for data that has not changed at all. I hit this while implementing cache-aside in a side project.

A Quick Recap of How Cache-Aside Works

Before getting into the bug, let me explain the pattern I was using. Cache-aside is the most common caching strategy:

  1. A request comes in. Check the cache first.
  2. If the value is there, return it. Done.
  3. If it is not there, fetch from the database, store the result in the cache, then return it.

The next request for the same data skips the database entirely.

The way you know if a key exists in the cache is: cache.get(key) returns None when the key is not there, and returns the actual value when it is. So your logic ends up looking like this:

```
raw = cache.get(key)

if raw is not None: # None means miss
return raw # hit, return it

Miss. Go to the database.

result = fetch_from_db()
cache.set(key, result)
return result
```

Enter fullscreen mode

Exit fullscreen mode

This works great until your database returns None.

When None Has Two Jobs

Here is the scenario. My application tracks LLM traces. Each trace can have evaluation results attached to it. When a trace is brand new, it has no evaluations yet. The database legitimately returns None or an empty list.

Walk through what happens with the code above.

First request:

```
cache.get("eval_results:trace-abc") → None # key doesn't exist, miss

fetch_from_db() → None # DB says "no evals yet", correct answer

cache.set(key, None) # store it
```

Enter fullscreen mode

Exit fullscreen mode

Second request, same trace, 200ms later:

cache.get("eval_results:trace-abc") → None

Enter fullscreen mode

Exit fullscreen mode

And here is the problem. Is that None a cache miss, or is it the stored None from the previous request?

You cannot tell. The function sees None and concludes "miss" every single time. It calls the database again, gets None again, stores None again. The entry is never actually cached and creates an loop.

Why This Becomes a Database Problem

On its own, one trace with no evaluations is harmless. The danger is the scale. In my project, after ingesting a batch of traces, none of them have evaluation results yet. That is the normal state right after ingestion. The UI requests eval status for every trace on the page.

With this bug, every page load fires a database query for every trace, every time. The cache offers zero protection for this case, and it is the most common case right after ingestion, exactly when you need it most. Put five people refreshing the dashboard at the same time and you go from 50 queries to 250 queries in seconds. All for data that has not changed. That is the cascade.

The failure is invisible. Just a database working much harder than it should, and an application that will fall over under load that a correctly working cache would have absorbed completely.

The Fix: Give the Stored Nothing Its Own Identity

The problem is that None is doing two jobs at once. It means "key not found in cache" and it means "the actual value is empty." These two things need to be distinguishable.

The fix is a sentinel. Instead of storing literal None, you store a specific string that your real data will never produce.

_CACHED_NONE = "__myapp:cached_none__"

Enter fullscreen mode

Exit fullscreen mode

Now the write path becomes:

stored = _CACHED_NONE if result is None else result cache.set(key, stored)

Enter fullscreen mode

Exit fullscreen mode

And the read path unwraps it:

```
raw = cache.get(key)

if raw is not None: # None still means miss
return None if raw == _CACHED_NONE else raw # unwrap the sentinel
```

Enter fullscreen mode

Exit fullscreen mode

Let's walk through the same scenario:

First request:

cache.get(key) → None # miss, key doesn't exist fetch_from_db() → None cache.set(key, "__myapp:cached_none__") # stored as sentinel

Enter fullscreen mode

Exit fullscreen mode

Second request:

cache.get(key) → "__myapp:cached_none__" # not None, so it's a HIT unwrap → return None to caller Database never touched.

Enter fullscreen mode

Exit fullscreen mode

The caller still receives None. The API behaviour is identical, but now the empty answer is cached and the database is protected.

The Test That Proves It

The thing I appreciated most when writing tests for this was how clearly a single test exposes the bug.

```
def test_none_result_is_cached_not_refetched():
cache = InMemoryCache()
call_count = 0

def compute():
    nonlocal call_count
    call_count += 1
    return None  # DB says "no results"

cache_aside(cache, "k", compute, ttl_s=60)
cache_aside(cache, "k", compute, ttl_s=60)

assert call_count == 1  # compute only ran once — None was cached

```

Enter fullscreen mode

Exit fullscreen mode

Remove the sentinel and run this against the original code. call_count is 2. That is the bug, made visible. No ambiguity.

The Broader Pattern

This is not specific to caching. None being overloaded is a common source of silent bugs anywhere in a system where it carries two meanings.

Python's own standard library handles this with dict.get(key, default). You pass a custom default so you can distinguish "key not found" from "key found, value is None." Same idea.

The rule I now follow: whenever None means two different things in the same flow, one of them needs a name. A sentinel string, a dedicated object, a wrapper type. Anything that gives absence of value its own distinct identity so the code can tell the difference.

The bug I described will not show up in development. Your test database is small, your traffic is low, and a few extra queries are invisible. It surfaces in production, under load, in the form of a database that is inexplicably struggling with read traffic on a day when nothing obvious changed.

Write the test. Make it fail on the old code. Then fix it.