How I hacked the crap out of Next.js ISR

When I started re-writing this website (again) I decided to go with the old-faithful Next.js which I have been rather happy with in the past. Now, like everything in the Javascript/Typescript/Node/npm -ecosystem everything changes completely if you make the mistake of looking away for two seconds, and while change is usually good and there is nothing permanent except change it sometimes takes a while to adapt when whole concepts get thrown away. This hit me hard with the deprecation of getInitialProps(), which I actually learned to like and while having it's problems in some use-cases I never had any issues with it. You could argue that "it is only deprecated, you can still use it" and while this is true I'd rather learn the new ways as I go and not hang in the past, or I'll deprecate myself as I go.

So what's with Next.js 13 and what's new?

Actually, the recommended way to use Next.js 13.4 is to use the App-router and the previous "pages router" is probably going to be a thing in the past in the near future. This change happened after I started my work so I'm sticking with this until I get things stable and I have time to either experiment with the App-router or move over to something completely different (looking at Remix and maybe having a serverless front, using Cloudflare pages etc).

So in short: the getInitialProps is deprecated and the way to do things with Next.js 13 pages routing can be picked from these:

  • Server Side Rendering (SSR): export a getServerSideProps() -function which is called on server-sider on every render
  • Static Site Generation (SSG): export a getStaticProps() -function which is called on BUILD-TIME, and also getStaticPaths() -function if you have dynamic routing
  • Incremental Static Regeneration (ISR): same as previous, but with revalidate-value from returned from those functions to update contents in the background
  • Client-side Rendering (CSR): just fetch data on the client-side
  • Edge and Node.js Runtimes: in practise vendor-locking yourself with Vercel or managing the whole infrastructure yourself

The first option is good for SEO but even with careful caching it will not be a SPA-like experience. The second one expects you to have really static content, which rarely is an actual, real-world situation. Fourth option kills your SEO and for the fifth I don't have the time now nor do I want to go the Vercel-way.

So I'm left with the middle one and it has it's own problems: the ISR is built on top of SSG, meaning that the site-content still needs to be built completely on BUILD-TIME, meaning I cannot simply have this on my ".gitlab-ci.yml" and get on with it:

lint: stage: test cache: paths: - node_modules script: - npm install - npm run lint build app: stage: build variables: NEXT_TELEMETRY_DISABLED: "1" cache: paths: - node_modules artifacts: paths: - .next script: - npm run build

Why? Because the next app tries to call the getStaticProps() and getStaticPaths() -functions on each page that has them and build the contents to .next/static/chunks/pages. Depending how your CI/CD is configured this may or may not work at all and is sub-optimal to say the least if the site is being built from scratch and the endpoints are not online yet. I'd even go as far as calling it an anti-pattern.

Again I'm left with a couple of options:

  • Prevent the static content-generation at build time
  • Nuke the cached content from the Docker image after build and have it created lazily on production

As it turns out I cannot completely prevent the static content to be generated - npm run build calls the aforementioned functions and that's that. And I'd have to generate some kind of dummy output from them the first place to have this issue fixed with the second option. Now my hacky solution is a mix of those two:

1. Tell the app about build-time

I have the "build app" script line from ".gitlab-ci.yml" modified like this to tell the app not to call any endpoints:

script: - SKIP_GET_STATIC_PROPS=true npm run build

2. Detect build-time and return "empty" json from getStaticProps() and getStaticPaths()

My getStaticProps() looks something like this:

export const getStaticProps: GetStaticProps<HomePageProps> = async () => { if (process.env.SKIP_GET_STATIC_PROPS) { return { props: {}, revalidate: 1, } } // ... actual fetching is done around here and returned: return { props: { articles, content, }, revalidate: 60, } }

The getStaticPaths() on those pages that have dynamic routing look something like this:

export async function getStaticPaths() { if (process.env.SKIP_GET_STATIC_PROPS) { return { paths: [], fallback: 'blocking', } } // ... fetch the dynamic paths and return them in the paths -key plus tell the server to check for new content if it's not found pre-rendered return { paths, fallback: 'blocking', } }

One more issue remains: Next.js serves stale content on the first request (that dummy content in this case) and then in the background tries to regenerate it and on upcoming requests the client will receive up-to-date content. This means some poor user might end up with a "loading..." screen I made, which is not exactly optimal.

That's when I decided to do something naughty on the docker-entrypoint.sh.

3. Implement a "magic" -endpoint for on-demand revalidation and call it at start of the app

So you can implement an endpoint for doing on-demand revalidation and here's the code I use in pages/api/revalidate.ts:

import {NextApiRequest, NextApiResponse} from "next"; export default async function handler(req: NextApiRequest, res: NextApiResponse): Promise<void> { if (process.env.NEXT_REVALIDATE_TOKEN === undefined || process.env.NEXT_REVALIDATE_TOKEN === '') { res.status(500).json({ message: 'Missing token'}); return } if (req.query.secret !== process.env.NEXT_REVALIDATE_TOKEN) { res.status(401).json({ message: 'Invalid token'}); } const path = Array.isArray(req.query.path) ? req.query.path[0] : req.query.path; if (!path) { res.status(400).json({ message: 'Missing path'}); return } try { await res.revalidate(path); res.json({revalidated: true}); } catch (err) { res.status(500).send('Error revalidating'); } }

And with all the type- and error-checking the beef boils down to this one simple command: res.revalidate(path).

A point you should be raising now is that you cannot call anything AFTER the Dockerfile CMD is called, i.e. you cannot call any endpoints once the app has started. Except you can, if you abuse it a bit.

You can always fork a process before you exec the CMD and have a sleep statement take care of the delay. So here's a docker-entrypoint.sh I use:

#!/usr/bin/env sh set -euo pipefail export NEXT_REVALIDATE_TOKEN=$(cat /dev/urandom | tr -dc '[:alpha:]' | head -c 20 | head -n 1) /usr/local/bin/preload-cache.sh & exec $@

And the actual "magic" happens in the preload-cache.sh which looks like this:

#!/usr/bin/env sh set -euo pipefail sleep 3 if [ "$NODE_ENV" != "production" ]; then # Technically /app/.next/static/chunks/pages has no matching files in non-production environment, but lest be verbose echo "Skip cache revalidate on non-production environment..." exit 0 fi; function revalidate() { echo -n "invalidating chunk: ${1}..." CHUNK=$(echo "${1}" | sed -r 's/\[/%5B/' | sed -r 's/\]/%5D/' ) curl -s "http://localhost:3000/api/revalidate?path=${CHUNK}&secret=${NEXT_REVALIDATE_TOKEN}" > /dev/null echo "done" } function warmup() { echo -n "warming up endpoint: ${1}..." curl -s "http://localhost:3000${1}" > /dev/null echo "done" } cd /app/.next/static/chunks/pages find . -regex ".*/[^_]*-.*\.js" | sed -rn 's/^\.(.*)-[0-9a-z]+\.js$/\1/p' | sed -r 's/^\/index$/\//' | while read -r line ; do revalidate "${line}" done find . -regex ".*/[^_\[]*-.*\.js" | sed -rn 's/^\.(.*)-[0-9a-z]+\.js$/\1/p' | sed -r 's/^\/index$/\//' | while read -r line ; do warmup "${line}" done

So what happens here is that I introduce a variable called NEXT_REVALIDATE_TOKEN with some random-enough value so that not just anyone can ask for revalidating the content. Then I fork a process that first calls the revalidate endpoint for every route I have, the list of which I gather from .next/static/chunks/pages with the help of little regular-expression-find-sed -combo. Then, since I can, I'm also calling all the static routes and pre-generating them. For the dynamic ones I realize I cannot do the same as I don't know the routes here, but at least the index-page and all the other static routes are ready to be pounded immediately. The dynamic routes have no content and they will be "forcefully" generated on first request due to the fallback: "blocking" from the getStaticPaths() and that's good enough for me now.

Final thoughts

Is this a good idea? Probably not. Should you (nor I) do this? Also probably no. This was more a can-I-do-this -challenge and will probably revisit this whole thing and implement something more feasible and sane later once I have more time on my hands.