For your pipeline you basically need to host your Ghost CMS so that Gatsby can pull the content during build time. You also need to host your built static website somewhere.
Several options exist on how and where you want to host your Ghost CMS, you could even have a local install and just push to built static website to your hosting plattform. I decided early on that I want to have everything on Amazon Web Services. Ghost runs on a EC2 instance and my Gatsby project is at Github to utilize the functionality of CodeBuild to automatically deploy changes to my S3 bucket, where the static site is hosted. For HTTPS/SSL I use CloudFront, which also acts as the CDN (not really important for a niche personal blog).
I would recommend to stick to this tutorial for the setup as I have described it above. I just want to add some points where I had to come up with my own solutions.
When using CloudFront as a CDN, you will most likely want your data to be fresh after you push an update or publish a post. An extra step is required to don't serve stale content and I solved this by using Lambda Functions or invalidating the cache after pushing an update, but ultimately decided to go the invalidating route.
When using CloudFront to distribute your content, you can apply Lambda functions to hook into the process of request/response between the user, cloudfront and your origin source. You can read about this more here.
Essential is that you can add a maximum of one lambda function at each stage of the request/response pipeline. For the prevention of a stale cache we want to alter the headers of the response to tell cloudfront the data needs to be fetched from the Origin server. This is taken from the tutorial of Ximedes.
'use strict';
exports.handler = (event, context, callback) => {
const request = event.Records[0].cf.request;
const response = event.Records[0].cf.response;
const headers = response.headers;
if (request.uri.startsWith('/static/')) {
headers['cache-control'] = [
{
key: 'Cache-Control',
value: 'public, max-age=31536000, immutable'
}
];
} else {
headers['cache-control'] = [
{
key: 'Cache-Control',
value: 'public, max-age=0, must-revalidate'
}
];
}
headers['vary'] = [
{
key: 'Vary',
value: 'Accept-Encoding'
}
];
callback(null, response);
};
Next step is to add the function to your CloudFront distribution
The second variant I am going to introduce you is that after CodeBuild copies your built website to S3 to invalidate the CloudFront Cache. If you read the documentation of invalidating caches you will find a statement about the cost, which states to sum it up that every path you invalidate costs money. Although you get 1000 invalidations free per month, I thought that it would cost too much invalidating every file, however there is a statement that all files in a folder invalidated with a wildcard will only count as one invalidation, which means after pushing a update we just invalidate the root folder.
version: 0.2
phases:
install:
runtime-versions:
nodejs: 10
commands:
- npm install --global yarn
- npm install --global gatsby-cli
build:
commands:
- yarn
- gatsby build
post_build:
commands:
- aws s3 sync public "s3://YOURS3BUCKET" --acl=public-read --delete
- aws cloudfront create-invalidation --distribution-id "YOURDISTRIBUTIONID" --paths "/*"
cache:
paths:
- node_modules/**/*
- public/**/*
- /usr/local/lib/node_modules/**/*
I ultimately decided to use the cache invalidation method, because I tested both and using lambda@edge functions to alter the header was still unreliable when the new content was available. By using the cache invalidation, I can be sure that I see the fresh content after about 10 to 20 minutes. Additionally I think that using lambda functions is a little bit inefficient, because I use so little static resources and my content is only as dynamic as I want it to be by pushing updates. If you want to read more about how Caching in Gatsby works and can be optimized check out the article at their documentation.