AWS, RDS, TIL

RDS snapshots are lazy loaded

For a while we have known that after creating an rds database from an existing snapshot, the database is cold. Accessing data is slow compared to our expectations. Internally we have been referring to this as a cold start.

It turns out that this is due to the snapshot been lazy loaded into the rds ebs volume. Rds snapshots are stored in S3. Once a new rds instance is started from an existing snapshot, the instance becomes available even before the whole snapshot is copied from the S3 bucket to the rds ebs volume. If there is a request for data which is not yet in the ebs volume, the query will have to wait until the data is read from S3. This results on higher I/O latency.

To warm up a psql database AWS recommends using pg_prewarm extension to read all data.

Another way to make psql read all data is via the vacuum analyze command.

For more details check out this AWS post.