Understanding Temp Files in Postgres
Databases can feel like black boxes that hum along perfectly until suddenly they don’t. And when bad things happen with databases…they’re usually bad! One example of an error you might run into unexpectedly with your database is something about running out of temporary file space.
This is not intended to be a comprehensive look at temp files and the internals of how they work, but just to be a primer to get you started with a solid understanding of what they are and what kind of statistics you might run into as you are figuring out how to optimize your database to handle complex queries and temp files — I link to several other resources throughout that might help you make a deeper dive into this. Let’s get to it!
What are temp files?
At their core, databases are file systems. Data is persisted to the disk for long term storage, but data is also cached in memory if it’s frequently accessed, or currently being accessed, for quicker access.
When you write complex queries against a Postgres database that includes joins, sorting, DISTINCT
queries, etc., those can require more memory than is available. When that happens, Postgres falls back to what are called temp files - these are actually stored on disk but only live for the length of the query. Once the query returns, the temp files are erased…