get_or_create, is an awesome helper utility to have at your disposal when you need an object matching some specifications, but there should only be exactly one match — you want to retrieve it if it already exists, and create it if it doesn’t.
However, there’s a scenario where it doesn’t quite do what we expect in the case of race conditions (exactly the thing we’re trying to prevent). There is a nod to what we’re about to talk about in the docs:
This method is atomic assuming correct usage, correct database configuration, and correct behavior of the underlying database. However, if uniqueness is not enforced at the database level for the
kwargsused in a
unique_together), this method is prone to a race-condition which can result in multiple rows with the same parameters being inserted simultaneously.
Let’s talk about this in more detail. Here’s the relevant bit of Dango’s implementation of
lookup, params = self._extract_model_params(defaults, **kwargs)
return self.get(**lookup), False
return self._create_object_from_params(lookup, params)
This does exactly what the name implies! It attempts to do a lookup based on the filter args that are passed in (explicitly doing a
.get(), which fails with
DoesNotExist if there is no match in the database), and then catching that
DoesNotExist exception, and creates the object instead.
However, if you go further into
_create_object_from_params, you’ll notice that it does a lot more than just make a call to
.create(). Here’s what happens there (still in Django source code):
obj = self.create(**params)
return obj, True
exc_info = sys.exc_info()
return self.get(**lookup), False
This is cool — it’s explicitly accounting for race conditions! It tries to create the object, but if that operation throws an
IntegrityError, it does the lookup again and tries to return what it finds.
The problem is this: if you hit this part of the code in one thread (meaning the lookup has already taken place and not returned anything) on an object that does not have a uniqueness constraint on the attributes you’re doing the lookup based on, if one is created in another thread, the creation in this thread will not throw an
IntegrityError, and you’ll end up with two! This may be fine — for now. After all, your call to
get_or_create returned an instance matching your parameters, and so did the call in the other thread, and both will carry on their merry way.
The problem arises next time you try to retrieve the object using the same lookup params with
Because you now have two objects in your database, when
get_or_create tries its
.get(), (note, not
.filter()), you’ll get a
get_or_create only catches a
DoesNotExist exception. This means that unless we do additional exception handling on our own (which we shouldn’t have to, in this case!), the user will see an error.
The moral of the story? Don’t use
get_or_create on objects that don’t have uniqueness constraints on the attributes you’re doing the lookup based on, at the database level.
The inverse is also true: if you have a model with a uniqueness constraint, using the built-in
get_or_create method is preferable to trying to build your own, since you likely won’t handle the race condition caused by multiple threads attempting this at the same time.