Combining Disparate QuerySets in Django
While refactoring strongarm.io we ran into a problem: we had different database tables that we needed to query over as if they were a single database table. We use Django as an ORM and needed to stay within the ORM in order to leverage third-party libraries.
Our two models have overlapping fields, in this instance both had a
name field to search over and a
created field to order on. A naive implementation might serialize all the data to memory and come up with implementations like:
# Consider ModelA and ModelB to exist somewhere, both have a name (CharField) # and created (DateTimeField). from itertools import chain # Sorted data. sorted_data = sorted( chain( ModelA.objects.order_by('-created'), ModelB.objects.order_by('-created'), ), key=lambda instance: instance.date, reverse=True) # Search for a particular result. result = None try: result = ModelA.objects.get(name='foo') except ModelA.DoesNotExist: try: result = ModelB.objects.get(name='foo') except ModelB.DoesNotExist: pass # Look at a subset. subset_data = (list(ModelA.objects.filter(name__contains='foo')) + list(ModelB.objects.filter(name__contains='foo')))
This has two issues we had to overcome:
- We had too much data to serialize to memory.
- We needed to pass the result into Django APIs that expected a
Enter Django QuerySetSequence! We built Django QuerySetSequence (based on some previously available code) to provide the following:
- Provide a
QuerySetAPI that operates over multiple
QuerySetsgenerated from different
- Evaluate each
QuerySetlazily (i.e. as late as possible).
- High quality code with tests.
- Guard against calling untested methods.
This allows much more Django-esque code:
# Consider ModelA and ModelB to exist somewhere, both have a name (CharField) # and created (DateTimeField). from django.core.exceptions import ObjectDoesNotExist from queryset_sequence import QuerySetSequence queryset = QuerySetSequence(ModelA.objects.all(), ModelB.objects.all()) # Sorted data. sorted_data = queryset.order_by('-created') # Search for a particular result: try: result = queryset.get(name='foo') except ObjectDoesNotExist: result = None # Look at a subset. subset_data = queryset.filter(name__contains='foo')
Currently Django QuerySetSequence supports both Django 1.8 and Django 1.9 on Python 2.7, 3.4 and 3.5. Check out the full set of features or contribute to the source repository. You can install
django-querysetsequence package frompypi using pip:
pip install django-querysetsequence
If you’re using Django QuerySetSequence we’d love to hear about it!