Django and MongoDB Connector Design

SQL to MongoDB query mapping.

SQL querypymongo API
SELECTfind(projection=)
WHEREfind(filter=)
AND$and
OR$or
NOT$neq
IN$in
INNER JOINfind(), find(), find()
LEFT JOINaggregate($lookup)
UPDATEupdate_many
DELETEdelete_many
INSERT INTOinsert_many
CREATE DATABASEimplicit
ALTER DATABASEimplicit
CREATE TABLEimplicit
ALTER TABLEimplicit
DROP TABLEdrop_collection
CREATE INDEXcreate_indexes
DROP INDEXdrop_index

Performing relational SQL JOINs in MongoDB

Since MongoDB does not have an intrinsic JOIN command the multi-collection JOIN must be done at the application layer. Due to the intrinsic design of MongoDB, if a parallel thread does an update operation on the same set of documents on which JOIN is taking place, there is a possibility of getting different results than expected. This possibility exists in multi-threaded SQL implementations as well.

Application layer multi-collection JOINS without any interleaved database updates are thread safe and can be done in MongoDB. In case you feel SQL joins confusing there is an excellent explanation on how sql joins work which can be referred to before proceeding further.

Here is an example of a typical SQL query generated by Django admin while modifying a user property.

SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename", "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" 
FROM "auth_permission" 
INNER JOIN "django_content_type" 
    ON ("auth_permission"."content_type_id" = "django_content_type"."id") 
ORDER BY 
    "django_content_type"."app_label" ASC, 
    "django_content_type"."model" ASC, 
    "auth_permission"."codename" ASC
copy code

INNER JOIN translates to:

docs = db['auth_permission'].find(
    {},
    {'content_type_id': True, '_id': False}
).to_list()

joined_docs = db['django_content_type'].find(
    {'id': {'$in': [doc['content_type_id'] for doc in docs]}},
    {'id': True, '_id': False}
).to_list()

# Finally, project the SELECTed fields with content_type_id in joined_docs
copy code

So INNER JOIN can basically be done using three application level find operations out of which 2 are for lookups.

Consider a complex multi-join SQL query

SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" 
FROM "auth_permission" 
INNER JOIN "auth_user_user_permissions" 
    ON ("auth_permission"."id" = "auth_user_user_permissions"."permission_id") 
INNER JOIN "django_content_type" 
    ON ("auth_permission"."content_type_id" = "django_content_type"."id") 
ORDER BY 
    "django_content_type"."app_label" ASC, 
    "django_content_type"."model" ASC, 
    "auth_permission"."codename" ASC
copy code

This can be performed using 4 lookups inside python.

docs = db['auth_permission'].find(
    {},
    {'id': True, '_id': False}
).to_list()

joined_docs = db['auth_user_user_permissions'].find(
    {'permission_id': {'$in': [doc['id'] for doc in docs]}},
    {'permission_id': True, '_id': False}
).to_list()

joined_docs2 = db['auth_permission'].find(
    {'id': {'$in': [doc['permission_id'] for doc in joined_docs]}},
    {'content_type_id': True, '_id': False}
).to_list()

joined_docs3 = db['django_content_type'].find(
    {'id': {'$in': [doc['content_type_id'] for doc in joined_docs2]}},
    {'id': True, '_id': False}
).to_list()

# Finally, project the SELECTed fields with id in joined_docs3
copy code

LEFT JOIN or LEFT OUTER JOIN can be done using similar lookup operations.