Django and MongoDB Connector Design

SQL to MongoDB query mapping.

SQL query pymongo API
SELECT find(projection=)
WHERE find(filter=)
AND $and
OR $or
NOT $neq
IN $in
INNER JOIN find(), find(), find()
LEFT JOIN aggregate($lookup)
UPDATE update_many
DELETE delete_many
INSERT INTO insert_many
CREATE DATABASE implicit
ALTER DATABASE implicit
CREATE TABLE implicit
ALTER TABLE implicit
DROP TABLE drop_collection
CREATE INDEX create_indexes
DROP INDEX drop_index

Performing relational SQL JOINs in MongoDB

Since MongoDB does not have an intrinsic JOIN command the multi-collection JOIN must be done at the application layer. Due to the intrinsic design of MongoDB, if a parallel thread does an update operation on the same set of documents on which JOIN is taking place, there is a possibility of getting different results than expected. This possibility exists in multi-threaded SQL implementations as well.

Application layer multi-collection JOINS without any interleaved database updates are thread safe and can be done in MongoDB. In case you feel SQL joins confusing there is an excellent explanation on how sql joins work which can be referred to before proceeding further.

Here is an example of a typical SQL query generated by Django admin while modifying a user property.

SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename", "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" 
FROM "auth_permission" 
INNER JOIN "django_content_type" 
    ON ("auth_permission"."content_type_id" = "django_content_type"."id") 
ORDER BY 
    "django_content_type"."app_label" ASC, 
    "django_content_type"."model" ASC, 
    "auth_permission"."codename" ASC

INNER JOIN translates to:

docs = db['auth_permission'].find(
    {},
    {'content_type_id': True, '_id': False}
).to_list()

joined_docs = db['django_content_type'].find(
    {'id': {'$in': [doc['content_type_id'] for doc in docs]}},
    {'id': True, '_id': False}
).to_list()

# Finally, project the SELECTed fields with content_type_id in joined_docs

So INNER JOIN can basically be done using three application level find operations out of which 2 are for lookups.

Consider a complex multi-join SQL query

SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" 
FROM "auth_permission" 
INNER JOIN "auth_user_user_permissions" 
    ON ("auth_permission"."id" = "auth_user_user_permissions"."permission_id") 
INNER JOIN "django_content_type" 
    ON ("auth_permission"."content_type_id" = "django_content_type"."id") 
ORDER BY 
    "django_content_type"."app_label" ASC, 
    "django_content_type"."model" ASC, 
    "auth_permission"."codename" ASC

This can be performed using 4 lookups inside python.

docs = db['auth_permission'].find(
    {},
    {'id': True, '_id': False}
).to_list()

joined_docs = db['auth_user_user_permissions'].find(
    {'permission_id': {'$in': [doc['id'] for doc in docs]}},
    {'permission_id': True, '_id': False}
).to_list()

joined_docs2 = db['auth_permission'].find(
    {'id': {'$in': [doc['permission_id'] for doc in joined_docs]}},
    {'content_type_id': True, '_id': False}
).to_list()

joined_docs3 = db['django_content_type'].find(
    {'id': {'$in': [doc['content_type_id'] for doc in joined_docs2]}},
    {'id': True, '_id': False}
).to_list()

# Finally, project the SELECTed fields with id in joined_docs3

LEFT JOIN or LEFT OUTER JOIN can be done using similar lookup operations.