Django and MongoDB Connector Design
SQL to MongoDB query mapping.
SQL query | pymongo API |
---|---|
SELECT | find(projection=) |
WHERE | find(filter=) |
AND | $and |
OR | $or |
NOT | $neq |
IN | $in |
INNER JOIN | find(), find(), find() |
LEFT JOIN | aggregate($lookup) |
UPDATE | update_many |
DELETE | delete_many |
INSERT INTO | insert_many |
CREATE DATABASE | implicit |
ALTER DATABASE | implicit |
CREATE TABLE | implicit |
ALTER TABLE | implicit |
DROP TABLE | drop_collection |
CREATE INDEX | create_indexes |
DROP INDEX | drop_index |
Performing relational SQL JOINs in MongoDB
Since MongoDB does not have an intrinsic JOIN command the multi-collection JOIN must be done at the application layer. Due to the intrinsic design of MongoDB, if a parallel thread does an update operation on the same set of documents on which JOIN is taking place, there is a possibility of getting different results than expected. This possibility exists in multi-threaded SQL implementations as well.
Application layer multi-collection JOINS without any interleaved database updates are thread safe and can be done in MongoDB. In case you feel SQL joins confusing there is an excellent explanation on how sql joins work which can be referred to before proceeding further.
Here is an example of a typical SQL query generated by Django admin while modifying a user property.
SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename", "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model"
FROM "auth_permission"
INNER JOIN "django_content_type"
ON ("auth_permission"."content_type_id" = "django_content_type"."id")
ORDER BY
"django_content_type"."app_label" ASC,
"django_content_type"."model" ASC,
"auth_permission"."codename" ASC
INNER JOIN translates to:
docs = db['auth_permission'].find(
{},
{'content_type_id': True, '_id': False}
).to_list()
joined_docs = db['django_content_type'].find(
{'id': {'$in': [doc['content_type_id'] for doc in docs]}},
{'id': True, '_id': False}
).to_list()
# Finally, project the SELECTed fields with content_type_id in joined_docs
So INNER JOIN can basically be done using three application level find operations out of which 2 are for lookups.
Consider a complex multi-join SQL query
SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename"
FROM "auth_permission"
INNER JOIN "auth_user_user_permissions"
ON ("auth_permission"."id" = "auth_user_user_permissions"."permission_id")
INNER JOIN "django_content_type"
ON ("auth_permission"."content_type_id" = "django_content_type"."id")
ORDER BY
"django_content_type"."app_label" ASC,
"django_content_type"."model" ASC,
"auth_permission"."codename" ASC
This can be performed using 4 lookups inside python.
docs = db['auth_permission'].find(
{},
{'id': True, '_id': False}
).to_list()
joined_docs = db['auth_user_user_permissions'].find(
{'permission_id': {'$in': [doc['id'] for doc in docs]}},
{'permission_id': True, '_id': False}
).to_list()
joined_docs2 = db['auth_permission'].find(
{'id': {'$in': [doc['permission_id'] for doc in joined_docs]}},
{'content_type_id': True, '_id': False}
).to_list()
joined_docs3 = db['django_content_type'].find(
{'id': {'$in': [doc['content_type_id'] for doc in joined_docs2]}},
{'id': True, '_id': False}
).to_list()
# Finally, project the SELECTed fields with id in joined_docs3
LEFT JOIN or LEFT OUTER JOIN can be done using similar lookup operations.