top of page
Search
  • Writer's pictureWiktor Janiszewski

MongoDB - rapid and powerful tool





What is MongoDB?


MongoDB is a document-oriented NoSQL database engine used for high volume data storage. Instead of using tables and rows as in the traditional databases, MongoDB makes use of collections and documents. Documents consist of key-value pairs which are the basic unit of data in MongoDB (like JSON type files). Collections contain sets of documents and function which is the equivalent of relational database tables. MongoDB is a database which came into light around the mid-2000s.


What does MongoDB outstand?


Full cloud-based developer data platform



Cloud Manager Overview

MongoDB is not just a database, it’s a complete developer data platform. MongoDB Atlas offers various cloud oriented functionalities. Thanks to specifically designed services/algorithms you are able to integrate them with your database. You will have access to:

  • The Performance Advisor - monitors queries that MongoDB considers slow and suggests new indexes to improve query performance. The threshold for slow queries varies based on the average time of operations on your cluster to provide recommendations pertinent to your workload,

  • Atlas Search - By integrating the database, search engine, and sync mechanism into a single, unified, and fully managed platform, Atlas Search is the fastest and easiest way to build relevance-based search capabilities directly into applications,

  • MongoDB Charts - is the only native data visualization tool built for MongoDB Atlas, giving you a quick, simple, and powerful way to visualize your data. Whether you're running a dedicated cluster, a serverless instance, using Atlas Data Federation for discovering robust insights from blended Atlas and S3 data, or your archived data is in Online Archive, Charts covers a breadth of data visualization use cases,

  • Multi-cloud deployment - MongoDB Atlas is the only globally distributed, multi-cloud database. Deploy your data across 95+ regions or create a multi-cloud cluster to enable applications that make use of two or more clouds at the same time.


Flexible Document Schemas

Flexible document schema

MongoDB's developers created very interesting way to store data inside MongoDB services. Specially designed BSON data type, which was originally based on JSON, due to the fact that it enables you to have objects inside collection with completely different sets of fields. This data structure allows to be modelled and manipulated easily. Moreover, MongoDB doesn't required to input your data in one particular manner. You can store different kinds of documents in one collection, so we can say that it's kind of schemaless database. But it's worth noticing that you are able to define your own validation schema, which will be applied whenever new document is inserted or updated.

Generally, schemaless approach in MongoDB grants you amazing flexibility which is really great feature when handling real-world data which is under permanent change.


Widely supported and code native access


Most of database services tend to use heavy, time-consuming and relatively hard ORM (like SQLAlchemy in Python) in all kinds of programs in order to get data into Object form. MongoDB's developers decided to store data in a document format which means that you are able to access and manipulate it from any language you can image in simple way (for instance: dictionaries in Python, objects in JavaScript, Maps in Java, etc.).


Powerful querying and analytics


MongoDB is specially prepared to make data in collections easy to access and manipulate. It rarely requires joins or transactions, but when you are in need to perform complex querying, the MongoDB is here to help you.

The MongoDB Query API allows you to query deep into documents, and even perform complex analytics pipelines with just a few lines of declarative code. Unlike ORMs, which require lot of complex transactions and code lines in order to perform demanding queries or even expanded tasks.


High performance

High performance of MongoDB in comparison to its SQL rival, PostgreSQL, source: https://www.prnewswire.com/news-releases/new-benchmarks-show-postgres-dominating-mongodb-in-varied-workloads-300875314.html

Due to the document oriented model that is being used in MongoDB, all the relevant informations can be embedded inside a single document rather than relying on expensive join operations, like in traditional relational databases. This particular approach offers much faster querying, and returns all the necessary information in just one single call to the database.

MongoDB offers to manipulate handful of documents in just one operation, when it comes to write performance. This is significant performance writing boost in comparison to conventional database transaction inside traditional databases, where batched writes are present.


Native aggregation

Aggregation pipeline that in output counts matched documents.

MongoDB offers so called Aggregation Pipeline which allows to perform complex logic within steps that enables to operate on documents in varies ways. For instance you can group documents based on some unique value or just modify document output in just one step.



When you should use MongoDB?


MongoDB is the best candidate to work with unstructured data. That's mean, it's great to perform operations inside Big Data system or MapReduce applications. You should use MongoDB especially when:

  • you’re using cloud computing - cloud-based storage needs to easily distribute data across multiple servers, which suits MongoDB’s nature perfectly. Moreover, MongoDB offers you specially prepared inside services, which for instance, can help you to organise and improve performance of querying,

  • you need your data fast and easily accessible - MongoDB offers high data availability, providing instant and automatic data recovery. If you want to have better understanding of your data, you can use MongoDB Compass GUI, which also make the data more accessible for user and allows to perform expanded operations/querying with just one click of the mouse,

  • you have lots of unstructured data - MongoDB (like all the others NoSQL databases) has no storable data type limits.

  • you’re using Agile methodologies for development - relational databases are anything but agile, and they will slow you down. On the other hand, a database like MongoDB doesn’t require the level of preparation that its relational counterpart requires.

  • you have an unstable or undefined schema for documents in database - this is one of the best advantages of MongoDB when handling real-world data which is under permanent change.


MongoDB Compass


MongoDB Compass is amazing and powerful GUI for MongoDB. Through using it, you can easily query, aggregate, and analyze your MongoDB data in a visual environment.

Briefly, you can perform all kinds of database transaction using MongoDB Compass. Thanks to specially prepared application the use of MongoDB is much easier and straightforward.


MongoDB projects in TECHS


Automation of users retrieval in telecommunication company


Main purpose of it was to implement queries and aggregation pipelines that would enable to export and import tones of leads (users in telecommunication area) based on some filtering criteria. Main two reasons of using MongoDB in this project:

  • Undefined schema of documents,

  • Tones of leads documents, that were added, updated and deleted on daily basis.

First part of project was done in MongoDB Compass because of fact that the data and all the others functions (like aggregations) are much easier accessible than in standard form of MongoDB Shell.

But due to using MongoDB Shell syntax in one place during export part and demanding operation on MongoDB Compass for Non-IT users we decided to integrate FastAPI with MongoDB using PyMongo, which was very easy to do and new API was created in just matter of seconds.



Automation of user statistic retrieval in company that produce games for children


The company was asking us to create special queries and logic to retrieve some crucial statistics that would be used later in analytics. There were two things that should be retrieved:

  • Basic user statistic - for instance: number of installations of particular game broken down by countries on daily basic in provided date range

  • Retention of users - how many users came back to playing game in provided date range broken down by the days of retention.

First part was easily done using powerful Aggregation Pipelines, which enabled to straightforwardly manipulate on documents in database.

However, user retention was quite complicated part and we decided to integrate the MongoDB Shell with aggregation - mainly due to the fact that the loops can be used in shell so the whole process can be automated in simple way.


Sources:

75 views

Recent Posts

See All

Comments


bottom of page