Basic Q-and-A around MongoDB


Q1: What is GPG?

A: GNU Privacy Guard (GnuPG or GPG), a free-software replacement for Symantec's PGP cryptographic software suite, compliant with RFC 4880, the IETF standards-track specification of OpenPGP. Modern versions of PGP are interoperable with GnuPG and other OpenPGP-compliant systems. 

GnuPG is a hybrid-encryption software program because it uses a combination of conventional symmetric-key cryptography for speed, and public-key cryptography for ease of secure key exchange, typically by using the recipient's public key to encrypt a session key which is used only once. This mode of operation is part of the OpenPGP standard and has been part of PGP from its first version.

The GnuPG 1.x series uses an integrated cryptographic library, while the GnuPG 2.x series replaces this with Libgcrypt.

GnuPG encrypts messages using asymmetric key pairs individually generated by GnuPG users. The resulting public keys may be exchanged with other users in a variety of ways, such as Internet key servers. They must always be exchanged carefully to prevent identity spoofing by corrupting public key ↔ "owner" identity correspondences. It is also possible to add a cryptographic digital signature to a message, so the message integrity and sender can be verified, if a particular correspondence relied upon has not been corrupted. 

Ref: https://en.wikipedia.org/wiki/GNU_Privacy_Guard

Q2: What is PGP?

A: Pretty Good Privacy (PGP) is an encryption program that provides cryptographic privacy and authentication for data communication. PGP is used for signing, encrypting, and decrypting texts, e-mails, files, directories, and whole disk partitions and to increase the security of e-mail communications.

PGP and similar software follow the OpenPGP, an open standard of PGP encryption software, standard (RFC 4880) for encrypting and decrypting data. 

Q3: Give an example usage of GPG.

A: A GPG usecase coming in picture while installing MongoDB:

Import the public key used by the package management system.

From a terminal, issue the following command to import the MongoDB public GPG Key from https://www.mongodb.org/static/pgp/server-4.2.asc:

wget -qO - https://www.mongodb.org/static/pgp/server-4.2.asc | sudo apt-key add -

The operation should respond with an OK.

However, if you receive an error indicating that gnupg is not installed, you can:

    Install gnupg and its required libraries using the following command:

sudo apt-get install gnupg

Once installed, retry importing the key:

wget -qO - https://www.mongodb.org/static/pgp/server-4.2.asc | sudo apt-key add -

Ref: https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/

Q4: Define GridFS.

A4: A convention for storing large files in a MongoDB database. All of the official MongoDB drivers support this convention, as does the mongofiles program.

In MongoDB, use GridFS for storing files larger than 16 MB.

Ref: https://docs.mongodb.com/manual/reference/glossary/#term-gridfs

...

GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16 MB.
Ref: https://docs.mongodb.com/manual/core/gridfs/

BSON:A serialization format used to store documents and make remote procedure calls in MongoDB. “BSON” is a portmanteau of the words “binary” and “JSON”. Think of BSON as a binary representation of JSON (JavaScript Object Notation) documents. 

Q5: How to show all the databases present on the 'localhost' on fresh installation of MongoDB?

A:
Enter 'mongo' shell using the command 'mongo'.

(base) ashish@ashish-vBox:~$ which mongo
/usr/bin/mongo

> db.adminCommand('listDatabases')
{
    "databases" : [
        {
            "name" : "admin",
            "sizeOnDisk" : 40960,
            "empty" : false
        },
        {
            "name" : "config",
            "sizeOnDisk" : 61440,
            "empty" : false
        },
        {
            "name" : "local",
            "sizeOnDisk" : 40960,
            "empty" : false
        }
    ],
    "totalSize" : 143360,
    "ok" : 1
}
> db.getMongo().getDBNames()
[ "admin", "config", "local" ]
> show databases OR show dbs
admin   0.000GB
config  0.000GB
local   0.000GB

Note: There is also a database 'test' that is not coming in any of the listing above but that shows in 'mongo' shell as the output of the command 'db'.

Q6: Show how to put a file in GridFS.
A:
(base) ashish@ashish-vBox:~$ mongofiles -d=local list
2019-10-29T16:55:34.789+0530 connected to: mongodb://localhost/

(base) ashish@ashish-vBox:~/Desktop$ tee grid_fs_test.txt
Hello GridFS!

(base) ashish@ashish-vBox:~/Desktop$ cat grid_fs_test.txt 
Hello GridFS!

(base) ashish@ashish-vBox:~/Desktop$ mongofiles -d=local put grid_fs_test.txt
2019-10-29T16:59:43.790+0530 connected to: mongodb://localhost/
2019-10-29T16:59:43.989+0530 added gridFile: grid_fs_test.txt

(base) ashish@ashish-vBox:~/Desktop$ mongofiles -d=local list
2019-10-29T17:00:01.578+0530 connected to: mongodb://localhost/
grid_fs_test.txt 14

Q7: How to get the Object ID of a file in GridFS?

A:
First, enter 'mongo' shell.

> use local
switched to db local
> db.fs.files
local.fs.files
> db.fs.files.find()
{ "_id" : ObjectId("5db822a712366dd1af476200"), "length" : NumberLong(14), "chunkSize" : 261120, "uploadDate" : ISODate("2019-10-29T11:29:43.988Z"), "filename" : "grid_fs_test.txt", "metadata" : {  } }

...

> db.fs.files.find().pretty()
{
 "_id" : ObjectId("5db96aab4423094e34660cff"),
 "length" : NumberLong(7),
 "chunkSize" : 261120,
 "uploadDate" : ISODate("2019-10-30T10:49:15.541Z"),
 "filename" : "grid_fs.txt",
 "metadata" : {
  
 }
}

Q8: How to set a 'data' directory path for the MongoDB?
A:
"mongod" is the command to start the MongoDB.
mongod --dbpath /home/ashish/Desktop/workspace/data/db

MongoDB needs data directory to store data. Default path is "/data/db".

(base) ashish@ashish-vBox:~/Desktop/workspace/data/db$ ls -l
total 412
-rw------- 1 ashish ashish 20480 Oct 30 16:15 collection-0-1438267418948105381.wt
...
-rw------- 1 ashish ashish 20480 Oct 30 16:19 collection-8-1438267418948105381.wt
drwx------ 2 ashish ashish  4096 Oct 30 16:49 diagnostic.data
-rw------- 1 ashish ashish 24576 Oct 30 16:19 index-10-1438267418948105381.wt
...
-rw------- 1 ashish ashish 20480 Oct 30 16:19 index-9-1438267418948105381.wt
drwx------ 2 ashish ashish  4096 Oct 30 16:14 journal
-rw------- 1 ashish ashish 36864 Oct 30 16:19 _mdb_catalog.wt
-rw------- 1 ashish ashish     5 Oct 30 16:14 mongod.lock
-rw------- 1 ashish ashish 36864 Oct 30 16:20 sizeStorer.wt
-rw------- 1 ashish ashish   114 Oct 30 16:14 storage.bson
-rw------- 1 ashish ashish    47 Oct 30 16:14 WiredTiger
-rw------- 1 ashish ashish  4096 Oct 30 16:14 WiredTigerLAS.wt
-rw------- 1 ashish ashish    21 Oct 30 16:14 WiredTiger.lock
-rw------- 1 ashish ashish  1190 Oct 30 16:48 WiredTiger.turtle
-rw------- 1 ashish ashish 77824 Oct 30 16:48 WiredTiger.wt

Error logs from "mongo" shell one gets if MongoDB is not started:
019-10-30T16:59:57.112+0530 E  QUERY    [js] Error: couldn't connect to server 127.0.0.1:27017, connection attempt failed: SocketException: Error connecting to 127.0.0.1:27017

Ref: http://zetcode.com/python/pymongo/
Ref: https://api.mongodb.com/python/current/examples/gridfs.html

Q9: How would you empty a GridFS database?
A:
> db.fs.files.remove({})
WriteResult({ "nRemoved" : 4 })

> db.fs.chunks.remove({})
WriteResult({ "nRemoved" : 4 })

Q10: Location of 'mongo.exe' in Windows installation.
A:

To begin using MongoDB, connect a mongo.exe shell to the running MongoDB instance. Either:
From Windows Explorer/File Explorer, go to C:\Program Files\MongoDB\Server\4.2\bin\ directory and double-click on mongo.exe.
Or, open a Command Interpreter with Administrative privileges and run:
"C:\Program Files\MongoDB\Server\4.2\bin\mongo.exe"

Ref: https://docs.mongodb.com/manual/tutorial/install-mongodb-on-windows/

Q11: How would you stop MongoDB on Windows?
A:
Or on Windows if you have installed as a service named MongoDB:

net stop MongoDB
And if not installed as a service (as of Windows 7+) you can run:

taskkill /f /im mongod.exe

Ref: https://stackoverflow.com/questions/11774887/how-to-stop-mongo-db-in-one-command/11777141

Q12: Write about "ServerSelectionTimeoutError" error.
A:
exception pymongo.errors.ServerSelectionTimeoutError(message='', errors=None)
Thrown when no MongoDB server is available for an operation

If there is no suitable server for an operation PyMongo tries for serverSelectionTimeoutMS (default 30 seconds) to find one, then throws this exception. For example, it is thrown after attempting an operation when PyMongo cannot connect to any server, or if you attempt an insert into a replica set that has no primary and does not elect one within the timeout window, or if you attempt to query with a Read Preference that the replica set cannot satisfy.

# Ref: https://api.mongodb.com/python/current/api/pymongo/errors.html

Q13: Write about "bind_ip" argument passed to "mongod" at the start-up.
A (13a):
Starting in MongoDB 3.6, MongoDB binaries, mongod and mongos, bind to localhost by default. If the net.ipv6 configuration file setting or the --ipv6 command line option is set for the binary, the binary additionally binds to the localhost IPv6 address.

Considerations
WARNING

Make sure that your mongod and mongos instances are only accessible on trusted networks. If your system has more than one network interface, bind MongoDB programs to the private or internal network interface.

To override and bind to other ip addresses, you can use the net.bindIp configuration file setting or the --bind_ip command-line option to specify a list of hostnames or ip addresses.

WARNING

Before binding to a non-localhost (e.g. publicly accessible) IP address, ensure you have secured your cluster from unauthorized access. For a complete list of security recommendations, see Security Checklist. At minimum, consider enabling authentication and hardening network infrastructure.

For example, the following mongod instance binds to both the localhost and the hostname My-Example-Associated-Hostname, which is associated with the ip address 198.51.100.1:

mongod --bind_ip localhost,My-Example-Associated-Hostname
In order to connect to this instance, remote clients must specify the hostname or its associated ip address 198.51.100.1:

mongo --host My-Example-Associated-Hostname

mongo --host 198.51.100.1
To bind to all IPv4 addresses, you can specify the bind ip address of 0.0.0.0. To bind to all IPv4 and IPv6 addresses, you can specify the bind ip address of ::,0.0.0.0 or alternatively, use the new net.bindIpAll setting or the new command-line option --bind_ip_all.

Ref: https://docs.mongodb.com/manual/core/security-mongodb-configuration/

A (13b):
You can bind mongod only to one IP, with 0.0.0.0 being the alias for "listen on all available network interfaces".

So either use

bind_ip=127.0.0.1
to listen to the loop back interface or

bind_ip=[someIP]
to listen to that IP only or

bind_ip=0.0.0.0
to listen to all available IPs on the system.

If you need to listen to several specific IPs, it is very likely that your system design is somehow screwed.

Ref: https://stackoverflow.com/questions/30884021/mongodb-bind-ip-wont-work-unless-set-to-0-0-0-0

Q14: How would you install Python package for GridFS?
A:
Python package for GridFS is "gridfs". It comes packaged with 'pymongo' and does not appear separately in PyPI.

Following are the logs that are generated while installing "gridfs".

(base) administrator@master $ pip install gridfs
Collecting gridfs
  ERROR: Could not find a version that satisfies the requirement gridfs (from versions: none)
ERROR: No matching distribution found for gridfs
(base) administrator@master:/usr/local/spark/examples/src/main/python$ python
Python 3.7.3 (default, Mar 27 2019, 22:11:17) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pymongo
Traceback (most recent call last):
  File "[stdin]", line 1, in [module]
ModuleNotFoundError: No module named 'pymongo'
>>> exit()
(base) administrator@master $ pip install pymongo
Collecting pymongo
  Downloading https://files.pythonhosted.org/packages/23/23/7666537adafcd232c88c156aa9382c859791d79bf12094005e009c2b6a3d/pymongo-3.9.0-cp37-cp37m-manylinux1_x86_64.whl (447kB)
     |████████████████████████████████| 450kB 193kB/s 
Installing collected packages: pymongo
Successfully installed pymongo-3.9.0
(base) administrator@master $ python
Python 3.7.3 (default, Mar 27 2019, 22:11:17) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import gridfs
>>> exit()
(base) administrator@master $ 

1 comment:

  1. Our dedicated MongoDB experts combined with MongoDB’s innate features, schema less data architecture to Sharding and Replica Sets, we optimize your database to completely utilize its powerful capabilities
    https://genexdbs.com/

    ReplyDelete