This page documented the investigation of an issue: MongoDB Pod failed to lock mongod.lock
on OpenShift/OKDv4.
The problem was because NFSv3 was used to mount between OKDv4 Worker Nodes to the NFS server. MongoDB used internally flock/fcntl
to lock the mongod.lock
file descriptor. In a network file system like NFSv3, locking support was required from both server and client, otherwise, a “No locks available” would be the outcome. NFSv4, on the other hand, came with built-in support for locking as specified in the protocol. Changing to mount using only NFSv4 solved the problem, but there were more than just that.
On 2021-09-22, a colleague reported that deploying MongoDB into our OKDv4 failed because of the error:
...exception in initAndListen: DBPathInUse: Unable to lock the lock file /var/lib/mongodb/data/mongod.lock (No locks available)...
At first site, we thought that the process mongod
did not have permission to access to the file. This was quickly dismissed as the process created the file itself.
In order to verify this, I had to read the source code of MongoDB (I had investigated a similar issue in the past) to see what happened at the source.
Searching in the code base of MongoDB, tag r4.0.5
, I could pin point where exactly the error was raised, the module storage_engine_lock_file_posix.cpp
:
|
|
At line 171
, mongod
attempted to lock the mongod.lock
file using fcntl
. This was a system call to the kernel API fcntl(2)
and it failed. The lines 176-182
returned the error. The errnowithdescription(errorcode)
at line 178
was used to translate the global variable errno
(assigned by fcntl
call). Depending the implementation of fcntl
, the text might vary. In our case, it was No locks available
.
Googled No locks avaible
with fcntl
showed that the issued related to the file system, especially with NFSv3. This provided a great hint to me as we were using NFS for our PersistentVolume
. However, the problem did not exist when we deployed MongoDB on our OKDv3 which also used NFS. The only difference between OKDv3 and OKDv4 was that OKDv4 used Fedora CoreOS which I didn’t know how the NFS client was set up.
NOTE: Reading further, I noticed that in order for NFSv3 to work with file locking, support for the protocol MUST be done from both server and client.
I checked first how it worked on OKDv3 by log into the node:
|
|
OK, so for OKDv3, only NFSv4 was used.
Then, I had to log into the OKDv4 Worker Node to find out how the NFS mounts were done.
|
|
OK, so only NFSv3 was used on OKDv4. WHY?
Then I realized the second big difference between OKDv3 and OKDv4 was that, the NFS PersistentVolume
was provisioned automatically by using an Operator called nfs-subdir-external-provisioner
. This Operator will register a storageClass
namely managed-nfs-storage
(configurable). Any PersistentVolumeClaim
uses this managed-nfs-storage
will automatically be bound with a newly created PersistentVolume
.
Could that be this Operator use by default only NFSv3? Can we force OKDv4 to use a particular NFS version on PersistentVolume
? It turned out that it was possible by using mountOptions
like this:
|
|
But since all PersistentVolumes
were created by the Operator, I needed to find out how to inject this mountOptions
field. Luckily, they allowed to specified it.
|
|
NOTE: There was a little investigation of how to specify values for a Helm value which was an Array instead of an Dict.
Once this was deployed, I discovered that the Operator failed to bootstrap with a mount fail.
|
|
That was strange because I knew for sure that the directory existed and it was exported. Then I had to check the NFS server to see how the directories were exported and it was done something like below:
|
|
Looked normal to me though! However, digging in Google hinted me a valuable information: NFSv4 treated fsid
seriously as the answer of the StackOverflow question “Cannot mount nfs4 share: no such file or directory”.
So in the /etc/exports
above, the path /export/shared_volume
was defined as fsid=0
which NFSv4 would treat it as root directory. So when the path /export/okd4_volume
was mounted using NFSv4, the server rejected that path because the parent of the directory was not exported. Another great hint was that, the mounting path MUST not contain the fsid=0
, in this case the /export
.
So I made the following change:
|
|
There were one change: adding the /export
and made it fsid=0
(adjusted the existing ones as well). Then the following command will work:
NOTE: Re-apply the configuration using sudo exportfs -a
.
|
|
But hey, how did NFSv3 was chosen then? It turned out that, the NFS client tried with several combination of options until it succeed.
|
|
Without any options
, the NFS client attempts to try out vers=4.2
, then fallback to vers=3,prot=6
with different port until it could find a working option.
With the final finding of the path, I just needed to adjust the Helm Chart command to deploy the Operator nfs-subdir-external-provisioner
to use /okd4_volume
instead of /export/okd4_volume
.
Deploying a new MongoDB Pod showed that mongod
could lock the file again successfully. Phew!