CBS Server: SQLite3 DB always locked

Hello,

@switt4 and I have run into an issue using DataLad on the CBS Server that we’ve narrowed down to stemming from SQLite3 not working properly due to a filesystem lock.

A minimal example is the following Python script:

#!/usr/bin/python3

import sqlite3

con = sqlite3.connect("example.db")
cur = con.cursor()
cur.execute("create table foo (name varchar);")
con.commit()
con.close()

Which crashes with the following traceback:

Traceback (most recent call last):
  File "show_sqlite_error.py", line 7, in <module>
    cur.execute("create table foo (name varchar);")
sqlite3.OperationalError: database is locked

I don’t really know enough about sqlite3 or CIFS to know if there’s a straightforward way to debug or fix this – @mklassen I think you’ve looked at the filesystem issues before but let me know if there’s someone better to ask or if you know of a good way for me to track this down.

Thanks!

The general issue is that sqlite3 does not really like network filesystems. The recommendations for solving this tend to be to use the nobrl option when mounting CIFS shares, but that requires root privileges to test. The other recommendation that is sometimes made is to use the database option nolock (https://www.sqlite.org/c3ref/open.html)

sqlite3.connect("file:example.db?nolock=1")

This works, but it means you are responsible to ensure that the database is never concurrently accessed or you risk permanent corruption of the database file. I explicitly do not recommended this workaround, use at your own risk.

FTR

  • The issue happens in the home directory which is nfs4 as well as on cifs shares.
  • It does not occur in /tmp which is ext4.
  • It would be useful to know if it occurs with sqlite3 (not the python package, the binary package), but that is not installed on the system.
1 Like

We should open a ticket with Chris Wedlake (helpdesk@robarts.ca) to help debug this (cc me). I don’t believe a solution to the intermittent file locking issues was ever found, so if this is a reproducible bug then we should work with Chris to try to resolve this again, as the thought was there could be some combination of filesystem mounting options that might resolve it. Part of the test case could involve running this command on db in /localscratch vs NFS mounts vs CIFS mounts.

The workaround for me to avoid file-lock issues has been to use /localscratch for anything with alot of file i/o, but I agree this is quite restrictive…

I did see in https://www.sqlite.org/lockingv3.html that one recommendation is to avoid network file systems to be safe (“Your best defense is to not use SQLite for files on a network filesystem.”, stated in the “6.0 How To Corrupt Your Database Files” section).

Ali

1 Like

Just noticed @mklassen’s message above now – would be good to CC you on this helpdesk ticket as well if you’re ok with that!

Okay, I’ll open a ticket and CC the two of you. Thanks for your help with this!