Skip to main content

Clients’ operations log (.oplog) and operations history (.ophistory)

MooseFS offers a unique way to check, what filesystem operations has been performed on a node. As root (priviliged) user, you can read two files in the root directory (in the mountpoint) of your mounted MooseFS share: .oplog and .ophistory. Both of those files will show you a list of operations that is being performed on this particular client (mount). When you display them, e.g. with a cat command, .ophistory will show up to 15MiB of history of operations and then continue to show new operations, .oplog will just show new operations that have happened since you issued the cat command.

Both .oplog and .ophistory are metafiles, meaning they are not real files in the system and will not show when you list the root directory of your mount, but you can open them and read them like any other regular file. They cannot be modified (opened for writing).

How the .oplog and .ophistory files are built

Let's say a user performed the following commands in shell:

root@test:/mnt/mfs# cd myphotos
root@test:/mnt/mfs/myphotos# ls -al
total 8295
drwxr-xr-x 2 10113 10113 2000237 Aug 23 2024 .
drwxrwxrwx 34 root root 4000169 Mar 27 15:17 ..
-rw-r--r-- 1 10113 10113 509077 Aug 23 2024 photo1.jpg
-rw-r--r-- 1 10113 10113 526421 Aug 23 2024 photo2.jpg
-rw-r--r-- 1 10113 10113 579193 Aug 23 2024 photo3.jpg
-rw-r--r-- 1 10113 10113 523489 Aug 23 2024 photo4.jpg
-rw-r--r-- 1 10113 10113 353660 Aug 23 2024 photo5.jpg
root@test:/mnt/mfs/myphotos# cp photo1.jpg photo1_copy.jpg

And on another console the .oplog file was opened just before the above operations were performed. The output will look similar to this:

04.22 11:29:58.854606: uid:0 gid:0 pid:1509221 cmd:open (2147483633,O_RDONLY) (internal node: OPLOG): OK (1,0)i
04.22 11:29:58.854751: uid:0 gid:0 pid:1509221 cmd:getattr (2147483633) (internal node: OPLOG): OK (3600,[-r--------:0100400,1,0,0,0,0,0,0])
#
04.22 11:30:00.472019: uid:0 gid:0 pid:1509213 cmd:getattr (1) [no handle]: OK (1.0,[drwxrwxrwx:0040777,34,0,0,1745314023,1743085066,1743085066,4000169])
#
04.22 11:30:02.898783: uid:0 gid:0 pid:1509183 cmd:getattr (1) [no handle]: OK (1.0,[drwxrwxrwx:0040777,34,0,0,1745314023,1743085066,1743085066,4000169])
04.22 11:30:02.901284: uid:0 gid:0 pid:1509183 cmd:lookup (1,myphotos): OK (1.0,54260060,1.0,[drwxr-xr-x:0040755,2,10113,10113,1745314052,1724422073,1737719056,2000237])
04.22 11:30:02.901499: uid:0 gid:0 pid:1509183 cmd:access (54260060,0x1): OK
#
04.22 11:30:04.914317: uid:0 gid:0 pid:1509183 cmd:getattr (54260060) [no handle]: OK (1.0,[drwxr-xr-x:0040755,2,10113,10113,1745314052,1724422073,1737719056,2000237])
04.22 11:30:04.918334: uid:0 gid:0 pid:1509222 cmd:opendir (54260060): OK [handle:00000001]
04.22 11:30:04.919133: uid:0 gid:0 pid:1509222 cmd:readdirplus (54260060,4096,340) [handle:00000001]: OK (1160)
04.22 11:30:04.919440: uid:0 gid:0 pid:1509222 cmd:getxattr (54260060,security.selinux,255) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.919534: uid:0 gid:0 pid:1509222 cmd:getxattr (54260060,system.posix_acl_access,0) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.919597: uid:0 gid:0 pid:1509222 cmd:getxattr (54260060,system.posix_acl_default,0) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.922804: uid:0 gid:0 pid:1509222 cmd:getattr (1) [no handle] (using open dir cache): OK (1.0,[drwxrwxrwx:0040777,34,0,0,1745314023,1743085066,1743085066,4000169])
04.22 11:30:04.923067: uid:0 gid:0 pid:1509222 cmd:getxattr (1,security.selinux,255) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.923215: uid:0 gid:0 pid:1509222 cmd:getxattr (1,system.posix_acl_access,0) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.923301: uid:0 gid:0 pid:1509222 cmd:getxattr (1,system.posix_acl_default,0) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.923460: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo3.jpg) (using open dir cache): OK (0.0,54260072,1.0,[-rw-r--r--:0100644,1,10113,10113,1724421997,1724417502,1737719056,579193])
04.22 11:30:04.923530: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo3.jpg) (using open dir cache): OK (0.0,54260072,1.0,[-rw-r--r--:0100644,1,10113,10113,1724421997,1724417502,1737719056,579193])
04.22 11:30:04.923661: uid:0 gid:0 pid:1509222 cmd:getxattr (54260072,security.selinux,255) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.923700: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo3.jpg) (using open dir cache): OK (0.0,54260072,1.0,[-rw-r--r--:0100644,1,10113,10113,1724421997,1724417502,1737719056,579193])
04.22 11:30:04.923745: uid:0 gid:0 pid:1509222 cmd:getxattr (54260072,system.posix_acl_access,0) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.923806: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo5.jpg) (using open dir cache): OK (0.0,54260065,1.0,[-rw-r--r--:0100644,1,10113,10113,1724417502,1724417502,1737719056,353660])
04.22 11:30:04.923871: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo5.jpg) (using open dir cache): OK (0.0,54260065,1.0,[-rw-r--r--:0100644,1,10113,10113,1724417502,1724417502,1737719056,353660])
04.22 11:30:04.923999: uid:0 gid:0 pid:1509222 cmd:getxattr (54260065,security.selinux,255) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.924052: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo5.jpg) (using open dir cache): OK (0.0,54260065,1.0,[-rw-r--r--:0100644,1,10113,10113,1724417502,1724417502,1737719056,353660])
04.22 11:30:04.924100: uid:0 gid:0 pid:1509222 cmd:getxattr (54260065,system.posix_acl_access,0) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.924140: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo4.jpg) (using open dir cache): OK (0.0,54260064,1.0,[-rw-r--r--:0100644,1,10113,10113,1724417502,1724417502,1737719056,523489])
04.22 11:30:04.924218: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo4.jpg) (using open dir cache): OK (0.0,54260064,1.0,[-rw-r--r--:0100644,1,10113,10113,1724417502,1724417502,1737719056,523489])
04.22 11:30:04.924341: uid:0 gid:0 pid:1509222 cmd:getxattr (54260064,security.selinux,255) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.924379: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo4.jpg) (using open dir cache): OK (0.0,54260064,1.0,[-rw-r--r--:0100644,1,10113,10113,1724417502,1724417502,1737719056,523489])
04.22 11:30:04.924436: uid:0 gid:0 pid:1509222 cmd:getxattr (54260064,system.posix_acl_access,0) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.924490: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo2.jpg) (using open dir cache): OK (0.0,54260062,1.0,[-rw-r--r--:0100644,1,10113,10113,1724422805,1724417501,1737719056,526421])
04.22 11:30:04.924540: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo2.jpg) (using open dir cache): OK (0.0,54260062,1.0,[-rw-r--r--:0100644,1,10113,10113,1724422805,1724417501,1737719056,526421])
04.22 11:30:04.924652: uid:0 gid:0 pid:1509222 cmd:getxattr (54260062,security.selinux,255) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.924694: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo2.jpg) (using open dir cache): OK (0.0,54260062,1.0,[-rw-r--r--:0100644,1,10113,10113,1724422805,1724417501,1737719056,526421])
04.22 11:30:04.924736: uid:0 gid:0 pid:1509222 cmd:getxattr (54260062,system.posix_acl_access,0) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.924776: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo1.jpg) (using open dir cache): OK (0.0,54260061,1.0,[-rw-r--r--:0100644,1,10113,10113,1724422805,1724418914,1737719056,509077])
04.22 11:30:04.924810: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo1.jpg) (using open dir cache): OK (0.0,54260061,1.0,[-rw-r--r--:0100644,1,10113,10113,1724422805,1724418914,1737719056,509077])
04.22 11:30:04.924964: uid:0 gid:0 pid:1509222 cmd:getxattr (54260061,security.selinux,255) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.925008: uid:0 gid:0 pid:1509222 cmd:lookup (54260060,photo1.jpg) (using open dir cache): OK (0.0,54260061,1.0,[-rw-r--r--:0100644,1,10113,10113,1724422805,1724418914,1737719056,509077])
04.22 11:30:04.925052: uid:0 gid:0 pid:1509222 cmd:getxattr (54260061,system.posix_acl_access,0) (using open dir cache): ENODATA (No data available)
04.22 11:30:04.925095: uid:0 gid:0 pid:1509222 cmd:readdir (54260060,4096,340) [handle:00000001]: OK (no data)
04.22 11:30:04.925150: uid:0 gid:0 pid:0 cmd:releasedir (54260060) [handle:00000000]: OK
#
#
04.22 11:30:08.070403: uid:0 gid:0 pid:513151 cmd:getattr (1) [no handle]: OK (1.0,[drwxrwxrwx:0040777,34,0,0,1745314023,1743085066,1743085066,4000169])
04.22 11:30:08.071320: uid:0 gid:0 pid:513151 cmd:getattr (54260060) [no handle]: OK (1.0,[drwxr-xr-x:0040755,2,10113,10113,1745314052,1724422073,1737719056,2000237])
04.22 11:30:08.559740: uid:1001 gid:1001 pid:1421165 cmd:statfs (1): OK (16262295086080,11538406803968,11538406803968,0,0,9719232)
#
#
04.22 11:30:15.815173: uid:0 gid:0 pid:1509183 cmd:getattr (54260060) [no handle]: OK (1.0,[drwxr-xr-x:0040755,2,10113,10113,1745314052,1724422073,1737719056,2000237])
04.22 11:30:15.819454: uid:0 gid:0 pid:1509223 cmd:lookup (54260060,photo1_copy.jpg): ENOENT (No such file or directory)
04.22 11:30:15.821640: uid:0 gid:0 pid:1509223 cmd:lookup (54260060,photo1.jpg): OK (0.0,54260061,1.0,[-rw-r--r--:0100644,1,10113,10113,1724422805,1724418914,1737719056,509077])
04.22 11:30:15.822296: uid:0 gid:0 pid:1509223 cmd:lookup (54260060,photo1.jpg): OK (0.0,54260061,1.0,[-rw-r--r--:0100644,1,10113,10113,1724422805,1724418914,1737719056,509077])
04.22 11:30:15.822513: uid:0 gid:0 pid:1509223 cmd:open (54260061,O_RDONLY) (using cached data from lookup): OK (direct_io:0,keep_cache:0,append_mode:0) [handle:01000001]
04.22 11:30:15.823003: uid:0 gid:0 pid:1509223 cmd:lookup (54260060,photo1_copy.jpg): ENOENT (No such file or directory)
04.22 11:30:15.825095: uid:0 gid:0 pid:1509223 cmd:create (54260060,photo1_copy.jpg,O_WRONLY|O_CREAT|O_EXCL,-rw-r--r--:0100644): OK (0.0,4691,1.0,[-rw-r--r--:0100644,1,0,0,1745314215,1745314215,1745314215,0]) (direct_io:0,keep_cache:0,append_mode:0) [handle:01000002]
04.22 11:30:15.825327: uid:0 gid:0 pid:1509223 cmd:getxattr (4691,security.capability,0) (using cache): ENODATA (No data available)
04.22 11:30:15.840440: uid:0 gid:0 pid:1509223 cmd:read (54260061,131072,0) [handle:01000001]: OK (131072)
04.22 11:30:15.840646: uid:0 gid:0 pid:1509223 cmd:getxattr (4691,security.capability,0) (using cache): ENODATA (No data available)
04.22 11:30:15.840955: uid:0 gid:0 pid:1509223 cmd:write (4691,65536,0) [handle:01000002]: OK (65536)
04.22 11:30:15.841326: uid:0 gid:0 pid:1509223 cmd:getxattr (4691,security.capability,0) (using cache): ENODATA (No data available)
04.22 11:30:15.841592: uid:0 gid:0 pid:1509223 cmd:write (4691,65536,65536) [handle:01000002]: OK (65536)
04.22 11:30:15.850950: uid:0 gid:0 pid:1509223 cmd:read (54260061,118784,393216) [handle:01000001]: OK (115861)
04.22 11:30:15.850955: uid:0 gid:0 pid:1509223 cmd:read (54260061,262144,131072) [handle:01000001]: OK (262144)
04.22 11:30:15.851261: uid:0 gid:0 pid:1509223 cmd:getxattr (4691,security.capability,0) (using cache): ENODATA (No data available)
04.22 11:30:15.851626: uid:0 gid:0 pid:1509223 cmd:write (4691,65536,131072) [handle:01000002]: OK (65536)
04.22 11:30:15.851780: uid:0 gid:0 pid:1509223 cmd:getxattr (4691,security.capability,0) (using cache): ENODATA (No data available)
04.22 11:30:15.852189: uid:0 gid:0 pid:1509223 cmd:write (4691,65536,196608) [handle:01000002]: OK (65536)
04.22 11:30:15.852261: uid:0 gid:0 pid:1509223 cmd:getxattr (4691,security.capability,0) (using cache): ENODATA (No data available)
04.22 11:30:15.852504: uid:0 gid:0 pid:1509223 cmd:write (4691,65536,262144) [handle:01000002]: OK (65536)
04.22 11:30:15.852567: uid:0 gid:0 pid:1509223 cmd:getxattr (4691,security.capability,0) (using cache): ENODATA (No data available)
04.22 11:30:15.852774: uid:0 gid:0 pid:1509223 cmd:write (4691,65536,327680) [handle:01000002]: OK (65536)
04.22 11:30:15.852825: uid:0 gid:0 pid:1509223 cmd:getxattr (4691,security.capability,0) (using cache): ENODATA (No data available)
04.22 11:30:15.853029: uid:0 gid:0 pid:1509223 cmd:write (4691,65536,393216) [handle:01000002]: OK (65536)
04.22 11:30:15.853078: uid:0 gid:0 pid:1509223 cmd:getxattr (4691,security.capability,0) (using cache): ENODATA (No data available)
04.22 11:30:15.853308: uid:0 gid:0 pid:1509223 cmd:write (4691,50325,458752) [handle:01000002]: OK (50325)
04.22 11:30:16.001743: uid:0 gid:0 pid:1509223 cmd:flush (4691) [handle:01000002,uselocks:0,lock_owner:58BFB3EB0215AD8A]: OK
04.22 11:30:16.001868: uid:0 gid:0 pid:0 cmd:release (4691) [handle:01000002,uselocks:0,lock_owner:0000000000000000]: OK
04.22 11:30:16.001920: uid:0 gid:0 pid:1509223 cmd:flush (54260061) [handle:01000001,uselocks:0,lock_owner:58BFB3EB0215AD8A]: OK
04.22 11:30:16.001992: uid:0 gid:0 pid:0 cmd:release (54260061) [handle:01000001,uselocks:0,lock_owner:0000000000000000]: OK
#
04.22 11:30:17.780845: uid:0 gid:0 pid:96581 cmd:getattr (1) [no handle]: OK (1.0,[drwxrwxrwx:0040777,34,0,0,1745314023,1743085066,1743085066,4000169])
04.22 11:30:17.781538: uid:0 gid:0 pid:96581 cmd:getattr (54260060) [no handle]: OK (1.0,[drwxr-xr-x:0040755,2,10113,10113,1745314052,1745314215,1745314215,2000286])

The above output was generated by Debian 12 (bookworm). Another Linux distribution or another OS (FreeBSD, MacOS or other) may generate a different output, for example without the getxattr commands.

Let's analyse what is being displayed:

  1. Each line starts with a very similar part: 04.22 11:30:15.822296: uid:0 gid:0 pid:1509223 cmd:xxx. This shows when (month.day hour:minute:second.microsecond), who (uid, gid, pid) performed what filesystem command (cmd:xxx, where xxx is the command). Note: pid:0 means an operation performed by system kernel.

  2. Next, in the round brackets, there is some variable content than depends on the command performed, e.g. for lookup command there are 2 parametes: in which directory (inode number) what content was looked up (name of an object), for read command there are 3 parametes: from which file (inode), how many bytes and from what offset were read (or attempted to).

  3. If the operation is performed on an open file, a hande of this file may be displayed next in square brackets, if appropriate, information about locks will be displayed too.

  4. If the operation is performed using local cache (no communication with the Master), a note is displayed.

  5. Then, after a colon (:) sign there will be status of the command: OK if the command succeeded, corresponding error if it did not succeed.

  6. Finally, following the status, additional information that is the "result" of the command will be displayed, if appropriate. For an error it will be the "long" version of the error. For the OK status it may be the result of the command performed, e.g. when a new file is created, parameters of that file (permissions, inode, a/m/ctime etc.) will be returned, when a file was read or written, the number of bytes read/written will be returned, etc.

  7. A line with just the hash (#) sign is an "idle" line, they are displayed every 1 second if no operations happened during that second.

In the above example, the first two lines have to do with opening the metafile .oplog, the rest is the result of the commands that were issued. The next 4 lines are the cd myphotos command (ommiting the hash lines). Then, the big block of commands starting with getxattr and opendir, and ending with releasedir is the result of the ls -al command. The rest is the cp command.

Note, that:

  1. A lot of the lines in the ls -al block have the added information "using open dir cache" - that's because the opendir operation sends a lot of information about the opened directory from Master to Client in one big packet. That information is cached in the client and used in subsequent calls, thanks to that the operations are performed really fast. An operation that is made "using cache" is practically never a cause for concern when it comes to measuring efficiency of the system.

  2. Not all errors are errors. In the above listing, are errors are supposed to be there. The ENODATA errors in all the getxattr commands mean simply, that the requested attribute was not defined for that object. The ENOENT error in the lookup command which tries to "find" object named photo1_copy.jpg simply tells the OS that the object with that name does not exist in the specified path (directory), therefore the cp operation can proceed and should create a new object with that name.

  3. MooseFS client never decides by itself what operations should be performed. This is all done by whatever process is currently working with the filesystem, in cooperation with the hosting OS's kernel. So if you see some operations that are "unnecessary", "redundant" - it has nothing to do with MooseFS and everything to do with the processes that work "on" MooseFS.

How can you use the .oplog and .ophistory files?

  1. To debug filesystem operations performed by a process - e.g. you have installed an application that performs some filesystem operations and this application is awfully slow. You can "peek" at what the application is doing and decide for yourself if the problem lies in the application not being well optimised or if it's something else.

  2. To debug errors - a process is throwing some unexplained errors while working on a file? See what operations it performed and if the errors it throws are consistent with the errors the filesystem is reporting.

  3. To see what process is responsible for the heavy usage of this particular MooseFS client - simply save some oplog data to an external file (e.g. in your /tmp/ directory) and use simple tools like grep and cut (or awk if necessary) to look for the culprit (it might be a good idea to filter out all the lines that match the pattern using .*cache first, to find the process that performs the most "active connection" operations). Example:

    root@test:/mnt/mfs# cat .oplog > /tmp/myoplog.txt
    ^C
    root@test:/mnt/mfs# egrep -v "using .*cache" /tmp/myoplog.txt | cut -d " " -f 5 | grep -v '#' | sort | uniq -c | sort -n

Notice, that since the .oplog file is providing data continously, you have to manually stop the cat operation after some time has passed.

Oplog dictionary

Each operation will return an OK status or an error code. Some operations return aditional values with the OK status, some just the status. All operations, their parameters and return values for OK status (if any) are listed below:

commandparamsreturned values
accessinode, mode
createparent inode, name, flags, mode(entry cache timeout, inode, attribute cache timeout, attributes) (direct i/o mode, cache mode, append mode) [handle]
flockFUSE request id, inode, owner, lock mode, [handle]
flushinode [handle, uselocks, lock owner]
fsyncinode, datasync [handle]
getattrinode [handle]attribute cache timeout, attributes
getlkinode, owner, start, end, type [handle]start, end, type, pid
getxattrinode, attr name, attr lengthlength
invalidate cacheinode, offset, length
invalidate entryparent inode, name
linkinode, new parent inode, new nameentry cache timeout, inode, attribute cache timeout, attributes
listxattrinode, lengthlength
lookupparent inode, nameentry cache timeout, inode, attribute cache timeout, attributes
mkdirparent inode, name, modeentry cache timeout, inode, attribute cache timeout, attributes
mknodparent inode, name, mode, rdev(major+minor)entry cache timeout, inode, attribute cache timeout, attributes
openinode, flagsdirect i/o mode, keep cache, append mode [handle]
opendirinode[handle]
readinode, length, offsetread length
readdirinode, length, offsetlength or "no data"
readdirplusinode, length, offset [handle]length or "no data"
readlinkinodepath
releaseinode [handle, uselocks, lockowner]
releasedirinode [handle]
removexattrinode, attr name
renameparent inode, object name, destination inode, new name, flags
rmdirparent inode, name
setattrinode, attrs to set, [handle]attribute cache timeout, attributes
setlkinode, owner, start, end, type(r/w/u) [handle]
setlkwinode, owner, start, end, type(r/w/u) [handle]
setxattrinode, attr name, attr length, flags
special casespecial internal MooseFS call
statfsinode or nonetotalspace,availspace,freespace,trashspace,sustainedspace,inodes
sustained opendirinode
symlinkpath, parent inode, namecache timeout, inode, attribute cache timeout, attributes
unlinkparent inode, name
writeinode, length, offset(correction for append)written length