Inodes...Inodes...the word has lost all meaning!

2engels

Ars Scholae Palatinae
1,242
Seeing a very weird issue.

RH 7.9 server, relatively standard installation. Virtual system, will update with virtualization technology once customer responds.

We are seeing a weird error in a software package that we can tie to inode numbers above what inode32 can handle. That is not the main issue.

What we are trying to figure out is WHY inode numbers would be so huge, over 6.7 billion at last check. Funny thing is that the system is frequently checked for total number of files (find / | wc -l) and we see less than 10 million files total. Lowest number I can find on the system so far (/etc/kderc) is just under 1.7 billion. The oldest file I can find has one over 7 billion, so I am totally lost.

I had assumed that inodes would be relatively logical, older files would have lower ones, newer files have higher ones. Due to reuse there could be some exceptions.

Any idea why the inode numbers would be so high?
 

Burn24

Smack-Fu Master, in training
53
The one time I ran into anything close to this was with XFS filesystems multi-TB in size (6 or higher?) on CentOS 7.x systems. The mount options didn't specify inode64, and I think by default it was mounting with inode32 instead. We were getting 'out of inode' errors from regular file operations which were fixed by mounting w/ inode64.

My guess is how inodes are allocated isn't just in a simple linear pattern you're expecting. From my local man 5 xfs manpage:
inode32|inode64
When inode32 is specified, it indicates that XFS limits inode creation to locations which will
not result in inode numbers with more than 32 bits of significance.

When inode64 is specified, it indicates that XFS is allowed to create inodes at any location in
the filesystem, including those which will result in inode numbers occupying more than 32 bits
of significance.

inode32 is provided for backwards compatibility with older systems and applications, since 64
bits inode numbers might cause problems for some applications that cannot handle large inode
numbers. If applications are in use which do not handle inode numbers bigger than 32 bits, the
inode32 option should be specified.

For kernel v3.7 and later, inode64 is the default.
My take is that sometimes the OS will use 64bit inodes on mounted filesystems whether you think it should or not, and that the inode32 mount option continues to exist for software like you initially mention in your comment, software that can't handle 64bit inodes.

Or, alternatively, according to the man page a filesystem wizard did it, and trying to find out why is probably a boondoggle. Maybe you need the inode32 mount option instead?
 

Jeff S

Ars Tribunus Angusticlavius
8,765
Subscriptor++
As another thought about why inode numbers might be so high is, does this server have a very high amount of I/O, where maybe lots and lots of temporary files are being created and deleted, or maybe some software on the system does stuff where it saves changes to a file to a new copy of the file, once that file is written it deletes the original file, and then renames the copy (or maybe just tracks the file under the new name)?

If there is a lot of such "churn" of files, like millions per day, you could get inode numbers in the billions after the system has been running awhile - I mean, who says a running system will re-use lower-numbered inodes right away? Maybe it just keeps incrementing a counter var in the kernel tracking the next available or last used inode number, until it reaches the max and then wraps around? I mean, why do work to find unused lower-value inode numbers when you don't have to, and higher numbers are very unlikely to be in use?

To re-use a lower inode number, you have to first check/search the filesystem to see if that inode is currently in use, a somewhat expensive operation. If it IS in use, then you have to increment and check again. Or consult against a free-list (that is, a list the system tracks of previously used but now free inodes (note the free 'list' might actually be a tree or some other structure, I'm using list very loosely here); such a check against a free list would surely be cheaper than checking the filesystem to see if the inode no. is in use, but still has a cost to it). If you are using a value larger than the largest the system has ever used, it should be guaranteed available.