center of tech

We’ve got a bug in our remove code - I didn’t introduce it in my recent set of changes,
we can just say it has always been there. The bug is that when we remove a file, we eventually
fill up the dataset on the DSes.
I’m going to want to fix it, mainly because I
want to get my head around how we are going to examine files on the DSes.
By this, I mean that the inode information lives on the MDS. You can browse the directory
structure on it and see what files are there. You can’t do that on the DS. The DS may have the
concept of an inode, but it doesn’t care at all about directory structure, file names, etc.
The first file may have a layout of 4 and the second a layout of 8. And it may be the case
that those files only intersect on this dataset.
To find dangling files, conceptually we need to run scanners on both the MDS and the
DSes. For each file file on the MDS, we can look at the odl and check to see if there
is a corresponding file on the DS. On the DS, we have to make sure that the MDS has that
file existing and the dataset is in the corresponding layout.
Scratch that, we just need a scanner on the DS - I’m starting to describe a parallelized
fsck. So, on the DS, we scan the datasets and for each file we find, we check with the MDS:
A colleague just pointed out 6792701 Removing large holey file does not free space which I like as a culprit for a couple of reasons:
But I’m not sure that they are holey. Our server sets NFL4_UFLG_DENSE, which should
inform the clients to not create holes.
In any event, the bug will give me some clues as to how to triage the issue.
Source/Kaynak : http://blogs.sun.com/tdh/entry/bug_in_remove_code