XFS allocates space for a file using extents: starting location and length. XFS extents also specify the file's logical starting offset for a file. This allows a files extent map to automatically support sparse files (i.e. "holes" in the file). A flag is also used to specify if the extent has been preallocated and not yet been written to (unwritten extent).
A file can have more than one extent if one chunk of contiguous disk space is not available for the file. As a file grows, the XFS space allocator will attempt to keep space contiguous and merge extents. If more than one file is being allocated space in the same AG at the same time, multiple extents for the files will occur as the extents get interleaved. The effect of this can vary depending on the extent allocator used in the XFS driver.
The following two subsections cover the two methods of storing extent information for a file. The first is the fastest and simplest where the inode completely contains an extent array to the file's data. The second is slower and more complex B+tree which can handle thousands to millions of extents efficiently.
Local extents are where the entire extent array is stored within the inode's data fork itself. This is the most optimal in terms of speed and resource consumption. The trade-off is the file can only have a few extents before the inode runs out of space.
The "data fork" of the inode contains an array of extents, the size of the array determined by the inode's di_nextents
value.
The number of extents that can fit in the inode depends on the inode size and di_forkoff
. For a default 256 byte inode with no extended attributes, a file can up to 19 extents with this format. Beyond this, extents have to use the B+tree format.
xfs_db Example:
An 8MB file with one extent:
xfs_db> inode <inode#>
xfs_db> p
core.magic = 0x494e
core.mode = 0100644
core.version = 1
core.format = 2 (extents)
...
core.size = 8294400
core.nblocks = 2025
core.extsize = 0
core.nextents = 1
core.naextents = 0
core.forkoff = 0
...
u.bmx[0] = [startoff,startblock,blockcount,extentflag]
0:[0,25356,2025,0]
A 24MB file with three extents:
xfs_db> inode <inode#>
xfs_db> p
...
core.format = 2 (extents)
...
core.size = 24883200
core.nblocks = 6075
core.nextents = 3
...
u.bmx[0-2] = [startoff,startblock,blockcount,extentflag]
0:[0,27381,2025,0]
1:[2025,31431,2025,0]
2:[4050,35481,2025,0]
Raw disk version of the inode with the third extent highlighted (di_u
always starts at offset 0x64):
We can expand the highlighted section into the following bit array from MSB to LSB with the file offset and the block count highlighted:
A 4MB file with two extents and a hole in the middle, the first extent containing 64KB of data, the second about 4MB in containing 32KB (write
64KB, lseek
~4MB, write
32KB operations):
xfs_db> inode <inode#>
xfs_db> p
...
core.format = 2 (extents)
...
core.size = 4063232
core.nblocks = 24
core.nextents = 2
...
u.bmx[0-1] = [startoff,startblock,blockcount,extentflag]
0:[0,37506,16,0]
1:[984,37522,8,0]