Product SiteDocumentation Site

8.2. Leaf Attributes

When an inode's attribute fork space is used up with shortform attributes and more are added, the attribute format is migrated to "extents".
Extent based attributes use hash/index pairs to speed up an attribute lookup. The first part of the "leaf" contains an array of fixed size hash/index pairs with the flags stored as well. The remaining part of the leaf block contains the array name/value pairs, where each element varies in length.
Each leaf is based on the xfs_da_blkinfo_t block header declared in Leaf Directories. The structure encapsulating all other structures in the xfs_attr_leafblock_t.
The structures involved are:
typedef struct xfs_attr_leaf_map {
     __be16                     base;
     __be16                     size;
} xfs_attr_leaf_map_t;

typedef struct xfs_attr_leaf_hdr {
     xfs_da_blkinfo_t           info;
     __be16                     count;
     __be16                     usedbytes;
     __be16                     firstused;
     __u8                       holes;
     __u8                       pad1;
     xfs_attr_leaf_map_t        freemap[3];
} xfs_attr_leaf_hdr_t;

typedef struct xfs_attr_leaf_entry {
     __be32                     hashval;
     __be16                     nameidx;
     __u8                       flags;
     __u8                       pad2;
} xfs_attr_leaf_entry_t;

typedef struct xfs_attr_leaf_name_local {
     __be16                     valuelen;
     __u8                       namelen;
     __u8                       nameval[1];
} xfs_attr_leaf_name_local_t;

typedef struct xfs_attr_leaf_name_remote {
     __be32                     valueblk;
     __be32                     valuelen;
     __u8                       namelen;
     __u8                       name[1];
} xfs_attr_leaf_name_remote_t;

typedef struct xfs_attr_leafblock  {
     xfs_attr_leaf_hdr_t           hdr;
     xfs_attr_leaf_entry_t         entries[1];
     xfs_attr_leaf_name_local_t    namelist;
     xfs_attr_leaf_name_remote_t   valuelist;
} xfs_attr_leafblock_t;
Each leaf header uses the following magic number:
#define XFS_ATTR_LEAF_MAGIC        0xfbee
The hash/index elements in the entries[] array are packed from the top of the block. Name/values grow from the bottom but are not packed. The freemap contains run-length-encoded entries for the free bytes after the entries[] array, but only the three largest runs are stored (smaller runs are dropped). When the freemap doesn’t show enough space for an allocation, name/value area is compacted and allocation is tried again. If there still isn't enough space, then the block is split. The name/value structures (both local and remote versions) must be 32-bit aligned.
For attributes with small values (ie. the value can be stored within the leaf), the XFS_ATTR_LOCAL flag is set for the attribute. The entry details are stored using the xfs_attr_leaf_name_local_t structure. For large attribute values that cannot be stored within the leaf, separate filesystem blocks are allocated to store the value. They use the xfs_attr_leaf_name_remote_t structure.
69
Both local and remote entries can be interleaved as they are only addressed by the hash/index entries. The flag is stored with the hash/index pairs so the appropriate structure can be used.
Since duplicate hash keys are possible, for each hash that matches during a lookup, the actual name string must be compared.
An “incomplete” bit is also used for attribute flags. It shows that an attribute is in the middle of being created and should not be shown to the user if we crash during the time that the bit is set. The bit is cleared when attribute has finished being setup. This is done because some large attributes cannot be created inside a single transaction.

xfs_db Example:

A single 30KB extended attribute is added to an inode:
xfs_db> inode <inode#>
xfs_db> p
...
core.nblocks = 9
core.nextents = 0
core.naextents = 1
core.forkoff = 15
core.aformat = 2 (extents)
...
a.bmx[0] = [startoff,startblock,blockcount,extentflag]
          0:[0,37535,9,0]
xfs_db> ablock 0
xfs_db> p
hdr.info.forw = 0
hdr.info.back = 0
hdr.info.magic = 0xfbee
hdr.count = 1
hdr.usedbytes = 20
hdr.firstused = 4076
hdr.holes = 0
hdr.freemap[0-2] = [base,size] 0:[40,4036] 1:[0,0] 2:[0,0]
entries[0] = [hashval,nameidx,incomplete,root,secure,local]
          0:[0xfcf89d4f,4076,0,0,0,0]
nvlist[0].valueblk = 0x1
nvlist[0].valuelen = 30692
nvlist[0].namelen = 8
nvlist[0].name = "big_attr"
Attribute blocks 1 to 8 (filesystem blocks 37536 to 37543) contain the raw binary value data for the attribute.
Index 4076 (0xfec) is the offset into the block where the name/value information is. As can be seen by the value, it's at the end of the block:
xfs_db> type text
xfs_db> p
000: 00 00 00 00  00 00 00 00 fb ee 00 00 00 01 00 14 ................
010: 0f ec 00 00  00 28 0f c4 00 00 00 00 00 00 00 00 ................
020: fc f8 9d 4f  0f ec 00 00 00 00 00 00 00 00 00 00 ...O............
030: 00 00 00 00  00 00 00 00 00 00 00 00 00 00 00 00 ................
...
fe0: 00 00 00 00  00 00 00 00 00 00 00 00 00 00 00 01 ................
ff0: 00 00 77 e4  08 62 69 67 5f 61 74 74 72 00 00 00 ..w..big.attr...
A 30KB attribute and a couple of small attributes are added to a file:
xfs_db> inode <inode#>
xfs_db> p
...
core.nblocks = 10
core.extsize = 0
core.nextents = 1
core.naextents = 2
core.forkoff = 15
core.aformat = 2 (extents)
...
u.bmx[0] = [startoff,startblock,blockcount,extentflag]
          0:[0,81857,1,0]
a.bmx[0-1] = [startoff,startblock,blockcount,extentflag]
          0:[0,81858,1,0]
          1:[1,182398,8,0]
xfs_db> ablock 0
xfs_db> p
hdr.info.forw = 0
hdr.info.back = 0
hdr.info.magic = 0xfbee
hdr.count = 3
hdr.usedbytes = 52
hdr.firstused = 4044
hdr.holes = 0
hdr.freemap[0-2] = [base,size] 0:[56,3988] 1:[0,0] 2:[0,0]
entries[0-2] = [hashval,nameidx,incomplete,root,secure,local]
          0:[0x1e9d3934,4044,0,0,0,1]
          1:[0x1e9d3937,4060,0,0,0,1]
          2:[0xfcf89d4f,4076,0,0,0,0]
nvlist[0].valuelen = 6
nvlist[0].namelen = 5
nvlist[0].name = "attr2"
nvlist[0].value = "value2"
nvlist[1].valuelen = 6
nvlist[1].namelen = 5
nvlist[1].name = "attr1"
nvlist[1].value = "value1"
nvlist[2].valueblk = 0x1
nvlist[2].valuelen = 30692
nvlist[2].namelen = 8
nvlist[2].name = "big_attr"
As can be seen in the entries array, the two small attributes have the local flag set and the values are printed.
A raw disk dump shows the attributes. The last attribute added is highlighted (offset 4044 or 0xfcc):
c
MediaWiki Appliance - Powered by TurnKey Linux