Btrees B 树 

Btrees Introduction Btrees 介绍

Btrfs uses a single set of btree manipulation code for all metadata in the filesystem. For performance or organizational purposes, the trees are broken up into a few different types, and each type of tree will hold a few different types of keys. The super block holds pointers to the tree roots of the tree of tree roots and the chunk tree.
Btrfs 使用一组单一的 btree 操作代码来处理文件系统中的所有元数据。出于性能或组织目的,树被分成几种不同类型,并且每种类型的树将包含几种不同类型的键。超级块保存指向树根的指针,包括树根树和块树。

Tree of Tree roots

This tree is used for indexing and finding the root of most of the other trees in the filesystem. It attaches names to subvolumes and snapshots, and stores the location of the extent allocation tree root. It also stores pointers to all of the subvolumes or snapshots that are being deleted by the transaction code. This allows the deletion to pick up where it left off after a crash.

Chunk Tree 块树

The chunk tree does all of the logical to physical block address mapping for the filesystem, and it stores information about all of the devices in the FS. In order to bootstrap lookup in the chunk tree, the super block also duplicates the chunk items needed to resolve blocks in the chunk tree. Over time, the chunk tree will be split into multiple roots to allow access of larger storage pools.

There are back references from the chunk items to the extent tree that allocated them. Only a single extent tree can allocate extents out of a given chunk.

Two types of key are stored in the chunk tree:

  • DEV_ITEM (where the offset field is the internal devid), which contain information on all of the underlying block devices in the filesystem
    DEV_ITEM(其中偏移字段是内部 devid),其中包含文件系统中所有底层块设备的信息

  • CHUNK_ITEM (where the offset field is the start of the chunk as a virtual address), which maps a section of the virtual address space (a chunk) into physical storage.

Device Allocation Tree 设备分配树

The device allocation tree records which parts of each physical device have been allocated into chunks. This is a relatively small tree that is only updated as new chunks are allocated. It stores back references to the chunk tree that allocated each physical extent on the device.

Extent Allocation Tree 范围分配树

The extent allocation tree records byte ranges that are in use, maintains reference counts on each extent and records back references to the tree or file that is using each extent. Logical block groups are created inside the extent allocation tree, and these reference large logical extents from the chunk tree.

Each block group can only store a specific type of extent. This might include metadata, or mirrored metadata, or striped data blocks etc.

Currently there is only one extent allocation tree shared by all the other trees. This will change in order to scale better under load.
目前,所有其他树共享一个范围分配树。 为了更好地在负载下扩展,这将发生变化。

Keys for the extent tree use the start of the extent as the objectid. A BLOCK_GROUP_ITEM key will be followed by the EXTENT_ITEM keys for extents within that block group.
范围树的键使用范围的起始作为对象标识符。 BLOCK_GROUP_ITEM 键后面将跟随该块组内范围的 EXTENT_ITEM 键。

FS Trees 文件系统树 

These store files and directories, and all of the normal metadata you would expect to find in a filesystem. There is one root for each subvolume or snapshot, but snapshots will share blocks between roots.

Keys in FS trees always use the inode number of the filesystem object as the objectid.
FS 树中的键始终使用文件系统对象的 inode 号作为 objectid。

Each object will have one or more of:

  • Inode. I 节点。

  • Inode ref, indicating what name this object is known as, and in which directory.
    I 节点引用,指示此对象被称为什么名称,以及在哪个目录中。

  • For files, a set of extent information, indicating where on the filesystem this file’s data is.

  • For directories, two sequences of dir_items, one indexed by a hash of the object name, and one indexed by a unique sequential index number.
    对于目录,有两个 dir_items 序列,一个按对象名称的哈希索引,另一个按唯一的顺序索引号。

Checksum Tree 校验和树

The checksum tree stores block checksums. Every 4k block of data stored on disk has a checksum associated with it. The “offset” part of the keys in the checksum tree indicates the start of the checksummed data on disk. The value stored with the key is a sequence of (currently 4-byte) checksums, for the 4k blocks starting at the offset.
校验和树存储块校验和。磁盘上存储的每个 4k 数据块都有一个与之关联的校验和。校验和树中键的“偏移”部分表示磁盘上校验数据的起始位置。与键一起存储的值是一个(目前为 4 字节)校验和序列,用于从偏移开始的 4k 块。

Data Relocation Tree 数据迁移树 

Log Root Tree 日志根树 