Abseil 源码阅读笔记·容器 [0x02]

FixedArray

先从简单的开始，看一下 FixedArray，学一点工程技巧。

注释里的介绍：

// A `FixedArray<T>` represents a non-resizable array of `T` where the length of
// the array can be determined at run-time. It is a good replacement for
// non-standard and deprecated uses of `alloca()` and variable length arrays
// within the GCC extension. (See
// https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html).
//
// `FixedArray` allocates small arrays inline, keeping performance fast by
// avoiding heap operations. It also helps reduce the chances of
// accidentally overflowing your stack if large input is passed to
// your function.

一些基本的定义

来看一下代码，起手一个命名空间 absl，紧跟一个宏 ABSL_NAMESPACE_BEGIN，展开后是 inline namespace ABSL_OPTION_INLINE_NAMESPACE_NAME {，一个内联的命名空间，这是用来控制版本的，在外部使用时不需要指定内联命名空间的名字，但是在编译后是有相关的符号的。

1 2	`template <typename T, size_t N = kFixedArrayUseDefault, typename A = std::allocator<T>>`

FixedArray 是一个模板类，有三个模板参数，T 代表类型，N 代表要内联的元素个数，A 是个 allocator。当 N 不指定时，会使用 kFixedArrayUseDefault 作为默认参数，而在上面已经定义了：

1	`constexpr static auto kFixedArrayUseDefault = static_cast<size_t>(-1);`

这实际上是 size_t 的最大值，现实场景不会有人把值写成这个，由此可以判断使用者有没有指定一个 N，如果没有指定那就使用默认值（默认最多内联 256 bytes 的内容），如果指定了再用指定的元素个数 N。注释里也提到了，大多数情况不需要指定 N。

1
2
3

static constexpr size_type inline_elements =
    (N == kFixedArrayUseDefault ? kInlineBytesDefault / sizeof(value_type)
                                : static_cast<size_type>(N));

类的开头是一个断言，确保 T 不是 C 风格数组，或者大小已知。这是为了防止类似 int [] 的东西被传进来，它的大小不确定，不能用于创建固定大小的 FixedArray。

static constexpr size_t kInlineBytesDefault = 256;

using AllocatorTraits = std::allocator_traits<A>;
// std::iterator_traits isn't guaranteed to be SFINAE-friendly until C++17,
// but this seems to be mostly pedantic.
template <typename Iterator>
using EnableIfForwardIterator = absl::enable_if_t<std::is_convertible<
    typename std::iterator_traits<Iterator>::iterator_category,
    std::forward_iterator_tag>::value>;
static constexpr bool NoexceptCopyable() {
    return std::is_nothrow_copy_constructible<StorageElement>::value &&
        absl::allocator_is_nothrow<allocator_type>::value;
}
static constexpr bool NoexceptMovable() {
    return std::is_nothrow_move_constructible<StorageElement>::value &&
        absl::allocator_is_nothrow<allocator_type>::value;
}
static constexpr bool DefaultConstructorIsNonTrivial() {
    return !absl::is_trivially_default_constructible<StorageElement>::value;
}

接下来定义了几个东西，一个是 kInlineBytesDefault 内联的默认最大大小，还有一个 EnableIfForwardIterator，似乎是出于工程上的妥协，就不去管它了，还有三个判断函数，功能也很显然。

后面又有一些 using，具体见代码，就不再赘述了。

这类有一大堆构造函数之类的，也不详细讲了。具体实现方面，也没什么好说的，都是最基本的操作，实现也很显然。这里用 using 的方式来定义类型，增强了可读性。

存储实现

template <typename OuterT, typename InnerT = absl::remove_extent_t<OuterT>,
        size_t InnerN = std::extent<OuterT>::value>
struct StorageElementWrapper {
    InnerT array[InnerN];
};

using StorageElement =
    absl::conditional_t<std::is_array<value_type>::value,
                        StorageElementWrapper<value_type>, value_type>;

static pointer AsValueType(pointer ptr) { return ptr; }
static pointer AsValueType(StorageElementWrapper<value_type>* ptr) {
    return std::addressof(ptr->array);
}

这几个东西比较有意思，当我们使用 data() 获取数据指针的时候，如果是 FixedArray<char> 之类的东西，可能会被编译器当成一个变量的地址，从而产生溢出警告。这里是工程性的技巧，用于消除歧义。

data() 会调用 AsValueType，这会根据数据的类型来决定调用的重载函数。而 StorageElement 的数据类型会在编译时，通过 absl::conditional_t 确定。

1 2	`template <bool B, typename T, typename F> using conditional_t = typename std::conditional<B, T, F>::type;`

可以看到 absl::conditional_t 是一个简写，在内部是用了 std::conditional。MSVC 的 std::conditional 是这样实现的：

_EXPORT_STD template <bool _Test, class _Ty1, class _Ty2>
struct conditional { // Choose _Ty1 if _Test is true, and _Ty2 otherwise
    using type = _Ty1;
};

template <class _Ty1, class _Ty2>
struct conditional<false, _Ty1, _Ty2> {
    using type = _Ty2;
};

如果模板的第一个参数为 true，那就会匹配到上面一个模板，让 type 为 _Ty1，而如果第一个参数为 false，就会由于模板特化，匹配到下面一个，使 type 变成 _Ty2。

template <typename OuterT, typename InnerT = absl::remove_extent_t<OuterT>,
        size_t InnerN = std::extent<OuterT>::value>
struct StorageElementWrapper {
    InnerT array[InnerN];
};

当类型是 C 风格数组的时候，首先用 absl::remove_extent_t 剥掉一层类型，得到数组中元素的类型，用 std::extent<OuterT>::value 得到元素个数。StorageElementWrapper 将 C 风格数组包装成一个 struct。

std::extent 内部也用了递归的模板匹配，详见代码。

对于数组，或许会有对齐等方面的隐患，这里用 static_assert 来确保不出问题：

1 2	`static_assert(sizeof(StorageElement) == sizeof(value_type), ""); static_assert(alignof(StorageElement) == alignof(value_type), "");`

最后是 Storage 部分，根据数据大小，会决定是否内联，由此决定内存的分配与释放方式。

如果是内联的，那就会调用 InlinedStorage::AnnotateConstruct 进行构造，其实这玩意儿啥也没做，因为内存在编译时就决定了，用 static_cast<void>(n); 来 “使用” n 这个变量，消除编译器警告。这里用到了 EBCO 优化，对空的内联存储，继承的是 NonEmptyInlinedStorage 类，这样可以节省一个字节。

如果非内联，那就动态分配内存。不过这样就会浪费为内联分配的内存，这可能是处于实际使用时的考虑，如果要存储的东西比较大，浪费一点点区别也不大。对于这样一个容器，常见的情况就是存储大小较小的情况，此时对于每一个字节都要节省；而对于大的，这只是确保了正确性，内存方面反而有不少损失。这体现的是工程实践时的权衡。

根据不同情况，在构造的时候得到了 StorageElement* data_ 指针。

有一个细节值得注意：

size_type size() const { return size_alloc_.template get<0>(); }
StorageElement* begin() const { return data_; }
StorageElement* end() const { return begin() + size(); }
allocator_type& alloc() { return size_alloc_.template get<1>(); }
const allocator_type& alloc() const {
    return size_alloc_.template get<1>();
}

size_alloc_ 是一个有 EBCO 优化的 Tuple（具体实现细节我看不太懂），上面这段代码里有个奇怪的语法：

1	`size_alloc_.template get<0>()`

其实目的是这样的：

1	`size_alloc_.get<0>()`

但在编译器解析的时候，这个东西会被解析成 (size_alloc_.get < 0) > ()，然后就会编译错误。

Address Sanitizer 支持

在存储的部分有对 Address Sanitizer 的支持，具体是这样实现的：

1
2
3

#ifdef ABSL_HAVE_ADDRESS_SANITIZER
    ABSL_ATTRIBUTE_NOINLINE
# endif  // ABSL_HAVE_ADDRESS_SANITIZER

标记禁用内联，以输出完整调用栈。

1
2
3

ABSL_ADDRESS_SANITIZER_REDZONE(redzone_begin_);
alignas(StorageElement) char buff_[sizeof(StorageElement[inline_elements])];
ABSL_ADDRESS_SANITIZER_REDZONE(redzone_end_);

在数据两边放一字节的 Redzone，用于检测越界。

容器较小，内联存储时，在 “构造” 的时候会这样做：

if (!n) return;
ABSL_ANNOTATE_CONTIGUOUS_CONTAINER(data(), RedzoneEnd(), RedzoneEnd(),
                                    data() + n);
ABSL_ANNOTATE_CONTIGUOUS_CONTAINER(RedzoneBegin(), data(), data(),
                                    RedzoneBegin());

ABSL_ANNOTATE_CONTIGUOUS_CONTAINER 接受四个参数 beg, end, old_mid, new_mid，意思是：容器的内存范围是 [begin, end)，之前 [begin, old_mid) 是有效的，现在 [begin, new_mid) 是有效的。

第一步时，原本 [data(), RedzoneEnd()) 都有效，现在 [data(), data() + n) 有效，也就是去掉了末尾的 Redzone。第二步则是去掉了开头的 Redzone。

析构时的做法也同理。

容器较大时，越界检测就交给了内存分配器来解决。

总结

概括一下，FixedArray 的功能就是提供一个运行时确定大小且大小不可更改的容器，容量较小时直接在栈上存储，容量大时动态分配。

（似乎在 C++20 中可以通过 [[no_unique_address]] 简单地完成 EBCO 优化，无需使用继承的 trick 方法，相比继承要好不少，不过属性可能会被编译器忽略，要取决于具体实现）

InlinedVector

笔记 > 源码阅读 > Abseil

#笔记 #C++ #源码阅读

Abseil 源码阅读笔记·容器 [0x02]

http://xiao-h.com/2025/01/22/Abseil-0x02-容器/

作者

小H

发布于

2025年1月22日

许可协议

[论文复现] Poisson Image Editing 上一篇

Abseil 源码阅读笔记·总览 [0x01] 下一篇