为什么 constexpr 上下文会使编译器失败,而没有完美优化?
问题描述
我玩弄了 constexpr
并发现了一些有趣的行为:
I played around with constexpr
and realized some interesting behavior:
- 在某些情况下,在函数前添加
constexpr
可以使 GCC 尝试更加努力地优化,从而完全优化函数并仅提供计算值.李> - 但是,从
constexpr
上下文调用这样一个完全优化的函数会导致错误,因为它在内部使用(编译器内置的)函数/内在函数,这些函数/内在函数未标记为constexpr
(尤其是memcpy
). - (Clang 在将
constexpr
应用于此类函数时直接失败,即使没有constexpr
上下文.)
- In some situations adding
constexpr
in front of a function enables GCC to try optimizing harder which results in fully optimizing the function away and just providing the calculated value. - However, calling such a fully optimized function from a
constexpr
context results in errors because it internally uses (compiler-built-in) functions/intrinsics which are not markedconstexpr
(in particularlymemcpy
). - (Clang fails directly when applying
constexpr
to such a function, even without aconstexpr
context.)
为什么会这样?
- 即使在
constexpr
上下文中,编译器 (GCC) 是否仍能优化? - C++ 提案 P0202(将其变为 C++20)想要制作像
这样的函数memcpy
constexpr
(参见 original 中的 III.B 部分修订版),但被拒绝并更改了,因为此类函数的编译器内置版本将实现相同的效果(请参阅III.A部分wg21.link/p0202" rel="nofollow noreferrer">最新版本). - 那么,GCC 和 Clang 在
constexpr
函数/上下文中不允许memcpy
是错误的吗?(注意:memcpy
和__builtin_memcpy
是等价的.)
- Shouldn't the compiler (GCC) be able to still optimize, even in
constexpr
context? - C++ proposal P0202 (which made it into C++20) wanted to make functions like
memcpy
constexpr
(see section III.B in original revision), but that was rejected and changed because compiler-built-in versions of such functions would achieve the same (see section III.A in latest revision). - So, are GCC and Clang wrong in not allowing
memcpy
inconstexpr
functions/context? (Note:memcpy
and__builtin_memcpy
are equivalent.)
因为例子比较容易理解,这里有这样一个例子.
(您甚至可以在 Compiler Explorer here 中更轻松地看到它的结果.)
As examples are easier to understand, here is such an example.
(You can even see it more comfortably with its results in Compiler Explorer here.)
注意:我无法想出一个简单的例子,在函数中简单地添加 constexpr
有助于 GCC 优化器完全优化,否则它不会.但请相信我,我有这样的例子,它们更复杂(而且很遗憾是封闭源代码).
Note: I was unable to come up with a simple example where simply adding constexpr
to the function helped the GCC optimizer to fully optimize, which it otherwise would not. But believe me, that I have such examples, which are more complicated (and sadly closed-source).
#include <array>
#include <cstdint>
#include <cstring>
constexpr std::uint32_t extract(const std::uint8_t* data) noexcept
{
std::uint32_t num;
memcpy(&num, data, sizeof(std::uint32_t));
return num;
}
int main()
{
constexpr std::array<std::uint8_t, 4> a1 {{ 0xff, 0xff, 0xff, 0xff }};
/*constexpr*/ auto val = extract(a1.data()); // <--- Using constexpr here makes compiler fail.
return val;
}
GCC 能够将其优化为:
main: # @main
mov eax, -1
ret
Clang 也可以优化它,如果去掉函数定义前面的 constexpr
.
Clang can optimize it too, if removing the constexpr
in front of the function definition.
但是,如果在函数调用前面的 constexpr
中进行注释(从而从 constexpr
上下文调用函数),编译器将失败并显示如下内容:
However, if commenting in the constexpr
in front of the function call (and thereby calling the function from constexpr
context) the compiler fails with something like this:
海合会:
<source>: In function 'int main()':
<source>:15:33: in 'constexpr' expansion of 'extract(a1.std::array<unsigned char, 4>::data())'
<source>:8:11: error: 'memcpy(((void*)(& num)), ((const void*)(& a1.std::array<unsigned char, 4>::_M_elems)), 4)' is not a constant expression
8 | memcpy(&num, data, sizeof(std::uint32_t));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Compiler returned: 1
叮当声:
<source>:5:25: error: constexpr function never produces a constant expression [-Winvalid-constexpr]
constexpr std::uint32_t extract(const std::uint8_t* data) noexcept
^
<source>:8:5: note: non-constexpr function 'memcpy' cannot be used in a constant expression
memcpy(&num, data, sizeof(std::uint32_t));
^
<source>:15:20: error: constexpr variable 'val' must be initialized by a constant expression
constexpr auto val = extract(a1.data()); // <--- Error!
^ ~~~~~~~~~~~~~~~~~~
<source>:8:5: note: non-constexpr function 'memcpy' cannot be used in a constant expression
memcpy(&num, data, sizeof(std::uint32_t));
^
<source>:15:26: note: in call to 'extract(&a1._M_elems[0])'
constexpr auto val = extract(a1.data()); // <--- Error!
^
2 errors generated.
Compiler returned: 1
推荐答案
根据dcl.constexpr
对于既不是默认值也不是模板的 constexpr 函数或 constexpr 构造函数,如果不存在参数值,则函数或构造函数的调用可以是核心常量表达式的求值子表达式,或者,对于构造函数,某些常量初始化对象 ([basic.start.static]) 的初始化完整表达式的求值子表达式,程序格式错误,不需要诊断.
For a constexpr function or constexpr constructor that is neither defaulted nor a template, if no argument values exist such that an invocation of the function or constructor could be an evaluated subexpression of a core constant expression, or, for a constructor, an evaluated subexpression of the initialization full-expression of some constant-initialized object ([basic.start.static]), the program is ill-formed, no diagnostic required.
因为 memcpy
不是 constexpr
,你的程序是不正确的 NDR.
As memcpy
is not constexpr
, your program is ill formed NDR.
在 contsexpr
上下文中使用该函数将允许进行诊断.
Using the function in contsexpr
context would allow to have diagnostic.
在某些情况下,在函数前面添加 constexpr
可以让 GCC 尝试更努力地优化,从而完全优化函数并只提供计算值.
In some situations adding
constexpr
in front of a function enables GCC to try optimizing harder which results in fully optimizing the function away and just providing the calculated value.
这是一个很好的提示(就像之前的inline
).
It is a good hint (as inline
before).
constexpr
函数可能被误用":
constexpr std::size_t factorial(std::size_t n) {/*..*