This document contains the release notes for the LLVM Compiler Infrastructure,
release 15.0.7. Here we describe the status of LLVM, including major improvements
from the previous release, improvements in various subprojects of LLVM, and
some of the current users of the code. All LLVM releases may be downloaded
from the LLVM releases web site.
For more information about LLVM, including information about the latest
release, please check out the main LLVM web site. If you
have questions or comments, the LLVM Developer’s Mailing List is a good place to send
them.
Note that if you are reading this file from a Git checkout or the main
LLVM web page, this document applies to the next release, not the current
one. To see the release notes for a specific release, please see the releases
page.
- LLVM now uses opaque pointers. This means that
different pointer types like i8*, i32* or void()** are now
represented as a single ptr type. See the linked document for migration
instructions.
- Renamed llvm.experimental.vector.extract intrinsic to llvm.vector.extract.
- Renamed llvm.experimental.vector.insert intrinsic to llvm.vector.insert.
- The constant expression variants of the following instructions have been
removed:
- extractvalue
- insertvalue
- udiv
- sdiv
- urem
- srem
- fadd
- fsub
- fmul
- fdiv
- frem
- Added the support for fmax and fmin in atomicrmw instruction. The
comparison is expected to match the behavior of llvm.maxnum.* and
llvm.minnum.* respectively.
- callbr instructions no longer use blockaddress arguments for labels.
Instead, label constraints starting with ! refer directly to entries in
the callbr indirect destination list.
; Old representation
%res = callbr i32 asm "", "=r,r,i"(i32 %x, i8 *blockaddress(@foo, %indirect))
to label %fallthrough [label %indirect]
; New representation
%res = callbr i32 asm "", "=r,r,!i"(i32 %x)
to label %fallthrough [label %indirect]
- Omitting CMAKE_BUILD_TYPE when using a single configuration generator is now
an error. You now have to pass -DCMAKE_BUILD_TYPE=<type> in order to configure
LLVM. This is done to help new users of LLVM select the correct type: since building
LLVM in Debug mode is very resource intensive, we want to make sure that new users
make the choice that lines up with their usage. We have also improved documentation
around this setting that should help new users. You can find this documentation
here.
- Loop interchange legality and cost model improvements
- 8 and 16-bit atomic loads and stores are now supported
- Added support for the Armv9-A, Armv9.1-A and Armv9.2-A architectures.
- Added support for the Armv8.1-M PACBTI-M extension.
- Added support for the Armv9-A, Armv9.1-A and Armv9.2-A architectures.
- Added support for the Armv8.1-M PACBTI-M extension.
- Removed the deprecation of ARMv8-A T32 Complex IT blocks. No deprecation
warnings will be generated and -mrestrict-it is now always off by default.
Previously it was on by default for Armv8 and off for all other architecture
versions.
- Added a pass to workaround Cortex-A57 Erratum 1742098 and Cortex-A72
Erratum 1655431. This is enabled by default when targeting either CPU.
- Implemented generation of Windows SEH unwind information.
- Switched the MinGW target to use SEH instead of DWARF for unwind information.
- Added support for the Cortex-M85 CPU.
- Added support for a new -mframe-chain=(none|aapcs|aapcs+leaf) command-line
option, which controls the generation of AAPCS-compliant Frame Records.
- DirectX has been added as an experimental target. Specify
-DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=DirectX in your CMake configuration
to enable it. The target is not packaged in pre-built binaries.
- The DirectX backend supports the dxil architecture which is based on LLVM
3.6 IR encoded as bitcode and is the format used for DirectX GPU Shader
programs.
Common PowerPC improvements:
* Add a new post instruction selection pass to generate CTR loops.
* Add SSE4 and BMI compatible intrinsics implementation.
* Supported 16-byte lock free atomics on PowerPC8 and up.
* Supported atomic load/store for pointer types.
* Supported stack size larger than 2G
* Add __builtin_min/__builtin_max/__abs builtins.
* Code generation improvements for splat load/vector shuffle/mulli, etc.
* Emit VSX instructions for vector loads and stores regardless of alignment.
* The mcpu=future has its own ISA now (FutureISA).
* Added the ppc-set-dscr option to set the Data Stream Control Register (DSCR).
* Bug fixes.
AIX improvements:
* Supported 64 bit XCOFF for integrated-as path.
* Supported X86-compatible vector intrinsics.
* Program code csect default alignment now is 32-byte.
* Supported auxiliary header in integrated-as path.
* Improved alias symbol handling.
- A RISCVRedundantCopyElimination pass was added to remove unnecessary zero
copies.
- A RISC-V specific CodeGenPrepare pass was added.
- The machine outliner was enabled by default for RISC-V at -Oz.
Additionally, the newly introduced RISCVMakeCompressible pass will make
modify instructions prior to emission at -Oz in order to increase
opportunities for the compression with the RISC-V C extension.
- Various bug fixes and improvements to code generation for the RISC-V vector
extensions.
- Various improvements were made to RISC-V specific optimisation passes such
as RISCVSExtWRemoval and RISCVMergeBaseOffset.
- llc now computes the target ABI based on the target architecture using the
same logic as Clang if not explicit ABI is given.
- generic is now recognized as a valid CPU name and is mapped to
generic-rv32 or generic-rv64 depending on the target triple.
- Support for the experimental Zvfh extension was added, enabling
half-precision floating point in vectors.
- Support for the Zihintpause (Pause Hint) extension.
- Assembler and disassembler support for the Zfinx and Zdinx (float / double
in integer register) extensions.
- Assembler and disassembler support for the Zicbom, Zicboz, and Zicbop cache
management operation extensions.
- Support for the Zmmul extension (a subextension of the M extension, adding
multiplication instructions only).
- Assembler and disassembler support for the hypervisor extension and for the
Sinval supervisor memory-management extension.
- Support z16 processor name.
- Machine scheduler description for z16.
- Add support for inline assembly address operands (“p”) as well as for SystemZ
specific address operands (“ZQ”, “ZR”, “ZS” and “ZT”).
- Efficient handling of small memcpy/memset operations up to 32 bytes.
- Tuning of the inliner.
- Fixing emission of library calls so that narrow integer arguments are sign or
zero extended per the SystemZ ABI.
- Support added for libunwind.
- Various minor improvements and bugfixes.
- Support half type on SSE2 and above targets following X86 psABI.
- Support rdpru instruction on Zen2 and above targets.
During this release, half type has an ABI breaking change to provide the
support for the ABI of _Float16 type on SSE2 and above following X86 psABI.
(D107082)
The change may affect the current use of half includes (but is not limited
to):
- Frontends generating half type in function passing and/or returning
arguments.
- Downstream runtimes providing any half conversion builtins assuming the
old ABI.
- Projects built with LLVM 15.0 but using early versions of compiler-rt.
When you find failures with half type, check the calling conversion of the
code and switch it to the new ABI.
- Add LLVMGetCastOpcode function to aid users of LLVMBuildCast in
resolving the best cast operation given a source value and destination type.
This function is a direct wrapper of CastInst::getCastOpcode.
- Add LLVMGetAggregateElement function as a wrapper for
Constant::getAggregateElement, which can be used to fetch an element of a
constant struct, array or vector, independently of the underlying
representation. The LLVMGetElementAsConstant function is deprecated in
favor of the new function, which works on all constant aggregates, rather than
only instances of ConstantDataSequential.
- The following functions for creating constant expressions have been removed,
because the underlying constant expressions are no longer supported. Instead,
an instruction should be created using the LLVMBuildXYZ APIs, which will
constant fold the operands if possible and create an instruction otherwise:
- LLVMConstExtractValue
- LLVMConstInsertValue
- LLVMConstUDiv
- LLVMConstExactUDiv
- LLVMConstSDiv
- LLVMConstExactSDiv
- LLVMConstURem
- LLVMConstSRem
- LLVMConstFAdd
- LLVMConstFSub
- LLVMConstFMul
- LLVMConstFDiv
- LLVMConstFRem
- Add LLVMDeleteInstruction function which allows deleting instructions that
are not inserted into a basic block.
- As part of the opaque pointer migration, the following APIs are deprecated and
will be removed in the next release:
- LLVMBuildLoad -> LLVMBuildLoad2
- LLVMBuildCall -> LLVMBuildCall2
- LLVMBuildInvoke -> LLVMBuildInvoke2
- LLVMBuildGEP -> LLVMBuildGEP2
- LLVMBuildInBoundsGEP -> LLVMBuildInBoundsGEP2
- LLVMBuildStructGEP -> LLVMBuildStructGEP2
- LLVMBuildPtrDiff -> LLVMBuildPtrDiff2
- LLVMConstGEP -> LLVMConstGEP2
- LLVMConstInBoundsGEP -> LLVMConstInBoundsGEP2
- LLVMAddAlias -> LLVMAddAlias2
- Refactor compression namespaces across the project, making way for a possible
introduction of alternatives to zlib compression in the llvm toolchain.
Changes are as follows:
- Relocate the llvm::zlib namespace to llvm::compression::zlib.
- Remove crc32 from zlib compression namespace, people should use the llvm::crc32 instead.
- The “memory region” command now has a “–all” option to list all
memory regions (including unmapped ranges). This is the equivalent
of using address 0 then repeating the command until all regions
have been listed.
- Added “–show-tags” option to the “memory find” command. This is off by default.
When enabled, if the target value is found in tagged memory, the tags for that
memory will be shown inline with the memory contents.
- Various memory related parts of LLDB have been updated to handle
non-address bits (such as AArch64 pointer signatures):
- “memory read”, “memory write” and “memory find” can now be used with
addresses with non-address bits.
- All the read and write memory methods on SBProccess and SBTarget can
be used with addreses with non-address bits.
- When printing a pointer expression, LLDB can now dereference the result
even if it has non-address bits.
- The memory cache now ignores non-address bits when looking up memory
locations. This prevents us reading locations multiple times, or not
writing out new values if the addresses have different non-address bits.
- LLDB now supports reading memory tags from AArch64 Linux core files.
- LLDB now supports the gnu debuglink section for reading debug information
from a separate file on Windows
- LLDB now allows selecting the C++ ABI to use on Windows (between Itanium,
used for MingW, and MSVC) via the plugin.object-file.pe-coff.abi setting.
In Windows builds of LLDB, this defaults to the style used for LLVM’s default
target.
- The code for the LLVM Visual Studio integration
has been removed. This had been obsolete and abandoned since Visual Studio
started including an integration by default in 2019.
- Added the unwinder, personality, and helper functions for exception handling
on AIX. (D100132)
(D100504)
- PGO on AIX: A new implementation that requires linker support
(__start_SECTION/__stop_SECTION symbols) available on AIX 7.2 TL5 SP4 and
AIX 7.3 TL0 SP2.