diff options
author | Dimitry Andric <dim@FreeBSD.org> | 2012-12-22 14:58:30 +0000 |
---|---|---|
committer | Dimitry Andric <dim@FreeBSD.org> | 2012-12-22 14:58:30 +0000 |
commit | 482e7bddf617ae804dc47133cb07eb4aa81e45de (patch) | |
tree | c074bb56c422dea536a85cc2d80fd620bb6af08e | |
parent | 522600a229b950314b5f4af84eba4f3e8a0ffea1 (diff) |
Vendor import of llvm tags/RELEASE_32/final r170710 (effectively, 3.2vendor/llvm/llvm-release_32-r170710
Notes
Notes:
svn path=/vendor/llvm/dist/; revision=244590
svn path=/vendor/llvm/llvm-release_32-r170710/; revision=244591; tag=vendor/llvm/llvm-release_32-r170710
24 files changed, 565 insertions, 355 deletions
diff --git a/docs/ReleaseNotes.html b/docs/ReleaseNotes.html index a4b1d580b637..a4c5960c1555 100644 --- a/docs/ReleaseNotes.html +++ b/docs/ReleaseNotes.html @@ -29,12 +29,6 @@ <p>Written by the <a href="http://llvm.org/">LLVM Team</a></p> </div> -<h1 style="color:red">These are in-progress notes for the upcoming LLVM 3.2 -release.<br> -You may prefer the -<a href="http://llvm.org/releases/3.1/docs/ReleaseNotes.html">LLVM 3.1 -Release Notes</a>.</h1> - <!-- *********************************************************************** --> <h2> <a name="intro">Introduction</a> @@ -46,7 +40,7 @@ Release Notes</a>.</h1> <p>This document contains the release notes for the LLVM Compiler Infrastructure, release 3.2. Here we describe the status of LLVM, including major improvements from the previous release, improvements in various - subprojects of LLVM, and some of the current users of the code. All LLVM + sub-projects of LLVM, and some of the current users of the code. All LLVM releases may be downloaded from the <a href="http://llvm.org/releases/">LLVM releases web site</a>.</p> @@ -72,11 +66,12 @@ Release Notes</a>.</h1> <div> -<p>The LLVM 3.2 distribution currently consists of code from the core LLVM - repository, which roughly includes the LLVM optimizers, code generators and - supporting tools, and the Clang repository. In addition to this code, the - LLVM Project includes other sub-projects that are in development. Here we - include updates on these subprojects.</p> +<p>The LLVM 3.2 distribution currently consists of production-quality code + from the core LLVM repository, which roughly includes the LLVM optimizers, + code generators and supporting tools, as well as Clang, DragonEgg and + compiler-rt sub-project repositories. In addition to this code, the LLVM + Project includes other sub-projects that are in development. Here we + include updates on these sub-projects.</p> <!--=========================================================================--> <h3> @@ -90,18 +85,18 @@ Release Notes</a>.</h1> experience through expressive diagnostics, a high level of conformance to language standards, fast compilation, and low memory use. Like LLVM, Clang provides a modular, library-based architecture that makes it suitable for - creating or integrating with other development tools. Clang is considered a - production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86 - (32- and 64-bit), and for Darwin/ARM targets.</p> + creating or integrating with other development tools.</p> <p>In the LLVM 3.2 time-frame, the Clang team has made many improvements. Highlights include:</p> <ul> - <li>...</li> + <li>Improvements to Clang's diagnostics</li> + <li>Support for tls_model attribute</li> + <li>Type safety attributes</li> </ul> <p>For more details about the changes to Clang since the 3.1 release, see the - <a href="http://clang.llvm.org/docs/ReleaseNotes.html">Clang release + <a href="http://llvm.org/releases/3.2/tools/clang/docs/ReleaseNotes.html">Clang 3.2 release notes.</a></p> <p>If Clang rejects your code but another compiler accepts it, please take a @@ -129,7 +124,10 @@ Release Notes</a>.</h1> <p>The 3.2 release has the following notable changes:</p> <ul> - <li>...</li> + <li>Able to load LLVM plugins such as Polly.</li> + <li>Supports thread-local storage models.</li> + <li>Passes knowledge of variable lifetimes to the LLVM optimizers.</li> + <li>No longer requires GCC to be built with LTO support.</li> </ul> </div> @@ -141,7 +139,8 @@ Release Notes</a>.</h1> <div> -<p>The new LLVM <a href="http://compiler-rt.llvm.org/">compiler-rt project</a> + +<p>The LLVM <a href="http://compiler-rt.llvm.org/">compiler-rt project</a> is a simple library that provides an implementation of the low-level target-specific hooks required by code generation and other runtime components. For example, when compiling for a 32-bit target, converting a @@ -153,7 +152,11 @@ Release Notes</a>.</h1> <p>The 3.2 release has the following notable changes:</p> <ul> - <li>...</li> + <li><a href="http://llvm.org/releases/3.2/tools/clang/docs/ThreadSanitizer.html">ThreadSanitizer (TSan)</a> - data race detector run-time library for C/C++ has been added.</li> + <li>Improvements to <a href="http://llvm.org/releases/3.2/tools/clang/docs/AddressSanitizer.html">AddressSanitizer</a> including: better portability + (OSX, Android NDK), support for cmake based builds, enhanced error reporting and lots of bug fixes.</li> + <li>Added support for A6 'Swift' CPU.</li> + <li><code>divsi3</code> function has been enhanced to take advantage of a hardware unsigned divide when it is available.</li> </ul> </div> @@ -174,7 +177,9 @@ Release Notes</a>.</h1> <p>The 3.2 release has the following notable changes:</p> <ul> - <li>...</li> + <li>Linux build fixes for clang (see <a href="http://lldb.llvm.org/build.html">Building LLDB</a>)</li> + <li>Some Linux stability and usability improvements</li> + <li>Switch expression evaluation to use MCJIT (from legacy JIT) on Linux</li> </ul> </div> @@ -193,7 +198,15 @@ Release Notes</a>.</h1> <p>Within the LLVM 3.2 time-frame there were the following highlights:</p> <ul> - <li>...</li> + <li> C++11 shared_ptr atomic access API (20.7.2.5) has been implemented.</li> + <li>Applied noexcept and constexpr throughout library.</li> + <li>Improved C++11 conformance in associative container emplace.</li> + <li>Performance improvements in: std::rotate algorithm and I/O.</li> + <li>Operator new/delete and type_infos for exception types moved from libc++ to libc++abi.</li> + <li>Bug fixes in: <code><atomic></code>; vector<code><bool></code> algorithms, + <code><future></code>,<code><tuple></code>, + <code><type_traits></code>,<code><fstream></code>,<code><istream></code>, + <code><iterator></code>, <code><condition_variable></code>,<code><complex></code> as well as visibility fixes. </ul> </div> @@ -212,7 +225,7 @@ Release Notes</a>.</h1> <p>The 3.2 release has the following notable changes:</p> <ul> - <li>...</li> + <li>Bug fixes only, no functional changes.</li> </ul> </div> @@ -227,16 +240,61 @@ Release Notes</a>.</h1> <p><a href="http://polly.llvm.org/">Polly</a> is an <em>experimental</em> optimizer for data locality and parallelism. It currently provides high-level - loop optimizations and automatic parallelisation (using the OpenMP run time). + loop optimizations and automatic parallelization (using the OpenMP run time). Work in the area of automatic SIMD and accelerator code generation was started.</p> <p>Within the LLVM 3.2 time-frame there were the following highlights:</p> <ul> - <li>...</li> + <li>isl, the integer set library used by Polly, was relicensed under the MIT license.</li> + <li>isl based code generation.</li> + <li>MIT licensed replacement for CLooG (LGPLv2).</li> + <li>Fine grained option handling (separation of core and border computations, control overhead vs. code size).</li> + <li>Support for FORTRAN and Dragonegg.</li> + <li>OpenMP code generation fixes.</li> +</ul> + +</div> + +<!--=========================================================================--> +<h3> +<a name="StaticAnalyzer">Clang Static Analyzer</a> +</h3> + +<div> + +<p>The <a href="http://clang-analyzer.llvm.org/">Clang Static Analyzer</a> + is an advanced source code analysis tool integrated into Clang that performs + a deep analysis of code to find potential bugs.</p> + +<p>In the LLVM 3.2 release, the static analyzer has made significant improvements + in many areas, with notable highlights such as:</p> + +<ul> + <li>Improved interprocedural analysis within a translation unit (see details below), which greatly amplified the analyzer's ability to find bugs.</li> + <li>New infrastructure to model "well-known" APIs, allowing the analyzer to do a much better job when modeling calls to such functions.</li> + <li>Significant improvements to the APIs to write static analyzer checkers, with a more unified way of representing function/method calls in the checker API. Details can be found in the <a href="http://llvm.org/devmtg/2012-11#talk13">Building a Checker in 24 hours</a> talk. +</ul> + +<p>The release specifically includes notable improvements for Objective-C analysis, including:</p> + +<ul> + <li>Interprocedural analysis for Objective-C methods.</li> + <li>Interprocedural analysis of calls to "blocks".</li> + <li>Precise modeling of GCD APIs such as <tt>dispatch_once</tt> and friends.</li> + <li>Improved support for recently added Objective-C constructs such as array and dictionary literals.</li> +</ul> + +<p>The release specifically includes notable improvements for C++ analysis, including:</p> + +<ul> + <li>Interprocedural analysis for C++ methods (within a translation unit).</li> + <li>More precise modeling of C++ initializers and destructors.</li> </ul> +<p>Finally, this release includes many small improvements to <tt>scan-build</tt>, which can be used to drive the analyzer from the command line or a continuous integration system. This includes a directory-traversal issue, which could cause potential security problems in some cases. We would like to acknowledge Tim Brown of Portcullis Computer Security Ltd for reporting this issue.</p> + </div> </div> @@ -265,6 +323,19 @@ Release Notes</a>.</h1> </div> +<h3>EmbToolkit</h3> + +<div> + +<p><a href="http://www.embtoolkit.org/">EmbToolkit</a> provides Linux cross-compiler + toolchain/SDK (GCC/binutils/C library (uclibc,eglibc,musl)), a build system for + package cross-compilation and optionally various root file systems. + It supports ARM and MIPS. There is an ongoing effort to provide a clang+llvm + environment for the 3.2 releases, +</p> + +</div> + <h3>FAUST</h3> <div> @@ -274,7 +345,7 @@ Release Notes</a>.</h1> AUdio STream. Its programming model combines two approaches: functional programming and block diagram composition. In addition with the C, C++, Java, JavaScript output formats, the Faust compiler can generate LLVM bitcode, and - works with LLVM 2.7-3.1.</p> + works with LLVM 2.7-3.2.</p> </div> @@ -331,7 +402,11 @@ Release Notes</a>.</h1> <p>OSL was developed by Sony Pictures Imageworks for use in its in-house renderer used for feature film animation and visual effects, and is - distributed as open source software with the "New BSD" license.</p> + distributed as open source software with the "New BSD" license. + It has been used for all the shading on such films as The Amazing Spider-Man, + Men in Black III, Hotel Transylvania, and may other films in-progress, + and also has been incorporated into several commercial and open source + rendering products such as Blender, VRay, and Autodesk Beast.</p> </div> @@ -367,7 +442,7 @@ Release Notes</a>.</h1> C++, Fortran and Faust code in Pure programs if the corresponding LLVM-enabled compilers are installed).</p> -<p>Pure version 0.54 has been tested and is known to work with LLVM 3.1 (and +<p>Pure version 0.56 has been tested and is known to work with LLVM 3.2 (and continues to work with older LLVM releases >= 2.5).</p> </div> @@ -432,7 +507,9 @@ Release Notes</a>.</h1> <p>LLVM 3.2 includes several major changes and big features:</p> <ul> - <li>...</li> + <li>Loop Vectorizer.</li> + <li>New implementation of SROA.</li> + <li>New NVPTX back-end (replacing existing PTX back-end) based on NVIDIA sources.</li> </ul> </div> @@ -451,7 +528,10 @@ Release Notes</a>.</h1> <ul> <li>Thread local variables may have a specified TLS model. See the <a href="LangRef.html#globalvars">Language Reference Manual</a>.</li> - <li>...</li> + <li>'TYPE_CODE_FUNCTION_OLD' type code and autoupgrade code for old function attributes format has been removed.</li> + <li>Internal representation of the Attributes class has been converted into a pointer to an + opaque object that's uniqued by and stored in the LLVMContext object. + The Attributes class then becomes a thin wrapper around this opaque object.</li> </ul> </div> @@ -489,23 +569,33 @@ Release Notes</a>.</h1> <ul> <li>The inner most loops must have a single basic block.</li> <li>The number of iterations are known before the loop starts to execute.</li> - <li>The loop counter needs to be incrimented by one.</li> + <li>The loop counter needs to be incremented by one.</li> <li>The loop trip count <b>can</b> be a variable.</li> <li>Loops do <b>not</b> need to start at zero.</li> <li>The induction variable can be used inside the loop.</li> <li>Loop reductions are supported.</li> <li>Arrays with affine access pattern do <b>not</b> need to be marked as 'noalias' and are checked at runtime.</li> - <li>...</li> </ul> </p> -<p>SROA - We've re-written SROA to be significantly more powerful. -<!-- FIXME: Add more text here... --></p> +<p>SROA - We’ve re-written SROA to be significantly more powerful and generate +code which is much more friendly to the rest of the optimization pipeline. +Previously this pass had scaling problems that required it to only operate on +relatively small aggregates, and at times it would mistakenly replace a large +aggregate with a single very large integer in order to make it a scalar SSA +value. The result was a large number of i1024 and i2048 values representing any +small stack buffer. These in turn slowed down many subsequent optimization +paths.</p> +<p>The new SROA pass uses a different algorithm that allows it to only promote to +scalars the pieces of the aggregate actively in use. Because of this it doesn’t +require any thresholds. It also always deduces the scalar values from the uses +of the aggregate rather than the specific LLVM type of the aggregate. These +features combine to both optimize more code with the pass but to improve the +compile time of many functions dramatically.</p> <ul> - <li>Branch weight metadata is preseved through more of the optimizer.</li> - <li>...</li> + <li>Branch weight metadata is preserved through more of the optimizer.</li> </ul> </div> @@ -524,8 +614,19 @@ Release Notes</a>.</h1> <a href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro to the LLVM MC Project Blog Post</a>.</p> -<ul> - <li>...</li> +<ul> + <li> Added support for following assembler directives: <code>.ifb</code>, <code>.ifnb</code>, <code>.ifc</code>, + <code>.ifnc</code>, <code>.purgem</code>, <code>.rept</code> and <code>.version</code> (ELF) as well as Darwin specific + <code>.pushsection</code>, <code>.popsection</code> and <code>.previous</code> .</li> + <li>Enhanced handling of <code>.lcomm directive</code>.</li> + <li>MS style inline assembler: added implementation of the offset and TYPE operators.</li> + <li>Targets can specify minimum supported NOP size for NOP padding.</li> + <li>ELF improvements: added support for generating ELF objects on Windows.</li> + <li>MachO improvements: symbol-difference variables are marked as N_ABS, added direct-to-object attribute for data-in-code markers.</li> + <li>Added support for annotated disassembly output for x86 and arm targets.</li> + <li>Arm support has been improved by adding support for ARM TARGET2 relocation + and fixing hadling of ARM-style "$d.*" labels.</li> + <li>Implemented local-exec TLS on PowerPC.</li> </ul> </div> @@ -550,10 +651,6 @@ Release Notes</a>.</h1> infrastructure, which allows us to implement more aggressive algorithms and make it run faster:</p> -<ul> - <li>...</li> -</ul> - <p> We added new TableGen infrastructure to support bundling for Very Long Instruction Word (VLIW) architectures. TableGen can now automatically generate a deterministic finite automaton from a VLIW @@ -563,6 +660,13 @@ Release Notes</a>.</h1> <p> We have added a new target independent VLIW packetizer based on the DFA infrastructure to group machine instructions into bundles.</p> +<p> We have added new TableGen infrastructure to support relationship maps + between instructions. This feature enables TableGen to automatically + construct a set of relation tables and query functions that can be used + to switch between various forms of instructions. For more information, + please refer to <a href="http://llvm.org/docs/HowToUseInstrMappings.html"> + How To Use Instruction Mappings</a>.</p> + </div> <h4> @@ -588,7 +692,7 @@ Release Notes</a>.</h1> <p>New features and major changes in the X86 target include:</p> <ul> - <li>...</li> + <li>Small codegen optimizations, especially for AVX2.</li> </ul> </div> @@ -603,7 +707,7 @@ Release Notes</a>.</h1> <p>New features of the ARM target include:</p> <ul> - <li>...</li> + <li>Support and performance tuning for the A6 'Swift' CPU.</li> </ul> <!--_________________________________________________________________________--> @@ -620,7 +724,7 @@ Release Notes</a>.</h1> platform specific support for Linux.</p> <p>Full support is included for Thumb1, Thumb2 and ARM modes, along with - subtarget and CPU specific extensions for VFP2, VFP3 and NEON.</p> + sub-target and CPU specific extensions for VFP2, VFP3 and NEON.</p> <p>The assembler is Unified Syntax only (see ARM Architecural Reference Manual for details). While there is some, and growing, support for pre-unfied @@ -640,7 +744,29 @@ Release Notes</a>.</h1> <p>New features and major changes in the MIPS target include:</p> <ul> - <li>...</li> + <li>Integrated assembler support: + MIPS32 works for both PIC and static, known limitation is the PR14456 where + R_MIPS_GPREL16 relocation is generated with the wrong addend. + MIPS64 support is incomplete, for example exception handling is not working.</li> + <li>Support for fast calling convention has been added.</li> + <li>Support for Android MIPS toolchain has been added to clang driver.</li> + <li>Added clang driver support for MIPS N32 ABI through "-mabi=n32" option.</li> + <li>MIPS32 and MIPS64 disassembler has been implemented.</li> + <li>Support for compiling programs with large GOTs (exceeding 64kB in size) has been added + through llc option "-mxgot".</li> + <li>Added experimental support for MIPS32 DSP intrinsics.</li> + <li>Experimental support for MIPS16 with following limitations: only soft float is supported, + C++ exceptions are not supported, large stack frames (> 32000 bytes) are not supported, + direct object code emission is not supported only .s .</li> + <li>Standalone assembler (llvm-mc): implementation is in progress and considered experimental.</li> + <li>All classic JIT and MCJIT tests pass on Little and Big Endian MIPS32 platforms.</li> + <li>Inline asm support: all common constraints and operand modifiers have been implemented.</li> + <li>Added tail call optimization support, use llc option "-enable-mips-tail-calls" + or clang options "-mllvm -enable-mips-tail-calls"to enable it.</li> + <li>Improved register allocation by removing registers $fp, $gp, $ra and $at from the list of reserved registers.</li> + <li>Long branch expansion pass has been implemented, which expands branch + instructions with offsets that do not fit in the 16-bit field.</li> + <li>Cavium Octeon II board is used for testing builds (llvm-mips-linux builder).</li> </ul> </div> @@ -652,7 +778,6 @@ Release Notes</a>.</h1> <div> -<ul> <p>Many fixes and changes across LLVM (and Clang) for better compliance with the 64-bit PowerPC ELF Application Binary Interface, interoperability with GCC, and overall 64-bit PowerPC support. Some highlights include:</p> @@ -681,8 +806,28 @@ Release Notes</a>.</h1> <p>There have also been code generation improvements for both 32- and 64-bit code. Instruction scheduling support for the Freescale e500mc and e5500 cores has been added.</p> + +</div> + +<!--=========================================================================--> +<h3> +<a name="NVPTX">PTX/NVPTX Target Improvements</a> +</h3> + +<div> + +<p>The PTX back-end has been replaced by the NVPTX back-end, which is based on + the LLVM back-end used by NVIDIA in their CUDA (nvcc) and OpenCL compiler. + Some highlights include:</p> +<ul> + <li>Compatibility with PTX 3.1 and SM 3.5</li> + <li>Support for NVVM intrinsics as defined in the NVIDIA Compiler SDK</li> + <li>Full compatibility with old PTX back-end, with much greater coverage of + LLVM IR</li> </ul> +<p>Please submit any back-end bugs to the LLVM Bugzilla site.</p> + </div> <!--=========================================================================--> @@ -693,7 +838,7 @@ Release Notes</a>.</h1> <div> <ul> - <li>...</li> + <li>Added support for custom names for library functions in TargetLibraryInfo.</li> </ul> </div> @@ -710,9 +855,11 @@ Release Notes</a>.</h1> from the previous release.</p> <ul> - <li>...</li> -</ul> - +<li>llvm-ld and llvm-stub have been removed, llvm-ld functionality can be partially replaced by + llvm-link | opt | {llc | as, llc -filetype=obj} | ld, or fully replaced by Clang. </li> +<li>MCJIT: added support for inline assembly (requires asm parser), added faux remote target execution to lli option '-remote-mcjit'.</li> +</ul> + </div> <!--=========================================================================--> @@ -733,10 +880,6 @@ Release Notes</a>.</h1> <p> The TargetData structure has been renamed to DataLayout and moved to VMCore to remove a dependency on Target. </p> -<ul> - <li>...</li> -</ul> - </div> <!--=========================================================================--> @@ -746,33 +889,22 @@ to remove a dependency on Target. </p> <div> -<p>In addition, some tools have changed in this release. Some of the changes - are:</p> - -<ul> - <li>...</li> -</ul> - -</div> - - -<!--=========================================================================--> -<h3> -<a name="python">Python Bindings</a> -</h3> - -<div> - -<p>Officially supported Python bindings have been added! Feature support is far - from complete. The current bindings support interfaces to:</p> +<p>In addition, some tools have changed in this release. Some of the changes are:</p> <ul> - <li>...</li> +<li>opt: added support for '-mtriple' option.</li> +<li>llvm-mc : - added '-disassemble' support for '-show-inst' and '-show-encoding' options, added '-edis' option to produce annotated + disassembly output for X86 and ARM targets.</li> +<li>libprofile: allows the profile data file name to be specified by the LLVMPROF_OUTPUT environment variable.</li> +<li>llvm-objdump: has been changed to display available targets, '-arch' option accepts x86 and x86-64 as valid arch names.</li> +<li>llc and opt: added FMA formation from pairs of FADD + FMUL or FSUB + FMUL enabled by option '-enable-excess-fp-precision' or option '-enable-unsafe-fp-math', + option '-fp-contract' controls the creation by optimizations of fused FP by selecting Fast, Standard, or Strict mode.</li> +<li>llc: object file output from llc is no longer considered experimental.</li> +<li>gold plugin: handles Position Independent Executables.</li> </ul> </div> -</div> <!-- *********************************************************************** --> <h2> @@ -794,7 +926,7 @@ to remove a dependency on Target. </p> <p>Known problem areas include:</p> <ul> - <li>The CellSPU, MSP430, PTX and XCore backends are experimental.</li> + <li>The CellSPU, MSP430, and XCore backends are experimental, and the CellSPU backend will be removed in LLVM 3.3.</li> <li>The integrated assembler, disassembler, and JIT is not supported by several targets. If an integrated assembler is not supported, then a @@ -836,7 +968,7 @@ to remove a dependency on Target. </p> src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date: 2012-11-20 05:22:44 +0100 (Tue, 20 Nov 2012) $ + Last modified: $Date: 2012-12-19 11:50:28 +0100 (Wed, 19 Dec 2012) $ </address> </body> diff --git a/include/llvm/MC/MCExpr.h b/include/llvm/MC/MCExpr.h index 00eef270d6c4..1007aa526493 100644 --- a/include/llvm/MC/MCExpr.h +++ b/include/llvm/MC/MCExpr.h @@ -197,7 +197,11 @@ public: VK_Mips_GOT_PAGE, VK_Mips_GOT_OFST, VK_Mips_HIGHER, - VK_Mips_HIGHEST + VK_Mips_HIGHEST, + VK_Mips_GOT_HI16, + VK_Mips_GOT_LO16, + VK_Mips_CALL_HI16, + VK_Mips_CALL_LO16 }; private: diff --git a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp index f6dccb106d9b..a180e36e83f8 100644 --- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp +++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp @@ -346,7 +346,7 @@ uint8_t *RuntimeDyldImpl::createStubFunction(uint8_t *Addr) { uint32_t *StubAddr = (uint32_t*)Addr; *StubAddr = 0xe51ff004; // ldr pc,<label> return (uint8_t*)++StubAddr; - } else if (Arch == Triple::mipsel) { + } else if (Arch == Triple::mipsel || Arch == Triple::mips) { uint32_t *StubAddr = (uint32_t*)Addr; // 0: 3c190000 lui t9,%hi(addr). // 4: 27390000 addiu t9,t9,%lo(addr). diff --git a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp index 1ebcaf7ba822..f7015cdf6b5e 100644 --- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp +++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp @@ -676,7 +676,8 @@ void RuntimeDyldELF::processRelocationRef(const ObjRelocationInfo &Rel, RelType, 0); Section.StubOffset += getMaxStubSize(); } - } else if (Arch == Triple::mipsel && RelType == ELF::R_MIPS_26) { + } else if ((Arch == Triple::mipsel || Arch == Triple::mips) && + RelType == ELF::R_MIPS_26) { // This is an Mips branch relocation, need to use a stub function. DEBUG(dbgs() << "\t\tThis is a Mips branch relocation."); SectionEntry &Section = Sections[Rel.SectionID]; diff --git a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h index 829fd6c4c9a9..a292ee1a8479 100644 --- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h +++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h @@ -168,7 +168,7 @@ protected: inline unsigned getMaxStubSize() { if (Arch == Triple::arm || Arch == Triple::thumb) return 8; // 32-bit instruction and 32-bit address - else if (Arch == Triple::mipsel) + else if (Arch == Triple::mipsel || Arch == Triple::mips) return 16; else if (Arch == Triple::ppc64) return 44; diff --git a/lib/MC/MCExpr.cpp b/lib/MC/MCExpr.cpp index e0336342d6d1..de2f375aab91 100644 --- a/lib/MC/MCExpr.cpp +++ b/lib/MC/MCExpr.cpp @@ -229,6 +229,10 @@ StringRef MCSymbolRefExpr::getVariantKindName(VariantKind Kind) { case VK_Mips_GOT_OFST: return "GOT_OFST"; case VK_Mips_HIGHER: return "HIGHER"; case VK_Mips_HIGHEST: return "HIGHEST"; + case VK_Mips_GOT_HI16: return "GOT_HI16"; + case VK_Mips_GOT_LO16: return "GOT_LO16"; + case VK_Mips_CALL_HI16: return "CALL_HI16"; + case VK_Mips_CALL_LO16: return "CALL_LO16"; } llvm_unreachable("Invalid variant kind"); } diff --git a/lib/Target/Mips/InstPrinter/MipsInstPrinter.cpp b/lib/Target/Mips/InstPrinter/MipsInstPrinter.cpp index b38463de4bfe..68d3ac5f3bd0 100644 --- a/lib/Target/Mips/InstPrinter/MipsInstPrinter.cpp +++ b/lib/Target/Mips/InstPrinter/MipsInstPrinter.cpp @@ -128,6 +128,10 @@ static void printExpr(const MCExpr *Expr, raw_ostream &OS) { case MCSymbolRefExpr::VK_Mips_GOT_OFST: OS << "%got_ofst("; break; case MCSymbolRefExpr::VK_Mips_HIGHER: OS << "%higher("; break; case MCSymbolRefExpr::VK_Mips_HIGHEST: OS << "%highest("; break; + case MCSymbolRefExpr::VK_Mips_GOT_HI16: OS << "%got_hi("; break; + case MCSymbolRefExpr::VK_Mips_GOT_LO16: OS << "%got_lo("; break; + case MCSymbolRefExpr::VK_Mips_CALL_HI16: OS << "%call_hi("; break; + case MCSymbolRefExpr::VK_Mips_CALL_LO16: OS << "%call_lo("; break; } OS << SRE->getSymbol(); diff --git a/lib/Target/Mips/MCTargetDesc/MipsAsmBackend.cpp b/lib/Target/Mips/MCTargetDesc/MipsAsmBackend.cpp index 9a35bb6bd707..c078794899d2 100644 --- a/lib/Target/Mips/MCTargetDesc/MipsAsmBackend.cpp +++ b/lib/Target/Mips/MCTargetDesc/MipsAsmBackend.cpp @@ -42,6 +42,8 @@ static unsigned adjustFixupValue(unsigned Kind, uint64_t Value) { case Mips::fixup_Mips_GOT_PAGE: case Mips::fixup_Mips_GOT_OFST: case Mips::fixup_Mips_GOT_DISP: + case Mips::fixup_Mips_GOT_LO16: + case Mips::fixup_Mips_CALL_LO16: break; case Mips::fixup_Mips_PC16: // So far we are only using this type for branches. @@ -60,6 +62,8 @@ static unsigned adjustFixupValue(unsigned Kind, uint64_t Value) { break; case Mips::fixup_Mips_HI16: case Mips::fixup_Mips_GOT_Local: + case Mips::fixup_Mips_GOT_HI16: + case Mips::fixup_Mips_CALL_HI16: // Get the 2nd 16-bits. Also add 1 if bit 15 is 1. Value = ((Value + 0x8000) >> 16) & 0xffff; break; @@ -179,7 +183,11 @@ public: { "fixup_Mips_GOT_OFST", 0, 16, 0 }, { "fixup_Mips_GOT_DISP", 0, 16, 0 }, { "fixup_Mips_HIGHER", 0, 16, 0 }, - { "fixup_Mips_HIGHEST", 0, 16, 0 } + { "fixup_Mips_HIGHEST", 0, 16, 0 }, + { "fixup_Mips_GOT_HI16", 0, 16, 0 }, + { "fixup_Mips_GOT_LO16", 0, 16, 0 }, + { "fixup_Mips_CALL_HI16", 0, 16, 0 }, + { "fixup_Mips_CALL_LO16", 0, 16, 0 } }; if (Kind < FirstTargetFixupKind) diff --git a/lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h b/lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h index 233214b461f0..94e0d20d8835 100644 --- a/lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h +++ b/lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h @@ -84,7 +84,13 @@ namespace MipsII { /// MO_HIGHER/HIGHEST - Represents the highest or higher half word of a /// 64-bit symbol address. MO_HIGHER, - MO_HIGHEST + MO_HIGHEST, + + /// MO_GOT_HI16/LO16, MO_CALL_HI16/LO16 - Relocations used for large GOTs. + MO_GOT_HI16, + MO_GOT_LO16, + MO_CALL_HI16, + MO_CALL_LO16 }; enum { diff --git a/lib/Target/Mips/MCTargetDesc/MipsELFObjectWriter.cpp b/lib/Target/Mips/MCTargetDesc/MipsELFObjectWriter.cpp index 5d240fe84703..f82e203c23ca 100644 --- a/lib/Target/Mips/MCTargetDesc/MipsELFObjectWriter.cpp +++ b/lib/Target/Mips/MCTargetDesc/MipsELFObjectWriter.cpp @@ -179,6 +179,18 @@ unsigned MipsELFObjectWriter::GetRelocType(const MCValue &Target, case Mips::fixup_Mips_HIGHEST: Type = ELF::R_MIPS_HIGHEST; break; + case Mips::fixup_Mips_GOT_HI16: + Type = ELF::R_MIPS_GOT_HI16; + break; + case Mips::fixup_Mips_GOT_LO16: + Type = ELF::R_MIPS_GOT_LO16; + break; + case Mips::fixup_Mips_CALL_HI16: + Type = ELF::R_MIPS_CALL_HI16; + break; + case Mips::fixup_Mips_CALL_LO16: + Type = ELF::R_MIPS_CALL_LO16; + break; } return Type; } diff --git a/lib/Target/Mips/MCTargetDesc/MipsFixupKinds.h b/lib/Target/Mips/MCTargetDesc/MipsFixupKinds.h index 77faec54fb23..f96390043a3b 100644 --- a/lib/Target/Mips/MCTargetDesc/MipsFixupKinds.h +++ b/lib/Target/Mips/MCTargetDesc/MipsFixupKinds.h @@ -116,6 +116,18 @@ namespace Mips { // resulting in - R_MIPS_HIGHEST fixup_Mips_HIGHEST, + // resulting in - R_MIPS_GOT_HI16 + fixup_Mips_GOT_HI16, + + // resulting in - R_MIPS_GOT_LO16 + fixup_Mips_GOT_LO16, + + // resulting in - R_MIPS_CALL_HI16 + fixup_Mips_CALL_HI16, + + // resulting in - R_MIPS_CALL_LO16 + fixup_Mips_CALL_LO16, + // Marker LastTargetFixupKind, NumTargetFixupKinds = LastTargetFixupKind - FirstTargetFixupKind diff --git a/lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp b/lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp index 7fbdae02f411..da1e4552c9d0 100644 --- a/lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp +++ b/lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp @@ -287,6 +287,18 @@ getMachineOpValue(const MCInst &MI, const MCOperand &MO, case MCSymbolRefExpr::VK_Mips_HIGHEST: FixupKind = Mips::fixup_Mips_HIGHEST; break; + case MCSymbolRefExpr::VK_Mips_GOT_HI16: + FixupKind = Mips::fixup_Mips_GOT_HI16; + break; + case MCSymbolRefExpr::VK_Mips_GOT_LO16: + FixupKind = Mips::fixup_Mips_GOT_LO16; + break; + case MCSymbolRefExpr::VK_Mips_CALL_HI16: + FixupKind = Mips::fixup_Mips_CALL_HI16; + break; + case MCSymbolRefExpr::VK_Mips_CALL_LO16: + FixupKind = Mips::fixup_Mips_CALL_LO16; + break; } // switch Fixups.push_back(MCFixup::Create(0, MO.getExpr(), MCFixupKind(FixupKind))); diff --git a/lib/Target/Mips/Mips64InstrInfo.td b/lib/Target/Mips/Mips64InstrInfo.td index a6111689c7ed..83322eac8c62 100644 --- a/lib/Target/Mips/Mips64InstrInfo.td +++ b/lib/Target/Mips/Mips64InstrInfo.td @@ -255,6 +255,7 @@ def : MipsPat<(MipsHi tblockaddress:$in), (LUi64 tblockaddress:$in)>; def : MipsPat<(MipsHi tjumptable:$in), (LUi64 tjumptable:$in)>; def : MipsPat<(MipsHi tconstpool:$in), (LUi64 tconstpool:$in)>; def : MipsPat<(MipsHi tglobaltlsaddr:$in), (LUi64 tglobaltlsaddr:$in)>; +def : MipsPat<(MipsHi texternalsym:$in), (LUi64 texternalsym:$in)>; def : MipsPat<(MipsLo tglobaladdr:$in), (DADDiu ZERO_64, tglobaladdr:$in)>; def : MipsPat<(MipsLo tblockaddress:$in), (DADDiu ZERO_64, tblockaddress:$in)>; @@ -262,6 +263,7 @@ def : MipsPat<(MipsLo tjumptable:$in), (DADDiu ZERO_64, tjumptable:$in)>; def : MipsPat<(MipsLo tconstpool:$in), (DADDiu ZERO_64, tconstpool:$in)>; def : MipsPat<(MipsLo tglobaltlsaddr:$in), (DADDiu ZERO_64, tglobaltlsaddr:$in)>; +def : MipsPat<(MipsLo texternalsym:$in), (DADDiu ZERO_64, texternalsym:$in)>; def : MipsPat<(add CPU64Regs:$hi, (MipsLo tglobaladdr:$lo)), (DADDiu CPU64Regs:$hi, tglobaladdr:$lo)>; diff --git a/lib/Target/Mips/MipsCodeEmitter.cpp b/lib/Target/Mips/MipsCodeEmitter.cpp index 4bfccd8fdd7d..05090b84dece 100644 --- a/lib/Target/Mips/MipsCodeEmitter.cpp +++ b/lib/Target/Mips/MipsCodeEmitter.cpp @@ -85,7 +85,7 @@ class MipsCodeEmitter : public MachineFunctionPass { private: - void emitWordLE(unsigned Word); + void emitWord(unsigned Word); /// Routines that handle operands which add machine relocations which are /// fixed up by the relocation stage. @@ -112,12 +112,6 @@ class MipsCodeEmitter : public MachineFunctionPass { unsigned getSizeExtEncoding(const MachineInstr &MI, unsigned OpNo) const; unsigned getSizeInsEncoding(const MachineInstr &MI, unsigned OpNo) const; - int emitULW(const MachineInstr &MI); - int emitUSW(const MachineInstr &MI); - int emitULH(const MachineInstr &MI); - int emitULHu(const MachineInstr &MI); - int emitUSH(const MachineInstr &MI); - void emitGlobalAddressUnaligned(const GlobalValue *GV, unsigned Reloc, int Offset) const; }; @@ -133,7 +127,7 @@ bool MipsCodeEmitter::runOnMachineFunction(MachineFunction &MF) { MCPEs = &MF.getConstantPool()->getConstants(); MJTEs = 0; if (MF.getJumpTableInfo()) MJTEs = &MF.getJumpTableInfo()->getJumpTables(); - JTI->Initialize(MF, IsPIC); + JTI->Initialize(MF, IsPIC, Subtarget->isLittle()); MCE.setModuleInfo(&getAnalysis<MachineModuleInfo> ()); do { @@ -271,103 +265,6 @@ void MipsCodeEmitter::emitMachineBasicBlock(MachineBasicBlock *BB, Reloc, BB)); } -int MipsCodeEmitter::emitUSW(const MachineInstr &MI) { - unsigned src = getMachineOpValue(MI, MI.getOperand(0)); - unsigned base = getMachineOpValue(MI, MI.getOperand(1)); - unsigned offset = getMachineOpValue(MI, MI.getOperand(2)); - // swr src, offset(base) - // swl src, offset+3(base) - MCE.emitWordLE( - (0x2e << 26) | (base << 21) | (src << 16) | (offset & 0xffff)); - MCE.emitWordLE( - (0x2a << 26) | (base << 21) | (src << 16) | ((offset+3) & 0xffff)); - return 2; -} - -int MipsCodeEmitter::emitULW(const MachineInstr &MI) { - unsigned dst = getMachineOpValue(MI, MI.getOperand(0)); - unsigned base = getMachineOpValue(MI, MI.getOperand(1)); - unsigned offset = getMachineOpValue(MI, MI.getOperand(2)); - unsigned at = 1; - if (dst != base) { - // lwr dst, offset(base) - // lwl dst, offset+3(base) - MCE.emitWordLE( - (0x26 << 26) | (base << 21) | (dst << 16) | (offset & 0xffff)); - MCE.emitWordLE( - (0x22 << 26) | (base << 21) | (dst << 16) | ((offset+3) & 0xffff)); - return 2; - } else { - // lwr at, offset(base) - // lwl at, offset+3(base) - // addu dst, at, $zero - MCE.emitWordLE( - (0x26 << 26) | (base << 21) | (at << 16) | (offset & 0xffff)); - MCE.emitWordLE( - (0x22 << 26) | (base << 21) | (at << 16) | ((offset+3) & 0xffff)); - MCE.emitWordLE( - (0x0 << 26) | (at << 21) | (0x0 << 16) | (dst << 11) | (0x0 << 6) | 0x21); - return 3; - } -} - -int MipsCodeEmitter::emitUSH(const MachineInstr &MI) { - unsigned src = getMachineOpValue(MI, MI.getOperand(0)); - unsigned base = getMachineOpValue(MI, MI.getOperand(1)); - unsigned offset = getMachineOpValue(MI, MI.getOperand(2)); - unsigned at = 1; - // sb src, offset(base) - // srl at,src,8 - // sb at, offset+1(base) - MCE.emitWordLE( - (0x28 << 26) | (base << 21) | (src << 16) | (offset & 0xffff)); - MCE.emitWordLE( - (0x0 << 26) | (0x0 << 21) | (src << 16) | (at << 11) | (0x8 << 6) | 0x2); - MCE.emitWordLE( - (0x28 << 26) | (base << 21) | (at << 16) | ((offset+1) & 0xffff)); - return 3; -} - -int MipsCodeEmitter::emitULH(const MachineInstr &MI) { - unsigned dst = getMachineOpValue(MI, MI.getOperand(0)); - unsigned base = getMachineOpValue(MI, MI.getOperand(1)); - unsigned offset = getMachineOpValue(MI, MI.getOperand(2)); - unsigned at = 1; - // lbu at, offset(base) - // lb dst, offset+1(base) - // sll dst,dst,8 - // or dst,dst,at - MCE.emitWordLE( - (0x24 << 26) | (base << 21) | (at << 16) | (offset & 0xffff)); - MCE.emitWordLE( - (0x20 << 26) | (base << 21) | (dst << 16) | ((offset+1) & 0xffff)); - MCE.emitWordLE( - (0x0 << 26) | (0x0 << 21) | (dst << 16) | (dst << 11) | (0x8 << 6) | 0x0); - MCE.emitWordLE( - (0x0 << 26) | (dst << 21) | (at << 16) | (dst << 11) | (0x0 << 6) | 0x25); - return 4; -} - -int MipsCodeEmitter::emitULHu(const MachineInstr &MI) { - unsigned dst = getMachineOpValue(MI, MI.getOperand(0)); - unsigned base = getMachineOpValue(MI, MI.getOperand(1)); - unsigned offset = getMachineOpValue(MI, MI.getOperand(2)); - unsigned at = 1; - // lbu at, offset(base) - // lbu dst, offset+1(base) - // sll dst,dst,8 - // or dst,dst,at - MCE.emitWordLE( - (0x24 << 26) | (base << 21) | (at << 16) | (offset & 0xffff)); - MCE.emitWordLE( - (0x24 << 26) | (base << 21) | (dst << 16) | ((offset+1) & 0xffff)); - MCE.emitWordLE( - (0x0 << 26) | (0x0 << 21) | (dst << 16) | (dst << 11) | (0x8 << 6) | 0x0); - MCE.emitWordLE( - (0x0 << 26) | (dst << 21) | (at << 16) | (dst << 11) | (0x0 << 6) | 0x25); - return 4; -} - void MipsCodeEmitter::emitInstruction(const MachineInstr &MI) { DEBUG(errs() << "JIT: " << (void*)MCE.getCurrentPCValue() << ":\t" << MI); @@ -377,16 +274,19 @@ void MipsCodeEmitter::emitInstruction(const MachineInstr &MI) { if ((MI.getDesc().TSFlags & MipsII::FormMask) == MipsII::Pseudo) return; - emitWordLE(getBinaryCodeForInstr(MI)); + emitWord(getBinaryCodeForInstr(MI)); ++NumEmitted; // Keep track of the # of mi's emitted MCE.processDebugLoc(MI.getDebugLoc(), false); } -void MipsCodeEmitter::emitWordLE(unsigned Word) { +void MipsCodeEmitter::emitWord(unsigned Word) { DEBUG(errs() << " 0x"; errs().write_hex(Word) << "\n"); - MCE.emitWordLE(Word); + if (Subtarget->isLittle()) + MCE.emitWordLE(Word); + else + MCE.emitWordBE(Word); } /// createMipsJITCodeEmitterPass - Return a pass that emits the collected Mips diff --git a/lib/Target/Mips/MipsISelLowering.cpp b/lib/Target/Mips/MipsISelLowering.cpp index e225b6c28eb6..b0dd0a766f70 100644 --- a/lib/Target/Mips/MipsISelLowering.cpp +++ b/lib/Target/Mips/MipsISelLowering.cpp @@ -46,6 +46,10 @@ static cl::opt<bool> EnableMipsTailCalls("enable-mips-tail-calls", cl::Hidden, cl::desc("MIPS: Enable tail calls."), cl::init(false)); +static cl::opt<bool> +LargeGOT("mxgot", cl::Hidden, + cl::desc("MIPS: Enable GOT larger than 64k."), cl::init(false)); + static const uint16_t O32IntRegs[4] = { Mips::A0, Mips::A1, Mips::A2, Mips::A3 }; @@ -77,6 +81,71 @@ static SDValue GetGlobalReg(SelectionDAG &DAG, EVT Ty) { return DAG.getRegister(FI->getGlobalBaseReg(), Ty); } +static SDValue getTargetNode(SDValue Op, SelectionDAG &DAG, unsigned Flag) { + EVT Ty = Op.getValueType(); + + if (GlobalAddressSDNode *N = dyn_cast<GlobalAddressSDNode>(Op)) + return DAG.getTargetGlobalAddress(N->getGlobal(), Op.getDebugLoc(), Ty, 0, + Flag); + if (ExternalSymbolSDNode *N = dyn_cast<ExternalSymbolSDNode>(Op)) + return DAG.getTargetExternalSymbol(N->getSymbol(), Ty, Flag); + if (BlockAddressSDNode *N = dyn_cast<BlockAddressSDNode>(Op)) + return DAG.getTargetBlockAddress(N->getBlockAddress(), Ty, 0, Flag); + if (JumpTableSDNode *N = dyn_cast<JumpTableSDNode>(Op)) + return DAG.getTargetJumpTable(N->getIndex(), Ty, Flag); + if (ConstantPoolSDNode *N = dyn_cast<ConstantPoolSDNode>(Op)) + return DAG.getTargetConstantPool(N->getConstVal(), Ty, N->getAlignment(), + N->getOffset(), Flag); + + llvm_unreachable("Unexpected node type."); + return SDValue(); +} + +static SDValue getAddrNonPIC(SDValue Op, SelectionDAG &DAG) { + DebugLoc DL = Op.getDebugLoc(); + EVT Ty = Op.getValueType(); + SDValue Hi = getTargetNode(Op, DAG, MipsII::MO_ABS_HI); + SDValue Lo = getTargetNode(Op, DAG, MipsII::MO_ABS_LO); + return DAG.getNode(ISD::ADD, DL, Ty, + DAG.getNode(MipsISD::Hi, DL, Ty, Hi), + DAG.getNode(MipsISD::Lo, DL, Ty, Lo)); +} + +static SDValue getAddrLocal(SDValue Op, SelectionDAG &DAG, bool HasMips64) { + DebugLoc DL = Op.getDebugLoc(); + EVT Ty = Op.getValueType(); + unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT; + SDValue GOT = DAG.getNode(MipsISD::Wrapper, DL, Ty, GetGlobalReg(DAG, Ty), + getTargetNode(Op, DAG, GOTFlag)); + SDValue Load = DAG.getLoad(Ty, DL, DAG.getEntryNode(), GOT, + MachinePointerInfo::getGOT(), false, false, false, + 0); + unsigned LoFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO; + SDValue Lo = DAG.getNode(MipsISD::Lo, DL, Ty, getTargetNode(Op, DAG, LoFlag)); + return DAG.getNode(ISD::ADD, DL, Ty, Load, Lo); +} + +static SDValue getAddrGlobal(SDValue Op, SelectionDAG &DAG, unsigned Flag) { + DebugLoc DL = Op.getDebugLoc(); + EVT Ty = Op.getValueType(); + SDValue Tgt = DAG.getNode(MipsISD::Wrapper, DL, Ty, GetGlobalReg(DAG, Ty), + getTargetNode(Op, DAG, Flag)); + return DAG.getLoad(Ty, DL, DAG.getEntryNode(), Tgt, + MachinePointerInfo::getGOT(), false, false, false, 0); +} + +static SDValue getAddrGlobalLargeGOT(SDValue Op, SelectionDAG &DAG, + unsigned HiFlag, unsigned LoFlag) { + DebugLoc DL = Op.getDebugLoc(); + EVT Ty = Op.getValueType(); + SDValue Hi = DAG.getNode(MipsISD::Hi, DL, Ty, getTargetNode(Op, DAG, HiFlag)); + Hi = DAG.getNode(ISD::ADD, DL, Ty, Hi, GetGlobalReg(DAG, Ty)); + SDValue Wrapper = DAG.getNode(MipsISD::Wrapper, DL, Ty, Hi, + getTargetNode(Op, DAG, LoFlag)); + return DAG.getLoad(Ty, DL, DAG.getEntryNode(), Wrapper, + MachinePointerInfo::getGOT(), false, false, false, 0); +} + const char *MipsTargetLowering::getTargetNodeName(unsigned Opcode) const { switch (Opcode) { case MipsISD::JmpLink: return "MipsISD::JmpLink"; @@ -1743,8 +1812,6 @@ SDValue MipsTargetLowering::LowerGlobalAddress(SDValue Op, const GlobalValue *GV = cast<GlobalAddressSDNode>(Op)->getGlobal(); if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) { - SDVTList VTs = DAG.getVTList(MVT::i32); - const MipsTargetObjectFile &TLOF = (const MipsTargetObjectFile&)getObjFileLowering(); @@ -1752,69 +1819,33 @@ SDValue MipsTargetLowering::LowerGlobalAddress(SDValue Op, if (TLOF.IsGlobalInSmallSection(GV, getTargetMachine())) { SDValue GA = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0, MipsII::MO_GPREL); - SDValue GPRelNode = DAG.getNode(MipsISD::GPRel, dl, VTs, &GA, 1); + SDValue GPRelNode = DAG.getNode(MipsISD::GPRel, dl, + DAG.getVTList(MVT::i32), &GA, 1); SDValue GPReg = DAG.getRegister(Mips::GP, MVT::i32); return DAG.getNode(ISD::ADD, dl, MVT::i32, GPReg, GPRelNode); } + // %hi/%lo relocation - SDValue GAHi = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0, - MipsII::MO_ABS_HI); - SDValue GALo = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0, - MipsII::MO_ABS_LO); - SDValue HiPart = DAG.getNode(MipsISD::Hi, dl, VTs, &GAHi, 1); - SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, GALo); - return DAG.getNode(ISD::ADD, dl, MVT::i32, HiPart, Lo); - } - - EVT ValTy = Op.getValueType(); - bool HasGotOfst = (GV->hasInternalLinkage() || - (GV->hasLocalLinkage() && !isa<Function>(GV))); - unsigned GotFlag = HasMips64 ? - (HasGotOfst ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT_DISP) : - (HasGotOfst ? MipsII::MO_GOT : MipsII::MO_GOT16); - SDValue GA = DAG.getTargetGlobalAddress(GV, dl, ValTy, 0, GotFlag); - GA = DAG.getNode(MipsISD::Wrapper, dl, ValTy, GetGlobalReg(DAG, ValTy), GA); - SDValue ResNode = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), GA, - MachinePointerInfo(), false, false, false, 0); - // On functions and global targets not internal linked only - // a load from got/GP is necessary for PIC to work. - if (!HasGotOfst) - return ResNode; - SDValue GALo = DAG.getTargetGlobalAddress(GV, dl, ValTy, 0, - HasMips64 ? MipsII::MO_GOT_OFST : - MipsII::MO_ABS_LO); - SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, GALo); - return DAG.getNode(ISD::ADD, dl, ValTy, ResNode, Lo); + return getAddrNonPIC(Op, DAG); + } + + if (GV->hasInternalLinkage() || (GV->hasLocalLinkage() && !isa<Function>(GV))) + return getAddrLocal(Op, DAG, HasMips64); + + if (LargeGOT) + return getAddrGlobalLargeGOT(Op, DAG, MipsII::MO_GOT_HI16, + MipsII::MO_GOT_LO16); + + return getAddrGlobal(Op, DAG, + HasMips64 ? MipsII::MO_GOT_DISP : MipsII::MO_GOT16); } SDValue MipsTargetLowering::LowerBlockAddress(SDValue Op, SelectionDAG &DAG) const { - const BlockAddress *BA = cast<BlockAddressSDNode>(Op)->getBlockAddress(); - // FIXME there isn't actually debug info here - DebugLoc dl = Op.getDebugLoc(); + if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) + return getAddrNonPIC(Op, DAG); - if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) { - // %hi/%lo relocation - SDValue BAHi = - DAG.getTargetBlockAddress(BA, MVT::i32, 0, MipsII::MO_ABS_HI); - SDValue BALo = - DAG.getTargetBlockAddress(BA, MVT::i32, 0, MipsII::MO_ABS_LO); - SDValue Hi = DAG.getNode(MipsISD::Hi, dl, MVT::i32, BAHi); - SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, BALo); - return DAG.getNode(ISD::ADD, dl, MVT::i32, Hi, Lo); - } - - EVT ValTy = Op.getValueType(); - unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT; - unsigned OFSTFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO; - SDValue BAGOTOffset = DAG.getTargetBlockAddress(BA, ValTy, 0, GOTFlag); - BAGOTOffset = DAG.getNode(MipsISD::Wrapper, dl, ValTy, - GetGlobalReg(DAG, ValTy), BAGOTOffset); - SDValue BALOOffset = DAG.getTargetBlockAddress(BA, ValTy, 0, OFSTFlag); - SDValue Load = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), BAGOTOffset, - MachinePointerInfo(), false, false, false, 0); - SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, BALOOffset); - return DAG.getNode(ISD::ADD, dl, ValTy, Load, Lo); + return getAddrLocal(Op, DAG, HasMips64); } SDValue MipsTargetLowering:: @@ -1901,41 +1932,15 @@ LowerGlobalTLSAddress(SDValue Op, SelectionDAG &DAG) const SDValue MipsTargetLowering:: LowerJumpTable(SDValue Op, SelectionDAG &DAG) const { - SDValue HiPart, JTI, JTILo; - // FIXME there isn't actually debug info here - DebugLoc dl = Op.getDebugLoc(); - bool IsPIC = getTargetMachine().getRelocationModel() == Reloc::PIC_; - EVT PtrVT = Op.getValueType(); - JumpTableSDNode *JT = cast<JumpTableSDNode>(Op); - - if (!IsPIC && !IsN64) { - JTI = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, MipsII::MO_ABS_HI); - HiPart = DAG.getNode(MipsISD::Hi, dl, PtrVT, JTI); - JTILo = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, MipsII::MO_ABS_LO); - } else {// Emit Load from Global Pointer - unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT; - unsigned OfstFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO; - JTI = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, GOTFlag); - JTI = DAG.getNode(MipsISD::Wrapper, dl, PtrVT, GetGlobalReg(DAG, PtrVT), - JTI); - HiPart = DAG.getLoad(PtrVT, dl, DAG.getEntryNode(), JTI, - MachinePointerInfo(), false, false, false, 0); - JTILo = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, OfstFlag); - } + if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) + return getAddrNonPIC(Op, DAG); - SDValue Lo = DAG.getNode(MipsISD::Lo, dl, PtrVT, JTILo); - return DAG.getNode(ISD::ADD, dl, PtrVT, HiPart, Lo); + return getAddrLocal(Op, DAG, HasMips64); } SDValue MipsTargetLowering:: LowerConstantPool(SDValue Op, SelectionDAG &DAG) const { - SDValue ResNode; - ConstantPoolSDNode *N = cast<ConstantPoolSDNode>(Op); - const Constant *C = N->getConstVal(); - // FIXME there isn't actually debug info here - DebugLoc dl = Op.getDebugLoc(); - // gp_rel relocation // FIXME: we should reference the constant pool using small data sections, // but the asm printer currently doesn't support this feature without @@ -1946,31 +1951,10 @@ LowerConstantPool(SDValue Op, SelectionDAG &DAG) const // SDValue GOT = DAG.getGLOBAL_OFFSET_TABLE(MVT::i32); // ResNode = DAG.getNode(ISD::ADD, MVT::i32, GOT, GPRelNode); - if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) { - SDValue CPHi = DAG.getTargetConstantPool(C, MVT::i32, N->getAlignment(), - N->getOffset(), MipsII::MO_ABS_HI); - SDValue CPLo = DAG.getTargetConstantPool(C, MVT::i32, N->getAlignment(), - N->getOffset(), MipsII::MO_ABS_LO); - SDValue HiPart = DAG.getNode(MipsISD::Hi, dl, MVT::i32, CPHi); - SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, CPLo); - ResNode = DAG.getNode(ISD::ADD, dl, MVT::i32, HiPart, Lo); - } else { - EVT ValTy = Op.getValueType(); - unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT; - unsigned OFSTFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO; - SDValue CP = DAG.getTargetConstantPool(C, ValTy, N->getAlignment(), - N->getOffset(), GOTFlag); - CP = DAG.getNode(MipsISD::Wrapper, dl, ValTy, GetGlobalReg(DAG, ValTy), CP); - SDValue Load = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), CP, - MachinePointerInfo::getConstantPool(), false, - false, false, 0); - SDValue CPLo = DAG.getTargetConstantPool(C, ValTy, N->getAlignment(), - N->getOffset(), OFSTFlag); - SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, CPLo); - ResNode = DAG.getNode(ISD::ADD, dl, ValTy, Load, Lo); - } + if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) + return getAddrNonPIC(Op, DAG); - return ResNode; + return getAddrLocal(Op, DAG, HasMips64); } SDValue MipsTargetLowering::LowerVASTART(SDValue Op, SelectionDAG &DAG) const { @@ -2862,60 +2846,41 @@ MipsTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI, // If the callee is a GlobalAddress/ExternalSymbol node (quite common, every // direct call is) turn it into a TargetGlobalAddress/TargetExternalSymbol // node so that legalize doesn't hack it. - unsigned char OpFlag; bool IsPICCall = (IsN64 || IsPIC); // true if calls are translated to jalr $25 bool GlobalOrExternal = false; SDValue CalleeLo; if (GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee)) { - if (IsPICCall && G->getGlobal()->hasInternalLinkage()) { - OpFlag = IsO32 ? MipsII::MO_GOT : MipsII::MO_GOT_PAGE; - unsigned char LoFlag = IsO32 ? MipsII::MO_ABS_LO : MipsII::MO_GOT_OFST; + if (IsPICCall) { + if (G->getGlobal()->hasInternalLinkage()) + Callee = getAddrLocal(Callee, DAG, HasMips64); + else if (LargeGOT) + Callee = getAddrGlobalLargeGOT(Callee, DAG, MipsII::MO_CALL_HI16, + MipsII::MO_CALL_LO16); + else + Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_CALL); + } else Callee = DAG.getTargetGlobalAddress(G->getGlobal(), dl, getPointerTy(), 0, - OpFlag); - CalleeLo = DAG.getTargetGlobalAddress(G->getGlobal(), dl, getPointerTy(), - 0, LoFlag); - } else { - OpFlag = IsPICCall ? MipsII::MO_GOT_CALL : MipsII::MO_NO_FLAG; - Callee = DAG.getTargetGlobalAddress(G->getGlobal(), dl, - getPointerTy(), 0, OpFlag); - } - + MipsII::MO_NO_FLAG); GlobalOrExternal = true; } else if (ExternalSymbolSDNode *S = dyn_cast<ExternalSymbolSDNode>(Callee)) { - if (IsN64 || (!IsO32 && IsPIC)) - OpFlag = MipsII::MO_GOT_DISP; - else if (!IsPIC) // !N64 && static - OpFlag = MipsII::MO_NO_FLAG; + if (!IsN64 && !IsPIC) // !N64 && static + Callee = DAG.getTargetExternalSymbol(S->getSymbol(), getPointerTy(), + MipsII::MO_NO_FLAG); + else if (LargeGOT) + Callee = getAddrGlobalLargeGOT(Callee, DAG, MipsII::MO_CALL_HI16, + MipsII::MO_CALL_LO16); + else if (HasMips64) + Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_DISP); else // O32 & PIC - OpFlag = MipsII::MO_GOT_CALL; - Callee = DAG.getTargetExternalSymbol(S->getSymbol(), getPointerTy(), - OpFlag); + Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_CALL); + GlobalOrExternal = true; } SDValue InFlag; - // Create nodes that load address of callee and copy it to T9 - if (IsPICCall) { - if (GlobalOrExternal) { - // Load callee address - Callee = DAG.getNode(MipsISD::Wrapper, dl, getPointerTy(), - GetGlobalReg(DAG, getPointerTy()), Callee); - SDValue LoadValue = DAG.getLoad(getPointerTy(), dl, DAG.getEntryNode(), - Callee, MachinePointerInfo::getGOT(), - false, false, false, 0); - - // Use GOT+LO if callee has internal linkage. - if (CalleeLo.getNode()) { - SDValue Lo = DAG.getNode(MipsISD::Lo, dl, getPointerTy(), CalleeLo); - Callee = DAG.getNode(ISD::ADD, dl, getPointerTy(), LoadValue, Lo); - } else - Callee = LoadValue; - } - } - // T9 register operand. SDValue T9; diff --git a/lib/Target/Mips/MipsInstrInfo.td b/lib/Target/Mips/MipsInstrInfo.td index f16b5f9ee7ff..aa8881997285 100644 --- a/lib/Target/Mips/MipsInstrInfo.td +++ b/lib/Target/Mips/MipsInstrInfo.td @@ -1154,12 +1154,14 @@ def : MipsPat<(MipsHi tblockaddress:$in), (LUi tblockaddress:$in)>; def : MipsPat<(MipsHi tjumptable:$in), (LUi tjumptable:$in)>; def : MipsPat<(MipsHi tconstpool:$in), (LUi tconstpool:$in)>; def : MipsPat<(MipsHi tglobaltlsaddr:$in), (LUi tglobaltlsaddr:$in)>; +def : MipsPat<(MipsHi texternalsym:$in), (LUi texternalsym:$in)>; def : MipsPat<(MipsLo tglobaladdr:$in), (ADDiu ZERO, tglobaladdr:$in)>; def : MipsPat<(MipsLo tblockaddress:$in), (ADDiu ZERO, tblockaddress:$in)>; def : MipsPat<(MipsLo tjumptable:$in), (ADDiu ZERO, tjumptable:$in)>; def : MipsPat<(MipsLo tconstpool:$in), (ADDiu ZERO, tconstpool:$in)>; def : MipsPat<(MipsLo tglobaltlsaddr:$in), (ADDiu ZERO, tglobaltlsaddr:$in)>; +def : MipsPat<(MipsLo texternalsym:$in), (ADDiu ZERO, texternalsym:$in)>; def : MipsPat<(add CPURegs:$hi, (MipsLo tglobaladdr:$lo)), (ADDiu CPURegs:$hi, tglobaladdr:$lo)>; diff --git a/lib/Target/Mips/MipsJITInfo.cpp b/lib/Target/Mips/MipsJITInfo.cpp index 052046a8a45d..da1119df8f9f 100644 --- a/lib/Target/Mips/MipsJITInfo.cpp +++ b/lib/Target/Mips/MipsJITInfo.cpp @@ -222,10 +222,17 @@ void *MipsJITInfo::emitFunctionStub(const Function *F, void *Fn, // addiu t9, t9, %lo(EmittedAddr) // jalr t8, t9 // nop - JCE.emitWordLE(0xf << 26 | 25 << 16 | Hi); - JCE.emitWordLE(9 << 26 | 25 << 21 | 25 << 16 | Lo); - JCE.emitWordLE(25 << 21 | 24 << 11 | 9); - JCE.emitWordLE(0); + if (IsLittleEndian) { + JCE.emitWordLE(0xf << 26 | 25 << 16 | Hi); + JCE.emitWordLE(9 << 26 | 25 << 21 | 25 << 16 | Lo); + JCE.emitWordLE(25 << 21 | 24 << 11 | 9); + JCE.emitWordLE(0); + } else { + JCE.emitWordBE(0xf << 26 | 25 << 16 | Hi); + JCE.emitWordBE(9 << 26 | 25 << 21 | 25 << 16 | Lo); + JCE.emitWordBE(25 << 21 | 24 << 11 | 9); + JCE.emitWordBE(0); + } sys::Memory::InvalidateInstructionCache(Addr, 16); if (!sys::Memory::setRangeExecutable(Addr, 16)) diff --git a/lib/Target/Mips/MipsJITInfo.h b/lib/Target/Mips/MipsJITInfo.h index 637a31866034..ecda3101a003 100644 --- a/lib/Target/Mips/MipsJITInfo.h +++ b/lib/Target/Mips/MipsJITInfo.h @@ -26,10 +26,11 @@ class MipsTargetMachine; class MipsJITInfo : public TargetJITInfo { bool IsPIC; + bool IsLittleEndian; public: explicit MipsJITInfo() : - IsPIC(false) {} + IsPIC(false), IsLittleEndian(true) {} /// replaceMachineCodeForFunction - Make it so that calling the function /// whose machine code is at OLD turns into a call to NEW, perhaps by @@ -58,8 +59,10 @@ class MipsJITInfo : public TargetJITInfo { unsigned NumRelocs, unsigned char *GOTBase); /// Initialize - Initialize internal stage for the function being JITted. - void Initialize(const MachineFunction &MF, bool isPIC) { + void Initialize(const MachineFunction &MF, bool isPIC, + bool isLittleEndian) { IsPIC = isPIC; + IsLittleEndian = isLittleEndian; } }; diff --git a/lib/Target/Mips/MipsMCInstLower.cpp b/lib/Target/Mips/MipsMCInstLower.cpp index 5fa633933838..4162f981d1df 100644 --- a/lib/Target/Mips/MipsMCInstLower.cpp +++ b/lib/Target/Mips/MipsMCInstLower.cpp @@ -62,6 +62,10 @@ MCOperand MipsMCInstLower::LowerSymbolOperand(const MachineOperand &MO, case MipsII::MO_GOT_OFST: Kind = MCSymbolRefExpr::VK_Mips_GOT_OFST; break; case MipsII::MO_HIGHER: Kind = MCSymbolRefExpr::VK_Mips_HIGHER; break; case MipsII::MO_HIGHEST: Kind = MCSymbolRefExpr::VK_Mips_HIGHEST; break; + case MipsII::MO_GOT_HI16: Kind = MCSymbolRefExpr::VK_Mips_GOT_HI16; break; + case MipsII::MO_GOT_LO16: Kind = MCSymbolRefExpr::VK_Mips_GOT_LO16; break; + case MipsII::MO_CALL_HI16: Kind = MCSymbolRefExpr::VK_Mips_CALL_HI16; break; + case MipsII::MO_CALL_LO16: Kind = MCSymbolRefExpr::VK_Mips_CALL_LO16; break; } switch (MOTy) { diff --git a/lib/Transforms/Scalar/SROA.cpp b/lib/Transforms/Scalar/SROA.cpp index ccc2f7a77b3c..2d518f735be0 100644 --- a/lib/Transforms/Scalar/SROA.cpp +++ b/lib/Transforms/Scalar/SROA.cpp @@ -2160,6 +2160,9 @@ static bool isIntegerWideningViable(const DataLayout &TD, AllocaPartitioning::const_use_iterator I, AllocaPartitioning::const_use_iterator E) { uint64_t SizeInBits = TD.getTypeSizeInBits(AllocaTy); + // Don't create integer types larger than the maximum bitwidth. + if (SizeInBits > IntegerType::MAX_INT_BITS) + return false; // Don't try to handle allocas with bit-padding. if (SizeInBits != TD.getTypeStoreSizeInBits(AllocaTy)) @@ -2198,7 +2201,7 @@ static bool isIntegerWideningViable(const DataLayout &TD, if (RelBegin == 0 && RelEnd == Size) WholeAllocaOp = true; if (IntegerType *ITy = dyn_cast<IntegerType>(LI->getType())) { - if (ITy->getBitWidth() < TD.getTypeStoreSize(ITy)) + if (ITy->getBitWidth() < TD.getTypeStoreSizeInBits(ITy)) return false; continue; } @@ -2214,7 +2217,7 @@ static bool isIntegerWideningViable(const DataLayout &TD, if (RelBegin == 0 && RelEnd == Size) WholeAllocaOp = true; if (IntegerType *ITy = dyn_cast<IntegerType>(ValueTy)) { - if (ITy->getBitWidth() < TD.getTypeStoreSize(ITy)) + if (ITy->getBitWidth() < TD.getTypeStoreSizeInBits(ITy)) return false; continue; } diff --git a/test/CodeGen/Mips/biggot.ll b/test/CodeGen/Mips/biggot.ll new file mode 100644 index 000000000000..c4ad851c8258 --- /dev/null +++ b/test/CodeGen/Mips/biggot.ll @@ -0,0 +1,50 @@ +; RUN: llc -march=mipsel -mxgot < %s | FileCheck %s -check-prefix=O32 +; RUN: llc -march=mips64el -mcpu=mips64r2 -mattr=+n64 -mxgot < %s | \ +; RUN: FileCheck %s -check-prefix=N64 + +@v0 = external global i32 + +define void @foo1() nounwind { +entry: +; O32: lui $[[R0:[0-9]+]], %got_hi(v0) +; O32: addu $[[R1:[0-9]+]], $[[R0]], ${{[a-z0-9]+}} +; O32: lw ${{[0-9]+}}, %got_lo(v0)($[[R1]]) +; O32: lui $[[R2:[0-9]+]], %call_hi(foo0) +; O32: addu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}} +; O32: lw ${{[0-9]+}}, %call_lo(foo0)($[[R3]]) + +; N64: lui $[[R0:[0-9]+]], %got_hi(v0) +; N64: daddu $[[R1:[0-9]+]], $[[R0]], ${{[a-z0-9]+}} +; N64: ld ${{[0-9]+}}, %got_lo(v0)($[[R1]]) +; N64: lui $[[R2:[0-9]+]], %call_hi(foo0) +; N64: daddu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}} +; N64: ld ${{[0-9]+}}, %call_lo(foo0)($[[R3]]) + + %0 = load i32* @v0, align 4 + tail call void @foo0(i32 %0) nounwind + ret void +} + +declare void @foo0(i32) + +; call to external function. + +define void @foo2(i32* nocapture %d, i32* nocapture %s, i32 %n) nounwind { +entry: +; O32: foo2: +; O32: lui $[[R2:[0-9]+]], %call_hi(memcpy) +; O32: addu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}} +; O32: lw ${{[0-9]+}}, %call_lo(memcpy)($[[R3]]) + +; N64: foo2: +; N64: lui $[[R2:[0-9]+]], %call_hi(memcpy) +; N64: daddu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}} +; N64: ld ${{[0-9]+}}, %call_lo(memcpy)($[[R3]]) + + %0 = bitcast i32* %d to i8* + %1 = bitcast i32* %s to i8* + tail call void @llvm.memcpy.p0i8.p0i8.i32(i8* %0, i8* %1, i32 %n, i32 4, i1 false) + ret void +} + +declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i32, i1) nounwind diff --git a/test/MC/Mips/xgot.ll b/test/MC/Mips/xgot.ll new file mode 100644 index 000000000000..bfe9b9ad6604 --- /dev/null +++ b/test/MC/Mips/xgot.ll @@ -0,0 +1,42 @@ +; RUN: llc -filetype=obj -mtriple mipsel-unknown-linux -mxgot %s -o - | elf-dump --dump-section-data | FileCheck %s + +@.str = private unnamed_addr constant [16 x i8] c"ext_1=%d, i=%d\0A\00", align 1 +@ext_1 = external global i32 + +define void @fill() nounwind { +entry: + +; Check that the appropriate relocations were created. +; For the xgot case we want to see R_MIPS_[GOT|CALL]_[HI|LO]16. + +; R_MIPS_HI16 +; CHECK: ('r_type', 0x05) + +; R_MIPS_LO16 +; CHECK: ('r_type', 0x06) + +; R_MIPS_GOT_HI16 +; CHECK: ('r_type', 0x16) + +; R_MIPS_GOT_LO16 +; CHECK: ('r_type', 0x17) + +; R_MIPS_GOT +; CHECK: ('r_type', 0x09) + +; R_MIPS_LO16 +; CHECK: ('r_type', 0x06) + +; R_MIPS_CALL_HI16 +; CHECK: ('r_type', 0x1e) + +; R_MIPS_CALL_LO16 +; CHECK: ('r_type', 0x1f) + + %0 = load i32* @ext_1, align 4 + %call = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([16 x i8]* @.str, i32 0, i32 0), i32 %0) nounwind + ret void +} + +declare i32 @printf(i8* nocapture, ...) nounwind + diff --git a/test/Transforms/SROA/basictest.ll b/test/Transforms/SROA/basictest.ll index b363eefb3f9d..9fe926ee2cc1 100644 --- a/test/Transforms/SROA/basictest.ll +++ b/test/Transforms/SROA/basictest.ll @@ -1134,3 +1134,45 @@ entry: ret void ; CHECK: ret } + +define void @PR14465() { +; Ensure that we don't crash when analyzing a alloca larger than the maximum +; integer type width (MAX_INT_BITS) supported by llvm (1048576*32 > (1<<23)-1). +; CHECK: @PR14465 + + %stack = alloca [1048576 x i32], align 16 +; CHECK: alloca [1048576 x i32] + %cast = bitcast [1048576 x i32]* %stack to i8* + call void @llvm.memset.p0i8.i64(i8* %cast, i8 -2, i64 4194304, i32 16, i1 false) + ret void +; CHECK: ret +} + +define void @PR14548(i1 %x) { +; Handle a mixture of i1 and i8 loads and stores to allocas. This particular +; pattern caused crashes and invalid output in the PR, and its nature will +; trigger a mixture in several permutations as we resolve each alloca +; iteratively. +; Note that we don't do a particularly good *job* of handling these mixtures, +; but the hope is that this is very rare. +; CHECK: @PR14548 + +entry: + %a = alloca <{ i1 }>, align 8 + %b = alloca <{ i1 }>, align 8 +; Nothing of interest is simplified here. +; CHECK: alloca +; CHECK: alloca + + %b.i1 = bitcast <{ i1 }>* %b to i1* + store i1 %x, i1* %b.i1, align 8 + %b.i8 = bitcast <{ i1 }>* %b to i8* + %foo = load i8* %b.i8, align 1 + + %a.i8 = bitcast <{ i1 }>* %a to i8* + call void @llvm.memcpy.p0i8.p0i8.i32(i8* %a.i8, i8* %b.i8, i32 1, i32 1, i1 false) nounwind + %bar = load i8* %a.i8, align 1 + %a.i1 = getelementptr inbounds <{ i1 }>* %a, i32 0, i32 0 + %baz = load i1* %a.i1, align 1 + ret void +} diff --git a/test/Transforms/SROA/big-endian.ll b/test/Transforms/SROA/big-endian.ll index ce82d1f30b57..1ac6d25d6341 100644 --- a/test/Transforms/SROA/big-endian.ll +++ b/test/Transforms/SROA/big-endian.ll @@ -82,14 +82,9 @@ entry: %a0i16ptr = bitcast i8* %a0ptr to i16* store i16 1, i16* %a0i16ptr -; CHECK: %[[mask0:.*]] = and i16 1, -16 - - %a1i4ptr = bitcast i8* %a1ptr to i4* - store i4 1, i4* %a1i4ptr -; CHECK-NEXT: %[[insert0:.*]] = or i16 %[[mask0]], 1 store i8 1, i8* %a2ptr -; CHECK-NEXT: %[[mask1:.*]] = and i40 undef, 4294967295 +; CHECK: %[[mask1:.*]] = and i40 undef, 4294967295 ; CHECK-NEXT: %[[insert1:.*]] = or i40 %[[mask1]], 4294967296 %a3i24ptr = bitcast i8* %a3ptr to i24* @@ -110,7 +105,7 @@ entry: %ai = load i56* %aiptr %ret = zext i56 %ai to i64 ret i64 %ret -; CHECK-NEXT: %[[ext4:.*]] = zext i16 %[[insert0]] to i56 +; CHECK-NEXT: %[[ext4:.*]] = zext i16 1 to i56 ; CHECK-NEXT: %[[shift4:.*]] = shl i56 %[[ext4]], 40 ; CHECK-NEXT: %[[mask4:.*]] = and i56 %[[insert3]], 1099511627775 ; CHECK-NEXT: %[[insert4:.*]] = or i56 %[[mask4]], %[[shift4]] |