src - FreeBSD source tree

diff options


context:
space:
mode:

author	Dimitry Andric <dim@FreeBSD.org>	2012-12-22 14:58:30 +0000
committer	Dimitry Andric <dim@FreeBSD.org>	2012-12-22 14:58:30 +0000
commit	482e7bddf617ae804dc47133cb07eb4aa81e45de (patch)
tree	c074bb56c422dea536a85cc2d80fd620bb6af08e
parent	522600a229b950314b5f4af84eba4f3e8a0ffea1 (diff)

Vendor import of llvm tags/RELEASE_32/final r170710 (effectively, 3.2vendor/llvm/llvm-release_32-r170710

release): http://llvm.org/svn/llvm-project/llvm/tags/RELEASE_32/final@170710

Notes

Notes: svn path=/vendor/llvm/dist/; revision=244590 svn path=/vendor/llvm/llvm-release_32-r170710/; revision=244591; tag=vendor/llvm/llvm-release_32-r170710

Diffstat

-rw-r--r--

docs/ReleaseNotes.html

288

-rw-r--r--

include/llvm/MC/MCExpr.h

-rw-r--r--

lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp

-rw-r--r--

lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp

-rw-r--r--

lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h

-rw-r--r--

lib/MC/MCExpr.cpp

-rw-r--r--

lib/Target/Mips/InstPrinter/MipsInstPrinter.cpp

-rw-r--r--

lib/Target/Mips/MCTargetDesc/MipsAsmBackend.cpp

-rw-r--r--

lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h

-rw-r--r--

lib/Target/Mips/MCTargetDesc/MipsELFObjectWriter.cpp

-rw-r--r--

lib/Target/Mips/MCTargetDesc/MipsFixupKinds.h

-rw-r--r--

lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp

-rw-r--r--

lib/Target/Mips/Mips64InstrInfo.td

-rw-r--r--

lib/Target/Mips/MipsCodeEmitter.cpp

116

-rw-r--r--

lib/Target/Mips/MipsISelLowering.cpp

261

-rw-r--r--

lib/Target/Mips/MipsInstrInfo.td

-rw-r--r--

lib/Target/Mips/MipsJITInfo.cpp

-rw-r--r--

lib/Target/Mips/MipsJITInfo.h

-rw-r--r--

lib/Target/Mips/MipsMCInstLower.cpp

-rw-r--r--

lib/Transforms/Scalar/SROA.cpp

-rw-r--r--

test/CodeGen/Mips/biggot.ll

-rw-r--r--

test/MC/Mips/xgot.ll

-rw-r--r--

test/Transforms/SROA/basictest.ll

-rw-r--r--

test/Transforms/SROA/big-endian.ll

24 files changed, 565 insertions, 355 deletions

diff --git a/docs/ReleaseNotes.html b/docs/ReleaseNotes.html
index a4b1d580b637..a4c5960c1555 100644
--- a/docs/ReleaseNotes.html
+++ b/docs/ReleaseNotes.html

@@ -29,12 +29,6 @@

<p>Written by the <a href="http://llvm.org/">LLVM Team</a></p>

</div>

-<h1 style="color:red">These are in-progress notes for the upcoming LLVM 3.2

-release.<br>

-You may prefer the

-<a href="http://llvm.org/releases/3.1/docs/ReleaseNotes.html">LLVM 3.1

-Release Notes</a>.</h1>

<h2>

<a name="intro">Introduction</a>

@@ -46,7 +40,7 @@ Release Notes</a>.</h1>

<p>This document contains the release notes for the LLVM Compiler

Infrastructure, release 3.2. Here we describe the status of LLVM, including

major improvements from the previous release, improvements in various

- subprojects of LLVM, and some of the current users of the code. All LLVM

+ sub-projects of LLVM, and some of the current users of the code. All LLVM

releases may be downloaded from the <a href="http://llvm.org/releases/">LLVM

releases web site</a>.</p>

@@ -72,11 +66,12 @@ Release Notes</a>.</h1>

<div>

-<p>The LLVM 3.2 distribution currently consists of code from the core LLVM

- repository, which roughly includes the LLVM optimizers, code generators and

- supporting tools, and the Clang repository. In addition to this code, the

- LLVM Project includes other sub-projects that are in development. Here we

- include updates on these subprojects.</p>

+<p>The LLVM 3.2 distribution currently consists of production-quality code

+ from the core LLVM repository, which roughly includes the LLVM optimizers,

+ code generators and supporting tools, as well as Clang, DragonEgg and

+ compiler-rt sub-project repositories. In addition to this code, the LLVM

+ Project includes other sub-projects that are in development. Here we

+ include updates on these sub-projects.</p>

<h3>

@@ -90,18 +85,18 @@ Release Notes</a>.</h1>

experience through expressive diagnostics, a high level of conformance to

language standards, fast compilation, and low memory use. Like LLVM, Clang

provides a modular, library-based architecture that makes it suitable for

- creating or integrating with other development tools. Clang is considered a

- production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86

- (32- and 64-bit), and for Darwin/ARM targets.</p>

+ creating or integrating with other development tools.</p>

<p>In the LLVM 3.2 time-frame, the Clang team has made many improvements.

Highlights include:</p>

<ul>

- <li>...</li>

+ <li>Improvements to Clang's diagnostics</li>

+ <li>Support for tls_model attribute</li>

+ <li>Type safety attributes</li>

</ul>

<p>For more details about the changes to Clang since the 3.1 release, see the

- <a href="http://clang.llvm.org/docs/ReleaseNotes.html">Clang release

+ <a href="http://llvm.org/releases/3.2/tools/clang/docs/ReleaseNotes.html">Clang 3.2 release

notes.</a></p>

<p>If Clang rejects your code but another compiler accepts it, please take a

@@ -129,7 +124,10 @@ Release Notes</a>.</h1>

<p>The 3.2 release has the following notable changes:</p>

<ul>

- <li>...</li>

+ <li>Able to load LLVM plugins such as Polly.</li>

+ <li>Supports thread-local storage models.</li>

+ <li>Passes knowledge of variable lifetimes to the LLVM optimizers.</li>

+ <li>No longer requires GCC to be built with LTO support.</li>

</ul>

</div>

@@ -141,7 +139,8 @@ Release Notes</a>.</h1>

<div>

-<p>The new LLVM <a href="http://compiler-rt.llvm.org/">compiler-rt project</a>

+<p>The LLVM <a href="http://compiler-rt.llvm.org/">compiler-rt project</a>

is a simple library that provides an implementation of the low-level

target-specific hooks required by code generation and other runtime

components. For example, when compiling for a 32-bit target, converting a

@@ -153,7 +152,11 @@ Release Notes</a>.</h1>

<p>The 3.2 release has the following notable changes:</p>

<ul>

- <li>...</li>

+ <li><a href="http://llvm.org/releases/3.2/tools/clang/docs/ThreadSanitizer.html">ThreadSanitizer (TSan)</a> - data race detector run-time library for C/C++ has been added.</li>

+ <li>Improvements to <a href="http://llvm.org/releases/3.2/tools/clang/docs/AddressSanitizer.html">AddressSanitizer</a> including: better portability

+ (OSX, Android NDK), support for cmake based builds, enhanced error reporting and lots of bug fixes.</li>

+ <li>Added support for A6 'Swift' CPU.</li>

+ <li><code>divsi3</code> function has been enhanced to take advantage of a hardware unsigned divide when it is available.</li>

</ul>

</div>

@@ -174,7 +177,9 @@ Release Notes</a>.</h1>

<p>The 3.2 release has the following notable changes:</p>

<ul>

- <li>...</li>

+ <li>Linux build fixes for clang (see <a href="http://lldb.llvm.org/build.html">Building LLDB</a>)</li>

+ <li>Some Linux stability and usability improvements</li>

+ <li>Switch expression evaluation to use MCJIT (from legacy JIT) on Linux</li>

</ul>

</div>

@@ -193,7 +198,15 @@ Release Notes</a>.</h1>

<p>Within the LLVM 3.2 time-frame there were the following highlights:</p>

<ul>

- <li>...</li>

+ <li> C++11 shared_ptr atomic access API (20.7.2.5) has been implemented.</li>

+ <li>Applied noexcept and constexpr throughout library.</li>

+ <li>Improved C++11 conformance in associative container emplace.</li>

+ <li>Performance improvements in: std::rotate algorithm and I/O.</li>

+ <li>Operator new/delete and type_infos for exception types moved from libc++ to libc++abi.</li>

+ <li>Bug fixes in: <code><atomic></code>; vector<code><bool></code> algorithms,

+ <code><future></code>,<code><tuple></code>,

+ <code><type_traits></code>,<code><fstream></code>,<code><istream></code>,

+ <code><iterator></code>, <code><condition_variable></code>,<code><complex></code> as well as visibility fixes.

</ul>

</div>

@@ -212,7 +225,7 @@ Release Notes</a>.</h1>

<p>The 3.2 release has the following notable changes:</p>

<ul>

- <li>...</li>

+ <li>Bug fixes only, no functional changes.</li>

</ul>

</div>

@@ -227,16 +240,61 @@ Release Notes</a>.</h1>

<p><a href="http://polly.llvm.org/">Polly</a> is an <em>experimental</em>

optimizer for data locality and parallelism. It currently provides high-level

- loop optimizations and automatic parallelisation (using the OpenMP run time).

+ loop optimizations and automatic parallelization (using the OpenMP run time).

Work in the area of automatic SIMD and accelerator code generation was

started.</p>

<p>Within the LLVM 3.2 time-frame there were the following highlights:</p>

<ul>

- <li>...</li>

+ <li>isl, the integer set library used by Polly, was relicensed under the MIT license.</li>

+ <li>isl based code generation.</li>

+ <li>MIT licensed replacement for CLooG (LGPLv2).</li>

+ <li>Fine grained option handling (separation of core and border computations, control overhead vs. code size).</li>

+ <li>Support for FORTRAN and Dragonegg.</li>

+ <li>OpenMP code generation fixes.</li>

+</ul>

+</div>

+

+<h3>

+<a name="StaticAnalyzer">Clang Static Analyzer</a>

+</h3>

+<div>

+<p>The <a href="http://clang-analyzer.llvm.org/">Clang Static Analyzer</a>

+ is an advanced source code analysis tool integrated into Clang that performs

+ a deep analysis of code to find potential bugs.</p>

+<p>In the LLVM 3.2 release, the static analyzer has made significant improvements

+ in many areas, with notable highlights such as:</p>

+<ul>

+ <li>Improved interprocedural analysis within a translation unit (see details below), which greatly amplified the analyzer's ability to find bugs.</li>

+ <li>New infrastructure to model "well-known" APIs, allowing the analyzer to do a much better job when modeling calls to such functions.</li>

+ <li>Significant improvements to the APIs to write static analyzer checkers, with a more unified way of representing function/method calls in the checker API. Details can be found in the <a href="http://llvm.org/devmtg/2012-11#talk13">Building a Checker in 24 hours</a> talk.

+</ul>

+<p>The release specifically includes notable improvements for Objective-C analysis, including:</p>

+<ul>

+ <li>Interprocedural analysis for Objective-C methods.</li>

+ <li>Interprocedural analysis of calls to "blocks".</li>

+ <li>Precise modeling of GCD APIs such as <tt>dispatch_once</tt> and friends.</li>

+ <li>Improved support for recently added Objective-C constructs such as array and dictionary literals.</li>

+</ul>

+<p>The release specifically includes notable improvements for C++ analysis, including:</p>

+<ul>

+ <li>Interprocedural analysis for C++ methods (within a translation unit).</li>

+ <li>More precise modeling of C++ initializers and destructors.</li>

</ul>

+<p>Finally, this release includes many small improvements to <tt>scan-build</tt>, which can be used to drive the analyzer from the command line or a continuous integration system. This includes a directory-traversal issue, which could cause potential security problems in some cases. We would like to acknowledge Tim Brown of Portcullis Computer Security Ltd for reporting this issue.</p>

</div>

@@ -265,6 +323,19 @@ Release Notes</a>.</h1>

</div>

+<h3>EmbToolkit</h3>

+<div>

+<p><a href="http://www.embtoolkit.org/">EmbToolkit</a> provides Linux cross-compiler

+ toolchain/SDK (GCC/binutils/C library (uclibc,eglibc,musl)), a build system for

+ package cross-compilation and optionally various root file systems.

+ It supports ARM and MIPS. There is an ongoing effort to provide a clang+llvm

+ environment for the 3.2 releases,

+</p>

+</div>

<h3>FAUST</h3>

<div>

@@ -274,7 +345,7 @@ Release Notes</a>.</h1>

AUdio STream. Its programming model combines two approaches: functional

programming and block diagram composition. In addition with the C, C++, Java,

JavaScript output formats, the Faust compiler can generate LLVM bitcode, and

- works with LLVM 2.7-3.1.</p>

+ works with LLVM 2.7-3.2.</p>

</div>

@@ -331,7 +402,11 @@ Release Notes</a>.</h1>

<p>OSL was developed by Sony Pictures Imageworks for use in its in-house

renderer used for feature film animation and visual effects, and is

- distributed as open source software with the "New BSD" license.</p>

+ distributed as open source software with the "New BSD" license.

+ It has been used for all the shading on such films as The Amazing Spider-Man,

+ Men in Black III, Hotel Transylvania, and may other films in-progress,

+ and also has been incorporated into several commercial and open source

+ rendering products such as Blender, VRay, and Autodesk Beast.</p>

</div>

@@ -367,7 +442,7 @@ Release Notes</a>.</h1>

C++, Fortran and Faust code in Pure programs if the corresponding

LLVM-enabled compilers are installed).</p>

-<p>Pure version 0.54 has been tested and is known to work with LLVM 3.1 (and

+<p>Pure version 0.56 has been tested and is known to work with LLVM 3.2 (and

continues to work with older LLVM releases >= 2.5).</p>

</div>

@@ -432,7 +507,9 @@ Release Notes</a>.</h1>

<p>LLVM 3.2 includes several major changes and big features:</p>

<ul>

- <li>...</li>

+ <li>Loop Vectorizer.</li>

+ <li>New implementation of SROA.</li>

+ <li>New NVPTX back-end (replacing existing PTX back-end) based on NVIDIA sources.</li>

</ul>

</div>

@@ -451,7 +528,10 @@ Release Notes</a>.</h1>

<ul>

<li>Thread local variables may have a specified TLS model. See the

<a href="LangRef.html#globalvars">Language Reference Manual</a>.</li>

- <li>...</li>

+ <li>'TYPE_CODE_FUNCTION_OLD' type code and autoupgrade code for old function attributes format has been removed.</li>

+ <li>Internal representation of the Attributes class has been converted into a pointer to an

+ opaque object that's uniqued by and stored in the LLVMContext object.

+ The Attributes class then becomes a thin wrapper around this opaque object.</li>

</ul>

</div>

@@ -489,23 +569,33 @@ Release Notes</a>.</h1>

<ul>

<li>The inner most loops must have a single basic block.</li>

<li>The number of iterations are known before the loop starts to execute.</li>

- <li>The loop counter needs to be incrimented by one.</li>

+ <li>The loop counter needs to be incremented by one.</li>

<li>The loop trip count <b>can</b> be a variable.</li>

<li>Loops do <b>not</b> need to start at zero.</li>

<li>The induction variable can be used inside the loop.</li>

<li>Loop reductions are supported.</li>

<li>Arrays with affine access pattern do <b>not</b> need to be marked as 'noalias' and are checked at runtime.</li>

- <li>...</li>

</ul>

</p>

-<p>SROA - We've re-written SROA to be significantly more powerful.

-</p>

+<p>SROA - We’ve re-written SROA to be significantly more powerful and generate

+code which is much more friendly to the rest of the optimization pipeline.

+Previously this pass had scaling problems that required it to only operate on

+relatively small aggregates, and at times it would mistakenly replace a large

+aggregate with a single very large integer in order to make it a scalar SSA

+value. The result was a large number of i1024 and i2048 values representing any

+small stack buffer. These in turn slowed down many subsequent optimization

+paths.</p>

+<p>The new SROA pass uses a different algorithm that allows it to only promote to

+scalars the pieces of the aggregate actively in use. Because of this it doesn’t

+require any thresholds. It also always deduces the scalar values from the uses

+of the aggregate rather than the specific LLVM type of the aggregate. These

+features combine to both optimize more code with the pass but to improve the

+compile time of many functions dramatically.</p>

<ul>

- <li>Branch weight metadata is preseved through more of the optimizer.</li>

- <li>...</li>

+ <li>Branch weight metadata is preserved through more of the optimizer.</li>

</ul>

</div>

@@ -524,8 +614,19 @@ Release Notes</a>.</h1>

<a href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro

to the LLVM MC Project Blog Post</a>.</p>

-<ul>

- <li>...</li>

+<ul>

+ <li> Added support for following assembler directives: <code>.ifb</code>, <code>.ifnb</code>, <code>.ifc</code>,

+ <code>.ifnc</code>, <code>.purgem</code>, <code>.rept</code> and <code>.version</code> (ELF) as well as Darwin specific

+ <code>.pushsection</code>, <code>.popsection</code> and <code>.previous</code> .</li>

+ <li>Enhanced handling of <code>.lcomm directive</code>.</li>

+ <li>MS style inline assembler: added implementation of the offset and TYPE operators.</li>

+ <li>Targets can specify minimum supported NOP size for NOP padding.</li>

+ <li>ELF improvements: added support for generating ELF objects on Windows.</li>

+ <li>MachO improvements: symbol-difference variables are marked as N_ABS, added direct-to-object attribute for data-in-code markers.</li>

+ <li>Added support for annotated disassembly output for x86 and arm targets.</li>

+ <li>Arm support has been improved by adding support for ARM TARGET2 relocation

+ and fixing hadling of ARM-style "$d.*" labels.</li>

+ <li>Implemented local-exec TLS on PowerPC.</li>

</ul>

</div>

@@ -550,10 +651,6 @@ Release Notes</a>.</h1>

infrastructure, which allows us to implement more aggressive algorithms and

make it run faster:</p>

-<ul>

- <li>...</li>

-</ul>

<p> We added new TableGen infrastructure to support bundling for

Very Long Instruction Word (VLIW) architectures. TableGen can now

automatically generate a deterministic finite automaton from a VLIW

@@ -563,6 +660,13 @@ Release Notes</a>.</h1>

<p> We have added a new target independent VLIW packetizer based on the

DFA infrastructure to group machine instructions into bundles.</p>

+<p> We have added new TableGen infrastructure to support relationship maps

+ between instructions. This feature enables TableGen to automatically

+ construct a set of relation tables and query functions that can be used

+ to switch between various forms of instructions. For more information,

+ please refer to <a href="http://llvm.org/docs/HowToUseInstrMappings.html">

+ How To Use Instruction Mappings</a>.</p>

</div>

<h4>

@@ -588,7 +692,7 @@ Release Notes</a>.</h1>

<p>New features and major changes in the X86 target include:</p>

<ul>

- <li>...</li>

+ <li>Small codegen optimizations, especially for AVX2.</li>

</ul>

</div>

@@ -603,7 +707,7 @@ Release Notes</a>.</h1>

<p>New features of the ARM target include:</p>

<ul>

- <li>...</li>

+ <li>Support and performance tuning for the A6 'Swift' CPU.</li>

</ul>

@@ -620,7 +724,7 @@ Release Notes</a>.</h1>

platform specific support for Linux.</p>

<p>Full support is included for Thumb1, Thumb2 and ARM modes, along with

- subtarget and CPU specific extensions for VFP2, VFP3 and NEON.</p>

+ sub-target and CPU specific extensions for VFP2, VFP3 and NEON.</p>

<p>The assembler is Unified Syntax only (see ARM Architecural Reference Manual

for details). While there is some, and growing, support for pre-unfied

@@ -640,7 +744,29 @@ Release Notes</a>.</h1>

<p>New features and major changes in the MIPS target include:</p>

<ul>

- <li>...</li>

+ <li>Integrated assembler support:

+ MIPS32 works for both PIC and static, known limitation is the PR14456 where

+ R_MIPS_GPREL16 relocation is generated with the wrong addend.

+ MIPS64 support is incomplete, for example exception handling is not working.</li>

+ <li>Support for fast calling convention has been added.</li>

+ <li>Support for Android MIPS toolchain has been added to clang driver.</li>

+ <li>Added clang driver support for MIPS N32 ABI through "-mabi=n32" option.</li>

+ <li>MIPS32 and MIPS64 disassembler has been implemented.</li>

+ <li>Support for compiling programs with large GOTs (exceeding 64kB in size) has been added

+ through llc option "-mxgot".</li>

+ <li>Added experimental support for MIPS32 DSP intrinsics.</li>

+ <li>Experimental support for MIPS16 with following limitations: only soft float is supported,

+ C++ exceptions are not supported, large stack frames (> 32000 bytes) are not supported,

+ direct object code emission is not supported only .s .</li>

+ <li>Standalone assembler (llvm-mc): implementation is in progress and considered experimental.</li>

+ <li>All classic JIT and MCJIT tests pass on Little and Big Endian MIPS32 platforms.</li>

+ <li>Inline asm support: all common constraints and operand modifiers have been implemented.</li>

+ <li>Added tail call optimization support, use llc option "-enable-mips-tail-calls"

+ or clang options "-mllvm -enable-mips-tail-calls"to enable it.</li>

+ <li>Improved register allocation by removing registers $fp, $gp, $ra and $at from the list of reserved registers.</li>

+ <li>Long branch expansion pass has been implemented, which expands branch

+ instructions with offsets that do not fit in the 16-bit field.</li>

+ <li>Cavium Octeon II board is used for testing builds (llvm-mips-linux builder).</li>

</ul>

</div>

@@ -652,7 +778,6 @@ Release Notes</a>.</h1>

<div>

-<ul>

<p>Many fixes and changes across LLVM (and Clang) for better compliance with

the 64-bit PowerPC ELF Application Binary Interface, interoperability with

GCC, and overall 64-bit PowerPC support. Some highlights include:</p>

@@ -681,8 +806,28 @@ Release Notes</a>.</h1>

<p>There have also been code generation improvements for both 32- and 64-bit

code. Instruction scheduling support for the Freescale e500mc and e5500

cores has been added.</p>

+</div>

+

+<h3>

+<a name="NVPTX">PTX/NVPTX Target Improvements</a>

+</h3>

+<div>

+<p>The PTX back-end has been replaced by the NVPTX back-end, which is based on

+ the LLVM back-end used by NVIDIA in their CUDA (nvcc) and OpenCL compiler.

+ Some highlights include:</p>

+<ul>

+ <li>Compatibility with PTX 3.1 and SM 3.5</li>

+ <li>Support for NVVM intrinsics as defined in the NVIDIA Compiler SDK</li>

+ <li>Full compatibility with old PTX back-end, with much greater coverage of

+ LLVM IR</li>

</ul>

+<p>Please submit any back-end bugs to the LLVM Bugzilla site.</p>

</div>

@@ -693,7 +838,7 @@ Release Notes</a>.</h1>

<div>

<ul>

- <li>...</li>

+ <li>Added support for custom names for library functions in TargetLibraryInfo.</li>

</ul>

</div>

@@ -710,9 +855,11 @@ Release Notes</a>.</h1>

from the previous release.</p>

<ul>

- <li>...</li>

-</ul>

+<li>llvm-ld and llvm-stub have been removed, llvm-ld functionality can be partially replaced by

+ llvm-link | opt | {llc | as, llc -filetype=obj} | ld, or fully replaced by Clang. </li>

+<li>MCJIT: added support for inline assembly (requires asm parser), added faux remote target execution to lli option '-remote-mcjit'.</li>

+</ul>

</div>

@@ -733,10 +880,6 @@ Release Notes</a>.</h1>

<p> The TargetData structure has been renamed to DataLayout and moved to VMCore

to remove a dependency on Target. </p>

-<ul>

- <li>...</li>

-</ul>

</div>

@@ -746,33 +889,22 @@ to remove a dependency on Target. </p>

<div>

-<p>In addition, some tools have changed in this release. Some of the changes

- are:</p>

-<ul>

- <li>...</li>

-</ul>

-</div>

-

-<h3>

-<a name="python">Python Bindings</a>

-</h3>

-<div>

-<p>Officially supported Python bindings have been added! Feature support is far

- from complete. The current bindings support interfaces to:</p>

+<p>In addition, some tools have changed in this release. Some of the changes are:</p>

<ul>

- <li>...</li>

+<li>opt: added support for '-mtriple' option.</li>

+<li>llvm-mc : - added '-disassemble' support for '-show-inst' and '-show-encoding' options, added '-edis' option to produce annotated

+ disassembly output for X86 and ARM targets.</li>

+<li>libprofile: allows the profile data file name to be specified by the LLVMPROF_OUTPUT environment variable.</li>

+<li>llvm-objdump: has been changed to display available targets, '-arch' option accepts x86 and x86-64 as valid arch names.</li>

+<li>llc and opt: added FMA formation from pairs of FADD + FMUL or FSUB + FMUL enabled by option '-enable-excess-fp-precision' or option '-enable-unsafe-fp-math',

+ option '-fp-contract' controls the creation by optimizations of fused FP by selecting Fast, Standard, or Strict mode.</li>

+<li>llc: object file output from llc is no longer considered experimental.</li>

+<li>gold plugin: handles Position Independent Executables.</li>

</ul>

</div>

-</div>

<h2>

@@ -794,7 +926,7 @@ to remove a dependency on Target. </p>

<p>Known problem areas include:</p>

<ul>

- <li>The CellSPU, MSP430, PTX and XCore backends are experimental.</li>

+ <li>The CellSPU, MSP430, and XCore backends are experimental, and the CellSPU backend will be removed in LLVM 3.3.</li>

<li>The integrated assembler, disassembler, and JIT is not supported by

several targets. If an integrated assembler is not supported, then a

@@ -836,7 +968,7 @@ to remove a dependency on Target. </p>

src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>

<a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br>

- Last modified: $Date: 2012-11-20 05:22:44 +0100 (Tue, 20 Nov 2012) $

+ Last modified: $Date: 2012-12-19 11:50:28 +0100 (Wed, 19 Dec 2012) $

</address>

</body>

diff --git a/include/llvm/MC/MCExpr.h b/include/llvm/MC/MCExpr.h
index 00eef270d6c4..1007aa526493 100644
--- a/include/llvm/MC/MCExpr.h
+++ b/include/llvm/MC/MCExpr.h

@@ -197,7 +197,11 @@ public:

VK_Mips_GOT_PAGE,

VK_Mips_GOT_OFST,

VK_Mips_HIGHER,

- VK_Mips_HIGHEST

+ VK_Mips_HIGHEST,

+ VK_Mips_GOT_HI16,

+ VK_Mips_GOT_LO16,

+ VK_Mips_CALL_HI16,

+ VK_Mips_CALL_LO16

};

private:

diff --git a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
index f6dccb106d9b..a180e36e83f8 100644
--- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
+++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp

@@ -346,7 +346,7 @@ uint8_t *RuntimeDyldImpl::createStubFunction(uint8_t *Addr) {

uint32_t *StubAddr = (uint32_t*)Addr;

*StubAddr = 0xe51ff004; // ldr pc,<label>

return (uint8_t*)++StubAddr;

- } else if (Arch == Triple::mipsel) {

+ } else if (Arch == Triple::mipsel || Arch == Triple::mips) {

uint32_t *StubAddr = (uint32_t*)Addr;

// 0: 3c190000 lui t9,%hi(addr).

// 4: 27390000 addiu t9,t9,%lo(addr).

diff --git a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
index 1ebcaf7ba822..f7015cdf6b5e 100644
--- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
+++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp

@@ -676,7 +676,8 @@ void RuntimeDyldELF::processRelocationRef(const ObjRelocationInfo &Rel,

RelType, 0);

Section.StubOffset += getMaxStubSize();

}

- } else if (Arch == Triple::mipsel && RelType == ELF::R_MIPS_26) {

+ } else if ((Arch == Triple::mipsel || Arch == Triple::mips) &&

+ RelType == ELF::R_MIPS_26) {

// This is an Mips branch relocation, need to use a stub function.

DEBUG(dbgs() << "\t\tThis is a Mips branch relocation.");

SectionEntry &Section = Sections[Rel.SectionID];

diff --git a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h
index 829fd6c4c9a9..a292ee1a8479 100644
--- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h
+++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h

@@ -168,7 +168,7 @@ protected:

inline unsigned getMaxStubSize() {

if (Arch == Triple::arm || Arch == Triple::thumb)

return 8; // 32-bit instruction and 32-bit address

- else if (Arch == Triple::mipsel)

+ else if (Arch == Triple::mipsel || Arch == Triple::mips)

return 16;

else if (Arch == Triple::ppc64)

return 44;

diff --git a/lib/MC/MCExpr.cpp b/lib/MC/MCExpr.cpp
index e0336342d6d1..de2f375aab91 100644
--- a/lib/MC/MCExpr.cpp
+++ b/lib/MC/MCExpr.cpp

@@ -229,6 +229,10 @@ StringRef MCSymbolRefExpr::getVariantKindName(VariantKind Kind) {

case VK_Mips_GOT_OFST: return "GOT_OFST";

case VK_Mips_HIGHER: return "HIGHER";

case VK_Mips_HIGHEST: return "HIGHEST";

+ case VK_Mips_GOT_HI16: return "GOT_HI16";

+ case VK_Mips_GOT_LO16: return "GOT_LO16";

+ case VK_Mips_CALL_HI16: return "CALL_HI16";

+ case VK_Mips_CALL_LO16: return "CALL_LO16";

}

llvm_unreachable("Invalid variant kind");

}

diff --git a/lib/Target/Mips/InstPrinter/MipsInstPrinter.cpp b/lib/Target/Mips/InstPrinter/MipsInstPrinter.cpp
index b38463de4bfe..68d3ac5f3bd0 100644
--- a/lib/Target/Mips/InstPrinter/MipsInstPrinter.cpp
+++ b/lib/Target/Mips/InstPrinter/MipsInstPrinter.cpp

@@ -128,6 +128,10 @@ static void printExpr(const MCExpr *Expr, raw_ostream &OS) {

case MCSymbolRefExpr::VK_Mips_GOT_OFST: OS << "%got_ofst("; break;

case MCSymbolRefExpr::VK_Mips_HIGHER: OS << "%higher("; break;

case MCSymbolRefExpr::VK_Mips_HIGHEST: OS << "%highest("; break;

+ case MCSymbolRefExpr::VK_Mips_GOT_HI16: OS << "%got_hi("; break;

+ case MCSymbolRefExpr::VK_Mips_GOT_LO16: OS << "%got_lo("; break;

+ case MCSymbolRefExpr::VK_Mips_CALL_HI16: OS << "%call_hi("; break;

+ case MCSymbolRefExpr::VK_Mips_CALL_LO16: OS << "%call_lo("; break;

}

OS << SRE->getSymbol();

diff --git a/lib/Target/Mips/MCTargetDesc/MipsAsmBackend.cpp b/lib/Target/Mips/MCTargetDesc/MipsAsmBackend.cpp
index 9a35bb6bd707..c078794899d2 100644
--- a/lib/Target/Mips/MCTargetDesc/MipsAsmBackend.cpp
+++ b/lib/Target/Mips/MCTargetDesc/MipsAsmBackend.cpp

@@ -42,6 +42,8 @@ static unsigned adjustFixupValue(unsigned Kind, uint64_t Value) {

case Mips::fixup_Mips_GOT_PAGE:

case Mips::fixup_Mips_GOT_OFST:

case Mips::fixup_Mips_GOT_DISP:

+ case Mips::fixup_Mips_GOT_LO16:

+ case Mips::fixup_Mips_CALL_LO16:

break;

case Mips::fixup_Mips_PC16:

// So far we are only using this type for branches.

@@ -60,6 +62,8 @@ static unsigned adjustFixupValue(unsigned Kind, uint64_t Value) {

break;

case Mips::fixup_Mips_HI16:

case Mips::fixup_Mips_GOT_Local:

+ case Mips::fixup_Mips_GOT_HI16:

+ case Mips::fixup_Mips_CALL_HI16:

// Get the 2nd 16-bits. Also add 1 if bit 15 is 1.

Value = ((Value + 0x8000) >> 16) & 0xffff;

break;

@@ -179,7 +183,11 @@ public:

{ "fixup_Mips_GOT_OFST", 0, 16, 0 },

{ "fixup_Mips_GOT_DISP", 0, 16, 0 },

{ "fixup_Mips_HIGHER", 0, 16, 0 },

- { "fixup_Mips_HIGHEST", 0, 16, 0 }

+ { "fixup_Mips_HIGHEST", 0, 16, 0 },

+ { "fixup_Mips_GOT_HI16", 0, 16, 0 },

+ { "fixup_Mips_GOT_LO16", 0, 16, 0 },

+ { "fixup_Mips_CALL_HI16", 0, 16, 0 },

+ { "fixup_Mips_CALL_LO16", 0, 16, 0 }

};

if (Kind < FirstTargetFixupKind)

diff --git a/lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h b/lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h
index 233214b461f0..94e0d20d8835 100644
--- a/lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h
+++ b/lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h

@@ -84,7 +84,13 @@ namespace MipsII {

/// MO_HIGHER/HIGHEST - Represents the highest or higher half word of a

/// 64-bit symbol address.

MO_HIGHER,

- MO_HIGHEST

+ MO_HIGHEST,

+ /// MO_GOT_HI16/LO16, MO_CALL_HI16/LO16 - Relocations used for large GOTs.

+ MO_GOT_HI16,

+ MO_GOT_LO16,

+ MO_CALL_HI16,

+ MO_CALL_LO16

};

enum {

diff --git a/lib/Target/Mips/MCTargetDesc/MipsELFObjectWriter.cpp b/lib/Target/Mips/MCTargetDesc/MipsELFObjectWriter.cpp
index 5d240fe84703..f82e203c23ca 100644
--- a/lib/Target/Mips/MCTargetDesc/MipsELFObjectWriter.cpp
+++ b/lib/Target/Mips/MCTargetDesc/MipsELFObjectWriter.cpp

@@ -179,6 +179,18 @@ unsigned MipsELFObjectWriter::GetRelocType(const MCValue &Target,

case Mips::fixup_Mips_HIGHEST:

Type = ELF::R_MIPS_HIGHEST;

break;

+ case Mips::fixup_Mips_GOT_HI16:

+ Type = ELF::R_MIPS_GOT_HI16;

+ break;

+ case Mips::fixup_Mips_GOT_LO16:

+ Type = ELF::R_MIPS_GOT_LO16;

+ break;

+ case Mips::fixup_Mips_CALL_HI16:

+ Type = ELF::R_MIPS_CALL_HI16;

+ break;

+ case Mips::fixup_Mips_CALL_LO16:

+ Type = ELF::R_MIPS_CALL_LO16;

+ break;

}

return Type;

}

diff --git a/lib/Target/Mips/MCTargetDesc/MipsFixupKinds.h b/lib/Target/Mips/MCTargetDesc/MipsFixupKinds.h
index 77faec54fb23..f96390043a3b 100644
--- a/lib/Target/Mips/MCTargetDesc/MipsFixupKinds.h
+++ b/lib/Target/Mips/MCTargetDesc/MipsFixupKinds.h

@@ -116,6 +116,18 @@ namespace Mips {

// resulting in - R_MIPS_HIGHEST

fixup_Mips_HIGHEST,

+ // resulting in - R_MIPS_GOT_HI16

+ fixup_Mips_GOT_HI16,

+ // resulting in - R_MIPS_GOT_LO16

+ fixup_Mips_GOT_LO16,

+ // resulting in - R_MIPS_CALL_HI16

+ fixup_Mips_CALL_HI16,

+ // resulting in - R_MIPS_CALL_LO16

+ fixup_Mips_CALL_LO16,

// Marker

LastTargetFixupKind,

NumTargetFixupKinds = LastTargetFixupKind - FirstTargetFixupKind

diff --git a/lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp b/lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp
index 7fbdae02f411..da1e4552c9d0 100644
--- a/lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp
+++ b/lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp

@@ -287,6 +287,18 @@ getMachineOpValue(const MCInst &MI, const MCOperand &MO,

case MCSymbolRefExpr::VK_Mips_HIGHEST:

FixupKind = Mips::fixup_Mips_HIGHEST;

break;

+ case MCSymbolRefExpr::VK_Mips_GOT_HI16:

+ FixupKind = Mips::fixup_Mips_GOT_HI16;

+ break;

+ case MCSymbolRefExpr::VK_Mips_GOT_LO16:

+ FixupKind = Mips::fixup_Mips_GOT_LO16;

+ break;

+ case MCSymbolRefExpr::VK_Mips_CALL_HI16:

+ FixupKind = Mips::fixup_Mips_CALL_HI16;

+ break;

+ case MCSymbolRefExpr::VK_Mips_CALL_LO16:

+ FixupKind = Mips::fixup_Mips_CALL_LO16;

+ break;

} // switch

Fixups.push_back(MCFixup::Create(0, MO.getExpr(), MCFixupKind(FixupKind)));

diff --git a/lib/Target/Mips/Mips64InstrInfo.td b/lib/Target/Mips/Mips64InstrInfo.td
index a6111689c7ed..83322eac8c62 100644
--- a/lib/Target/Mips/Mips64InstrInfo.td
+++ b/lib/Target/Mips/Mips64InstrInfo.td

@@ -255,6 +255,7 @@ def : MipsPat<(MipsHi tblockaddress:$in), (LUi64 tblockaddress:$in)>;

def : MipsPat<(MipsHi tjumptable:$in), (LUi64 tjumptable:$in)>;

def : MipsPat<(MipsHi tconstpool:$in), (LUi64 tconstpool:$in)>;

def : MipsPat<(MipsHi tglobaltlsaddr:$in), (LUi64 tglobaltlsaddr:$in)>;

+def : MipsPat<(MipsHi texternalsym:$in), (LUi64 texternalsym:$in)>;

def : MipsPat<(MipsLo tglobaladdr:$in), (DADDiu ZERO_64, tglobaladdr:$in)>;

def : MipsPat<(MipsLo tblockaddress:$in), (DADDiu ZERO_64, tblockaddress:$in)>;

@@ -262,6 +263,7 @@ def : MipsPat<(MipsLo tjumptable:$in), (DADDiu ZERO_64, tjumptable:$in)>;

def : MipsPat<(MipsLo tconstpool:$in), (DADDiu ZERO_64, tconstpool:$in)>;

def : MipsPat<(MipsLo tglobaltlsaddr:$in),

(DADDiu ZERO_64, tglobaltlsaddr:$in)>;

+def : MipsPat<(MipsLo texternalsym:$in), (DADDiu ZERO_64, texternalsym:$in)>;

def : MipsPat<(add CPU64Regs:$hi, (MipsLo tglobaladdr:$lo)),

(DADDiu CPU64Regs:$hi, tglobaladdr:$lo)>;

diff --git a/lib/Target/Mips/MipsCodeEmitter.cpp b/lib/Target/Mips/MipsCodeEmitter.cpp
index 4bfccd8fdd7d..05090b84dece 100644
--- a/lib/Target/Mips/MipsCodeEmitter.cpp
+++ b/lib/Target/Mips/MipsCodeEmitter.cpp

@@ -85,7 +85,7 @@ class MipsCodeEmitter : public MachineFunctionPass {

private:

- void emitWordLE(unsigned Word);

+ void emitWord(unsigned Word);

/// Routines that handle operands which add machine relocations which are

/// fixed up by the relocation stage.

@@ -112,12 +112,6 @@ class MipsCodeEmitter : public MachineFunctionPass {

unsigned getSizeExtEncoding(const MachineInstr &MI, unsigned OpNo) const;

unsigned getSizeInsEncoding(const MachineInstr &MI, unsigned OpNo) const;

- int emitULW(const MachineInstr &MI);

- int emitUSW(const MachineInstr &MI);

- int emitULH(const MachineInstr &MI);

- int emitULHu(const MachineInstr &MI);

- int emitUSH(const MachineInstr &MI);

void emitGlobalAddressUnaligned(const GlobalValue *GV, unsigned Reloc,

int Offset) const;

};

@@ -133,7 +127,7 @@ bool MipsCodeEmitter::runOnMachineFunction(MachineFunction &MF) {

MCPEs = &MF.getConstantPool()->getConstants();

MJTEs = 0;

if (MF.getJumpTableInfo()) MJTEs = &MF.getJumpTableInfo()->getJumpTables();

- JTI->Initialize(MF, IsPIC);

+ JTI->Initialize(MF, IsPIC, Subtarget->isLittle());

MCE.setModuleInfo(&getAnalysis<MachineModuleInfo> ());

do {

@@ -271,103 +265,6 @@ void MipsCodeEmitter::emitMachineBasicBlock(MachineBasicBlock *BB,

Reloc, BB));

}

-int MipsCodeEmitter::emitUSW(const MachineInstr &MI) {

- unsigned src = getMachineOpValue(MI, MI.getOperand(0));

- unsigned base = getMachineOpValue(MI, MI.getOperand(1));

- unsigned offset = getMachineOpValue(MI, MI.getOperand(2));

- // swr src, offset(base)

- // swl src, offset+3(base)

- MCE.emitWordLE(

- (0x2e << 26) | (base << 21) | (src << 16) | (offset & 0xffff));

- MCE.emitWordLE(

- (0x2a << 26) | (base << 21) | (src << 16) | ((offset+3) & 0xffff));

- return 2;

-int MipsCodeEmitter::emitULW(const MachineInstr &MI) {

- unsigned dst = getMachineOpValue(MI, MI.getOperand(0));

- unsigned base = getMachineOpValue(MI, MI.getOperand(1));

- unsigned offset = getMachineOpValue(MI, MI.getOperand(2));

- unsigned at = 1;

- if (dst != base) {

- // lwr dst, offset(base)

- // lwl dst, offset+3(base)

- MCE.emitWordLE(

- (0x26 << 26) | (base << 21) | (dst << 16) | (offset & 0xffff));

- MCE.emitWordLE(

- (0x22 << 26) | (base << 21) | (dst << 16) | ((offset+3) & 0xffff));

- return 2;

- } else {

- // lwr at, offset(base)

- // lwl at, offset+3(base)

- // addu dst, at, $zero

- MCE.emitWordLE(

- (0x26 << 26) | (base << 21) | (at << 16) | (offset & 0xffff));

- MCE.emitWordLE(

- (0x22 << 26) | (base << 21) | (at << 16) | ((offset+3) & 0xffff));

- MCE.emitWordLE(

- (0x0 << 26) | (at << 21) | (0x0 << 16) | (dst << 11) | (0x0 << 6) | 0x21);

- return 3;

- }

-int MipsCodeEmitter::emitUSH(const MachineInstr &MI) {

- unsigned src = getMachineOpValue(MI, MI.getOperand(0));

- unsigned base = getMachineOpValue(MI, MI.getOperand(1));

- unsigned offset = getMachineOpValue(MI, MI.getOperand(2));

- unsigned at = 1;

- // sb src, offset(base)

- // srl at,src,8

- // sb at, offset+1(base)

- MCE.emitWordLE(

- (0x28 << 26) | (base << 21) | (src << 16) | (offset & 0xffff));

- MCE.emitWordLE(

- (0x0 << 26) | (0x0 << 21) | (src << 16) | (at << 11) | (0x8 << 6) | 0x2);

- MCE.emitWordLE(

- (0x28 << 26) | (base << 21) | (at << 16) | ((offset+1) & 0xffff));

- return 3;

-int MipsCodeEmitter::emitULH(const MachineInstr &MI) {

- unsigned dst = getMachineOpValue(MI, MI.getOperand(0));

- unsigned base = getMachineOpValue(MI, MI.getOperand(1));

- unsigned offset = getMachineOpValue(MI, MI.getOperand(2));

- unsigned at = 1;

- // lbu at, offset(base)

- // lb dst, offset+1(base)

- // sll dst,dst,8

- // or dst,dst,at

- MCE.emitWordLE(

- (0x24 << 26) | (base << 21) | (at << 16) | (offset & 0xffff));

- MCE.emitWordLE(

- (0x20 << 26) | (base << 21) | (dst << 16) | ((offset+1) & 0xffff));

- MCE.emitWordLE(

- (0x0 << 26) | (0x0 << 21) | (dst << 16) | (dst << 11) | (0x8 << 6) | 0x0);

- MCE.emitWordLE(

- (0x0 << 26) | (dst << 21) | (at << 16) | (dst << 11) | (0x0 << 6) | 0x25);

- return 4;

-int MipsCodeEmitter::emitULHu(const MachineInstr &MI) {

- unsigned dst = getMachineOpValue(MI, MI.getOperand(0));

- unsigned base = getMachineOpValue(MI, MI.getOperand(1));

- unsigned offset = getMachineOpValue(MI, MI.getOperand(2));

- unsigned at = 1;

- // lbu at, offset(base)

- // lbu dst, offset+1(base)

- // sll dst,dst,8

- // or dst,dst,at

- MCE.emitWordLE(

- (0x24 << 26) | (base << 21) | (at << 16) | (offset & 0xffff));

- MCE.emitWordLE(

- (0x24 << 26) | (base << 21) | (dst << 16) | ((offset+1) & 0xffff));

- MCE.emitWordLE(

- (0x0 << 26) | (0x0 << 21) | (dst << 16) | (dst << 11) | (0x8 << 6) | 0x0);

- MCE.emitWordLE(

- (0x0 << 26) | (dst << 21) | (at << 16) | (dst << 11) | (0x0 << 6) | 0x25);

- return 4;

void MipsCodeEmitter::emitInstruction(const MachineInstr &MI) {

DEBUG(errs() << "JIT: " << (void*)MCE.getCurrentPCValue() << ":\t" << MI);

@@ -377,16 +274,19 @@ void MipsCodeEmitter::emitInstruction(const MachineInstr &MI) {

if ((MI.getDesc().TSFlags & MipsII::FormMask) == MipsII::Pseudo)

return;

- emitWordLE(getBinaryCodeForInstr(MI));

+ emitWord(getBinaryCodeForInstr(MI));

++NumEmitted; // Keep track of the # of mi's emitted

MCE.processDebugLoc(MI.getDebugLoc(), false);

}

-void MipsCodeEmitter::emitWordLE(unsigned Word) {

+void MipsCodeEmitter::emitWord(unsigned Word) {

DEBUG(errs() << " 0x";

errs().write_hex(Word) << "\n");

- MCE.emitWordLE(Word);

+ if (Subtarget->isLittle())

+ MCE.emitWordLE(Word);

+ else

+ MCE.emitWordBE(Word);

}

/// createMipsJITCodeEmitterPass - Return a pass that emits the collected Mips

diff --git a/lib/Target/Mips/MipsISelLowering.cpp b/lib/Target/Mips/MipsISelLowering.cpp
index e225b6c28eb6..b0dd0a766f70 100644
--- a/lib/Target/Mips/MipsISelLowering.cpp
+++ b/lib/Target/Mips/MipsISelLowering.cpp

@@ -46,6 +46,10 @@ static cl::opt<bool>

EnableMipsTailCalls("enable-mips-tail-calls", cl::Hidden,

cl::desc("MIPS: Enable tail calls."), cl::init(false));

+static cl::opt<bool>

+LargeGOT("mxgot", cl::Hidden,

+ cl::desc("MIPS: Enable GOT larger than 64k."), cl::init(false));

static const uint16_t O32IntRegs[4] = {

Mips::A0, Mips::A1, Mips::A2, Mips::A3

};

@@ -77,6 +81,71 @@ static SDValue GetGlobalReg(SelectionDAG &DAG, EVT Ty) {

return DAG.getRegister(FI->getGlobalBaseReg(), Ty);

}

+static SDValue getTargetNode(SDValue Op, SelectionDAG &DAG, unsigned Flag) {

+ EVT Ty = Op.getValueType();

+ if (GlobalAddressSDNode *N = dyn_cast<GlobalAddressSDNode>(Op))

+ return DAG.getTargetGlobalAddress(N->getGlobal(), Op.getDebugLoc(), Ty, 0,

+ Flag);

+ if (ExternalSymbolSDNode *N = dyn_cast<ExternalSymbolSDNode>(Op))

+ return DAG.getTargetExternalSymbol(N->getSymbol(), Ty, Flag);

+ if (BlockAddressSDNode *N = dyn_cast<BlockAddressSDNode>(Op))

+ return DAG.getTargetBlockAddress(N->getBlockAddress(), Ty, 0, Flag);

+ if (JumpTableSDNode *N = dyn_cast<JumpTableSDNode>(Op))

+ return DAG.getTargetJumpTable(N->getIndex(), Ty, Flag);

+ if (ConstantPoolSDNode *N = dyn_cast<ConstantPoolSDNode>(Op))

+ return DAG.getTargetConstantPool(N->getConstVal(), Ty, N->getAlignment(),

+ N->getOffset(), Flag);

+ llvm_unreachable("Unexpected node type.");

+ return SDValue();

+static SDValue getAddrNonPIC(SDValue Op, SelectionDAG &DAG) {

+ DebugLoc DL = Op.getDebugLoc();

+ EVT Ty = Op.getValueType();

+ SDValue Hi = getTargetNode(Op, DAG, MipsII::MO_ABS_HI);

+ SDValue Lo = getTargetNode(Op, DAG, MipsII::MO_ABS_LO);

+ return DAG.getNode(ISD::ADD, DL, Ty,

+ DAG.getNode(MipsISD::Hi, DL, Ty, Hi),

+ DAG.getNode(MipsISD::Lo, DL, Ty, Lo));

+static SDValue getAddrLocal(SDValue Op, SelectionDAG &DAG, bool HasMips64) {

+ DebugLoc DL = Op.getDebugLoc();

+ EVT Ty = Op.getValueType();

+ unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;

+ SDValue GOT = DAG.getNode(MipsISD::Wrapper, DL, Ty, GetGlobalReg(DAG, Ty),

+ getTargetNode(Op, DAG, GOTFlag));

+ SDValue Load = DAG.getLoad(Ty, DL, DAG.getEntryNode(), GOT,

+ MachinePointerInfo::getGOT(), false, false, false,

+ 0);

+ unsigned LoFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;

+ SDValue Lo = DAG.getNode(MipsISD::Lo, DL, Ty, getTargetNode(Op, DAG, LoFlag));

+ return DAG.getNode(ISD::ADD, DL, Ty, Load, Lo);

+static SDValue getAddrGlobal(SDValue Op, SelectionDAG &DAG, unsigned Flag) {

+ DebugLoc DL = Op.getDebugLoc();

+ EVT Ty = Op.getValueType();

+ SDValue Tgt = DAG.getNode(MipsISD::Wrapper, DL, Ty, GetGlobalReg(DAG, Ty),

+ getTargetNode(Op, DAG, Flag));

+ return DAG.getLoad(Ty, DL, DAG.getEntryNode(), Tgt,

+ MachinePointerInfo::getGOT(), false, false, false, 0);

+static SDValue getAddrGlobalLargeGOT(SDValue Op, SelectionDAG &DAG,

+ unsigned HiFlag, unsigned LoFlag) {

+ DebugLoc DL = Op.getDebugLoc();

+ EVT Ty = Op.getValueType();

+ SDValue Hi = DAG.getNode(MipsISD::Hi, DL, Ty, getTargetNode(Op, DAG, HiFlag));

+ Hi = DAG.getNode(ISD::ADD, DL, Ty, Hi, GetGlobalReg(DAG, Ty));

+ SDValue Wrapper = DAG.getNode(MipsISD::Wrapper, DL, Ty, Hi,

+ getTargetNode(Op, DAG, LoFlag));

+ return DAG.getLoad(Ty, DL, DAG.getEntryNode(), Wrapper,

+ MachinePointerInfo::getGOT(), false, false, false, 0);

const char *MipsTargetLowering::getTargetNodeName(unsigned Opcode) const {

switch (Opcode) {

case MipsISD::JmpLink: return "MipsISD::JmpLink";

@@ -1743,8 +1812,6 @@ SDValue MipsTargetLowering::LowerGlobalAddress(SDValue Op,

const GlobalValue *GV = cast<GlobalAddressSDNode>(Op)->getGlobal();

if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) {

- SDVTList VTs = DAG.getVTList(MVT::i32);

const MipsTargetObjectFile &TLOF =

(const MipsTargetObjectFile&)getObjFileLowering();

@@ -1752,69 +1819,33 @@ SDValue MipsTargetLowering::LowerGlobalAddress(SDValue Op,

if (TLOF.IsGlobalInSmallSection(GV, getTargetMachine())) {

SDValue GA = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0,

MipsII::MO_GPREL);

- SDValue GPRelNode = DAG.getNode(MipsISD::GPRel, dl, VTs, &GA, 1);

+ SDValue GPRelNode = DAG.getNode(MipsISD::GPRel, dl,

+ DAG.getVTList(MVT::i32), &GA, 1);

SDValue GPReg = DAG.getRegister(Mips::GP, MVT::i32);

return DAG.getNode(ISD::ADD, dl, MVT::i32, GPReg, GPRelNode);

}

// %hi/%lo relocation

- SDValue GAHi = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0,

- MipsII::MO_ABS_HI);

- SDValue GALo = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0,

- MipsII::MO_ABS_LO);

- SDValue HiPart = DAG.getNode(MipsISD::Hi, dl, VTs, &GAHi, 1);

- SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, GALo);

- return DAG.getNode(ISD::ADD, dl, MVT::i32, HiPart, Lo);

- }

- EVT ValTy = Op.getValueType();

- bool HasGotOfst = (GV->hasInternalLinkage() ||

- (GV->hasLocalLinkage() && !isa<Function>(GV)));

- unsigned GotFlag = HasMips64 ?

- (HasGotOfst ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT_DISP) :

- (HasGotOfst ? MipsII::MO_GOT : MipsII::MO_GOT16);

- SDValue GA = DAG.getTargetGlobalAddress(GV, dl, ValTy, 0, GotFlag);

- GA = DAG.getNode(MipsISD::Wrapper, dl, ValTy, GetGlobalReg(DAG, ValTy), GA);

- SDValue ResNode = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), GA,

- MachinePointerInfo(), false, false, false, 0);

- // On functions and global targets not internal linked only

- // a load from got/GP is necessary for PIC to work.

- if (!HasGotOfst)

- return ResNode;

- SDValue GALo = DAG.getTargetGlobalAddress(GV, dl, ValTy, 0,

- HasMips64 ? MipsII::MO_GOT_OFST :

- MipsII::MO_ABS_LO);

- SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, GALo);

- return DAG.getNode(ISD::ADD, dl, ValTy, ResNode, Lo);

+ return getAddrNonPIC(Op, DAG);

+ }

+ if (GV->hasInternalLinkage() || (GV->hasLocalLinkage() && !isa<Function>(GV)))

+ return getAddrLocal(Op, DAG, HasMips64);

+ if (LargeGOT)

+ return getAddrGlobalLargeGOT(Op, DAG, MipsII::MO_GOT_HI16,

+ MipsII::MO_GOT_LO16);

+ return getAddrGlobal(Op, DAG,

+ HasMips64 ? MipsII::MO_GOT_DISP : MipsII::MO_GOT16);

}

SDValue MipsTargetLowering::LowerBlockAddress(SDValue Op,

SelectionDAG &DAG) const {

- const BlockAddress *BA = cast<BlockAddressSDNode>(Op)->getBlockAddress();

- // FIXME there isn't actually debug info here

- DebugLoc dl = Op.getDebugLoc();

+ if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64)

+ return getAddrNonPIC(Op, DAG);

- if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) {

- // %hi/%lo relocation

- SDValue BAHi =

- DAG.getTargetBlockAddress(BA, MVT::i32, 0, MipsII::MO_ABS_HI);

- SDValue BALo =

- DAG.getTargetBlockAddress(BA, MVT::i32, 0, MipsII::MO_ABS_LO);

- SDValue Hi = DAG.getNode(MipsISD::Hi, dl, MVT::i32, BAHi);

- SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, BALo);

- return DAG.getNode(ISD::ADD, dl, MVT::i32, Hi, Lo);

- }

- EVT ValTy = Op.getValueType();

- unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;

- unsigned OFSTFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;

- SDValue BAGOTOffset = DAG.getTargetBlockAddress(BA, ValTy, 0, GOTFlag);

- BAGOTOffset = DAG.getNode(MipsISD::Wrapper, dl, ValTy,

- GetGlobalReg(DAG, ValTy), BAGOTOffset);

- SDValue BALOOffset = DAG.getTargetBlockAddress(BA, ValTy, 0, OFSTFlag);

- SDValue Load = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), BAGOTOffset,

- MachinePointerInfo(), false, false, false, 0);

- SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, BALOOffset);

- return DAG.getNode(ISD::ADD, dl, ValTy, Load, Lo);

+ return getAddrLocal(Op, DAG, HasMips64);

}

SDValue MipsTargetLowering::

@@ -1901,41 +1932,15 @@ LowerGlobalTLSAddress(SDValue Op, SelectionDAG &DAG) const

SDValue MipsTargetLowering::

LowerJumpTable(SDValue Op, SelectionDAG &DAG) const

{

- SDValue HiPart, JTI, JTILo;

- // FIXME there isn't actually debug info here

- DebugLoc dl = Op.getDebugLoc();

- bool IsPIC = getTargetMachine().getRelocationModel() == Reloc::PIC_;

- EVT PtrVT = Op.getValueType();

- JumpTableSDNode *JT = cast<JumpTableSDNode>(Op);

- if (!IsPIC && !IsN64) {

- JTI = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, MipsII::MO_ABS_HI);

- HiPart = DAG.getNode(MipsISD::Hi, dl, PtrVT, JTI);

- JTILo = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, MipsII::MO_ABS_LO);

- } else {// Emit Load from Global Pointer

- unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;

- unsigned OfstFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;

- JTI = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, GOTFlag);

- JTI = DAG.getNode(MipsISD::Wrapper, dl, PtrVT, GetGlobalReg(DAG, PtrVT),

- JTI);

- HiPart = DAG.getLoad(PtrVT, dl, DAG.getEntryNode(), JTI,

- MachinePointerInfo(), false, false, false, 0);

- JTILo = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, OfstFlag);

- }

+ if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64)

+ return getAddrNonPIC(Op, DAG);

- SDValue Lo = DAG.getNode(MipsISD::Lo, dl, PtrVT, JTILo);

- return DAG.getNode(ISD::ADD, dl, PtrVT, HiPart, Lo);

+ return getAddrLocal(Op, DAG, HasMips64);

}

SDValue MipsTargetLowering::

LowerConstantPool(SDValue Op, SelectionDAG &DAG) const

{

- SDValue ResNode;

- ConstantPoolSDNode *N = cast<ConstantPoolSDNode>(Op);

- const Constant *C = N->getConstVal();

- // FIXME there isn't actually debug info here

- DebugLoc dl = Op.getDebugLoc();

// gp_rel relocation

// FIXME: we should reference the constant pool using small data sections,

// but the asm printer currently doesn't support this feature without

@@ -1946,31 +1951,10 @@ LowerConstantPool(SDValue Op, SelectionDAG &DAG) const

// SDValue GOT = DAG.getGLOBAL_OFFSET_TABLE(MVT::i32);

// ResNode = DAG.getNode(ISD::ADD, MVT::i32, GOT, GPRelNode);

- if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) {

- SDValue CPHi = DAG.getTargetConstantPool(C, MVT::i32, N->getAlignment(),

- N->getOffset(), MipsII::MO_ABS_HI);

- SDValue CPLo = DAG.getTargetConstantPool(C, MVT::i32, N->getAlignment(),

- N->getOffset(), MipsII::MO_ABS_LO);

- SDValue HiPart = DAG.getNode(MipsISD::Hi, dl, MVT::i32, CPHi);

- SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, CPLo);

- ResNode = DAG.getNode(ISD::ADD, dl, MVT::i32, HiPart, Lo);

- } else {

- EVT ValTy = Op.getValueType();

- unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;

- unsigned OFSTFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;

- SDValue CP = DAG.getTargetConstantPool(C, ValTy, N->getAlignment(),

- N->getOffset(), GOTFlag);

- CP = DAG.getNode(MipsISD::Wrapper, dl, ValTy, GetGlobalReg(DAG, ValTy), CP);

- SDValue Load = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), CP,

- MachinePointerInfo::getConstantPool(), false,

- false, false, 0);

- SDValue CPLo = DAG.getTargetConstantPool(C, ValTy, N->getAlignment(),

- N->getOffset(), OFSTFlag);

- SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, CPLo);

- ResNode = DAG.getNode(ISD::ADD, dl, ValTy, Load, Lo);

- }

+ if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64)

+ return getAddrNonPIC(Op, DAG);

- return ResNode;

+ return getAddrLocal(Op, DAG, HasMips64);

}

SDValue MipsTargetLowering::LowerVASTART(SDValue Op, SelectionDAG &DAG) const {

@@ -2862,60 +2846,41 @@ MipsTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,

// If the callee is a GlobalAddress/ExternalSymbol node (quite common, every

// direct call is) turn it into a TargetGlobalAddress/TargetExternalSymbol

// node so that legalize doesn't hack it.

- unsigned char OpFlag;

bool IsPICCall = (IsN64 || IsPIC); // true if calls are translated to jalr $25

bool GlobalOrExternal = false;

SDValue CalleeLo;

if (GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee)) {

- if (IsPICCall && G->getGlobal()->hasInternalLinkage()) {

- OpFlag = IsO32 ? MipsII::MO_GOT : MipsII::MO_GOT_PAGE;

- unsigned char LoFlag = IsO32 ? MipsII::MO_ABS_LO : MipsII::MO_GOT_OFST;

+ if (IsPICCall) {

+ if (G->getGlobal()->hasInternalLinkage())

+ Callee = getAddrLocal(Callee, DAG, HasMips64);

+ else if (LargeGOT)

+ Callee = getAddrGlobalLargeGOT(Callee, DAG, MipsII::MO_CALL_HI16,

+ MipsII::MO_CALL_LO16);

+ else

+ Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_CALL);

+ } else

Callee = DAG.getTargetGlobalAddress(G->getGlobal(), dl, getPointerTy(), 0,

- OpFlag);

- CalleeLo = DAG.getTargetGlobalAddress(G->getGlobal(), dl, getPointerTy(),

- 0, LoFlag);

- } else {

- OpFlag = IsPICCall ? MipsII::MO_GOT_CALL : MipsII::MO_NO_FLAG;

- Callee = DAG.getTargetGlobalAddress(G->getGlobal(), dl,

- getPointerTy(), 0, OpFlag);

- }

+ MipsII::MO_NO_FLAG);

GlobalOrExternal = true;

}

else if (ExternalSymbolSDNode *S = dyn_cast<ExternalSymbolSDNode>(Callee)) {

- if (IsN64 || (!IsO32 && IsPIC))

- OpFlag = MipsII::MO_GOT_DISP;

- else if (!IsPIC) // !N64 && static

- OpFlag = MipsII::MO_NO_FLAG;

+ if (!IsN64 && !IsPIC) // !N64 && static

+ Callee = DAG.getTargetExternalSymbol(S->getSymbol(), getPointerTy(),

+ MipsII::MO_NO_FLAG);

+ else if (LargeGOT)

+ Callee = getAddrGlobalLargeGOT(Callee, DAG, MipsII::MO_CALL_HI16,

+ MipsII::MO_CALL_LO16);

+ else if (HasMips64)

+ Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_DISP);

else // O32 & PIC

- OpFlag = MipsII::MO_GOT_CALL;

- Callee = DAG.getTargetExternalSymbol(S->getSymbol(), getPointerTy(),

- OpFlag);

+ Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_CALL);

GlobalOrExternal = true;

}

SDValue InFlag;

- // Create nodes that load address of callee and copy it to T9

- if (IsPICCall) {

- if (GlobalOrExternal) {

- // Load callee address

- Callee = DAG.getNode(MipsISD::Wrapper, dl, getPointerTy(),

- GetGlobalReg(DAG, getPointerTy()), Callee);

- SDValue LoadValue = DAG.getLoad(getPointerTy(), dl, DAG.getEntryNode(),

- Callee, MachinePointerInfo::getGOT(),

- false, false, false, 0);

- // Use GOT+LO if callee has internal linkage.

- if (CalleeLo.getNode()) {

- SDValue Lo = DAG.getNode(MipsISD::Lo, dl, getPointerTy(), CalleeLo);

- Callee = DAG.getNode(ISD::ADD, dl, getPointerTy(), LoadValue, Lo);

- } else

- Callee = LoadValue;

- }

// T9 register operand.

SDValue T9;

diff --git a/lib/Target/Mips/MipsInstrInfo.td b/lib/Target/Mips/MipsInstrInfo.td
index f16b5f9ee7ff..aa8881997285 100644
--- a/lib/Target/Mips/MipsInstrInfo.td
+++ b/lib/Target/Mips/MipsInstrInfo.td

@@ -1154,12 +1154,14 @@ def : MipsPat<(MipsHi tblockaddress:$in), (LUi tblockaddress:$in)>;

def : MipsPat<(MipsHi tjumptable:$in), (LUi tjumptable:$in)>;

def : MipsPat<(MipsHi tconstpool:$in), (LUi tconstpool:$in)>;

def : MipsPat<(MipsHi tglobaltlsaddr:$in), (LUi tglobaltlsaddr:$in)>;

+def : MipsPat<(MipsHi texternalsym:$in), (LUi texternalsym:$in)>;

def : MipsPat<(MipsLo tglobaladdr:$in), (ADDiu ZERO, tglobaladdr:$in)>;

def : MipsPat<(MipsLo tblockaddress:$in), (ADDiu ZERO, tblockaddress:$in)>;

def : MipsPat<(MipsLo tjumptable:$in), (ADDiu ZERO, tjumptable:$in)>;

def : MipsPat<(MipsLo tconstpool:$in), (ADDiu ZERO, tconstpool:$in)>;

def : MipsPat<(MipsLo tglobaltlsaddr:$in), (ADDiu ZERO, tglobaltlsaddr:$in)>;

+def : MipsPat<(MipsLo texternalsym:$in), (ADDiu ZERO, texternalsym:$in)>;

def : MipsPat<(add CPURegs:$hi, (MipsLo tglobaladdr:$lo)),

(ADDiu CPURegs:$hi, tglobaladdr:$lo)>;

diff --git a/lib/Target/Mips/MipsJITInfo.cpp b/lib/Target/Mips/MipsJITInfo.cpp
index 052046a8a45d..da1119df8f9f 100644
--- a/lib/Target/Mips/MipsJITInfo.cpp
+++ b/lib/Target/Mips/MipsJITInfo.cpp

@@ -222,10 +222,17 @@ void *MipsJITInfo::emitFunctionStub(const Function *F, void *Fn,

// addiu t9, t9, %lo(EmittedAddr)

// jalr t8, t9

// nop

- JCE.emitWordLE(0xf << 26 | 25 << 16 | Hi);

- JCE.emitWordLE(9 << 26 | 25 << 21 | 25 << 16 | Lo);

- JCE.emitWordLE(25 << 21 | 24 << 11 | 9);

- JCE.emitWordLE(0);

+ if (IsLittleEndian) {

+ JCE.emitWordLE(0xf << 26 | 25 << 16 | Hi);

+ JCE.emitWordLE(9 << 26 | 25 << 21 | 25 << 16 | Lo);

+ JCE.emitWordLE(25 << 21 | 24 << 11 | 9);

+ JCE.emitWordLE(0);

+ } else {

+ JCE.emitWordBE(0xf << 26 | 25 << 16 | Hi);

+ JCE.emitWordBE(9 << 26 | 25 << 21 | 25 << 16 | Lo);

+ JCE.emitWordBE(25 << 21 | 24 << 11 | 9);

+ JCE.emitWordBE(0);

+ }

sys::Memory::InvalidateInstructionCache(Addr, 16);

if (!sys::Memory::setRangeExecutable(Addr, 16))

diff --git a/lib/Target/Mips/MipsJITInfo.h b/lib/Target/Mips/MipsJITInfo.h
index 637a31866034..ecda3101a003 100644
--- a/lib/Target/Mips/MipsJITInfo.h
+++ b/lib/Target/Mips/MipsJITInfo.h

@@ -26,10 +26,11 @@ class MipsTargetMachine;

class MipsJITInfo : public TargetJITInfo {

bool IsPIC;

+ bool IsLittleEndian;

public:

explicit MipsJITInfo() :

- IsPIC(false) {}

+ IsPIC(false), IsLittleEndian(true) {}

/// replaceMachineCodeForFunction - Make it so that calling the function

/// whose machine code is at OLD turns into a call to NEW, perhaps by

@@ -58,8 +59,10 @@ class MipsJITInfo : public TargetJITInfo {

unsigned NumRelocs, unsigned char *GOTBase);

/// Initialize - Initialize internal stage for the function being JITted.

- void Initialize(const MachineFunction &MF, bool isPIC) {

+ void Initialize(const MachineFunction &MF, bool isPIC,

+ bool isLittleEndian) {

IsPIC = isPIC;

+ IsLittleEndian = isLittleEndian;

}

};

diff --git a/lib/Target/Mips/MipsMCInstLower.cpp b/lib/Target/Mips/MipsMCInstLower.cpp
index 5fa633933838..4162f981d1df 100644
--- a/lib/Target/Mips/MipsMCInstLower.cpp
+++ b/lib/Target/Mips/MipsMCInstLower.cpp

@@ -62,6 +62,10 @@ MCOperand MipsMCInstLower::LowerSymbolOperand(const MachineOperand &MO,

case MipsII::MO_GOT_OFST: Kind = MCSymbolRefExpr::VK_Mips_GOT_OFST; break;

case MipsII::MO_HIGHER: Kind = MCSymbolRefExpr::VK_Mips_HIGHER; break;

case MipsII::MO_HIGHEST: Kind = MCSymbolRefExpr::VK_Mips_HIGHEST; break;

+ case MipsII::MO_GOT_HI16: Kind = MCSymbolRefExpr::VK_Mips_GOT_HI16; break;

+ case MipsII::MO_GOT_LO16: Kind = MCSymbolRefExpr::VK_Mips_GOT_LO16; break;

+ case MipsII::MO_CALL_HI16: Kind = MCSymbolRefExpr::VK_Mips_CALL_HI16; break;

+ case MipsII::MO_CALL_LO16: Kind = MCSymbolRefExpr::VK_Mips_CALL_LO16; break;

}

switch (MOTy) {

diff --git a/lib/Transforms/Scalar/SROA.cpp b/lib/Transforms/Scalar/SROA.cpp
index ccc2f7a77b3c..2d518f735be0 100644
--- a/lib/Transforms/Scalar/SROA.cpp
+++ b/lib/Transforms/Scalar/SROA.cpp

@@ -2160,6 +2160,9 @@ static bool isIntegerWideningViable(const DataLayout &TD,

AllocaPartitioning::const_use_iterator I,

AllocaPartitioning::const_use_iterator E) {

uint64_t SizeInBits = TD.getTypeSizeInBits(AllocaTy);

+ // Don't create integer types larger than the maximum bitwidth.

+ if (SizeInBits > IntegerType::MAX_INT_BITS)

+ return false;

// Don't try to handle allocas with bit-padding.

if (SizeInBits != TD.getTypeStoreSizeInBits(AllocaTy))

@@ -2198,7 +2201,7 @@ static bool isIntegerWideningViable(const DataLayout &TD,

if (RelBegin == 0 && RelEnd == Size)

WholeAllocaOp = true;

if (IntegerType *ITy = dyn_cast<IntegerType>(LI->getType())) {

- if (ITy->getBitWidth() < TD.getTypeStoreSize(ITy))

+ if (ITy->getBitWidth() < TD.getTypeStoreSizeInBits(ITy))

return false;

continue;

}

@@ -2214,7 +2217,7 @@ static bool isIntegerWideningViable(const DataLayout &TD,

if (RelBegin == 0 && RelEnd == Size)

WholeAllocaOp = true;

if (IntegerType *ITy = dyn_cast<IntegerType>(ValueTy)) {

- if (ITy->getBitWidth() < TD.getTypeStoreSize(ITy))

+ if (ITy->getBitWidth() < TD.getTypeStoreSizeInBits(ITy))

return false;

continue;

}

diff --git a/test/CodeGen/Mips/biggot.ll b/test/CodeGen/Mips/biggot.ll
new file mode 100644
index 000000000000..c4ad851c8258
--- /dev/null
+++ b/test/CodeGen/Mips/biggot.ll

@@ -0,0 +1,50 @@

+; RUN: llc -march=mipsel -mxgot < %s | FileCheck %s -check-prefix=O32

+; RUN: llc -march=mips64el -mcpu=mips64r2 -mattr=+n64 -mxgot < %s | \

+; RUN: FileCheck %s -check-prefix=N64

+@v0 = external global i32

+define void @foo1() nounwind {

+entry:

+; O32: lui $[[R0:[0-9]+]], %got_hi(v0)

+; O32: addu $[[R1:[0-9]+]], $[[R0]], ${{[a-z0-9]+}}

+; O32: lw ${{[0-9]+}}, %got_lo(v0)($[[R1]])

+; O32: lui $[[R2:[0-9]+]], %call_hi(foo0)

+; O32: addu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}

+; O32: lw ${{[0-9]+}}, %call_lo(foo0)($[[R3]])

+; N64: lui $[[R0:[0-9]+]], %got_hi(v0)

+; N64: daddu $[[R1:[0-9]+]], $[[R0]], ${{[a-z0-9]+}}

+; N64: ld ${{[0-9]+}}, %got_lo(v0)($[[R1]])

+; N64: lui $[[R2:[0-9]+]], %call_hi(foo0)

+; N64: daddu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}

+; N64: ld ${{[0-9]+}}, %call_lo(foo0)($[[R3]])

+ %0 = load i32* @v0, align 4

+ tail call void @foo0(i32 %0) nounwind

+ ret void

+declare void @foo0(i32)

+; call to external function.

+define void @foo2(i32* nocapture %d, i32* nocapture %s, i32 %n) nounwind {

+entry:

+; O32: foo2:

+; O32: lui $[[R2:[0-9]+]], %call_hi(memcpy)

+; O32: addu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}

+; O32: lw ${{[0-9]+}}, %call_lo(memcpy)($[[R3]])

+; N64: foo2:

+; N64: lui $[[R2:[0-9]+]], %call_hi(memcpy)

+; N64: daddu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}

+; N64: ld ${{[0-9]+}}, %call_lo(memcpy)($[[R3]])

+ %0 = bitcast i32* %d to i8*

+ %1 = bitcast i32* %s to i8*

+ tail call void @llvm.memcpy.p0i8.p0i8.i32(i8* %0, i8* %1, i32 %n, i32 4, i1 false)

+ ret void

+declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i32, i1) nounwind

diff --git a/test/MC/Mips/xgot.ll b/test/MC/Mips/xgot.ll
new file mode 100644
index 000000000000..bfe9b9ad6604
--- /dev/null
+++ b/test/MC/Mips/xgot.ll

@@ -0,0 +1,42 @@

+; RUN: llc -filetype=obj -mtriple mipsel-unknown-linux -mxgot %s -o - | elf-dump --dump-section-data | FileCheck %s

+@.str = private unnamed_addr constant [16 x i8] c"ext_1=%d, i=%d\0A\00", align 1

+@ext_1 = external global i32

+define void @fill() nounwind {

+entry:

+; Check that the appropriate relocations were created.

+; For the xgot case we want to see R_MIPS_[GOT|CALL]_[HI|LO]16.

+; R_MIPS_HI16

+; CHECK: ('r_type', 0x05)

+; R_MIPS_LO16

+; CHECK: ('r_type', 0x06)

+; R_MIPS_GOT_HI16

+; CHECK: ('r_type', 0x16)

+; R_MIPS_GOT_LO16

+; CHECK: ('r_type', 0x17)

+; R_MIPS_GOT

+; CHECK: ('r_type', 0x09)

+; R_MIPS_LO16

+; CHECK: ('r_type', 0x06)

+; R_MIPS_CALL_HI16

+; CHECK: ('r_type', 0x1e)

+; R_MIPS_CALL_LO16

+; CHECK: ('r_type', 0x1f)

+ %0 = load i32* @ext_1, align 4

+ %call = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([16 x i8]* @.str, i32 0, i32 0), i32 %0) nounwind

+ ret void

+declare i32 @printf(i8* nocapture, ...) nounwind

diff --git a/test/Transforms/SROA/basictest.ll b/test/Transforms/SROA/basictest.ll
index b363eefb3f9d..9fe926ee2cc1 100644
--- a/test/Transforms/SROA/basictest.ll
+++ b/test/Transforms/SROA/basictest.ll

@@ -1134,3 +1134,45 @@ entry:

ret void

; CHECK: ret

}

+define void @PR14465() {

+; Ensure that we don't crash when analyzing a alloca larger than the maximum

+; integer type width (MAX_INT_BITS) supported by llvm (1048576*32 > (1<<23)-1).

+; CHECK: @PR14465

+ %stack = alloca [1048576 x i32], align 16

+; CHECK: alloca [1048576 x i32]

+ %cast = bitcast [1048576 x i32]* %stack to i8*

+ call void @llvm.memset.p0i8.i64(i8* %cast, i8 -2, i64 4194304, i32 16, i1 false)

+ ret void

+; CHECK: ret

+define void @PR14548(i1 %x) {

+; Handle a mixture of i1 and i8 loads and stores to allocas. This particular

+; pattern caused crashes and invalid output in the PR, and its nature will

+; trigger a mixture in several permutations as we resolve each alloca

+; iteratively.

+; Note that we don't do a particularly good *job* of handling these mixtures,

+; but the hope is that this is very rare.

+; CHECK: @PR14548

+entry:

+ %a = alloca <{ i1 }>, align 8

+ %b = alloca <{ i1 }>, align 8

+; Nothing of interest is simplified here.

+; CHECK: alloca

+ %b.i1 = bitcast <{ i1 }>* %b to i1*

+ store i1 %x, i1* %b.i1, align 8

+ %b.i8 = bitcast <{ i1 }>* %b to i8*

+ %foo = load i8* %b.i8, align 1

+ %a.i8 = bitcast <{ i1 }>* %a to i8*

+ call void @llvm.memcpy.p0i8.p0i8.i32(i8* %a.i8, i8* %b.i8, i32 1, i32 1, i1 false) nounwind

+ %bar = load i8* %a.i8, align 1

+ %a.i1 = getelementptr inbounds <{ i1 }>* %a, i32 0, i32 0

+ %baz = load i1* %a.i1, align 1

+ ret void

diff --git a/test/Transforms/SROA/big-endian.ll b/test/Transforms/SROA/big-endian.ll
index ce82d1f30b57..1ac6d25d6341 100644
--- a/test/Transforms/SROA/big-endian.ll
+++ b/test/Transforms/SROA/big-endian.ll

@@ -82,14 +82,9 @@ entry:

%a0i16ptr = bitcast i8* %a0ptr to i16*

store i16 1, i16* %a0i16ptr

-; CHECK: %[[mask0:.*]] = and i16 1, -16

- %a1i4ptr = bitcast i8* %a1ptr to i4*

- store i4 1, i4* %a1i4ptr

-; CHECK-NEXT: %[[insert0:.*]] = or i16 %[[mask0]], 1

store i8 1, i8* %a2ptr

-; CHECK-NEXT: %[[mask1:.*]] = and i40 undef, 4294967295

+; CHECK: %[[mask1:.*]] = and i40 undef, 4294967295

; CHECK-NEXT: %[[insert1:.*]] = or i40 %[[mask1]], 4294967296

%a3i24ptr = bitcast i8* %a3ptr to i24*

@@ -110,7 +105,7 @@ entry:

%ai = load i56* %aiptr

%ret = zext i56 %ai to i64

ret i64 %ret

-; CHECK-NEXT: %[[ext4:.*]] = zext i16 %[[insert0]] to i56

+; CHECK-NEXT: %[[ext4:.*]] = zext i16 1 to i56

; CHECK-NEXT: %[[shift4:.*]] = shl i56 %[[ext4]], 40

; CHECK-NEXT: %[[mask4:.*]] = and i56 %[[insert3]], 1099511627775

; CHECK-NEXT: %[[insert4:.*]] = or i56 %[[mask4]], %[[shift4]]