Next: , Previous: Score Options, Up: Submodel Options



3.17.30 SH Options

These -m options are defined for the SH implementations:

-m1

Generate code for the SH1.

-m2

Generate code for the SH2.

-m2e
Generate code for the SH2e.
-m3

Generate code for the SH3.

-m3e

Generate code for the SH3e.

-m4-nofpu

Generate code for the SH4 without a floating-point unit.

-m4-single-only

Generate code for the SH4 with a floating-point unit that only supports single-precision arithmetic.

-m4-single

Generate code for the SH4 assuming the floating-point unit is in single-precision mode by default.

-m4

Generate code for the SH4.

-m4a-nofpu

Generate code for the SH4al-dsp, or for a SH4a in such a way that the floating-point unit is not used.

-m4a-single-only

Generate code for the SH4a, in such a way that no double-precision floating point operations are used.

-m4a-single

Generate code for the SH4a assuming the floating-point unit is in single-precision mode by default.

-m4a

Generate code for the SH4a.

-m4al

Same as -m4a-nofpu, except that it implicitly passes -dsp to the assembler. GCC doesn't generate any DSP instructions at the moment.

-mb

Compile code for the processor in big endian mode.

-ml

Compile code for the processor in little endian mode.

-mdalign

Align doubles at 64-bit boundaries. Note that this changes the calling conventions, and thus some functions from the standard C library will not work unless you recompile it first with -mdalign.

-mrelax

Shorten some address references at link time, when possible; uses the linker option -relax.

-mbigtable

Use 32-bit offsets in switch tables. The default is to use 16-bit offsets.

-mfmovd

Enable the use of the instruction fmovd.

-mhitachi

Comply with the calling conventions defined by Renesas.

-mrenesas

Comply with the calling conventions defined by Renesas.

-mno-renesas

Comply with the calling conventions defined for GCC before the Renesas conventions were available. This option is the default for all targets of the SH toolchain except for sh-symbianelf.

-mnomacsave

Mark the MAC register as call-clobbered, even if -mhitachi is given.

-mieee

Increase IEEE-compliance of floating-point code. At the moment, this is equivalent to -fno-finite-math-only. When generating 16 bit SH opcodes, getting IEEE-conforming results for comparisons of NANs / infinities incurs extra overhead in every floating point comparison, therefore the default is set to -ffinite-math-only.

-minline-ic_invalidate

Inline code to invalidate instruction cache entries after setting up nested function trampolines. This option has no effect if -musermode is in effect and the selected code generation option (e.g. -m4) does not allow the use of the icbi instruction. If the selected code generation option does not allow the use of the icbi instruction, and -musermode is not in effect, the inlined code will manipulate the instruction cache address array directly with an associative write. This not only requires privileged mode, but it will also fail if the cache line had been mapped via the TLB and has become unmapped.

-misize

Dump instruction size and location in the assembly code.

-mpadstruct

This option is deprecated. It pads structures to multiple of 4 bytes, which is incompatible with the SH ABI.

-mspace

Optimize for space instead of speed. Implied by -Os.

-mprefergot

When generating position-independent code, emit function calls using the Global Offset Table instead of the Procedure Linkage Table.

-musermode

Don't generate privileged mode only code; implies -mno-inline-ic_invalidate if the inlined code would not work in user mode. This is the default when the target is sh-*-linux*.

-multcost= number

Set the cost to assume for a multiply insn.

-mdiv= strategy

Set the division strategy to use for SHmedia code. strategy must be one of: call, call2, fp, inv, inv:minlat, inv20u, inv20l, inv:call, inv:call2, inv:fp . "fp" performs the operation in floating point. This has a very high latency, but needs only a few instructions, so it might be a good choice if your code has enough easily exploitable ILP to allow the compiler to schedule the floating point instructions together with other instructions. Division by zero causes a floating point exception. "inv" uses integer operations to calculate the inverse of the divisor, and then multiplies the dividend with the inverse. This strategy allows cse and hoisting of the inverse calculation. Division by zero calculates an unspecified result, but does not trap. "inv:minlat" is a variant of "inv" where if no cse / hoisting opportunities have been found, or if the entire operation has been hoisted to the same place, the last stages of the inverse calculation are intertwined with the final multiply to reduce the overall latency, at the expense of using a few more instructions, and thus offering fewer scheduling opportunities with other code. "call" calls a library function that usually implements the inv:minlat strategy. This gives high code density for m5-*media-nofpu compilations. "call2" uses a different entry point of the same library function, where it assumes that a pointer to a lookup table has already been set up, which exposes the pointer load to cse / code hoisting optimizations. "inv:call", "inv:call2" and "inv:fp" all use the "inv" algorithm for initial code generation, but if the code stays unoptimized, revert to the "call", "call2", or "fp" strategies, respectively. Note that the potentially-trapping side effect of division by zero is carried by a separate instruction, so it is possible that all the integer instructions are hoisted out, but the marker for the side effect stays where it is. A recombination to fp operations or a call is not possible in that case. "inv20u" and "inv20l" are variants of the "inv:minlat" strategy. In the case that the inverse calculation was nor separated from the multiply, they speed up division where the dividend fits into 20 bits (plus sign where applicable), by inserting a test to skip a number of operations in this case; this test slows down the case of larger dividends. inv20u assumes the case of a such a small dividend to be unlikely, and inv20l assumes it to be likely.

-mdivsi3_libfunc= name

Set the name of the library function used for 32 bit signed division to name. This only affect the name used in the call and inv:call division strategies, and the compiler will still expect the same sets of input/output/clobbered registers as if this option was not present.

-madjust-unroll

Throttle unrolling to avoid thrashing target registers. This option only has an effect if the gcc code base supports the TARGET_ADJUST_UNROLL_MAX target hook.

-mindexed-addressing

Enable the use of the indexed addressing mode for SHmedia32/SHcompact. This is only safe if the hardware and/or OS implement 32 bit wrap-around semantics for the indexed addressing mode. The architecture allows the implementation of processors with 64 bit MMU, which the OS could use to get 32 bit addressing, but since no current hardware implementation supports this or any other way to make the indexed addressing mode safe to use in the 32 bit ABI, the default is -mno-indexed-addressing.

-mgettrcost= number

Set the cost assumed for the gettr instruction to number. The default is 2 if -mpt-fixed is in effect, 100 otherwise.

-mpt-fixed

Assume pt* instructions won't trap. This will generally generate better scheduled code, but is unsafe on current hardware. The current architecture definition says that ptabs and ptrel trap when the target anded with 3 is 3. This has the unintentional effect of making it unsafe to schedule ptabs / ptrel before a branch, or hoist it out of a loop. For example, __do_global_ctors, a part of libgcc that runs constructors at program startup, calls functions in a list which is delimited by −1. With the -mpt-fixed option, the ptabs will be done before testing against −1. That means that all the constructors will be run a bit quicker, but when the loop comes to the end of the list, the program crashes because ptabs loads −1 into a target register. Since this option is unsafe for any hardware implementing the current architecture specification, the default is -mno-pt-fixed. Unless the user specifies a specific cost with -mgettrcost, -mno-pt-fixed also implies -mgettrcost=100; this deters register allocation using target registers for storing ordinary integers.

-minvalid-symbols

Assume symbols might be invalid. Ordinary function symbols generated by the compiler will always be valid to load with movi/shori/ptabs or movi/shori/ptrel, but with assembler and/or linker tricks it is possible to generate symbols that will cause ptabs / ptrel to trap. This option is only meaningful when -mno-pt-fixed is in effect. It will then prevent cross-basic-block cse, hoisting and most scheduling of symbol loads. The default is -mno-invalid-symbols.