Here is a table of the instruction names that are meaningful in the RTL generation pass of the compiler. Giving one of these names to an instruction pattern tells the RTL generation pass that it can use the pattern to accomplish a certain task.
mov
m
movsi
moves full-word data.
If operand 0 is a subreg
with mode m of a register whose
own mode is wider than m, the effect of this instruction is
to store the specified value in the part of the register that corresponds
to mode m. Bits outside of m, but which are within the
same target word as the subreg
are undefined. Bits which are
outside the target word are left unchanged.
This class of patterns is special in several ways. First of all, each
of these names up to and including full word size must be defined,
because there is no other way to copy a datum from one place to another.
If there are patterns accepting operands in larger modes,
mov
m must be defined for integer modes of those sizes.
Second, these patterns are not used solely in the RTL generation pass. Even the reload pass can generate move insns to copy values from stack slots into temporary registers. When it does so, one of the operands is a hard register and the other is an operand that can need to be reloaded into a register.
Therefore, when given such a pair of operands, the pattern must generate
RTL which needs no reloading and needs no temporary registers--no
registers other than the operands. For example, if you support the
pattern with a define_expand
, then in such a case the
define_expand
mustn't call force_reg
or any other such
function which might generate new pseudo registers.
This requirement exists even for subword modes on a RISC machine where fetching those modes from memory normally requires several insns and some temporary registers.
During reload a memory reference with an invalid address may be passed
as an operand. Such an address will be replaced with a valid address
later in the reload pass. In this case, nothing may be done with the
address except to use it as it stands. If it is copied, it will not be
replaced with a valid address. No attempt should be made to make such
an address into a valid address and no routine (such as
change_address
) that will do so may be called. Note that
general_operand
will fail when applied to such an address.
The global variable reload_in_progress
(which must be explicitly
declared if required) can be used to determine whether such special
handling is required.
The variety of operands that have reloads depends on the rest of the machine description, but typically on a RISC machine these can only be pseudo registers that did not get hard registers, while on other machines explicit memory references will get optional reloads.
If a scratch register is required to move an object to or from memory,
it can be allocated using gen_reg_rtx
prior to life analysis.
If there are cases which need scratch registers during or after reload,
you must define SECONDARY_INPUT_RELOAD_CLASS
and/or
SECONDARY_OUTPUT_RELOAD_CLASS
to detect them, and provide
patterns reload_in
m or
reload_out
m to handle
them. See Register Classes.
The global variable no_new_pseudos
can be used to determine if it
is unsafe to create new pseudo registers. If this variable is nonzero, then
it is unsafe to call gen_reg_rtx
to allocate a new pseudo.
The constraints on a mov
m must permit moving any hard
register to any other hard register provided that
HARD_REGNO_MODE_OK
permits mode m in both registers and
REGISTER_MOVE_COST
applied to their classes returns a value of 2.
It is obligatory to support floating point mov
m
instructions into and out of any registers that can hold fixed point
values, because unions and structures (which have modes
SImode
or
DImode
) can be in those registers and they may have floating
point members.
There may also be a need to support fixed point mov
m
instructions in and out of floating point registers. Unfortunately, I
have forgotten why this was so, and I don't know whether it is still
true. If
HARD_REGNO_MODE_OK
rejects fixed point values in
floating point registers, then the constraints of the fixed point
mov
m instructions must be designed to avoid ever trying to
reload into a floating point register.
reload_in
m
reload_out
m
mov
m
, but used when a scratch register is required to
move between operand 0 and operand 1. Operand 2 describes the scratch
register. See the discussion of the SECONDARY_RELOAD_CLASS
macro in see Register Classes.
There are special restrictions on the form of the match_operand
s
used in these patterns. First, only the predicate for the reload
operand is examined, i.e., reload_in
examines operand 1, but not
the predicates for operand 0 or 2. Second, there may be only one
alternative in the constraints. Third, only a single register class
letter may be used for the constraint; subsequent constraint letters
are ignored. As a special exception, an empty constraint string
matches the ALL_REGS
register class. This may relieve ports
of the burden of defining an ALL_REGS
constraint letter just
for these patterns.
movstrict
m
mov
m
except that if operand 0 is a subreg
with mode m of a register whose natural mode is wider,
the movstrict
m
instruction is guaranteed not to alter
any of the register except the part which belongs to mode m.
load_multiple
Define this only if the target machine really has such an instruction; do not define this if the most efficient way of loading consecutive registers from memory is to do them one at a time.
On some machines, there are restrictions as to which consecutive
registers can be stored into memory, such as particular starting or
ending register numbers or only a range of valid counts. For those
machines, use a define_expand
(see Expander Definitions)
and make the pattern fail if the restrictions are not met.
Write the generated insn as a parallel
with elements being a
set
of one register from the appropriate memory location (you may
also need use
or clobber
elements). Use a
match_parallel
(see RTL Template) to recognize the insn. See
rs6000.md
for examples of the use of this insn pattern.
store_multiple
load_multiple
, but store several consecutive registers
into consecutive memory locations. Operand 0 is the first of the
consecutive memory locations, operand 1 is the first register, and
operand 2 is a constant: the number of consecutive registers.
push
m
PUSH_ROUNDING
is defined. For historical reason, this pattern may be
missing and in such case an mov
expander is used instead, with a
MEM
expression forming the push operation. The mov
expander
method is deprecated.
add
m3
sub
m3
, mul
m3
div
m3
, udiv
m3
, mod
m3
, umod
m3
smin
m3
, smax
m3
, umin
m3
, umax
m3
and
m3
, ior
m3
, xor
m3
min
m3
, max
m3
mulhisi3
HImode
, and store
a SImode
product in operand 0.
mulqihi3
, mulsidi3
umulqihi3
, umulhisi3
, umulsidi3
smul
m3_highpart
umul
m3_highpart
divmod
m4
For machines with an instruction that produces both a quotient and a
remainder, provide a pattern for divmod
m4
but do not
provide patterns for div
m3
and mod
m3
. This
allows optimization in the relatively common case when both the quotient
and remainder are computed.
If an instruction that just produces a quotient or just a remainder
exists and is more efficient than the instruction that produces both,
write the output routine of divmod
m4
to call
find_reg_note
and look for a REG_UNUSED
note on the
quotient or remainder and generate the appropriate instruction.
udivmod
m4
ashl
m3
ashr
m3
, lshr
m3
, rotl
m3
, rotr
m3
ashl
m3
instructions.
neg
m2
abs
m2
sqrt
m2
The sqrt
built-in function of C always uses the mode which
corresponds to the C data type double
and the sqrtf
built-in function uses the mode which corresponds to the C data
type float
.
cos
m2
The cos
built-in function of C always uses the mode which
corresponds to the C data type double
and the cosf
built-in function uses the mode which corresponds to the C data
type float
.
sin
m2
The sin
built-in function of C always uses the mode which
corresponds to the C data type double
and the sinf
built-in function uses the mode which corresponds to the C data
type float
.
exp
m2
The exp
built-in function of C always uses the mode which
corresponds to the C data type double
and the expf
built-in function uses the mode which corresponds to the C data
type float
.
log
m2
The log
built-in function of C always uses the mode which
corresponds to the C data type double
and the logf
built-in function uses the mode which corresponds to the C data
type float
.
floor
m2
The floor
built-in function of C always uses the mode which
corresponds to the C data type double
and the floorf
built-in function uses the mode which corresponds to the C data
type float
.
trunc
m2
The trunc
built-in function of C always uses the mode which
corresponds to the C data type double
and the truncf
built-in function uses the mode which corresponds to the C data
type float
.
round
m2
The round
built-in function of C always uses the mode which
corresponds to the C data type double
and the roundf
built-in function uses the mode which corresponds to the C data
type float
.
ceil
m2
The ceil
built-in function of C always uses the mode which
corresponds to the C data type double
and the ceilf
built-in function uses the mode which corresponds to the C data
type float
.
nearbyint
m2
The nearbyint
built-in function of C always uses the mode which
corresponds to the C data type double
and the nearbyintf
built-in function uses the mode which corresponds to the C data
type float
.
ffs
m2
The ffs
built-in function of C always uses the mode which
corresponds to the C data type int
.
one_cmpl
m2
cmp
m
(set (cc0) (compare (match_operand:m 0 ...) (match_operand:m 1 ...)))
tst
m
(set (cc0) (match_operand:m 0 ...))
tst
m patterns should not be defined for machines that do
not use
(cc0)
. Doing so would confuse the optimizer since it
would no longer be clear which set
operations were comparisons.
The cmp
m patterns should be used instead.
movstr
m
Pmode
.
The number of bytes to move is the third operand, in mode m.
Usually, you specify word_mode
for m. However, if you can
generate better code knowing the range of valid lengths is smaller than
those representable in a full word, you should provide a pattern with a
mode corresponding to the range of values you can handle efficiently
(e.g., QImode
for values in the range 0-127; note we avoid numbers
that appear negative) and also a pattern with word_mode
.
The fourth operand is the known shared alignment of the source and
destination, in the form of a const_int
rtx. Thus, if the
compiler knows that both source and destination are word-aligned,
it may provide the value 4 for this operand.
Descriptions of multiple movstr
m patterns can only be
beneficial if the patterns for smaller modes have fewer restrictions
on their first, second and fourth operands. Note that the mode m
in
movstr
m does not impose any restriction on the mode of
individually moved data units in the block.
These patterns need not give special consideration to the possibility
that the source and destination strings might overlap.
clrstr
m
Pmode
. The number of bytes to clear is
the second operand, in mode m. See movstr
m
for
a discussion of the choice of mode.
The third operand is the known alignment of the destination, in the form
of a const_int
rtx. Thus, if the compiler knows that the
destination is word-aligned, it may provide the value 4 for this
operand.
The use for multiple clrstr
m is as for
movstr
m.
cmpstr
m
movstr
m
. The two memory blocks specified are compared
byte by byte in lexicographic order. The effect of the instruction is
to store a value in operand 0 whose sign indicates the result of the
comparison.
strlen
m
mem
referring to the first character of the string,
operand 2 is the character to search for (normally zero),
and operand 3 is a constant describing the known alignment
of the beginning of the string.
float
m
n2
floatuns
m
n2
fix
m
n2
fixuns
m
n2
ftrunc
m2
fix_trunc
m
n2
fix
m
n2
but works for any floating point value
of mode m by converting the value to an integer.
fixuns_trunc
m
n2
fixuns
m
n2
but works for any floating point
value of mode m by converting the value to an integer.
trunc
m
n2
extend
m
n2
zero_extend
m
n2
extv
word_mode
.
Operand 1 may have mode byte_mode
or word_mode
; often
word_mode
is allowed only for registers. Operands 2 and 3 must
be valid for word_mode
.
The RTL generation pass generates this instruction only with constants for operands 2 and 3.
The bit-field value is sign-extended to a full word integer
before it is stored in operand 0.
extzv
extv
except that the bit-field value is zero-extended.
insv
word_mode
) into a
bit-field in operand 0, where operand 1 specifies the width in bits and
operand 2 the starting bit. Operand 0 may have mode byte_mode
or
word_mode
; often word_mode
is allowed only for registers.
Operands 1 and 2 must be valid for word_mode
.
The RTL generation pass generates this instruction only with constants
for operands 1 and 2.
mov
modecc
The mode of the operands being compared need not be the same as the operands being moved. Some machines, sparc64 for example, have instructions that conditionally move an integer value based on the floating point condition codes and vice versa.
If the machine does not have conditional move instructions, do not
define these patterns.
mov
modecc
mov
modecc
but for conditional addition. Conditionally
move operand 2 or (operands 2 + operand 3) into operand 0 according to the
comparison in operand 1. If the comparison is true, operand 2 is moved into
operand 0, otherwise operand 3 is moved.
s
cond
eq
, lt
or leu
.
You specify the mode that the operand must have when you write the
match_operand
expression. The compiler automatically sees
which mode you have used and supplies an operand of that mode.
The value stored for a true condition must have 1 as its low bit, or
else must be negative. Otherwise the instruction is not suitable and
you should omit it from the machine description. You describe to the
compiler exactly which value is stored by defining the macro
STORE_FLAG_VALUE
(see Misc). If a description cannot be
found that can be used for all the s
cond patterns, you
should omit those operations from the machine description.
These operations may fail, but should do so only in relatively uncommon cases; if they would fail for common cases involving integer comparisons, it is best to omit these patterns.
If these operations are omitted, the compiler will usually generate code
that copies the constant one to the target and branches around an
assignment of zero to the target. If this code is more efficient than
the potential instructions used for the s
cond pattern
followed by those required to convert the result into a 1 or a zero in
SImode
, you should omit the s
cond operations from
the machine description.
b
cond
label_ref
that
refers to the label to jump to. Jump if the condition codes meet
condition cond.
Some machines do not follow the model assumed here where a comparison
instruction is followed by a conditional branch instruction. In that
case, the cmp
m (and
tst
m) patterns should
simply store the operands away and generate all the required insns in a
define_expand
(see Expander Definitions) for the conditional
branch operations. All calls to expand b
cond patterns are
immediately preceded by calls to expand either a
cmp
m
pattern or a
tst
m pattern.
Machines that use a pseudo register for the condition code value, or where the mode used for the comparison depends on the condition being tested, should also use the above mechanism. See Jump Patterns.
The above discussion also applies to the mov
modecc
and
s
cond patterns.
jump
label_ref
of the label to jump to. This pattern name is mandatory
on all machines.
call
const_int
; operand 2 is the number of registers used as
operands.
On most machines, operand 2 is not actually stored into the RTL pattern. It is supplied for the sake of some RISC machines which need to put this information into the assembler code; they can put it in the RTL instead of operand 1.
Operand 0 should be a mem
RTX whose address is the address of the
function. Note, however, that this address can be a symbol_ref
expression even if it would not be a legitimate memory address on the
target machine. If it is also not a valid argument for a call
instruction, the pattern for this operation should be a
define_expand
(see Expander Definitions) that places the
address into a register and uses that register in the call instruction.
call_value
call
instruction (but with numbers increased by one).
Subroutines that return BLKmode
objects use the call
insn.
call_pop
, call_value_pop
call
and call_value
, except used if defined and
if RETURN_POPS_ARGS
is nonzero. They should emit a parallel
that contains both the function call and a set
to indicate the
adjustment made to the frame pointer.
For machines where RETURN_POPS_ARGS
can be nonzero, the use of these
patterns increases the number of functions for which the frame pointer
can be eliminated, if desired.
untyped_call
parallel
expression where each element is a set
expression that indicates
the saving of a function return value into the result block.
This instruction pattern should be defined to support
__builtin_apply
on machines where special instructions are needed
to call a subroutine with arbitrary arguments or to save the value
returned. This instruction pattern is required on machines that have
multiple registers that can hold a return value
(i.e. FUNCTION_VALUE_REGNO_P
is true for more than one register).
return
Like the mov
m patterns, this pattern is also used after the
RTL generation phase. In this case it is to support machines where
multiple instructions are usually needed to return from a function, but
some class of functions only requires one instruction to implement a
return. Normally, the applicable functions are those which do not need
to save any registers or allocate stack space.
For such machines, the condition specified in this pattern should only
be true when reload_completed
is nonzero and the function's
epilogue would only be a single instruction. For machines with register
windows, the routine leaf_function_p
may be used to determine if
a register window push is required.
Machines that have conditional return instructions should define patterns such as
(define_insn "" [(set (pc) (if_then_else (match_operator 0 "comparison_operator" [(cc0) (const_int 0)]) (return) (pc)))] "condition" "...")
where condition would normally be the same condition specified on the
named return
pattern.
untyped_return
__builtin_return
on machines where special
instructions are needed to return a value of any type.
Operand 0 is a memory location where the result of calling a function
with __builtin_apply
is stored; operand 1 is a parallel
expression where each element is a set
expression that indicates
the restoring of a function return value from the result block.
nop
(const_int 0)
will do as an
RTL pattern.
indirect_jump
casesi
SImode
.
CASE_DROPS_THROUGH
is defined,
then an out-of-bounds index drops through to the code following
the jump table instead of jumping to this label. In that case,
this label is not actually used by the casesi
instruction,
but it is always provided as an operand.)
The table is a addr_vec
or addr_diff_vec
inside of a
jump_insn
. The number of elements in the table is one plus the
difference between the upper bound and the lower bound.
tablejump
casesi
pattern.
This pattern requires two operands: the address or offset, and a label
which should immediately precede the jump table. If the macro
CASE_VECTOR_PC_RELATIVE
evaluates to a nonzero value then the first
operand is an offset which counts from the address of the table; otherwise,
it is an absolute address to jump to. In either case, the first operand has
mode Pmode
.
The tablejump
insn is always the last insn before the jump
table it uses. Its assembler code normally has no need to use the
second operand, but you should incorporate it in the RTL pattern so
that the jump optimizer will not delete the table as unreachable code.
decrement_and_branch_until_zero
This optional instruction pattern is only used by the combiner,
typically for loops reversed by the loop optimizer when strength
reduction is enabled.
doloop_end
const_int
or const0_rtx
if this cannot be
determined until run-time; operand 2 is the actual or estimated maximum
number of iterations as a const_int
; operand 3 is the number of
enclosed loops as a const_int
(an innermost loop has a value of
1); operand 4 is the label to jump to if the register is nonzero.
See Looping Patterns.
This optional instruction pattern should be defined for machines with
low-overhead looping instructions as the loop optimizer will try to
modify suitable loops to utilize it. If nested low-overhead looping is
not supported, use a define_expand
(see Expander Definitions)
and make the pattern fail if operand 3 is not const1_rtx
.
Similarly, if the actual or estimated maximum number of iterations is
too large for this instruction, make it fail.
doloop_begin
doloop_end
required for machines that
need to perform some initialization, such as loading special registers
used by a low-overhead looping instruction. If initialization insns do
not always need to be emitted, use a define_expand
(see Expander Definitions) and make it fail.
canonicalize_funcptr_for_compare
Operand 0 is always a reg
and has mode Pmode
; operand 1
may be a reg
, mem
, symbol_ref
, const_int
, etc
and also has mode Pmode
.
Canonicalization of a function pointer usually involves computing the address of the function which would be called if the function pointer were used in an indirect call.
Only define this pattern if function pointers on the target machine
can have different values but still call the same function when
used in an indirect call.
save_stack_block
save_stack_function
save_stack_nonlocal
restore_stack_block
restore_stack_function
restore_stack_nonlocal
Pmode
. Do not define these patterns on
such machines.
Some machines require special handling for stack pointer saves and
restores. On those machines, define the patterns corresponding to the
non-standard cases by using a define_expand
(see Expander Definitions) that produces the required insns. The three types of
saves and restores are:
save_stack_block
saves the stack pointer at the start of a block
that allocates a variable-sized object, and restore_stack_block
restores the stack pointer when the block is exited.
save_stack_function
and restore_stack_function
do a
similar job for the outermost block of a function and are used when the
function allocates variable-sized objects or calls alloca
. Only
the epilogue uses the restored stack pointer, allowing a simpler save or
restore sequence on some machines.
save_stack_nonlocal
is used in functions that contain labels
branched to by nested functions. It saves the stack pointer in such a
way that the inner function can use restore_stack_nonlocal
to
restore the stack pointer. The compiler generates code to restore the
frame and argument pointer registers, but some machines require saving
and restoring additional data such as register window information or
stack backchains. Place insns in these patterns to save and restore any
such required data.
When saving the stack pointer, operand 0 is the save area and operand 1
is the stack pointer. The mode used to allocate the save area defaults
to Pmode
but you can override that choice by defining the
STACK_SAVEAREA_MODE
macro (see Storage Layout). You must
specify an integral mode, or VOIDmode
if no save area is needed
for a particular type of save (either because no save is needed or
because a machine-specific save area can be used). Operand 0 is the
stack pointer and operand 1 is the save area for restore operations. If
save_stack_block
is defined, operand 0 must not be
VOIDmode
since these saves can be arbitrarily nested.
A save area is a mem
that is at a constant offset from
virtual_stack_vars_rtx
when the stack pointer is saved for use by
nonlocal gotos and a reg
in the other two cases.
allocate_stack
STACK_GROWS_DOWNWARD
is undefined) operand 1 from
the stack pointer to create space for dynamically allocated data.
Store the resultant pointer to this space into operand 0. If you
are allocating space from the main stack, do this by emitting a
move insn to copy virtual_stack_dynamic_rtx
to operand 0.
If you are allocating the space elsewhere, generate code to copy the
location of the space to operand 0. In the latter case, you must
ensure this space gets freed when the corresponding space on the main
stack is free.
Do not define this pattern if all that must be done is the subtraction.
Some machines require other operations such as stack probes or
maintaining the back chain. Define this pattern to emit those
operations in addition to updating the stack pointer.
probe
If you need to emit instructions before the stack has been adjusted,
put them into the allocate_stack
pattern. Otherwise, define
this pattern to emit the required instructions.
No operands are provided.
check_stack
nonlocal_goto
On most machines you need not define this pattern, since GCC will
already generate the correct code, which is to load the frame pointer
and static chain, restore the stack (using the
restore_stack_nonlocal
pattern, if defined), and jump indirectly
to the dispatcher. You need only define this pattern if this code will
not work on your machine.
nonlocal_goto_receiver
exception_receiver
builtin_setjmp_setup
jmp_buf
. You will not normally need to define this pattern.
A typical reason why you might need this pattern is if some value, such
as a pointer to a global table, must be restored. Though it is
preferred that the pointer value be recalculated if possible (given the
address of a label for instance). The single argument is a pointer to
the jmp_buf
. Note that the buffer is five words long and that
the first three are normally used by the generic mechanism.
builtin_setjmp_receiver
builtin_longjmp
builtin_setjmp_setup
. The single argument is a pointer to the
jmp_buf
.
eh_return
__builtin_eh_return
,
and thence the call frame exception handling library routines, are
built. It is intended to handle non-trivial actions needed along
the abnormal return path.
The pattern takes two arguments. The first is an offset to be applied
to the stack pointer. It will have been copied to some appropriate
location (typically EH_RETURN_STACKADJ_RTX
) which will survive
until after reload to when the normal epilogue is generated.
The second argument is the address of the exception handler to which
the function should return. This will normally need to copied by the
pattern to some special register or memory location.
This pattern only needs to be defined if call frame exception handling
is to be used, and simple moves involving EH_RETURN_STACKADJ_RTX
and EH_RETURN_HANDLER_RTX
are not sufficient.
prologue
Using a prologue pattern is generally preferred over defining
TARGET_ASM_FUNCTION_PROLOGUE
to emit assembly code for the prologue.
The prologue
pattern is particularly useful for targets which perform
instruction scheduling.
epilogue
Using an epilogue pattern is generally preferred over defining
TARGET_ASM_FUNCTION_EPILOGUE
to emit assembly code for the epilogue.
The epilogue
pattern is particularly useful for targets which perform
instruction scheduling or which have delay slots for their return instruction.
sibcall_epilogue
The sibcall_epilogue
pattern must not clobber any arguments used for
parameter passing or any stack slots for arguments passed to the current
function.
trap
conditional_trap
A typical conditional_trap
pattern looks like
(define_insn "conditional_trap" [(trap_if (match_operator 0 "trap_operator" [(cc0) (const_int 0)]) (match_operand 1 "const_int_operand" "i"))] "" "...")
prefetch
Targets that do not support write prefetches or locality hints can ignore the values of operands 1 and 2.