Instruction Set Assembly Guide For Armv7 and Earlier Arm Architectures 100076 0200 00 en
Instruction Set Assembly Guide For Armv7 and Earlier Arm Architectures 100076 0200 00 en
Reference Guide
Copyright © 2018, 2019 Arm Limited or its affiliates. All rights reserved.
100076_0200_00_en
Instruction Set Assembly Guide for Armv7 and earlier Arm® architectures
Instruction Set Assembly Guide for Armv7 and earlier Arm® architectures
Reference Guide
Copyright © 2018, 2019 Arm Limited or its affiliates. All rights reserved.
Release Information
Document History
Your access to the information in this document is conditional upon your acceptance that you will not use or permit others to use
the information for the purposes of determining whether implementations infringe any third party patents.
THIS DOCUMENT IS PROVIDED “AS IS”. ARM PROVIDES NO REPRESENTATIONS AND NO WARRANTIES,
EXPRESS, IMPLIED OR STATUTORY, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY, SATISFACTORY QUALITY, NON-INFRINGEMENT OR FITNESS FOR A PARTICULAR PURPOSE
WITH RESPECT TO THE DOCUMENT. For the avoidance of doubt, Arm makes no representation with respect to, and has
undertaken no analysis to identify or understand the scope and content of, third party patents, copyrights, trade secrets, or other
rights.
TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL ARM BE LIABLE FOR ANY DAMAGES,
INCLUDING WITHOUT LIMITATION ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, PUNITIVE, OR
CONSEQUENTIAL DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING
OUT OF ANY USE OF THIS DOCUMENT, EVEN IF ARM HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
DAMAGES.
This document consists solely of commercial items. You shall be responsible for ensuring that any use, duplication or disclosure of
this document complies fully with any relevant export laws and regulations to assure that this document or any portion thereof is
not exported, directly or indirectly, in violation of such export laws. Use of the word “partner” in reference to Arm’s customers is
not intended to create or refer to any partnership relationship with any other company. Arm may make changes to this document at
any time and without notice.
If any of the provisions contained in these terms conflict with any of the provisions of any click through or signed written
agreement covering this document with Arm, then the click through or signed written agreement prevails over and supersedes the
conflicting provisions of these terms. This document may be translated into other languages for convenience, and you agree that if
there is any conflict between the English version of this document and any translation, the terms of the English version of the
Agreement shall prevail.
The Arm corporate logo and words marked with ® or ™ are registered trademarks or trademarks of Arm Limited (or its
subsidiaries) in the US and/or elsewhere. All rights reserved. Other brands and names mentioned in this document may be the
trademarks of their respective owners. Please follow Arm’s trademark usage guidelines at http://www.arm.com/company/policies/
trademarks.
Copyright © 2018, 2019 Arm Limited (or its affiliates). All rights reserved.
LES-PRE-20349
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 2
reserved.
Non-Confidential
Instruction Set Assembly Guide for Armv7 and earlier Arm® architectures
Confidentiality Status
This document is Non-Confidential. The right to use, copy and disclose this document may be subject to license restrictions in
accordance with the terms of the agreement entered into by Arm and the party that Arm delivered this document to.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 3
reserved.
Non-Confidential
Contents
Instruction Set Assembly Guide for Armv7 and
earlier Arm® architectures Reference Guide
Preface
About this book ..................................................... ..................................................... 20
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 5
reserved.
Non-Confidential
A1.16 Access to the inline barrel shifter in AArch32 state ....................... ....................... A1-42
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 6
reserved.
Non-Confidential
C1.9 Condition code suffixes ............................................ ............................................ C1-92
C1.10 Condition code suffixes and related flags .............................. .............................. C1-93
C1.11 Comparison of condition code meanings in integer and floating-point code .... .... C1-94
C1.12 Benefits of using conditional execution in A32 and T32 code ............... ............... C1-96
C1.13 Example showing the benefits of conditional instructions in A32 and T32 code . . C1-97
C1.14 Optimization for execution speed ........................................................................ C1-100
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 7
reserved.
Non-Confidential
C2.43 LDAEX ................................................................................................................ C2-173
C2.44 LDC and LDC2 .................................................................................................... C2-175
C2.45 LDM .......................................................... .......................................................... C2-177
C2.46 LDR (immediate offset) ........................................... ........................................... C2-179
C2.47 LDR (PC-relative) ................................................................................................ C2-181
C2.48 LDR (register offset) ............................................................................................ C2-183
C2.49 LDR (register-relative) ............................................ ............................................ C2-185
C2.50 LDR, unprivileged ............................................... ............................................... C2-187
C2.51 LDREX ................................................................................................................ C2-189
C2.52 LSL ...................................................................................................................... C2-191
C2.53 LSR .......................................................... .......................................................... C2-193
C2.54 MCR and MCR2 .................................................................................................. C2-195
C2.55 MCRR and MCRR2 .............................................. .............................................. C2-196
C2.56 MLA .......................................................... .......................................................... C2-197
C2.57 MLS .......................................................... .......................................................... C2-198
C2.58 MOV .................................................................................................................... C2-199
C2.59 MOVT .................................................................................................................. C2-201
C2.60 MRC and MRC2 .................................................................................................. C2-202
C2.61 MRRC and MRRC2 .............................................. .............................................. C2-203
C2.62 MRS (PSR to general-purpose register) .............................. .............................. C2-204
C2.63 MRS (system coprocessor register to general-purpose register) ........... ........... C2-206
C2.64 MSR (general-purpose register to system coprocessor register) ........... ........... C2-207
C2.65 MSR (general-purpose register to PSR) .............................. .............................. C2-208
C2.66 MUL .......................................................... .......................................................... C2-210
C2.67 MVN .................................................................................................................... C2-211
C2.68 NOP .................................................................................................................... C2-213
C2.69 ORN (T32 only) ................................................. ................................................. C2-214
C2.70 ORR .................................................................................................................... C2-215
C2.71 PKHBT and PKHTB ............................................................................................ C2-217
C2.72 PLD, PLDW, and PLI ............................................. ............................................. C2-219
C2.73 POP .......................................................... .......................................................... C2-221
C2.74 PUSH .................................................................................................................. C2-222
C2.75 QADD .................................................................................................................. C2-223
C2.76 QADD8 ................................................................................................................ C2-224
C2.77 QADD16 .............................................................................................................. C2-225
C2.78 QASX .................................................................................................................. C2-226
C2.79 QDADD ....................................................... ....................................................... C2-227
C2.80 QDSUB ....................................................... ....................................................... C2-228
C2.81 QSAX .................................................................................................................. C2-229
C2.82 QSUB .................................................................................................................. C2-230
C2.83 QSUB8 ................................................................................................................ C2-231
C2.84 QSUB16 .............................................................................................................. C2-232
C2.85 RBIT .................................................................................................................... C2-233
C2.86 REV .......................................................... .......................................................... C2-234
C2.87 REV16 ........................................................ ........................................................ C2-235
C2.88 REVSH ................................................................................................................ C2-236
C2.89 RFE .......................................................... .......................................................... C2-237
C2.90 ROR .................................................................................................................... C2-239
C2.91 RRX .......................................................... .......................................................... C2-241
C2.92 RSB .......................................................... .......................................................... C2-243
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 8
reserved.
Non-Confidential
C2.93 RSC .......................................................... .......................................................... C2-245
C2.94 SADD8 ................................................................................................................ C2-247
C2.95 SADD16 .............................................................................................................. C2-249
C2.96 SASX ......................................................... ......................................................... C2-251
C2.97 SBC .......................................................... .......................................................... C2-253
C2.98 SBFX ......................................................... ......................................................... C2-255
C2.99 SDIV .................................................................................................................... C2-256
C2.100 SEL .......................................................... .......................................................... C2-257
C2.101 SETEND ...................................................... ...................................................... C2-259
C2.102 SETPAN .............................................................................................................. C2-260
C2.103 SEV .......................................................... .......................................................... C2-261
C2.104 SEVL ......................................................... ......................................................... C2-262
C2.105 SG ........................................................... ........................................................... C2-263
C2.106 SHADD8 ...................................................... ...................................................... C2-264
C2.107 SHADD16 ..................................................... ..................................................... C2-265
C2.108 SHASX ................................................................................................................ C2-266
C2.109 SHSAX ................................................................................................................ C2-267
C2.110 SHSUB8 .............................................................................................................. C2-268
C2.111 SHSUB16 ............................................................................................................ C2-269
C2.112 SMC .................................................................................................................... C2-270
C2.113 SMLAxy ....................................................... ....................................................... C2-271
C2.114 SMLAD ................................................................................................................ C2-273
C2.115 SMLAL ................................................................................................................ C2-274
C2.116 SMLALD .............................................................................................................. C2-275
C2.117 SMLALxy ...................................................... ...................................................... C2-276
C2.118 SMLAWy ...................................................... ...................................................... C2-278
C2.119 SMLSD ................................................................................................................ C2-279
C2.120 SMLSLD .............................................................................................................. C2-280
C2.121 SMMLA ....................................................... ....................................................... C2-281
C2.122 SMMLS ....................................................... ....................................................... C2-282
C2.123 SMMUL ....................................................... ....................................................... C2-283
C2.124 SMUAD ....................................................... ....................................................... C2-284
C2.125 SMULxy ....................................................... ....................................................... C2-285
C2.126 SMULL ................................................................................................................ C2-286
C2.127 SMULWy ...................................................... ...................................................... C2-287
C2.128 SMUSD ....................................................... ....................................................... C2-288
C2.129 SRS .......................................................... .......................................................... C2-289
C2.130 SSAT ......................................................... ......................................................... C2-291
C2.131 SSAT16 ....................................................... ....................................................... C2-292
C2.132 SSAX ......................................................... ......................................................... C2-293
C2.133 SSUB8 ................................................................................................................ C2-295
C2.134 SSUB16 .............................................................................................................. C2-297
C2.135 STC and STC2 .................................................................................................... C2-299
C2.136 STL .......................................................... .......................................................... C2-301
C2.137 STLEX ........................................................ ........................................................ C2-302
C2.138 STM .......................................................... .......................................................... C2-304
C2.139 STR (immediate offset) ........................................... ........................................... C2-306
C2.140 STR (register offset) ............................................................................................ C2-308
C2.141 STR, unprivileged ............................................... ............................................... C2-310
C2.142 STREX ................................................................................................................ C2-312
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 9
reserved.
Non-Confidential
C2.143 SUB .......................................................... .......................................................... C2-314
C2.144 SUBS pc, lr .................................................... .................................................... C2-317
C2.145 SVC .......................................................... .......................................................... C2-319
C2.146 SWP and SWPB ................................................ ................................................ C2-320
C2.147 SXTAB ........................................................ ........................................................ C2-321
C2.148 SXTAB16 ...................................................... ...................................................... C2-323
C2.149 SXTAH ................................................................................................................ C2-325
C2.150 SXTB ......................................................... ......................................................... C2-327
C2.151 SXTB16 ....................................................... ....................................................... C2-329
C2.152 SXTH ......................................................... ......................................................... C2-330
C2.153 SYS .......................................................... .......................................................... C2-332
C2.154 TBB and TBH ...................................................................................................... C2-333
C2.155 TEQ .......................................................... .......................................................... C2-334
C2.156 TST .......................................................... .......................................................... C2-336
C2.157 TT, TTT, TTA, TTAT .............................................. .............................................. C2-338
C2.158 UADD8 ................................................................................................................ C2-340
C2.159 UADD16 .............................................................................................................. C2-342
C2.160 UASX .................................................................................................................. C2-344
C2.161 UBFX ......................................................... ......................................................... C2-346
C2.162 UDF .......................................................... .......................................................... C2-347
C2.163 UDIV ......................................................... ......................................................... C2-348
C2.164 UHADD8 ...................................................... ...................................................... C2-349
C2.165 UHADD16 ..................................................... ..................................................... C2-350
C2.166 UHASX ................................................................................................................ C2-351
C2.167 UHSAX ................................................................................................................ C2-352
C2.168 UHSUB8 ...................................................... ...................................................... C2-353
C2.169 UHSUB16 ..................................................... ..................................................... C2-354
C2.170 UMAAL ................................................................................................................ C2-355
C2.171 UMLAL ................................................................................................................ C2-356
C2.172 UMULL ................................................................................................................ C2-357
C2.173 UQADD8 ...................................................... ...................................................... C2-358
C2.174 UQADD16 ..................................................... ..................................................... C2-359
C2.175 UQASX ....................................................... ....................................................... C2-360
C2.176 UQSAX ....................................................... ....................................................... C2-361
C2.177 UQSUB8 ...................................................... ...................................................... C2-362
C2.178 UQSUB16 ..................................................... ..................................................... C2-363
C2.179 USAD8 ................................................................................................................ C2-364
C2.180 USADA8 .............................................................................................................. C2-365
C2.181 USAT ......................................................... ......................................................... C2-366
C2.182 USAT16 ....................................................... ....................................................... C2-367
C2.183 USAX .................................................................................................................. C2-368
C2.184 USUB8 ................................................................................................................ C2-370
C2.185 USUB16 .............................................................................................................. C2-372
C2.186 UXTAB ................................................................................................................ C2-373
C2.187 UXTAB16 ............................................................................................................ C2-375
C2.188 UXTAH ................................................................................................................ C2-377
C2.189 UXTB ......................................................... ......................................................... C2-379
C2.190 UXTB16 ....................................................... ....................................................... C2-381
C2.191 UXTH .................................................................................................................. C2-382
C2.192 WFE .................................................................................................................... C2-384
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 10
reserved.
Non-Confidential
C2.193 WFI .......................................................... .......................................................... C2-385
C2.194 YIELD .................................................................................................................. C2-386
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 11
reserved.
Non-Confidential
C3.47 VFMSL (by scalar) .............................................................................................. C3-439
C3.48 VFMSL (vector) ................................................. ................................................. C3-440
C3.49 VHADD ....................................................... ....................................................... C3-441
C3.50 VHSUB ................................................................................................................ C3-442
C3.51 VLDn (single n-element structure to one lane) ......................... ......................... C3-443
C3.52 VLDn (single n-element structure to all lanes) .................................................... C3-445
C3.53 VLDn (multiple n-element structures) ................................ ................................ C3-447
C3.54 VLDM .................................................................................................................. C3-449
C3.55 VLDR ......................................................... ......................................................... C3-450
C3.56 VLDR (post-increment and pre-decrement) ........................................................ C3-451
C3.57 VLDR pseudo-instruction .................................................................................... C3-452
C3.58 VMAX and VMIN ................................................ ................................................ C3-453
C3.59 VMAXNM, VMINNM ............................................................................................ C3-454
C3.60 VMLA .................................................................................................................. C3-455
C3.61 VMLA (by scalar) ................................................ ................................................ C3-456
C3.62 VMLAL (by scalar) ............................................... ............................................... C3-457
C3.63 VMLAL ................................................................................................................ C3-458
C3.64 VMLS (by scalar) ................................................ ................................................ C3-459
C3.65 VMLS .................................................................................................................. C3-460
C3.66 VMLSL ................................................................................................................ C3-461
C3.67 VMLSL (by scalar) ............................................... ............................................... C3-462
C3.68 VMOV (immediate) .............................................. .............................................. C3-463
C3.69 VMOV (register) .................................................................................................. C3-464
C3.70 VMOV (between two general-purpose registers and a 64-bit extension register) ....
............................................................................................................................. C3-465
C3.71 VMOV (between a general-purpose register and an Advanced SIMD scalar) .... C3-466
C3.72 VMOVL ....................................................... ....................................................... C3-467
C3.73 VMOVN ....................................................... ....................................................... C3-468
C3.74 VMOV2 ....................................................... ....................................................... C3-469
C3.75 VMRS .................................................................................................................. C3-470
C3.76 VMSR .................................................................................................................. C3-471
C3.77 VMUL .................................................................................................................. C3-472
C3.78 VMUL (by scalar) ................................................................................................ C3-473
C3.79 VMULL ................................................................................................................ C3-474
C3.80 VMULL (by scalar) .............................................................................................. C3-475
C3.81 VMVN (register) .................................................................................................. C3-476
C3.82 VMVN (immediate) .............................................................................................. C3-477
C3.83 VNEG .................................................................................................................. C3-478
C3.84 VORN (register) .................................................................................................. C3-479
C3.85 VORN (immediate) .............................................................................................. C3-480
C3.86 VORR (register) .................................................................................................. C3-481
C3.87 VORR (immediate) .............................................................................................. C3-482
C3.88 VPADAL .............................................................................................................. C3-483
C3.89 VPADD ................................................................................................................ C3-484
C3.90 VPADDL .............................................................................................................. C3-485
C3.91 VPMAX and VPMIN ............................................................................................ C3-486
C3.92 VPOP .................................................................................................................. C3-487
C3.93 VPUSH ................................................................................................................ C3-488
C3.94 VQABS ................................................................................................................ C3-489
C3.95 VQADD ....................................................... ....................................................... C3-490
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 12
reserved.
Non-Confidential
C3.96 VQDMLAL and VQDMLSL (by vector or by scalar) ............................................ C3-491
C3.97 VQDMULH (by vector or by scalar) .................................. .................................. C3-492
C3.98 VQDMULL (by vector or by scalar) .................................. .................................. C3-493
C3.99 VQMOVN and VQMOVUN ........................................ ........................................ C3-494
C3.100 VQNEG ....................................................... ....................................................... C3-495
C3.101 VQRDMULH (by vector or by scalar) .................................................................. C3-496
C3.102 VQRSHL (by signed variable) ...................................... ...................................... C3-497
C3.103 VQRSHRN and VQRSHRUN (by immediate) .......................... .......................... C3-498
C3.104 VQSHL (by signed variable) ....................................... ....................................... C3-499
C3.105 VQSHL and VQSHLU (by immediate) ................................................................ C3-500
C3.106 VQSHRN and VQSHRUN (by immediate) .......................................................... C3-501
C3.107 VQSUB ....................................................... ....................................................... C3-502
C3.108 VRADDHN .......................................................................................................... C3-503
C3.109 VRECPE ...................................................... ...................................................... C3-504
C3.110 VRECPS ...................................................... ...................................................... C3-505
C3.111 VREV16, VREV32, and VREV64 ........................................................................ C3-506
C3.112 VRHADD ...................................................... ...................................................... C3-507
C3.113 VRSHL (by signed variable) ................................................................................ C3-508
C3.114 VRSHR (by immediate) ........................................... ........................................... C3-509
C3.115 VRSHRN (by immediate) .................................................................................... C3-510
C3.116 VRINT ........................................................ ........................................................ C3-511
C3.117 VRSQRTE ..................................................... ..................................................... C3-512
C3.118 VRSQRTS ..................................................... ..................................................... C3-513
C3.119 VRSRA (by immediate) ........................................... ........................................... C3-514
C3.120 VRSUBHN ..................................................... ..................................................... C3-515
C3.121 VSDOT (vector) ................................................. ................................................. C3-516
C3.122 VSDOT (by element) ............................................. ............................................. C3-517
C3.123 VSHL (by immediate) .......................................................................................... C3-518
C3.124 VSHL (by signed variable) .................................................................................. C3-519
C3.125 VSHLL (by immediate) ........................................................................................ C3-520
C3.126 VSHR (by immediate) ............................................ ............................................ C3-521
C3.127 VSHRN (by immediate) ........................................... ........................................... C3-522
C3.128 VSLI .................................................................................................................... C3-523
C3.129 VSRA (by immediate) ............................................ ............................................ C3-524
C3.130 VSRI .................................................................................................................... C3-525
C3.131 VSTM .................................................................................................................. C3-526
C3.132 VSTn (multiple n-element structures) ................................ ................................ C3-527
C3.133 VSTn (single n-element structure to one lane) ......................... ......................... C3-529
C3.134 VSTR ......................................................... ......................................................... C3-531
C3.135 VSTR (post-increment and pre-decrement) ........................................................ C3-532
C3.136 VSUB .................................................................................................................. C3-533
C3.137 VSUBHN ...................................................... ...................................................... C3-534
C3.138 VSUBL and VSUBW ............................................. ............................................. C3-535
C3.139 VSWP ........................................................ ........................................................ C3-536
C3.140 VTBL and VTBX .................................................................................................. C3-537
C3.141 VTRN .................................................................................................................. C3-538
C3.142 VTST ......................................................... ......................................................... C3-539
C3.143 VUDOT (vector) .................................................................................................. C3-540
C3.144 VUDOT (by element) ............................................. ............................................. C3-541
C3.145 VUZP ......................................................... ......................................................... C3-542
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 13
reserved.
Non-Confidential
C3.146 VZIP .................................................................................................................... C3-543
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 14
reserved.
Non-Confidential
List of Figures
Instruction Set Assembly Guide for Armv7 and
earlier Arm® architectures Reference Guide
Figure A1-1 Organization of general-purpose registers and Program Status Registers ......................... A1-31
Figure B1-1 Extension register bank for Advanced SIMD in AArch32 state ........................................... B1-47
Figure B2-1 Extension register bank for floating-point in AArch32 state ................................................ B2-67
Figure C2-1 ASR #3 .............................................................................................................................. C2-115
Figure C2-2 LSR #3 .............................................................................................................................. C2-116
Figure C2-3 LSL #3 ............................................................................................................................... C2-116
Figure C2-4 ROR #3 ............................................................................................................................. C2-116
Figure C2-5 RRX ................................................................................................................................... C2-117
Figure C3-1 De-interleaving an array of 3-element structures .............................................................. C3-395
Figure C3-2 Operation of doubleword VEXT for imm = 3 ..................................................................... C3-435
Figure C3-3 Example of operation of VPADAL (in this case for data type S16) .................................. C3-483
Figure C3-4 Example of operation of VPADD (in this case, for data type I16) ..................................... C3-484
Figure C3-5 Example of operation of doubleword VPADDL (in this case, for data type S16) .............. C3-485
Figure C3-6 Operation of quadword VSHL.I64 Qd, Qm, #1 ................................................................. C3-518
Figure C3-7 Operation of quadword VSLI.64 Qd, Qm, #1 .................................................................... C3-523
Figure C3-8 Operation of doubleword VSRI.64 Dd, Dm, #2 ................................................................. C3-525
Figure C3-9 Operation of doubleword VTRN.8 ..................................................................................... C3-538
Figure C3-10 Operation of doubleword VTRN.32 ................................................................................... C3-538
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 15
reserved.
Non-Confidential
List of Tables
Instruction Set Assembly Guide for Armv7 and
earlier Arm® architectures Reference Guide
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 17
reserved.
Non-Confidential
Table C2-12 Options and architectures, LDR (register offsets) ............................................................. C2-184
Table C2-13 Register-relative offsets .................................................................................................... C2-185
Table C2-14 Offsets and architectures, LDR (User mode) .................................................................... C2-187
Table C2-15 Offsets and architectures, STR, word, halfword, and byte ................................................ C2-306
Table C2-16 Options and architectures, STR (register offsets) ............................................................. C2-308
Table C2-17 Offsets and architectures, STR (User mode) .................................................................... C2-311
Table C3-1 Summary of Advanced SIMD instructions ........................................................................ C3-391
Table C3-2 Summary of shared Advanced SIMD and floating-point instructions ................................ C3-394
Table C3-3 Patterns for immediate value in VBIC (immediate) ........................................................... C3-408
Table C3-4 Permitted combinations of parameters for VLDn (single n-element structure to one lane) .... C3-
443
Table C3-5 Permitted combinations of parameters for VLDn (single n-element structure to all lanes) .... C3-
445
Table C3-6 Permitted combinations of parameters for VLDn (multiple n-element structures) ............ C3-447
Table C3-7 Available immediate values in VMOV (immediate) ........................................................... C3-463
Table C3-8 Available immediate values in VMVN (immediate) ........................................................... C3-477
Table C3-9 Patterns for immediate value in VORR (immediate) ......................................................... C3-482
Table C3-10 Available immediate ranges in VQRSHRN and VQRSHRUN (by immediate) .................. C3-498
Table C3-11 Available immediate ranges in VQSHL and VQSHLU (by immediate) ............................. C3-500
Table C3-12 Available immediate ranges in VQSHRN and VQSHRUN (by immediate) ....................... C3-501
Table C3-13 Results for out-of-range inputs in VRECPE ...................................................................... C3-504
Table C3-14 Results for out-of-range inputs in VRECPS ...................................................................... C3-505
Table C3-15 Available immediate ranges in VRSHR (by immediate) .................................................... C3-509
Table C3-16 Available immediate ranges in VRSHRN (by immediate) ................................................. C3-510
Table C3-17 Results for out-of-range inputs in VRSQRTE .................................................................... C3-512
Table C3-18 Results for out-of-range inputs in VRSQRTS .................................................................... C3-513
Table C3-19 Available immediate ranges in VRSRA (by immediate) .................................................... C3-514
Table C3-20 Available immediate ranges in VSHL (by immediate) ....................................................... C3-518
Table C3-21 Available immediate ranges in VSHLL (by immediate) ..................................................... C3-520
Table C3-22 Available immediate ranges in VSHR (by immediate) ...................................................... C3-521
Table C3-23 Available immediate ranges in VSHRN (by immediate) .................................................... C3-522
Table C3-24 Available immediate ranges in VSRA (by immediate) ....................................................... C3-524
Table C3-25 Permitted combinations of parameters for VSTn (multiple n-element structures) ............ C3-527
Table C3-26 Permitted combinations of parameters for VSTn (single n-element structure to one lane) .... C3-
529
Table C3-27 Operation of doubleword VUZP.8 ...................................................................................... C3-542
Table C3-28 Operation of quadword VUZP.32 ...................................................................................... C3-542
Table C3-29 Operation of doubleword VZIP.8 ....................................................................................... C3-543
Table C3-30 Operation of quadword VZIP.32 ........................................................................................ C3-543
Table C4-1 Summary of floating-point instructions .............................................................................. C4-547
Table C5-1 Summary of A32/T32 cryptographic instructions .............................................................. C5-590
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 18
reserved.
Non-Confidential
Preface
This preface introduces the Instruction Set Assembly Guide for Armv7 and earlier Arm® architectures
Reference Guide.
It contains the following:
• About this book on page 20.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 19
reserved.
Non-Confidential
Preface
Using this book
Glossary
The Arm® Glossary is a list of terms used in Arm documentation, together with definitions for those
terms. The Arm Glossary does not contain terms that are industry standard unless the Arm meaning
differs from the generally accepted meaning.
See the Arm® Glossary for more information.
Typographic conventions
italic
Introduces special terminology, denotes cross-references, and citations.
bold
Highlights interface elements, such as menu names. Denotes signal names. Also used for terms
in descriptive lists, where appropriate.
monospace
Denotes text that you can enter at the keyboard, such as commands, file and program names,
and source code.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 20
reserved.
Non-Confidential
Preface
Feedback
monospace
Denotes a permitted abbreviation for a command or option. You can enter the underlined text
instead of the full command or option name.
monospace italic
Denotes arguments to monospace text where the argument is to be replaced by a specific value.
monospace bold
Denotes language keywords when used outside example code.
<and>
Encloses replaceable terms for assembler syntax where they appear in code or code fragments.
For example:
MRC p15, 0, <Rd>, <CRn>, <CRm>, <Opcode_2>
SMALL CAPITALS
Used in body text for a few terms that have specific technical meanings, that are defined in the
Arm® Glossary. For example, IMPLEMENTATION DEFINED, IMPLEMENTATION SPECIFIC, UNKNOWN, and
UNPREDICTABLE.
Feedback
Feedback on content
If you have comments on content then send an e-mail to [email protected]. Give:
• The title Instruction Set Assembly Guide for Armv7 and earlier Arm architectures Reference Guide.
• The number 100076_0200_00_en.
• If applicable, the page number(s) to which your comments refer.
• A concise explanation of your comments.
Arm also welcomes general suggestions for additions and improvements.
Note
Arm tests the PDF only in Adobe Acrobat and Acrobat Reader, and cannot guarantee the quality of the
represented document when used with any other PDF reader.
Other information
• Arm® Developer.
• Arm® Information Center.
• Arm® Technical Support Knowledge Articles.
• Technical Support.
• Arm® Glossary.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights 21
reserved.
Non-Confidential
Part A
Instruction Set Overview
Chapter A1
Overview of AArch32 state
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-25
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.1 Terminology
A1.1 Terminology
This document uses the following terms to refer to instruction sets.
Instruction sets for Armv7 and earlier architectures were called the ARM and Thumb instruction sets.
This document describes the instruction sets for Armv7 and earlier architectures, but uses terminology
that is introduced with Armv8:
A32
The A32 instruction set was previously called the ARM instruction set. It is a fixed-length
instruction set that uses 32-bit instruction encodings.
T32
The T32 instruction set was previously called the Thumb instruction set. It is a variable-length
instruction set that uses both 16-bit and 32-bit instruction.
AArch32
The AArch32 Execution state supports the A32 and T32 instruction sets.
The Arm 32-bit Execution state uses 32-bit general purpose registers, and a 32-bit program counter (PC),
stack pointer (SP), and link register (LR). In implementations of the Arm architecture beforeArmv8,
execution is always in AArch32 state.
Note
Some examples and descriptions in this document might apply only to the armasm legacy assembler.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-26
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.2 Changing between A32 and T32 instruction set states
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-27
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.3 Processor modes, and privileged and unprivileged software execution
Note
Armv6‑M, Armv7‑M, Armv8‑M Baseline, and Armv8‑M Mainline do not support the same modes as
other Arm architectures and profiles. Some of the processor modes listed here do not apply to these
architectures.
User 0b10000
FIQ 0b10001
IRQ 0b10010
Supervisor 0b10011
Monitor 0b10110
Abort 0b10111
Hyp 0b11010
Undefined 0b11011
System 0b11111
User mode is an unprivileged mode, and has restricted access to system resources. All other modes have
full access to system resources in the current security state, can change mode freely, and execute
software as privileged.
Applications that require task protection usually execute in User mode. Some embedded applications
might run entirely in any mode other than User mode. An application that requires full access to system
resources usually executes in System mode.
Modes other than User mode are entered to service exceptions, or to access privileged resources.
Code can run in either a Secure state or in a Non-secure state. Hypervisor (Hyp) mode has privileged
execution in Non-secure state.
Related concepts
A1.4 Processor modes in Armv6‑M, Armv7‑M, and Armv8‑M on page A1-29
Related information
Arm Architecture Reference Manual
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-28
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.4 Processor modes in Armv6-M, Armv7-M, and Armv8-M
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-29
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.5 Registers in AArch32 state
Note
• SP and LR can be used as general-purpose registers, although Arm deprecates using SP other than as
a stack pointer.
Additional registers are available in privileged software execution. Arm processors have a total of 43
registers. The registers are arranged in partially overlapping banks. There is a different register bank for
each processor mode. The banked registers give rapid context switching for dealing with processor
exceptions and privileged operations.
The additional registers in Arm processors are:
• 2 supervisor mode registers for banked SP and LR.
• 2 abort mode registers for banked SP and LR.
• 2 undefined mode registers for banked SP and LR.
• 2 interrupt mode registers for banked SP and LR.
• 7 FIQ mode registers for banked R8-R12, SP and LR.
• 2 monitor mode registers for banked SP and LR.
• 1 Hyp mode register for banked SP.
• 7 Saved Program Status Register (SPSRs), one for each exception mode.
• 1 Hyp mode register for ELR_Hyp to store the preferred return address from Hyp mode.
Note
In privileged software execution, CPSR is an alias for APSR and gives access to additional bits.
The following figure shows how the registers are banked in the Arm architecture.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-30
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.5 Registers in AArch32 state
Application
level view System level view
APSR CPSR
SPSR_hyp SPSR_svc SPSR_abt SPSR_und SPSR_mon SPSR_irq SPSR_fiq
ELR_hyp
‡ Exists only in Secure state.
† Exists only in Non-secure state.
Cells with no entry indicate that the User mode register is used.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-31
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.6 General-purpose registers in AArch32 state
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-32
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.7 Register accesses in AArch32 state
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-33
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.8 Predeclared core register names in AArch32 state
a1-a4 Argument, result or scratch registers. These are synonyms for R0 to R3.
v1-v8 Variable registers. These are synonyms for R4 to R11.
SB Static base register. This is a synonym for R9.
IP Intra-procedure call scratch register. This is a synonym for R12.
SP Stack pointer. This is a synonym for R13.
LR Link register. This is a synonym for R14.
PC Program counter. This is a synonym for R15.
With the exception of a1-a4 and v1-v8, you can write the register names either in all upper case or all
lower case.
Related concepts
A1.6 General-purpose registers in AArch32 state on page A1-32
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-34
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.9 Predeclared extension register names in AArch32 state
You can write the register names either in upper case or lower case.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-35
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.10 Program Counter in AArch32 state
During execution, the PC does not contain the address of the currently executing instruction. The address
of the currently executing instruction is typically PC-8 for A32, or PC-4 for T32.
Note
Arm recommends you use the BX instruction to jump to an address or to return from a function, rather
than writing to the PC directly.
Related references
C2.14 B on page C2-132
C2.21 BX, BXNS on page C2-142
C2.23 CBZ and CBNZ on page C2-145
C2.154 TBB and TBH on page C2-333
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-36
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.11 The Q flag in AArch32 state
The state of the Q flag cannot be tested directly by the condition codes. To read the state of the Q flag,
use an MRS instruction.
MRS r6, APSR
TST r6, #(1<<27); Z is clear if Q flag was set
Related references
C2.62 MRS (PSR to general-purpose register) on page C2-204
C2.65 MSR (general-purpose register to PSR) on page C2-208
C2.75 QADD on page C2-223
C2.125 SMULxy on page C2-285
C2.127 SMULWy on page C2-287
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-37
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.12 Application Program Status Register
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-38
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.13 Current Program Status Register in AArch32 state
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-39
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.14 Saved Program Status Registers in AArch32 state
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-40
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.15 A32 and T32 instruction set overview
Data processing These instructions operate on the general-purpose registers. They can perform operations such as addition,
subtraction, or bitwise logic on the contents of two registers and place the result in a third register. They can
also operate on the value in a single register, or on a value in a register and an immediate value supplied
within the instruction.
Long multiply instructions give a 64-bit result in two registers.
Register load and These instructions load or store the value of a single register from or to memory. They can load or store a 32-
store bit word, a 16-bit halfword, or an 8-bit unsigned byte. Byte and halfword loads can either be sign extended or
zero extended to fill the 32-bit register.
A few instructions are also defined that can load or store 64-bit doubleword values into two 32-bit registers.
Multiple register load These instructions load or store any subset of the general-purpose registers from or to memory.
and store
Status register access These instructions move the contents of a status register to or from a general-purpose register.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-41
reserved.
Non-Confidential
A1 Overview of AArch32 state
A1.16 Access to the inline barrel shifter in AArch32 state
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights A1-42
reserved.
Non-Confidential
Part B
Advanced SIMD and Floating-point Programming
Chapter B1
Advanced SIMD Programming
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-45
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.1 Architecture support for Advanced SIMD
Related information
Floating-point support
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-46
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.2 Extension register bank mapping for Advanced SIMD in AArch32 state
B1.2 Extension register bank mapping for Advanced SIMD in AArch32 state
The Advanced SIMD extension register bank is a collection of registers that can be accessed as either 64-
bit or 128-bit registers.
Advanced SIMD and floating-point instructions use the same extension register bank, and is distinct
from the Arm core register bank.
The following figure shows the views of the extension register bank, and the overlap between the
different size registers. For example, the 128-bit register Q0 is an alias for two consecutive 64-bit
registers D0 and D1. The 128-bit register Q8 is an alias for 2 consecutive 64-bit registers D16 and D17.
D0
Q0
D1
D2
Q1
D3
... ...
D14
Q7
D15
D16
Q8
D17
... ...
D30
Q15
D31
Figure B1-1 Extension register bank for Advanced SIMD in AArch32 state
Note
If your processor supports both Advanced SIMD and floating-point, all the Advanced SIMD registers
overlap with the floating-point registers.
The aliased views enable half-precision, single-precision, and double-precision values, and Advanced
SIMD vectors to coexist in different non-overlapped registers at the same time.
You can also use the same overlapped registers to store half-precision, single-precision, and double-
precision values, and Advanced SIMD vectors at different times.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-47
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.2 Extension register bank mapping for Advanced SIMD in AArch32 state
Do not attempt to use overlapped 64-bit and 128-bit registers at the same time because it creates
meaningless results.
The mapping between the registers is as follows:
• D<2n> maps to the least significant half of Q<n>
• D<2n+1> maps to the most significant half of Q<n>.
For example, you can access the least significant half of the elements of a vector in Q6 by referring to
D12, and the most significant half of the elements by referring to D13.
Related concepts
B2.3 Views of the floating-point extension register bank in AArch32 state on page B2-69
B1.3 Views of the Advanced SIMD register bank in AArch32 state on page B1-49
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-48
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.3 Views of the Advanced SIMD register bank in AArch32 state
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-49
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.4 Load values to Advanced SIMD registers
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-50
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.5 Conditional execution of A32/T32 Advanced SIMD instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-51
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.6 Floating-point exceptions for Advanced SIMD in A32/T32 instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-52
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.7 Advanced SIMD data types in A32/T32 instructions
The datatype of the second (or only) operand is specified in the instruction.
Note
Most instructions have a restricted range of permitted data types. See the instruction descriptions for
details. However, the data type description is flexible:
• If the description specifies I, you can also use the S or U data types.
• If only the data size is specified, you can specify a type (I, S, U, P or F).
• If no data type is specified, you can specify a data type.
Related concepts
B1.7 Advanced SIMD data types in A32/T32 instructions on page B1-53
B1.8 Polynomial arithmetic over {0,1} on page B1-54
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-53
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.8 Polynomial arithmetic over {0,1}
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-54
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.9 Advanced SIMD vectors
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-55
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.10 Normal, long, wide, and narrow Advanced SIMD instructions
You can specify that the operands and result of a normal Advanced SIMD instruction must all be
quadwords by appending a Q to the instruction mnemonic. If you do this, armasm produces an
error if the operands or result are not quadwords.
Long operation
The operands are doubleword vectors and the result is a quadword vector. The elements of the
result are usually twice the width of the elements of the operands, and the same type.
Long operation is specified using an L appended to the instruction mnemonic, for example:
VADDL.S16 Q0, D2, D3
Wide operation
One operand vector is doubleword and the other is quadword. The result vector is quadword.
The elements of the result and the first operand are twice the width of the elements of the second
operand.
Wide operation is specified using a W appended to the instruction mnemonic, for example:
VADDW.S16 Q0, Q1, D4
Narrow operation
The operands are quadword vectors and the result is a doubleword vector. The elements of the
result are half the width of the elements of the operands.
Narrow operation is specified using an N appended to the instruction mnemonic, for example:
VADDHN.I16 D0, Q1, Q2
Related concepts
B1.9 Advanced SIMD vectors on page B1-55
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-56
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.11 Saturating Advanced SIMD instructions
Saturating Advanced SIMD arithmetic instructions set the QC bit in the floating-point status register
(FPSCR) to indicate that saturation has occurred.
Saturating instructions are specified using a Q prefix, which is inserted between the V and the instruction
mnemonic.
Related references
C2.7 Saturating instructions on page C2-118
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-57
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.12 Advanced SIMD scalars
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-58
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.13 Extended notation extension for Advanced SIMD
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-59
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.14 Advanced SIMD system registers in AArch32 state
Related information
Arm Architecture Reference Manual
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-60
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.15 Flush-to-zero mode in Advanced SIMD
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-61
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.16 When to use flush-to-zero mode in Advanced SIMD
You can change between flush-to-zero and normal mode at any time, if different parts of your code have
different requirements. Numbers already in registers are not affected by changing mode.
Related concepts
B1.15 Flush-to-zero mode in Advanced SIMD on page B1-61
B1.17 The effects of using flush-to-zero mode in Advanced SIMD on page B1-63
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-62
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.17 The effects of using flush-to-zero mode in Advanced SIMD
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-63
reserved.
Non-Confidential
B1 Advanced SIMD Programming
B1.18 Advanced SIMD operations not affected by flush-to-zero mode
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B1-64
reserved.
Non-Confidential
Chapter B2
Floating-point Programming
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-65
reserved.
Non-Confidential
B2 Floating-point Programming
B2.1 Architecture support for floating-point
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-66
reserved.
Non-Confidential
B2 Floating-point Programming
B2.2 Extension register bank mapping for floating-point in AArch32 state
S0
D0
S1
S2
D1
S3
S4
D2
S5
S6
D3
S7
... ...
S28
D14
S29
S30
D15
S31
D16
D17
...
D30
D31
The aliased views enable half-precision, single-precision, and double-precision values to coexist in
different non-overlapped registers at the same time.
You can also use the same overlapped registers to store half-precision, single-precision, and double-
precision values at different times.
Do not attempt to use overlapped 32-bit and 64-bit registers at the same time because it creates
meaningless results.
The mapping between the registers is as follows:
• S<2n> maps to the least significant half of D<n>
• S<2n+1> maps to the most significant half of D<n>
For example, you can access the least significant half of register D6 by referring to S12, and the most
significant half of D6 by referring to S13.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-67
reserved.
Non-Confidential
B2 Floating-point Programming
B2.2 Extension register bank mapping for floating-point in AArch32 state
Related concepts
B2.3 Views of the floating-point extension register bank in AArch32 state on page B2-69
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-68
reserved.
Non-Confidential
B2 Floating-point Programming
B2.3 Views of the floating-point extension register bank in AArch32 state
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-69
reserved.
Non-Confidential
B2 Floating-point Programming
B2.4 Load values to floating-point registers
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-70
reserved.
Non-Confidential
B2 Floating-point Programming
B2.5 Conditional execution of A32/T32 floating-point instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-71
reserved.
Non-Confidential
B2 Floating-point Programming
B2.6 Floating-point exceptions for floating-point in A32/T32 instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-72
reserved.
Non-Confidential
B2 Floating-point Programming
B2.7 Floating-point data types in A32/T32 instructions
32-bit
F32 (or F)
64-bit
F64 (or D)
The datatype of the second (or only) operand is specified in the instruction.
Note
• Most instructions have a restricted range of permitted data types. See the instruction descriptions for
details. However, the data type description is flexible:
— If the description specifies I, you can also use the S or U data types.
— If only the data size is specified, you can specify a type (S, U, P or F).
— If no data type is specified, you can specify a data type.
Related concepts
B1.8 Polynomial arithmetic over {0,1} on page B1-54
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-73
reserved.
Non-Confidential
B2 Floating-point Programming
B2.8 Extended notation extension for floating-point code
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-74
reserved.
Non-Confidential
B2 Floating-point Programming
B2.9 Floating-point system registers in AArch32 state
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-75
reserved.
Non-Confidential
B2 Floating-point Programming
B2.10 Flush-to-zero mode in floating-point
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-76
reserved.
Non-Confidential
B2 Floating-point Programming
B2.11 When to use flush-to-zero mode in floating-point
You can change between flush-to-zero and normal mode at any time, if different parts of your code have
different requirements. Numbers already in registers are not affected by changing mode.
Related concepts
B2.10 Flush-to-zero mode in floating-point on page B2-76
B2.12 The effects of using flush-to-zero mode in floating-point on page B2-78
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-77
reserved.
Non-Confidential
B2 Floating-point Programming
B2.12 The effects of using flush-to-zero mode in floating-point
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-78
reserved.
Non-Confidential
B2 Floating-point Programming
B2.13 Floating-point operations not affected by flush-to-zero mode
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights B2-79
reserved.
Non-Confidential
B2 Floating-point Programming
B2.13 Floating-point operations not affected by flush-to-zero mode
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights reserved. B2-80
Non-Confidential
Part C
A32/T32 Instruction Set Reference
Chapter C1
Condition Codes
Describes condition codes and conditional execution of A32 and T32 code.
It contains the following sections:
• C1.1 Conditional instructions on page C1-84.
• C1.2 Conditional execution in A32 code on page C1-85.
• C1.3 Conditional execution in T32 code on page C1-86.
• C1.4 Condition flags on page C1-87.
• C1.5 Updates to the condition flags in A32/T32 code on page C1-88.
• C1.6 Floating-point instructions that update the condition flags on page C1-89.
• C1.7 Carry flag on page C1-90.
• C1.8 Overflow flag on page C1-91.
• C1.9 Condition code suffixes on page C1-92.
• C1.10 Condition code suffixes and related flags on page C1-93.
• C1.11 Comparison of condition code meanings in integer and floating-point code on page C1-94.
• C1.12 Benefits of using conditional execution in A32 and T32 code on page C1-96.
• C1.13 Example showing the benefits of conditional instructions in A32 and T32 code
on page C1-97.
• C1.14 Optimization for execution speed on page C1-100.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-83
reserved.
Non-Confidential
C1 Condition Codes
C1.1 Conditional instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-84
reserved.
Non-Confidential
C1 Condition Codes
C1.2 Conditional execution in A32 code
Related concepts
C1.3 Conditional execution in T32 code on page C1-86
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-85
reserved.
Non-Confidential
C1 Condition Codes
C1.3 Conditional execution in T32 code
The use of the IT instruction is deprecated when any of the following are true:
• There is more than one instruction in the IT block.
• There is a 32-bit instruction in the IT block.
• The instruction in the IT block references the PC.
Related concepts
C1.2 Conditional execution in A32 code on page C1-85
Related references
C2.41 IT on page C2-169
C2.23 CBZ and CBNZ on page C2-145
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-86
reserved.
Non-Confidential
C1 Condition Codes
C1.4 Condition flags
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-87
reserved.
Non-Confidential
C1 Condition Codes
C1.5 Updates to the condition flags in A32/T32 code
Related concepts
C1.1 Conditional instructions on page C1-84
Related references
C1.4 Condition flags on page C1-87
C1.10 Condition code suffixes and related flags on page C1-93
Chapter C2 A32 and T32 Instructions on page C2-101
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-88
reserved.
Non-Confidential
C1 Condition Codes
C1.6 Floating-point instructions that update the condition flags
Related concepts
C1.7 Carry flag on page C1-90
C1.8 Overflow flag on page C1-91
Related references
C4.4 VCMP, VCMPE on page C4-551
C3.75 VMRS on page C3-470
C4.26 VMRS (floating-point) on page C4-573
Related information
Arm Architecture Reference Manual
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-89
reserved.
Non-Confidential
C1 Condition Codes
C1.7 Carry flag
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-90
reserved.
Non-Confidential
C1 Condition Codes
C1.8 Overflow flag
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-91
reserved.
Non-Confidential
C1 Condition Codes
C1.9 Condition code suffixes
Suffix Meaning
EQ Equal
NE Not equal
CS Carry set (identical to HS)
HS Unsigned higher or same (identical to CS)
CC Carry clear (identical to LO)
LO Unsigned lower (identical to CC)
MI Minus or negative result
PL Positive or zero result
VS Overflow
VC No overflow
HI Unsigned higher
LS Unsigned lower or same
GE Signed greater than or equal
LT Signed less than
GT Signed greater than
LE Signed less than or equal
AL Always (this is the default)
Note
The meaning of some of these condition codes depends on whether the instruction that last updated the
condition flags is a floating-point or integer instruction.
Related references
C1.11 Comparison of condition code meanings in integer and floating-point code on page C1-94
C2.41 IT on page C2-169
C3.75 VMRS on page C3-470
C4.26 VMRS (floating-point) on page C4-573
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-92
reserved.
Non-Confidential
C1 Condition Codes
C1.10 Condition code suffixes and related flags
MI N set Negative
VS V set Overflow
VC V clear No overflow
The optional condition code is shown in syntax descriptions as {cond}. This condition is encoded in A32
instructions. For T32 instructions, the condition is encoded in a preceding IT instruction. An instruction
with a condition code is only executed if the condition flags meet the specified condition.
The following is an example of conditional execution in A32 code:
ADD r0, r1, r2 ; r0 = r1 + r2, don't update flags
ADDS r0, r1, r2 ; r0 = r1 + r2, and update flags
ADDSCS r0, r1, r2 ; If C flag set then r0 = r1 + r2,
; and update flags
CMP r0, r1 ; update flags based on r0-r1.
Related concepts
C1.1 Conditional instructions on page C1-84
Related references
C1.4 Condition flags on page C1-87
C1.11 Comparison of condition code meanings in integer and floating-point code on page C1-94
C1.5 Updates to the condition flags in A32/T32 code on page C1-88
Chapter C2 A32 and T32 Instructions on page C2-101
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-93
reserved.
Non-Confidential
C1 Condition Codes
C1.11 Comparison of condition code meanings in integer and floating-point code
Suffix Meaning after integer data processing instruction Meaning after floating-point instruction
EQ Equal Equal
NE Not equal Not equal, or unordered
CS Carry set Greater than or equal, or unordered
HS Unsigned higher or same Greater than or equal, or unordered
CC Carry clear Less than
LO Unsigned lower Less than
MI Negative Less than
PL Positive or zero Greater than or equal, or unordered
VS Overflow Unordered (at least one NaN operand)
VC No overflow Not unordered
HI Unsigned higher Greater than, or unordered
LS Unsigned lower or same Less than or equal
GE Signed greater than or equal Greater than or equal
LT Signed less than Less than, or unordered
GT Signed greater than Greater than
LE Signed less than or equal Less than or equal, or unordered
AL Always (normally omitted) Always (normally omitted)
Note
The type of the instruction that last updated the condition flags determines the meaning of the condition
codes.
Related concepts
C1.1 Conditional instructions on page C1-84
Related references
C1.10 Condition code suffixes and related flags on page C1-93
C1.5 Updates to the condition flags in A32/T32 code on page C1-88
C4.4 VCMP, VCMPE on page C4-551
C3.75 VMRS on page C3-470
C4.26 VMRS (floating-point) on page C4-573
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-94
reserved.
Non-Confidential
C1 Condition Codes
C1.11 Comparison of condition code meanings in integer and floating-point code
Related information
Arm Architecture Reference Manual
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-95
reserved.
Non-Confidential
C1 Condition Codes
C1.12 Benefits of using conditional execution in A32 and T32 code
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-96
reserved.
Non-Confidential
C1 Condition Codes
C1.13 Example showing the benefits of conditional instructions in A32 and T32 code
C1.13 Example showing the benefits of conditional instructions in A32 and T32
code
Using conditional instructions rather than conditional branches can save both code size and cycles.
This example shows the difference between using branches and using conditional instructions. It uses the
Euclid algorithm for the Greatest Common Divisor (gcd) to show how conditional instructions improve
code size and speed.
In C the gcd algorithm can be expressed as:
int gcd(int a, int b)
{
while (a != b)
{
if (a > b)
a = a - b;
else
b = b - a;
}
return a;
}
The following examples show implementations of the gcd algorithm with and without conditional
instructions.
The code is seven instructions long because of the number of branches. Every time a branch is taken, the
processor must refill the pipeline and continue from the new location. The other instructions and non-
executed branches use a single cycle each.
The following table shows the number of cycles this implementation uses on an Arm7™ processor when
R0 equals 1 and R1 equals 2.
1 2 CMP r0, r1 1
1 2 BLT less 3
1 2 B gcd 3
1 1 CMP r0, r1 1
1 1 BEQ end 3
Total = 13
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-97
reserved.
Non-Confidential
C1 Condition Codes
C1.13 Example showing the benefits of conditional instructions in A32 and T32 code
In addition to improving code size, in most cases this code executes faster than the version that uses only
branches.
The following table shows the number of cycles this implementation uses on an Arm7 processor when
R0 equals 1 and R1 equals 2.
1 2 CMP r0, r1 1
1 1 SUBLT r1,r1,r0 1
1 1 BNE gcd 3
1 1 CMP r0,r1 1
Total = 10
These instructions assemble equally well to A32 or T32 code. The assembler checks the IT instructions,
but omits them on assembly to A32 code.
It requires one more instruction in T32 code (the IT instruction) than in A32 code, but the overall code
size is 10 bytes in T32 code, compared with 16 bytes in A32 code.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-98
reserved.
Non-Confidential
C1 Condition Codes
C1.13 Example showing the benefits of conditional instructions in A32 and T32 code
conditional branches and is similar to the A32 code implementation using branches, without conditional
instructions.
The T32 code implementation of the gcd algorithm without conditional instructions requires seven
instructions. The overall code size is 14 bytes. This figure is even less than the A32 implementation that
uses conditional instructions, which uses 16 bytes.
In addition, on a system using 16-bit memory this T32 implementation runs faster than both A32
implementations because only one memory access is required for each 16-bit T32 instruction, whereas
each 32-bit A32 instruction requires two fetches.
Related concepts
C1.12 Benefits of using conditional execution in A32 and T32 code on page C1-96
C1.14 Optimization for execution speed on page C1-100
Related references
C2.41 IT on page C2-169
C1.10 Condition code suffixes and related flags on page C1-93
Related information
Arm Architecture Reference Manual
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-99
reserved.
Non-Confidential
C1 Condition Codes
C1.14 Optimization for execution speed
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C1-100
reserved.
Non-Confidential
Chapter C2
A32 and T32 Instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-101
reserved.
Non-Confidential
C2 A32 and T32 Instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-102
reserved.
Non-Confidential
C2 A32 and T32 Instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-103
reserved.
Non-Confidential
C2 A32 and T32 Instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-104
reserved.
Non-Confidential
C2 A32 and T32 Instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-105
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.1 A32 and T32 instruction summary
BLX, BLXNS Branch with Link, change instruction set, Branch with Link and Exchange (Non-secure)
BX, BXNS Branch, change instruction set, Branch and Exchange (Non-secure)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-106
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.1 A32 and T32 instruction summary
LDAEX, LDAEXB, LDAEXH, LDAEXD Load-Acquire Register Exclusive Word, Byte, Halfword, Doubleword
LDREX, LDREXB, LDREXH, LDREXD Load Register Exclusive Word, Byte, Halfword, Doubleword
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-107
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.1 A32 and T32 instruction summary
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-108
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.1 A32 and T32 instruction summary
STREX, STREXB, STREXH,STREXD Store Register Exclusive Word, Byte, Halfword, Doubleword
STLEX, STLEXB, STLEXH, STLEXD Store-Release Exclusive Word, Byte, Halfword, Doubleword
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-109
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.1 A32 and T32 instruction summary
WFE, WFI, YIELD Wait For Event, Wait For Interrupt, Yield
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-110
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.2 Instruction width specifiers
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-111
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.3 Flexible second operand (Operand2)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-112
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.4 Syntax of Operand2 as a constant
Syntax
#constant
Usage
In A32 instructions, constant can have any value that can be produced by rotating an 8-bit value right
by any even number of bits within a 32-bit word.
In T32 instructions, constant can be:
• Any constant that can be produced by shifting an 8-bit value left by any number of bits within a 32-
bit word.
• Any constant of the form 0x00XY00XY.
• Any constant of the form 0xXY00XY00.
• Any constant of the form 0xXYXYXYXY.
Note
In these constants, X and Y are hexadecimal digits.
In addition, in a small number of instructions, constant can take a wider range of values. These are
listed in the individual instruction descriptions.
When an Operand2 constant is used with the instructions MOVS, MVNS, ANDS, ORRS, ORNS, EORS, BICS, TEQ
or TST, the carry flag is updated to bit[31] of the constant, if the constant is greater than 255 and can be
produced by shifting an 8-bit value. These instructions do not affect the carry flag if Operand2 is any
other constant.
Instruction substitution
If the value of an Operand2 constant is not available, but its logical inverse or negation is available, then
the assembler produces an equivalent instruction and inverts or negates the constant.
For example, an assembler might assemble the instruction CMP Rd, #0xFFFFFFFE as the equivalent
instruction CMN Rd, #0x2.
Be aware of this when comparing disassembly listings with source code.
Related concepts
C2.6 Shift operations on page C2-115
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C2.5 Syntax of Operand2 as a register with optional shift on page C2-114
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-113
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.5 Syntax of Operand2 as a register with optional shift
Syntax
Rm {, shift}
where:
Rm
is a register supplying the shift amount, and only the least significant byte is
used.
-
Usage
If you omit the shift, or specify LSL #0, the instruction uses the value in Rm.
If you specify a shift, the shift is applied to the value in Rm, and the resulting 32-bit value is used by the
instruction. However, the contents of the register Rm remain unchanged. Specifying a register with shift
also updates the carry flag when used with certain instructions.
Related concepts
C2.6 Shift operations on page C2-115
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C2.4 Syntax of Operand2 as a constant on page C2-113
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-114
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.6 Shift operations
Carry
Flag
31 54 3 2 1 0
...
Figure C2-1 ASR #3
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-115
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.6 Shift operations
0 0 0 Carry
Flag
31 5 4 3 2 10
...
Figure C2-2 LSR #3
0 0 0
31 5 4 3 2 10
Carry
Flag ...
Figure C2-3 LSL #3
Carry
Flag
31 5 4 3 2 10
...
Figure C2-4 ROR #3
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-116
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.6 Shift operations
Carry
Flag
31 1 0
... ...
Figure C2-5 RRX
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C2.4 Syntax of Operand2 as a constant on page C2-113
C2.5 Syntax of Operand2 as a register with optional shift on page C2-114
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-117
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.7 Saturating instructions
Saturating arithmetic
Saturation means that, for some value of 2n that depends on the instruction:
• For a signed saturating operation, if the full result would be less than -2n, the result returned is -2n.
• For an unsigned saturating operation, if the full result would be negative, the result returned is zero.
• If the full result would be greater than 2n-1, the result returned is 2n-1.
When any of these occurs, it is called saturation. Some instructions set the Q flag when saturation occurs.
Note
Saturating instructions do not clear the Q flag when saturation does not occur. To clear the Q flag, use an
MSR instruction.
The Q flag can also be set by two other instructions, but these instructions do not saturate.
Related references
C2.75 QADD on page C2-223
C2.82 QSUB on page C2-230
C2.79 QDADD on page C2-227
C2.80 QDSUB on page C2-228
C2.113 SMLAxy on page C2-271
C2.118 SMLAWy on page C2-278
C2.125 SMULxy on page C2-285
C2.127 SMULWy on page C2-287
C2.130 SSAT on page C2-291
C2.181 USAT on page C2-366
C2.65 MSR (general-purpose register to PSR) on page C2-208
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-118
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.8 ADC
C2.8 ADC
Add with Carry.
Syntax
ADC{S}{cond} {Rd}, Rn, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
is an optional condition code.
Rd
is the destination register.
Rn
is the register holding the first operand.
Operand2
is a flexible second operand.
Usage
The ADC (Add with Carry) instruction adds the values in Rn and Operand2, together with the carry flag.
You can use ADC to synthesize multiword arithmetic.
In certain circumstances, the assembler can substitute one instruction for another. Be aware of this when
reading disassembly listings.
Condition flags
If S is specified, the ADC instruction updates the N, Z, C and V flags according to the result.
16-bit instructions
The following forms of this instruction are available in T32 code, and are 16-bit instructions:
ADCS Rd, Rd, Rm
Rd and Rm must both be Lo registers. This form can only be used outside an IT block.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-119
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.8 ADC
Rd and Rm must both be Lo registers. This form can only be used inside an IT block.
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-120
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.9 ADD
C2.9 ADD
Add without Carry.
Syntax
ADD{S}{cond} {Rd}, Rn, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
is an optional condition code.
Rd
is the destination register.
Rn
is the register holding the first operand.
Operand2
is a flexible second operand.
imm12
is any value in the range 0-4095.
Operation
The ADD instruction adds the values in Rn and Operand2 or imm12.
In certain circumstances, the assembler can substitute one instruction for another. Be aware of this when
reading disassembly listings.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-121
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.9 ADD
Condition flags
If S is specified, these instructions update the N, Z, C and V flags according to the result.
16-bit instructions
The following forms of these instructions are available in T32 code, and are 16-bit instructions:
ADDS Rd, Rn, #imm
imm range 0-7. Rd and Rn must both be Lo registers. This form can only be used outside an IT
block.
ADD{cond} Rd, Rn, #imm
imm range 0-7. Rd and Rn must both be Lo registers. This form can only be used inside an IT
block.
ADDS Rd, Rn, Rm
Rd, Rn and Rm must all be Lo registers. This form can only be used outside an IT block.
Example
ADD r2, r1, r3
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-122
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.9 ADD
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C1.9 Condition code suffixes on page C1-92
C2.144 SUBS pc, lr on page C2-317
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-123
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.10 ADR (PC-relative)
Syntax
ADR{cond}{.W} Rd,label
where:
cond
is an optional condition code.
.W
is an optional instruction width specifier.
Rd
is the destination register to load.
label
is a PC-relative expression.
label must be within a limited distance of the current instruction.
Usage
ADR produces position-independent code, because the assembler generates an instruction that adds or
subtracts a value to the PC.
label must evaluate to an address in the same assembler area as the ADR instruction.
If you use ADR to generate a target for a BX or BLX instruction, it is your responsibility to set the T32 bit
(bit 0) of the address if the target contains T32 instructions.
ADR in T32
You can use the .W width specifier to force ADR to generate a 32-bit instruction in T32 code. ADR with .W
always generates a 32-bit instruction, even if the address can be generated in a 16-bit instruction.
For forward references, ADR without .W always generates a 16-bit instruction in T32 code, even if that
results in failure for an address that could be generated in a 32-bit T32 ADD instruction.
Restrictions
In T32 code, Rd cannot be PC or SP.
In A32 code, Rd can be PC or SP but use of SP is deprecated.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-124
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.10 ADR (PC-relative)
Related references
C2.4 Syntax of Operand2 as a constant on page C2-113
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-125
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.11 ADR (register-relative)
Syntax
ADR{cond}{.W} Rd,label
where:
cond
is an optional condition code.
.W
is an optional instruction width specifier.
Rd
is the destination register to load.
label
is a symbol defined by the FIELD directive. label specifies an offset from the base register
which is defined using the MAP directive.
label must be within a limited distance from the base register.
Usage
ADR generates code to easily access named fields inside a storage map.
Restrictions
In T32 code:
• Rd cannot be PC.
• Rd can be SP only if the base register is SP.
The following table shows the possible offsets between the label and the current instruction:
ADR in T32
You can use the .W width specifier to force ADR to generate a 32-bit instruction in T32 code. ADR with .W
always generates a 32-bit instruction, even if the address can be generated in a 16-bit instruction.
For forward references, ADR without .W, with base register SP, always generates a 16-bit instruction in
T32 code, even if that results in failure for an address that could be generated in a 32-bit T32 ADD
instruction.
Related references
C2.4 Syntax of Operand2 as a constant on page C2-113
c Rd must be in the range R0-R7 or SP. If Rd is SP, the offset range is -508 to 508 and must be a multiple of 4
d Must be a multiple of 4.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-126
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.11 ADR (register-relative)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-127
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.12 AND
C2.12 AND
Logical AND.
Syntax
AND{S}{cond} Rd, Rn, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
is an optional condition code.
Rd
is the destination register.
Rn
is the register holding the first operand.
Operand2
is a flexible second operand.
Operation
The AND instruction performs bitwise AND operations on the values in Rn and Operand2.
In certain circumstances, the assembler can substitute BIC for AND, or AND for BIC. Be aware of this when
reading disassembly listings.
Condition flags
If S is specified, the AND instruction:
• Updates the N and Z flags according to the result.
• Can update the C flag during the calculation of Operand2.
• Does not affect the V flag.
16-bit instructions
The following forms of this instruction are available in T32 code, and are 16-bit instructions:
ANDS Rd, Rd, Rm
Rd and Rm must both be Lo registers. This form can only be used outside an IT block.
It does not matter if you specify AND{S} Rd, Rm, Rd. The instruction is the same.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-128
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.12 AND
Examples
AND r9,r2,#0xFF00
ANDS r9, r8, #0x19
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C2.144 SUBS pc, lr on page C2-317
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-129
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.13 ASR
C2.13 ASR
Arithmetic Shift Right. This instruction is a preferred synonym for MOV instructions with shifted register
operands.
Syntax
ASR{S}{cond} Rd, Rm, Rs
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
Rd
is the destination register.
Rm
is the register holding the first operand. This operand is shifted right.
Rs
is a register holding a shift value to apply to the value in Rm. Only the least significant byte is
used.
sh
is a constant shift. The range of values permitted is 1-32.
Operation
ASR provides the signed value of the contents of a register divided by a power of two. It copies the sign
bit into vacated bit positions on the left.
Caution
Do not use the S suffix when using PC as Rd in User mode or System mode. The assembler cannot warn
you about this because it has no information about what the processor mode is likely to be at execution
time.
You cannot use PC for Rd or any operand in the ASR instruction if it has a register-controlled shift.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-130
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.13 ASR
Condition flags
If S is specified, the ASR instruction updates the N and Z flags according to the result.
The C flag is unaffected if the shift value is 0. Otherwise, the C flag is updated to the last bit shifted out.
16-bit instructions
The following forms of these instructions are available in T32 code, and are 16-bit instructions:
ASRS Rd, Rm, #sh
Rd and Rm must both be Lo registers. This form can only be used outside an IT block.
Architectures
This instruction is available in A32 and T32.
Example
ASR r7, r8, r9
Related references
C2.58 MOV on page C2-199
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-131
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.14 B
C2.14 B
Branch.
Syntax
B{cond}{.W} label
where:
cond
is an optional condition code.
.W
is an optional instruction width specifier to force the use of a 32-bit B instruction in T32.
label
is a PC-relative expression.
Operation
The B instruction causes a branch to label.
B in T32
You can use the .W width specifier to force B to generate a 32-bit instruction in T32 code.
B.W always generates a 32-bit instruction, even if the target could be reached using a 16-bit instruction.
For forward references, B without .W always generates a 16-bit instruction in T32 code, even if that
results in failure for a target that could be reached using a 32-bit T32 instruction.
Condition flags
The B instruction does not change the flags.
Architectures
See the earlier table for details of availability of the B instruction.
Example
B loopA
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-132
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.14 B
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-133
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.15 BFC
C2.15 BFC
Bit Field Clear.
Syntax
BFC{cond} Rd, #lsb, #width
where:
cond
is an optional condition code.
Rd
is the destination register.
lsb
is the least significant bit that is to be cleared.
width
is the number of bits to be cleared. width must not be 0, and (width+lsb) must be less than or
equal to 32.
Operation
Clears adjacent bits in a register. width bits in Rd are cleared, starting at lsb. Other bits in Rd are
unchanged.
Register restrictions
You cannot use PC for any register.
You can use SP in the BFC A32 instruction but this is deprecated. You cannot use SP in the BFC T32
instruction.
Condition flags
The BFC instruction does not change the flags.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-134
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.16 BFI
C2.16 BFI
Bit Field Insert.
Syntax
BFI{cond} Rd, Rn, #lsb, #width
where:
cond
is an optional condition code.
Rd
is the destination register.
Rn
is the source register.
lsb
is the least significant bit that is to be copied.
width
is the number of bits to be copied. width must not be 0, and (width+lsb) must be less than or
equal to 32.
Operation
Inserts adjacent bits from one register into another. width bits in Rd, starting at lsb, are replaced by
width bits from Rn, starting at bit[0]. Other bits in Rd are unchanged.
Register restrictions
You cannot use PC for any register.
You can use SP in the BFI A32 instruction but this is deprecated. You cannot use SP in the BFI T32
instruction.
Condition flags
The BFI instruction does not change the flags.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-135
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.17 BIC
C2.17 BIC
Bit Clear.
Syntax
BIC{S}{cond} Rd, Rn, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
is an optional condition code.
Rd
is the destination register.
Rn
is the register holding the first operand.
Operand2
is a flexible second operand.
Operation
The BIC (Bit Clear) instruction performs an AND operation on the bits in Rn with the complements of the
corresponding bits in the value of Operand2.
In certain circumstances, the assembler can substitute BIC for AND, or AND for BIC. Be aware of this when
reading disassembly listings.
Condition flags
If S is specified, the BIC instruction:
• Updates the N and Z flags according to the result.
• Can update the C flag during the calculation of Operand2.
• Does not affect the V flag.
16-bit instructions
The following forms of the BIC instruction are available in T32 code, and are 16-bit instructions:
BICS Rd, Rd, Rm
Rd and Rm must both be Lo registers. This form can only be used outside an IT block.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-136
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.17 BIC
Example
BIC r0, r1, #0xab
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C2.144 SUBS pc, lr on page C2-317
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-137
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.18 BKPT
C2.18 BKPT
Breakpoint.
Syntax
BKPT #imm
where:
imm
is an expression evaluating to an integer in the range:
• 0-65535 (a 16-bit value) in an A32 instruction.
• 0-255 (an 8-bit value) in a 16-bit T32 instruction.
Usage
The BKPT instruction causes the processor to enter Debug state. Debug tools can use this to investigate
system state when the instruction at a particular address is reached.
In both A32 state and T32 state, imm is ignored by the Arm hardware. However, a debugger can use it to
store additional information about the breakpoint.
BKPT is an unconditional instruction. It must not have a condition code in A32 code. In T32 code, the
BKPT instruction does not require a condition code suffix because BKPT always executes irrespective of its
condition code suffix.
Architectures
This instruction is available in A32 and T32.
In T32, it is only available as a 16-bit instruction.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-138
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.19 BL
C2.19 BL
Branch with Link.
Syntax
BL{cond}{.W} label
where:
cond
is an optional condition code. cond is not available on all forms of this instruction.
.W
is an optional instruction width specifier to force the use of a 32-bit BL instruction in T32.
label
is a PC-relative expression.
Operation
The BL instruction causes a branch to label, and copies the address of the next instruction into LR (R14,
the link register).
Condition flags
The BL instruction does not change the flags.
Availability
See the preceding table for details of availability of the BL instruction in both instruction sets.
Examples
BLE ng+8
BL subC
BLLT rtX
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-139
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.20 BLX, BLXNS
Syntax
BLX{cond}{q} label
BLX{cond}{q} Rm
Where:
cond
Is an optional condition code. cond is not available on all forms of this instruction.
q
Is an optional instruction width specifier. Must be set to .W when label is used.
label
Is a PC-relative expression.
Rm
Is a register containing an address to branch to.
Operation
The BLX instruction causes a branch to label, or to the address contained in Rm. In addition:
• The BLX instruction copies the address of the next instruction into LR (R14, the link register).
• The BLX instruction can change the instruction set.
BLX label always changes the instruction set. It changes a processor in A32 state to T32 state, or a
processor in T32 state to A32 state.
BLX Rm derives the target instruction set from bit[0] of Rm:
— If bit[0] of Rm is 0, the processor changes to, or remains in, A32 state.
— If bit[0] of Rm is 1, the processor changes to, or remains in, T32 state.
Note
• Armv7‑M and Armv6‑M only support the T32 instruction set. An attempt to change the instruction
execution state causes the processor to take an exception on the instruction at the target address.
The BLXNS instruction calls a subroutine at an address and instruction set specified by a register, and
causes a transition from the Secure to the Non-secure domain. This variant of the instruction must only
be used when additional steps required to make such a transition safe are taken.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-140
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.20 BLX, BLXNS
Register restrictions
You can use PC for Rm in the A32 BLX instruction, but this is deprecated. You cannot use PC in other A32
instructions.
You can use PC for Rm in the T32 BLX instruction. You cannot use PC in other T32 instructions.
You can use SP for Rm in this A32 instruction but this is deprecated.
You can use SP for Rm in the T32 BLX and BLXNS instructions, but this is deprecated. You cannot use SP
in the other T32 instructions.
Condition flags
These instructions do not change the flags.
Availability
See the preceding table for details of availability of the BLX and BLXNS instructions in both instruction
sets.
Related references
C1.9 Condition code suffixes on page C1-92
C2.2 Instruction width specifiers on page C2-111
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-141
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.21 BX, BXNS
Syntax
BX{cond}{q} Rm
Where:
cond
Is an optional condition code. cond is not available on all forms of this instruction.
q
Is an optional instruction width specifier.
Rm
Is a register containing an address to branch to.
Operation
The BX instruction causes a branch to the address contained in Rm and exchanges the instruction set, if
necessary. The BX instruction can change the instruction set.
BX Rm derives the target instruction set from bit[0] of Rm:
• If bit[0] of Rm is 0, the processor changes to, or remains in, A32 state.
• If bit[0] of Rm is 1, the processor changes to, or remains in, T32 state.
Note
• Armv7‑M and Armv6‑M only support the T32 instruction set. An attempt to change the instruction
execution state causes the processor to take an exception on the instruction at the target address.
The BXNS instruction causes a branch to an address and instruction set specified by a register, and causes
a transition from the Secure to the Non-secure domain. This variant of the instruction must only be used
when additional steps required to make such a transition safe are taken.
Register restrictions
You can use PC for Rm in the A32 BX instruction, but this is deprecated. You cannot use PC in other A32
instructions.
You can use PC for Rm in the T32 BX and BXNS instructions. You cannot use PC in other T32 instructions.
You can use SP for Rm in the A32 BX instruction but this is deprecated.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-142
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.21 BX, BXNS
You can use SP for Rm in the T32 BX and BXNS instructions, but this is deprecated.
Condition flags
These instructions do not change the flags.
Availability
See the preceding table for details of availability of the BX and BXNS instructions in both instruction sets.
Related references
C1.9 Condition code suffixes on page C1-92
C2.2 Instruction width specifiers on page C2-111
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-143
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.22 BXJ
C2.22 BXJ
Branch and change to Jazelle state.
Syntax
BXJ{cond} Rm
where:
cond
is an optional condition code. cond is not available on all forms of this instruction.
Rm
is a register containing an address to branch to.
Operation
The BXJ instruction causes a branch to the address contained in Rm and changes the instruction set state to
Jazelle.
Note
In Armv8, BXJ behaves as a BX instruction. This means it causes a branch to an address and instruction
set specified by a register.
Register restrictions
You can use SP for Rm in the BXJ A32 instruction but this is deprecated.
You cannot use SP in the BXJ T32 instruction.
Condition flags
The BXJ instruction does not change the flags.
Availability
See the preceding table for details of availability of the BXJ instruction in both instruction sets.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-144
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.23 CBZ and CBNZ
Syntax
CBZ{q} Rn, label
where:
q
Is an optional instruction width specifier.
Rn
Is the register holding the operand.
label
Is the branch destination.
Usage
You can use the CBZ or CBNZ instructions to avoid changing the condition flags and to reduce the number
of instructions.
Except that it does not change the condition flags, CBZ Rn, label is equivalent to:
CMP Rn, #0
BEQ label
Except that it does not change the condition flags, CBNZ Rn, label is equivalent to:
CMP Rn, #0
BNE label
Restrictions
The branch destination must be a multiple of 2 in the range 0 to 126 bytes after the instruction and in the
same execution state.
These instructions must not be used inside an IT block.
Condition flags
These instructions do not change the flags.
Architectures
These 16-bit instructions are available in Armv7‑A T32, Armv8‑A T32, and Armv8‑M only.
There are no Armv7‑A A32, or Armv8‑A A32 or 32-bit T32 encodings of these instructions.
Related references
C2.14 B on page C2-132
C2.27 CMP and CMN on page C2-149
C2.2 Instruction width specifiers on page C2-111
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-145
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.24 CDP and CDP2
Note
CDP and CDP2 are not supported in Armv8.
Syntax
CDP{cond} coproc, #opcode1, CRd, CRn, CRm{, #opcode2}
where:
cond
Usage
The use of these instructions depends on the coprocessor. See the coprocessor documentation for details.
Architectures
These 32-bit instructions are available in A32 and T32.
There are no 16-bit versions of these instructions in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-146
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.25 CLREX
C2.25 CLREX
Clear Exclusive.
Syntax
CLREX{cond}
where:
cond
is an optional condition code.
Note
cond is permitted only in T32 code, using a preceding IT instruction, but this is deprecated in
Armv8. This is an unconditional instruction in A32.
Usage
Use the CLREX instruction to clear the local record of the executing processor that an address has had a
request for an exclusive access.
CLREX returns a closely-coupled exclusive access monitor to its open-access state. This removes the
requirement for a dummy store to memory.
It is implementation defined whether CLREX also clears the global record of the executing processor that
an address has had a request for an exclusive access.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit CLREX instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
Related information
Arm Architecture Reference Manual
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-147
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.26 CLZ
C2.26 CLZ
Count Leading Zeros.
Syntax
CLZ{cond} Rd, Rm
where:
cond
is an optional condition code.
Rd
is the destination register.
Rm
is the operand register.
Operation
The CLZ instruction counts the number of leading zeros in the value in Rm and returns the result in Rd.
The result value is 32 if no bits are set in the source register, and zero if bit 31 is set.
Register restrictions
You cannot use PC for any operand.
You can use SP in these A32 instructions but this is deprecated.
You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Examples
CLZ r4,r9
CLZNE r2,r3
Use the CLZ T32 instruction followed by a left shift of Rm by the resulting Rd value to normalize the value
of register Rm. Use MOVS, rather than MOV, to flag the case where Rm is zero:
CLZ r5, r9
MOVS r9, r9, LSL r5
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-148
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.27 CMP and CMN
Syntax
CMP{cond} Rn, Operand2
where:
cond
is an optional condition code.
Rn
is the register holding the first operand.
Operand2
is a flexible second operand.
Operation
These instructions compare the value in a register with Operand2. They update the condition flags on the
result, but do not place the result in any register.
The CMP instruction subtracts the value of Operand2 from the value in Rn. This is the same as a SUBS
instruction, except that the result is discarded.
The CMN instruction adds the value of Operand2 to the value in Rn. This is the same as an ADDS
instruction, except that the result is discarded.
In certain circumstances, the assembler can substitute CMN for CMP, or CMP for CMN. Be aware of this when
reading disassembly listings.
Condition flags
These instructions update the N, Z, C and V flags according to the result.
16-bit instructions
The following forms of these instructions are available in T32 code, and are 16-bit instructions:
CMP Rn, Rm
Lo register restriction does not apply.
CMN Rn, Rm
Rn and Rm must both be Lo registers.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-149
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.27 CMP and CMN
Correct examples
CMP r2, r9
CMN r0, #6400
CMPGT sp, r7, LSL #2
Incorrect example
CMP r2, pc, ASR r0 ; PC not permitted with register-controlled
; shift.
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-150
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.28 CPS
C2.28 CPS
Change Processor State.
Syntax
CPSeffect iflags{, #mode}
CPS #mode
where:
effect
is one of:
IE
Interrupt or abort enable.
ID
Interrupt or abort disable.
iflags
Usage
Changes one or more of the mode, A, I, and F bits in the CPSR, without changing the other CPSR bits.
CPS is only permitted in privileged software execution, and has no effect in User mode.
Condition flags
This instruction does not change the condition flags.
16-bit instructions
The following forms of these instructions are available in T32 code, and are 16-bit instructions:
• CPSIE iflags.
• CPSID iflags.
You cannot specify a mode change in a 16-bit T32 instruction.
Architectures
This instruction is available in A32 and T32.
In T32, 16-bit and 32-bit versions of this instruction are available.
Examples
CPSIE if ; Enable IRQ and FIQ interrupts.
CPSID A ; Disable imprecise aborts.
CPSID ai, #17 ; Disable imprecise aborts and interrupts, and enter
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-151
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.28 CPS
; FIQ mode.
CPS #16 ; Enter User mode.
Related concepts
A1.3 Processor modes, and privileged and unprivileged software execution on page A1-28
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-152
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.29 CRC32
C2.29 CRC32
CRC32 performs a cyclic redundancy check (CRC) calculation on a value held in a general-purpose
register.
Syntax
CRC32B{q} Rd, Rn, Rm ; A1 Wd = CRC32(Wn, Rm[<7:0>])
Where:
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
A CRC32 instruction must be unconditional.
Rd
Is the general-purpose accumulator output register.
Rn
Is the general-purpose accumulator input register.
Rm
Is the general-purpose data source register.
Architectures supported
Supported in architecture Armv8.1 and later. Optionally supported in the Armv8‑A architecture.
Usage
CRC32 takes an input CRC value in the first source operand, performs a CRC on the input value in the
second source operand, and returns the output CRC value. The second source operand can be 8, 16, or 32
bits. To align with common usage, the bit order of the values is reversed as part of the operation, and the
polynomial 0x04C11DB7 is used for the CRC calculation.
Note
ID_ISAR5.CRC32 indicates whether this instruction is supported in the T32 and A32 instruction sets.
Note
For more information about the CONSTRAINED UNPREDICTABLE behavior, see Architectural Constraints on
UNPREDICTABLE behaviors in the Arm® Architecture Reference Manual Arm®v8, for Arm®v8‑A
architecture profile.
Related references
C2.29 CRC32 on page C2-153
C2.1 A32 and T32 instruction summary on page C2-106
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-153
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.30 CRC32C
C2.30 CRC32C
CRC32C performs a cyclic redundancy check (CRC) calculation on a value held in a general-purpose
register.
Syntax
CRC32CB{q} Rd, Rn, Rm ; A1 Wd = CRC32C(Wn, Rm[<7:0>])
Where:
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
A CRC32C instruction must be unconditional.
Rd
Is the general-purpose accumulator output register.
Rn
Is the general-purpose accumulator input register.
Rm
Is the general-purpose data source register.
Architectures supported
Supported in architecture Armv8‑A.1 and later. Optionally supported in the Armv8‑A architecture.
Usage
CRC32C takes an input CRC value in the first source operand, performs a CRC on the input value in the
second source operand, and returns the output CRC value. The second source operand can be 8, 16, or 32
bits. To align with common usage, the bit order of the values is reversed as part of the operation, and the
polynomial 0x1EDC6F41 is used for the CRC calculation.
Note
ID_ISAR5.CRC32 indicates whether this instruction is supported in the T32 and A32 instruction sets.
Note
For more information about the CONSTRAINED UNPREDICTABLE behavior, see Architectural Constraints on
UNPREDICTABLE behaviors in the Arm® Architecture Reference Manual Arm®v8, for Arm®v8‑A
architecture profile.
Related references
C2.29 CRC32 on page C2-153
C2.1 A32 and T32 instruction summary on page C2-106
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-154
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.31 CSDB
C2.31 CSDB
Consumption of Speculative Data Barrier.
Syntax
CSDB{c}{q} ; A32
CSDB{c}.W ; T32
Where:
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
c
Is an optional condition code. See Chapter C1 Condition Codes on page C1-83.
Usage
Consumption of Speculative Data Barrier is a memory barrier that controls Speculative execution and
data value prediction. Arm Compiler supports the mitigation of the Variant 1 mechanism that is described
in the whitepaper at Vulnerability of Speculative Processors to Cache Timing Side-Channel Mechanism.
The CSDB instruction allows Speculative execution of:
• Branch instructions.
• Instructions that write to the PC.
• Instructions that are not a result of data value predictions.
• Instructions that are the result of PSTATE.{N,Z,C,V} predictions from conditional branch
instructions or from conditional instructions that write to the PC.
The CSDB instruction prevents Speculative execution of:
• Non-branch instructions.
• Instructions that do not write to the PC.
• Instructions that are the result of data value predictions.
• Instructions that are the result of PSTATE.{N,Z,C,V} predictions from instructions other than
conditional branch instructions and conditional instructions that write to the PC.
Examples
The following example shows a code sequence that could result in the processor loading data from an
untrusted location that is provided by a user as the result of Speculative execution of instructions:
CMP R0, R1
BGE out_of_range
LDRB R4, [R5, R0] ; load data from list A
; speculative execution of this instruction
; must be prevented
AND R4, R4, #1
LSL R4, R4, #8
ADD R4, R4, #0x200
CMP R4, R6
BGE out_of_range
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-155
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.31 CSDB
In this example:
• There are two list objects A and B.
• A contains a list of values that are used to calculate offsets from which data can be loaded from B.
• R1 is the length of A.
• R5 is the base address of A.
• R6 is the length of B.
• R8 is the base address of B.
• R0 is an untrusted offset that is provided by a user, and is used to load an element from A.
When R0 is greater-than or equal-to the length of A, it is outside the address range of A. Therefore, the
first branch instruction BGE out_of_range is taken, and instructions LDRB R4, [R5, R0] through LDRB
R7, [R8, R4] are skipped.
Without a CSDB instruction, these skipped instructions can still be speculatively executed, and could
result in:
• If R0 is maliciously set to an incorrect value, then data can be loaded into R4 from an address outside
the address range of A.
• Data can be loaded into R7 from an address outside the address range of B.
To mitigate against these untrusted accesses, add a pair of MOVGE and CSDB instructions between the BGE
out_of_range and LDRB R4, [R5, R0] instructions as follows:
CMP R0, R1
BGE out_of_range
Related references
C2.1 A32 and T32 instruction summary on page C2-106
C2.58 MOV on page C2-199
Related information
Arm Processor Security Update
Compiler support for mitigations
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-156
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.32 DBG
C2.32 DBG
Debug.
Syntax
DBG{cond} {option}
where:
cond
is an optional condition code.
option
is an optional limitation on the operation of the hint. The range is 0-15.
Usage
DBG is a hint instruction. It is optional whether it is implemented or not. If it is not implemented, it
behaves as a NOP. The assembler produces a diagnostic message if the instruction executes as NOP on the
target.
Debug hint provides a hint to a debugger and related tools. See your debugger and related tools
documentation to determine the use, if any, of this instruction.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C2.68 NOP on page C2-213
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-157
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.33 DMB
C2.33 DMB
Data Memory Barrier.
Syntax
DMB{cond} {option}
where:
cond
is an optional condition code.
Note
cond is permitted only in T32 code. This is an unconditional instruction in A32 code.
option
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-158
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.33 DMB
Operation
Data Memory Barrier acts as a memory barrier. It ensures that all explicit memory accesses that appear in
program order before the DMB instruction are observed before any explicit memory accesses that appear
in program order after the DMB instruction. It does not affect the ordering of any other instructions
executing on the processor.
Alias
The following alternative values of option are supported, but Arm recommends that you do not use
them:
• SH is an alias for ISH.
• SHST is an alias for ISHST.
• UN is an alias for NSH.
• UNST is an alias for NSHST.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-159
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.34 DSB
C2.34 DSB
Data Synchronization Barrier.
Syntax
DSB{cond} {option}
where:
cond
is an optional condition code.
Note
cond is permitted only in T32 code. This is an unconditional instruction in A32 code.
option
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-160
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.34 DSB
Operation
Data Synchronization Barrier acts as a special kind of memory barrier. No instruction in program order
after this instruction executes until this instruction completes. This instruction completes when:
• All explicit memory accesses before this instruction complete.
• All Cache, Branch predictor and TLB maintenance operations before this instruction complete.
Alias
The following alternative values of option are supported for DSB, but Arm recommends that you do not
use them:
• SH is an alias for ISH.
• SHST is an alias for ISHST.
• UN is an alias for NSH.
• UNST is an alias for NSHST.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-161
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.35 EOR
C2.35 EOR
Logical Exclusive OR.
Syntax
EOR{S}{cond} Rd, Rn, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
is an optional condition code.
Rd
is the destination register.
Rn
is the register holding the first operand.
Operand2
is a flexible second operand.
Operation
The EOR instruction performs bitwise Exclusive OR operations on the values in Rn and Operand2.
Condition flags
If S is specified, the EOR instruction:
• Updates the N and Z flags according to the result.
• Can update the C flag during the calculation of Operand2.
• Does not affect the V flag.
16-bit instructions
The following forms of the EOR instruction are available in T32 code, and are 16-bit instructions:
EORS Rd, Rd, Rm
Rd and Rm must both be Lo registers. This form can only be used outside an IT block.
It does not matter if you specify EOR{S} Rd, Rm, Rd. The instruction is the same.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-162
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.35 EOR
Correct examples
EORS r0,r0,r3,ROR r6
EORS r7, r11, #0x18181818
Incorrect example
EORS r0,pc,r3,ROR r6 ; PC not permitted with register
; controlled shift
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C2.144 SUBS pc, lr on page C2-317
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-163
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.36 ERET
C2.36 ERET
Exception Return.
Syntax
ERET{cond}
where:
cond
Usage
In a processor that implements the Virtualization Extensions, you can use ERET to perform a return from
an exception taken to Hyp mode.
Operation
When executed in Hyp mode, ERET loads the PC from ELR_hyp and loads the CPSR from SPSR_hyp.
When executed in any other mode, apart from User or System, it behaves as:
• MOVS PC, LR in the A32 instruction set.
• SUBS PC, LR, #0 in the T32 instruction set.
Notes
You must not use ERET in User or System mode. The assembler cannot warn you about this because it
has no information about what the processor mode is likely to be at execution time.
ERET is the preferred synonym for SUBS PC, LR, #0 in the T32 instruction set.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related concepts
A1.3 Processor modes, and privileged and unprivileged software execution on page A1-28
Related references
C2.58 MOV on page C2-199
C2.144 SUBS pc, lr on page C2-317
C1.9 Condition code suffixes on page C1-92
C2.39 HVC on page C2-167
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-164
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.37 ESB
C2.37 ESB
Error Synchronization Barrier.
Syntax
ESB{c}{q}
ESB{c}.W
Where:
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
c
Is an optional condition code. See Chapter C1 Condition Codes on page C1-83.
Architectures supported
Supported in the Armv8-A and Armv8-R architectures.
Usage
Error Synchronization Barrier.
Related references
C2.1 A32 and T32 instruction summary on page C2-106
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-165
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.38 HLT
C2.38 HLT
Halting breakpoint.
Note
This instruction is supported only in the Armv8 architecture.
Syntax
HLT{Q} #imm
Where:
Q
is an optional suffix. It only has an effect when Halting debug-mode is disabled. In this case, if Q
is specified, the instruction behaves as a NOP. If Q is not specified, the instruction is UNDEFINED.
imm
is an expression evaluating to an integer in the range:
• 0-65535 (a 16-bit value) in an A32 instruction.
• 0-63 (a 6-bit value) in a 16-bit T32 instruction.
Usage
The HLT instruction causes the processor to enter Debug state if Halting debug-mode is enabled.
In both A32 state and T32 state, imm is ignored by the Arm hardware. However, a debugger can use it to
store additional information about the breakpoint.
HLT is an unconditional instruction. It must not have a condition code in A32 code. In T32 code, the HLT
instruction does not require a condition code suffix because it always executes irrespective of its
condition code suffix.
Availability
This instruction is available in A32 and T32.
In T32, it is only available as a 16-bit instruction.
Related references
C2.68 NOP on page C2-213
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-166
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.39 HVC
C2.39 HVC
Hypervisor Call.
Syntax
HVC #imm
where:
imm
is an expression evaluating to an integer in the range 0-65535.
Operation
In a processor that implements the Virtualization Extensions, the HVC instruction causes a Hypervisor
Call exception. This means that the processor enters Hyp mode, the CPSR value is saved to the Hyp
mode SPSR, and execution branches to the HVC vector.
HVC must not be used if the processor is in Secure state, or in User mode in Non-secure state.
imm is ignored by the processor. However, it can be retrieved by the exception handler to determine what
service is being requested.
HVC cannot be conditional, and is not permitted in an IT block.
Notes
The ERET instruction performs an exception return from Hyp mode.
Architectures
This 32-bit instruction is available in A32 and T32. It is available in Armv7 architectures that include the
Virtualization Extensions.
There is no 16-bit version of this instruction in T32.
Related concepts
A1.3 Processor modes, and privileged and unprivileged software execution on page A1-28
Related references
C2.36 ERET on page C2-164
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-167
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.40 ISB
C2.40 ISB
Instruction Synchronization Barrier.
Syntax
ISB{cond} {option}
where:
cond
is an optional condition code.
Note
cond is permitted only in T32 code. This is an unconditional instruction in A32 code.
option
is an optional limitation on the operation of the hint. The permitted value is:
SY
Full system barrier operation. This is the default and can be omitted.
Operation
Instruction Synchronization Barrier flushes the pipeline in the processor, so that all instructions following
the ISB are fetched from cache or memory, after the instruction has been completed. It ensures that the
effects of context altering operations, such as changing the ASID, or completed TLB maintenance
operations, or branch predictor maintenance operations, in addition to all changes to the CP15 registers,
executed before the ISB instruction are visible to the instructions fetched after the ISB.
In addition, the ISB instruction ensures that any branches that appear in program order after it are always
written into the branch prediction logic with the context that is visible after the ISB instruction. This is
required to ensure correct execution of the instruction stream.
Note
When the target architecture is Armv7‑M, you cannot use an ISB instruction in an IT block, unless it is
the last instruction in the block.
Architectures
This 32-bit instructions are available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-168
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.41 IT
C2.41 IT
The IT (If-Then) instruction makes a single following instruction (the IT block) conditional. The
conditional instruction must be from a restricted set of 16-bit instructions.
Syntax
IT cond
where:
cond
specifies the condition for the following instruction.
Deprecated syntax
IT{x{y{z}}} {cond}
where:
x
specifies the condition switch for the second instruction in the IT block.
y
specifies the condition switch for the third instruction in the IT block.
z
specifies the condition switch for the fourth instruction in the IT block.
cond
specifies the condition for the first instruction in the IT block.
The condition switches for the second, third, and fourth instructions in the IT block can be either:
T
Then. Applies the condition cond to the instruction.
E
Else. Applies the inverse condition of cond to the instruction.
Usage
The IT block can contain between two and four conditional instructions, where the conditions can be all
the same, or some of them can be the logical inverse of the others, but this is deprecated in Armv8.
The conditional instruction (including branches, but excluding the BKPT instruction) must specify the
condition in the {cond} part of its syntax.
You are not required to write IT instructions in your code, because the assembler generates them for you
automatically according to the conditions specified on the following instructions. However, if you do
write IT instructions, the assembler validates the conditions specified in the IT instructions against the
conditions specified in the following instructions.
Writing the IT instructions ensures that you consider the placing of conditional instructions, and the
choice of conditions, in the design of your code.
When assembling to A32 code, the assembler performs the same checks, but does not generate any IT
instructions.
With the exception of CMP, CMN, and TST, the 16-bit instructions that normally affect the condition flags,
do not affect them when used inside an IT block.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-169
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.41 IT
A BKPT instruction in an IT block is always executed, so it does not require a condition in the {cond} part
of its syntax. The IT block continues from the next instruction. Using a BKPT or HLT instruction inside an
IT block is deprecated.
Note
You can use an IT block for unconditional instructions by using the AL condition.
Conditional branches inside an IT block have a longer branch range than those outside the IT block.
Restrictions
The following instructions are not permitted in an IT block:
• IT.
• CBZ and CBNZ.
• TBB and TBH.
• CPS, CPSID and CPSIE.
• SETEND.
Note
armasm shows a diagnostic message when any of these instructions are used in an IT block.
Using any instruction not listed in the following table in an IT block is deprecated. Also, any explicit
reference to R15 (the PC) in the IT block is deprecated.
ADD, ADC, RSB, SBC, SUB ADD SP, SP, #imm or SUB SP, SP, #imm or when Rm, Rdn
or Rdm is the PC
MUL -
Condition flags
This instruction does not change the flags.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-170
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.41 IT
Exceptions
Exceptions can occur between an IT instruction and the corresponding IT block, or within an IT block.
This exception results in entry to the appropriate exception handler, with suitable return information in
LR and SPSR.
Instructions designed for use as exception returns can be used as normal to return from the exception,
and execution of the IT block resumes correctly. This is the only way that a PC-modifying instruction
can branch to an instruction in an IT block.
Availability
This 16-bit instruction is available in T32 only.
In A32 code, IT is a pseudo-instruction that does not generate any code.
There is no 32-bit version of this instruction.
Correct examples
IT GT
LDRGT r0, [r1,#4]
IT EQ
ADDEQ r0, r1, r2
Incorrect examples
IT NE
ADD r0,r0,r1 ; syntax error: no condition code used in IT block
ITT EQ
MOVEQ r0,r1
ADDEQ r0,r0,#1 ; IT block covering more than one instruction is deprecated
IT GT
LDRGT r0,label ; LDR (PC-relative) is deprecated in an IT block
IT EQ
ADDEQ PC,r0 ; ADD is deprecated when Rdn is the PC
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-171
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.42 LDA
C2.42 LDA
Load-Acquire Register.
Note
This instruction is supported only in Armv8.
Syntax
LDA{cond} Rt, [Rn]
where:
cond
is an optional condition code.
Rt
is the register to load.
Rn
is the register on which the memory address is based.
Operation
LDA loads data from memory. If any loads or stores appear after a load-acquire in program order, then all
observers are guaranteed to observe the load-acquire before observing the loads and stores. Loads and
stores appearing before a load-acquire are unaffected.
If a store-release follows a load-acquire, each observer is guaranteed to observe them in program order.
There is no requirement that a load-acquire be paired with a store-release.
Restrictions
The address specified must be naturally aligned, or an alignment fault is generated.
The PC must not be used for Rt or Rn.
Availability
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction.
Related references
C2.43 LDAEX on page C2-173
C2.136 STL on page C2-301
C2.137 STLEX on page C2-302
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-172
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.43 LDAEX
C2.43 LDAEX
Load-Acquire Register Exclusive.
Note
This instruction is supported only in Armv8.
Syntax
LDAEX{cond} Rt, [Rn]
where:
cond
is an optional condition code.
Rt
is the register to load.
Rt2
is the second register for doubleword loads.
Rn
is the register on which the memory address is based.
Operation
LDAEX loads data from memory.
• If the physical address has the Shared TLB attribute, LDAEX tags the physical address as exclusive
access for the current processor, and clears any exclusive access tag for this processor for any other
physical address.
• Otherwise, it tags the fact that the executing processor has an outstanding tagged physical address.
• If any loads or stores appear after LDAEX in program order, then all observers are guaranteed to
observe the LDAEX before observing the loads and stores. Loads and stores appearing before LDAEX
are unaffected.
Restrictions
The PC must not be used for any of Rt, Rt2, or Rn.
For A32 instructions:
• SP can be used but use of SP for any of Rt, or Rt2 is deprecated.
• For LDAEXD, Rt must be an even numbered register, and not LR.
• Rt2 must be R(t+1).
Usage
Use LDAEX and STLEX to implement interprocess communication in multiple-processor and shared-
memory systems.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-173
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.43 LDAEX
For reasons of performance, keep the number of instructions between corresponding LDAEX and STLEX
instructions to a minimum.
Note
The address used in a STLEX instruction must be the same as the address in the most recently executed
LDAEX instruction.
Availability
These 32-bit instructions are available in A32 and T32.
There are no 16-bit versions of these instructions.
Related references
C2.136 STL on page C2-301
C2.42 LDA on page C2-172
C2.137 STLEX on page C2-302
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-174
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.44 LDC and LDC2
Note
LDC2 is not supported in Armv8.
Syntax
op{L}{cond} coproc, CRd, [Rn]
where:
op
is LDC or LDC2.
cond
option
is a coprocessor option in the range 0-255, enclosed in braces.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-175
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.44 LDC and LDC2
Usage
The use of these instructions depends on the coprocessor. See the coprocessor documentation for details.
Architectures
These 32-bit instructions are available in A32 and T32.
There are no 16-bit versions of these instructions in T32.
Register restrictions
You cannot use PC for Rn in the pre-index and post-index instructions. These are the forms that write
back to Rn.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-176
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.45 LDM
C2.45 LDM
Load Multiple registers.
Syntax
LDM{addr_mode}{cond} Rn{!}, reglist{^}
where:
addr_mode
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-177
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.45 LDM
16-bit instructions
16-bit versions of a subset of these instructions are available in T32 code.
The following restrictions apply to the 16-bit instructions:
• All registers in reglist must be Lo registers.
• Rn must be a Lo register.
• addr_mode must be omitted (or IA), meaning increment address after each transfer.
• Writeback must be specified for LDM instructions where Rn is not in the reglist.
In addition, the PUSH and POP instructions are subsets of the STM and LDM instructions and can therefore
be expressed using the STM and LDM instructions. Some forms of PUSH and POP are also 16-bit
instructions.
Loading to the PC
A load to the PC causes a branch to the instruction at the address loaded.
Also:
• Bits[1:0] must not be 0b10.
• If bit[0] is 1, execution continues in T32 state.
• If bit[0] is 0, execution continues in A32 state.
Correct example
LDM r8,{r0,r2,r9} ; LDMIA is a synonym for LDM
Incorrect example
LDMDA r2, {} ; must be at least one register in list
Related references
C2.73 POP on page C2-221
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-178
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.46 LDR (immediate offset)
Syntax
LDR{type}{cond} Rt, [Rn {, #offset}] ; immediate offset
where:
type
Table C2-10 Offsets and architectures, LDR, word, halfword, and byte
A32, signed byte, halfword, or signed halfword -255 to 255 -255 to 255 -255 to 255
T32 32-bit encoding, word, halfword, signed halfword, byte, or signed byte h -255 to 4095 -255 to 255 -255 to 255
T32 32-bit encoding, doubleword -1020 to 1020 i -1020 to 1020 i -1020 to 1020 i
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-179
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.46 LDR (immediate offset)
Table C2-10 Offsets and architectures, LDR, word, halfword, and byte (continued)
Register restrictions
Rn must be different from Rt in the pre-index and post-index forms.
For T32 instructions, you must not specify SP or PC for either Rt or Rt2.
For A32 instructions:
• Rt must be an even-numbered register.
• Rt must not be LR.
• Arm strongly recommends that you do not use R12 for Rt.
• Rt2 must be R(t + 1).
Use of PC
In A32 code you can use PC for Rt in LDR word instructions and PC for Rn in LDR instructions.
Other uses of PC are not permitted in these A32 instructions.
In T32 code you can use PC for Rt in LDR word instructions and PC for Rn in LDR instructions. Other uses
of PC in these T32 instructions are not permitted.
Use of SP
You can use SP for Rn.
In A32 code, you can use SP for Rt in word instructions. You can use SP for Rt in non-word instructions
in A32 code but this is deprecated.
In T32 code, you can use SP for Rt in word instructions only. All other use of SP for Rt in these
instructions are not permitted in T32 code.
Examples
LDR r8,[r10] ; loads R8 from the address in R10.
LDRNE r2,[r5,#960]! ; (conditionally) loads R2 from a word
; 960 bytes above the address in R5, and
; increments R5 by 960.
Related references
C1.9 Condition code suffixes on page C1-92
h For word loads, Rt can be the PC. A load to the PC causes a branch to the address loaded. In Armv4, bits[1:0] of the address loaded must be 0b00. In Armv5T and
above, bits[1:0] must not be 0b10, and if bit[0] is 1, execution continues in T32 state, otherwise execution continues in A32 state.
i Must be divisible by 4.
j Rt and Rn must be in the range R0-R7.
k Must be divisible by 2.
l Rt must be in the range R0-R7.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-180
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.47 LDR (PC-relative)
Syntax
LDR{type}{cond}{.W} Rt, label
where:
type
is a PC-relative expression.
label must be within a limited distance of the current instruction.
Note
Equivalent syntaxes are available for the STR instruction in A32 code but they are deprecated.
m For word loads, Rt can be the PC. A load to the PC causes a branch to the address loaded. In Armv4, bits[1:0] of the address loaded must be 0b00. In Armv5T and
above, bits[1:0] must not be 0b10, and if bit[0] is 1, execution continues in T32 state, otherwise execution continues in A32 state.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-181
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.47 LDR (PC-relative)
Use of SP
In A32 code, you can use SP for Rt in LDR word instructions. You can use SP for Rt in LDR non-word
A32 instructions but this is deprecated.
In T32 code, you can use SP for Rt in LDR word instructions only. All other uses of SP in these
instructions are not permitted in T32 code.
Related references
C1.9 Condition code suffixes on page C1-92
m For word loads, Rt can be the PC. A load to the PC causes a branch to the address loaded. In Armv4, bits[1:0] of the address loaded must be 0b00. In Armv5T and
above, bits[1:0] must not be 0b10, and if bit[0] is 1, execution continues in T32 state, otherwise execution continues in A32 state.
n In Armv7‑M, LDRD (PC-relative) instructions must be on a word-aligned address.
o Must be a multiple of 4.
p Rt must be in the range R0-R7. There are no byte, halfword, or doubleword 16-bit instructions.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-182
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.48 LDR (register offset)
Syntax
LDR{type}{cond} Rt, [Rn, ±Rm {, shift}] ; register offset
LDRD{cond} Rt, Rt2, [Rn, ±Rm] ; register offset, doubleword ; A32 only
where:
type
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-183
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.48 LDR (register offset)
T32 32-bit encoding, word, halfword, signed halfword, byte, or signed byte r +Rm LSL #0-3
Register restrictions
In the pre-index and post-index forms, Rn must be different from Rt.
Use of PC
In A32 instructions you can use PC for Rt in LDR word instructions, and you can use PC for Rn in LDR
instructions with register offset syntax (that is the forms that do not writeback to the Rn).
Other uses of PC are not permitted in A32 instructions.
In T32 instructions you can use PC for Rt in LDR word instructions. Other uses of PC in these T32
instructions are not permitted.
Use of SP
You can use SP for Rn.
In A32 code, you can use SP for Rt in word instructions. You can use SP for Rt in non-word A32
instructions but this is deprecated.
You can use SP for Rm in A32 instructions but this is deprecated.
In T32 code, you can use SP for Rt in word instructions only. All other use of SP for Rt in these
instructions are not permitted in T32 code.
Use of SP for Rm is not permitted in T32 state.
Related references
C1.9 Condition code suffixes on page C1-92
q Where ±Rm is shown, you can use –Rm, +Rm, or Rm. Where +Rm is shown, you cannot use –Rm.
r For word loads, Rt can be the PC. A load to the PC causes a branch to the address loaded. In Armv4, bits[1:0] of the address loaded must be 0b00. In Armv5T and
above, bits[1:0] must not be 0b10, and if bit[0] is 1, execution continues in T32 state, otherwise execution continues in A32 state.
s Rt, Rn, and Rm must all be in the range R0-R7.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-184
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.49 LDR (register-relative)
Syntax
LDR{type}{cond}{.W} Rt, label
where:
type
is a symbol defined by the FIELD directive. label specifies an offset from the base register
which is defined using the MAP directive.
label must be within a limited distance of the value in the base register.
The following table shows the possible offsets between the label and the current instruction:
t For word loads, Rt can be the PC. A load to the PC causes a branch to the address loaded. In Armv4, bits[1:0] of the address loaded must be 0b00. In Armv5T and
above, bits[1:0] must not be 0b10, and if bit[0] is 1, execution continues in T32 state, otherwise execution continues in A32 state.
u Must be a multiple of 4.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-185
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.49 LDR (register-relative)
Use of PC
You can use PC for Rt in word instructions. Other uses of PC are not permitted in these instructions.
Use of SP
In A32 code, you can use SP for Rt in word instructions. You can use SP for Rt in non-word A32
instructions but this is deprecated.
In T32 code, you can use SP for Rt in word instructions only. All other use of SP for Rt in these
instructions are not permitted in T32 code.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-186
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.50 LDR, unprivileged
Syntax
LDR{type}T{cond} Rt, [Rn {, #offset}] ; immediate offset (32-bit T32 encoding only)
where:
type
Operation
When these instructions are executed by privileged software, they access memory with the same
restrictions as they would have if they were executed by unprivileged software.
When executed by unprivileged software these instructions behave in exactly the same way as the
corresponding load instruction, for example LDRSBT behaves in the same way as LDRSB.
A32, word or byte Not available -4095 to 4095 ±Rm LSL #0-31
LSR #1-32
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-187
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.50 LDR, unprivileged
ASR #1-32
ROR #1-31
RRX
A32, signed byte, halfword, or signed halfword Not available -255 to 255 ±Rm Not available
T32, 32-bit encoding, word, halfword, signed halfword, byte, or signed 0 to 255 Not available Not available
byte
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-188
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.51 LDREX
C2.51 LDREX
Load Register Exclusive.
Syntax
LDREX{cond} Rt, [Rn {, #offset}]
where:
cond
is an optional condition code.
Rt
is the register to load.
Rt2
is the second register for doubleword loads.
Rn
is the register on which the memory address is based.
offset
is an optional offset applied to the value in Rn. offset is permitted only in 32-bit T32
instructions. If offset is omitted, an offset of zero is assumed.
Operation
LDREX loads data from memory.
• If the physical address has the Shared TLB attribute, LDREX tags the physical address as exclusive
access for the current processor, and clears any exclusive access tag for this processor for any other
physical address.
• Otherwise, it tags the fact that the executing processor has an outstanding tagged physical address.
LDREXB and LDREXH zero extend the value loaded.
Restrictions
PC must not be used for any of Rt, Rt2, or Rn.
For A32 instructions:
• SP can be used but use of SP for any of Rt, or Rt2 is deprecated.
• For LDREXD, Rt must be an even numbered register, and not LR.
• Rt2 must be R(t+1).
• offset is not permitted.
Usage
Use LDREX and STREX to implement interprocess communication in multiple-processor and shared-
memory systems.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-189
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.51 LDREX
For reasons of performance, keep the number of instructions between corresponding LDREX and STREX
instructions to a minimum.
Note
The address used in a STREX instruction must be the same as the address in the most recently executed
LDREX instruction.
Architectures
These 32-bit instructions are available in A32 and T32.
The LDREXD instruction is not available in the Armv7‑M architecture.
There are no 16-bit versions of these instructions in T32.
Examples
MOV r1, #0x1 ; load the ‘lock taken’ value
try
LDREX r0, [LockAddr] ; load the lock value
CMP r0, #0 ; is the lock free?
STREXEQ r0, r1, [LockAddr] ; try and claim the lock
CMPEQ r0, #0 ; did this succeed?
BNE try ; no – try again
.... ; yes – we have the lock
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-190
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.52 LSL
C2.52 LSL
Logical Shift Left. This instruction is a preferred synonym for MOV instructions with shifted register
operands.
Syntax
LSL{S}{cond} Rd, Rm, Rs
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
Rd
is the destination register.
Rm
is the register holding the first operand. This operand is shifted left.
Rs
is a register holding a shift value to apply to the value in Rm. Only the least significant byte is
used.
sh
is a constant shift. The range of values permitted is 0-31.
Operation
LSL provides the value of a register multiplied by a power of two, inserting zeros into the vacated bit
positions.
Caution
Do not use the S suffix when using PC as Rd in User mode or System mode. The assembler cannot warn
you about this because it has no information about what the processor mode is likely to be at execution
time.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-191
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.52 LSL
You cannot use PC for Rd or any operand in the LSL instruction if it has a register-controlled shift.
Condition flags
If S is specified, the LSL instruction updates the N and Z flags according to the result.
The C flag is unaffected if the shift value is 0. Otherwise, the C flag is updated to the last bit shifted out.
16-bit instructions
The following forms of these instructions are available in T32 code, and are 16-bit instructions:
LSLS Rd, Rm, #sh
Rd and Rm must both be Lo registers. This form can only be used outside an IT block.
Architectures
This 32-bit instruction is available in A32 and T32.
This 16-bit T32 instruction is available in T32.
Example
LSLS r1, r2, r3
Related references
C2.58 MOV on page C2-199
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-192
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.53 LSR
C2.53 LSR
Logical Shift Right. This instruction is a preferred synonym for MOV instructions with shifted register
operands.
Syntax
LSR{S}{cond} Rd, Rm, Rs
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
Rd
is the destination register.
Rm
is the register holding the first operand. This operand is shifted right.
Rs
is a register holding a shift value to apply to the value in Rm. Only the least significant byte is
used.
sh
is a constant shift. The range of values permitted is 1-32.
Operation
LSR provides the unsigned value of a register divided by a variable power of two, inserting zeros into the
vacated bit positions.
Caution
Do not use the S suffix when using PC as Rd in User mode or System mode. The assembler cannot warn
you about this because it has no information about what the processor mode is likely to be at execution
time.
You cannot use PC for Rd or any operand in the LSR instruction if it has a register-controlled shift.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-193
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.53 LSR
Condition flags
If S is specified, the instruction updates the N and Z flags according to the result.
The C flag is unaffected if the shift value is 0. Otherwise, the C flag is updated to the last bit shifted out.
16-bit instructions
The following forms of these instructions are available in T32 code, and are 16-bit instructions:
LSRS Rd, Rm, #sh
Rd and Rm must both be Lo registers. This form can only be used outside an IT block.
Architectures
This 32-bit instruction is available in A32 and T32.
This 16-bit T32 instruction is available in T32.
Example
LSR r4, r5, r6
Related references
C2.58 MOV on page C2-199
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-194
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.54 MCR and MCR2
Note
MCR2 is not supported in Armv8.
Syntax
MCR{cond} coproc, #opcode1, Rt, CRn, CRm{, #opcode2}
where:
cond
Usage
The use of these instructions depends on the coprocessor. See the coprocessor documentation for details.
Architectures
These 32-bit instructions are available in A32 and T32.
There are no 16-bit versions of these instructions in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-195
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.55 MCRR and MCRR2
Note
MCRR2 is not supported in Armv8.
Syntax
MCRR{cond} coproc, #opcode, Rt, Rt2, CRn
where:
cond
Usage
The use of these instructions depends on the coprocessor. See the coprocessor documentation for details.
Architectures
These 32-bit instructions are available in A32 and T32.
There are no 16-bit versions of these instructions in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-196
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.56 MLA
C2.56 MLA
Multiply-Accumulate with signed or unsigned 32-bit operands, giving the least significant 32 bits of the
result.
Syntax
MLA{S}{cond} Rd, Rn, Rm, Ra
where:
cond
is an optional condition code.
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
Rd
is the destination register.
Rn, Rm
are registers holding the values to be multiplied.
Ra
is a register holding the value to be added.
Operation
The MLA instruction multiplies the values from Rn and Rm, adds the value from Ra, and places the least
significant 32 bits of the result in Rd.
Register restrictions
You cannot use PC for any register.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
If S is specified, the MLA instruction:
• Updates the N and Z flags according to the result.
• Does not affect the C or V flag.
Architectures
This instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Example
MLA r10, r2, r1, r5
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-197
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.57 MLS
C2.57 MLS
Multiply-Subtract, with signed or unsigned 32-bit operands, giving the least significant 32 bits of the
result.
Syntax
MLS{cond} Rd, Rn, Rm, Ra
where:
cond
is an optional condition code.
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
Rd
is the destination register.
Rn, Rm
are registers holding the values to be multiplied.
Ra
is a register holding the value to be subtracted from.
Operation
The MLS instruction multiplies the values in Rn and Rm, subtracts the result from the value in Ra, and
places the least significant 32 bits of the final result in Rd.
Register restrictions
You cannot use PC for any register.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Architectures
This instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Example
MLS r4, r5, r6, r7
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-198
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.58 MOV
C2.58 MOV
Move.
Syntax
MOV{S}{cond} Rd, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
is an optional condition code.
Rd
is the destination register.
Operand2
is a flexible second operand.
imm16
is any value in the range 0-65535.
Operation
The MOV instruction copies the value of Operand2 into Rd.
In certain circumstances, the assembler can substitute MVN for MOV, or MOV for MVN. Be aware of this when
reading disassembly listings.
You can use SP for Rd or Rm. But this is deprecated except for the following cases:
• MOV SP, Rm when Rm is not PC or SP.
• MOV Rd, SP when Rd is not PC or SP.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-199
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.58 MOV
Note
• You cannot use PC for Rd in MOV Rd, #imm16 if the #imm16 value is not a permitted Operand2 value.
You can use PC in forms with Operand2 without register-controlled shift.
If you use PC as Rm, the value used is the address of the instruction plus 8.
If you use PC as Rd:
• Execution branches to the address corresponding to the result.
• If you use the S suffix, see the SUBS pc,lr instruction.
Condition flags
If S is specified, the instruction:
• Updates the N and Z flags according to the result.
• Can update the C flag during the calculation of Operand2.
• Does not affect the V flag.
16-bit instructions
The following forms of this instruction are available in T32 code, and are 16-bit instructions:
MOVS Rd, #imm
Rd must be a Lo register. imm range 0-255. This form can only be used outside an IT block.
MOVS Rd, Rm
Rd and Rm must both be Lo registers. This form can only be used outside an IT block.
MOV{cond} Rd, Rm
Rd or Rm can be Lo or Hi registers.
Availability
These instructions are available in A32 and T32.
In T32, 16-bit and 32-bit versions of these instructions are available.
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C2.144 SUBS pc, lr on page C2-317
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-200
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.59 MOVT
C2.59 MOVT
Move Top.
Syntax
MOVT{cond} Rd, #imm16
where:
cond
is an optional condition code.
Rd
is the destination register.
imm16
is a 16-bit immediate value.
Usage
MOVT writes imm16 to Rd[31:16], without affecting Rd[15:0].
You can generate any 32-bit immediate with a MOV, MOVT instruction pair.
Register restrictions
You cannot use PC in A32 or T32 instructions.
You can use SP for Rd in A32 instructions but this is deprecated.
You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-201
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.60 MRC and MRC2
Note
MRC2 is not supported in Armv8.
Syntax
MRC{cond} coproc, #opcode1, Rt, CRn, CRm{, #opcode2}
where:
cond
Usage
The use of these instructions depends on the coprocessor. See the coprocessor documentation for details.
Architectures
These 32-bit instructions are available in A32 and T32.
There are no 16-bit versions of these instructions in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-202
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.61 MRRC and MRRC2
Note
MRRC2 is not supported in Armv8.
Syntax
MRRC{cond} coproc, #opcode, Rt, Rt2, CRm
where:
cond
Usage
The use of these instructions depends on the coprocessor. See the coprocessor documentation for details.
Architectures
These 32-bit instructions are available in A32 and T32.
There are no 16-bit versions of these instructions in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-203
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.62 MRS (PSR to general-purpose register)
Syntax
MRS{cond} Rd, psr
where:
cond
is an optional condition code.
Rd
is the destination register.
psr
is one of:
APSR
on any processor, in any mode.
CPSR
deprecated synonym for APSR and for use in Debug state, on any processor except
Armv7‑M and Armv6‑M.
SPSR
on any processor, except Armv6‑M, Armv7‑M, Armv8‑M Baseline, and Armv8‑M
Mainline, in privileged software execution only.
Mpsr
on Armv6‑M, Armv7‑M, Armv8‑M Baseline, and Armv8‑M Mainline processors only.
Mpsr
can be any of: IPSR, EPSR, IEPSR, IAPSR, EAPSR, MSP, PSP, XPSR, PRIMASK, BASEPRI,
BASEPRI_MAX, FAULTMASK, or CONTROL.
Usage
Use MRS in combination with MSR as part of a read-modify-write sequence for updating a PSR, for
example to change processor mode, or to clear the Q flag.
In process swap code, the programmers’ model state of the process being swapped out must be saved,
including relevant PSR contents. Similarly, the state of the process being swapped in must also be
restored. These operations make use of MRS/store and load/MSR instruction sequences.
SPSR
You must not attempt to access the SPSR when the processor is in User or System mode. This is your
responsibility. The assembler cannot warn you about this, because it has no information about the
processor mode at execution time.
CPSR
Arm deprecates reading the CPSR endianness bit (E) with an MRS instruction.
The CPSR execution state bits, other than the E bit, can only be read when the processor is in Debug
state, halting debug-mode. Otherwise, the execution state bits in the CPSR read as zero.
The condition flags can be read in any mode on any processor. Use APSR if you are only interested in
accessing the condition flags in User mode.
Register restrictions
You cannot use PC for Rd in A32 instructions. You can use SP for Rd in A32 instructions but this is
deprecated.
You cannot use PC or SP for Rd in T32 instructions.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-204
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.62 MRS (PSR to general-purpose register)
Condition flags
This instruction does not change the flags.
Architectures
This instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related concepts
A1.13 Current Program Status Register in AArch32 state on page A1-39
Related references
C2.63 MRS (system coprocessor register to general-purpose register) on page C2-206
C2.64 MSR (general-purpose register to system coprocessor register) on page C2-207
C2.65 MSR (general-purpose register to PSR) on page C2-208
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-205
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.63 MRS (system coprocessor register to general-purpose register)
Syntax
MRS{cond} Rn, coproc_register
where:
cond
is an optional condition code.
coproc_register
is the name of the coprocessor register.
special_register
is the name of the coprocessor register that can be written to APSR_nzcv. This is only possible
for the coprocessor register DBGDSCRint.
Rn
is the general-purpose register. Rn must not be PC.
Usage
You can use this pseudo-instruction to read CP14 or CP15 coprocessor registers, with the exception of
write-only registers. A complete list of the applicable coprocessor register names is in the Arm®v7-AR
Architecture Reference Manual. For example:
MRS R1, SCTLR ; writes the contents of the CP15 coprocessor
; register SCTLR into R1
Architectures
This pseudo-instruction is available in Armv7‑A and Armv7‑R in A32 and 32-bit T32 code.
There is no 16-bit version of this pseudo-instruction in T32.
Related references
C2.62 MRS (PSR to general-purpose register) on page C2-204
C2.64 MSR (general-purpose register to system coprocessor register) on page C2-207
C2.65 MSR (general-purpose register to PSR) on page C2-208
C1.9 Condition code suffixes on page C1-92
Related information
Arm Architecture Reference Manual
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-206
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.64 MSR (general-purpose register to system coprocessor register)
Syntax
MSR{cond} coproc_register, Rn
where:
cond
is an optional condition code.
coproc_register
is the name of the coprocessor register.
Rn
is the general-purpose register. Rn must not be PC.
Usage
You can use this pseudo-instruction to write to any CP14 or CP15 coprocessor writable register. A
complete list of the applicable coprocessor register names is in the Arm Architecture Reference Manual.
For example:
MSR SCTLR, R1 ; writes the contents of R1 into the CP15
; coprocessor register SCTLR
Availability
This pseudo-instruction is available in A32 and T32.
This pseudo-instruction is available in Armv7‑A and Armv7‑R in A32 and 32-bit T32 code.
There is no 16-bit version of this pseudo-instruction in T32.
Related references
C2.62 MRS (PSR to general-purpose register) on page C2-204
C2.63 MRS (system coprocessor register to general-purpose register) on page C2-206
C2.65 MSR (general-purpose register to PSR) on page C2-208
C1.9 Condition code suffixes on page C1-92
C2.153 SYS on page C2-332
Related information
Arm Architecture Reference Manual
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-207
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.65 MSR (general-purpose register to PSR)
Syntax
MSR{cond} APSR_flags, Rm
where:
cond
is an optional condition code.
flags
specifies the APSR flags to be moved. flags can be one or more of:
nzcvq
ALU flags field mask, PSR[31:27] (User mode)
g
SIMD GE flags field mask, PSR[19:16] (User mode).
Rm
is the general-purpose register. Rm must not be PC.
Syntax on architectures other than Armv6-M, Armv7-M, Armv8-M Baseline, and Armv8-M
Mainline
MSR{cond} APSR_flags, #constant
MSR{cond} psr_fields, Rm
where:
cond
is an optional condition code.
flags
specifies the APSR flags to be moved. flags can be one or more of:
nzcvq
ALU flags field mask, PSR[31:27] (User mode)
g
SIMD GE flags field mask, PSR[19:16] (User mode).
constant
is an expression evaluating to a numeric value. The value must correspond to an 8-bit pattern
rotated by an even number of bits within a 32-bit word. Not available in T32.
Rm
is the source register. Rm must not be PC.
psr
is one of:
CPSR
for use in Debug state, also deprecated synonym for APSR
SPSR
on any processor, in privileged software execution only.
fields
specifies the SPSR or CPSR fields to be moved. fields can be one or more of:
c
control field mask byte, PSR[7:0] (privileged software execution)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-208
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.65 MSR (general-purpose register to PSR)
x
extension field mask byte, PSR[15:8] (privileged software execution)
s
status field mask byte, PSR[23:16] (privileged software execution)
f
flags field mask byte, PSR[31:24] (privileged software execution).
where:
cond
is an optional condition code.
Rm
is the source register. Rm must not be PC.
psr
can be any of: APSR, IPSR, EPSR, IEPSR, IAPSR, EAPSR, XPSR, MSP, PSP, PRIMASK, BASEPRI,
BASEPRI_MAX, FAULTMASK, or CONTROL.
Usage
In User mode:
• Use APSR to access the condition flags, Q, or GE bits.
• Writes to unallocated, privileged or execution state bits in the CPSR are ignored. This ensures that
User mode programs cannot change to privileged software execution.
Arm deprecates using MSR to change the endianness bit (E) of the CPSR, in any mode.
You must not attempt to access the SPSR when the processor is in User or System mode.
Register restrictions
You cannot use PC in A32 instructions. You can use SP for Rm in A32 instructions but this is deprecated.
You cannot use PC or SP in T32 instructions.
Condition flags
This instruction updates the flags explicitly if the APSR_nzcvq or CPSR_f field is specified.
Architectures
This instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C2.62 MRS (PSR to general-purpose register) on page C2-204
C2.63 MRS (system coprocessor register to general-purpose register) on page C2-206
C2.64 MSR (general-purpose register to system coprocessor register) on page C2-207
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-209
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.66 MUL
C2.66 MUL
Multiply with signed or unsigned 32-bit operands, giving the least significant 32 bits of the result.
Syntax
MUL{S}{cond} {Rd}, Rn, Rm
where:
cond
is an optional condition code.
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
Rd
is the destination register.
Rn, Rm
are registers holding the values to be multiplied.
Operation
The MUL instruction multiplies the values from Rn and Rm, and places the least significant 32 bits of the
result in Rd.
Register restrictions
You cannot use PC for any register.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
If S is specified, the MUL instruction:
• Updates the N and Z flags according to the result.
• Does not affect the C or V flag.
16-bit instructions
The following forms of the MUL instruction are available in T32 code, and are 16-bit instructions:
MULS Rd, Rn, Rd
Rd and Rn must both be Lo registers. This form can only be used outside an IT block.
There are no other T32 multiply instructions that can update the condition flags.
Availability
This instruction is available in A32 and T32.
The MULS instruction is available in T32 in a 16-bit encoding.
Examples
MUL r10, r2, r5
MULS r0, r2, r2
MULLT r2, r3, r2
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-210
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.67 MVN
C2.67 MVN
Move Not.
Syntax
MVN{S}{cond} Rd, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
is an optional condition code.
Rd
is the destination register.
Operand2
is a flexible second operand.
Operation
The MVN instruction takes the value of Operand2, performs a bitwise logical NOT operation on the value,
and places the result into Rd.
In certain circumstances, the assembler can substitute MVN for MOV, or MOV for MVN. Be aware of this when
reading disassembly listings.
Note
• PC and SP in A32 instructions are deprecated.
If you use PC as Rm, the value used is the address of the instruction plus 8.
If you use PC as Rd:
• Execution branches to the address corresponding to the result.
• If you use the S suffix, see the SUBS pc,lr instruction.
Condition flags
If S is specified, the instruction:
• Updates the N and Z flags according to the result.
• Can update the C flag during the calculation of Operand2.
• Does not affect the V flag.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-211
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.67 MVN
16-bit instructions
The following forms of this instruction are available in T32 code, and are 16-bit instructions:
MVNS Rd, Rm
Rd and Rm must both be Lo registers. This form can only be used outside an IT block.
MVN{cond} Rd, Rm
Rd and Rm must both be Lo registers. This form can only be used inside an IT block.
Architectures
This instruction is available in A32 and T32.
Correct example
MVNNE r11, #0xF000000B ; A32 only. This immediate value is not
; available in T32.
Incorrect example
MVN pc,r3,ASR r0 ; PC not permitted with
; register-controlled shift
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C2.144 SUBS pc, lr on page C2-317
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-212
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.68 NOP
C2.68 NOP
No Operation.
Syntax
NOP{cond}
where:
cond
is an optional condition code.
Usage
NOP does nothing. If NOP is not implemented as a specific instruction on your target architecture, the
assembler treats it as a pseudo-instruction and generates an alternative instruction that does nothing, such
as MOV r0, r0 (A32) or MOV r8, r8 (T32).
NOP is not necessarily a time-consuming NOP. The processor might remove it from the pipeline before it
reaches the execution stage.
You can use NOP for padding, for example to place the following instruction on a 64-bit boundary in A32,
or a 32-bit boundary in T32.
Architectures
This instruction is available in A32 and T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-213
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.69 ORN (T32 only)
Syntax
ORN{S}{cond} Rd, Rn, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
is an optional condition code.
Rd
is the destination register.
Rn
is the register holding the first operand.
Operand2
is a flexible second operand.
Operation
The ORN T32 instruction performs an OR operation on the bits in Rn with the complements of the
corresponding bits in the value of Operand2.
In certain circumstances, the assembler can substitute ORN for ORR, or ORR for ORN. Be aware of this when
reading disassembly listings.
Use of PC
You cannot use PC (R15) for Rd or any operand in the ORN instruction.
Condition flags
If S is specified, the ORN instruction:
• Updates the N and Z flags according to the result.
• Can update the C flag during the calculation of Operand2.
• Does not affect the V flag.
Examples
ORN r7, r11, lr, ROR #4
ORNS r7, r11, lr, ASR #32
Architectures
This 32-bit instruction is available in T32.
There is no A32 or 16-bit T32 ORN instruction.
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C2.144 SUBS pc, lr on page C2-317
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-214
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.70 ORR
C2.70 ORR
Logical OR.
Syntax
ORR{S}{cond} Rd, Rn, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
is an optional condition code.
Rd
is the destination register.
Rn
is the register holding the first operand.
Operand2
is a flexible second operand.
Operation
The ORR instruction performs bitwise OR operations on the values in Rn and Operand2.
In certain circumstances, the assembler can substitute ORN for ORR, or ORR for ORN. Be aware of this when
reading disassembly listings.
Condition flags
If S is specified, the ORR instruction:
• Updates the N and Z flags according to the result.
• Can update the C flag during the calculation of Operand2.
• Does not affect the V flag.
16-bit instructions
The following forms of the ORR instruction are available in T32 code, and are 16-bit instructions:
ORRS Rd, Rd, Rm
Rd and Rm must both be Lo registers. This form can only be used outside an IT block.
It does not matter if you specify ORR{S} Rd, Rm, Rd. The instruction is the same.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-215
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.70 ORR
Example
ORREQ r2,r0,r5
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C2.144 SUBS pc, lr on page C2-317
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-216
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.71 PKHBT and PKHTB
Syntax
PKHBT{cond} {Rd}, Rn, Rm{, LSL #leftshift}
where:
PKHBT
Register restrictions
You cannot use PC for any register.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
These instructions do not change the flags.
Architectures
These instructions are available in A32.
These 32-bit instructions are available T32. For the Armv7‑M architecture, they are only available in an
Armv7E-M implementation.
There are no 16-bit versions of these instructions in T32.
Correct examples
PKHBT r0, r3, r5 ; combine the bottom halfword of R3
; with the top halfword of R5
PKHBT r0, r3, r5, LSL #16 ; combine the bottom halfword of R3
; with the bottom halfword of R5
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-217
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.71 PKHBT and PKHTB
PKHTB r0, r3, r5, ASR #16 ; combine the top halfword of R3
; with the top halfword of R5
You can also scale the second operand by using different values of shift.
Incorrect example
PKHBTEQ r4, r5, r1, ASR #8 ; ASR not permitted with PKHBT
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-218
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.72 PLD, PLDW, and PLI
Syntax
PLtype{cond} [Rn {, #offset}]
PLtype{cond} label
where:
type
Data address.
DW
Instruction address.
type cannot be DW if the syntax specifies label.
cond
is an optional condition code.
Note
cond is permitted only in T32 code, using a preceding IT instruction, but this is deprecated in
the Armv8 architecture. This is an unconditional instruction in A32 code and you must not use
cond.
Rn
is an optional shift.
label
is a PC-relative expression.
Range of offsets
The offset is applied to the value in Rn before the preload takes place. The result is used as the memory
address for the preload. The range of offsets permitted is:
• -4095 to +4095 for A32 instructions.
• -255 to +4095 for T32 instructions, when Rn is not PC.
• -4095 to +4095 for T32 instructions, when Rn is PC.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-219
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.72 PLD, PLDW, and PLI
The assembler calculates the offset from the PC for you. The assembler generates an error if label is out
of range.
Register restrictions
Rm must not be PC. For T32 instructions Rm must also not be SP.
Rn must not be PC for T32 instructions of the syntax PLtype{cond} [Rn, ±Rm{, #shift}].
Architectures
The PLD instruction is available in A32.
The 32-bit encoding of PLD is available in T32.
PLDW is available only in the Armv7 architecture and above that implement the Multiprocessing
Extensions.
PLI is available only in the Armv7 architecture and above.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-220
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.73 POP
C2.73 POP
Pop registers off a full descending stack.
Syntax
POP{cond} reglist
where:
cond
is a non-empty list of registers, enclosed in braces. It can contain register ranges. It must be
comma separated if it contains more than one register or register range.
Operation
POP is a synonym for LDMIA sp! reglist. POP is the preferred mnemonic.
Note
LDM and LDMFD are synonyms of LDMIA.
Registers are stored on the stack in numerical order, with the lowest numbered register at the lowest
address.
T32 instructions
A subset of this instruction is available in the T32 instruction set.
The following restriction applies to the 16-bit POP instruction:
• reglist can only include the Lo registers and the PC.
Example
POP {r0,r10,pc} ; no 16-bit version available
Related references
C2.45 LDM on page C2-177
C2.74 PUSH on page C2-222
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-221
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.74 PUSH
C2.74 PUSH
Push registers onto a full descending stack.
Syntax
PUSH{cond} reglist
where:
cond
is a non-empty list of registers, enclosed in braces. It can contain register ranges. It must be
comma separated if it contains more than one register or register range.
Operation
PUSH is a synonym for STMDB sp!, reglist. PUSH is the preferred mnemonic.
Note
STMFD is a synonym of STMDB.
Registers are stored on the stack in numerical order, with the lowest numbered register at the lowest
address.
T32 instructions
The following restriction applies to the 16-bit PUSH instruction:
• reglist can only include the Lo registers and the LR.
Examples
PUSH {r0,r4-r7}
PUSH {r2,lr}
Related references
C2.45 LDM on page C2-177
C2.73 POP on page C2-221
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-222
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.75 QADD
C2.75 QADD
Signed saturating addition.
Syntax
QADD{cond} {Rd}, Rm, Rn
where:
cond
Operation
The QADD instruction adds the values in Rm and Rn. It saturates the result to the signed range -231 ≤ x ≤
231-1.
Note
All values are treated as two’s complement signed integers by this instruction.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Q flag
If saturation occurs, this instruction sets the Q flag. To read the state of the Q flag, use an MRS instruction.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Example
QADD r0, r1, r9
Related references
C2.62 MRS (PSR to general-purpose register) on page C2-204
A1.11 The Q flag in AArch32 state on page A1-37
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-223
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.76 QADD8
C2.76 QADD8
Signed saturating parallel byte-wise addition.
Syntax
QADD8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs four signed integer additions on the corresponding bytes of the operands and
writes the results into the corresponding bytes of the destination. It saturates the results to the signed
range -27 ≤ x ≤ 27 -1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
A1.11 The Q flag in AArch32 state on page A1-37
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-224
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.77 QADD16
C2.77 QADD16
Signed saturating parallel halfword-wise addition.
Syntax
QADD16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs two signed integer additions on the corresponding halfwords of the operands
and writes the results into the corresponding halfwords of the destination. It saturates the results to the
signed range -215 ≤ x ≤ 215 -1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
A1.11 The Q flag in AArch32 state on page A1-37
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-225
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.78 QASX
C2.78 QASX
Signed saturating parallel add and subtract halfwords with exchange.
Syntax
QASX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs an addition on the
two top halfwords of the operands and a subtraction on the bottom two halfwords. It writes the results
into the corresponding halfwords of the destination. It saturates the results to the signed range -215 ≤ x ≤
215 -1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
A1.11 The Q flag in AArch32 state on page A1-37
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-226
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.79 QDADD
C2.79 QDADD
Signed saturating Double and Add.
Syntax
QDADD{cond} {Rd}, Rm, Rn
where:
cond
Operation
QDADD calculates SAT(Rm + SAT(Rn * 2)). It saturates the result to the signed range -231 ≤ x ≤ 231-1.
Saturation can occur on the doubling operation, on the addition, or on both. If saturation occurs on the
doubling but not on the addition, the Q flag is set but the final result is unsaturated.
Note
All values are treated as two’s complement signed integers by this instruction.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Q flag
If saturation occurs, this instruction sets the Q flag. To read the state of the Q flag, use an MRS instruction.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
A1.11 The Q flag in AArch32 state on page A1-37
C2.62 MRS (PSR to general-purpose register) on page C2-204
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-227
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.80 QDSUB
C2.80 QDSUB
Signed saturating Double and Subtract.
Syntax
QDSUB{cond} {Rd}, Rm, Rn
where:
cond
Operation
QDSUB calculates SAT(Rm - SAT(Rn * 2)). It saturates the result to the signed range -231 ≤ x ≤ 231-1.
Saturation can occur on the doubling operation, on the subtraction, or on both. If saturation occurs on the
doubling but not on the subtraction, the Q flag is set but the final result is unsaturated.
Note
All values are treated as two’s complement signed integers by this instruction.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Q flag
If saturation occurs, this instruction sets the Q flag. To read the state of the Q flag, use an MRS instruction.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Example
QDSUBLT r9, r0, r1
Related references
A1.11 The Q flag in AArch32 state on page A1-37
C2.62 MRS (PSR to general-purpose register) on page C2-204
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-228
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.81 QSAX
C2.81 QSAX
Signed saturating parallel subtract and add halfwords with exchange.
Syntax
QSAX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs a subtraction on the
two top halfwords of the operands and an addition on the bottom two halfwords. It writes the results into
the corresponding halfwords of the destination. It saturates the results to the signed range -215 ≤ x ≤ 215
-1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
A1.11 The Q flag in AArch32 state on page A1-37
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-229
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.82 QSUB
C2.82 QSUB
Signed saturating Subtract.
Syntax
QSUB{cond} {Rd}, Rm, Rn
where:
cond
Operation
The QSUB instruction subtracts the value in Rn from the value in Rm. It saturates the result to the signed
range -231 ≤ x ≤ 231-1.
Note
All values are treated as two’s complement signed integers by this instruction.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Q flag
If saturation occurs, this instruction sets the Q flag. To read the state of the Q flag, use an MRS instruction.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
A1.11 The Q flag in AArch32 state on page A1-37
C2.62 MRS (PSR to general-purpose register) on page C2-204
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-230
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.83 QSUB8
C2.83 QSUB8
Signed saturating parallel byte-wise subtraction.
Syntax
QSUB8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each byte of the second operand from the corresponding byte of the first
operand and writes the results into the corresponding bytes of the destination. It saturates the results to
the signed range -27 ≤ x ≤ 27 -1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
A1.11 The Q flag in AArch32 state on page A1-37
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-231
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.84 QSUB16
C2.84 QSUB16
Signed saturating parallel halfword-wise subtraction.
Syntax
QSUB16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each halfword of the second operand from the corresponding halfword of the
first operand and writes the results into the corresponding halfwords of the destination. It saturates the
results to the signed range -215 ≤ x ≤ 215 -1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
A1.11 The Q flag in AArch32 state on page A1-37
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-232
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.85 RBIT
C2.85 RBIT
Reverse the bit order in a 32-bit word.
Syntax
RBIT{cond} Rd, Rn
where:
cond
Register restrictions
You cannot use PC for any register.
You can use SP in the A32 instruction but this is deprecated. You cannot use SP in the T32 instruction.
Condition flags
This instruction does not change the flags.
Architectures
This instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Example
RBIT r7, r8
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-233
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.86 REV
C2.86 REV
Reverse the byte order in a word.
Syntax
REV{cond} Rd, Rn
where:
cond
Usage
You can use this instruction to change endianness. REV converts 32-bit big-endian data into little-endian
data or 32-bit little-endian data into big-endian data.
Register restrictions
You cannot use PC for any register.
You can use SP in the A32 instruction but this is deprecated. You cannot use SP in the T32 instruction.
Condition flags
This instruction does not change the flags.
16-bit instructions
The following form of this instruction is available in T32 code, and is a 16-bit instruction:
REV Rd, Rm
Architectures
This instruction is available in A32 and T32.
Example
REV r3, r7
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-234
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.87 REV16
C2.87 REV16
Reverse the byte order in each halfword independently.
Syntax
REV16{cond} Rd, Rn
where:
cond
Usage
You can use this instruction to change endianness. REV16 converts 16-bit big-endian data into little-
endian data or 16-bit little-endian data into big-endian data.
Register restrictions
You cannot use PC for any register.
You can use SP in the A32 instruction but this is deprecated. You cannot use SP in the T32 instruction.
Condition flags
This instruction does not change the flags.
16-bit instructions
The following form of this instruction is available in T32 code, and is a 16-bit instruction:
REV16 Rd, Rm
Architectures
This instruction is available in A32 and T32.
Example
REV16 r0, r0
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-235
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.88 REVSH
C2.88 REVSH
Reverse the byte order in the bottom halfword, and sign extend to 32 bits.
Syntax
REVSH{cond} Rd, Rn
where:
cond
Usage
You can use this instruction to change endianness. REVSH converts either:
• 16-bit signed big-endian data into 32-bit signed little-endian data.
• 16-bit signed little-endian data into 32-bit signed big-endian data.
Register restrictions
You cannot use PC for any register.
You can use SP in the A32 instruction but this is deprecated. You cannot use SP in the T32 instruction.
Condition flags
This instruction does not change the flags.
16-bit instructions
The following form of this instruction is available in T32 code, and is a 16-bit instruction:
REVSH Rd, Rm
Architectures
This instruction is available in A32 and T32.
Example
REVSH r0, r5 ; Reverse Signed Halfword
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-236
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.89 RFE
C2.89 RFE
Return From Exception.
Syntax
RFE{addr_mode}{cond} Rn{!}
where:
addr_mode
Rn
specifies the base register. Rn must not be PC.
!
is an optional suffix. If ! is present, the final address is written back into Rn.
Usage
You can use RFE to return from an exception if you previously saved the return state using the SRS
instruction. Rn is usually the SP where the return state information was saved.
Operation
Loads the PC and the CPSR from the address contained in Rn, and the following address. Optionally
updates Rn.
Notes
RFE writes an address to the PC. The alignment of this address must be correct for the instruction set in
use after the exception return:
• For a return to A32, the address written to the PC must be word-aligned.
• For a return to T32, the address written to the PC must be halfword-aligned.
• For a return to Jazelle, there are no alignment restrictions on the address written to the PC.
No special precautions are required in software to follow these rules, if you use the instruction to return
after a valid exception entry mechanism.
Where addresses are not word-aligned, RFE ignores the least significant two bits of Rn.
The time order of the accesses to individual words of memory generated by RFE is not architecturally
defined. Do not use this instruction on memory-mapped I/O locations where access order matters.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-237
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.89 RFE
Architectures
This instruction is available in A32.
This 32-bit T32 instruction is available, except in the Armv7‑M and Armv8‑M Mainline architectures.
There is no 16-bit version of this instruction.
Example
RFE sp!
Related concepts
A1.3 Processor modes, and privileged and unprivileged software execution on page A1-28
Related references
C2.129 SRS on page C2-289
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-238
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.90 ROR
C2.90 ROR
Rotate Right. This instruction is a preferred synonym for MOV instructions with shifted register operands.
Syntax
ROR{S}{cond} Rd, Rm, Rs
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
Rd
is the register holding the first operand. This operand is shifted right.
Rs
is a register holding a shift value to apply to the value in Rm. Only the least significant byte is
used.
sh
Operation
ROR provides the value of the contents of a register rotated by a value. The bits that are rotated off the
right end are inserted into the vacated bit positions on the left.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-239
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.90 ROR
Caution
Do not use the S suffix when using PC as Rd in User mode or System mode. The assembler cannot warn
you about this because it has no information about what the processor mode is likely to be at execution
time.
You cannot use PC for Rd or any operand in this instruction if it has a register-controlled shift.
Condition flags
If S is specified, the instruction updates the N and Z flags according to the result.
The C flag is unaffected if the shift value is 0. Otherwise, the C flag is updated to the last bit shifted out.
16-bit instructions
The following forms of this instruction are available in T32 code, and are 16-bit instructions:
RORS Rd, Rd, Rs
Rd and Rs must both be Lo registers. This form can only be used outside an IT block.
Rd and Rs must both be Lo registers. This form can only be used inside an IT block.
Architectures
This instruction is available in A32 and T32.
Example
ROR r4, r5, r6
Related references
C2.58 MOV on page C2-199
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-240
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.91 RRX
C2.91 RRX
Rotate Right with Extend. This instruction is a preferred synonym for MOV instructions with shifted
register operands.
Syntax
RRX{S}{cond} Rd, Rm
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
Rd
is the register holding the first operand. This operand is shifted right.
Operation
RRX provides the value of the contents of a register shifted right one bit. The old carry flag is shifted into
bit[31]. If the S suffix is present, the old bit[0] is placed in the carry flag.
Caution
Do not use the S suffix when using PC as Rd in User mode or System mode. The assembler cannot warn
you about this because it has no information about what the processor mode is likely to be at execution
time.
You cannot use PC for Rd or any operand in this instruction if it has a register-controlled shift.
Condition flags
If S is specified, the instruction updates the N and Z flags according to the result.
The C flag is unaffected if the shift value is 0. Otherwise, the C flag is updated to the last bit shifted out.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-241
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.91 RRX
Architectures
The 32-bit instruction is available in A32 and T32.
There is no 16-bit instruction in T32.
Related references
C2.58 MOV on page C2-199
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-242
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.92 RSB
C2.92 RSB
Reverse Subtract without carry.
Syntax
RSB{S}{cond} {Rd}, Rn, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
Operation
The RSB instruction subtracts the value in Rn from the value of Operand2. This is useful because of the
wide range of options for Operand2.
In certain circumstances, the assembler can substitute one instruction for another. Be aware of this when
reading disassembly listings.
Condition flags
If S is specified, the RSB instruction updates the N, Z, C and V flags according to the result.
16-bit instructions
The following forms of this instruction are available in T32 code, and are 16-bit instructions:
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-243
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.92 RSB
Rd and Rn must both be Lo registers. This form can only be used outside an IT block.
Rd and Rn must both be Lo registers. This form can only be used inside an IT block.
Example
RSB r4, r4, #1280 ; subtracts contents of R4 from 1280
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-244
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.93 RSC
C2.93 RSC
Reverse Subtract with Carry.
Syntax
RSC{S}{cond} {Rd}, Rn, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
Usage
The RSC instruction subtracts the value in Rn from the value of Operand2. If the carry flag is clear, the
result is reduced by one.
You can use RSC to synthesize multiword arithmetic.
In certain circumstances, the assembler can substitute one instruction for another. Be aware of this when
reading disassembly listings.
RSC is not available in T32 code.
Use of PC and SP
Use of PC and SP is deprecated.
You cannot use PC for Rd or any operand in an RSC instruction that has a register-controlled shift.
If you use PC (R15) as Rn or Rm, the value used is the address of the instruction plus 8.
If you use PC as Rd:
• Execution branches to the address corresponding to the result.
• If you use the S suffix, see the SUBS pc,lr instruction.
Condition flags
If S is specified, the RSC instruction updates the N, Z, C and V flags according to the result.
Correct example
RSCSLE r0,r5,r0,LSL r4 ; conditional, flags set
Incorrect example
RSCSLE r0,pc,r0,LSL r4 ; PC not permitted with register
; controlled shift
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-245
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.93 RSC
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-246
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.94 SADD8
C2.94 SADD8
Signed parallel byte-wise addition.
Syntax
SADD8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs four signed integer additions on the corresponding bytes of the operands and
writes the results into the corresponding bytes of the destination. The results are modulo 28. It sets the
APSR GE flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
GE flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[0]
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-247
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.94 SADD8
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-248
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.95 SADD16
C2.95 SADD16
Signed parallel halfword-wise addition.
Syntax
SADD16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs two signed integer additions on the corresponding halfwords of the operands
and writes the results into the corresponding halfwords of the destination. The results are modulo 216. It
sets the APSR GE flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
GE flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[1:0]
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-249
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.95 SADD16
Related references
C2.100 SEL on page C2-257
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-250
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.96 SASX
C2.96 SASX
Signed parallel add and subtract halfwords with exchange.
Syntax
SASX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs an addition on the
two top halfwords of the operands and a subtraction on the bottom two halfwords. It writes the results
into the corresponding halfwords of the destination. The results are modulo 216. It sets the APSR GE
flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
GE flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[1:0]
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-251
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.96 SASX
Related references
C2.100 SEL on page C2-257
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-252
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.97 SBC
C2.97 SBC
Subtract with Carry.
Syntax
SBC{S}{cond} {Rd}, Rn, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
Usage
The SBC (Subtract with Carry) instruction subtracts the value of Operand2 from the value in Rn. If the
carry flag is clear, the result is reduced by one.
You can use SBC to synthesize multiword arithmetic.
In certain circumstances, the assembler can substitute one instruction for another. Be aware of this when
reading disassembly listings.
Condition flags
If S is specified, the SBC instruction updates the N, Z, C and V flags according to the result.
16-bit instructions
The following forms of this instruction are available in T32 code, and are 16-bit instructions:
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-253
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.97 SBC
Rd and Rm must both be Lo registers. This form can only be used outside an IT block.
Rd and Rm must both be Lo registers. This form can only be used inside an IT block.
For clarity, the above examples use consecutive registers for multiword values. There is no requirement
to do this. The following, for example, is perfectly valid:
SUBS r6, r6, r9
SBCS r9, r2, r1
SBC r2, r8, r11
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-254
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.98 SBFX
C2.98 SBFX
Signed Bit Field Extract.
Syntax
SBFX{cond} Rd, Rn, #lsb, #width
where:
cond
is the bit number of the least significant bit in the bitfield, in the range 0 to 31.
width
Operation
Copies adjacent bits from one register into the least significant bits of a second register, and sign extends
to 32 bits.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not alter any flags.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-255
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.99 SDIV
C2.99 SDIV
Signed Divide.
Syntax
SDIV{cond} {Rd}, Rn, Rm
where:
cond
is an optional condition code.
Rd
is the destination register.
Rn
is the register holding the value to be divided.
Rm
is a register holding the divisor.
Register restrictions
PC or SP cannot be used for Rd, Rn, or Rm.
Architectures
This 32-bit T32 instruction is available in Armv7‑R, Armv7‑M, and Armv8‑M Mainline.
This 32-bit A32 instruction is optional in Armv7‑R.
This 32-bit A32 and T32 instruction is available in Armv7‑A if Virtualization Extensions are
implemented, and optional if not.
There is no 16-bit T32 SDIV instruction.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-256
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.100 SEL
C2.100 SEL
Select bytes from each operand according to the state of the APSR GE flags.
Syntax
SEL{cond} {Rd}, Rn, Rm
where:
cond
Operation
The SEL instruction selects bytes from Rn or Rm according to the APSR GE flags:
• If GE[0] is set, Rd[7:0] come from Rn[7:0], otherwise from Rm[7:0].
• If GE[1] is set, Rd[15:8] come from Rn[15:8], otherwise from Rm[15:8].
• If GE[2] is set, Rd[23:16] come from Rn[23:16], otherwise from Rm[23:16].
• If GE[3] is set, Rd[31:24] come from Rn[31:24], otherwise from Rm[31:24].
Usage
Use the SEL instruction after one of the signed parallel instructions. You can use this to select maximum
or minimum values in multiple byte or halfword data.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Examples
SEL r0, r4, r5
SELLT r4, r0, r4
The following instruction sequence sets each byte in R4 equal to the unsigned minimum of the
corresponding bytes of R1 and R2:
USUB8 r4, r1, r2
SEL r4, r2, r1
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-257
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.100 SEL
Related concepts
A1.12 Application Program Status Register on page A1-38
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-258
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.101 SETEND
C2.101 SETEND
Set the endianness bit in the CPSR, without affecting any other bits in the CPSR.
Note
This instruction is deprecated in Armv8.
Syntax
SETEND specifier
where:
specifier
is one of:
BE
Big-endian.
LE
Little-endian.
Usage
Use SETEND to access data of different endianness, for example, to access several big-endian DMA-
formatted data fields from an otherwise little-endian application.
SETEND cannot be conditional, and is not permitted in an IT block.
Architectures
This instruction is available in A32 and 16-bit T32.
This 16-bit instruction is available in T32, except in the Armv6‑M and Armv7‑M architectures.
There is no 32-bit version of this instruction in T32.
Example
SETEND BE ; Set the CPSR E bit for big-endian accesses
LDR r0, [r2, #header]
LDR r1, [r2, #CRC32]
SETEND le ; Set the CPSR E bit for little-endian accesses
; for the rest of the application
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-259
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.102 SETPAN
C2.102 SETPAN
Set Privileged Access Never.
Syntax
SETPAN{q} #imm ; A1 general registers (A32)
Where:
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
imm
Is the unsigned immediate 0 or 1.
Architectures supported
Supported in Armv8.1 and later.
Usage
Set Privileged Access Never writes a new value to PSTATE.PAN.
This instruction is available only in privileged mode and it is a NOP when executed in User mode.
Related references
C2.1 A32 and T32 instruction summary on page C2-106
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-260
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.103 SEV
C2.103 SEV
Set Event.
Syntax
SEV{cond}
where:
cond
Operation
This is a hint instruction. It is optional whether it is implemented or not. If it is not implemented, it
executes as a NOP. The assembler produces a diagnostic message if the instruction executes as a NOP on
the target.
SEV causes an event to be signaled to all cores within a multiprocessor system. If SEV is implemented,
WFE must also be implemented.
Availability
This instruction is available in A32 and T32.
Related references
C2.104 SEVL on page C2-262
C2.68 NOP on page C2-213
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-261
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.104 SEVL
C2.104 SEVL
Set Event Locally.
Note
This instruction is supported only in Armv8.
Syntax
SEVL{cond}
where:
cond
Operation
This is a hint instruction. It is optional whether it is implemented or not. If it is not implemented, it
executes as a NOP. armasm produces a diagnostic message if the instruction executes as a NOP on the
target.
SEVL causes an event to be signaled to all cores the current processor. SEVL is not required to affect other
processors although it is permitted to do so.
Availability
This instruction is available in A32 and T32.
Related references
C2.103 SEV on page C2-261
C2.68 NOP on page C2-213
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-262
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.105 SG
C2.105 SG
Secure Gateway.
Syntax
SG
Usage
Secure Gateway marks a valid branch target for branches from Non-secure code that wants to call Secure
code.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-263
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.106 SHADD8
C2.106 SHADD8
Signed halving parallel byte-wise addition.
Syntax
SHADD8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs four signed integer additions on the corresponding bytes of the operands,
halves the results, and writes the results into the corresponding bytes of the destination. This cannot
cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-264
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.107 SHADD16
C2.107 SHADD16
Signed halving parallel halfword-wise addition.
Syntax
SHADD16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs two signed integer additions on the corresponding halfwords of the operands,
halves the results, and writes the results into the corresponding halfwords of the destination. This cannot
cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-265
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.108 SHASX
C2.108 SHASX
Signed halving parallel add and subtract halfwords with exchange.
Syntax
SHASX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs an addition on the
two top halfwords of the operands and a subtraction on the bottom two halfwords. It halves the results
and writes them into the corresponding halfwords of the destination. This cannot cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-266
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.109 SHSAX
C2.109 SHSAX
Signed halving parallel subtract and add halfwords with exchange.
Syntax
SHSAX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs a subtraction on the
two top halfwords of the operands and an addition on the bottom two halfwords. It halves the results and
writes them into the corresponding halfwords of the destination. This cannot cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-267
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.110 SHSUB8
C2.110 SHSUB8
Signed halving parallel byte-wise subtraction.
Syntax
SHSUB8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each byte of the second operand from the corresponding byte of the first
operand, halves the results, and writes the results into the corresponding bytes of the destination. This
cannot cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-268
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.111 SHSUB16
C2.111 SHSUB16
Signed halving parallel halfword-wise subtraction.
Syntax
SHSUB16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each halfword of the second operand from the corresponding halfword of the
first operand, halves the results, and writes the results into the corresponding halfwords of the
destination. This cannot cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-269
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.112 SMC
C2.112 SMC
Secure Monitor Call.
Syntax
SMC{cond} #imm4
where:
cond
is a 4-bit immediate value. This is ignored by the Arm processor, but can be used by the SMC
exception handler to determine what service is being requested.
Note
SMC was called SMI in earlier versions of the A32 assembly language. SMI instructions disassemble to
SMC, with a comment to say that this was formerly SMI.
Architectures
This 32-bit instruction is available in A32 and T32, if the Arm architecture has the Security Extensions.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
Related information
Arm Architecture Reference Manual
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-270
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.113 SMLAxy
C2.113 SMLAxy
Signed Multiply Accumulate, with 16-bit operands and a 32-bit result and accumulator.
Syntax
SMLA<x><y>{cond} Rd, Rn, Rm, Ra
where:
<x>
is either B or T. B means use the bottom half (bits [15:0]) of Rn, T means use the top half (bits
[31:16]) of Rn.
<y>
is either B or T. B means use the bottom half (bits [15:0]) of Rm, T means use the top half (bits
[31:16]) of Rm.
cond
Operation
SMLAxy multiplies the 16-bit signed integers from the selected halves of Rn and Rm, adds the 32-bit result
to the 32-bit value in Ra, and places the result in Rd.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, or V flags.
If overflow occurs in the accumulation, SMLAxy sets the Q flag. To read the state of the Q flag, use an MRS
instruction.
Note
SMLAxy never clears the Q flag. To clear the Q flag, use an MSR instruction.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-271
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.113 SMLAxy
Examples
SMLABBNE r0, r2, r1, r10
SMLABT r0, r0, r3, r5
Related references
C2.62 MRS (PSR to general-purpose register) on page C2-204
C2.65 MSR (general-purpose register to PSR) on page C2-208
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-272
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.114 SMLAD
C2.114 SMLAD
Dual 16-bit Signed Multiply with Addition of products and 32-bit accumulation.
Syntax
SMLAD{X}{cond} Rd, Rn, Rm, Ra
where:
cond
is an optional parameter. If X is present, the most and least significant halfwords of the second
operand are exchanged before the multiplications occur.
Rd
Operation
SMLAD multiplies the bottom halfword of Rn with the bottom halfword of Rm, and the top halfword of Rn
with the top halfword of Rm. It then adds both products to the value in Ra and stores the sum to Rd.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Example
SMLADLT r1, r2, r4, r1
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-273
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.115 SMLAL
C2.115 SMLAL
Signed Long Multiply, with optional Accumulate, with 32-bit operands, and 64-bit result and
accumulator.
Syntax
SMLAL{S}{cond} RdLo, RdHi, Rn, Rm
where:
S
is an optional suffix available in A32 state only. If S is specified, the condition flags are updated
on the result of the operation.
cond
are the destination registers. They also hold the accumulating value. RdLo and RdHi must be
different registers
Rn, Rm
Operation
The SMLAL instruction interprets the values from Rn and Rm as two’s complement signed integers. It
multiplies these integers, and adds the 64-bit result to the 64-bit signed integer contained in RdHi and
RdLo.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
If S is specified, this instruction:
• Updates the N and Z flags according to the result.
• Does not affect the C or V flags.
Architectures
This instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-274
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.116 SMLALD
C2.116 SMLALD
Dual 16-bit Signed Multiply with Addition of products and 64-bit Accumulation.
Syntax
SMLALD{X}{cond} RdLo, RdHi, Rn, Rm
where:
X
is an optional parameter. If X is present, the most and least significant halfwords of the second
operand are exchanged before the multiplications occur.
cond
are the destination registers for the 64-bit result. They also hold the 64-bit accumulate operand.
RdHi and RdLo must be different registers.
Rn, Rm
Operation
SMLALD multiplies the bottom halfword of Rn with the bottom halfword of Rm, and the top halfword of Rn
with the top halfword of Rm. It then adds both products to the value in RdLo, RdHi and stores the sum to
RdLo, RdHi.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Example
SMLALD r10, r11, r5, r1
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-275
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.117 SMLALxy
C2.117 SMLALxy
Signed Multiply-Accumulate with 16-bit operands and a 64-bit accumulator.
Syntax
SMLAL<x><y>{cond} RdLo, RdHi, Rn, Rm
where:
<x>
is either B or T. B means use the bottom half (bits [15:0]) of Rn, T means use the top half (bits
[31:16]) of Rn.
<y>
is either B or T. B means use the bottom half (bits [15:0]) of Rm, T means use the top half (bits
[31:16]) of Rm.
cond
are the destination registers. They also hold the accumulate value. RdHi and RdLo must be
different registers.
Rn, Rm
Operation
SMLALxy multiplies the signed integer from the selected half of Rm by the signed integer from the selected
half of Rn, and adds the 32-bit result to the 64-bit value in RdHi and RdLo.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Note
SMLALxy cannot raise an exception. If overflow occurs on this instruction, the result wraps round without
any warning.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Examples
SMLALTB r2, r3, r7, r1
SMLALBTVS r0, r1, r9, r2
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-276
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.117 SMLALxy
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-277
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.118 SMLAWy
C2.118 SMLAWy
Signed Multiply-Accumulate Wide, with one 32-bit and one 16-bit operand, and a 32-bit accumulate
value, providing the top 32 bits of the result.
Syntax
SMLAW<y>{cond} Rd, Rn, Rm, Ra
where:
<y>
is either B or T. B means use the bottom half (bits [15:0]) of Rm, T means use the top half (bits
[31:16]) of Rm.
cond
Operation
SMLAWy multiplies the signed 16-bit integer from the selected half of Rm by the signed 32-bit integer from
Rn, adds the top 32 bits of the 48-bit result to the 32-bit value in Ra, and places the result in Rd.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, or V flags.
If overflow occurs in the accumulation, SMLAWy sets the Q flag.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C2.62 MRS (PSR to general-purpose register) on page C2-204
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-278
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.119 SMLSD
C2.119 SMLSD
Dual 16-bit Signed Multiply with Subtraction of products and 32-bit accumulation.
Syntax
SMLSD{X}{cond} Rd, Rn, Rm, Ra
where:
cond
is an optional parameter. If X is present, the most and least significant halfwords of the second
operand are exchanged before the multiplications occur.
Rd
Operation
SMLSD multiplies the bottom halfword of Rn with the bottom halfword of Rm, and the top halfword of Rn
with the top halfword of Rm. It then subtracts the second product from the first, adds the difference to the
value in Ra, and stores the result to Rd.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Examples
SMLSD r1, r2, r0, r7
SMLSDX r11, r10, r2, r3
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-279
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.120 SMLSLD
C2.120 SMLSLD
Dual 16-bit Signed Multiply with Subtraction of products and 64-bit accumulation.
Syntax
SMLSD{X}{cond} RdLo, RdHi, Rn, Rm
where:
X
is an optional parameter. If X is present, the most and least significant halfwords of the second
operand are exchanged before the multiplications occur.
cond
are the destination registers for the 64-bit result. They also hold the 64-bit accumulate operand.
RdHi and RdLo must be different registers.
Rn, Rm
Operation
SMLSLD multiplies the bottom halfword of Rn with the bottom halfword of Rm, and the top halfword of Rn
with the top halfword of Rm. It then subtracts the second product from the first, adds the difference to the
value in RdLo, RdHi, and stores the result to RdLo, RdHi.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Example
SMLSLD r3, r0, r5, r1
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-280
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.121 SMMLA
C2.121 SMMLA
Signed Most significant word Multiply with Accumulation.
Syntax
SMMLA{R}{cond} Rd, Rn, Rm, Ra
where:
R
Operation
SMMLA multiplies the values from Rn and Rm, adds the value in Ra to the most significant 32 bits of the
product, and stores the result in Rd.
If the optional R parameter is specified, 0x80000000 is added before extracting the most significant 32
bits. This has the effect of rounding the result.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-281
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.122 SMMLS
C2.122 SMMLS
Signed Most significant word Multiply with Subtraction.
Syntax
SMMLS{R}{cond} Rd, Rn, Rm, Ra
where:
R
Operation
SMMLS multiplies the values from Rn and Rm, subtracts the product from the value in Ra shifted left by 32
bits, and stores the most significant 32 bits of the result in Rd.
If the optional R parameter is specified, 0x80000000 is added before extracting the most significant 32
bits. This has the effect of rounding the result.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-282
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.123 SMMUL
C2.123 SMMUL
Signed Most significant word Multiply.
Syntax
SMMUL{R}{cond} {Rd}, Rn, Rm
where:
R
Operation
SMMUL multiplies the 32-bit values from Rn and Rm, and stores the most significant 32 bits of the 64-bit
result to Rd.
If the optional R parameter is specified, 0x80000000 is added before extracting the most significant 32
bits. This has the effect of rounding the result.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Examples
SMMULGE r6, r4, r3
SMMULR r2, r2, r2
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-283
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.124 SMUAD
C2.124 SMUAD
Dual 16-bit Signed Multiply with Addition of products, and optional exchange of operand halves.
Syntax
SMUAD{X}{cond} {Rd}, Rn, Rm
where:
X
is an optional parameter. If X is present, the most and least significant halfwords of the second
operand are exchanged before the multiplications occur.
cond
Operation
SMUAD multiplies the bottom halfword of Rn with the bottom halfword of Rm, and the top halfword of Rn
with the top halfword of Rm. It then adds the products and stores the sum to Rd.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Q flag
The SMUAD instruction sets the Q flag if the addition overflows.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Examples
SMUAD r2, r3, r2
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-284
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.125 SMULxy
C2.125 SMULxy
Signed Multiply, with 16-bit operands and a 32-bit result.
Syntax
SMUL<x><y>{cond} {Rd}, Rn, Rm
where:
<x>
is either B or T. B means use the bottom half (bits [15:0]) of Rn, T means use the top half (bits
[31:16]) of Rn.
<y>
is either B or T. B means use the bottom half (bits [15:0]) of Rm, T means use the top half (bits
[31:16]) of Rm.
cond
Operation
SMULxy multiplies the 16-bit signed integers from the selected halves of Rn and Rm, and places the 32-bit
result in Rd.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
These instructions do not affect the N, Z, C, or V flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Examples
SMULTBEQ r8, r7, r9
Related references
C2.62 MRS (PSR to general-purpose register) on page C2-204
C2.65 MSR (general-purpose register to PSR) on page C2-208
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-285
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.126 SMULL
C2.126 SMULL
Signed Long Multiply, with 32-bit operands and 64-bit result.
Syntax
SMULL{S}{cond} RdLo, RdHi, Rn, Rm
where:
S
is an optional suffix available in A32 state only. If S is specified, the condition flags are updated
on the result of the operation.
cond
are the destination registers. RdLo and RdHi must be different registers
Rn, Rm
Operation
The SMULL instruction interprets the values from Rn and Rm as two’s complement signed integers. It
multiplies these integers and places the least significant 32 bits of the result in RdLo, and the most
significant 32 bits of the result in RdHi.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
If S is specified, this instruction:
• Updates the N and Z flags according to the result.
• Does not affect the C or V flags.
Architectures
This instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-286
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.127 SMULWy
C2.127 SMULWy
Signed Multiply Wide, with one 32-bit and one 16-bit operand, providing the top 32 bits of the result.
Syntax
SMULW<y>{cond} {Rd}, Rn, Rm
where:
<y>
is either B or T. B means use the bottom half (bits [15:0]) of Rm, T means use the top half (bits
[31:16]) of Rm.
cond
Operation
SMULWy multiplies the signed integer from the selected half of Rm by the signed integer from Rn, and
places the upper 32-bits of the 48-bit result in Rd.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, or V flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C2.62 MRS (PSR to general-purpose register) on page C2-204
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-287
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.128 SMUSD
C2.128 SMUSD
Dual 16-bit Signed Multiply with Subtraction of products, and optional exchange of operand halves.
Syntax
SMUSD{X}{cond} {Rd}, Rn, Rm
where:
X
is an optional parameter. If X is present, the most and least significant halfwords of the second
operand are exchanged before the multiplications occur.
cond
Operation
SMUSD multiplies the bottom halfword of Rn with the bottom halfword of Rm, and the top halfword of Rn
with the top halfword of Rm. It then subtracts the second product from the first, and stores the difference
to Rd.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Example
SMUSDXNE r0, r1, r2
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-288
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.129 SRS
C2.129 SRS
Store Return State onto a stack.
Syntax
SRS{addr_mode}{cond} sp{!}, #modenum
where:
addr_mode
is an optional suffix. If ! is present, the final address is written back into the SP of the mode
specified by modenum.
modenum
specifies the number of the mode whose banked SP is used as the base register. You must use
only the defined mode numbers.
Operation
SRS stores the LR and the SPSR of the current mode, at the address contained in SP of the mode
specified by modenum, and the following word respectively. Optionally updates SP of the mode specified
by modenum. This is compatible with the normal use of the STM instruction for stack accesses.
Note
For full descending stack, you must use SRSFD or SRSDB.
Usage
You can use SRS to store return state for an exception handler on a different stack from the one
automatically selected.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-289
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.129 SRS
Notes
Where addresses are not word-aligned, SRS ignores the least significant two bits of the specified address.
The time order of the accesses to individual words of memory generated by SRS is not architecturally
defined. Do not use this instruction on memory-mapped I/O locations where access order matters.
Do not use SRS in User and System modes because these modes do not have a SPSR.
SRS is not permitted in a non-secure state if modenum specifies monitor mode.
Availability
This 32-bit instruction is available in A32 and T32.
The 32-bit T32 instruction is not available in the Armv7‑M architecture.
There is no 16-bit version of this instruction in T32.
Example
R13_usr EQU 16
SRSFD sp,#R13_usr
Related concepts
A1.3 Processor modes, and privileged and unprivileged software execution on page A1-28
Related references
C2.45 LDM on page C2-177
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-290
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.130 SSAT
C2.130 SSAT
Signed Saturate to any bit position, with optional shift before saturating.
Syntax
SSAT{cond} Rd, #sat, Rm{, shift}
where:
cond
Operation
The SSAT instruction applies the specified shift, then saturates a signed value to the signed range -2sat-1 ≤
x ≤ 2sat-1 -1.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Q flag
If saturation occurs, this instruction sets the Q flag. To read the state of the Q flag, use an MRS instruction.
Architectures
This instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Example
SSAT r7, #16, r7, LSL #4
Related references
C2.131 SSAT16 on page C2-292
C2.62 MRS (PSR to general-purpose register) on page C2-204
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-291
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.131 SSAT16
C2.131 SSAT16
Parallel halfword Saturate.
Syntax
SSAT16{cond} Rd, #sat, Rn
where:
cond
Operation
Halfword-wise signed saturation to any bit position.
The SSAT16 instruction saturates each signed halfword to the signed range -2sat-1 ≤ x ≤ 2sat-1 -1.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Q flag
If saturation occurs on either halfword, this instruction sets the Q flag. To read the state of the Q flag, use
an MRS instruction.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Correct example
SSAT16 r7, #12, r7
Incorrect example
SSAT16 r1, #16, r2, LSL #4 ; shifts not permitted with halfword
; saturations
Related references
C2.62 MRS (PSR to general-purpose register) on page C2-204
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-292
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.132 SSAX
C2.132 SSAX
Signed parallel subtract and add halfwords with exchange.
Syntax
SSAX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs a subtraction on the
two top halfwords of the operands and an addition on the bottom two halfwords. It writes the results into
the corresponding halfwords of the destination. The results are modulo 216. It sets the APSR GE flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
GE flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[1:0]
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-293
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.132 SSAX
Related references
C2.100 SEL on page C2-257
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-294
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.133 SSUB8
C2.133 SSUB8
Signed parallel byte-wise subtraction.
Syntax
SSUB8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each byte of the second operand from the corresponding byte of the first
operand and writes the results into the corresponding bytes of the destination. The results are modulo 28.
It sets the APSR GE flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
GE flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[0]
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-295
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.133 SSUB8
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-296
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.134 SSUB16
C2.134 SSUB16
Signed parallel halfword-wise subtraction.
Syntax
SSUB16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each halfword of the second operand from the corresponding halfword of the
first operand and writes the results into the corresponding halfwords of the destination. The results are
modulo 216. It sets the APSR GE flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
GE flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[1:0]
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-297
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.134 SSUB16
Related references
C2.100 SEL on page C2-257
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-298
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.135 STC and STC2
Note
STC2 is not supported in Armv8.
Syntax
op{L}{cond} coproc, CRd, [Rn]
where:
op
is the register on which the memory address is based. If PC is specified, the value used is the
address of the current instruction plus eight.
-
is an optional minus sign. If - is present, the offset is subtracted from Rn. Otherwise, the offset is
added to Rn.
offset
is an optional suffix. If ! is present, the address including the offset is written back into Rn.
option
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-299
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.135 STC and STC2
Usage
The use of these instructions depends on the coprocessor. See the coprocessor documentation for details.
Architectures
These 32-bit instructions are available in A32 and T32.
There are no 16-bit versions of these instructions in T32.
Register restrictions
You cannot use PC for Rn in the pre-index and post-index instructions. These are the forms that write
back to Rn.
You cannot use PC for Rn in T32 STC and STC2 instructions.
A32 STC and STC2 instructions where Rn is PC, are deprecated.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-300
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.136 STL
C2.136 STL
Store-Release Register.
Note
This instruction is supported only in Armv8.
Syntax
STL{cond} Rt, [Rn]
where:
cond
is an optional condition code.
Rt
is the register to store.
Rn
is the register on which the memory address is based.
Operation
STL stores data to memory. If any loads or stores appear before a store-release in program order, then all
observers are guaranteed to observe the loads and stores before observing the store-release. Loads and
stores appearing after a store-release are unaffected.
If a store-release follows a load-acquire, each observer is guaranteed to observe them in program order.
There is no requirement that a store-release be paired with a load-acquire.
All store-release operations are multi-copy atomic, meaning that in a multiprocessing system, if one
observer observes a write to memory because of a store-release operation, then all observers observe it.
Also, all observers observe all such writes to the same location in the same order.
Restrictions
The address specified must be naturally aligned, or an alignment fault is generated.
The PC must not be used for Rt or Rn.
Availability
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction.
Related references
C2.43 LDAEX on page C2-173
C2.42 LDA on page C2-172
C2.137 STLEX on page C2-302
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-301
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.137 STLEX
C2.137 STLEX
Store-Release Register Exclusive.
Note
This instruction is supported only in Armv8.
Syntax
STLEX{cond} Rd, Rt, [Rn]
where:
cond
is an optional condition code.
Rd
is the destination register for the returned status.
Rt
is the register to load or store.
Rt2
is the second register for doubleword loads or stores.
Rn
is the register on which the memory address is based.
Operation
STLEX performs a conditional store to memory. The conditions are as follows:
• If the physical address does not have the Shared TLB attribute, and the executing processor has an
outstanding tagged physical address, the store takes place, the tag is cleared, and the value 0 is
returned in Rd.
• If the physical address does not have the Shared TLB attribute, and the executing processor does not
have an outstanding tagged physical address, the store does not take place, and the value 1 is returned
in Rd.
• If the physical address has the Shared TLB attribute, and the physical address is tagged as exclusive
access for the executing processor, the store takes place, the tag is cleared, and the value 0 is returned
in Rd.
• If the physical address has the Shared TLB attribute, and the physical address is not tagged as
exclusive access for the executing processor, the store does not take place, and the value 1 is returned
in Rd.
If any loads or stores appear before STLEX in program order, then all observers are guaranteed to observe
the loads and stores before observing the store-release. Loads and stores appearing after STLEX are
unaffected.
All store-release operations are multi-copy atomic.
Restrictions
The PC must not be used for any of Rd, Rt, Rt2, or Rn.
For STLEX, Rd must not be the same register as Rt, Rt2, or Rn.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-302
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.137 STLEX
Usage
Use LDAEX and STLEX to implement interprocess communication in multiple-processor and shared-
memory systems.
For reasons of performance, keep the number of instructions between corresponding LDAEX and STLEX
instructions to a minimum.
Note
The address used in a STLEX instruction must be the same as the address in the most recently executed
LDAEX instruction.
Availability
These 32-bit instructions are available in A32 and T32.
There are no 16-bit versions of these instructions.
Related references
C2.43 LDAEX on page C2-173
C2.136 STL on page C2-301
C2.42 LDA on page C2-172
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-303
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.138 STM
C2.138 STM
Store Multiple registers.
Syntax
STM{addr_mode}{cond} Rn{!}, reglist{^}
where:
addr_mode
Increment address After each transfer. This is the default, and can be omitted.
IB
is the base register, the general-purpose register holding the initial address for the transfer. Rn
must not be PC.
!
is an optional suffix. If ! is present, the final address is written back into Rn.
reglist
is a list of one or more registers to be stored, enclosed in braces. It can contain register ranges. It
must be comma-separated if it contains more than one register or register range. Any
combination of registers R0 to R15 (PC) can be transferred in A32 state, but there are some
restrictions in T32 state.
^
is an optional suffix, available in A32 state only. You must not use it in User mode or System
mode. Data is transferred into or out of the User mode registers instead of the current mode
registers.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-304
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.138 STM
16-bit instruction
A 16-bit version of this instruction is available in T32 code.
The following restrictions apply to the 16-bit instruction:
• All registers in reglist must be Lo registers.
• Rn must be a Lo register.
• addr_mode must be omitted (or IA), meaning increment address after each transfer.
• Writeback must be specified for STM instructions.
Note
16-bit T32 STM instructions with writeback that specify Rn as the lowest register in the reglist are
deprecated.
In addition, the PUSH and POP instructions are subsets of the STM and LDM instructions and can therefore
be expressed using the STM and LDM instructions. Some forms of PUSH and POP are also 16-bit
instructions.
Correct example
STMDB r1!,{r3-r6,r11,r12}
Incorrect example
STM r5!,{r5,r4,r9} ; value stored for R5 unknown
Related references
C2.73 POP on page C2-221
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-305
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.139 STR (immediate offset)
Syntax
STR{type}{cond} Rt, [Rn {, #offset}] ; immediate offset
where:
type
Byte
H
Halfword
-
Table C2-15 Offsets and architectures, STR, word, halfword, and byte
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-306
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.139 STR (immediate offset)
Table C2-15 Offsets and architectures, STR, word, halfword, and byte (continued)
T32 32-bit encoding, word, halfword, or byte -255 to 4095 -255 to 255 -255 to 255
T32 32-bit encoding, doubleword -1020 to 1020 z -1020 to 1020 z -1020 to 1020 z
Register restrictions
Rn must be different from Rt in the pre-index and post-index forms.
For T32 instructions, you must not specify SP or PC for either Rt or Rt2.
For A32 instructions:
• Rt must be an even-numbered register.
• Rt must not be LR.
• Arm strongly recommends that you do not use R12 for Rt.
• Rt2 must be R(t + 1).
Use of PC
In A32 instructions you can use PC for Rt in STR word instructions and PC for Rn in STR instructions
with immediate offset syntax (that is the forms that do not writeback to the Rn). However, this is
deprecated.
Other uses of PC are not permitted in these A32 instructions.
In T32 code, using PC in STR instructions is not permitted.
Use of SP
You can use SP for Rn.
In A32 code, you can use SP for Rt in word instructions. You can use SP for Rt in non-word instructions
in A32 code but this is deprecated.
In T32 code, you can use SP for Rt in word instructions only. All other use of SP for Rt in this
instruction is not permitted in T32 code.
Example
STR r2,[r9,#consta-struc] ; consta-struc is an expression
; evaluating to a constant in
; the range 0-4095.
Related references
C1.9 Condition code suffixes on page C1-92
z Must be divisible by 4.
aa Rt and Rn must be in the range R0-R7.
ab Rt must be in the range R0-R7.
ac Must be divisible by 2.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-307
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.140 STR (register offset)
Syntax
STR{type}{cond} Rt, [Rn, ±Rm {, shift}] ; register offset
STRD{cond} Rt, Rt2, [Rn, ±Rm] ; register offset, doubleword ; A32 only
where:
type
ad Where ±Rm is shown, you can use –Rm, +Rm, or Rm. Where +Rm is shown, you cannot use –Rm.
ae Rt, Rn, and Rm must all be in the range R0-R7.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-308
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.140 STR (register offset)
Register restrictions
In the pre-index and post-index forms, Rn must be different from Rt.
Use of PC
In A32 instructions you can use PC for Rt in STR word instructions, and you can use PC for Rn in STR
instructions with register offset syntax (that is, the forms that do not writeback to the Rn). However, this
is deprecated.
Other uses of PC are not permitted in A32 instructions.
Use of PC in STR T32 instructions is not permitted.
Use of SP
You can use SP for Rn.
In A32 code, you can use SP for Rt in word instructions. You can use SP for Rt in non-word A32
instructions but this is deprecated.
You can use SP for Rm in A32 instructions but this is deprecated.
In T32 code, you can use SP for Rt in word instructions only. All other use of SP for Rt in this
instruction is not permitted in T32 code.
Use of SP for Rm is not permitted in T32 state.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-309
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.141 STR, unprivileged
Syntax
STR{type}T{cond} Rt, [Rn {, #offset}] ; immediate offset (T32, 32-bit encoding only)
where:
type
Byte
H
Halfword
-
is an optional shift.
Operation
When these instructions are executed by privileged software, they access memory with the same
restrictions as they would have if they were executed by unprivileged software.
When executed by unprivileged software, these instructions behave in exactly the same way as the
corresponding store instruction, for example STRBT behaves in the same way as STRB.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-310
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.141 STR, unprivileged
A32, word or byte Not available -4095 to 4095 +/–Rm LSL #0-31
LSR #1-32
ASR #1-32
ROR #1-31
RRX
T32 32-bit encoding, word, halfword, or byte 0 to 255 Not available Not available
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-311
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.142 STREX
C2.142 STREX
Store Register Exclusive.
Syntax
STREX{cond} Rd, Rt, [Rn {, #offset}]
where:
cond
is an optional offset applied to the value in Rn. offset is permitted only in T32 instructions. If
offset is omitted, an offset of 0 is assumed.
Operation
STREX performs a conditional store to memory. The conditions are as follows:
• If the physical address does not have the Shared TLB attribute, and the executing processor has an
outstanding tagged physical address, the store takes place, the tag is cleared, and the value 0 is
returned in Rd.
• If the physical address does not have the Shared TLB attribute, and the executing processor does not
have an outstanding tagged physical address, the store does not take place, and the value 1 is returned
in Rd.
• If the physical address has the Shared TLB attribute, and the physical address is tagged as exclusive
access for the executing processor, the store takes place, the tag is cleared, and the value 0 is returned
in Rd.
• If the physical address has the Shared TLB attribute, and the physical address is not tagged as
exclusive access for the executing processor, the store does not take place, and the value 1 is returned
in Rd.
Restrictions
PC must not be used for any of Rd, Rt, Rt2, or Rn.
For STREX, Rd must not be the same register as Rt, Rt2, or Rn.
For A32 instructions:
• SP can be used but use of SP for any of Rd, Rt, or Rt2 is deprecated.
• For STREXD, Rt must be an even numbered register, and not LR.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-312
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.142 STREX
Usage
Use LDREX and STREX to implement interprocess communication in multiple-processor and shared-
memory systems.
For reasons of performance, keep the number of instructions between corresponding LDREX and STREX
instructions to a minimum.
Note
The address used in a STREX instruction must be the same as the address in the most recently executed
LDREX instruction.
Availability
All these 32-bit instructions are available in A32 and T32.
There are no 16-bit versions of these instructions.
Examples
MOV r1, #0x1 ; load the ‘lock taken’ value
try
LDREX r0, [LockAddr] ; load the lock value
CMP r0, #0 ; is the lock free?
STREXEQ r0, r1, [LockAddr] ; try and claim the lock
CMPEQ r0, #0 ; did this succeed?
BNE try ; no – try again
.... ; yes – we have the lock
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-313
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.143 SUB
C2.143 SUB
Subtract without carry.
Syntax
SUB{S}{cond} {Rd}, Rn, Operand2
where:
S
is an optional suffix. If S is specified, the condition flags are updated on the result of the
operation.
cond
Operation
The SUB instruction subtracts the value of Operand2 or imm12 from the value in Rn.
In certain circumstances, the assembler can substitute one instruction for another. Be aware of this when
reading disassembly listings.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-314
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.143 SUB
Condition flags
If S is specified, the SUB instruction updates the N, Z, C and V flags according to the result.
16-bit instructions
The following forms of this instruction are available in T32 code, and are 16-bit instructions:
SUBS Rd, Rn, Rm
Rd, Rn and Rm must all be Lo registers. This form can only be used outside an IT block.
Rd, Rn and Rm must all be Lo registers. This form can only be used inside an IT block.
imm range 0-7. Rd and Rn must both be Lo registers. This form can only be used outside an IT
block.
SUB{cond} Rd, Rn, #imm
imm range 0-7. Rd and Rn must both be Lo registers. This form can only be used inside an IT
block.
SUBS Rd, Rd, #imm
imm range 0-255. Rd must be a Lo register. This form can only be used outside an IT block.
imm range 0-255. Rd must be a Lo register. This form can only be used inside an IT block.
Example
SUBS r8, r6, #240 ; sets the flags based on the result
For clarity, the above examples use consecutive registers for multiword values. There is no requirement
to do this. The following, for example, is perfectly valid:
SUBS r6, r6, r9
SBCS r9, r2, r1
SBC r2, r8, r11
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-315
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.143 SUB
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C2.144 SUBS pc, lr on page C2-317
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-316
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.144 SUBS pc, lr
Syntax
SUBS{cond} pc, lr, #imm ; A32 and T32 code
where:
op1
is one of ADC, ADD, AND, BIC, EOR, ORN, ORR, RSB, RSC, SBC, and SUB.
op2
is an immediate value. In T32 code, it is limited to the range 0-255. In A32 code, it is a flexible
second operand.
Rn
is the first general-purpose source register. Arm deprecates the use of any register except LR.
Rm
Usage
SUBS pc, lr, #imm subtracts a value from the link register and loads the PC with the result, then copies
the SPSR to the CPSR.
You can use SUBS pc, lr, #imm to return from an exception if there is no return state on the stack. The
value of #imm depends on the exception to return from.
Notes
SUBS pc, lr, #imm writes an address to the PC. The alignment of this address must be correct for the
instruction set in use after the exception return:
• For a return to A32, the address written to the PC must be word-aligned.
• For a return to T32, the address written to the PC must be halfword-aligned.
• For a return to Jazelle, there are no alignment restrictions on the address written to the PC.
No special precautions are required in software to follow these rules, if you use the instruction to return
after a valid exception entry mechanism.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-317
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.144 SUBS pc, lr
In T32, only SUBS{cond} pc, lr, #imm is a valid instruction. MOVS pc, lr is a synonym of SUBS pc,
lr, #0. Other instructions are undefined.
In A32, only SUBS{cond} pc, lr, #imm and MOVS{cond} pc, lr are valid instructions. Other
instructions are deprecated.
Caution
Do not use these instructions in User mode or System mode. The assembler cannot warn you about this.
Availability
This 32-bit instruction is available in A32 and T32.
The 32-bit T32 instruction is not available in the Armv7‑M architecture.
There is no 16-bit version of this instruction in T32.
Related references
C2.12 AND on page C2-128
C2.58 MOV on page C2-199
C2.3 Flexible second operand (Operand2) on page C2-112
C2.9 ADD on page C2-121
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-318
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.145 SVC
C2.145 SVC
SuperVisor Call.
Syntax
SVC{cond} #imm
where:
cond
Operation
The SVC instruction causes an exception. This means that the processor mode changes to Supervisor, the
CPSR is saved to the Supervisor mode SPSR, and execution branches to the SVC vector.
imm is ignored by the processor. However, it can be retrieved by the exception handler to determine what
service is being requested.
Note
SVC was called SWI in earlier versions of the A32 assembly language. SWI instructions disassemble to
SVC, with a comment to say that this was formerly SWI.
Condition flags
This instruction does not change the flags.
Availability
This instruction is available in A32 and 16-bit T32 and in the Armv7 architectures.
There is no 32-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-319
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.146 SWP and SWPB
Note
These instruction are not supported in Armv8.
Syntax
SWP{B}{cond} Rt, Rt2, [Rn]
where:
cond
is the source register. Rt2 can be the same register as Rt. Rt2 must not be PC.
Rn
contains the address in memory. Rn must be a different register from both Rt and Rt2. Rn must
not be PC.
Usage
You can use SWP and SWPB to implement semaphores:
• Data from memory is loaded into Rt.
• The contents of Rt2 are saved to memory.
• If Rt2 is the same register as Rt, the contents of the register are swapped with the contents of the
memory location.
Note
The use of SWP and SWPB is deprecated. You can use LDREX and STREX instructions to implement more
sophisticated semaphores.
Availability
These instructions are available in A32.
There are no T32 SWP or SWPB instructions.
Related references
C2.51 LDREX on page C2-189
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-320
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.147 SXTAB
C2.147 SXTAB
Sign extend Byte with Add, to extend an 8-bit value to a 32-bit value.
Syntax
SXTAB{cond} {Rd}, Rn, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
This instruction does the following:
1. Rotate the value from Rm right by 0, 8, 16 or 24 bits.
2. Extract bits[7:0] from the value obtained.
3. Sign extend to 32 bits.
4. Add the value from Rn.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-321
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.147 SXTAB
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-322
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.148 SXTAB16
C2.148 SXTAB16
Sign extend two Bytes with Add, to extend two 8-bit values to two 16-bit values.
Syntax
SXTAB16{cond} {Rd}, Rn, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
This instruction does the following:
1. Rotate the value from Rm right by 0, 8, 16 or 24 bits.
2. Extract bits[23:16] and bits[7:0] from the value obtained.
3. Sign extend to 16 bits.
4. Add them to bits[31:16] and bits[15:0] respectively of Rn to form bits[31:16] and bits[15:0] of the
result.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-323
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.148 SXTAB16
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-324
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.149 SXTAH
C2.149 SXTAH
Sign extend Halfword with Add, to extend a 16-bit value to a 32-bit value.
Syntax
SXTAH{cond} {Rd}, Rn, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
This instruction does the following:
1. Rotate the value from Rm right by 0, 8, 16 or 24 bits.
2. Extract bits[15:0] from the value obtained.
3. Sign extend to 32 bits.
4. Add the value from Rn.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-325
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.149 SXTAH
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-326
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.150 SXTB
C2.150 SXTB
Sign extend Byte, to extend an 8-bit value to a 32-bit value.
Syntax
SXTB{cond} {Rd}, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
This instruction does the following:
1. Rotates the value from Rm right by 0, 8, 16 or 24 bits.
2. Extracts bits[7:0] from the value obtained.
3. Sign extends to 32 bits.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
16-bit instructions
The following form of this instruction is available in T32 code, and is a 16-bit instruction:
SXTB Rd, Rm
Availability
The 32-bit instruction is available in A32 and T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-327
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.150 SXTB
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
The 16-bit instruction is available in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-328
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.151 SXTB16
C2.151 SXTB16
Sign extend two bytes.
Syntax
SXTB16{cond} {Rd}, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
SXTB16 extends two 8-bit values to two 16-bit values. It does this by:
1. Rotating the value from Rm right by 0, 8, 16 or 24 bits.
2. Extracting bits[23:16] and bits[7:0] from the value obtained.
3. Sign extending to 16 bits each.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-329
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.152 SXTH
C2.152 SXTH
Sign extend Halfword.
Syntax
SXTH{cond} {Rd}, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
SXTH extends a 16-bit value to a 32-bit value. It does this by:
1. Rotating the value from Rm right by 0, 8, 16 or 24 bits.
2. Extracting bits[15:0] from the value obtained.
3. Sign extending to 32 bits.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
16-bit instructions
The following form of this instruction is available in T32 code, and is a 16-bit instruction:
SXTH Rd, Rm
Availability
The 32-bit instruction is available in A32 and T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-330
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.152 SXTH
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
The 16-bit instruction is available in T32.
Example
SXTH r3, r9
Incorrect example
SXTH r3, r9, ROR #12 ; rotation must be 0, 8, 16, or 24.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-331
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.153 SYS
C2.153 SYS
Execute system coprocessor instruction.
Syntax
SYS{cond} instruction{, Rn}
where:
cond
is an operand to the instruction. For instructions that take an argument, Rn is compulsory. For
instructions that do not take an argument, Rn is optional and if it is not specified, R0 is used. Rn
must not be PC.
Usage
You can use this pseudo-instruction to execute special coprocessor instructions such as cache, branch
predictor, and TLB operations. The instructions operate by writing to special write-only coprocessor
registers. The instruction names are the same as the write-only coprocessor register names and are listed
in the Arm® Architecture Reference Manual. For example:
SYS ICIALLUIS ; invalidates all instruction caches Inner Shareable
; to Point of Unification and also flushes branch
; target cache.
Availability
This 32-bit instruction is available in A32 and T32.
The 32-bit T32 instruction is not available in the Armv7‑M architecture.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
Related information
Arm Architecture Reference Manual
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-332
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.154 TBB and TBH
Syntax
TBB [Rn, Rm]
where:
Rn
is the base register. This contains the address of the table of branch lengths. Rn must not be SP.
If PC is specified for Rn, the value used is the address of the instruction plus 4.
Rm
Operation
These instructions cause a PC-relative forward branch using a table of single byte offsets (TBB) or
halfword offsets (TBH). Rn provides a pointer to the table, and Rm supplies an index into the table. The
branch length is twice the value of the byte (TBB) or the halfword (TBH) returned from the table. The
target of the branch table must be in the same execution state.
Architectures
These 32-bit T32 instructions are available.
There are no versions of these instructions in A32 or in 16-bit T32 encodings.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-333
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.155 TEQ
C2.155 TEQ
Test Equivalence.
Syntax
TEQ{cond} Rn, Operand2
where:
cond
Usage
This instruction tests the value in a register against Operand2. It updates the condition flags on the result,
but does not place the result in any register.
The TEQ instruction performs a bitwise Exclusive OR operation on the value in Rn and the value of
Operand2. This is the same as an EORS instruction, except that the result is discarded.
Use the TEQ instruction to test if two values are equal, without affecting the V or C flags (as CMP does).
TEQ is also useful for testing the sign of a value. After the comparison, the N flag is the logical Exclusive
OR of the sign bits of the two operands.
Register restrictions
In this T32 instruction, you cannot use SP or PC for Rn or Operand2.
In this A32 instruction, use of SP or PC is deprecated.
For A32 instructions:
• If you use PC (R15) as Rn, the value used is the address of the instruction plus 8.
• You cannot use PC for any operand in any data processing instruction that has a register-controlled
shift.
Condition flags
This instruction:
• Updates the N and Z flags according to the result.
• Can update the C flag during the calculation of Operand2.
• Does not affect the V flag.
Architectures
This instruction is available in A32 and T32.
Correct example
TEQEQ r10, r9
Incorrect example
TEQ pc, r1, ROR r0 ; PC not permitted with register
; controlled shift
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-334
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.155 TEQ
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-335
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.156 TST
C2.156 TST
Test bits.
Syntax
TST{cond} Rn, Operand2
where:
cond
Operation
This instruction tests the value in a register against Operand2. It updates the condition flags on the result,
but does not place the result in any register.
The TST instruction performs a bitwise AND operation on the value in Rn and the value of Operand2.
This is the same as an ANDS instruction, except that the result is discarded.
Register restrictions
In this T32 instruction, you cannot use SP or PC for Rn or Operand2.
In this A32 instruction, use of SP or PC is deprecated.
For A32 instructions:
• If you use PC (R15) as Rn, the value used is the address of the instruction plus 8.
• You cannot use PC for any operand in any data processing instruction that has a register-controlled
shift.
Condition flags
This instruction:
• Updates the N and Z flags according to the result.
• Can update the C flag during the calculation of Operand2.
• Does not affect the V flag.
16-bit instructions
The following form of the TST instruction is available in T32 code, and is a 16-bit instruction:
TST Rn, Rm
Architectures
This instruction is available A32 and T32.
Examples
TST r0, #0x3F8
TSTNE r1, r5, ASR r1
Related references
C2.3 Flexible second operand (Operand2) on page C2-112
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-336
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.156 TST
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-337
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.157 TT, TTT, TTA, TTAT
Syntax
TT{cond}{q} Rd, Rn ; T1 TT general registers (T32)
Where:
cond
Is an optional condition code. It specifies the condition under which the instruction is executed.
If cond is omitted, it defaults to always (AL). See Chapter C1 Condition Codes on page C1-83.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Rd
Is the destination general-purpose register into which the status result of the target test is written.
Rn
Is the general-purpose base register.
Usage
Test Target (TT) queries the security state and access permissions of a memory location.
Test Target Unprivileged (TTT) queries the security state and access permissions of a memory location
for an unprivileged access to that location.
Test Target Alternate Domain (TTA) and Test Target Alternate Domain Unprivileged (TTAT) query the
security state and access permissions of a memory location for a Non-secure access to that location.
These instructions are only valid when executing in Secure state, and are UNDEFINED if used from Non-
secure state.
These instructions return the security state and access permissions in the destination register, the contents
of which are as follows:
[7:0] MREGION The MPU region that the address maps to. This field is 0 if MRVALID is 0.
[15:8] SREGION The SAU region that the address maps to. This field is only valid if the instruction is executed from Secure
state. This field is 0 if SRVALID is 0.
[16] MRVALID Set to 1 if the MREGION content is valid. Set to 0 if the MREGION content is invalid.
[17] SRVALID Set to 1 if the SREGION content is valid. Set to 0 if the SREGION content is invalid.
[18] R Read accessibility. Set to 1 if the memory location can be read according to the permissions of the selected
MPU when operating in the current mode. For TTT and TTAT, this bit returns the permissions for unprivileged
access, regardless of whether the current mode is privileged or unprivileged.
[19] RW Read/write accessibility. Set to 1 if the memory location can be read and written according to the permissions of
the selected MPU when operating in the current mode. For TTT and TTAT, this bit returns the permissions for
unprivileged access, regardless of whether the current mode is privileged or unprivileged.
[20] NSR Equal to R AND NOT S. Can be used in combination with the LSLS (immediate) instruction to check both the
MPU and SAU/IDAU permissions. This bit is only valid if the instruction is executed from Secure state and the
R field is valid.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-338
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.157 TT, TTT, TTA, TTAT
(continued)
[21] NSRW Equal to RW AND NOT S. Can be used in combination with the LSLS (immediate) instruction to check both
the MPU and SAU/IDAU permissions. This bit is only valid if the instruction is executed from Secure state and
the RW field is valid.
[22] S Security. A value of 1 indicates the memory location is Secure, and a value of 0 indicates the memory location
is Non-secure. This bit is only valid if the instruction is executed from Secure state.
[23] IRVALID IREGION valid flag. For a Secure request, indicates the validity of the IREGION field. Set to 1 if the IREGION
content is valid. Set to 0 if the IREGION content is invalid.
This bit is always 0 if the IDAU cannot provide a region number, the address is exempt from security
attribution, or if the requesting TT instruction is executed from the Non-secure state.
[31:24] IREGION IDAU region number. Indicates the IDAU region number containing the target address. This field is 0 if
IRVALID is0.
The SREGION field is invalid and 0 if any of the following conditions are true:
• SAU_CTRL.ENABLE is set to 0.
• The address did not match any enabled SAU regions.
• The address matched multiple SAU regions.
• The SAU attributes were overridden by the IDAU.
• The instruction is executed from Non-secure state, or is executed on a processor that does not
implement the Armv8‑M Security Extensions.
The R and RW bits are invalid and 0 if any of the following conditions are true:
• The address matched multiple MPU regions.
• TT or TTT is executed from an unprivileged mode.
Related references
C1.9 Condition code suffixes on page C1-92
C2.2 Instruction width specifiers on page C2-111
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-339
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.158 UADD8
C2.158 UADD8
Unsigned parallel byte-wise addition.
Syntax
UADD8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs four unsigned integer additions on the corresponding bytes of the operands and
writes the results into the corresponding bytes of the destination. The results are modulo 28. It sets the
APSR GE flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
GE flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[0]
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-340
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.158 UADD8
Related references
C2.100 SEL on page C2-257
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-341
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.159 UADD16
C2.159 UADD16
Unsigned parallel halfword-wise addition.
Syntax
UADD16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs two unsigned integer additions on the corresponding halfwords of the operands
and writes the results into the corresponding halfwords of the destination. The results are modulo 216. It
sets the APSR GE flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
GE flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[1:0]
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-342
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.159 UADD16
Related references
C2.100 SEL on page C2-257
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-343
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.160 UASX
C2.160 UASX
Unsigned parallel add and subtract halfwords with exchange.
Syntax
UASX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs an addition on the
two top halfwords of the operands and a subtraction on the bottom two halfwords. It writes the results
into the corresponding halfwords of the destination. The results are modulo 216. It sets the APSR GE
flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
GE flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[1:0]
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-344
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.160 UASX
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-345
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.161 UBFX
C2.161 UBFX
Unsigned Bit Field Extract.
Syntax
UBFX{cond} Rd, Rn, #lsb, #width
where:
cond
is the bit number of the least significant bit in the bitfield, in the range 0 to 31.
width
Operation
Copies adjacent bits from one register into the least significant bits of a second register, and zero extends
to 32 bits.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not alter any flags.
Architectures
This instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-346
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.162 UDF
C2.162 UDF
Permanently Undefined.
Syntax
UDF{c}{q} {#}imm ; A1 general registers (A32)
Where:
imm
The value depends on the instruction variant:
A1 general registers
For A32, a 16-bit unsigned immediate, in the range 0 to 65535.
T1 general registers
For T32, an 8-bit unsigned immediate, in the range 0 to 255.
T2 general registers
For T32, a 16-bit unsigned immediate, in the range 0 to 65535.
Note
The PE ignores the value of this constant.
c
Is an optional condition code. See Chapter C1 Condition Codes on page C1-83. Arm deprecates
using any c value other than AL.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Usage
Permanently Undefined generates an Undefined Instruction exception.
The encodings for UDF used in this section are defined as permanently UNDEFINED in the Armv8‑A
architecture. However:
• With the T32 instruction set, Arm deprecates using the UDF instruction in an IT block.
• In the A32 instruction set, UDF is not conditional.
Related references
C2.1 A32 and T32 instruction summary on page C2-106
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-347
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.163 UDIV
C2.163 UDIV
Unsigned Divide.
Syntax
UDIV{cond} {Rd}, Rn, Rm
where:
cond
is an optional condition code.
Rd
is the destination register.
Rn
is the register holding the value to be divided.
Rm
is a register holding the divisor.
Register restrictions
PC or SP cannot be used for Rd, Rn, or Rm.
Architectures
This 32-bit T32 instruction is available in Armv7‑R, Armv7‑M and Armv8‑M Mainline.
This 32-bit A32 instruction is optional in Armv7‑R.
This 32-bit A32 and T32 instruction is available in Armv7‑A if Virtualization Extensions are
implemented, and optional if not.
There is no 16-bit T32 UDIV instruction.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-348
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.164 UHADD8
C2.164 UHADD8
Unsigned halving parallel byte-wise addition.
Syntax
UHADD8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs four unsigned integer additions on the corresponding bytes of the operands,
halves the results, and writes the results into the corresponding bytes of the destination. This cannot
cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-349
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.165 UHADD16
C2.165 UHADD16
Unsigned halving parallel halfword-wise addition.
Syntax
UHADD16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs two unsigned integer additions on the corresponding halfwords of the
operands, halves the results, and writes the results into the corresponding halfwords of the destination.
This cannot cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-350
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.166 UHASX
C2.166 UHASX
Unsigned halving parallel add and subtract halfwords with exchange.
Syntax
UHASX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs an addition on the
two top halfwords of the operands and a subtraction on the bottom two halfwords. It halves the results
and writes them into the corresponding halfwords of the destination. This cannot cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-351
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.167 UHSAX
C2.167 UHSAX
Unsigned halving parallel subtract and add halfwords with exchange.
Syntax
UHSAX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs a subtraction on the
two top halfwords of the operands and an addition on the bottom two halfwords. It halves the results and
writes them into the corresponding halfwords of the destination. This cannot cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-352
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.168 UHSUB8
C2.168 UHSUB8
Unsigned halving parallel byte-wise subtraction.
Syntax
UHSUB8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each byte of the second operand from the corresponding byte of the first
operand, halves the results, and writes the results into the corresponding bytes of the destination. This
cannot cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-353
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.169 UHSUB16
C2.169 UHSUB16
Unsigned halving parallel halfword-wise subtraction.
Syntax
UHSUB16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each halfword of the second operand from the corresponding halfword of the
first operand, halves the results, and writes the results into the corresponding halfwords of the
destination. This cannot cause overflow.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-354
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.170 UMAAL
C2.170 UMAAL
Unsigned Multiply Accumulate Accumulate Long.
Syntax
UMAAL{cond} RdLo, RdHi, Rn, Rm
where:
cond
are the destination registers for the 64-bit result. They also hold the two 32-bit accumulate
operands. RdLo and RdHi must be different registers.
Rn, Rm
Operation
The UMAAL instruction multiplies the 32-bit values in Rn and Rm, adds the two 32-bit values in RdHi and
RdLo, and stores the 64-bit result to RdLo, RdHi.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Examples
UMAAL r8, r9, r2, r3
UMAALGE r2, r0, r5, r3
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-355
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.171 UMLAL
C2.171 UMLAL
Unsigned Long Multiply, with optional Accumulate, with 32-bit operands and 64-bit result and
accumulator.
Syntax
UMLAL{S}{cond} RdLo, RdHi, Rn, Rm
where:
S
is an optional suffix available in A32 state only. If S is specified, the condition flags are updated
based on the result of the operation.
cond
are the destination registers. They also hold the accumulating value. RdLo and RdHi must be
different registers.
Rn, Rm
Operation
The UMLAL instruction interprets the values from Rn and Rm as unsigned integers. It multiplies these
integers, and adds the 64-bit result to the 64-bit unsigned integer contained in RdHi and RdLo.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
If S is specified, this instruction:
• Updates the N and Z flags according to the result.
• Does not affect the C or V flags.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Example
UMLALS r4, r5, r3, r8
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-356
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.172 UMULL
C2.172 UMULL
Unsigned Long Multiply, with 32-bit operands, and 64-bit result.
Syntax
UMULL{S}{cond} RdLo, RdHi, Rn, Rm
where:
S
is an optional suffix available in A32 state only. If S is specified, the condition flags are updated
based on the result of the operation.
cond
are the destination general-purpose registers. RdLo and RdHi must be different registers.
Rn, Rm
Operation
The UMULL instruction interprets the values from Rn and Rm as unsigned integers. It multiplies these
integers and places the least significant 32 bits of the result in RdLo, and the most significant 32 bits of
the result in RdHi.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
If S is specified, this instruction:
• Updates the N and Z flags according to the result.
• Does not affect the C or V flags.
Architectures
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Example
UMULL r0, r4, r5, r6
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-357
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.173 UQADD8
C2.173 UQADD8
Unsigned saturating parallel byte-wise addition.
Syntax
UQADD8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs four unsigned integer additions on the corresponding bytes of the operands and
writes the results into the corresponding bytes of the destination. It saturates the results to the unsigned
range 0 ≤ x ≤ 28 -1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-358
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.174 UQADD16
C2.174 UQADD16
Unsigned saturating parallel halfword-wise addition.
Syntax
UQADD16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction performs two unsigned integer additions on the corresponding halfwords of the operands
and writes the results into the corresponding halfwords of the destination. It saturates the results to the
unsigned range 0 ≤ x ≤ 216 -1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-359
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.175 UQASX
C2.175 UQASX
Unsigned saturating parallel add and subtract halfwords with exchange.
Syntax
UQASX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs an addition on the
two top halfwords of the operands and a subtraction on the bottom two halfwords. It writes the results
into the corresponding halfwords of the destination. It saturates the results to the unsigned range 0 ≤ x ≤
216 -1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-360
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.176 UQSAX
C2.176 UQSAX
Unsigned saturating parallel subtract and add halfwords with exchange.
Syntax
UQSAX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs a subtraction on the
two top halfwords of the operands and an addition on the bottom two halfwords. It writes the results into
the corresponding halfwords of the destination. It saturates the results to the unsigned range 0 ≤ x ≤ 216
-1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-361
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.177 UQSUB8
C2.177 UQSUB8
Unsigned saturating parallel byte-wise subtraction.
Syntax
UQSUB8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each byte of the second operand from the corresponding byte of the first
operand and writes the results into the corresponding bytes of the destination. It saturates the results to
the unsigned range 0 ≤ x ≤ 28 -1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-362
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.178 UQSUB16
C2.178 UQSUB16
Unsigned saturating parallel halfword-wise subtraction.
Syntax
UQSUB16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each halfword of the second operand from the corresponding halfword of the
first operand and writes the results into the corresponding halfwords of the destination. It saturates the
results to the unsigned range 0 ≤ x ≤ 216 -1. The Q flag is not affected even if this operation saturates.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, Q, or GE flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-363
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.179 USAD8
C2.179 USAD8
Unsigned Sum of Absolute Differences.
Syntax
USAD8{cond} {Rd}, Rn, Rm
where:
cond
Operation
The USAD8 instruction finds the four differences between the unsigned values in corresponding bytes of
Rn and Rm. It adds the absolute values of the four differences, and saves the result to Rd.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not alter any flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Example
USAD8 r2, r4, r6
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-364
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.180 USADA8
C2.180 USADA8
Unsigned Sum of Absolute Differences and Accumulate.
Syntax
USADA8{cond} Rd, Rn, Rm, Ra
where:
cond
Operation
The USADA8 instruction adds the absolute values of the four differences to the value in Ra, and saves the
result to Rd.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not alter any flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Correct examples
USADA8 r0, r3, r5, r2
USADA8VS r0, r4, r0, r1
Incorrect examples
USADA8 r2, r4, r6 ; USADA8 requires four registers
USADA16 r0, r4, r0, r1 ; no such instruction
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-365
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.181 USAT
C2.181 USAT
Unsigned Saturate to any bit position, with optional shift before saturating.
Syntax
USAT{cond} Rd, #sat, Rm{, shift}
where:
cond
Operation
The USAT instruction applies the specified shift to a signed value, then saturates to the unsigned range 0 ≤
x ≤ 2sat – 1.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Q flag
If saturation occurs, this instruction sets the Q flag. To read the state of the Q flag, use an MRS instruction.
Architectures
This instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Example
USATNE r0, #7, r5
Related references
C2.131 SSAT16 on page C2-292
C2.62 MRS (PSR to general-purpose register) on page C2-204
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-366
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.182 USAT16
C2.182 USAT16
Parallel halfword Saturate.
Syntax
USAT16{cond} Rd, #sat, Rn
where:
cond
Operation
Halfword-wise unsigned saturation to any bit position.
The USAT16 instruction saturates each signed halfword to the unsigned range 0 ≤ x ≤ 2sat -1.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Q flag
If saturation occurs on either halfword, this instruction sets the Q flag. To read the state of the Q flag, use
an MRS instruction.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Example
USAT16 r0, #7, r5
Related references
C2.62 MRS (PSR to general-purpose register) on page C2-204
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-367
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.183 USAX
C2.183 USAX
Unsigned parallel subtract and add halfwords with exchange.
Syntax
USAX{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction exchanges the two halfwords of the second operand, then performs a subtraction on the
two top halfwords of the operands and an addition on the bottom two halfwords. It writes the results into
the corresponding halfwords of the destination. The results are modulo 216. It sets the APSR GE flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
GE flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[1:0]
It sets GE[3:2] to 1 to indicate that the subtraction gave a result greater than or equal to zero, meaning a
borrow did not occur. This is equivalent to a SUBS instruction setting the C condition flag to 1.
You can use these flags to control a following SEL instruction.
Note
GE[1:0] are set or cleared together, and GE[3:2] are set or cleared together.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-368
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.183 USAX
Related references
C2.100 SEL on page C2-257
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-369
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.184 USUB8
C2.184 USUB8
Unsigned parallel byte-wise subtraction.
Syntax
USUB8{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each byte of the second operand from the corresponding byte of the first
operand and writes the results into the corresponding bytes of the destination. The results are modulo 28.
It sets the APSR GE flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
GE flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[0]
Availability
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C2.100 SEL on page C2-257
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-370
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.184 USUB8
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-371
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.185 USUB16
C2.185 USUB16
Unsigned parallel halfword-wise subtraction.
Syntax
USUB16{cond} {Rd}, Rn, Rm
where:
cond
Operation
This instruction subtracts each halfword of the second operand from the corresponding halfword of the
first operand and writes the results into the corresponding halfwords of the destination. The results are
modulo 216. It sets the APSR GE flags.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not affect the N, Z, C, V, or Q flags.
It sets the GE flags in the APSR as follows:
GE[1:0]
Availability
This 32-bit instruction is available in A32 and T32.
There is no 16-bit version of this instruction in T32.
Related references
C2.100 SEL on page C2-257
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-372
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.186 UXTAB
C2.186 UXTAB
Zero extend Byte and Add.
Syntax
UXTAB{cond} {Rd}, Rn, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
UXTAB extends an 8-bit value to a 32-bit value. It does this by:
1. Rotating the value from Rm right by 0, 8, 16 or 24 bits.
2. Extracting bits[7:0] from the value obtained.
3. Zero extending to 32 bits.
4. Adding the value from Rn.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-373
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.186 UXTAB
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-374
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.187 UXTAB16
C2.187 UXTAB16
Zero extend two Bytes and Add.
Syntax
UXTAB16{cond} {Rd}, Rn, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
UXTAB16 extends two 8-bit values to two 16-bit values. It does this by:
1. Rotating the value from Rm right by 0, 8, 16 or 24 bits.
2. Extracting bits[23:16] and bits[7:0] from the value obtained.
3. Zero extending them to 16 bits.
4. Adding them to bits[31:16] and bits[15:0] respectively of Rn to form bits[31:16] and bits[15:0] of the
result.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-375
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.187 UXTAB16
Example
UXTAB16EQ r0, r0, r4, ROR #16
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-376
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.188 UXTAH
C2.188 UXTAH
Zero extend Halfword and Add.
Syntax
UXTAH{cond} {Rd}, Rn, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
UXTAH extends a 16-bit value to a 32-bit value. It does this by:
1. Rotating the value from Rm right by 0, 8, 16 or 24 bits.
2. Extracting bits[15:0] from the value obtained.
3. Zero extending to 32 bits.
4. Adding the value from Rn.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-377
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.188 UXTAH
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-378
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.189 UXTB
C2.189 UXTB
Zero extend Byte.
Syntax
UXTB{cond} {Rd}, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
UXTB extends an 8-bit value to a 32-bit value. It does this by:
1. Rotating the value from Rm right by 0, 8, 16, or 24 bits.
2. Extracting bits[7:0] from the value obtained.
3. Zero extending to 32 bits.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
16-bit instruction
The following form of this instruction is available in T32 code, and is a 16-bit instruction:
UXTB Rd, Rm
Availability
The 32-bit instruction is available in A32 and T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-379
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.189 UXTB
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
The 16-bit instruction is available in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-380
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.190 UXTB16
C2.190 UXTB16
Zero extend two Bytes.
Syntax
UXTB16{cond} {Rd}, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
UXTB16 extends two 8-bit values to two 16-bit values. It does this by:
1. Rotating the value from Rm right by 0, 8, 16 or 24 bits.
2. Extracting bits[23:16] and bits[7:0] from the value obtained.
3. Zero extending each to 16 bits.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
Availability
The 32-bit instruction is available in A32 and T32.
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
There is no 16-bit version of this instruction in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-381
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.191 UXTH
C2.191 UXTH
Zero extend Halfword.
Syntax
UXTH{cond} {Rd}, Rm {,rotation}
where:
cond
is one of:
ROR #8
Operation
UXTH extends a 16-bit value to a 32-bit value. It does this by:
1. Rotating the value from Rm right by 0, 8, 16, or 24 bits.
2. Extracting bits[15:0] from the value obtained.
3. Zero extending to 32 bits.
Register restrictions
You cannot use PC for any operand.
You can use SP in A32 instructions but this is deprecated. You cannot use SP in T32 instructions.
Condition flags
This instruction does not change the flags.
16-bit instructions
The following form of this instruction is available in T32 code, and is a 16-bit instruction:
UXTH Rd, Rm
Availability
The 32-bit instruction is available in A32 and T32.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-382
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.191 UXTH
For the Armv7‑M architecture, the 32-bit T32 instruction is only available in an Armv7E-M
implementation.
The 16-bit instruction is available in T32.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-383
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.192 WFE
C2.192 WFE
Wait For Event.
Syntax
WFE{cond}
where:
cond
Operation
This is a hint instruction. It is optional whether this instruction is implemented or not. If this instruction
is not implemented, it executes as a NOP. The assembler produces a diagnostic message if the instruction
executes as a NOP on the target.
If the Event Register is not set, WFE suspends execution until one of the following events occurs:
• An IRQ interrupt, unless masked by the CPSR I-bit.
• An FIQ interrupt, unless masked by the CPSR F-bit.
• An Imprecise Data abort, unless masked by the CPSR A-bit.
• A Debug Entry request, if Debug is enabled.
• An Event signaled by another processor using the SEV instruction, or by the current processor using
the SEVL instruction.
If the Event Register is set, WFE clears it and returns immediately.
If WFE is implemented, SEV must also be implemented.
Availability
This instruction is available in A32 and T32.
Related references
C2.68 NOP on page C2-213
C1.9 Condition code suffixes on page C1-92
C2.103 SEV on page C2-261
C2.104 SEVL on page C2-262
C2.193 WFI on page C2-385
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-384
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.193 WFI
C2.193 WFI
Wait for Interrupt.
Syntax
WFI{cond}
where:
cond
Operation
This is a hint instruction. It is optional whether this instruction is implemented or not. If this instruction
is not implemented, it executes as a NOP. The assembler produces a diagnostic message if the instruction
executes as a NOP on the target.
WFI suspends execution until one of the following events occurs:
• An IRQ interrupt, regardless of the CPSR I-bit.
• An FIQ interrupt, regardless of the CPSR F-bit.
• An Imprecise Data abort, unless masked by the CPSR A-bit.
• A Debug Entry request, regardless of whether Debug is enabled.
Availability
This instruction is available in A32 and T32.
Related references
C2.68 NOP on page C2-213
C1.9 Condition code suffixes on page C1-92
C2.192 WFE on page C2-384
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-385
reserved.
Non-Confidential
C2 A32 and T32 Instructions
C2.194 YIELD
C2.194 YIELD
Yield.
Syntax
YIELD{cond}
where:
cond
Operation
This is a hint instruction. It is optional whether this instruction is implemented or not. If this instruction
is not implemented, it executes as a NOP. The assembler produces a diagnostic message if the instruction
executes as a NOP on the target.
YIELD indicates to the hardware that the current thread is performing a task, for example a spinlock, that
can be swapped out. Hardware can use this hint to suspend and resume threads in a multithreading
system.
Availability
This instruction is available in A32 and T32.
Related references
C2.68 NOP on page C2-213
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C2-386
reserved.
Non-Confidential
Chapter C3
Advanced SIMD Instructions (32-bit)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-387
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-388
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-389
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-390
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.1 Summary of Advanced SIMD instructions
VACLE, VACLT Absolute Compare Less than or Equal, Less Than (pseudo-instructions)
VADD Add
VADDHN Add, select High half
VAND Bitwise AND
VAND Bitwise AND (pseudo-instruction)
VBIC Bitwise Bit Clear (register)
VBIC Bitwise Bit Clear (immediate)
VCEQ, VCLE, VCLT Compare Equal, Less than or Equal, Compare Less Than
VCLE, VCLT Compare Less than or Equal, Compare Less Than (pseudo-instruction)
VCLS, VCLZ, VCNT Count Leading Sign bits, Count Leading Zeros, and Count set bits
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-391
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.1 Summary of Advanced SIMD instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-392
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.1 Summary of Advanced SIMD instructions
VSDOT (by element) Dot Product index form with signed integers
VUDOT (by element) Dot Product index form with unsigned integers
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-393
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.2 Summary of shared Advanced SIMD and floating-point instructions
Transfer from two general-purpose registers to either one double-precision or two single-precision registers
Transfer from either one double-precision or two single-precision registers to two general-purpose registers
VMRS Transfer from a SIMD and floating-point system register to a general-purpose register
VMSR Transfer from a general-purpose register to a SIMD and floating-point system register
VPOP Pop floating-point or SIMD registers from full-descending stack
VPUSH Push floating-point or SIMD registers to full-descending stack
VSTM Store multiple
VSTR Store
Related references
C3.54 VLDM on page C3-449
C3.55 VLDR on page C3-450
C3.56 VLDR (post-increment and pre-decrement) on page C3-451
C3.57 VLDR pseudo-instruction on page C3-452
C3.70 VMOV (between two general-purpose registers and a 64-bit extension register) on page C3-465
C3.71 VMOV (between a general-purpose register and an Advanced SIMD scalar) on page C3-466
C3.75 VMRS on page C3-470
C3.76 VMSR on page C3-471
C3.92 VPOP on page C3-487
C3.93 VPUSH on page C3-488
C3.131 VSTM on page C3-526
C3.134 VSTR on page C3-531
C3.135 VSTR (post-increment and pre-decrement) on page C3-532
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-394
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.3 Interleaving provided by load and store element and structure instructions
C3.3 Interleaving provided by load and store element and structure instructions
Many instructions in this group provide interleaving when structures are stored to memory, and de-
interleaving when structures are loaded from memory.
The following figure shows an example of de-interleaving. Interleaving is the inverse process.
A[0].x
A[0].y
A[0].z
A[1].x
A[1].y
A[1].z
A[2].x
A[2].y
A[2].z
A[3].x
A[3].y
A[3].z X3 X2 X1 X0 D0
Y3 Y2 Y1 Y0 D1
Z3 Z2 Z1 Z0 D2
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-395
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.4 Alignment restrictions in load and store element and structure instructions
C3.4 Alignment restrictions in load and store element and structure instructions
Many of these instructions allow you to specify memory alignment restrictions.
When the alignment is not specified in the instruction, the alignment restriction is controlled by the A bit
(SCTLR bit[1]):
• If the A bit is 0, there are no alignment restrictions (except for strongly-ordered or device memory,
where accesses must be element-aligned).
• If the A bit is 1, accesses must be element-aligned.
If an address is not correctly aligned, an alignment fault occurs.
Related concepts
C3.3 Interleaving provided by load and store element and structure instructions on page C3-395
Related references
C3.51 VLDn (single n-element structure to one lane) on page C3-443
C3.52 VLDn (single n-element structure to all lanes) on page C3-445
C3.53 VLDn (multiple n-element structures) on page C3-447
C3.132 VSTn (multiple n-element structures) on page C3-527
C3.133 VSTn (single n-element structure to one lane) on page C3-529
Related information
Arm Architecture Reference Manual
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-396
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.5 FLDMDBX, FLDMIAX
Syntax
FLDMDBX{c}{q} Rn!, dreglist ; A1 Decrement Before FP/SIMD registers (A32)
Where:
c
Is an optional condition code. See Chapter C1 Condition Codes on page C1-83.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Rn
Is the general-purpose base register. If writeback is not specified, the PC can be used.
!
Specifies base register writeback.
dreglist
Is the list of consecutively numbered 64-bit SIMD and FP registers to be transferred. The list
must contain at least one register, all registers must be in the range D0-D15, and must not
contain more than 16 registers.
Usage
FLDMX loads multiple SIMD and FP registers from consecutive locations in the Advanced SIMD and
floating-point register file using an address from a general-purpose register.
Arm deprecates use of FLDMDBX and FLDMIAX, except for disassembly purposes, and reassembly of
disassembled code.
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to
Hyp mode. For more information see Enabling Advanced SIMD and floating-point support in the Arm®
Architecture Reference Manual Arm®v8, for Arm®v8‑A architecture profile.
Note
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Architectural
Constraints on UNPREDICTABLE behaviors in the Arm® Architecture Reference Manual Arm®v8, for
Arm®v8‑A architecture profile.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-397
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.6 FSTMDBX, FSTMIAX
Syntax
FSTMDBX{c}{q} Rn!, dreglist ; A1 Decrement Before FP/SIMD registers (A32)
Where:
c
Is an optional condition code. See Chapter C1 Condition Codes on page C1-83.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Rn
Is the general-purpose base register. If writeback is not specified, the PC can be used. However,
Arm deprecates use of the PC.
!
Specifies base register writeback.
dreglist
Is the list of consecutively numbered 64-bit SIMD and FP registers to be transferred. The list
must contain at least one register, all registers must be in the range D0-D15, and must not
contain more than 16 registers.
Usage
FSTMX stores multiple SIMD and FP registers from the Advanced SIMD and floating-point register file
to consecutive locations in using an address from a general-purpose register.
Arm deprecates use of FLDMDBX and FLDMIAX, except for disassembly purposes, and reassembly of
disassembled code.
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and
mode in which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or
trapped to Hyp mode. For more information see Enabling Advanced SIMD and floating-point support in
the Arm® Architecture Reference Manual Arm®v8, for Arm®v8‑A architecture profile.
Note
For more information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Architectural
Constraints on UNPREDICTABLE behaviors in the Arm® Architecture Reference Manual Arm®v8, for
Arm®v8‑A architecture profile.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-398
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.7 VABA and VABAL
Syntax
VABA{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Qd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a long
operation.
Operation
VABA subtracts the elements of one vector from the corresponding elements of another vector, and
accumulates the absolute values of the results into the elements of the destination vector.
VABAL is the long version of the VABA instruction.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-399
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.8 VABD and VABDL
Syntax
VABD{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Qd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a long
operation.
Operation
VABD subtracts the elements of one vector from the corresponding elements of another vector, and places
the absolute values of the results into the elements of the destination vector.
VABDL is the long version of the VABD instruction.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-400
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.9 VABS
C3.9 VABS
Vector Absolute
Syntax
VABS{cond}.datatype Qd, Qm
VABS{cond}.datatype Dd, Dm
where:
cond
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
Operation
VABS takes the absolute value of each element in a vector, and places the results in a second vector. (The
floating-point version only clears the sign bit.)
Related references
C3.94 VQABS on page C3-489
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-401
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.10 VACLE, VACLT, VACGE and VACGT
Syntax
VACop{cond}.F32 {Qd}, Qn, Qm
where:
op
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
The result datatype is I32.
Operation
These instructions take the absolute value of each element in a vector, and compare it with the absolute
value of the corresponding element of a second vector. If the condition is true, the corresponding element
in the destination vector is set to all ones. Otherwise, it is set to all zeros.
Note
On disassembly, the VACLE and VACLT pseudo-instructions are disassembled to the corresponding VACGE
and VACGT instructions, with the operands reversed.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-402
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.11 VADD
C3.11 VADD
Vector Add.
Syntax
VADD{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VADD adds corresponding elements in two vectors, and places the results in the destination vector.
Related references
C3.13 VADDL and VADDW on page C3-405
C3.95 VQADD on page C3-490
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-403
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.12 VADDHN
C3.12 VADDHN
Vector Add and Narrow, selecting High half.
Syntax
VADDHN{cond}.datatype Dd, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector.
Operation
VADDHN adds corresponding elements in two vectors, selects the most significant halves of the results, and
places the final results in the destination vector. Results are truncated.
Related references
C3.108 VRADDHN on page C3-503
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-404
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.13 VADDL and VADDW
Syntax
VADDL{cond}.datatype Qd, Dn, Dm ; Long operation
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a long
operation.
Qd, Qn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a wide
operation.
Operation
VADDL adds corresponding elements in two doubleword vectors, and places the results in the destination
quadword vector.
VADDW adds corresponding elements in one quadword and one doubleword vector, and places the results
in the destination quadword vector.
Related references
C3.11 VADD on page C3-403
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-405
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.14 VAND (immediate)
Syntax
VAND{cond}.datatype Qd, #imm
where:
cond
Operation
VAND takes each element of the destination vector, performs a bitwise AND with an immediate value, and
returns the result into the destination vector.
Note
On disassembly, this pseudo-instruction is disassembled to a corresponding VBIC instruction, with the
complementary immediate value.
Immediate values
If datatype is I16, the immediate value must have one of the following forms:
• 0xFFXY.
• 0xXYFF.
If datatype is I32, the immediate value must have one of the following forms:
• 0xFFFFFFXY.
• 0xFFFFXYFF.
• 0xFFXYFFFF.
• 0xXYFFFFFF.
Related references
C3.16 VBIC (immediate) on page C3-408
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-406
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.15 VAND (register)
Syntax
VAND{cond}{.datatype} {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VAND performs a bitwise logical AND between two registers, and places the result in the destination
register.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-407
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.16 VBIC (immediate)
Syntax
VBIC{cond}.datatype Qd, #imm
where:
cond
Operation
VBIC takes each element of the destination vector, performs a bitwise AND complement with an
immediate value, and returns the result in the destination vector.
Immediate values
You can either specify imm as a pattern which the assembler repeats to fill the destination register, or you
can directly specify the immediate value (that conforms to the pattern) in full. The pattern for imm
depends on datatype as shown in the following table:
I16 I32
0x00XY 0x000000XY
0xXY00 0x0000XY00
0x00XY0000
0xXY000000
If you use the I8 or I64 datatypes, the assembler converts it to either the I16 or I32 instruction to match
the pattern of imm. If the immediate value does not match any of the patterns in the preceding table, the
assembler generates an error.
Related references
C3.14 VAND (immediate) on page C3-406
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-408
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.17 VBIC (register)
Syntax
VBIC{cond}{.datatype} {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VBIC performs a bitwise logical AND complement between two registers, and places the result in the
destination register.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-409
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.18 VBIF
C3.18 VBIF
Vector Bitwise Insert if False.
Syntax
VBIF{cond}{.datatype} {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VBIF inserts each bit from the first operand into the destination if the corresponding bit of the second
operand is 0, otherwise it leaves the destination bit unchanged.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-410
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.19 VBIT
C3.19 VBIT
Vector Bitwise Insert if True.
Syntax
VBIT{cond}{.datatype} {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VBIT inserts each bit from the first operand into the destination if the corresponding bit of the second
operand is 1, otherwise it leaves the destination bit unchanged.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-411
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.20 VBSL
C3.20 VBSL
Vector Bitwise Select.
Syntax
VBSL{cond}{.datatype} {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VBSL selects each bit for the destination from the first operand if the corresponding bit of the destination
is 1, or from the second operand if the corresponding bit of the destination is 0.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-412
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.21 VCADD
C3.21 VCADD
Vector Complex Add.
Syntax
VCADD{q}.dt {Dd,} Dn, Dm, #rotate ; A1 64-bit SIMD vector FP/SIMD registers (A32)
VCADD{q}.dt {Qd,} Qn, Qm, #rotate ; A1 128-bit SIMD vector FP/SIMD registers (A32)
Where:
Dd
Is the 64-bit name of the SIMD and FP destination register.
Dn
Is the 64-bit name of the first SIMD and FP source register.
Dm
Is the 64-bit name of the second SIMD and FP source register.
Qd
Is the 128-bit name of the SIMD and FP destination register.
Qn
Is the 128-bit name of the first SIMD and FP source register.
Qm
Is the 128-bit name of the second SIMD and FP source register.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
dt
Is the data type for the elements of the vectors, and can be either F16 or F32.
rotate
Is the rotation to be applied to elements in the second SIMD and FP source register, and can be
either 90 or 270.
Architectures supported
Supported in the Armv8.3-A architecture and later.
Usage
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to
Hyp mode. For more information see Enabling Advanced SIMD and floating-point support in the Arm®
Architecture Reference Manual Arm®v8, for Arm®v8‑A architecture profile.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-413
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.22 VCEQ (immediate #0)
Syntax
VCEQ{cond}.datatype {Qd}, Qn, #0
where:
cond
specifies the destination register and the operand register, for a quadword operation.
Dd, Dn, Dm
specifies the destination register and the operand register, for a doubleword operation.
#0
Operation
VCEQ takes the value of each element in a vector, and compares it with zero. If the condition is true, the
corresponding element in the destination vector is set to all ones. Otherwise, it is set to all zeros.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-414
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.23 VCEQ (register)
Syntax
VCEQ{cond}.datatype {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VCEQ takes the value of each element in a vector, and compares it with the value of the corresponding
element of a second vector. If the condition is true, the corresponding element in the destination vector is
set to all ones. Otherwise, it is set to all zeros.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-415
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.24 VCGE (immediate #0)
Syntax
VCGE{cond}.datatype {Qd}, Qn, #0
where:
cond
specifies the destination register and the operand register, for a quadword operation.
Dd, Dn, Dm
specifies the destination register and the operand register, for a doubleword operation.
#0
Operation
VCGE takes the value of each element in a vector, and compares it with zero. If the condition is true, the
corresponding element in the destination vector is set to all ones. Otherwise, it is set to all zeros.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-416
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.25 VCGE (register)
Syntax
VCGE{cond}.datatype {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VCGE takes the value of each element in a vector, and compares it with the value of the corresponding
element of a second vector. If the condition is true, the corresponding element in the destination vector is
set to all ones. Otherwise, it is set to all zeros.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-417
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.26 VCGT (immediate #0)
Syntax
VCGT{cond}.datatype {Qd}, Qn, #0
where:
cond
specifies the destination register and the operand register, for a quadword operation.
Dd, Dn, Dm
specifies the destination register and the operand register, for a doubleword operation.
Operation
VCGT takes the value of each element in a vector, and compares it with zero. If the condition is true, the
corresponding element in the destination vector is set to all ones. Otherwise, it is set to all zeros.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-418
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.27 VCGT (register)
Syntax
VCGT{cond}.datatype {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VCGT takes the value of each element in a vector, and compares it with the value of the corresponding
element of a second vector. If the condition is true, the corresponding element in the destination vector is
set to all ones. Otherwise, it is set to all zeros.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-419
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.28 VCLE (immediate #0)
Syntax
VCLE{cond}.datatype {Qd}, Qn, #0
where:
cond
specifies the destination register and the operand register, for a quadword operation.
Dd, Dn, Dm
specifies the destination register and the operand register, for a doubleword operation.
#0
Operation
VCLE takes the value of each element in a vector, and compares it with zero. If the condition is true, the
corresponding element in the destination vector is set to all ones. Otherwise, it is set to all zeros.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-420
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.29 VCLS
C3.29 VCLS
Vector Count Leading Sign bits.
Syntax
VCLS{cond}.datatype Qd, Qm
VCLS{cond}.datatype Dd, Dm
where:
cond
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
Operation
VCLS counts the number of consecutive bits following the topmost bit, that are the same as the topmost
bit, in each element in a vector, and places the results in a second vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-421
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.30 VCLE (register)
Syntax
VCLE{cond}.datatype {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VCLE takes the value of each element in a vector, and compares it with the value of the corresponding
element of a second vector. If the condition is true, the corresponding element in the destination vector is
set to all ones. Otherwise, it is set to all zeros.
On disassembly, this pseudo-instruction is disassembled to the corresponding VCGE instruction, with the
operands reversed.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-422
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.31 VCLT (immediate #0)
Syntax
VCLT{cond}.datatype {Qd}, Qn, #0
where:
cond
specifies the destination register and the operand register, for a quadword operation.
Dd, Dn, Dm
specifies the destination register and the operand register, for a doubleword operation.
#0
Operation
VCLT takes the value of each element in a vector, and compares it with zero. If the condition is true, the
corresponding element in the destination vector is set to all ones. Otherwise, it is set to all zeros.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-423
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.32 VCLT (register)
Syntax
VCLT{cond}.datatype {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VCLT takes the value of each element in a vector, and compares it with the value of the corresponding
element of a second vector. If the condition is true, the corresponding element in the destination vector is
set to all ones. Otherwise, it is set to all zeros.
Note
On disassembly, this pseudo-instruction is disassembled to the corresponding VCGT instruction, with the
operands reversed.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-424
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.33 VCLZ
C3.33 VCLZ
Vector Count Leading Zeros.
Syntax
VCLZ{cond}.datatype Qd, Qm
VCLZ{cond}.datatype Dd, Dm
where:
cond
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
Operation
VCLZ counts the number of consecutive zeros, starting from the top bit, in each element in a vector, and
places the results in a second vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-425
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.34 VCMLA
C3.34 VCMLA
Vector Complex Multiply Accumulate.
Syntax
VCMLA{q}.dt {Dd,} Dn, Dm, #rotate ; 64-bit SIMD vector FP/SIMD registers
VCMLA{q}.dt {Qd,} Qn, Qm, #rotate ; 128-bit SIMD vector FP/SIMD registers
Where:
Dd
Is the 64-bit name of the SIMD and FP destination register.
Dn
Is the 64-bit name of the first SIMD and FP source register.
Dm
Is the 64-bit name of the second SIMD and FP source register.
Qd
Is the 128-bit name of the SIMD and FP destination register.
Qn
Is the 128-bit name of the first SIMD and FP source register.
Qm
Is the 128-bit name of the second SIMD and FP source register.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
dt
Is the data type for the elements of the vectors, and can be either F16 or F32.
rotate
Is the rotation to be applied to elements in the second SIMD and FP source register, and can be
one of 0, 90, 180 or 270.
Architectures supported
Supported in the Armv8.3-A architecture and later.
Usage
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to
Hyp mode. For more information see Enabling Advanced SIMD and floating-point support in the Arm®
Architecture Reference Manual Arm®v8, for Arm®v8‑A architecture profile.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-426
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.35 VCMLA (by element)
Syntax
VCMLA{q}.F16 Dd, Dn, Dm[index], #rotate ; A1 Double,halfprec FP/SIMD registers (A32)
Where:
Dd
Is the 64-bit name of the SIMD and FP destination register.
Dn
Is the 64-bit name of the first SIMD and FP source register.
Dm
Is the 64-bit name of the second SIMD and FP source register.
index
Is the element index in the range 0 to 1.
Qd
Is the 128-bit name of the SIMD and FP destination register.
Qn
Is the 128-bit name of the first SIMD and FP source register.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
rotate
Is the rotation to be applied to elements in the second SIMD and FP source register, and can be
one of 0, 90, 180 or 270.
Architectures supported
Supported in the Armv8.3-A architecture and later.
Usage
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in
which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or trapped to
Hyp mode. For more information see Enabling Advanced SIMD and floating-point support in the Arm®
Architecture Reference Manual Arm®v8, for Arm®v8‑A architecture profile.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-427
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.36 VCNT
C3.36 VCNT
Vector Count set bits.
Syntax
VCNT{cond}.datatype Qd, Qm
VCNT{cond}.datatype Dd, Dm
where:
cond
must be I8.
Qd, Qm
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
Operation
VCNT counts the number of bits that are one in each element in a vector, and places the results in a second
vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-428
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.37 VCVT (between fixed-point or integer, and floating-point)
Syntax
VCVT{cond}.type Qd, Qm {, #fbits}
where:
cond
specifies the data types for the elements of the vectors. It must be one of:
S32.F32
specifies the destination vector and the operand vector, for a quadword operation.
Dd, Dm
specifies the destination vector and the operand vector, for a doubleword operation.
fbits
if present, specifies the number of fraction bits in the fixed point number. Otherwise, the
conversion is between floating-point and integer. fbits must lie in the range 0-32. If fbits is
omitted, the number of fraction bits is 0.
Operation
VCVT converts each element in a vector in one of the following ways, and places the results in the
destination vector:
• From floating-point to integer.
• From integer to floating-point.
• From floating-point to fixed-point.
• From fixed-point to floating-point.
Rounding
Integer or fixed-point to floating-point conversions use round to nearest.
Floating-point to integer or fixed-point conversions use round towards zero.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-429
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.38 VCVT (between half-precision and single-precision floating-point)
Syntax
VCVT{cond}.F32.F16 Qd, Dm
VCVT{cond}.F16.F32 Dd, Qm
where:
cond
is an optional condition code.
Qd, Dm
specifies the destination vector for the single-precision results and the half-precision operand
vector.
Dd, Qm
specifies the destination vector for half-precision results and the single-precision operand vector.
Operation
VCVT with half-precision extension, converts each element in a vector in one of the following ways, and
places the results in the destination vector:
• From half-precision floating-point to single-precision floating-point (F32.F16).
• From single-precision floating-point to half-precision floating-point (F16.F32).
Architectures
This instruction is available in Armv8. In earlier architectures, it is only available in NEON systems with
the half-precision extension.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-430
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.39 VCVT (from floating-point to integer with directed rounding modes)
Note
• This instruction is supported only in Armv8.
• You cannot use VCVT with a directed rounding mode inside an IT block.
Syntax
VCVTmode.type Qd, Qm
VCVTmode.type Dd, Dm
where:
mode
specifies the data types for the elements of the vectors. It must be one of:
S32.F32
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-431
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.40 VCVTB, VCVTT (between half-precision and double-precision)
Syntax
VCVTB{cond}.F64.F16 Dd, Sm
VCVTB{cond}.F16.F64 Sd, Dm
VCVTT{cond}.F64.F16 Dd, Sm
VCVTT{cond}.F16.F64 Sd, Dm
where:
cond
is an optional condition code.
Dd
is a double-precision register for the result.
Sm
is a single word register holding the operand.
Sd
is a single word register for the result.
Dm
is a double-precision register holding the operand.
Usage
These instructions convert the half-precision value in Sm to double-precision and place the result in Dd, or
the double-precision value in Dm to half-precision and place the result in Sd.
Floating-point exceptions
These instructions can produce Input Denormal, Invalid Operation, Overflow, Underflow, or Inexact
exceptions.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-432
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.41 VDUP
C3.41 VDUP
Vector Duplicate.
Syntax
VDUP{cond}.size Qd, Dm[x]
VDUP{cond}.size Qd, Rm
VDUP{cond}.size Dd, Rm
where:
cond
Operation
VDUP duplicates a scalar into every element of the destination vector. The source can be an Advanced
SIMD scalar or a general-purpose register.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-433
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.42 VEOR
C3.42 VEOR
Vector Bitwise Exclusive OR.
Syntax
VEOR{cond}{.datatype} {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VEOR performs a logical exclusive OR between two registers, and places the result in the destination
register.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-434
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.43 VEXT
C3.43 VEXT
Vector Extract.
Syntax
VEXT{cond}.8 {Qd}, Qn, Qm, #imm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
imm
is the number of 8-bit elements to extract from the bottom of the second operand vector, in the
range 0-7 for doubleword operations, or 0-15 for quadword operations.
Operation
VEXT extracts 8-bit elements from the bottom end of the second operand vector and the top end of the
first, concatenates them, and places the result in the destination vector. See the following figure for an
example:
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
Vm Vn
Vd
VEXT pseudo-instruction
You can specify a datatype of 16, 32, or 64 instead of 8. In this case, #imm refers to halfwords, words, or
doublewords instead of referring to bytes, and the permitted ranges are correspondingly reduced.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-435
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.44 VFMA, VFMS
Syntax
Vop{cond}.F32 {Qd}, Qn, Qm
where:
op
Operation
VFMA multiplies corresponding elements in the two operand vectors, and accumulates the results into the
elements of the destination vector. The result of the multiply is not rounded before the accumulation.
VFMS multiplies corresponding elements in the two operand vectors, then subtracts the products from the
corresponding elements of the destination vector, and places the final results in the destination vector.
The result of the multiply is not rounded before the subtraction.
Related references
C3.77 VMUL on page C3-472
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-436
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.45 VFMAL (by scalar)
Syntax
VFMAL{q}.F16 Dd, Sn, Sm[index] ; 64-bit SIMD vector
VFMAL{q}.F16 Qd, Dn, Dm[index] ; 128-bit SIMD vector FP/SIMD registers (A32)
Where:
Dd
Is the 64-bit name of the SIMD and FP destination register.
Sn
Is the 32-bit name of the first SIMD and FP source register.
Sm
Is the 32-bit name of the second SIMD and FP source register.
index
Depends on the instruction variant:
64
For the 64-bit SIMD vector variant: is the element index in the range 0 to 1.
128
For the 128-bit SIMD vector variant: is the element index in the range 0 to 3.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Qd
Is the 128-bit name of the SIMD and FP destination register.
Dn
Is the 64-bit name of the first SIMD and FP source register.
Dm
Is the 64-bit name of the second SIMD and FP source register.
Architectures supported
Supported in Armv8.2 and later.
Usage
Vector Floating-point Multiply-Add Long to accumulator (by scalar). This instruction multiplies the
vector elements in the first source SIMD and FP register by the specified value in the second source
SIMD and FP register, and accumulates the product to the corresponding vector element of the
destination SIMD and FP register. The instruction does not round the result of the multiply before the
accumulation.
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the Security state and
PE mode in which the instruction is executed, an attempt to execute the instruction might be UNDEFINED,
or trapped to Hyp mode. For more information see Enabling Advanced SIMD and floating-point support
in the Arm® Architecture Reference Manual Arm®v8, for Arm®v8‑A architecture profile.
In Armv8.2 and Armv8.3, this is an OPTIONAL instruction. From Armv8.4 it is mandatory for all
implementations to support it.
Note
ID_ISAR6.FHM indicates whether this instruction is supported.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-437
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.46 VFMAL (vector)
Syntax
VFMAL{q}.F16 Dd, Sn, Sm ; 64-bit SIMD vector
Where:
Dd
Is the 64-bit name of the SIMD and FP destination register.
Sn
Is the 32-bit name of the first SIMD and FP source register.
Sm
Is the 32-bit name of the second SIMD and FP source register.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Qd
Is the 128-bit name of the SIMD and FP destination register.
Dn
Is the 64-bit name of the first SIMD and FP source register.
Dm
Is the 64-bit name of the second SIMD and FP source register.
Architectures supported
Supported in Armv8.2 and later.
Usage
Vector Floating-point Multiply-Add Long to accumulator (vector). This instruction multiplies
corresponding values in the vectors in the two source SIMD and FP registers, and accumulates the
product to the corresponding vector element of the destination SIMD and FP register. The instruction
does not round the result of the multiply before the accumulation.
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the Security state and
PE mode in which the instruction is executed, an attempt to execute the instruction might be UNDEFINED,
or trapped to Hyp mode. For more information see Enabling Advanced SIMD and floating-point support
in the Arm® Architecture Reference Manual Arm®v8, for Arm®v8‑A architecture profile.
In Armv8.2 and Armv8.3, this is an OPTIONAL instruction. From Armv8.4 it is mandatory for all
implementations to support it.
Note
ID_ISAR6.FHM indicates whether this instruction is supported.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-438
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.47 VFMSL (by scalar)
Syntax
VFMSL{q}.F16 Dd, Sn, Sm[index] ; 64-bit SIMD vector
VFMSL{q}.F16 Qd, Dn, Dm[index] ; 128-bit SIMD vector FP/SIMD registers (A32)
Where:
Dd
Is the 64-bit name of the SIMD and FP destination register.
Sn
Is the 32-bit name of the first SIMD and FP source register.
Sm
Is the 32-bit name of the second SIMD and FP source register.
index
Depends on the instruction variant:
64
For the 64-bit SIMD vector variant: is the element index in the range 0 to 1.
128
For the 128-bit SIMD vector variant: is the element index in the range 0 to 3.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Qd
Is the 128-bit name of the SIMD and FP destination register.
Dn
Is the 64-bit name of the first SIMD and FP source register.
Dm
Is the 64-bit name of the second SIMD and FP source register.
Architectures supported
Supported in Armv8.2 and later.
Usage
Vector Floating-point Multiply-Subtract Long from accumulator (by scalar). This instruction multiplies
the negated vector elements in the first source SIMD and FP register by the specified value in the second
source SIMD and FP register, and accumulates the product to the corresponding vector element of the
destination SIMD and FP register. The instruction does not round the result of the multiply before the
accumulation.
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the Security state and
PE mode in which the instruction is executed, an attempt to execute the instruction might be UNDEFINED,
or trapped to Hyp mode. For more information see Enabling Advanced SIMD and floating-point support
in the Arm® Architecture Reference Manual Arm®v8, for Arm®v8‑A architecture profile.
In Armv8.2 and Armv8.3, this is an OPTIONAL instruction. From Armv8.4 it is mandatory for all
implementations to support it.
Note
ID_ISAR6.FHM indicates whether this instruction is supported.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-439
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.48 VFMSL (vector)
Syntax
VFMSL{q}.F16 Dd, Sn, Sm ; 64-bit SIMD vector
Where:
Dd
Is the 64-bit name of the SIMD and FP destination register.
Sn
Is the 32-bit name of the first SIMD and FP source register.
Sm
Is the 32-bit name of the second SIMD and FP source register.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Qd
Is the 128-bit name of the SIMD and FP destination register.
Dn
Is the 64-bit name of the first SIMD and FP source register.
Dm
Is the 64-bit name of the second SIMD and FP source register.
Architectures supported
Supported in Armv8.2 and later.
Usage
Vector Floating-point Multiply-Subtract Long from accumulator (vector). This instruction negates the
values in the vector of one SIMD and FP register, multiplies these with the corresponding values in
another vector, and accumulates the product to the corresponding vector element of the destination SIMD
and FP register. The instruction does not round the result of the multiply before the accumulation.
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the Security state and
PE mode in which the instruction is executed, an attempt to execute the instruction might be UNDEFINED,
or trapped to Hyp mode. For more information see Enabling Advanced SIMD and floating-point support
in the Arm® Architecture Reference Manual Arm®v8, for Arm®v8‑A architecture profile.
In Armv8.2 and Armv8.3, this is an OPTIONAL instruction. From Armv8.4 it is mandatory for all
implementations to support it.
Note
ID_ISAR6.FHM indicates whether this instruction is supported.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-440
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.49 VHADD
C3.49 VHADD
Vector Halving Add.
Syntax
VHADD{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VHADD adds corresponding elements in two vectors, shifts each result right one bit, and places the results
in the destination vector. Results are truncated.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-441
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.50 VHSUB
C3.50 VHSUB
Vector Halving Subtract.
Syntax
VHSUB{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VHSUB subtracts the elements of one vector from the corresponding elements of another vector, shifts
each result right one bit, and places the results in the destination vector. Results are always truncated.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-442
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.51 VLDn (single n-element structure to one lane)
Syntax
VLDn{cond}.datatype list, [Rn{@align}]{!}
where:
n
must be one of 1, 2, 3, or 4.
cond
is the list of Advanced SIMD registers enclosed in braces, { and }. See the following table for
options.
Rn
if ! is present, Rn is updated to (Rn + the number of bytes transferred by the instruction). The
update occurs after all the loads have taken place.
Rm
is a general-purpose register containing an offset from the base address. If Rm is present, the
instruction updates Rn to (Rn + Rm) after using the address to access memory. Rm cannot be SP or
PC.
Operation
VLDn loads one n-element structure from memory into one or more Advanced SIMD registers. Elements
of the register that are not loaded are unaltered.
Table C3-4 Permitted combinations of parameters for VLDn (single n-element structure to one lane)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-443
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.51 VLDn (single n-element structure to one lane)
Table C3-4 Permitted combinations of parameters for VLDn (single n-element structure to one lane) (continued)
Related concepts
C3.3 Interleaving provided by load and store element and structure instructions on page C3-395
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-444
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.52 VLDn (single n-element structure to all lanes)
Syntax
VLDn{cond}.datatype list, [Rn{@align}]{!}
where:
n
must be one of 1, 2, 3, or 4.
cond
is the list of Advanced SIMD registers enclosed in braces, { and }. See the following table for
options.
Rn
if ! is present, Rn is updated to (Rn + the number of bytes transferred by the instruction). The
update occurs after all the loads have taken place.
Rm
is a general-purpose register containing an offset from the base address. If Rm is present, the
instruction updates Rn to (Rn + Rm) after using the address to access memory. Rm cannot be SP or
PC.
Operation
VLDn loads multiple copies of one n-element structure from memory into one or more Advanced SIMD
registers.
Table C3-5 Permitted combinations of parameters for VLDn (single n-element structure to all lanes)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-445
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.52 VLDn (single n-element structure to all lanes)
Table C3-5 Permitted combinations of parameters for VLDn (single n-element structure to all lanes) (continued)
Related concepts
C3.3 Interleaving provided by load and store element and structure instructions on page C3-395
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-446
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.53 VLDn (multiple n-element structures)
Syntax
VLDn{cond}.datatype list, [Rn{@align}]{!}
where:
n
must be one of 1, 2, 3, or 4.
cond
is the list of Advanced SIMD registers enclosed in braces, { and }. See the following table for
options.
Rn
if ! is present, Rn is updated to (Rn + the number of bytes transferred by the instruction). The
update occurs after all the loads have taken place.
Rm
is a general-purpose register containing an offset from the base address. If Rm is present, the
instruction updates Rn to (Rn + Rm) after using the address to access memory. Rm cannot be SP or
PC.
Operation
VLDn loads multiple n-element structures from memory into one or more Advanced SIMD registers, with
de-interleaving (unless n == 1). Every element of each register is loaded.
Table C3-6 Permitted combinations of parameters for VLDn (multiple n-element structures)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-447
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.53 VLDn (multiple n-element structures)
Table C3-6 Permitted combinations of parameters for VLDn (multiple n-element structures) (continued)
{Dd, D(d+1), D(d+2), D(d+3)} @64, @128, or @256 8-byte, 16-byte, or 32-byte
4 8, 16, or 32 {Dd, D(d+1), D(d+2), D(d+3)} @64, @128, or @256 8-byte, 16-byte, or 32-byte
{Dd, D(d+2), D(d+4), D(d+6)} @64, @128, or @256 8-byte, 16-byte, or 32-byte
Related concepts
C3.3 Interleaving provided by load and store element and structure instructions on page C3-395
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-448
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.54 VLDM
C3.54 VLDM
Extension register load multiple.
Syntax
VLDMmode{cond} Rn{!}, Registers
where:
mode
meaning Increment address After each transfer. IA is the default, and can be omitted.
DB
meaning Empty Ascending stack operation. This is the same as DB for loads.
FD
meaning Full Descending stack operation. This is the same as IA for loads.
cond
is the general-purpose register holding the base address for the transfer.
!
is optional. ! specifies that the updated base address must be written back to Rn. If ! is not
specified, mode must be IA.
Registers
is a list of consecutive extension registers enclosed in braces, { and }. The list can be comma-
separated, or in range format. There must be at least one register in the list.
You can specify D or Q registers, but they must not be mixed. The number of registers must not
exceed 16 D registers, or 8 Q registers. If Q registers are specified, on disassembly they are shown
as D registers.
Note
VPOP Registers is equivalent to VLDM sp!, Registers.
You can use either form of this instruction. They both disassemble to VPOP.
Related references
C1.9 Condition code suffixes on page C1-92
C4.14 VLDM (floating-point) on page C4-561
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-449
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.55 VLDR
C3.55 VLDR
Extension register load.
Syntax
VLDR{cond}{.64} Dd, [Rn{, #offset}]
where:
cond
is the general-purpose register holding the base address for the transfer.
offset
is an optional numeric expression. It must evaluate to a numeric value at assembly time. The
value must be a multiple of 4, and lie in the range -1020 to +1020. The value is added to the
base address to form the address used for the transfer.
label
is a PC-relative expression.
label must be aligned on a word boundary within ±1KB of the current instruction.
Operation
The VLDR instruction loads an extension register from memory.
Two words are transferred.
There is also a VLDR pseudo-instruction.
Related references
C3.57 VLDR pseudo-instruction on page C3-452
C1.9 Condition code suffixes on page C1-92
C4.15 VLDR (floating-point) on page C4-562
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-450
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.56 VLDR (post-increment and pre-decrement)
Note
There are also VLDR and VSTR instructions without post-increment and pre-decrement.
Syntax
VLDR{cond}{.64} Dd, [Rn], #offset ; post-increment
where:
cond
is the general-purpose register holding the base address for the transfer.
offset
Operation
The post-increment instruction increments the base address in the register by the offset value, after the
transfer. The pre-decrement instruction decrements the base address in the register by the offset value,
and then performs the transfer using the new address in the register. This pseudo-instruction assembles to
a VLDM instruction.
Related references
C3.54 VLDM on page C3-449
C3.55 VLDR on page C3-450
C1.9 Condition code suffixes on page C1-92
C4.16 VLDR (post-increment and pre-decrement, floating-point) on page C4-563
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-451
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.57 VLDR pseudo-instruction
Note
This description is for the VLDR pseudo-instruction only.
Syntax
VLDR{cond}.datatype Dd,=constant
where:
cond
is an optional condition code.
datatype
Usage
If an instruction (for example, VMOV) is available that can generate the constant directly into the register,
the assembler uses it. Otherwise, it generates a doubleword literal pool entry containing the constant and
loads the constant using a VLDR instruction.
Related references
C3.55 VLDR on page C3-450
C1.9 Condition code suffixes on page C1-92
C3.57 VLDR pseudo-instruction on page C3-452
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-452
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.58 VMAX and VMIN
Syntax
Vop{cond}.datatype Qd, Qn, Qm
where:
op
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VMAX compares corresponding elements in two vectors, and copies the larger of each pair into the
corresponding element in the destination vector.
VMIN compares corresponding elements in two vectors, and copies the smaller of each pair into the
corresponding element in the destination vector.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-453
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.59 VMAXNM, VMINNM
Note
• These instructions are supported only in Armv8.
• You cannot use VMAXNM or VMINNM inside an IT block.
Syntax
Vop.F32 Qd, Qn, Qm
where:
op
must be either MAXNM or MINNM.
Qd, Qn, Qm
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VMAXNM compares corresponding elements in two vectors, and copies the larger of each pair into the
corresponding element in the destination vector.
VMINNM compares corresponding elements in two vectors, and copies the smaller of each pair into the
corresponding element in the destination vector.
If one of the elements in a pair is a number and the other element is NaN, the corresponding result
element is the number. This is consistent with the IEEE 754-2008 standard.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-454
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.60 VMLA
C3.60 VMLA
Vector Multiply Accumulate.
Syntax
VMLA{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VMLA multiplies corresponding elements in two vectors, and accumulates the results into the elements of
the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-455
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.61 VMLA (by scalar)
Syntax
VMLA{cond}.datatype {Qd}, Qn, Dm[x]
where:
cond
are the destination vector and the first operand vector, for a quadword operation.
Dd, Dn
are the destination vector and the first operand vector, for a doubleword operation.
Dm[x]
Operation
VMLA multiplies each element in a vector by a scalar, and accumulates the results into the corresponding
elements of the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-456
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.62 VMLAL (by scalar)
Syntax
VMLAL{cond}.datatype Qd, Dn, Dm[x]
where:
cond
are the destination vector and the first operand vector, for a long operation.
Dm[x]
Operation
VMLAL multiplies each element in a vector by a scalar, and accumulates the results into the corresponding
elements of the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-457
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.63 VMLAL
C3.63 VMLAL
Vector Multiply Accumulate Long.
Syntax
VMLAL{cond}.datatype Qd, Dn, Dm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a long
operation.
Operation
VMLAL multiplies corresponding elements in two vectors, and accumulates the results into the elements of
the destination vector.
Related concepts
B1.8 Polynomial arithmetic over {0,1} on page B1-54
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-458
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.64 VMLS (by scalar)
Syntax
VMLS{cond}.datatype {Qd}, Qn, Dm[x]
where:
cond
are the destination vector and the first operand vector, for a quadword operation.
Dd, Dn
are the destination vector and the first operand vector, for a doubleword operation.
Dm[x]
Operation
VMLS multiplies each element in a vector by a scalar, subtracts the results from the corresponding
elements of the destination vector, and places the final results in the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-459
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.65 VMLS
C3.65 VMLS
Vector Multiply Subtract.
Syntax
VMLS{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VMLS multiplies corresponding elements in two vectors, subtracts the results from corresponding
elements of the destination vector, and places the final results in the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-460
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.66 VMLSL
C3.66 VMLSL
Vector Multiply Subtract Long.
Syntax
VMLSL{cond}.datatype Qd, Dn, Dm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a long
operation.
Operation
VMLSL multiplies corresponding elements in two vectors, subtracts the results from corresponding
elements of the destination vector, and places the final results in the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-461
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.67 VMLSL (by scalar)
Syntax
VMLSL{cond}.datatype Qd, Dn, Dm[x]
where:
cond
are the destination vector and the first operand vector, for a long operation.
Dm[x]
Operation
VMLSL multiplies each element in a vector by a scalar, subtracts the results from the corresponding
elements of the destination vector, and places the final results in the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-462
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.68 VMOV (immediate)
Syntax
VMOV{cond}.datatype Qd, #imm
where:
cond
is an immediate value of the type specified by datatype. This is replicated to fill the destination
register.
Operation
VMOV replicates an immediate value in every element of the destination register.
datatype imm
I8 0xXY
0x0000XYFF, 0x00XYFFFF
Related references
C1.9 Condition code suffixes on page C1-92
am Each of 0xGG, 0xHH, 0xJJ, 0xKK, 0xLL, 0xMM, 0xNN, and 0xPP must be either 0x00 or 0xFF.
an Any number that can be expressed as +/–n * 2–r, where n and r are integers, 16 <= n <= 31, 0 <= r <= 7.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-463
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.69 VMOV (register)
Syntax
VMOV{cond}{.datatype} Qd, Qm
VMOV{cond}{.datatype} Dd, Dm
where:
cond
specifies the destination vector and the source vector, for a quadword operation.
Dd, Dm
specifies the destination vector and the source vector, for a doubleword operation.
Operation
VMOV copies the contents of the source register into the destination register.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-464
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.70 VMOV (between two general-purpose registers and a 64-bit extension register)
Syntax
VMOV{cond} Dm, Rd, Rn
where:
cond
Operation
VMOV Dm, Rd, Rn transfers the contents of Rd into the low half of Dm, and the contents of Rn into the
high half of Dm.
VMOV Rd, Rn, Dm transfers the contents of the low half of Dm into Rd, and the contents of the high half of
Dm into Rn.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-465
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.71 VMOV (between a general-purpose register and an Advanced SIMD scalar)
Syntax
VMOV{cond}{.size} Dn[x], Rd
where:
cond
the data type. Can be U8, S8, U16, S16, or 32. If omitted, datatype is 32.
Dn[x]
Operation
VMOV Dn[x], Rd transfers the contents of the least significant byte, halfword, or word of Rd into Dn[x].
VMOV Rd, Dn[x] transfers the contents of Dn[x] into the least significant byte, halfword, or word of Rd.
The remaining bits of Rd are either zero or sign extended.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-466
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.72 VMOVL
C3.72 VMOVL
Vector Move Long.
Syntax
VMOVL{cond}.datatype Qd, Dm
where:
cond
Operation
VMOVL takes each element in a doubleword vector, sign or zero extends them to twice their original
length, and places the results in a quadword vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-467
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.73 VMOVN
C3.73 VMOVN
Vector Move and Narrow.
Syntax
VMOVN{cond}.datatype Dd, Qm
where:
cond
Operation
VMOVN copies the least significant half of each element of a quadword vector into the corresponding
elements of a doubleword vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-468
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.74 VMOV2
C3.74 VMOV2
Pseudo-instruction that generates an immediate value and places it in every element of an Advanced
SIMD vector, without loading a value from a literal pool.
Syntax
VMOV2{cond}.datatype Qd, #constant
where:
datatype
must be one of:
• I8, I16, I32, or I64.
• S8, S16, S32, or S64.
• U8, U16, U32, or U64.
• F32.
cond
Operation
VMOV2 can generate any 16-bit immediate value, and a restricted range of 32-bit and 64-bit immediate
values.
VMOV2 is a pseudo-instruction that always assembles to exactly two instructions. It typically assembles to
a VMOV or VMVN instruction, followed by a VBIC or VORR instruction.
Related references
C3.68 VMOV (immediate) on page C3-463
C3.16 VBIC (immediate) on page C3-408
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-469
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.75 VMRS
C3.75 VMRS
Transfer contents from an Advanced SIMD system register to a general-purpose register.
Syntax
VMRS{cond} Rd, extsysreg
where:
cond
is an optional condition code.
extsysreg
is the Advanced SIMD and floating-point system register, usually FPSCR, FPSID, or FPEXC.
Rd
Usage
The VMRS instruction transfers the contents of extsysreg into Rd.
Note
The instruction stalls the processor until all current Advanced SIMD or floating-point operations
complete.
Example
VMRS r2,FPCID
VMRS APSR_nzcv, FPSCR ; transfer FP status register to the
; special-purpose APSR
Related references
B1.14 Advanced SIMD system registers in AArch32 state on page B1-60
C1.9 Condition code suffixes on page C1-92
C4.26 VMRS (floating-point) on page C4-573
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-470
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.76 VMSR
C3.76 VMSR
Transfer contents of a general-purpose register to an Advanced SIMD system register.
Syntax
VMSR{cond} extsysreg, Rd
where:
cond
is an optional condition code.
extsysreg
is the Advanced SIMD and floating-point system register, usually FPSCR, FPSID, or FPEXC.
Rd
Usage
The VMSR instruction transfers the contents of Rd into extsysreg.
Note
The instruction stalls the processor until all current Advanced SIMD operations complete.
Example
VMSR FPSCR, r4
Related references
B1.14 Advanced SIMD system registers in AArch32 state on page B1-60
C1.9 Condition code suffixes on page C1-92
C4.27 VMSR (floating-point) on page C4-574
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-471
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.77 VMUL
C3.77 VMUL
Vector Multiply.
Syntax
VMUL{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VMUL multiplies corresponding elements in two vectors, and places the results in the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-472
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.78 VMUL (by scalar)
Syntax
VMUL{cond}.datatype {Qd}, Qn, Dm[x]
where:
cond
are the destination vector and the first operand vector, for a quadword operation.
Dd, Dn
are the destination vector and the first operand vector, for a doubleword operation.
Dm[x]
Operation
VMUL multiplies each element in a vector by a scalar, and places the results in the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-473
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.79 VMULL
C3.79 VMULL
Vector Multiply Long
Syntax
VMULL{cond}.datatype Qd, Dn, Dm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a long
operation.
Operation
VMULL multiplies corresponding elements in two vectors, and places the results in the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-474
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.80 VMULL (by scalar)
Syntax
VMULL{cond}.datatype Qd, Dn, Dm[x]
where:
cond
are the destination vector and the first operand vector, for a long operation.
Dm[x]
Operation
VMULL multiplies each element in a vector by a scalar, and places the results in the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-475
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.81 VMVN (register)
Syntax
VMVN{cond}{.datatype} Qd, Qm
VMVN{cond}{.datatype} Dd, Dm
where:
cond
specifies the destination vector and the source vector, for a quadword operation.
Dd, Dm
specifies the destination vector and the source vector, for a doubleword operation.
Operation
VMVN inverts the value of each bit from the source register and places the results into the destination
register.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-476
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.82 VMVN (immediate)
Syntax
VMVN{cond}.datatype Qd, #imm
where:
cond
is an immediate value of the type specified by datatype. This is replicated to fill the destination
register.
Operation
VMVN inverts the value of each bit from an immediate value and places the results into each element in the
destination register.
datatype imm
I8 -
I16 0xFFXY, 0xXYFF
0xFFFFXY00, 0xFFXY0000
I64 -
F32 -
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-477
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.83 VNEG
C3.83 VNEG
Vector Negate.
Syntax
VNEG{cond}.datatype Qd, Qm
VNEG{cond}.datatype Dd, Dm
where:
cond
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
Operation
VNEG negates each element in a vector, and places the results in a second vector. (The floating-point
version only inverts the sign bit.)
Related references
C4.29 VNEG (floating-point) on page C4-576
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-478
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.84 VORN (register)
Syntax
VORN{cond}{.datatype} {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VORN performs a bitwise logical OR complement between two registers, and places the results in the
destination register.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-479
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.85 VORN (immediate)
Syntax
VORN{cond}.datatype Qd, #imm
where:
cond
Operation
VORN takes each element of the destination vector, performs a bitwise OR complement with an immediate
value, and returns the results in the destination vector.
Note
On disassembly, this pseudo-instruction is disassembled to a corresponding VORR instruction, with a
complementary immediate value.
Immediate values
If datatype is I16, the immediate value must have one of the following forms:
• 0xFFXY.
• 0xXYFF.
If datatype is I32, the immediate value must have one of the following forms:
• 0xFFFFFFXY.
• 0xFFFFXYFF.
• 0xFFXYFFFF.
• 0xXYFFFFFF.
Related references
C3.87 VORR (immediate) on page C3-482
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-480
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.86 VORR (register)
Syntax
VORR{cond}{.datatype} {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Note
VORR with the same register for both operands is a VMOV instruction. You can use VORR in this way, but
disassembly of the resulting code produces the VMOV syntax.
Operation
VORR performs a bitwise logical OR between two registers, and places the result in the destination
register.
Related references
C3.69 VMOV (register) on page C3-464
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-481
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.87 VORR (immediate)
Syntax
VORR{cond}.datatype Qd, #imm
where:
cond
Operation
VORR takes each element of the destination vector, performs a bitwise logical OR with an immediate
value, and places the results in the destination vector.
Immediate values
You can either specify imm as a pattern which the assembler repeats to fill the destination register, or you
can directly specify the immediate value (that conforms to the pattern) in full. The pattern for imm
depends on the datatype, as shown in the following table:
I16 I32
0x00XY 0x000000XY
0xXY00 0x0000XY00
- 0x00XY0000
- 0xXY000000
If you use the I8 or I64 datatypes, the assembler converts it to either the I16 or I32 instruction to match
the pattern of imm. If the immediate value does not match any of the patterns in the preceding table, the
assembler generates an error.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-482
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.88 VPADAL
C3.88 VPADAL
Vector Pairwise Add and Accumulate Long.
Syntax
VPADAL{cond}.datatype Qd, Qm
VPADAL{cond}.datatype Dd, Dm
where:
cond
are the destination vector and the operand vector, for a quadword instruction.
Dd, Dm
are the destination vector and the operand vector, for a doubleword instruction.
Operation
VPADAL adds adjacent pairs of elements of a vector, and accumulates the absolute values of the results
into the elements of the destination vector.
Dm
+ +
Dd
Figure C3-3 Example of operation of VPADAL (in this case for data type S16)
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-483
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.89 VPADD
C3.89 VPADD
Vector Pairwise Add.
Syntax
VPADD{cond}.datatype {Dd}, Dn, Dm
where:
cond
are the destination vector, the first operand vector, and the second operand vector.
Operation
VPADD adds adjacent pairs of elements of two vectors, and places the results in the destination vector.
Dm Dn
+ + + +
Dd
Figure C3-4 Example of operation of VPADD (in this case, for data type I16)
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-484
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.90 VPADDL
C3.90 VPADDL
Vector Pairwise Add Long.
Syntax
VPADDL{cond}.datatype Qd, Qm
VPADDL{cond}.datatype Dd, Dm
where:
cond
are the destination vector and the operand vector, for a quadword instruction.
Dd, Dm
are the destination vector and the operand vector, for a doubleword instruction.
Operation
VPADDL adds adjacent pairs of elements of a vector, sign or zero extends the results to twice their original
width, and places the final results in the destination vector.
Dm
+ +
Dd
Figure C3-5 Example of operation of doubleword VPADDL (in this case, for data type S16)
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-485
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.91 VPMAX and VPMIN
Syntax
VPop{cond}.datatype Dd, Dn, Dm
where:
op
are the destination doubleword vector, the first operand doubleword vector, and the second
operand doubleword vector.
Operation
VPMAX compares adjacent pairs of elements in two vectors, and copies the larger of each pair into the
corresponding element in the destination vector. Operands and results must be doubleword vectors.
VPMIN compares adjacent pairs of elements in two vectors, and copies the smaller of each pair into the
corresponding element in the destination vector. Operands and results must be doubleword vectors.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-486
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.92 VPOP
C3.92 VPOP
Pop extension registers from the stack.
Syntax
VPOP{cond} Registers
where:
cond
is a list of consecutive extension registers enclosed in braces, { and }. The list can be comma-
separated, or in range format. There must be at least one register in the list.
You can specify D or Q registers, but they must not be mixed. The number of registers must not
exceed 16 D registers, or 8 Q registers. If Q registers are specified, on disassembly they are shown
as D registers.
Note
VPOP Registers is equivalent to VLDM sp!, Registers.
You can use either form of this instruction. They both disassemble to VPOP.
Related references
C1.9 Condition code suffixes on page C1-92
C3.93 VPUSH on page C3-488
C4.33 VPOP (floating-point) on page C4-580
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-487
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.93 VPUSH
C3.93 VPUSH
Push extension registers onto the stack.
Syntax
VPUSH{cond} Registers
where:
cond
is a list of consecutive extension registers enclosed in braces, { and }. The list can be comma-
separated, or in range format. There must be at least one register in the list.
You can specify D or Q registers, but they must not be mixed. The number of registers must not
exceed 16 D registers, or 8 Q registers. If Q registers are specified, on disassembly they are shown
as D registers.
Note
VPUSH Registers is equivalent to VSTMDB sp!, Registers.
You can use either form of this instruction. They both disassemble to VPUSH.
Related references
C1.9 Condition code suffixes on page C1-92
C3.92 VPOP on page C3-487
C4.34 VPUSH (floating-point) on page C4-581
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-488
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.94 VQABS
C3.94 VQABS
Vector Saturating Absolute.
Syntax
VQABS{cond}.datatype Qd, Qm
VQABS{cond}.datatype Dd, Dm
where:
cond
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
Operation
VQABS takes the absolute value of each element in a vector, and places the results in a second vector.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-489
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.95 VQADD
C3.95 VQADD
Vector Saturating Add.
Syntax
VQADD{cond}.datatype {Qd}, Qn, Qm
where:
cond
must be one of S8, S16, S32, S64, U8, U16, U32, or U64.
Qd, Qn, Qm
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VQADD adds corresponding elements in two vectors, and places the results in the destination vector.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-490
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.96 VQDMLAL and VQDMLSL (by vector or by scalar)
Syntax
VQDopL{cond}.datatype Qd, Dn, Dm
where:
op
Multiply Accumulate.
MLS
Multiply Subtract.
cond
Operation
These instructions multiply their operands and double the results. VQDMLAL adds the results to the values
in the destination register. VQDMLSL subtracts the results from the values in the destination register.
If any of the results overflow, they are saturated. The sticky QC flag (FPSCR bit[27]) is set if saturation
occurs.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-491
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.97 VQDMULH (by vector or by scalar)
Syntax
VQDMULH{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector and the first operand vector, for a quadword operation.
Dd, Dn
are the destination vector and the first operand vector, for a doubleword operation.
Qm or Dm
Operation
VQDMULH multiplies corresponding elements in two vectors, doubles the results, and places the most
significant half of the final results in the destination vector.
The second operand can be a scalar instead of a vector.
If any of the results overflow, they are saturated. The sticky QC flag (FPSCR bit[27]) is set if saturation
occurs. Each result is truncated.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-492
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.98 VQDMULL (by vector or by scalar)
Syntax
VQDMULL{cond}.datatype Qd, Dn, Dm
where:
cond
Operation
VQDMULL multiplies corresponding elements in two vectors, doubles the results and places the results in
the destination register.
The second operand can be a scalar instead of a vector.
If any of the results overflow, they are saturated. The sticky QC flag (FPSCR bit[27]) is set if saturation
occurs.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-493
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.99 VQMOVN and VQMOVUN
Syntax
VQMOVN{cond}.datatype Dd, Qm
VQMOVUN{cond}.datatype Dd, Qm
where:
cond
for VQMOVN.
Dd, Qm
Operation
VQMOVN copies each element of the operand vector to the corresponding element of the destination vector.
The result element is half the width of the operand element, and values are saturated to the result width.
The results are the same type as the operands.
VQMOVUN copies each element of the operand vector to the corresponding element of the destination
vector. The result element is half the width of the operand element, and values are saturated to the result
width. The elements in the operand are signed and the elements in the result are unsigned.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-494
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.100 VQNEG
C3.100 VQNEG
Vector Saturating Negate.
Syntax
VQNEG{cond}.datatype Qd, Qm
VQNEG{cond}.datatype Dd, Dm
where:
cond
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
Operation
VQNEG negates each element in a vector, and places the results in a second vector.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-495
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.101 VQRDMULH (by vector or by scalar)
Syntax
VQRDMULH{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector and the first operand vector, for a quadword operation.
Dd, Dn
are the destination vector and the first operand vector, for a doubleword operation.
Qm or Dm
Operation
VQRDMULH multiplies corresponding elements in two vectors, doubles the results, and places the most
significant half of the final results in the destination vector.
The second operand can be a scalar instead of a vector.
If any of the results overflow, they are saturated. The sticky QC flag (FPSCR bit[27]) is set if saturation
occurs. Each result is rounded.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-496
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.102 VQRSHL (by signed variable)
Syntax
VQRSHL{cond}.datatype {Qd}, Qm, Qn
where:
cond
must be one of S8, S16, S32, S64, U8, U16, U32, or U64.
Qd, Qm, Qn
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dm, Dn
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VQRSHL takes each element in a vector, shifts them by a value from the least significant byte of the
corresponding element of a second vector, and places the results in the destination vector. If the shift
value is positive, the operation is a left shift. Otherwise, it is a rounding right shift.
The sticky QC flag (FPSCR bit[27]) is set if saturation occurs.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-497
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.103 VQRSHRN and VQRSHRUN (by immediate)
Syntax
VQRSHR{U}N{cond}.datatype Dd, Qm, #imm
where:
U
if present, indicates that the results are unsigned, although the operands are signed. Otherwise,
the results are the same type as the operands.
cond
is the immediate value specifying the size of the shift. The ranges are shown in the following
table:
Table C3-10 Available immediate ranges in VQRSHRN and VQRSHRUN (by immediate)
Operation
VQRSHR{U}N takes each element in a quadword vector of integers, right shifts them by an immediate
value, and places the results in a doubleword vector.
The sticky QC flag (FPSCR bit[27]) is set if saturation occurs.
Results are rounded.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-498
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.104 VQSHL (by signed variable)
Syntax
VQSHL{cond}.datatype {Qd}, Qm, Qn
where:
cond
must be one of S8, S16, S32, S64, U8, U16, U32, or U64.
Qd, Qm, Qn
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dm, Dn
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VQSHL takes each element in a vector, shifts them by a value from the least significant byte of the
corresponding element of a second vector, and places the results in the destination vector. If the shift
value is positive, the operation is a left shift. Otherwise, it is a truncating right shift.
The sticky QC flag (FPSCR bit[27]) is set if saturation occurs.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-499
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.105 VQSHL and VQSHLU (by immediate)
Syntax
VQSHL{U}{cond}.datatype {Qd}, Qm, #imm
where:
U
only permitted if Q is also present. Indicates that the results are unsigned even though the
operands are signed.
cond
must be one of :
S8, S16, S32, S64
is the immediate value specifying the size of the shift, in the range 0 to (size(datatype) – 1).
The ranges are shown in the following table:
Table C3-11 Available immediate ranges in VQSHL and VQSHLU (by immediate)
Operation
VQSHL and VQSHLU instructions take each element in a vector of integers, left shift them by an immediate
value, and place the results in the destination vector.
The sticky QC flag (FPSCR bit[27]) is set if saturation occurs.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-500
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.106 VQSHRN and VQSHRUN (by immediate)
Syntax
VQSHR{U}N{cond}.datatype Dd, Qm, #imm
where:
U
if present, indicates that the results are unsigned, although the operands are signed. Otherwise,
the results are the same type as the operands.
cond
is the immediate value specifying the size of the shift. The ranges are shown in the following
table:
Table C3-12 Available immediate ranges in VQSHRN and VQSHRUN (by immediate)
Operation
VQSHR{U}N takes each element in a quadword vector of integers, right shifts them by an immediate value,
and places the results in a doubleword vector.
The sticky QC flag (FPSCR bit[27]) is set if saturation occurs.
Results are truncated.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-501
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.107 VQSUB
C3.107 VQSUB
Vector Saturating Subtract.
Syntax
VQSUB{cond}.datatype {Qd}, Qn, Qm
where:
cond
must be one of S8, S16, S32, S64, U8, U16, U32, or U64.
Qd, Qn, Qm
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VQSUB subtracts the elements of one vector from the corresponding elements of another vector, and
places the results in the destination vector.
The sticky QC flag (FPSCR bit[27]) is set if saturation occurs.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-502
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.108 VRADDHN
C3.108 VRADDHN
Vector Rounding Add and Narrow, selecting High half.
Syntax
VRADDHN{cond}.datatype Dd, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector.
Operation
VRADDHN adds corresponding elements in two quadword vectors, selects the most significant halves of the
results, and places the final results in the destination doubleword vector. Results are rounded.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-503
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.109 VRECPE
C3.109 VRECPE
Vector Reciprocal Estimate.
Syntax
VRECPE{cond}.datatype Qd, Qm
VRECPE{cond}.datatype Dd, Dm
where:
cond
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
Operation
VRECPE finds an approximate reciprocal of each element in a vector, and places the results in a second
vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-504
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.110 VRECPS
C3.110 VRECPS
Vector Reciprocal Step.
Syntax
VRECPS{cond}.F32 {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VRECPS multiplies the elements of one vector by the corresponding elements of another vector, subtracts
each of the results from 2, and places the final results into the elements of the destination vector.
The Newton-Raphson iteration:
xn+1 = xn (2–dxn)
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-505
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.111 VREV16, VREV32, and VREV64
Syntax
VREVn{cond}.size Qd, Qm
VREVn{cond}.size Dd, Dm
where:
n
specifies the destination vector and the operand vector, for a quadword operation.
Dd, Dm
specifies the destination vector and the operand vector, for a doubleword operation.
Operation
VREV16 reverses the order of 8-bit elements within each halfword of the vector, and places the result in
the corresponding destination vector.
VREV32 reverses the order of 8-bit or 16-bit elements within each word of the vector, and places the result
in the corresponding destination vector.
VREV64 reverses the order of 8-bit, 16-bit, or 32-bit elements within each doubleword of the vector, and
places the result in the corresponding destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-506
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.112 VRHADD
C3.112 VRHADD
Vector Rounding Halving Add.
Syntax
VRHADD{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VRHADD adds corresponding elements in two vectors, shifts each result right one bit, and places the results
in the destination vector. Results are rounded.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-507
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.113 VRSHL (by signed variable)
Syntax
VRSHL{cond}.datatype {Qd}, Qm, Qn
where:
cond
must be one of S8, S16, S32, S64, U8, U16, U32, or U64.
Qd, Qm, Qn
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dm, Dn
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VRSHL takes each element in a vector, shifts them by a value from the least significant byte of the
corresponding element of a second vector, and places the results in the destination vector. If the shift
value is positive, the operation is a left shift. Otherwise, it is a rounding right shift.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-508
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.114 VRSHR (by immediate)
Syntax
VRSHR{cond}.datatype {Qd}, Qm, #imm
where:
cond
must be one of S8, S16, S32, S64, U8, U16, U32, or U64.
Qd, Qm
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
imm
is the immediate value specifying the size of the shift, in the range 0 to (size(datatype)). The
ranges are shown in the following table:
Operation
VRSHR takes each element in a vector, right shifts them by an immediate value, and places the results in
the destination vector. The results are rounded.
Related references
C3.86 VORR (register) on page C3-481
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-509
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.115 VRSHRN (by immediate)
Syntax
VRSHRN{cond}.datatype Dd, Qm, #imm
where:
cond
is the immediate value specifying the size of the shift, in the range 0 to (size(datatype)/2). The
ranges are shown in the following table:
Operation
VRSHRN takes each element in a quadword vector, right shifts them by an immediate value, and places the
results in a doubleword vector. The results are rounded.
Related references
C3.73 VMOVN on page C3-468
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-510
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.116 VRINT
C3.116 VRINT
VRINT (Vector Round to Integer) rounds each floating-point element in a vector to integer, and places the
results in the destination vector.
The resulting integers are represented in floating-point format.
Note
This instruction is supported only in Armv8.
Syntax
VRINTmode.F32.F32 Qd, Qm
VRINTmode.F32.F32 Dd, Dm
where:
mode
meaning round to nearest, ties away from zero. This cannot generate an Inexact
exception, even if the result is not exact.
N
meaning round to nearest, ties to even. This cannot generate an Inexact exception, even
if the result is not exact.
X
meaning round to nearest, ties to even, generating an Inexact exception if the result is
not exact.
P
meaning round towards plus infinity. This cannot generate an Inexact exception, even if
the result is not exact.
M
meaning round towards minus infinity. This cannot generate an Inexact exception, even
if the result is not exact.
Z
meaning round towards zero. This cannot generate an Inexact exception, even if the
result is not exact.
Qd, Qm
specifies the destination vector and the operand vector, for a quadword operation.
Dd, Dm
specifies the destination and operand vectors, for a doubleword operation.
Notes
You cannot use VRINT inside an IT block.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-511
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.117 VRSQRTE
C3.117 VRSQRTE
Vector Reciprocal Square Root Estimate.
Syntax
VRSQRTE{cond}.datatype Qd, Qm
VRSQRTE{cond}.datatype Dd, Dm
where:
cond
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
Operation
VRSQRTE finds an approximate reciprocal square root of each element in a vector, and places the results in
a second vector.
Negative 0
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-512
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.118 VRSQRTS
C3.118 VRSQRTS
Vector Reciprocal Square Root Step.
Syntax
VRSQRTS{cond}.F32 {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VRSQRTS multiplies the elements of one vector by the corresponding elements of another vector, subtracts
each of the results from three, divides these results by two, and places the final results into the elements
of the destination vector.
The Newton-Raphson iteration:
xn+1 = xn (3–dxn2)/2
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-513
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.119 VRSRA (by immediate)
Syntax
VRSRA{cond}.datatype {Qd}, Qm, #imm
where:
cond
must be one of S8, S16, S32, S64, U8, U16, U32, or U64.
Qd, Qm
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
imm
is the immediate value specifying the size of the shift, in the range 1 to (size(datatype)). The
ranges are shown in the following table:
Operation
VRSRA takes each element in a vector, right shifts them by an immediate value, and accumulates the
results into the destination vector. The results are rounded.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-514
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.120 VRSUBHN
C3.120 VRSUBHN
Vector Rounding Subtract and Narrow, selecting High half.
Syntax
VRSUBHN{cond}.datatype Dd, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector.
Operation
VRSUBHN subtracts the elements of one quadword vector from the corresponding elements of another
quadword vector, selects the most significant halves of the results, and places the final results in the
destination doubleword vector. Results are rounded.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-515
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.121 VSDOT (vector)
Syntax
VSDOT{q}.S8 Dd, Dn, Dm ; 64-bit SIMD vector
Where:
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Dd
Is the 64-bit name of the SIMD and FP destination register.
Dn
Is the 64-bit name of the first SIMD and FP source register.
Dm
Is the 64-bit name of the second SIMD and FP source register.
Qd
Is the 128-bit name of the SIMD and FP destination register.
Qn
Is the 128-bit name of the first SIMD and FP source register.
Qm
Is the 128-bit name of the second SIMD and FP source register.
Architectures supported
Supported in Armv8.2 and later.
For Armv8.2 and Armv8.3, this is an OPTIONAL instruction.
Usage
Dot Product vector form with signed integers. This instruction performs the dot product of the four 8-bit
elements in each 32-bit element of the first source register with the four 8-bit elements of the
corresponding 32-bit element in the second source register, accumulating the result into the
corresponding 32-bit element of the destination register.
Note
ID_ISAR6.DP indicates whether this instruction is supported in the T32 and A32 instruction sets.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-516
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.122 VSDOT (by element)
Syntax
VSDOT{q}.S8 Dd, Dn, Dm[index] ; 64-bit SIMD vector
VSDOT{q}.S8 Qd, Qn, Dm[index] ; A1 128-bit SIMD vector FP/SIMD registers (A32)
Where:
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Dd
Is the 64-bit name of the SIMD and FP destination register.
Dn
Is the 64-bit name of the first SIMD and FP source register.
Dm
Is the 64-bit name of the second SIMD and FP source register.
index
Is the element index in the range 0 to 1.
Qd
Is the 128-bit name of the SIMD and FP destination register.
Qn
Is the 128-bit name of the first SIMD and FP source register.
Architectures supported
Supported in Armv8.2 and later.
For Armv8.2 and Armv8.3, this is an OPTIONAL instruction.
Usage
Dot Product index form with signed integers. This instruction performs the dot product of the four 8-bit
elements in each 32-bit element of the first source register with the four 8-bit elements of an indexed 32-
bit element in the second source register, accumulating the result into the corresponding 32-bit element
of the destination register.
Note
ID_ISAR6.DP indicates whether this instruction is supported in the T32 and A32 instruction sets.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-517
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.123 VSHL (by immediate)
Syntax
VSHL{cond}.datatype {Qd}, Qm, #imm
where:
cond
is the immediate value specifying the size of the shift. The ranges are shown in the following
table:
Operation
VSHL takes each element in a vector of integers, left shifts them by an immediate value, and places the
results in the destination vector.
Bits shifted out of the left of each element are lost.
The following figure shows the operation of VSHL with two elements and a shift value of one. The least
significant bit in each element in the destination vector is set to zero.
Element 1 Element 0
Qm
... ...
Qd 0 0
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-518
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.124 VSHL (by signed variable)
Syntax
VSHL{cond}.datatype {Qd}, Qm, Qn
where:
cond
must be one of S8, S16, S32, S64, U8, U16, U32, or U64.
Qd, Qm, Qn
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Dd, Dm, Dn
are the destination vector, the first operand vector, and the second operand vector, for a
doubleword operation.
Operation
VSHL takes each element in a vector, shifts them by the value from the least significant byte of the
corresponding element of a second vector, and places the results in the destination vector. If the shift
value is positive, the operation is a left shift. Otherwise, it is a truncating right shift.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-519
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.125 VSHLL (by immediate)
Syntax
VSHLL{cond}.datatype Qd, Dm, #imm
where:
cond
is the immediate value specifying the size of the shift. The ranges are shown in the following
table:
Operation
VSHLL takes each element in a vector of integers, left shifts them by an immediate value, and places the
results in the destination vector. Values are sign or zero extended.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-520
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.126 VSHR (by immediate)
Syntax
VSHR{cond}.datatype {Qd}, Qm, #imm
where:
cond
must be one of S8, S16, S32, S64, U8, U16, U32, or U64.
Qd, Qm
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
imm
is the immediate value specifying the size of the shift. The ranges are shown in the following
table:
Operation
VSHR takes each element in a vector, right shifts them by an immediate value, and places the results in the
destination vector. The results are truncated.
Related references
C3.86 VORR (register) on page C3-481
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-521
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.127 VSHRN (by immediate)
Syntax
VSHRN{cond}.datatype Dd, Qm, #imm
where:
cond
is the immediate value specifying the size of the shift. The ranges are shown in the following
table:
Operation
VSHRN takes each element in a quadword vector, right shifts them by an immediate value, and places the
results in a doubleword vector. The results are truncated.
Related references
C3.73 VMOVN on page C3-468
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-522
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.128 VSLI
C3.128 VSLI
Vector Shift Left and Insert.
Syntax
VSLI{cond}.size {Qd}, Qm, #imm
where:
cond
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
imm
is the immediate value specifying the size of the shift, in the range 0 to (size – 1).
Operation
VSLI takes each element in a vector, left shifts them by an immediate value, and inserts the results in the
destination vector. Bits shifted out of the left of each element are lost. The following figure shows the
operation of VSLI with two elements and a shift value of one. The least significant bit in each element in
the destination vector is unchanged.
Element 1 Element 0
Qm
... ...
Qd
Unchanged Unchanged
bit bit
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-523
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.129 VSRA (by immediate)
Syntax
VSRA{cond}.datatype {Qd}, Qm, #imm
where:
cond
must be one of S8, S16, S32, S64, U8, U16, U32, or U64.
Qd, Qm
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
imm
is the immediate value specifying the size of the shift. The ranges are shown in the following
table:
Operation
VSRA takes each element in a vector, right shifts them by an immediate value, and accumulates the results
into the destination vector. The results are truncated.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-524
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.130 VSRI
C3.130 VSRI
Vector Shift Right and Insert.
Syntax
VSRI{cond}.size {Qd}, Qm, #imm
where:
cond
are the destination vector and the operand vector, for a quadword operation.
Dd, Dm
are the destination vector and the operand vector, for a doubleword operation.
imm
is the immediate value specifying the size of the shift, in the range 1 to size.
Operation
VSRI takes each element in a vector, right shifts them by an immediate value, and inserts the results in the
destination vector. Bits shifted out of the right of each element are lost. The following figure shows the
operation of VSRI with a single element and a shift value of two. The two most significant bits in the
destination vector are unchanged.
Element 0
Dm
... ...
Dd
Unchanged
bits
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-525
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.131 VSTM
C3.131 VSTM
Extension register store multiple.
Syntax
VSTMmode{cond} Rn{!}, Registers
where:
mode
meaning Increment address After each transfer. IA is the default, and can be omitted.
DB
meaning Empty Ascending stack operation. This is the same as IA for stores.
FD
meaning Full Descending stack operation. This is the same as DB for stores.
cond
is the general-purpose register holding the base address for the transfer.
!
is optional. ! specifies that the updated base address must be written back to Rn. If ! is not
specified, mode must be IA.
Registers
is a list of consecutive extension registers enclosed in braces, { and }. The list can be comma-
separated, or in range format. There must be at least one register in the list.
You can specify D or Q registers, but they must not be mixed. The number of registers must not
exceed 16 D registers, or 8 Q registers. If Q registers are specified, on disassembly they are shown
as D registers.
Note
VPUSH Registers is equivalent to VSTMDB sp!, Registers.
You can use either form of this instruction. They both disassemble to VPUSH.
Related references
C1.9 Condition code suffixes on page C1-92
C4.38 VSTM (floating-point) on page C4-585
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-526
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.132 VSTn (multiple n-element structures)
Syntax
VSTn{cond}.datatype list, [Rn{@align}]{!}
where:
n
must be one of 1, 2, 3, or 4.
cond
is the list of Advanced SIMD registers enclosed in braces, { and }. See the following table for
options.
Rn
if ! is present, Rn is updated to (Rn + the number of bytes transferred by the instruction). The
update occurs after all the stores have taken place.
Rm
is a general-purpose register containing an offset from the base address. If Rm is present, the
instruction updates Rn to (Rn + Rm) after using the address to access memory. Rm cannot be SP or
PC.
Operation
VSTn stores multiple n-element structures to memory from one or more Advanced SIMD registers, with
interleaving (unless n == 1). Every element of each register is stored.
Table C3-25 Permitted combinations of parameters for VSTn (multiple n-element structures)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-527
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.132 VSTn (multiple n-element structures)
Table C3-25 Permitted combinations of parameters for VSTn (multiple n-element structures) (continued)
{Dd, D(d+1), D(d+2), D(d+3)} @64, @128, or @256 8-byte, 16-byte, or 32-byte
4 8, 16, or 32 {Dd, D(d+1), D(d+2), D(d+3)} @64, @128, or @256 8-byte, 16-byte, or 32-byte
{Dd, D(d+2), D(d+4), D(d+6)} @64, @128, or @256 8-byte, 16-byte, or 32-byte
Related concepts
C3.3 Interleaving provided by load and store element and structure instructions on page C3-395
C3.4 Alignment restrictions in load and store element and structure instructions on page C3-396
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-528
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.133 VSTn (single n-element structure to one lane)
Syntax
VSTn{cond}.datatype list, [Rn{@align}]{!}
where:
n
must be one of 1, 2, 3, or 4.
cond
is the list of Advanced SIMD registers enclosed in braces, { and }. See the following table for
options.
Rn
if ! is present, Rn is updated to (Rn + the number of bytes transferred by the instruction). The
update occurs after all the stores have taken place.
Rm
is a general-purpose register containing an offset from the base address. If Rm is present, the
instruction updates Rn to (Rn + Rm) after using the address to access memory. Rm cannot be SP or
PC.
Operation
VSTn stores one n-element structure into memory from one or more Advanced SIMD registers.
Table C3-26 Permitted combinations of parameters for VSTn (single n-element structure to one lane)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-529
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.133 VSTn (single n-element structure to one lane)
Table C3-26 Permitted combinations of parameters for VSTn (single n-element structure to one lane) (continued)
Related concepts
C3.3 Interleaving provided by load and store element and structure instructions on page C3-395
C3.4 Alignment restrictions in load and store element and structure instructions on page C3-396
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-530
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.134 VSTR
C3.134 VSTR
Extension register store.
Syntax
VSTR{cond}{.64} Dd, [Rn{, #offset}]
where:
cond
is the general-purpose register holding the base address for the transfer.
offset
is an optional numeric expression. It must evaluate to a numeric value at assembly time. The
value must be a multiple of 4, and lie in the range -1020 to +1020. The value is added to the
base address to form the address used for the transfer.
Operation
The VSTR instruction saves the contents of an extension register to memory.
Two words are transferred.
Related references
C1.9 Condition code suffixes on page C1-92
C4.39 VSTR (floating-point) on page C4-586
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-531
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.135 VSTR (post-increment and pre-decrement)
Note
There are also VLDR and VSTR instructions without post-increment and pre-decrement.
Syntax
VSTR{cond}{.64} Dd, [Rn], #offset ; post-increment
where:
cond
is the general-purpose register holding the base address for the transfer.
offset
Operation
The post-increment instruction increments the base address in the register by the offset value, after the
transfer. The pre-decrement instruction decrements the base address in the register by the offset value,
and then performs the transfer using the new address in the register. This pseudo-instruction assembles to
a VSTM instruction.
Related references
C3.134 VSTR on page C3-531
C3.131 VSTM on page C3-526
C1.9 Condition code suffixes on page C1-92
C4.40 VSTR (post-increment and pre-decrement, floating-point) on page C4-587
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-532
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.136 VSUB
C3.136 VSUB
Vector Subtract.
Syntax
VSUB{cond}.datatype {Qd}, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a
quadword operation.
Operation
VSUB subtracts the elements of one vector from the corresponding elements of another vector, and places
the results in the destination vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-533
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.137 VSUBHN
C3.137 VSUBHN
Vector Subtract and Narrow, selecting High half.
Syntax
VSUBHN{cond}.datatype Dd, Qn, Qm
where:
cond
are the destination vector, the first operand vector, and the second operand vector.
Operation
VSUBHN subtracts the elements of one quadword vector from the corresponding elements of another
quadword vector, selects the most significant halves of the results, and places the final results in the
destination doubleword vector. Results are truncated.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-534
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.138 VSUBL and VSUBW
Syntax
VSUBL{cond}.datatype Qd, Dn, Dm ; Long operation
where:
cond
are the destination vector, the first operand vector, and the second operand vector, for a long
operation.
Qd, Qn, Dm
are the destination vector, the first operand vector, and the second operand vector, for a wide
operation.
Operation
VSUBL subtracts the elements of one doubleword vector from the corresponding elements of another
doubleword vector, and places the results in the destination quadword vector.
VSUBW subtracts the elements of a doubleword vector from the corresponding elements of a quadword
vector, and places the results in the destination quadword vector.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-535
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.139 VSWP
C3.139 VSWP
Vector Swap.
Syntax
VSWP{cond}{.datatype} Qd, Qm
VSWP{cond}{.datatype} Dd, Dm
where:
cond
Operation
VSWP exchanges the contents of two vectors. The vectors can be either doubleword or quadword. There is
no distinction between data types.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-536
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.140 VTBL and VTBX
Syntax
Vop{cond}.8 Dd, list, Dm
where:
op
Operation
VTBL uses byte indexes in a control vector to look up byte values in a table and generate a new vector.
Indexes out of range return zero.
VTBX works in the same way, except that indexes out of range leave the destination element unchanged.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-537
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.141 VTRN
C3.141 VTRN
Vector Transpose.
Syntax
VTRN{cond}.size Qd, Qm
VTRN{cond}.size Dd, Dm
where:
cond
Operation
VTRN treats the elements of its operand vectors as elements of 2 x 2 matrices, and transposes the matrices.
The following figures show examples of the operation of VTRN:
7 6 5 4 3 2 1 0
Dm
Dd
Dd
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-538
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.142 VTST
C3.142 VTST
Vector Test bits.
Syntax
VTST{cond}.size {Qd}, Qn, Qm
where:
cond
specifies the destination register, the first operand register, and the second operand register, for a
quadword operation.
Dd, Dn, Dm
specifies the destination register, the first operand register, and the second operand register, for a
doubleword operation.
Operation
VTST takes each element in a vector, and bitwise logical ANDs them with the corresponding element of a
second vector. If the result is not zero, the corresponding element in the destination vector is set to all
ones. Otherwise, it is set to all zeros.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-539
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.143 VUDOT (vector)
Syntax
VUDOT{q}.U8 Dd, Dn, Dm ; 64-bit SIMD vector
Where:
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Dd
Is the 64-bit name of the SIMD and FP destination register.
Dn
Is the 64-bit name of the first SIMD and FP source register.
Dm
Is the 64-bit name of the second SIMD and FP source register.
Qd
Is the 128-bit name of the SIMD and FP destination register.
Qn
Is the 128-bit name of the first SIMD and FP source register.
Qm
Is the 128-bit name of the second SIMD and FP source register.
Architectures supported
Supported in Armv8.2 and later.
For Armv8.2 and Armv8.3, this is an OPTIONAL instruction.
Usage
Dot Product vector form with unsigned integers. This instruction performs the dot product of the four 8-
bit elements in each 32-bit element of the first source register with the four 8-bit elements of the
corresponding 32-bit element in the second source register, accumulating the result into the
corresponding 32-bit element of the destination register.
Note
ID_ISAR6.DP indicates whether this instruction is supported in the T32 and A32 instruction sets.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-540
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.144 VUDOT (by element)
Syntax
VUDOT{q}.U8 Dd, Dn, Dm[index] ; 64-bit SIMD vector
VUDOT{q}.U8 Qd, Qn, Dm[index] ; A1 128-bit SIMD vector FP/SIMD registers (A32)
Where:
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Dd
Is the 64-bit name of the SIMD and FP destination register.
Dn
Is the 64-bit name of the first SIMD and FP source register.
Dm
Is the 64-bit name of the second SIMD and FP source register.
index
Is the element index in the range 0 to 1.
Qd
Is the 128-bit name of the SIMD and FP destination register.
Qn
Is the 128-bit name of the first SIMD and FP source register.
Architectures supported
Supported in Armv8.2 and later.
For Armv8.2 and Armv8.3, this is an OPTIONAL instruction.
Usage
Dot Product index form with unsigned integers. This instruction performs the dot product of the four 8-
bit elements in each 32-bit element of the first source register with the four 8-bit elements of an indexed
32-bit element in the second source register, accumulating the result into the corresponding 32-bit
element of the destination register.
Note
ID_ISAR6.DP indicates whether this instruction is supported in the T32 and A32 instruction sets.
Related references
C3.1 Summary of Advanced SIMD instructions on page C3-391
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-541
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.145 VUZP
C3.145 VUZP
Vector Unzip.
Syntax
VUZP{cond}.size Qd, Qm
VUZP{cond}.size Dd, Dm
where:
cond
is an optional condition code.
size
must be one of 8, 16, or 32.
Qd, Qm
specifies the vectors, for a quadword operation.
Dd, Dm
specifies the vectors, for a doubleword operation.
Note
The following are all the same instruction:
• VZIP.32 Dd, Dm.
• VUZP.32 Dd, Dm.
• VTRN.32 Dd, Dm.
The instruction is disassembled as VTRN.32 Dd, Dm.
Operation
VUZP de-interleaves the elements of two vectors.
Dd A7 A6 A5 A4 A3 A2 A1 A0 B6 B4 B2 B0 A6 A4 A2 A0
Dm B7 B6 B5 B4 B3 B2 B1 B0 B7 B5 B3 B1 A7 A5 A3 A1
Qd A3 A2 A1 A0 B2 B0 A2 A0
Qm B3 B2 B1 B0 B3 B1 A3 A1
Related concepts
C3.3 Interleaving provided by load and store element and structure instructions on page C3-395
Related references
C3.141 VTRN on page C3-538
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-542
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.146 VZIP
C3.146 VZIP
Vector Zip.
Syntax
VZIP{cond}.size Qd, Qm
VZIP{cond}.size Dd, Dm
where:
cond
Note
The following are all the same instruction:
• VZIP.32 Dd, Dm.
• VUZP.32 Dd, Dm.
• VTRN.32 Dd, Dm.
The instruction is disassembled as VTRN.32 Dd, Dm.
Operation
VZIP interleaves the elements of two vectors.
Dd A7 A6 A5 A4 A3 A2 A1 A0 B3 A3 B2 A2 B1 A1 B0 A0
Dm B7 B6 B5 B4 B3 B2 B1 B0 B7 A7 B6 A6 B5 A5 B4 A4
Qd A3 A2 A1 A0 B1 A1 B0 A0
Qm B3 B2 B1 B0 B3 A3 B2 A2
Related concepts
C3.3 Interleaving provided by load and store element and structure instructions on page C3-395
Related references
C3.141 VTRN on page C3-538
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C3-543
reserved.
Non-Confidential
C3 Advanced SIMD Instructions (32-bit)
C3.146 VZIP
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights reserved. C3-544
Non-Confidential
Chapter C4
Floating-point Instructions (32-bit)
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-545
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
• C4.23 VMOV (between one general-purpose register and single precision floating-point register)
on page C4-570.
• C4.24 VMOV (between two general-purpose registers and one or two extension registers)
on page C4-571.
• C4.25 VMOV (between a general-purpose register and half a double precision floating-point
register) on page C4-572.
• C4.26 VMRS (floating-point) on page C4-573.
• C4.27 VMSR (floating-point) on page C4-574.
• C4.28 VMUL (floating-point) on page C4-575.
• C4.29 VNEG (floating-point) on page C4-576.
• C4.30 VNMLA (floating-point) on page C4-577.
• C4.31 VNMLS (floating-point) on page C4-578.
• C4.32 VNMUL (floating-point) on page C4-579.
• C4.33 VPOP (floating-point) on page C4-580.
• C4.34 VPUSH (floating-point) on page C4-581.
• C4.35 VRINT (floating-point) on page C4-582.
• C4.36 VSEL on page C4-583.
• C4.37 VSQRT on page C4-584.
• C4.38 VSTM (floating-point) on page C4-585.
• C4.39 VSTR (floating-point) on page C4-586.
• C4.40 VSTR (post-increment and pre-decrement, floating-point) on page C4-587.
• C4.41 VSUB (floating-point) on page C4-588.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-546
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.1 Summary of floating-point instructions
VFNMA, VFNMS Fused multiply accumulate with negation, Fused multiply subtract with negation
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-547
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.1 Summary of floating-point instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-548
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.2 VABS (floating-point)
Syntax
VABS{cond}.F32 Sd, Sm
VABS{cond}.F64 Dd, Dm
where:
cond
Operation
The VABS instruction takes the contents of Sm or Dm, clears the sign bit, and places the result in Sd or Dd.
This gives the absolute value.
If the operand is a NaN, the sign bit is cleared, but no exception is produced.
Floating-point exceptions
VABS instructions do not produce any exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-549
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.3 VADD (floating-point)
Syntax
VADD{cond}.F32 {Sd}, Sn, Sm
where:
cond
Operation
The VADD instruction adds the values in the operand registers and places the result in the destination
register.
Floating-point exceptions
The VADD instruction can produce Invalid Operation, Overflow, or Inexact exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-550
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.4 VCMP, VCMPE
Syntax
VCMP{E}{cond}.F32 Sd, Sm
VCMP{E}{cond}.F32 Sd, #0
VCMP{E}{cond}.F64 Dd, Dm
VCMP{E}{cond}.F64 Dd, #0
where:
E
if present, indicates that the instruction raises an Invalid Operation exception if either operand is
a quiet or signaling NaN. Otherwise, it raises the exception only if either operand is a signaling
NaN.
cond
Operation
The VCMP{E} instruction subtracts the value in the second operand register (or 0 if the second operand is
#0) from the value in the first operand register, and sets the VFP condition flags based on the result.
Floating-point exceptions
VCMP{E} instructions can produce Invalid Operation exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-551
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.5 VCVT (between single-precision and double-precision)
Syntax
VCVT{cond}.F64.F32 Dd, Sm
VCVT{cond}.F32.F64 Sd, Dm
where:
cond
Operation
These instructions convert the single-precision value in Sm to double-precision, placing the result in Dd,
or the double-precision value in Dm to single-precision, placing the result in Sd.
Floating-point exceptions
These instructions can produce Invalid Operation, Input Denormal, Overflow, Underflow, or Inexact
exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-552
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.6 VCVT (between floating-point and integer)
Syntax
VCVT{R}{cond}.type.F64 Sd, Dm
VCVT{R}{cond}.type.F32 Sd, Sm
VCVT{cond}.F64.type Dd, Sm
VCVT{cond}.F32.type Sd, Sm
where:
R
makes the operation use the rounding mode specified by the FPSCR. Otherwise, the operation
rounds towards zero.
cond
can be either U32 (unsigned 32-bit integer) or S32 (signed 32-bit integer).
Sd
Operation
The first two forms of this instruction convert from floating-point to integer.
The third and fourth forms convert from integer to floating-point.
Floating-point exceptions
These instructions can produce Input Denormal, Invalid Operation, or Inexact exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-553
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.7 VCVT (from floating-point to integer with directed rounding modes)
Note
This instruction is supported only in Armv8.
Syntax
VCVTmode.S32.F64 Sd, Dm
VCVTmode.S32.F32 Sd, Sm
VCVTmode.U32.F64 Sd, Dm
VCVTmode.U32.F32 Sd, Sm
where:
mode
Notes
You cannot use VCVT with a directed rounding mode inside an IT block.
Floating-point exceptions
These instructions can produce Input Denormal, Invalid Operation, or Inexact exceptions.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-554
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.8 VCVT (between floating-point and fixed-point)
Syntax
VCVT{cond}.type.F64 Dd, Dd, #fbits
where:
cond
is the number of fraction bits in the fixed-point number, in the range 0-16 if type is S16 or U16,
or in the range 1-32 if type is S32 or U32.
Operation
The first two forms of this instruction convert from floating-point to fixed-point.
The third and fourth forms convert from fixed-point to floating-point.
In all cases the fixed-point number is contained in the least significant 16 or 32 bits of the register.
Floating-point exceptions
These instructions can produce Input Denormal, Invalid Operation, or Inexact exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-555
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.9 VCVTB, VCVTT (half-precision extension)
Syntax
VCVTB{cond}.type Sd, Sm
VCVTT{cond}.type Sd, Sm
where:
cond
Operation
VCVTB uses the bottom half (bits[15:0]) of the single word register to obtain or store the half-precision
value
VCVTT uses the top half (bits[31:16]) of the single word register to obtain or store the half-precision
value.
Architectures
The instructions are only available in VFPv3 systems with the half-precision extension, and VFPv4.
Floating-point exceptions
These instructions can produce Input Denormal, Invalid Operation, Overflow, Underflow, or Inexact
exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-556
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.10 VCVTB, VCVTT (between half-precision and double-precision)
Syntax
VCVTB{cond}.F64.F16 Dd, Sm
VCVTB{cond}.F16.F64 Sd, Dm
VCVTT{cond}.F64.F16 Dd, Sm
VCVTT{cond}.F16.F64 Sd, Dm
where:
cond
is an optional condition code.
Dd
is a double-precision register for the result.
Sm
is a single word register holding the operand.
Sd
is a single word register for the result.
Dm
is a double-precision register holding the operand.
Usage
These instructions convert the half-precision value in Sm to double-precision and place the result in Dd, or
the double-precision value in Dm to half-precision and place the result in Sd.
Floating-point exceptions
These instructions can produce Input Denormal, Invalid Operation, Overflow, Underflow, or Inexact
exceptions.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-557
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.11 VDIV
C4.11 VDIV
Floating-point divide.
Syntax
VDIV{cond}.F32 {Sd}, Sn, Sm
where:
cond
Operation
The VDIV instruction divides the value in the first operand register by the value in the second operand
register, and places the result in the destination register.
Floating-point exceptions
VDIV operations can produce Division by Zero, Invalid Operation, Overflow, Underflow, or Inexact
exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-558
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.12 VFMA, VFMS, VFNMA, VFNMS (floating-point)
Syntax
VF{N}op{cond}.F64 {Dd}, Dn, Dm
where:
op
is one of MA or MS.
N
Operation
VFMA multiplies the values in the operand registers, adds the value in the destination register, and places
the final result in the destination register. The result of the multiply is not rounded before the
accumulation.
VFMS multiplies the values in the operand registers, subtracts the product from the value in the destination
register, and places the final result in the destination register. The result of the multiply is not rounded
before the subtraction.
In each case, the final result is negated if the N option is used.
Floating-point exceptions
These instructions can produce Input Denormal, Invalid Operation, Overflow, Underflow, or Inexact
exceptions.
Related references
C4.28 VMUL (floating-point) on page C4-575
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-559
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.13 VJCVT
C4.13 VJCVT
Javascript Convert to signed fixed-point, rounding toward Zero.
Syntax
VJCVT{q}.S32.F64 Sd, Dm ; A1 FP/SIMD registers (A32)
Where:
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Sd
Is the 32-bit name of the SIMD and FP destination register.
Dm
Is the 64-bit name of the SIMD and FP source register.
Architectures supported
Supported in the Armv8.3-A architecture and later.
Usage
Javascript Convert to signed fixed-point, rounding toward Zero. This instruction converts the double-
precision floating-point value in the SIMD and FP source register to a 32-bit signed integer using the
Round towards Zero rounding mode, and write the result to the general-purpose destination register. If
the result is too large to be held as a 32-bit signed integer, then the result is the integer modulo 232, as
held in a 32-bit signed integer.
Depending on settings in the CPACR, NSACR, HCPTR, and FPEXC registers, and the security state and
mode in which the instruction is executed, an attempt to execute the instruction might be UNDEFINED, or
trapped to Hyp mode. For more information see Enabling Advanced SIMD and floating-point support in
the Arm® Architecture Reference Manual Arm®v8, for Arm®v8‑A architecture profile.
Related references
C4.1 Summary of floating-point instructions on page C4-547
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-560
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.14 VLDM (floating-point)
Syntax
VLDMmode{cond} Rn{!}, Registers
where:
mode
meaning Increment address After each transfer. IA is the default, and can be omitted.
DB
meaning Empty Ascending stack operation. This is the same as DB for loads.
FD
meaning Full Descending stack operation. This is the same as IA for loads.
cond
is the general-purpose register holding the base address for the transfer.
!
is optional. ! specifies that the updated base address must be written back to Rn. If ! is not
specified, mode must be IA.
Registers
is a list of consecutive extension registers enclosed in braces, { and }. The list can be comma-
separated, or in range format. There must be at least one register in the list.
You can specify S or D registers, but they must not be mixed. The number of registers must not
exceed 16 D registers.
Note
VPOP Registers is equivalent to VLDM sp!, Registers.
You can use either form of this instruction. They both disassemble to VPOP.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-561
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.15 VLDR (floating-point)
Syntax
VLDR{cond}{.size} Fd, [Rn{, #offset}]
where:
cond
is the general-purpose register holding the base address for the transfer.
offset
is an optional numeric expression. It must evaluate to a numeric value at assembly time. The
value must be a multiple of 4, and lie in the range -1020 to +1020. The value is added to the
base address to form the address used for the transfer.
label
is a PC-relative expression.
label must be aligned on a word boundary within ±1KB of the current instruction.
Operation
The VLDR instruction loads an extension register from memory.
One word is transferred if Fd is an S register. Two words are transferred otherwise.
There is also a VLDR pseudo-instruction.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-562
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.16 VLDR (post-increment and pre-decrement, floating-point)
Note
There are also VLDR and VSTR instructions without post-increment and pre-decrement.
Syntax
VLDR{cond}{.size} Fd, [Rn], #offset ; post-increment
where:
cond
is the extension register to load. It can be either a double precision (Dd) or a single precision (Sd)
register.
Rn
is the general-purpose register holding the base address for the transfer.
offset
is a numeric expression that must evaluate to a numeric value at assembly time. The value must
be 4 if Fd is an S register, or 8 if Fd is a D register.
Operation
The post-increment instruction increments the base address in the register by the offset value, after the
transfer. The pre-decrement instruction decrements the base address in the register by the offset value,
and then performs the transfer using the new address in the register. This pseudo-instruction assembles to
a VLDM instruction.
Related references
C4.14 VLDM (floating-point) on page C4-561
C4.15 VLDR (floating-point) on page C4-562
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-563
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.17 VLLDM
C4.17 VLLDM
Floating-point Lazy Load Multiple.
Syntax
VLLDM{c}{q} Rn
Where:
c
Is an optional condition code. See Chapter C1 Condition Codes on page C1-83.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Rn
Is the general-purpose base register.
Architectures supported
Supported in Armv8‑M Main extension only.
Usage
Floating-point Lazy Load Multiple restores the contents of the Secure floating-point registers that were
protected by a VLSTM instruction, and marks the floating-point context as active.
If the lazy state preservation set up by a previous VLSTM instruction is active (FPCCR.LSPACT == 1),
this instruction deactivates lazy state preservation and enables access to the Secure floating-point
registers.
If lazy state preservation is inactive (FPCCR.LSPACT == 0), either because lazy state preservation was
not enabled (FPCCR.LSPEN == 0) or because a floating-point instruction caused the Secure floating-
point register contents to be stored to memory, this instruction loads the stored Secure floating-point
register contents back into the floating-point registers.
If Secure floating-point is not in use (CONTROL_S.SFPA == 0), this instruction behaves as a NOP.
This instruction is only available in Secure state, and is UNDEFINED in Non-secure state.
If the Floating-point Extension is not implemented, this instruction is available in Secure state, but
behaves as a NOP.
Related references
C4.1 Summary of floating-point instructions on page C4-547
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-564
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.18 VLSTM
C4.18 VLSTM
Floating-point Lazy Store Multiple.
Syntax
VLSTM{c}{q} Rn
Where:
c
Is an optional condition code. See Chapter C1 Condition Codes on page C1-83.
q
Is an optional instruction width specifier. See C2.2 Instruction width specifiers on page C2-111.
Rn
Is the general-purpose base register.
Architectures supported
Supported in Armv8‑M Main extension only.
Usage
Floating-point Lazy Store Multiple stores the contents of Secure floating-point registers to a prepared
stack frame, and clears the Secure floating-point registers.
If floating-point lazy preservation is enabled (FPCCR.LSPEN == 1), then the next time a floating-point
instruction other than VLSTM or VLLDM is executed:
• The contents of Secure floating-point registers are stored to memory.
• The Secure floating-point registers are cleared.
If Secure floating-point is not in use (CONTROL_S.SFPA == 0), this instruction behaves as a NOP.
This instruction is only available in Secure state, and is UNDEFINED in Non-secure state.
If the Floating-point extension is not implemented, this instruction is available in Secure state, but
behaves as a NOP.
Related references
C4.1 Summary of floating-point instructions on page C4-547
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-565
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.19 VMAXNM, VMINNM (floating-point)
Note
These instructions are supported only in Armv8.
Syntax
Vop.F32 Sd, Sn, Sm
where:
op
must be either MAXNM or MINNM.
Sd, Sn, Sm
are the single-precision destination register, first operand register, and second operand register.
Dd, Dn, Dm
are the double-precision destination register, first operand register, and second operand register.
Operation
VMAXNM compares the values in the operand registers, and copies the larger value into the destination
operand register.
VMINNM compares the values in the operand registers, and copies the smaller value into the destination
operand register.
If one of the values being compared is a number and the other value is NaN, the number is copied into
the destination operand register. This is consistent with the IEEE 754-2008 standard.
Notes
You cannot use VMAXNM or VMINNM inside an IT block.
Floating-point exceptions
These instructions can produce Input Denormal, Invalid Operation, Overflow, Underflow, or Inexact
exceptions.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-566
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.20 VMLA (floating-point)
Syntax
VMLA{cond}.F32 Sd, Sn, Sm
where:
cond
Operation
The VMLA instruction multiplies the values in the operand registers, adds the value in the destination
register, and places the final result in the destination register.
Floating-point exceptions
This instruction can produce Invalid Operation, Overflow, Underflow, Inexact, or Input Denormal
exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-567
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.21 VMLS (floating-point)
Syntax
VMLS{cond}.F32 Sd, Sn, Sm
where:
cond
Operation
The VMLS instruction multiplies the values in the operand registers, subtracts the result from the value in
the destination register, and places the final result in the destination register.
Floating-point exceptions
This instruction can produce Invalid Operation, Overflow, Underflow, Inexact, or Input Denormal
exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-568
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.22 VMOV (floating-point)
Syntax
VMOV{cond}.F32 Sd, #imm
VMOV{cond}.F32 Sd, Sm
VMOV{cond}.F64 Dd, Dm
where:
cond
Immediate values
Any number that can be expressed as ±n * 2–r,where n and r are integers, 16 <= n <= 31, 0 <= r <= 7.
Architectures
The instructions that copy immediate constants are available in VFPv3 and above.
The instructions that copy from registers are available in all VFP systems.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-569
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.23 VMOV (between one general-purpose register and single precision floating-point register)
C4.23 VMOV (between one general-purpose register and single precision floating-
point register)
Transfer contents between a single-precision floating-point register and a general-purpose register.
Syntax
VMOV{cond} Rd, Sn
VMOV{cond} Sn, Rd
where:
cond
Operation
VMOV Rd, Sn transfers the contents of Sn into Rd.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-570
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.24 VMOV (between two general-purpose registers and one or two extension registers)
C4.24 VMOV (between two general-purpose registers and one or two extension
registers)
Transfer contents between two general-purpose registers and either one 64-bit register or two consecutive
32-bit registers.
Syntax
VMOV{cond} Dm, Rd, Rn
where:
cond
Operation
VMOV Dm, Rd, Rn transfers the contents of Rd into the low half of Dm, and the contents of Rn into the
high half of Dm.
VMOV Rd, Rn, Dm transfers the contents of the low half of Dm into Rd, and the contents of the high half of
Dm into Rn.
VMOV Rd, Rn, Sm, Sm1 transfers the contents of Sm into Rd, and the contents of Sm1 into Rn.
VMOV Sm, Sm1, Rd, Rn transfers the contents of Rd into Sm, and the contents of Rn into Sm1.
Architectures
The instructions are available in VFPv2 and above.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-571
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.25 VMOV (between a general-purpose register and half a double precision floating-point register)
Syntax
VMOV{cond}{.size} Dn[x], Rd
where:
cond
Operation
VMOV Dn[x], Rd transfers the contents of Rd into Dn[x].
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-572
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.26 VMRS (floating-point)
Syntax
VMRS{cond} Rd, extsysreg
where:
cond
is an optional condition code.
extsysreg
is the floating-point system register, usually FPSCR, FPSID, or FPEXC.
Rd
Usage
The VMRS instruction transfers the contents of extsysreg into Rd.
Note
The instruction stalls the processor until all current floating-point operations complete.
Examples
VMRS r2,FPCID
VMRS APSR_nzcv, FPSCR ; transfer FP status register to the
; special-purpose APSR
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-573
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.27 VMSR (floating-point)
Syntax
VMSR{cond} extsysreg, Rd
where:
cond
is an optional condition code.
extsysreg
is the floating-point system register, usually FPSCR, FPSID, or FPEXC.
Rd
Usage
The VMSR instruction transfers the contents of Rd into extsysreg.
Note
The instruction stalls the processor until all current floating-point operations complete.
Example
VMSR FPSCR, r4
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-574
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.28 VMUL (floating-point)
Syntax
VMUL{cond}.F32 {Sd,} Sn, Sm
where:
cond
Operation
The VMUL operation multiplies the values in the operand registers and places the result in the destination
register.
Floating-point exceptions
This instruction can produce Invalid Operation, Overflow, Underflow, Inexact, or Input Denormal
exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-575
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.29 VNEG (floating-point)
Syntax
VNEG{cond}.F32 Sd, Sm
VNEG{cond}.F64 Dd, Dm
where:
cond
Operation
The VNEG instruction takes the contents of Sm or Dm, changes the sign bit, and places the result in Sd or Dd.
This gives the negation of the value.
If the operand is a NaN, the sign bit is changed, but no exception is produced.
Floating-point exceptions
VNEG instructions do not produce any exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-576
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.30 VNMLA (floating-point)
Syntax
VNMLA{cond}.F32 Sd, Sn, Sm
where:
cond
Operation
The VNMLA instruction multiplies the values in the operand registers, adds the value to the destination
register, and places the negated final result in the destination register.
Floating-point exceptions
This instruction can produce Invalid Operation, Overflow, Underflow, Inexact, or Input Denormal
exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-577
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.31 VNMLS (floating-point)
Syntax
VNMLS{cond}.F32 Sd, Sn, Sm
where:
cond
Operation
The VNMLS instruction multiplies the values in the operand registers, subtracts the result from the value in
the destination register, and places the negated final result in the destination register.
Floating-point exceptions
This instruction can produce Invalid Operation, Overflow, Underflow, Inexact, or Input Denormal
exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-578
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.32 VNMUL (floating-point)
Syntax
VNMUL{cond}.F32 {Sd,} Sn, Sm
where:
cond
Operation
The VNMUL instruction multiplies the values in the operand registers and places the negated result in the
destination register.
Floating-point exceptions
This instruction can produce Invalid Operation, Overflow, Underflow, Inexact, or Input Denormal
exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-579
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.33 VPOP (floating-point)
Syntax
VPOP{cond} Registers
where:
cond
is a list of consecutive extension registers enclosed in braces, { and }. The list can be comma-
separated, or in range format. There must be at least one register in the list.
You can specify S or D registers, but they must not be mixed. The number of registers must not
exceed 16 D registers.
Note
VPOP Registers is equivalent to VLDM sp!, Registers.
You can use either form of this instruction. They both disassemble to VPOP.
Related references
C1.9 Condition code suffixes on page C1-92
C4.34 VPUSH (floating-point) on page C4-581
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-580
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.34 VPUSH (floating-point)
Syntax
VPUSH{cond} Registers
where:
cond
is a list of consecutive extension registers enclosed in braces, { and }. The list can be comma-
separated, or in range format. There must be at least one register in the list.
You can specify S or D registers, but they must not be mixed. The number of registers must not
exceed 16 D registers.
Note
VPUSH Registers is equivalent to VSTMDB sp!, Registers.
You can use either form of this instruction. They both disassemble to VPUSH.
Related references
C1.9 Condition code suffixes on page C1-92
C4.33 VPOP (floating-point) on page C4-580
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-581
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.35 VRINT (floating-point)
Note
This instruction is supported only in Armv8.
Syntax
VRINTmode{cond}.F64.F64 Dd, Dm
VRINTmode{cond}.F32.F32 Sd, Sm
where:
mode
Notes
You cannot use VRINT with a rounding mode of A, N, P or M inside an IT block.
Floating-point exceptions
These instructions cannot produce any exceptions, except VRINTX which can generate an Inexact
exception.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-582
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.36 VSEL
C4.36 VSEL
Floating-point select.
Note
This instruction is supported only in Armv8.
Syntax
VSELcond.F32 Sd, Sn, Sm
where:
cond
must be one of GE, GT, EQ, VS.
Sd, Sn, Sm
are the single-precision registers for the result and operands.
Dd, Dn, Dm
are the double-precision registers for the result and operands.
Usage
The VSEL instruction compares the values in the operand registers. If the condition is true, it copies the
value in the first operand register into the destination operand register. Otherwise, it copies the value in
the second operand register.
You cannot use VSEL inside an IT block.
Floating-point exceptions
VSEL instructions cannot produce any exceptions.
Related references
C1.11 Comparison of condition code meanings in integer and floating-point code on page C1-94
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-583
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.37 VSQRT
C4.37 VSQRT
Floating-point square root.
Syntax
VSQRT{cond}.F32 Sd, Sm
VSQRT{cond}.F64 Dd, Dm
where:
cond
Operation
The VSQRT instruction takes the square root of the contents of Sm or Dm, and places the result in Sd or Dd.
Floating-point exceptions
VSQRT instructions can produce Invalid Operation or Inexact exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-584
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.38 VSTM (floating-point)
Syntax
VSTMmode{cond} Rn{!}, Registers
where:
mode
meaning Increment address After each transfer. IA is the default, and can be omitted.
DB
meaning Empty Ascending stack operation. This is the same as IA for stores.
FD
meaning Full Descending stack operation. This is the same as DB for stores.
cond
is the general-purpose register holding the base address for the transfer.
!
is optional. ! specifies that the updated base address must be written back to Rn. If ! is not
specified, mode must be IA.
Registers
is a list of consecutive extension registers enclosed in braces, { and }. The list can be comma-
separated, or in range format. There must be at least one register in the list.
You can specify S or D registers, but they must not be mixed. The number of registers must not
exceed 16 D registers.
Note
VPUSH Registers is equivalent to VSTMDB sp!, Registers.
You can use either form of this instruction. They both disassemble to VPUSH.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-585
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.39 VSTR (floating-point)
Syntax
VSTR{cond}{.size} Fd, [Rn{, #offset}]
where:
cond
is the general-purpose register holding the base address for the transfer.
offset
is an optional numeric expression. It must evaluate to a numeric value at assembly time. The
value must be a multiple of 4, and lie in the range -1020 to +1020. The value is added to the
base address to form the address used for the transfer.
Operation
The VSTR instruction saves the contents of an extension register to memory.
One word is transferred if Fd is an S register. Two words are transferred otherwise.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-586
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.40 VSTR (post-increment and pre-decrement, floating-point)
Note
There are also VLDR and VSTR instructions without post-increment and pre-decrement.
Syntax
VSTR{cond}{.size} Fd, [Rn], #offset ; post-increment
where:
cond
is the extension register to be saved. It can be either a double precision (Dd) or a single precision
(Sd) register.
Rn
is the general-purpose register holding the base address for the transfer.
offset
is a numeric expression that must evaluate to a numeric value at assembly time. The value must
be 4 if Fd is an S register, or 8 if Fd is a D register.
Operation
The post-increment instruction increments the base address in the register by the offset value, after the
transfer. The pre-decrement instruction decrements the base address in the register by the offset value,
and then performs the transfer using the new address in the register. This pseudo-instruction assembles to
a VSTM instruction.
Related references
C4.39 VSTR (floating-point) on page C4-586
C4.38 VSTM (floating-point) on page C4-585
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-587
reserved.
Non-Confidential
C4 Floating-point Instructions (32-bit)
C4.41 VSUB (floating-point)
Syntax
VSUB{cond}.F32 {Sd}, Sn, Sm
where:
cond
Operation
The VSUB instruction subtracts the value in the second operand register from the value in the first operand
register, and places the result in the destination register.
Floating-point exceptions
The VSUB instruction can produce Invalid Operation, Overflow, or Inexact exceptions.
Related references
C1.9 Condition code suffixes on page C1-92
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C4-588
reserved.
Non-Confidential
Chapter C5
A32/T32 Cryptographic Algorithms
Lists the cryptographic algorithms that A32 and T32 SIMD instructions support.
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C5-589
reserved.
Non-Confidential
C5 A32/T32 Cryptographic Algorithms
C5.1 A32/T32 Cryptographic instructions
100076_0200_00_en Copyright © 2018, 2019 Arm Limited or its affiliates. All rights C5-590
reserved.
Non-Confidential