64-Bit TX System RISC
TX49/H2 Core Architecture
JAN. 2002
R4000/R4400/R5000 are a trademark of MIPS Technologies, Inc.
The information contained herein is subject to change without notice.
The information contained herein is presented only as a guide for the applications of our
products. No responsibility is assumed by TOSHIBA for any infringements of patents or
other rights of the third parties which may result from its use. No license is granted by
implication or otherwise under any patent or patent rights of TOSHIBA or others.
The products described in this document contain components made in the United States
and subject to export control of the U.S. authorities. Diversion contrary to the U.S. law
is prohibited.
TOSHIBA is continually working to improve the quality and reliability of its products.
Nevertheless, semiconductor devices in general can malfunction or fail due to their
inherent electrical sensitivity and vulnerability to physical stress.
It is the responsibility of the buyer, when utilizing TOSHIBA products, to comply with
the standards of safety in making a safe design for the entire system, and to avoid
situations in which a malfunction or failure of such TOSHIBA products could cause loss
of human life, bodily injury or damage to property.
In developing your designs, please ensure that TOSHIBA products are used within
specified operating ranges as set forth in the most recent TOSHIBA products
specifications.
Also, please keep in mind the precautions and conditions set forth in the “Handling
Guide for Semiconductor Devices,” or “TOSHIBA Semiconductor Reliability
Handbook” etc..
The Toshiba products listed in this document are intended for usage in general
electronics applications ( computer, personal equipment, office equipment, measuring
equipment, industrial robotics, domestic appliances, etc.).
These Toshiba products are neither intended nor warranted for usage in equipment that
requires extraordinarily high quality and/or reliability or a malfunction or failure of
which may cause loss of human life or bodily injury (“Unintended Usage”). Unintended
Usage include atomic energy control instruments, airplane or spaceship instruments,
transportation instruments, traffic signal instruments, combustion control instruments,
medical instruments, all types of safety devices, etc.. Unintended Usage of Toshiba
products listed in this document shall be made at the customer’s own risk.
The products described in this document may include products subject to the foreign
exchange and foreign trade laws.
© 2002 TOSHIBA CORPORATION
All Rights Reserved
Preface
Thank you for your new or continued patronage of Toshiba semiconductor products. This is the 1998
edition of the user’s manual for the TX49 Family of 64-bit RISC microprocessors, entitled 64-Bit TX
System RISC TX49/H2 Architecture.
This manual is written so as to be accessible to engineers who may be designing a To shiba
microprocessor into their products for the first time. No prior knowledge of these devices is assumed.
The manual includes a review of the architecture of the processor family, a description of the TX49
instruction set, and sections dedicated to various other relevant topics, such as the Memory
Management System (MMU) and CPU exceptions.
Toshiba continually updates its technic al information. Your comments and sugge stions concerning this
and other Toshiba documents are sincerely appreciated and may be used in subsequent editions. For
updates to this document or for additional information about the product, please contact your nearest
Toshiba office or authorized Toshiba dealer.
January 2002
Contents
i
Contents
Handling Precautions
1. Introduction ........................................................................................................................................... 1-1
2. Feature................................................................................................................................................... 2-1
3. TX49 Block Diagram............................................................................................................................. 3-1
4. CPU Registers Overview....................................................................................................................... 4-1
4.1 Introduction ................................................................................................................................... 4-1
4.2 CPU Registers................................................................................................................................ 4-1
4.3 CP0 Registers................................................................................................................................. 4-2
5. CPU Instruction Set Summary ............................................................................................................ 5-1
5.1 Introduction ................................................................................................................................... 5-1
5.2 Instruction Format........................................................................................................................ 5-1
5.3 Instruction Set Ove rv iew.............................................................................................................. 5-2
5.3.1 Load and Store Instructions (Table 5-1)..............................................................................5-2
5.3.2 Computational Instructions (Table 5-2)............................................................................... 5-3
5.3.3 Jump and Branch Instructions (Table 5-3).......................................................................... 5-4
5.3.4 Special Instructions (Table 5-4)............................................................................................ 5-5
5.3.5 Exception Instructions (Table 5-5).......................................................................................5-5
5.3.6 Coprocessor Instructions (Table 5-6).................................................................................... 5-6
5.3.7 CP0 Instructions (Table 5-7)................................................................................................. 5-6
5.3.8 Multiply and Divide Instructions (Table 5-8)...................................................................... 5-7
5.3.9 Debug Instructions (Table 5-9).............................................................................................5-7
5.3.10 Other Instructions (Table 5-10)............................................................................................ 5-7
5.4 Instruction Execution Cycles........................................................................................................ 5-7
5.5 Defining Access Types................................................................................................................... 5-8
6. CPU Pipeline ......................................................................................................................................... 6-1
6.1 Introduction ................................................................................................................................... 6-1
6.2 Basic Pipeline Operation............................................................................................................... 6-1
6.3 TX49 Pipeline Activities................................................................................................................ 6-2
6.4 Branch and Load Delay................................................................................................................. 6-3
6.4.1 Delayed load........................................................................................................................... 6-3
6.4.2 Delayed branching................................................................................................................. 6-3
6.5 Non-blocking Load Function......................................................................................................... 6-4
6.6 Interlock and Exception Handling ............................................................................................... 6-4
6.6.1 Overview of Interlock and Exception Handling ..................................................................6-4
6.6.2 Exception Conditions............................................................................................................. 6-6
6.6.3 Stall Conditions..................................................................................................................... 6-6
6.6.4 External Stalls....................................................................................................................... 6-6
6.6.5 Interlock and Exception Timing........................................................................................... 6-6
6.7 Multiply and Multiply/Add Instructions (MULT, MULTU, MADD, MADDU)......................... 6-7
6.8 Divide Instructions (DIV, DIVU).................................................................................................. 6-7
6.9 Streaming....................................................................................................................................... 6-7
7. System Control Coprocessor, CP0........................................................................................................ 7-1
7.1 Introduction ................................................................................................................................... 7-1
7.2 CP0 Registers................................................................................................................................. 7-2
7.2.1 Index register (Reg#0)........................................................................................................... 7-2
7.2.2 Random register (Reg#1)....................................................................................................... 7-3
7.2.3 EntryLo0 register (Reg#2) and EntryLo1 register (Reg#3)................................................. 7-4
7.2.4 Context regi ster (Reg#4) ....................................................................................................... 7-5
7.2.5 PageMask Register (Reg#5).................................................................................................. 7-6
7.2.6 Wired Register (Reg#6) ......................................................................................................... 7-7
7.2.7 BadVAddr Register (Reg#8).................................................................................................. 7-8
7.2.8 Count Register (Reg#9) ......................................................................................................... 7-9
7.2.9 EntryHi Register (Reg#10).................................................................................................. 7-10
Contents
ii
7.2.10 Compare Register (Reg#11) ................................................................................................ 7-11
7.2.11 Status Registe r (Re g#12)..................................................................................................... 7-12
7.2.12 Cause Register (Reg#13) ..................................................................................................... 7-15
7.2.13 EPC Register (Reg#14)........................................................................................................ 7-16
7.2.14 PRId Register (Reg#15)....................................................................................................... 7-17
7.2.15 Config Register (Reg#16)..................................................................................................... 7-18
7.2.16 LLAddr Register (Reg#17) .................................................................................................. 7-20
7.2.17 XContext Register (Reg#20)................................................................................................ 7-21
7.2.18 Debug Register (Reg#23)..................................................................................................... 7-22
7.2.19 DEPC Register (Reg#24)..................................................................................................... 7-24
7.2.20 TagLo Register (Reg#28) and TagHi Register (Reg#29) ................................................... 7-25
7.2.21 ErrorEPC Register (Reg#30)............................................................................................... 7-26
7.2.22 DESAVE Register (Reg#31)................................................................................................ 7-27
7.2.23 The Initialization of CP0 Registers in SoftReset Exception............................................. 7-28
8. Memory Management System.............................................................................................................. 8-1
8.1 Introduction ................................................................................................................................... 8-1
8.2 Address Space Overview............................................................................................................... 8-1
8.2.1 Virtual Address Space........................................................................................................... 8-1
8.2.2 Physical Address Space......................................................................................................... 8-2
8.2.3 Virtual-to-Physical Address Translation............................................................................. 8-2
8.2.4 32-bit Mode Address Translation......................................................................................... 8-3
8.2.5 64-bit Mode Address Translation......................................................................................... 8-4
8.3 Operating Modes ........................................................................................................................... 8-5
8.3.1 User Mode Operations........................................................................................................... 8-5
8.3.2 Supervisor Mode Operations ................................................................................................8-7
8.3.3 Kernel Mode Operations....................................................................................................... 8-9
8.4 Translation Lookaside Buffer..................................................................................................... 8-16
8.4.1 Joint TLB ............................................................................................................................. 8-16
8.4.2 TLB Entry format................................................................................................................ 8-16
8.4.3 Instruction-TLB................................................................................................................... 8-17
8.4.4 Data-TLB ............................................................................................................................. 8-17
8.5 Virtual-to-Physical Address Translation Process .....................................................................8-18
9. Cache Organization............................................................................................................................... 9-1
9.1 Introduction ................................................................................................................................... 9-1
9.2 Instruction Cache (I-Cache).......................................................................................................... 9-1
9.2.1 Instruction Cache Address Field.......................................................................................... 9-1
9.2.2 Instruction Cache Configuration..........................................................................................9-2
9.3 Data Cache..................................................................................................................................... 9-2
9.3.1 Data Cache Address Field..................................................................................................... 9-3
9.3.2 Data Cache Configuration .................................................................................................... 9-3
9.3.3 Data Cache Policies............................................................................................................... 9-4
9.4 FIFO Replacement Algorithm...................................................................................................... 9-5
9.5 Lock function ................................................................................................................................. 9-5
9.5.1 Lock bit setting and clearing ................................................................................................ 9-5
9.5.2 Operation During Lock ......................................................................................................... 9-6
9.5.3 Example of Data Cache Locking........................................................................................... 9-6
9.5.4 Example of Instruction Cache Locking................................................................................ 9-6
9.6 The Primary Cache Accessing ...................................................................................................... 9-7
9.7 Cache States .................................................................................................................................. 9-7
9.8 Cache Line Ownership.................................................................................................................. 9-8
9.9 Cache Multi-Hit Operation........................................................................................................... 9-8
9.10 Cache Test Function...................................................................................................................... 9-8
9.10.1 Cache Disabling..................................................................................................................... 9-8
9.10.2 Cache Flushing...................................................................................................................... 9-9
10. Write Buffer......................................................................................................................................... 10-1
11. CPU Exception..................................................................................................................................... 11-1
11.1 Introduction ................................................................................................................................. 11-1
11.2 Exception Vector Locations......................................................................................................... 11-1
Contents
iii
11.3 Priority of Exception ................................................................................................................... 11-2
11.4 ColdReset Exception.................................................................................................................... 11-3
11.4.1 Cause.................................................................................................................................... 11-3
11.4.2 Processing ............................................................................................................................ 11-3
11.4.3 Servicing............................................................................................................................... 11-3
11.5 SoftReset Exception..................................................................................................................... 11-4
11.5.1 Cause.................................................................................................................................... 11-4
11.5.2 Processing ............................................................................................................................ 11-4
11.5.3 Servicing............................................................................................................................... 11-4
11.6 NMI (Non-maskable Interrupt) Exception ................................................................................ 11-5
11.6.1 Cause.................................................................................................................................... 11-5
11.6.2 Processing ............................................................................................................................ 11-5
11.6.3 Servicing............................................................................................................................... 11-5
11.7 Address Error Exception............................................................................................................. 11-6
11.7.1 Cause.................................................................................................................................... 11-6
11.7.2 Processing ............................................................................................................................ 11-6
11.7.3 Servicing............................................................................................................................... 11-6
11.8 TLB Refill Exception................................................................................................................... 11-7
11.8.1 Cause.................................................................................................................................... 11-7
11.8.2 Processing ............................................................................................................................ 11-7
11.8.3 Servicing............................................................................................................................... 11-7
11.9 TLB Invalid Exception................................................................................................................ 11-8
11.9.1 Cause.................................................................................................................................... 11-8
11.9.2 Processing ............................................................................................................................ 11-8
11.9.3 Servicing............................................................................................................................... 11-8
11.10 TLB Modified Exception ............................................................................................................. 11-9
11.10.1 Cause.................................................................................................................................... 11-9
11.10.2 Processing ............................................................................................................................ 11-9
11.10.3 Servicing............................................................................................................................... 11-9
11.11 Bus Error Exception.................................................................................................................. 11-10
11.11.1 Cause.................................................................................................................................. 11-10
11.11.2 Processing .......................................................................................................................... 11-10
11.11.3 Servicing............................................................................................................................. 11-10
11.12 Integer Overflow Exception...................................................................................................... 11-11
11.12.1 Cause.................................................................................................................................. 11-11
11.12.2 Processing .......................................................................................................................... 11-11
11.12.3 Servicing............................................................................................................................. 11-11
11.13 Trap Exception .......................................................................................................................... 11-12
11.13.1 Cause.................................................................................................................................. 11-12
11.13.2 Processing .......................................................................................................................... 11-12
11.13.3 Servicing............................................................................................................................. 11-12
11.14 System Call Exception .............................................................................................................. 11-13
11.14.1 Cause.................................................................................................................................. 11-13
11.14.2 Processing .......................................................................................................................... 11-13
11.14.3 Servicing............................................................................................................................. 11-13
11.15 Breakpoint Exception................................................................................................................ 11-14
11.15.1 Cause.................................................................................................................................. 11-14
11.15.2 Processing .......................................................................................................................... 11-14
11.15.3 Servicing............................................................................................................................. 11-14
11.16 Reserved Instruction Exception ............................................................................................... 11-15
11.16.1 Cause.................................................................................................................................. 11-15
11.16.2 Processing .......................................................................................................................... 11-15
11.16.3 Servicing............................................................................................................................. 11-15
11.17 Coprocessor Unusable Exception ............................................................................................ . 11-16
11.17.1 Cause.................................................................................................................................. 11-16
11.17.2 Processing .......................................................................................................................... 11-16
11.17.3 Servicing............................................................................................................................. 11-16
11.18 Floating-Point Exception .......................................................................................................... 11-17
11.18.1 Cause.................................................................................................................................. 11-17
Contents
iv
11.18.2 Processing .......................................................................................................................... 11-17
11.18.3 Servicing............................................................................................................................. 11-17
11.19 Interrupt Exception................................................................................................................... 11-18
11.19.1 Cause.................................................................................................................................. 11-18
11.19.2 Processing .......................................................................................................................... 11-18
11.19.3 Servicing............................................................................................................................. 11-18
11.20 Exception Handling and Servicing Flowcharts....................................................................... 11-19
12. Floating-Point Unit, CP1.................................................................................................................... 12-1
12.1 Overview ...................................................................................................................................... 12-1
12.2 Floating Point Register............................................................................................................... 12-1
12.2.1 Floating-Point General Registers (FGRs)..........................................................................12-1
12.2.2 Floating-Point Control Registers........................................................................................ 12-2
12.2.3 Accessing the FP Control and Implementation/Revision Registers................................. 12-5
12.3 Floating-Po int Fo rmats............................................................................................................... 12-6
12.4 Binary Fixed-Point Format......................................................................................................... 12-7
12.5 Floating-Point Instruction Set Summary.................................................................................. 12-8
12.5.1 Load, Move and Store Instructions (Table 12-10)............................................................. 12-8
12.5.2 Conversion Instructions (Table 12-11)............................................................................... 12-9
12.5.3 Computational Instructions (Table 12-12)......................................................................... 12-9
12.5.4 Compare and Branch Instructions (Table 12-13)............................................................12-10
13. Floating-Point Exception .................................................................................................................... 13-1
13.1 Introduction ................................................................................................................................. 13-1
13.2 Exception Types...........................................................................................................................13-1
13.3 Exception Trap Processing.......................................................................................................... 13-2
13.4 Flags............................................................................................................................................. 13-2
13.5 FPU Exceptions........................................................................................................................... 13-3
13.6 Saving and Restoring State ........................................................................................................ 13-6
13.7 Trap Handlers for IEEE Standard 754 Exceptions................................................................... 13-6
14. Debug Support Unit............................................................................................................................ 14-1
14.1 Features ....................................................................................................................................... 14-1
14.2 EJTAG interface.......................................................................................................................... 14-1
14.3 JTAG Interface............................................................................................................................ 14-2
14.4 Processor Access Overview......................................................................................................... 14-2
14.5 Instruction ................................................................................................................................... 14-2
14.6 Debug Unit................................................................................................................................... 14-3
14.6.1 Extended Instructions......................................................................................................... 14-3
14.6.2 Extended Debug Registers in CP0 ..................................................................................... 14-3
14.7 Register Map................................................................................................................................ 14-3
14.8 Processor Bus Break Function ................................................................................................... 14-3
14.9 Debug Exception.......................................................................................................................... 14-4
14.9.1 Debug Single Step (DSS)..................................................................................................... 14-4
14.9.2 Debug Breakpoint exception (Dbp) ....................................................................................14-4
14.9.3 JTAG Break Exception........................................................................................................ 14-4
14.9.4 Debug Exception Handling.................................................................................................14-4
14.9.5 Branching to debug handler ............................................................................................... 14-4
14.9.6 Exception handling when in Debug Mode (DM bit is set) ................................................ 14-4
14.10 Real Time PC TRACE Output.................................................................................................... 14-4
15. TX 49 MPU Core Signal Descr iptio ns................................................................................................. 15-1
15.1 Signal Descriptions...................................................................................................................... 15-2
15.1.1 Memory Interface Signals................................................................................................... 15-2
15.1.2 DMA Interface Signals........................................................................................................ 15-4
15.1.3 Coprocessor Interface Signals............................................................................................. 15-5
15.1.4 Interrupt Interface Signals................................................................................................. 15-5
15.1.5 Test Interface Signals ......................................................................................................... 15-6
15.1.6 Debug Interface Signals...................................................................................................... 15-6
15.1.7 Clock and System Control Interface Signals..................................................................... 15-7
16. Low Power Consumption Modes......................................................................................................... 16-1
16.1 Halt mode..................................................................................................................................... 16-1
Contents
v
16.2 Doze mode.................................................................................................................................... 16-2
16.3 Status Shifts ................................................................................................................................ 16-3
Appendix A: CPU Instruction Set Details..........................................................................................A-1
A.1 Instruction Classes........................................................................................................................A-1
A.1.1 Instruction Format s ..............................................................................................................A-2
A.1.2 Instruction Notation Conventions........................................................................................A-2
A.1.3 Sign Extension and Zero Extension.....................................................................................A-4
A.1.4 Instruction Notation Examples............................................................................................A-4
A.2 Load and Store Instructions.........................................................................................................A-5
A.3 Jump and Branch Instructions.....................................................................................................A-6
A.4 Coprocessor Instructions...............................................................................................................A-6
A.5 System Control Coprocessor (CP0) Instructions.........................................................................A-6
A.6 CPU Instructions...........................................................................................................................A-7
A.7 Bit Encoding of CPU Instruction OPcodes ..............................................................................A-179
Appendix B: FPU Instruction Set Details..........................................................................................B-1
B.1 Instruction Format s ......................................................................................................................B-1
B.1.1 Floating-Po int Lo ads, Store s, and Moves ............................................................................ B-3
B.1.2 Floating-Point Operations ....................................................................................................B-3
B.2 Instruction Notational Conventions.............................................................................................B-4
B.2.1 Instruction Notation Examples............................................................................................B-4
B.3 Load and Store Instructions.........................................................................................................B-5
B.4 Computational Instructions..........................................................................................................B-7
B.5 Bit Encoding of FPU Instruction OPcodes.................................................................................B-50
Appendix C: Coprocessor 0 Hazards............................................................................................... ....C-1
C.1 Pipeline Interlock and Hazard in TX49.......................................................................................C-1
C.1.1 Interlock in Load Delay Slot.................................................................................................C-1
C.1.2 Branch Delay Slot..................................................................................................................C-2
C.1.3 Multiply, Multiply/Add and Division Instructions..............................................................C-3
C.1.4 Instructions regarding System Control Co-processor (CP0).............................................C-10
C.1.5 Control Bits Change in CP0 Registers by MTC0 Instruction...........................................C-12
C.2 Pipeline Behavior on Cache Miss...............................................................................................C-16
C.2.1 Instruction Cache Miss .......................................................................................................C-16
C.2.2 Data Cache Miss..................................................................................................................C-17
C.3 Pipeline Behavior in Uncached Area .........................................................................................C-19
C.3.1 Data Read from Uncached Area.........................................................................................C-19
C.3.2 Instruction Fetch from Uncached Area..............................................................................C-19
C.3.3 Data Write to Uncached Area.............................................................................................C-19
C.4 Timings on the Exception Handling...........................................................................................C-20
C.4.1 Basic Pipeline Behavior When Exceptions Occur .............................................................C-20
C.4.2 Exceptions during the Execution of Multi-cycle Instructions ..........................................C-21
C.4.3 Exceptions during the Data Cache Refill Cycle.................................................................C-21
Appendix D: G-Bus Overview............................................................................................................. D-1
D.1 G-Bus Operation........................................................................................................................... D-1
D.2 Types of G-Bus Arbitration.......................................................................................................... D-1
D.2.1 Snoop & Transfer (ST) Concurrency................................................................................... D-1
D.2.2 Execute & Transfer (ET) Concurrency................................................................................ D-2
Appendix E: Differences From TX4955A,TX4300 and TX4600........................................................E-1
Contents
vi
Handling Precautions
1 Using Toshiba Semiconductors Safely
1-1
1. Using Toshiba Semiconductors Safely
TOSHIBA are continually working to improve the quality and the reliability of their products.
Nevertheless, semiconductor devices in general can malfuncti on or fail due to their inherent
electrical sensitivity and vulnerability to physical stress. It is the responsibility of the buyer, when
utilizing TOSHIBA products, to observe standards of safety, and to avoid situations in which a
malfunction or failure of a TOSHIBA product could cause loss of human life, bodily injury or
damage to property.
In developing your designs, please ensure that TOSHIBA products are used within specified
operating ranges as set forth in the most recent products specifications. Also, please keep in mind
the precautions and conditions set forth in the TOSHIBA Semiconductor Reliability Handbook.
1 Using Toshiba Semiconductors Safely
1-2
2 Safety Precautions
2-1
2. Safety Precautions
This section lists important precautions which users of semiconductor devices (and anyone else)
should observe in order to avoi d injury and dama ge to propert y, and to ens ure safe a nd correct us e
of devices.
Please be sure that you understand the meanings of the labels and the graphic symbol described
below before you move on to the detailed descriptions of the precaut ions.
[Explanation of labels]
[Explanation of labels][Explanation of labels]
[Explanation of labels]
Indicates an imminently hazardous situation which will result in death or
serious injury if you do not follow instructions.
Indicates a pot entially hazardous situation which could result in death or
serious injury if you do not follow instructions.
Indicates a potentially haza rdous situation which i f not avoided, ma y result
in minor injury or moderate injury.
[Explanation of graphic symbol]
[Explanation of graphic symbol][Explanation of graphic symbol]
[Explanation of graphic symbol]
Graphic symbol Meaning
Indicates t hat cauti on is required (laser beam is dangerous to eyes).
2 Safety Precautions
2-2
2.1 General Precautions regarding Semiconductor Devices
Do not use devices under conditions exceeding t hei r absol ute maximum ratings (e.g. current, volt age, power dissipation or
temperature).
This may cause the device to break down, degrade its perform ance, or cause it to catch fi re or explode resulting in injury.
Do not insert devices i n the wrong orientat i on.
Make sure that the positive and negati ve termi nals of power suppl i es are connected correctly. Otherwise the rated maximum
current or power dissipation may be exceeded and the device may break down or undergo performance degradation, causing it to
catch fire or explode and resulting in injury.
When power to a device is on, do not touch the device’s heat sink.
Heat sinks becom e hot, s o you may burn your hand.
Do not touch the tips of device leads.
Because some types of devic e have l eads with poi nted tips, you may prick your finger.
When conducting any ki nd of evaluation, inspection or testing, be sure to connect the testing equipment s electrodes or probes to
the pins of the device under test before powering it on.
Otherwise, you m ay receive an el ectric shock causing injury.
Before grounding an item of measuring equipment or a soldering iron, check that there is no electrical leakage from it.
Electri cal leakage may cause the device which you are testing or soldering to break down, or could give you an electric shock.
Always wear protecti ve gl asses when cutting the leads of a device with clippers or a similar tool.
If you do not, small bits of met al flying off the cut ends may damage your eyes.
2 Safety Precautions
2-3
2.2 Precautions Specific to Each Product Group
2.2.1 Optical semiconductor devices
When a visible semiconductor laser is operating, do not look directl y into the laser beam or look through the optical system.
This is highly likel y to impair visi on, and i n the worst case may cause blindness.
If it is necessary to examine t he las er apparatus, for exampl e to inspect its optical characteristic s, always wear the appropri ate
type of laser prot ective gl asses as stipulated by IEC standard IEC825-1.
Ensure that the current flowing in an LED device does not exceed the device’s maximum rated current.
This is particularl y important for resin-packaged LE D devic es, as excessive current may cause the package resin to blow up,
scatteri ng resi n fragments and causi ng injury.
When testing the diel ect ric strength of a photocoupler, us e testi ng equipment which can shut off the supply voltage to the
photocoupler. If you detect a leakage current of more than 100 µA, use the testing equipment to shut off the photocoupler’s
supply voltage; otherwise a large short-circuit current will flow continuously, and the device may break down or burst in to flames,
resulting in fire or injury.
When incorporat i ng a visible sem i conductor laser into a design, use the device’s internal photodetector or a separate
photodetector to stabilize the laser’s radiant power so as to ensure that laser beams exceeding the laser’s rated radiant power
cannot be emitted.
If this stabilizi ng m echanism does not work and the rated radiant power is exceeded, the device may break down or the
excessivel y powerful la ser beams may cause injury.
2.2.2 Power devices
Never touch a power device while it is powered on. Also, after turning off a power device, do not touch it until it has thoroughly
discharged all rem ai ning elect rical charge.
Touching a power device while it is powered on or still charged could caus e a severe electric shock, resulting in death or serious
injury.
When conducting any kind of evaluation, inspection or testing, be sure to connect the testing equipment’s electrodes or probes to
the device under test before powering it on.
When you have finished, disc harge any el ectrical charge remaining in the device.
Connecting the electrodes or probes of testing equipment to a device while it is powered on may result in electric shock, c ausi ng
injury.
2 Safety Precautions
2-4
Do not use devices under conditions which exceed thei r absol ute maximum ratings (current, voltage, power dissipation,
temperature etc. ).
This may cause the device to break down, causing a large short-circuit current to flow, which may in turn cause it to catch fire or
explode, resulting i n fi re or injury.
Use a unit which can detect short-circuit currents and which will shut off the power supply if a short-c i rcuit occurs.
If the power supply is not shut off, a large short-circuit current will flow continuously, which may in turn cause the device to catch
fire or explode, resulti ng i n fire or injury.
When designing a case for enclosing your system, consider how best to protect the user from shrapnel in the event of the device
catching fire or exploding.
Flying shrapnel can cause injury.
When conducting any ki nd of evaluati on, inspection or testing, always use protective safety tools such as a cover for the device.
Otherwise you may sustai n i nj u ry caused by t he devic e catc hi ng fire or exploding.
Make sure that all metal casings in your design are grounded to earth.
Even in modules where a device’s electrodes and m etal casing are i n sul at e d, capacit ance i n the module may cause the
electrost ati c pot enti al i n the casing to rise.
Dielectric breakdown may cause a high voltage to be applied to the casing, causing electric shock and injury to anyone touching it.
When designing the heat radiati on and safet y features of a system incorporating high-speed rectifi ers, remember to take the
device’s f o rward and reverse losses into account.
The leakage current in these devices is greater than that in ordinary rectifiers; as a result, if a high-speed rectifie r is use d in an
extreme environment (e.g. at high temperature or high voltage), its reverse loss may increase, causing thermal runaway to occu r.
This may in turn cause the device to explode and scatter shrapnel, resulting in injury to the user.
A design should ensure that, except when the main circuit of the device is active, reverse bias is applied to the device gate while
electricity is conducted to control circuits, so that the main circuit will becom e inactive.
Malfunct i on of the device may cause serious accidents or injuries.
When conducting any ki nd of evaluation, inspection or testing, either wear protecti ve gl oves or wait until the device has cooled
properly before handling it.
Devices become hot when they are operated. Even after the power has been turned off, the device will retain residual heat which
may cause a burn to anyone touching it.
2.2.3 Bipolar ICs (for use in automobiles)
If your design incl udes an inducti ve l oad such as a motor coil, incorporate diodes or similar devices into t he design to prevent
negative current from flowing in.
The load current generated by powering the device on and off may cause it to function erratically or to break down, which could in
turn caus e injury.
Ensure that the power supply t o any devic e which incorporates protective functions is stable.
If the power supply is unstabl e, the device may operate erratically, prevent i ng the prot ective functions from working correctly. If
protect i ve funct i ons fail , t he devic e may break down causi ng injury to the user.
3 General Safety Precautions and Usage Considerations
3-1
3. General Safety Precauti ons and Usage Considerations
This section is designed to help you gain a better understanding of semiconductor devices, so as to
ensure the safety, quality and reliability of the devices which you incorporate into your designs.
3.1 From Incomi ng to Shipping
3.1.1 Electrostatic discharge (ESD)
When handling individual devices (which are not yet mount ed on a printed
circuit board), be sure that the environment is protected against
electrostatic electricity. Operators should wear anti-static clothing, and
containers and other objects which come into direct contact with devices
should be made of anti-static materials and should be grounded to earth via
an 0.5- to 1.0-M protective resistor.
Please follow the precautions described below; this is particularly important
for devices which are marked “Be careful of static.”.
(1) Work environment
When humidity in the working environment decreases, the human body and other insulators
can easily become charged with static electricity due to friction. Maintain the recommended
humidity of 40% to 60% in the work environment, while also taking into account the fact that
moisture-proof-packed products may absorb moisture after unpacking.
Be sure that all equipment, jigs and tools in the working area are grounded to earth.
Place a conductive mat over the floor of the work area, or take other appropriate measures, s o
that the floor s urfac e is prot ected a gainst st at ic el ect ricit y an d is grounded t o ea rth. Th e surfa ce
resistivity should be 104 to 108 /sq and the resistance between surface and ground, 7.5 × 105 to
108
Cover the workbench surface also with a conductive mat (with a surface resistivity of 104 to
108 /sq, for a resistance between surface and ground of 7.5 × 105 to 108 ) . The purpose of this
is to disperse stat ic electricity on the surfac e (through resistive components) and ground it to
earth. Workbench surfaces must not be constructed of low-resistance metallic materials that
allow rapid static discharge when a charged device touches them directly.
Pay attention to the following points when using automatic equipment in your workplace:
(a) When picking up ICs with a vacuum unit, use a conductive rubber fitting on the end of the
pick-up wand to protect against electrostatic charge.
(b) Minimize friction on IC packa ge surfac es. If some rubbing is una voidable due to t he device’s
mechanical structure, minimize the friction plane or use material with a small friction
coefficient and low electrical resistance. Also, consider the use of an ionizer.
(c) In sections which come into contact with device lead terminals, use a material which
dissipates static electricity.
(d) Ensure that no statically charged bodies (such as work clothes or the human body) touch
the devices.
3 General Safety Precautions and Usage Considerations
3-2
(e) Make sure that sections of the tape carrier which come into contact with installation
devices or other electrical machinery are made of a low-resistan ce material.
(f) Make sure that jigs and tools used in the assembly process do not touch devices.
(g) In processes in which packages may retain an electrostatic charge, use an ionizer to
neutralize the ions.
Make sure that CRT displays in the working area are protected against static charge, for
example by a VDT filter. As much as possible, avoid turning displays on and off. Doing so can
cause electrostatic induction in devices.
Keep track of charged potential in the workin g area by taking periodic measurements.
Ensure that work chairs are protected by an anti-static textile cover and are grounded to the
floor surface by a grounding chain. (Suggested resistance between the seat surface and
grounding chain is 7.5 × 105 to 1012.)
Install anti-static mats on storage shelf surfaces. (Suggested surface resist ivity is 104 to 108
/sq; suggested resistance between surface and ground is 7.5 × 105 to 108 .)
For transport and temporary storage of devi ces, use containers (boxes, jigs or bags) that are
made of anti-static materials or materials which di ssipate electrostatic charge.
Make sure that cart surfaces which come into contact with device packaging are made of
materials which will conduct static electricity, and verify that they are grounded to t he floor
surface via a grounding chain.
In any location where the level of static electricity is to be closely controlled, the ground
resistance level should be Class 3 or above. Use different ground wires for all items of
equipment which may come into physical contact with devices.
(2) Operating environment
Operators must wea r anti-s tati c clothing and conduct ive s hoes (or
a leg or heel strap).
Operators must wear a wrist strap grounded to earth via a
resistor of about 1 M.
Soldering irons must be grounded from iron tip to earth, and must be used only at low voltages
(6 V to 24 V).
If the tweezers you us e are likely to touch the device terminals, use anti-static tweezers and in
particular avoid metallic tweezers. If a charged device touches a low-resistance tool, rapid
discharge can occur. When using vacuum tweezers, attach a conductive chucking pat to the tip,
and connect it to a dedicated ground used especially for anti-static purposes (suggested
resistance value: 104 to 108 ).
Do not place devices or their containers near sources of strong electric al fields (such as above a
CRT).
3 General Safety Precautions and Usage Considerations
3-3
When storing printed circuit boards which have devi ces mounted on them, use a board
container or bag that is protected against static charge. To avoid the occurrence of static charge
or discha rge due to friction, keep the boards separate from one other and do not stack them
directly on top of one another.
Ensure, if possible, that any articles (such as clipboards) which are brought to any location
where the level of static electricity must be closely controlled are cons tructed of anti-static
materials.
In cases where the human body comes into direct contact with a device, be sure to wear anti-
static finger covers or gl oves (suggested resista nce value: 108 or less).
Equipment safety covers installed near devices should have resistance rati ngs of 109 or less.
If a wrist strap cannot be used for some reason, and there is a possibility of imparting friction to
devices, use an ionizer.
The transport film used in TCP products is manufactured from materials in which static
charges tend to build up. When using these products, install an ionizer to prevent the film from
being charged with static electricity. Also, ensure that no static electricity will be applied to the
product’s copper foils by taking measures to prevent static occuring in the peripheral
equipment.
3.1.2 Vibration, impact and stress
Handle devices and packaging materials with care. To avoid damage
to devices, do not toss or drop packages. Ensure that devices are not
subject ed to mechanical vibration or shock during transportation.
Ceramic package devices and devices in canister-type packages which
have empty space inside them are subject to damage from vibration
and shock because the bonding wires are secured only at their ends.
Plastic molded devices, on the other hand, have a relatively high level
of resistance to vibration and mechanical shock because their bonding
wires are enveloped and fixed in resin. However, when any device or package type is installed in
target equipment , it is to some extent s uscept ib le to wiring di sc onnecti ons and other dama ge from
vibration, shock and stressed solder junctions. Therefore when devices are incorporated into the
design of equipment which will be subject to vibration, the structural design of the equipment
must be thought out carefully.
If a device is subjected to especially strong vibration, mechanical shock or stress, the package or
the chip itself may crack. In products such as CCDs which incorporate window glass, this could
cause su rface flaws in the glass or cause the connection between the glass a n d the ceramic to
separate.
Furthermore, it is known that stress applied to a semiconductor device through the package
changes the resistance characteristics of the chip because of piezoelectric effects. In analog circuit
design attention must be paid to the probl em of package stress as well as to the dangers of
vibration and shock as described above.
Vibration
3 General Safety Precautions and Usage Considerations
3-4
3.2 Storage
3.2.1 General storage
Avoid storage locations where devices will be exposed to moisture or direct sunlight.
Follow the instructions printed on the device cartons regarding
transportation and storage.
The storage area temperature should be kept within a
temperature range of 5°C t o 35°C, a nd relative humi dity s hould
be maintained at between 45% and 75%.
Do not store devices in the presence of harmful (especially
corrosive) gases, or in dusty conditions.
Use storage areas where there is minimal temperature fluctuation. Rapid temperature changes
can cause moisture to form on stored devices , resulting in lead oxidation or corrosi on. As a result,
the solderability of the leads will be degraded.
When repacking devices, use anti-static containers.
Do not allow external forces or loads to be applied to devices while they are in storage.
If devices have been stored for more than two years, their electrical characteristics should be
test ed and their leads should be tested for ease of soldering before they are used.
3.2.2 Moisture-proof packing
Moisture-proof packing should be handled with care. The handling
procedure specified for each packing type should be followed scrupulously.
If the proper procedures are not fol lowed, the quality and reliability of
devices may be degraded. This section describes general precautions for
handling moisture-proof packing. Since the details may differ from device
to device, refer also to the relevant individual datasheets or databook.
(1) General precautions
Follow th e instructions printed on the device cartons regarding transportation and storage.
Do not drop or toss device packing. The laminated aluminum material in it can be rendered
ineffective by rough handling.
The stora g e area temperature should be kept within a temperature range of 5°C to 30°C, and
relative humidity should be maintained at 90% (max). Use devices within 12 months of the date
marked on the package seal.
  
Humidity: Temperature:
3 General Safety Precautions and Usage Considerations
3-5
If the 12-month st orage period has expired, or if the 30% humidity indicator shown in Figure 1
is pink when the packing i s opened, it may be advisable, depending on the device and pack ing
type, to back the devices at high temperature to remove any moisture. Please refer to the table
below. After the pack has been opened, use the devices in a 5°C to 30°C. 60% RH environment
and within t he effecti ve usa ge period l ist ed on the mois ture-proof pa cka ge. If t he effect ive us age
period has expired, or if the packing has been stored in a high-humidity environment, b ake the
devices at high temperature.
Packing Moisture removal
Tray If the packing bears the “Heatproof” marking or indicates the maximum temperature which it can
withstand, bake at 125°C for 20 hours. (Some devices require a different procedure.)
Tube Transfer devices to trays bearing the “Heatproof” marking or indicating the temperature which they
can withstand, or to aluminum tubes before bak i ng at 125°C for 20 hours.
Tape Deviced packed on tape cannot be baked and must be used within the effective usage period after
unpacking, as specif i ed on the packing.
When baking devices, protect the devices from static electricity.
Moisture indicators can detect the approximate humidity level at a standard temperature of
25°C. 6-point indicators and 3-point indicators are currently in use, but eventually all indicators
will be 3-point indicators.
DANGER IF PINK
CHANGE DESICCANT
READ AT LAVENDER
BETWEEN PINK & BLUE
10%
20%
30%
40%
50%
60%
HUM IDITY INDICATOR
DANGER IF PINK
READ AT LAVENDER
BETWEEN PINK & BLUE
20
30
40
HUM IDITY INDICATOR
(a) 6-point indicator (b) 3-poin t indicator
Figure 1 Humidity indicator
3 General Safety Precautions and Usage Considerations
3-6
3.3 Design
Care must be exercis ed in the des ign of electr onic equipment t o achieve the des ired relia bilit y. It is
important not only to adhere to specifications concerning absolute maximum ratings and
recommended operating conditions, it is also important to consider the overall environment in
which equipment will be used, including factors such as the ambient temperature, transient noise
and voltage and current surges, as well as mounting conditions which affect device reliability. This
section describes some general precauti ons which you should observe when designing circuits and
when mounting devices on printed circuit boards.
For more detailed information about each product family, refer to t he relevant individual technical
datasheets available from Toshiba.
3.3.1 Absolute maximum ratings
Do not use devices under condi ti ons i n which t heir ab sol ute maximum rat ings
(e.g. current, voltage, power dissipation or temperature) will be exceeded. A
device may break down or its performance may be degraded, causing it to
catch fire or explode resultin g in injury to the user.
The absolute maximum ratings are rated values which must not be
exceeded during operation, even for an instant. Although absol ute
maximum ratings differ from product to product , they essentially
concern the voltage and current at each pin, the allowable power
dissipation, and the junction and storage temperatures.
If the voltage or current on any pin exceeds the absolute maximum
rating, the device’s internal circuitry can become degraded. In the worst
case, heat generated in internal circuitry can fuse wiring or cause the semiconductor chip to break
down.
If storage or operating temperatures exceed rated values, the package seal can deteriorate or the
wires can become disconnected due to the differences between the thermal expansion coefficients
of the materials from which the device is constructed.
3.3.2 Recommended operating conditions
The recommended operating conditions for each device ar e those necessary to guarantee that the
device will operate as specified in the datasheet.
If greater reliability is required, derate the device’s absolute maximum ratings for voltage, current,
power and temperature before using it.
3.3.3 Derating
When incorporating a device into your desi gn, reduce its rated absolute maximum voltage, current,
power diss ipation and operating temperature in order to ensure high reliability.
Since derating differs from application to application, refer to the technical datasheets available
for the various devices used in your design.
3.3.4 Unused pins
If unused pins are left open, some devices can exhibit input instability problems, resulting in
malfunctions such as abrupt increase in current flow. Similarly, if the unused output pins on a
device are connected to the power supply pin, the ground pin or to other output pins, the IC may
malfuncti on or break down.
3 General Safety Precautions and Usage Considerations
3-7
Since the details regarding the handling of unused pins differ from device to device and from pin
to pin, please follow the instructions given in the relevant individual datasheets or databook.
CMOS logic IC inputs, for example, have extremely high impedance. If an input pin is left open, it
can easily pick up extraneous noise and become unstable. In this case, if the input voltage level
reaches an intermediate level, it is possible that both the P-channel and N-channel transistors
will be turned on, allowing unwanted supply current to flow. Therefore, ensure that the unused
input pins of a devi ce are connected to the power s upply (Vcc) pin or ground (GND) pin of t he same
device. For details of what to do with the pins of heat sinks, refer to the relevant technical
datasheet and databook.
3.3.5 Latch-up
Latch-up is an abnormal conditi on inherent in CMOS devi ces, in which Vcc get s shorted to ground.
This happens when a parasitic PN-PN junction (thyrist or structure) internal to the CMOS chip is
turned on, causing a large current of the order of several hundred mA or more to flow between Vcc
and GND, eventually causing the device to break down.
Latch-up occurs when the input or output voltage exceeds the ra ted value, causing a large current
to flow in the internal chip, or when the voltage on the Vcc (Vdd) pin exceeds its rated value,
forcing the internal chip into a breakdown condition. Once the chip falls into the latch-up state,
even though the excess voltage may have been applied only for an instant, the large current
continues to flow between Vcc (Vdd) and GND (Vss). This causes the device to heat up and, in
extreme cas es , t o emit ga s fumes as wel l. To avoi d th is prob lem, obs erve t he foll owing preca ut ions :
(1) Do not allow voltage l evels on the input and output pins either to rise above Vcc (Vdd) or to
fall below GND (Vss). Also, follow any prescribed power-on sequence, so that power is applied
gradually or in steps rather than abruptly.
(2) Do not allow any abnormal noise signals to be applied to the device.
(3) Set the voltage levels of unused input pins to Vcc (Vdd) or GND (Vss).
(4) Do not connect output pins to one another.
3.3.6 Input/Output protection
Wired-AND configurations, in whi ch outputs are connected together, cannot be used, since this
short-circuits the out puts . Outputs should, of course, never be connected to Vcc (Vdd) or GND
(Vss).
Furthermore, ICs with tri-state outputs can undergo performance degrada tion if a shorted output
current is al lowed t o flow for an extended peri od of t ime. Th erefore, wh en des igni ng circuit s , ma ke
sure that tri-state outputs will not be enabled simultaneously.
3.3.7 Load capacitance
Some devices display increased delay times if the load capacitance is large. Also, large charging
and discharging currents will flow in the device, causing noise. Furthermore, since outputs are
shorted for a relatively long t ime, wiring can become fused.
Consult the technical information for the device being used to determine the recommended load
capacitance.
3 General Safety Precautions and Usage Considerations
3-8
3.3.8 Thermal design
The failure rate of semiconductor devices is greatly increased as operating temperatures increase.
As shown in Figure 2, the int ernal thermal stress on a device is the sum of the ambient
temperature and the temperat ure rise due to power dissipation in the device. Therefore, to
achieve optimum reliability, observe the following precautions concerning thermal design:
(1) Keep the a mbient temperature (Ta) as low as possible.
(2) If the device’s dynamic power dis sipation is relatively large, select the most appropriate
circuit board material, and consider the use of heat sinks or of forced air cooling. Such
measures will help lower the thermal resistance of the package.
(3) Derate the device’s absolute maximum ratings to minimize thermal stress from power
dissipation.
θja = θjc + θca
θja = (Tj–Ta) / P
θjc = (Tj–Tc) / P
θca = (Tc–Ta) / P
in which θja = thermal resistance between junction and surrounding air (°C/W)
θjc = thermal resistance between junction and package surface, or internal thermal
resistance (°C/W)
θca = thermal resistance between package surface and surrounding air, or external
thermal resistance (°C/W)
Tj = junction temperature or chip temperature (°C)
Tc = package surface temperature or case temperature (°C)
Ta = ambient temperature (°C)
P = power dissipation (W)
Tc
θca
Ta
Tj
θjc
Figure 2 Thermal resistance of package
3.3.9 Interfacing
When connecting inputs and outputs between devices, make sure input voltage (VIL/VIH) and
output voltage (VOL/VOH) levels a re matched. Otherwise, the devices may malfunction. When
connecting devices operating at different supply voltages, such as in a dual-power-supply system,
be aware that erroneous power-on and power-off sequences can result in device breakdown. For
details of how to interface particular devices, consult the relevant technical datasheets and
databooks. If you have any questions or doubts about interfacing, contact your nearest Toshiba
office or distributor.
3 General Safety Precautions and Usage Considerations
3-9
3.3.10 Decoupling
Spike currents generated during switching can cause Vcc (Vdd) and GND (Vss) voltage levels to
fluctuat e, ca using ri nging i n the output waveform or a dela y in res pons e speed. (The power s uppl y
and GND wiring impedance is normally 50 to 100 .) For this reason, the impedance of power
supply lines with respect to high frequencies must be kept low. This can be accomplished by using
thick and short wiring for the Vcc (Vdd) and GND (Vss) lines and by installing decoupling
capacitors (of approximately 0.01 µF to 1 µF capacitance) as high-frequency filters between Vcc
(Vdd) and GND (Vss) at strategic locations on the printed circuit board.
For low-frequency filtering, it is a good idea to install a 10- to 100-µF capacitor on the printed
circuit board (one capacitor will suffice). If the capacitance is excessively large, however, (e.g.
several thousand µF) latch-up can be a problem. Be sure to choose an appropriate capacitance
value.
An important point about wiring is that, in the case of high-speed logic ICs, noise is caused mainly
by reflection and crosstalk, or by the power supply impedance. Reflections cause increased signal
delay, ringing, overshoot and undershoot, thereby reducing the device’s safety margins with
respect t o noise. To prevent reflections, reduce the wiring length by increasing the device
mounting density so as to lower the inductance (L) and capacitance (C) in the wiring. Extreme
care must be taken, however, when ta king this corrective measure, since it tends to cause
crosstalk between the wires. In practice, t here must be a trade-off between these two factors.
3.3.11 External noise
Printed circuit boards with long I/O or signal pattern lines are
vulnerabl e to induced noise or surges from outside sources.
Consequently, malfunctions or breakdowns can result from
overcurrent or overvoltage, depending on the types of device
used. To protect against nois e, lower the impedance of the
pattern line or insert a noise-canceling circuit. Protective
measures mu st also be taken ag ains t su rge s.
For details of the appropria te protective measures for a
particular device, consult the relevant databook.
3.3.12 Electromagnetic interference
Widespread use of electrical and electronic equipment in recent years has brought with it radio
and TV reception problems due to electromagnetic interference. To use the radio spectrum
effectively and to maintain radio communications quality, each country has formulated
regulati ons limiting the amount of electromagnetic interference which can be generated by
individual products.
Electromagnetic interference includes conduction noise propagated through power supply and
telephone lin es, and noise from direct electromagnetic waves radiated by equipment. Different
measurement methods and corrective measures are used to a ssess and counteract each specific
type of noise.
Difficult ies in controlling electromagnetic interference derive from the fact that there is no
method available which allows designers to calculate, at the design stage, the strength of the
electromagnetic waves which will emanate from each component in a piece of equipment. For this
reason, it is only after the prototype equipment has been completed that the designer can take
measurements using a dedicated instrument to determine the strength of electromagnetic
interference waves. Yet it is possible during system design to incorporat e some measures for the
prevention of electromagnetic interference, which can facilitate taking corrective measures once
the design has been completed. These include installing shields and noise filters, and increasing
Input/Output
Signals
3 General Safety Precautions and Usage Considerations
3-10
the thi ckness of the power supply wiring patterns on the printed circuit board. One effective
method, for exampl e, i s t o devis e s everal shieldi ng opt ions during des i gn, and then s elect t he mos t
suitable shielding method based on the results of measurements taken after the prototype has
been completed.
3.3.13 Peripheral circuits
In most cases semiconductor devices are used with peripheral circuits and components. The input
and output signal voltages and currents in these circuits must be chosen to match the
semiconductor device’s specifications. The following factors must be taken into account.
(1) Inappropriate voltages or currents applied to a device’s input pins may cause i t to operate
erratically. Some devices contain pull-up or pull-down resistors. When designing your system,
remember to take the effect of this on the voltage and current levels into account.
(2) The output pins on a device have a predetermined external circuit drive capability. If this
drive capability is greater than that required, either incorporate a compensating circuit into
your design or carefully select suitable components for use in external circuits.
3.3.14 Safety standards
Each country has safety standards which must be observed. These safety standards include
requirement s for quality assurance systems and design of device in sulation. Such requirements
must be fully taken into account to ensure that your design conforms to the applicable safety
standards.
3.3.15 Other precautions
(1) When designing a system, be sure to incorporate fail-sa fe and other appropriate measures
according to the intended purpose of your system. Also, be sure to debug your sys tem under
actual board-mo un ted cond ition s.
(2) If a plasti c-package device is placed in a strong elect ric fiel d, surface leak age may occur due to
the charge-up phenomenon, resulting in device malfunction. In such cases take appropriate
measures to prevent this problem, for example by protecting th e package surface with a
conductive shield.
(3) With some microcomputers and MOS memory devices, caution is required when powering on
or resetting the device. To ensure that your design does not violate device specifications,
consult the relevant databook for each constituent device.
(4) Ensure that no conductive mat erial or object (such as a metal pin) can drop onto and short t he
leads of a device mounted on a printed circuit board.
3.4 Inspection, Testing and Evaluation
3.4.1 Grounding
Ground all measuring instruments, jigs, tools and soldering irons to earth.
Electrical leakage may cause a device to break down or may result in electric
shock.
3 General Safety Precautions and Usage Considerations
3-11
3.4.2 Inspection Sequence
c Do not insert devices in the wrong orientation. Make sure that the positive
and negative electrodes of the power supply are correctly connected.
Otherwise, the rat ed maximum current or maximum power dissipation
may be exceeded and the device may break down or undergo performance
degradation, causing it to catch fire or explode, resulting in injury to the
user.
d When conducting any kind of evaluation, inspection or testing using AC
power with a peak voltage of 42.4 V or DC power exceeding 60 V, be sure to
connect the electrodes or probes of the test ing equipment to the device
under test before powering it on. Connecting the electrodes or probes of
testing equipment to a device while it is powered on may result in electric
shock, causing injury.
(1) Apply voltage to the test jig only after inserting the device securely into it. When applying or
removing power, observe the relevant precautions, if any.
(2) Make sure that the voltage applied to the device is off before removing the device from the
test jig. Otherwise, the device may undergo performance degrad ation or be destroyed.
(3) Make sure that no surge voltages from the measuring equipment are applied to the device.
(4) The chips housed in tape carrier packages (TCPs) are bare chips and are therefore exposed.
During inspection take care not to crack the chip or cause any flaws in it.
Electrical contact may also cause a chip to become faulty. T h erefore make sure that nothing
comes into electrical contact with the chip.
3.5 Mounting
There are essentially two main types of semiconductor device package: lead insertion and surface
mount. During mounting on printed circuit boards, devices can become contaminated by flux or
damaged by thermal stress from the soldering process. With surface-mount devices in particular,
the most significant problem is thermal stress from solder reflow, when the entire package is
subjected to heat. This section describes a recommended temperature profile for each mounting
method, as well as general precautions which you should take when mounting devices on printed
circuit boards. Note, however, that even for devices with the same package type, the appropriate
mounting method varies according t o th e size of the chip and the size and shape of the lead fra me.
Therefore, please consult the relevant technical datash eet and databook.
3.5.1 Lead forming
c Always wear protective glasses when cutting the leads of a device with
clippers or a similar tool. If you do not, small bits of metal flying off the cut
ends may damage your eyes.
d Do not touch the tips of device leads. Because some types of device have
leads with pointed tips, you may pric k your finger.
Semiconductor devices must undergo a process in which the leads are cut and formed before the
devices can be mounted on a printed circuit board. If undue stress is applied to the interior of a
device during this process, mechanical breakdown or performance degradation can result. This is
attributable primarily to differences between the stress on the device’s external leads and the
stress on the internal l eads. If the relative difference is great enough, the device’s internal leads,
adhesive properties or sealant can be damaged. Observe these precautions during the lead-
forming process (this does not apply to surface-mount devices):
3 General Safety Precautions and Usage Considerations
3-12
(1) Lead insertion hole intervals on the printed circuit board should match the lead pitch of the
device precisely.
(2) If lead insertion hole intervals on the printed circuit board do not precisely match the lead
pitch of the device, do not attempt to forcibly insert devices by pressing on them or by pulling
on their leads.
(3) For the minimum clearance specification between a device and a
printed circuit board, refer to the relevant device’s datasheet and
databook. If necessary, achieve the required clearance by forming
the device’s leads appropriately. Do not use the spacers which are
used to raise devices above the surface of the printed circuit board
during soldering to achieve clearance. These spa cers normally
continue to expand due to heat, even aft er the solder has begun to solidify; this appl ies severe
stress to the device.
(4) Observe the following precautions when forming the leads of a device prior to mounting.
Use a tool or jig to secure the lead at its base (where the lead meets the device package) while
bending so as to avoid mechanical stress to the device. Also avoid bending or stretching device
leads repeatedly.
Be careful not to damage the lead during lead forming.
Follow any other precautions described in the individual datasheets and databooks for each
device and package type.
3.5.2 Socket mounting
(1) When socket mounting devices on a printed circuit board, use sockets which match the
inserted device’s package.
(2) Use s ockets whose contacts have the appropria te contact pressure. If the contact pressure is
insufficient, the socket may not make a perfect contact when the device is repeatedly inserted
and removed; if the pressure is excessively high, the device leads may be bent or damaged
when they are inserted into or removed from the socket.
(3) When s oldering sockets to the printed circuit board, use sockets whose constructi on prevents
flux from penetrating into the contacts or which allows flux to be completely cleaned off.
(4) Make sure the coating agent applied to the printed circuit board for moisture-proofing
purposes does not stick to the socket contacts.
(5) If the device leads are severely bent by a socket as it is inserted or removed and you wish to
repair the leads so as to continue using the device, make sure that this lead correction is only
performed once. Do not use devices whose leads have been corrected more than once.
(6) If the printed circuit board with the devices mounted on it will be subjected to vibration from
external sources, use sockets which have a strong contact pressure so as to prevent the
sockets and devices from vibrating relative to one another.
3.5.3 Soldering temperature profile
The soldering temperature a n d heating time vary from device to device. Therefore, when
specifying the mounting conditions, refer to the individual datasheets and databooks for the
devices us ed.
3 General Safety Precautions and Usage Considerations
3-13
(1) Using a soldering iron
Complete soldering within ten seconds for lead temperatures of up to 260°C, or within three
seconds for lead temperatures of up to 350°C.
(2) Using medium infrared ray reflow
Heating top and bottom with long or medium infrared rays is recommended (see Figure 3).
Long infrared ray heater (preheating)
Medium infrared ray heater
(reflow)
Product flow
Figure 3 Heating top and bottom with long or medium infrared rays
Complete the infrared ray reflow process wit hin 30 seconds at a package surfac e temperature of
between 210°C and 240°C.
Refer to Figure 4 for an example of a good temperature profile for infrared or hot air reflow.
210
30 s
or less
Time (s)
60-120 s
(°C)
240
160
140
Package surface temperature
Figure 4 Sample temperature profile for infrared or hot air reflow
(3) Using hot air reflow
Complete hot air reflow within 30 seconds at a package surface temperature of between 210°C
and 240°C.
For an example of a recommended temperature profile, refer to Figure 4 above.
(4) Using solder flow
Apply preheating for 60 to 120 seconds at a temperature of 150°C.
Fo r le ad ins er ti on -ty pe pa ck ag es , co mp le te so lde r flo w wit h in 1 0 s eco nd s w it h the
temperature at the stopper (or, if there is no stopper, at a location more than 1.5 mm from
the body) which doe s not exceed 260°C.
3 General Safety Precautions and Usage Considerations
3-14
For surface-mount packages, complete soldering within 5 seconds at a temperature of 250°C or
less in order to prevent thermal stress in the device.
Figure 5 shows a n example of a recommended temperature profile for surface-mount packages
using solder flow.
5 s
or less
60-120 s
(°C)
250
160
140
Package surface temperature
Time (s)
Figure 5 Sample temperature profile for solder flow
3.5.4 Flux cleaning and ultrasonic cleaning
(1) When cleaning circuit boards to remove flux, make sure that no residual reactive ions such as
Na or Cl remain. Note that organic solvents react with water to generate hydrogen chloride
and other corrosive gases which can degrade device performance.
(2) Washing devices with water will not cause any problems. However, make sure that no
reactive ions such as sodium and chlorine are left as a residue. Also, be sure to dry devices
sufficiently after washing.
(3) Do not rub device markings with a brush or with your hand during cleaning or while the
devices ar e still wet from the cleaning agent. Doing so can rub off the markings.
(4) The dip cleaning, shower cleaning and steam cleaning processes all involve the chemical
action of a solvent. Use only recommended solvents for these cleaning methods. Wh en
immersin g devices in a solvent or steam bath, make sure that the temperature of the liquid is
50°C or below, and that the circuit board is removed from the bath within one minute.
(5) Ultrasonic cleaning should not be used with hermetically-sealed ceramic packages such as a
leadless chip carrier (LCC), pin grid array (PGA) or charge-coupled devi ce (CCD), because the
bonding wires can become disconnected due to resonance during the cleaning process. Even if
a device package allows ultrasonic cleaning, limit the duration of ultrasonic cleaning to as
short a time as possi bl e, si nce long hours of ult ras onic cl eaning degra de the a dhes ion b etween
the mold resin and the frame material. The followi ng ultrasonic clea ning conditions a re
recommended:
Frequency: 27 kHz 29 kHz
Ultrasonic output power: 300 W or less (0.25
W/cm2 or less)
Cleaning time: 30 seconds or less
Suspend the circuit board in the solvent bath during ultrasonic cleaning in such a way that
the ultrasonic vibrator does not come into direct contact with the circuit board or the device.
3 General Safety Precautions and Usage Considerations
3-15
3.5.5 No cleaning
If analog devices or high-speed devices are used without being cleaned, flux residues may cause
minute amounts of leakage between pins. Similarly, dew condensation, which occurs in
environments containing residual chlorine when power to the device is on, may cause between-
lead leakage or migration. Therefore, Toshiba recommends that these devices be cleaned.
However, if the flux used contains only a small amount of halogen (0.05W% or less), the devices
may be used without cleaning without any problems.
3.5.6 Mounting tape carrier packages (TCPs)
(1) When tape carrier packages (TCPs) are mounted, measures must be taken to prevent
electrostatic breakdown of the devices.
(2) If devices are being picked up from tape, or outer lead bonding (OLB) mounting is being
carried out, consult the manufac turer of the insertion machine which is being used, in order
to establish the optimum mounting conditions in advance and to avoid any possible hazards.
(3) The base film, which is made of polyimide, is hard and thin. Be careful not to cut or scratch
your hands or any objects while handling the tape.
(4) When punching tape, try not to scatter broken pieces of tape too much.
(5) Treat the extra film, reels and spacers left after punching as industrial waste, taking care not
to destroy or pollute the envi ronment.
(6) Chips housed in tape carrier packages (TCPs) are bare chips an d therefore have their reverse
side exposed. To ensure that the chip will not be cracked during mounting, ensure that no
mechanical shock is a ppli ed to the reverse s i de of the chi p. E lect ri cal conta ct may a ls o caus e a
chip to fai l. Therefore, when mounting devices, make sure that nothing comes into electrical
contact with the reverse side of the chip.
If your design requires connecting the reverse side of the chip to the circuit board, please
consult Toshiba or a Toshiba distributor beforehand.
3.5.7 Mounting chips
Devices delivered in chip form tend to degrade or break under external forces much more easily
than plastic-packaged devices. Therefore, caution is required when handling this type of device.
(1) Mount devices in a properly prepared environment so that chip surfaces will not be exposed to
polluted ambient air or other polluted substances.
(2) When handling chips, be careful not to expose them to static electricity.
In particul ar, measures must b e ta ken t o prevent st ati c dama ge during t he mounti ng of chips .
With this in mind, Toshiba recommend mounting all peripheral parts first and then mounting
chips last (after all other components have been mounted).
(3) Make sure that PCBs (or any other kind of circuit board) on which chips are being mounted do
not have any chemical residues on them (such as the chemicals whi ch were used for etching
the PCBs).
(4) When mounting chips on a board, use the method of assembly that is most suitable for
maintaining the appropriate electrical, thermal and mechanical properties of the
semiconductor devices used.
* For details of devices in chip form, refer to the relevant device’s individual datasheets.
3 General Safety Precautions and Usage Considerations
3-16
3.5.8 Circuit board coating
When devices are to be used in equipment requiring a high degree of reliability or in extreme
environments (where moisture, corrosive gas or dust is present), circuit boards may be coated for
protection. However, before doing so, you must carefully consider the possible stress and
contamination effects that may result and then choose t he coating resin which results in the
minimum level of stress to the device.
3.5.9 Heat sinks
(1) When attaching a heat sink to a device, be careful not to apply excessive force to the device in
the process.
(2) When attaching a device to a heat sink by fixing it at two or more locations , evenly tighten all
the screws in stages (i.e. do not fully tighten one screw while the rest are still only loosely
tightened). Finally, fully tighten all the screws up to the specified torque.
(3) Drill holes for screws in the heat sink exactly as specified. Smooth the
surface by removing burrs and protrusions or indentations which might
interfere with the installation of any part of the device.
(4) A coating of silicone compound can be applied between the heat sink and
the device to improve heat conductivity. Be sure to apply the coating
thinly and evenly; do not use too much. Also, be sure to use a non-volatile
compound, as volatile compounds can crack after a time, causing the heat
radiation properties of the heat sink to deteriorate.
(5) If the device is housed in a plastic package, use caution when selecting the type of silicone
compound to be applied between the heat sink and the device. With some types, the base oil
separates and penetrates the plastic package, significantly reducing the useful life of the
device.
Two recommended silicone compounds in which base oil separation is not a problem are
YG6260 from Toshiba Silicone.
(6) Heat-sink-equipped devices can become very hot during operation. Do not touch them, or you
may sustain a burn.
3.5.10 Tightening torque
(1) Make sure the screws are tightened with fastening torques not exceeding the torque values
stipulated in individual datasheets and databooks for the devices used.
(2) Do not allow a power screwdriver (elect rical or air-driven) to touch devices.
3.5.11 Repeated device mounting and usage
Do not remount or re-use devices which fall into t he categories listed below; t hese devices may
cause significant problems relating to performance and reliability.
(1) Devices which have been removed from th e board after soldering
(2) Devices which have been inserted in the wrong orienta tion or which have had reverse current
applied
(3) Devices which have undergone lead forming more than once
3 General Safety Precautions and Usage Considerations
3-17
3.6 Protecting Devices in the Field
3.6.1 Temperature
Semiconductor devices are generally more sensitive to temperature than ar e other electronic
components. The various electrical characteristics of a semiconductor device are dependent on the
ambient temperature at whic h the device is used. It is therefore necessary to understand the
temperature characteristics of a device and t o incorporat e device derati ng into circuit design. Not e
also that if a device is used above its maximum temperature rating, device deterioration is more
rapid and it will reach the end of its usable life sooner than expected.
3.6.2 Humidity
Resin-mol d ed devices are sometimes improperly sealed. When these devices are used for an
extended period of time in a high-humidity environment, moisture can penetrate into the device
and cause chip degradation or malfunction. Furthermore, when devices are mounted on a regular
printed circuit board, the impedance between wiring components can decrease under high-
humidity conditions. In systems which require a high signal-source impedance, circuit board
leakage or leakage bet ween device lead pins can cause malfunctions. The application of a
moisture-proof treatment to the device surface should be considered in this case. On the other
hand, operation under low-humidity conditi ons can damage a device due to t he occurrence of
electrostatic discharge. Unless damp-proofing measures have been specifically taken, use devices
only in environments with appropriate ambient moisture levels (i.e. within a relative humidity
range of 40% to 60%).
3.6.3 Corrosive gases
Corrosive gases can cause chemical reactions in devices, degrading device characteristics.
For example, sulphur-bearing corrosive gases emanating from rubber placed near a device
(accompanied by condensation under high-humidity conditions) can corrode a device’s leads. The
resulting chemical reaction between leads forms foreign particles which can cause electrical
leakage.
3.6.4 Radioactive and cosmic rays
Most industrial and consumer semiconductor devices are not designed with protection against
radioactive and cosmic rays. Devices used in aerospace equipment or in radioactive environments
must therefore be shielded.
3.6.5 Strong electrical and magnetic fields
Devices exposed to strong magnet ic fields ca n undergo a polarizati on phenomenon in their plastic
material, or within the chip, which gives rise to abnormal symptoms such as impedance changes
or increased leakage current. Failures have been reported in LSIs mounted near malfunctioning
deflection yokes in TV sets. In such cases the device’s installation location must be changed or the
device must be shielded against the electric al or magnetic field. Shielding against magnetism is
especially necessary for devices used in an alternating magnetic field beca use of the electromotive
forces generated in this type of environment.
3 General Safety Precautions and Usage Considerations
3-18
3.6.6 Interference from light (ultraviolet rays, sunlight, fluorescent lamps and
incandescent lamps)
Light st riki ng a semiconduct or device genera tes el ectromot ive force du e t o phot oelect ric effects . In
some cases the device can malfunction. This is especially true for devices in which the internal
chip is exposed. When designing circuits, make sure that devices are protected against incident
light from external sources. This problem is not limited to optical semiconductors and EPROMs.
All types of device can be affected by light.
3.6.7 Dust and oil
Just like corrosive gases, dust and oil can cause chemical reactions in devices, which will
adversely affect a device’s electrical characteristics. To avoid this problem, do not use devices in
dusty or oily environments. This is especially important for optical devices because dust and oil
can affect a device’s optical characteristics as well as its physical integrity and the electrical
performance factors mentioned above.
3.6.8 Fire
Semiconductor devices are combust ible; they can emit smoke and catch fire if heated sufficiently.
When this happens, some devices may generate poisonous gases. Devices should therefore never
be used in close proximity to an open flame or a heat-generating b ody, or near flammable or
combustible materials.
3.7 Disposal of Devices and Packing Materials
When discarding unused devices and packing materials, follow all procedures specified by local
regulations in order to protect the environment against contamination.
4 Precautions and Usage Considerations
4-1
4. Precautions and Usage Considerati ons
This section describes matters specific to each product group which need to be taken into
consideration when using devices. If the same item is described in Sections 3 and 4, the
description in Section 4 takes precedence.
4.1 Microcontrollers
4.1.1 Design
(1) Using resonators which are not specifically recommended for use
Resonators recommended for use wit h Toshiba products in microcontroller oscillator applications
are listed in Toshiba databooks along with information about oscillation conditions. If you use a
resonator not included in this list, please consult Toshiba or the resonator manufacturer
concerning the suitability of the device for your application.
(2) Undefined functi ons
In some microcontrollers certain instruction code values do not constitute valid processor
instructions. Also, it is possible that the values of bits in registers will become undefined. Take
care in your applications not to use invalid instructions or to let register bit values become
undefined.
4 Precautions and Usage Considerations
4-2
64-Bit TX System RISC
TX49/H2 Core Architecture
TX49/H2 Architecture
1-1
I TX49/H2 Processor Core Specification
1. Introduction
The TX49/ H2 Processor Co re is a high p erformance an d low-power 64- bit RISC microproc essor
core developed by Toshiba which is well-suited to embedded applications such as networking,
laser printer, STB (Set Top Box) and 3-D graphic.
In this manual, TX49/H2 is called “TX49” hereinafter.
TX49/H2 Archit ecture
1-2
TX49/H2 Archit ecture
2-1
2. Feature
64 bit operation
32 of 64 bit integer general purpose registers
32 of 64 bit floating point general purpose registers
64 GB physical address space
Instruction Set
Upward compatible with MIPS I, MIPS II, and MIPS III ISA
MAC (Multiply and Accumulate) instructions
PREF (Prefetch) instruction
Optimized 5 stage pipeline
Instruction Cache
8 KB/ 16 KB/ 32KB : Fixed in each products
Four-way set associative
Lock function support (Way1-Way3)
Data cache
8 KB/ 16 KB/ 32 KB: Fixed in each products
Four-way set associative
Lock function support (Way1-Way3)
Write policies
Write-back
Write-through-No-Write-Allocate-Snoop
Write-through-Write-Allocate-Snoop
MMU
48-doubl e-entry (even/odd) Joint TLB
2-entry Instruction TLB
4-entry Data TLB
IEEE754 compatible single and double precision FPU
Single and double precision FPU in hardware
Debug support (EJTAG)
Debug instructions
Real time debugging is supported by deb ug module logic
Power management modes (Halt, Doze)
WAIT instruction
TX49/H2 Archit ecture
2-2
TX49/H2 Archit ecture
3-1
3. TX49 Block Diagram
Figure 3-1 show s the b lock diagra m of T X49 Pure Core, M PU Core an d MCU. TX49 Pure Co re
includes an instruction cache and a data cache. These cache are selectable by user system from
among a variety of possible configurations. Cache size is predetermined for each ASSP product,
however.
TX49 Pure Core
TX49 MCU
TX49 MPU Core
Instruction Cache
8 KB/ 16 KB/ 32 KB
4-way set associative
Lockable
Data Cache
8 KB/ 16 KB/ 32 KB
4-way set associative
Lockable
WB/WT
Integer Unit
GPR
DataPath
MAC
Pipeline
Control
CP0
CP0 Register
MMU/TLB
Exception Unit
FPU(CP1)
FP Register
Data Path
Debug
Support
Unit
Write Buffer GBUS I/F
Peripheral
Figure 3-1 Block Diagram of the TX49
TX49/H2 Archit ecture
3-2
TX49/H2 Archit ecture
4-1
4. CPU Registers Overview
4.1 Introduction
The TX49 has the CPU registers for integer operation or address calculation and the CP0
registers for memory system or exception handling.
4.2 CPU Registers
The TX49 has the 64-bit CPU registers.
32 general-purpose registers
64-bit program counters
HI/L O register for storing the result of multiply and divide operations
Figure 4-1 shows the configuration of these registers.
General Purpose Regis t ers (GPR) Multiply/ Di vi de Regi sters
63 0 63 0
r0 = 0HI
r1 63 0
r2 LO
.
.Program counter
r29 63 0
r30 PC
r31 = Link Address
Figure 4-1 TX49 CPU registers
The r0 and r31 registers of GPR have special functions as follows.
Register r0 always contains the value 0. It can be a target register of an instruction
whose operation result is not needed. Or, it can be a source register of an instruction
that requires a value of 0.
Register r31 is the link register for the Jump and Link instruction. The address of
the instruction after the delay slot is placed in r31.
The TX49 has the following some special registers that are used or modified implicitly by
certain instructions.
HI - Holds the high-order bits of the result of integer multiply operation or the
remainder of integer divide operation.
LO - Holds the low-order bits of the result of integer multiply operation or the
quotient of integer divide opera tion.
These two registers are used to store that result of an integer multiplication or division. In
multiplication, the 64 high-order bits of a 128-bit result are stored in the HI, and the 64 low-
order bits are stored in the LO. In division, the resulting quotient is stored in the LO, and the
remainder is stored in the HI.
PC - Program Counter
The register contains the address of the currently executed instruction.
TX49/H2 Archit ecture
4-2
4.3 CP0 Registers
The TX49 has the 32-bit or 64-bit System control coprocessor(CP0) registers. These
registers are used for memory system or exception handling. Table 4-1 lists the CP0 registers
built into the TX49. The more detail information are described in Chapter 7.
Table 4-1 CP0 Registers
Register Name Reg. No. Register Name Reg. No.
Index Reg#0 Config Reg#16
Random Reg#1 LLAddr Reg#17
EntryLo0 Reg#2 (Reserved) (Note 1) Reg#18
EntryLo1 Reg#3 (Reserved) (Note 1) Reg#19
Context Reg#4 XContext Reg#20
PageMask Reg#5 (Reserved) (Note 1) Reg#21
Wired Reg#6 (Reserved) (Note 1) Reg#22
(Reserved) (Note 1) Reg#7 Debug (Note 2) Reg#23
BadVAddr Reg#8 DEPC (Note 2) Reg#24
Count Reg#9 (Reserved) (Note 1) Reg#25
EntryHi Reg#10 (Reserved) (Note 1) Reg#26
Compare Reg#11 (Reserved) (Note 1) Reg#27
Status Reg#12 TagLo Reg#28
Cause Reg#13 TagHi Reg#29
EPC Reg#14 ErrorEPC Reg#30
PRId Reg#15 DESAVE (Note 2) Reg#31
Note 1:These register s are use d to test the S yst em Control Cop r oces s or ( CP0) and s hou ld not be
access ed by the user.
Note 2:These registers are exclusively used by external in-circuit emulators (ICE).
TX49/H2 Archit ecture
5-1
5. CPU Instruction Set Summary
5.1 Introduction
Each instruction is 32 bits long. These instructions are upward compatible with the MIPS I,
II and III instruction set architecture and the TX39’s instructions.
5.2 Instruction Format
There are three instruction formats: Immediate (I-type), Jump (J-type) and Register (R-
type), as shown in Figure 5-1. Having just three instruction formats simplifies instruction
decoding. If more complex functions or addressing modes are required, they can be produced
with the compiler using combinations of the instructions.
Immediate (I -type)
31 26 25 21 20 16 15 0
op rs rt immediate
Jump (J-type)
31 26 25 0
op target
Register (R-type)
31 26 25 21 20 16 15 11 10 6 5 0
op rs rt rd sa funct
op Operation code (6 bits)
rs Source regist er (5 bits)
rt Target (source or destinati on) regi st er, or branc h condi tion (5 bits)
rd Destination regi ster (5 bits)
immediate Immediate, branc h displ acem ent, address displacement (16 bits)
target Branch target address (26 bi ts )
sa Shift amount (5 bits)
funct Functi on (6 bits)
Figure 5-1 Instruction formats and subfield mnemonics
TX49/H2 Archit ecture
5-2
5.3 Instruction Set Overview
5.3.1 Load and Store Instructions (Table 5- 1)
Load and Store instructions move data between memory and general purpose registers,
and are all I-type instructions. The only directly supported addressing mode is “base
register plus 16-bit signed immediate offset”.
Table 5-1 CPU Instruction Set: Load and Store Instructions
Instruction Description Note
LB Load Byte MIPS I
LBU Load Byte Unsigned MIPS I
LH Load Halfword MIPS I
LHU Load Halfword Unsigned MIPS I
LW Load Word MIPS I
LWL Load Word Left MIPS I
LWR Load Word Right MIPS I
SB Store Byte MIPS I
SH Store Halfword MIPS I
SW Store Word MIPS I
SWL Store Word Left MIPS I
SWR Store Word Right MIPS I
LD Load Doubleword MIPS III
LDL Load Doubleword Left MIPS III
LDR Load Doubleword Right MIPS III
LL Load Linked MIPS II
LLD Load Linked Doubleword MIPS III
LWU Load Word Unsigned MIPS III
SC Store Conditional MIPS II
SCD Store Conditional Doubl eword MIPS III
SD Store Doubleword MIPS III
SDL Store Doubleword Left MIPS III
SDR Store Doubleword Right MIPS III
SYNC Sync MIPS II
TX49/H2 Archit ecture
5-3
5.3.2 Computational Instructions (Table 5-2)
Computational instructions perform arithmetic, logical or shift operations on values in
registers. This instruction format can be R-type or I-type. With R-type instructions, the
one/two operands and the result are register values. With I-type instructions, one of the
operands is 16-bit immediate data. Computational instructions can be classified as
follows.
ALU immediate
Three-operand register-type
Shift
Multiply/Divide
Table 5-2 CPU Instruction Set: Computational Instructions
Instruction Description Note
(ALU Immediat e)
ADDI Add Immediate MIPS I
ADDIU Add Immediate Unsigned MIPS I
SLTI Set on Less Than Immediate MIPS I
SLTIU Set on Less Than Immediate Unsigned MIPS I
ANDI AND Immediate MIPS I
ORI OR Immediate MIPS I
XORI Exclusive OR Immediate MIPS I
LUI Load Upper Immediate MIPS I
DADDI Doubleword Add Immediat e MIPS III
DADDIU Doubleword Add Immediate Unsigned MIPS III
(ALU 3-Operand, regist er type)
ADD Add MIPS I
ADDU Add Unsi gned MIPS I
SUB Subtract MIPS I
SUBU Subtract Unsi gned MIPS I
SLT Set on Less Than MIPS I
SLTU Set on Less Than Unsigned MIPS I
AND AND MIPS I
OR OR MIPS I
XOR E xclu sive O R MIP S I
NOR NOR MIPS I
DADD Doubleword Add MIPS III
DADDU Doubleword Add Unsigned MIPS III
DSUB Doubleword Subtract MIPS III
DSUBU Doubleword Subtrac t Unsigned MIPS III
(Shift)
SLL Shift Left Logical MIPS I
SRL Shift Right Logical MIPS I
SRA Shift Right A rithm etic MI P S I
SLLV Shift Left Logical Variabl e MIPS I
SRLV Shift Right Logical Variabl e MIPS I
SRAV Shift Right Arithmetic Variable MIPS I
DSLL Doubleword Shif t Left Logical MIPS III
DSRL Doubleword Shif t Right Logical MIPS III
DSRA Doubleword Shif t Right A rithmetic MIPS III
DSLLV Doubleword Shift Left Logic al Variabl e MIPS III
DSRLV Doubleword Shift Right Logic a l Vari abl e MIPS III
TX49/H2 Archit ecture
5-4
Instruction Description Note
DSRAV Doubleword Shif t Right A rithm etic Variable MIPS III
DSLL32 Doubleword Shift Left Logical +32 MIPS III
DSRL32 Doubleword Shif t Right Logical +32 MIPS III
DSRA32 Doubleword Shif t Right A rithmetic +32 MIPS III
( Multiply and Divi de)
MULT Multiply MIPS I
MULTU Multiply Unsigned MIPS I
DIV Divide MIPS I
DIVU Divide Unsigned MIPS I
MFHI Mo ve From HI MIP S I
MTHI Mo ve To HI MIP S I
MFLO Move From LO MIPS I
MTLO Mo ve To LO MIP S I
DMULT Doubleword Multiply MIP S III
DMULTU Doubleword Multiply Uns i gned MIPS III
DDIV Doubleword Divide MIPS III
DDIVU Doubleword Divide Unsi gned MIPS III
5.3.3 Jump and Branch Instructions (Table 5-3)
Jump and branch instructions change the control flow of a program. All jump and
branch instructions occur with a delay of one instruction: that is, the instruction
immediately following the jump or branch (this is known as the instruction in the delay
slot) always executes while the target instruction is being fetched from storage. Branch-
likely instructions are used for static branch prediction. The instruction in the delay slot
is executed only when the branch is taken; the instruction in the delay slot is nullified if
the branch is not taken.
Table 5-3 CPU Instruction Set: Jump and Branch Instructions
Instruction Description Note
JJump MIPS I
JAL Jump And Link MIPS I
JR Jump Register MIPS I
JALR Jump And Link Register MIPS I
BEQ Branch on Equal MIPS I
BNE Branch on Not Equal MIPS I
BLEZ Branch on Less Than or Equal to Zero MIPS I
BGTZ Branch on Greater Than Zero MIPS I
BLTZ Branch on Less Than Zero MIPS I
BGEZ Branch on Greater than or Equal to Zero MIPS I
BLTZAL Branch on Less Than Zero And Link MIPS I
BGEZAL Branch on Greater than or Equal to Zero And Link MIPS I
BEQL Branch on Equal Likely MIPS II
BNEL Branch on Not Equal Likely MIPS II
BLEZL Branch on Less Than or Equal to Zero Likely MIPS II
BGTZL Branch on Greater Than Zero Likely MIPS II
BLTZL Branch on Less Than Zero Likely MIPS II
BGEZL Branch on Greater Than or Equal to Zero Likely MIPS II
BLTZALL Branch on Less Than Zero And Link Likely MIPS II
BGEZALL Branch on Greater Than or Equal to Zero And Link Likely MIPS II
TX49/H2 Archit ecture
5-5
5.3.4 Special Instructions (Table 5-4)
There are special instructions used for software trap. The instruction format is R-type
for all two.
Table 5-4 CPU Instruction Set: Special Instructions
Instruction Description Note
SYSCALL System Call MIPS I
BREAK Break MIPS I
5.3.5 Exception Instr uctions (Table 5-5)
These instructions (R-type or I-type) cause a branch to the general exception handling
vector based upon the result of a comparison.
Table 5-5 CPU Instruction Set: Exception Instructions
Instruction Description Note
TGE Trap if Greater Than or Equal MIPS II
TGEU Trap if Greater Than or Equal Unsigned MIPS II
TLT Trap if Less Than MIPS II
TLTU Trap if Less Than Unsigned MIP S II
TEQ Trap if Equal MIPS II
TNE Trap if Not Equal MIPS II
TGEI Trap if Greater Than or Equal Immediat e MIPS II
TGEIU Trap if Greater Than or Equal Immediate Unsigned MIPS II
TLTI Trap if Less Than Immediate MIPS II
TLTIU Trap if Less Than Immediate Unsigned MIPS II
TEQI Trap if Equal Immediate MIPS II
TNEI Trap if Not Equal Immediate MIPS II
TX49/H2 Archit ecture
5-6
5.3.6 Coprocessor Instructions (Table 5-6)
Coprocessor instructions invoke coprocessor operations. The format of these
instructions depends on which coprocessor is used.
Table 5-6 CPU Instruction Set: Coprocessor Instructions
Instruction Description Note
LWCz Load Word to Coprocessor z (z = 1,2) MIPS I
SWCz Store Wo rd from Coprocessor z (z = 1,2) MIPS I
MTCz Move To Coprocess or z (z = 1,2) MIPS I
MFCz Move From Coprocessor z (z = 1,2) MIPS I
CTCz Move Control To Coprocessor z (z = 1,2) MIPS I
CFCz Move Control From Coproc essor z (z = 1,2) MI PS I
COPz Coprocess or Operation z (z = 1,2) MI PS I
BCzT Branch on Coprocess o r z True (z = 0,1,2) MIPS I
BCzF Branch on Coprocess or z Fals e (z = 0,1,2) MIPS I
BCzTL Branch on Coprocess or z True Like l y (z = 0,1,2) MIPS II
BCzFL Branch on Coprocesso r z False Lik ely (z = 0,1,2) MIPS II
LDCz Load Double Coprocess or z (z = 1,2) MIPS III
SDCz Store Doubl e Coprocessor z (z = 1,2) MIPS III
DMTCz Doubleword Move To Coprocessor z (z = 1,2) MIPS III
DMFCz Doubleword Move From Coprocesso r z (z = 1,2) MI PS III
5.3.7 CP0 Instructions (Table 5-7)
Coprocessor 0 instructions are used for operations involving the system control
coprocessor (CP0 ) registers, processor memory management and exception handling.
Table 5-7 Instructi on Set: CP 0 Instruct ions
Instruction Description Note
MTC0 Mo ve To CP0 MIPS I
MFC0 Mo ve From CP0 MIPS I
DMTC0 Doubleword Move To CP0 MIPS III
DMFC0 Doubleword Move From CP0 MIPS III
TLBR Read Indexed TLB Entry
TLBWI Write Indexed TLB Entry
TLBWR Write Random TLB Entry
TLBP Probe TLB for Matching Entry
CACHE Cache MIPS III
ERET Exception Return MIPS III
WAIT Enter power managem ent mode
TX49/H2 Archit ecture
5-7
5.3.8 Multiply and Divide Instructions (Table 5-8)
Table 5-8 Extensions to the ISA: Multiply and Divide Instructions
Instruction Description Note
MULT Multiply (3-operand)
MULTU Multiply Unsigned (3-operand)
DMULT Doubleword Multiply (3-operand)
DMULTU Doubleword Multiply Uns i gned (3-operand)
MADD Multiply and ADD (3-operand)
MADDU Multipl y and A DD Unsigned (3-operand)
5.3.9 Debug Instructions (Table 5-9)
Table 5-9 Extensions to the ISA: Debug Instructions
Instruction Description Note
CTC0 Move Control To Coprocess or 0
CFC0 Move Control From Coprocess or 0
SDBBP Software Debug Breakpoint
DERET Debug Exception Return
5.3.10 Other Instructions (Table 5-10)
Table 5-10 Other Instructions
Instruction Description Note
PREF Prefetch
5.4 Instruction Execution Cycles
Because the TX49 employs the high-speed Multiply and Add Calculator (MAC), multiply
instructions, such as MULT, MULTU, DMULT and DMULTU are executed faster. And, TX49
is improved the execution of divide instructi ons, too.
Instruction Latency (2op/3op) Repeat (2op/3op)
MULT 2/3 operand 4/4 1/3
MADD 2/3 operand 4/4 1/3
DMULT 2/3 operand 7/7 6/6
DIV 37 36
DDIV 69 68
TX49/H2 Archit ecture
5-8
5.5 Defining Access Types
Access type indicates the size of a TX49 processor data item to be loaded or stored, set by
the load or store instruction opcode. Access types are defined in Table A-3.
Regardless of access type or byte ordering (endianness), the address given specifies the low-
order byte in the addressed field. For a big-endian configuration, the low-order byte is the
most-significant byte; for a little-endian configuration, the low-order byte is the least-
significant byte.
The access type, together with the three low-order bits of the address, define the bytes
accessed within the addressed doubleword (shown in Figure 5-2). Only the combinations
shown in Figure 5-2 are permissible; other combinations cause address error exceptions. See
Appendix A for individual descriptions of CPU load and store instructions.
Bytes Accessed
Low-Order
Address
Bits
Access Type
Mnemonic
(Value)
210
Big Endian
(63-----------------31-----------------0)
Byte
Little Endian
(63-----------------31-----------------0)
Byte
Doubleword (7) 0 0 0 0 1 2 3 4 5 6 7 7 6 5 4 3 2 1 0
0000123456 6543210
Septibyte (6) 001 12345677654321
000012345 543210
Sextibyte (5) 010 234567765432
00001234 43210
Quintibyte (4) 011 3456776543
0000123 3210
Word (3) 100 45677654
000012 210
001 123 321
100 456 654
Triplebyte (2)
101 567765
00001 10
010 23 32
100 45 54
Halfword (1)
110 6776
0000 0
001 1 1
010 2 2
011 3 3
100 4 4
101 5 5
110 6 6
Byte (0)
111 77
Figure 5-2 B yte Access within a Doub le wor d
TX49/H2 Archit ecture
6-1
6. CPU Pipeline
6.1 Introduction
This chapter describes the operation of the TX49 pipeline. It explains the basic operation of
the pipe line . An d, it e xp lain s h ow the T X 49 hand le d d elay in stru c tio ns; the se are in stru ct ion s
that follo w a branch o r load in struction in the pipelin e. A later section explain s interru ptions
to the pipeline flow caused by interlocks and exceptions.
6.2 Basic Pipeline Operation
The TX49 executes instructions in an optimized 5 stage pipeline. Each pipeline stage is
executed in one clock cycle. When the pipeline is fully utilized, five instructions are executed
at the same time, resulting in an average instruction execution rate of one instruction par
cycle as illustrated in Figure 6-1.
One cycle
F1 F2 D1 D2 E1 E2 M1 M2 W1 W2
F1 F2 D1 D2 E1 E2 M1 M2 W1 W2
F1 F2 D1 D2 E1 E2 M1 M2 W1 W2
F1 F2 D1 D2 E1 E2 M1 M2 W1 W2
F1 F2 D1 D2 E1 E2 M1 M2 W1 W2
F1 - Instruction Fetch, Phase one
F2 - Instruction Fetch, Phase two
D1 - Instructi on Decode, P hase one
D2 - Instructi on Decode, P hase t wo
E1 - Execution, Phase one
E2 - Execution, Phase two
M1 - Memory Access , P hase one
M2 - Memory Access, Phase two
W1 - Write Back, Phase one
W2 - Write Back, Phase two
Figure 6-1 Pipeline stages for executing TX49 instructions
F1, F2 : Instruction Fetch
During the F1 phase the ITLB begins the virtual to physical address
translation. And, during the F2 phase the instruction cache fetch and the virtual
to physical address translation are completed.
D1, D2 : Instruction Decode
The instruction is decoded. Contents of the general-purpose registers are read.
If the instruction involves a branch or jump, the target address is generated.
The coprocessor condition signal is latched.
E1, E2 : Execution
Arithmetic, logical and shift operations are performed. The execution of
multiple/divide instructions is begun.
For load and store instructions, the data virtual address is calculated, and
virtual-to-physical address translation is begun.
TX49/H2 Archit ecture
6-2
M1, M2 : Memory Acce ss
The dat a cache is accessed in the case of load and store instructions.
W1, W2 : Write Back
The result is written to a general register.
6.3 TX49 Pipeline Activities
Stage F1 F2 D1 D2 E1 E2 M1 M2 W1 W2
Fetch ICD ICA RF
& Decode ITLBM ITLBR ITC I DE C
ALU ALU WB
Load/Store DVA DCAD DCAA DCLA
JTLB1 JTLB2
SA DTC WB
DCW
Jump/Branch BCMP
BAC IVA
ICD: Instruction cache address decode
ICA: Instruction cache array access
RF: Register fetch
ITLBM: Instruction address translat i on match
ITLBR: Instruc ti on address translation read
ITC: Instruction tag match
IDEC: Instruct i on decode
ALU: ALU operation
WB: Write back to register file
DVA: Data virtual address calculatio n
DCAD: Data cache address decode
DCAA: Data cache array access
DCLA: Data cache load align
JTLB1: Address translation in JTLB stage1
JTLB2: Address translation in JTLB stage2
SA: Store align
DTC: Data cache tag check
DCW: Data cache write
BCMP: Branch compare
BAC: Branch address calcul ation
IVA: Generate inst ruct i on virt ual address
TX49/H2 Archit ecture
6-3
6.4 Branch and Load Delay
Some TX49 instructions are executed with a delay of one instruction cycle. The cycle in
which an ins tructio n i s de lay ed is c alle d a de l ay slot . A d elay occu rs with load instru ction and
branch/jump instructions.
6.4.1 Delayed load
With load instru ctions, a one -cycle delay occurs wh ile waiting fo r the data being loaded
to become available for use by another instruction. The TX49 checks the instruction in
the delay slot (the instruction immediately following the load instruction) to see if that
instruction needs to use the load result; if so, it stalls the pipeline (see Figure 6-2).
LW r5, 0 (r26 ) F D E M W
ADDU r8, r7, r5 F D ES E M W
Pipeline stall
Figure 6-2 CPU Pipeli ne Loa d Delay
6.4.2 Delayed branching
Figure 6-3 shows the pipeline flow for jump/branch instructions. The branch target
address that must be generated for these type of instructions does not become available
unit the E stage - too late to be used by the instruction in the branch delay slot. The
branch target instruction is fetched immediately after the branch delay slot cycle.
It is, however, possible to fetch a different instruction that would normally be executed
prior to the branch instruction.
BEQ r1, r4, L1 F D E M W
Target addr
subu r3, r5,r6 (delay slot ) F D E M W
L1:addiu r7, r7, 1 (target) F D E M W
Figure 6-3 CPU Pipeli ne Branc h Del a y
You can make effective use of the branch delay slot as follows.
Since the instruction immediately follo wing a bran ch instruction w ill be ex ecuted
just prior to the branch, you can therefore place an instruction (that logically
should be executed just before the branch) into delay slot following the branch
instruction.
The TX49 provides Branch Likely instructions in addition to the normal Branch
instructions. If the branch condition of the Branch Likely instruction is met, the
instruction in the del ay slot is executed and the br anch is taken. If the branch is
not taken, the instruction in the delay is treated as a NOP.
Therefore, Branch-Likely instructions allow the processor to execute the
instruction immediately following the branch while the target instruction is being
fetched.
If no instru ction is placed in the d elay slot, a N OP is placed just af ter the branch
instruction.
TX49/H2 Archit ecture
6-4
6.5 Non-blocking Load Function
The no n- bloc king load fu nctio n pre v en ts the pip e line fr om st alling wh en a cache miss o ccu rs
and a refill cycle is required to refill the data cache. Instructions after the load instruction
that do not use registers affected by the load will continue to be executed. An example is
shown in Figure 6-4. Here a cache miss occurs with the first load instruction. The two
instructio ns fo llo win g are ex e cute d p rio r to th e lo ad. Th e fo urth in stru ct ion (A DD ) mu st use a
register that will be loaded by the load instruction, therefore the pipeline is stalled until the
cache data becomes valid.
LW r3, 0(r0) FDEMRRRRW
ADD r6, r4, r2 F D E M W r3
ADD r7, r5, r2 F D E M W
ADD r8,r9,r3 F D ES ES ES E M W
R: Refill cycle, ES: Stall in E sta g e
Figure 6-4 Non-blocking load function
6.6 Interlock and Exception Handling
6.6.1 Overview of Interlock and Exception Handling
Smooth pipeline flow is interrupted when cache misses or exceptions occur, or when
data dependencies are detected. Interruptions handled using hardware, such as cache
misses, are referred to as interlocks, while those that are handled using software are
called exceptions.
As shown in Figure 6-5, all interlock and exception conditions are collectively referred
to as faults.
Figure 6-5 Interlocks, Exc eptions, and Fau lts
These are two types of interlocks:
stalls, which are resolved by halting the pipeline
slips, which require one part of the pipeline to advance while another part of the
pipeline is held static
At each cycle, exception and interlock condition corresponds to a particular pipeline
stage, a condition can be traced to the particular instruction in the exception/interlock
stage, as shown in Figure 6-6. For ins tance, an Illegal Ins truction (II) excep tion is raised
in the except ion (EX) stage.
Table 6- 1 and Table 6- 2 describe the pipeline inte rlocks and exce ptions liste d in Fig ure
6-6.
Exceptions Interlocks
Stalls Slips
Software Hardware
Faults
TX49/H2 Archit ecture
6-5
Pipeline Stage
State FDEMW
ITM ICM DCM
Stall CPE
LDI
MDSt
Slip FCBsy
ITLB IBE RI DBE
Cun NMI
BP Reset
SC OVF
DTLB Trap
DTMod
Exception
Intr
Figure 6-6 Correspondence of pipeline stage to interlock condition
Table 6-1 Pipeline Interlocks
Interlock Description
ITM Instruction TLB Miss
ICM Instruc ti on Cache Miss
CPE Coprocess or Possi ble Exception
DCM Data Cache Miss
LDI Load Interlock
MDSt Multiply / Divide Start
FCBsy FP Coprocessor Busy
Table 6-2 Pipeline Exceptions
Exception Description
ITLB Instruc tion Transl ation or Address Exception
Intr External Interrupt
IBE Instruc tion B us Error
RI Reserved Instruction
BP Breakpoint
SC System Call
Cun Coprocessor Unusabl e
OVF Integer Overflow
FPE FP Interr u p t
ExTrap EX Stage Traps
DTLB Data Translation or Address Exception
TLBMod TLB Modified
DBE Data Bus Error
NMI Nom-maskable Interrupt (or Soft Reset)
Reset Reset
TX49/H2 Archit ecture
6-6
6.6.2 Exception Conditions
When an exceptio n cond ition o ccurs, the re levant in struc tion an d all tho se th at follo w it
in the pipeline are cancelled. Accordingly, any stall conditions and any later exception
conditions that may have referenced this instruction are inhibited; there is no benefit in
servicing stalls for a cancelled instruction.
After instruction cancellation, a new instruction stream begins, starting execution at a
predefined exception vector. System Control Coprocessor registers are loaded with
information that identifies the type of exception and auxiliary information such as the
virtual address at which translation exceptions occur.
6.6.3 Stall Conditions
Often, a stall condition is only detected after parts of the pipeline have advanced using
incorrect data; thi s is called a pipeline overru n. When a stal l co nd ition i s de tecte d, al l f ive
instructions each different stage of the pipeline are frozen at once. In this stalled
state, no pipeline stages can advance until the interlock condition is resolved. For
example, when a cache miss occurs, the processor must refill the cache before it restarts
the pipeline.
Once the interlock is removed, the restart sequence begins two cycles before the
pipeline resumes execution. The restart sequence reverses the pipeline overrun by
inserting the correct information into the pipeline.
6.6.4 External Stalls
External stall is another class of interlocks. An external stall originates outside the
processor and is not referenced to a particular pipeline stage. This interlock is not
affected by exceptions.
6.6.5 Interlock and Exception Timing
To prevent interlock and exception handling from adversely affecting the processor
cycle time, the TX49 processor uses both logic and circuit pipeline techniques to reduce
critical timing paths. Interlock and exception handling have the following effects on the
pipeline:
In some cases, the processor pipeline must be backed up (reversed and started
over again from a prior stage) to recover from interlocks.
In some cases, interlocks are serviced for instructions that will be aborted, due to
an exception.
TX49/H2 Archit ecture
6-7
6.7 Multiply and Multiply/Add Instructions (MULT, MULTU, MADD, MADDU)
The TX49 can execute 32-bit multiply and multiply/add instructions of 2-operand
continuously, and can use the results in the HI/LO registers in immediately following
instructions, without pipeline stall as shown Figure 6-7. The TX49 requires three cycles to
use the results of a general-purpose register as shown Figure 6-8.
MULT/MADD r3, r4 F D E1 E2 E3 M W
MULT/MADD r6, r7, r8 F D E1 E2 E3 M W
Figure 6-7 MULT and MADD Instruct ions with out dat a dependency
(32-bit and 2-op er an d)
MULT/MADD r3, r4, r5 F D E1 E2 E3 M W
MULT/MADD r6, r3, r8 F D ES ES ES E1 E2 E3 M W
Figure 6-8 MULT and MADD Instructions with data dependency
(32-bit and 3-op er an d)
6.8 Divide Instructions (DIV, DIVU)
Division starts from the pipeline E stage and takes 36 cycles.
Figure 6-9 shows an example of a divide instruction.
DIV/DIVU F D E M W
V1 V2 V3 V4 V35 V36
Division stage1
Figure 6-9 DIV and DIVU Instruc t ions
6.9 Streaming
During a cach e refill ope ration, the TX49 can re sume exec ution immediately after arr ival of
necessary data or instruction in cache even though cache refill is not completed. This is
referred to as “streaming”.
TX49/H2 Archit ecture
6-8
TX49/H2 Archit ecture
7-1
7. System Control Coprocessor, CP0
7.1 Introduction
The TX49 has a System Control Co-Processor (CP0). CP0 translates virtual addresses to
physical addresses. CP0 manages exceptions and transitions between kernel, supervisor, and
user states. CP0 also controls the cache sub-system, as well as providing diagnostic control
and error recovery facilities.
TX49/H2 Archit ecture
7-2
7.2 CP0 Registers
This section is described about the bit field of each register. The term “coldreset” of tables
shows the value of each bit when GCOLDRESET* signal is asserted. The reserved bits in
description must be written the same value in reset, and return the same value when read.
7.2.1 Index register (Reg#0)
The Index register is a 32-bit read/write register containing six bits to index an entry in
the TLB. The P bit of the register shows the success/failure of a TLB Probe (TLBP)
instruction.
The Index register also specifies the TLB entry affected by TLB Read (TLBR) or TLB
Write Inde x (TLBWI) instruc tions. Fig ure 7-1 sh ows th e form at of the I ndex register and
Table 7-1 describes the Index register fields.
31 30 65 0
P 0 Index
Figure 7-1 Index Register Format
Table 7-1 Index Register Field Descriptions
Bit Field Description Cold Reset Read/Write
31 P Probe failure. Set to 1 when the previous
TLB Probe (TLBP) instruction was unsuccessf ul. Undefined Read/Write
30~6 0 Reserved 0x0 Read
5~0 Index Index to the TLB entry affected by the TLB Read and TLB
W rite Index instruc ti ons. Undefined Read/Write
TX49/H2 Archit ecture
7-3
7.2.2 Random register (Reg#1)
The Random register is a read only register containing six bits to index an entry in the
TLB. This register decrements as each i nstruction executes. The values are as follows.
A lower bound is set by the number of TLB entries reserved for exclusive use by
the operating system (the contents of the Wired register).
An upper bound is set by the total number of TLB entries (47 maximum).
The Random register specifies the TLB entry affected by TLB Write Random (TLBWR)
instruction. However the register doesn’t need to be read for this purpose, it is readable
to verify proper operation of the processor.
To simplify testing, the Random register is set to the value of the upper bound upon
system reset. This register is also set to the upper bound when the Wired register is
written.
Figure 7-2 shows the format of the Random register and Table 7-2 describes the
Random regis ter fields.
31 65 0
0 Random
Figure 7-2 Random Register Format
Table 7-2 Random Register Field Descriptions
Bit Field Description Cold Reset Read/Write
31~6 0 Reserved. 0x0 Read
5~0 Random TLB random index for TLBWR instruction. Upper bound
(47) Read
TX49/H2 Archit ecture
7-4
7.2.3 EntryLo0 register (Reg#2) and EntryLo1 register (Reg#3)
The EntryLo register consists of two registers have identical formats :
EntryLo0 is used for even virtual pages
EntryLo1 is used for odd virtual pages
The EntryLo0 and EntryLo1 register are read/write register. These registers hold the
physical page frame number (PFN) of the TLB entry for even and odd pages, respectively,
when performing TLB read and write operations.
Figure 7- 3 shows th e format of the EntryLo0/En tryLo1 re gister an d Table 7-3 descr ibes
the EntryLo0/EntryL o1 register fields.
63 32 31 30 29 6 5 3 2 1 0
0WCEPFNCDVG
Figure 7-3 EntryLo0/EntryLo1 Register Format
Table 7-3 EntryLo0/EntryLo1 Register Field Descriptions
Bit Field Description Cold Reset Read/Write
63~32 0 Reserved 0x0 Read
31~30 WCE Usable for Win-CE 0x0 Read/Write
29~6 PFN Page frame number. Undefined Read/Write
5~3 C Specifies the TLB page coherency attribute.
0: Cacheable, nonc oherent, write-t hrough, no-WA
1: Cacheable, nonc oherent, write-t hrough, WA
2: Uncached
3: Cacheable,nonc oherent,write-back,W A
47: Reserved
0x0 Read/Write
2D Dirty
If this bit is set, the page is marked as dirty and, therefore,
writable. This bit is actually a write-protect bit that software
can use to prevent alteration of dat a.
0 Read/Write
1 V Valid
If this bit is set, it indicates that the TLB entry is valid;
otherwise, a TLBL or TLBS miss occurs.
0 Read/Write
0 G Global
If this bit is set in both EntryLo0 and EntryLo1, then the
processor ignores the ASID during TLB lookup.
0 Read/Write
TX49/H2 Archit ecture
7-5
7.2.4 Context register (Reg#4)
The Context register is a read/write register containing the pointer to an entry in the
page table entry (PTE) array. This array is an operating system data structure that
stores virtual to physical address translations. When there is a TLB miss, the CPU loads
the TLB with the missing translation from the PTE array. Normally, the operating
system uses the Context register to address the current page map which resides in the
kernel mapped segment,kseg3. However the contents of this register duplicates some
information of the BadVAddr register, it is arranged in a form that is more useful for TLB
exception handler b y a software.
Figure 7-4 shows the formats of the Context register and Table 7-4 describes the
Context regi ster fields.
31 23 22 4 3 0
PTEBase BadVPN2 0
(32-bit mode)
63 23 22 4 3 0
PTEBase BadVPN2 0
(64-bit mode)
Figure 7-4 Context Register Formats
Table 7-4 Context Register Field Descriptions
32-bit mode
Bit Field Description Cold Reset Read/Write
3123 PTEBase Page t abl e entry bas e pointer
This field is for use by the operating system. It is normally
written with a value that allows the operating system to use
the Context register as a pointer into the current PTE array
in memor y.
Undefined Read/Write
224 BadVP N2 Bad virtual address bits 31~13
This field is written by hardware on a miss. It contains the
virtual page number (VP N) of the most recent virtual address
that did not have a valid translati on.
Undefined Read
30 0 Reserved 0x0 Read
64-bit mode
Bit Field Description Cold Reset Read/Write
6323 PTEBase Page t abl e entry bas e pointer Undefined Read/Write
224 BadVP N2 Bad virtual address bits 31~13 Undefined Read
30 0 Reserved 0x0 Read
The 19-bit BadVPN2 field contains bits 31 to 13 of the virtual address that caused the
TLB miss; bits 12 is excluded because a single TLB entry maps to an even-odd page pair.
For a 4-Kbyte page size, this format can directly address the pair-table of 8-byte PTEs.
For other page size and PTE sizes, shifting and masking this value produces the
appropriate address.
TX49/H2 Archit ecture
7-6
7.2.5 PageMask Register (Reg#5)
The PageMask register is a read/write register used for reading from/writing to the
TLB. This register holds a comparison mask that sets the variable page size for each TLB
entry.
TLB read and write operations use this register as either a source or a destination.
When virtual addresses are presented for translation into physical address, the
corresponding bits in the TLB identify which virtual address bits among bits 24~13 are
used in the comparison. When the Mask field is not one of the values shown in Table 7-5,
the operation of the TLB is undefined.
Figure 7-5 shows the format of the PageMask register and Table 7-5 describes the
PageMask register fields.
31 25 24 13 12 0
0MASK0
Figure 7-5 PageMask Register Format
Table 7-5 PageMask Register Field Descriptions
Bit Field Description Cold Reset Read/Write
3125 0 Reserved 0x0 Read
2413 MASK Page comparison mask
000000000000: page size = 4 Kbytes
000000000011: page size = 16 Kbytes
000000001111: page size = 64 Kbytes
000000111111: page size = 256 Kbytes
000011111111: page size = 1 Mbytes
001111111111: page size = 4 Mbytes
111111111111: page size = 16 Mbytes
0x0 Read/Write
120 0 Reserved 0x0 Read
TX49/H2 Archit ecture
7-7
7.2.6 Wired Register (Reg#6)
The Wired register is a read/write register specifies the boundary between the wired
and random entries of the TLB as follows. Wired entries are non-replaceable entries,
which can not be overwritten by a TLB write random operation. Random entries can be
overwritten.
TLB
47
0
Wired Regist er
Range of Random entries
Range of Wired entries
The Wired register is set to 0 upon system reset. Writing this register also sets the
Random register to the value of its upper bound. Figure 7-6 shows the format of the
Wired regist er and Table 7-6 describes the Wired register fields.
31 65 0
0Wired
Figure 7-6 Wired Register
Table 7-6 Wired Register Filed Descriptions
Bit Field Description Cold Reset Read/Write
3160 Reserved
(Must be written as zeroes, and ret urns zeroes when read.) 0x0 Read
50 Wired TLB Wired boundary. 0x0 Read/Write
TX49/H2 Archit ecture
7-8
7.2.7 BadVAddr Register (Reg#8)
The Bad Virtual Address (BadVAddr) register is a read only register that displays the
most recent virtual address that cause one of the following exceptions; Address Error,
TLB Invalid, TLB Modified and TLB Refill exceptions.
The processor does not write to this register when the EXL bit in the Status register is
set to a 1. Figu re 7-7 sh ow s th e f o rmats of the B ad VAd dr re g is ter and Ta ble 7- 7 d e scribes
the BadVAddr register fields.
31 0
Bad Virtual Address
(32-bit mode)
63 0
Bad Virtual Address
(64-bit mode)
Figure 7-7 BadVAddr Register Formats
Table 7-7 BadVAddr Register Field Descriptions
32-bit mode
Bit Field Description Cold Reset Read/Write
310 BadVAddr Bad Virt ual address Undefined Read
64-bit mode
Bit Field Description Cold Reset Read/Write
630 BadVAddr Bad Virt ual address Undefined Read
TX49/H2 Archit ecture
7-9
7.2.8 Count Register (Reg#9)
The Count register is a read/write register. This register acts as a timer, incrementing
at a constant rate (1/2 rate of CPUCLK) whether or not an instruction is executed, retired,
or any forward progress is made t h rough the pipeline.
This register can be also written for diagnostic purpose or system initialization. Figure
7-8 shows the format of the Count register and Table 7-8 describes the Count register
field.
31 0
Count
Figure 7-8 Count Register Format
Table 7-8 Count Register Field Description
Bit Field Description Cold Reset Read/Write
310 Count 32-bit timer, incrementing at half the maximum instruction
issue rate (CPUCLK). 0x0 Read/Write
TX49/H2 Archit ecture
7-10
7.2.9 EntryHi Register (Reg#10)
The EntryHi is a read/write register, and holds the high-order bits of a TLB entry for
TLB read and write operations. This register is accessed by the TLB Probe (TLBP), TLB
Write Ransom (TLBWR), TLB Write Indexed (TLBWI), and TLB Read Indexed (TLBR)
instructions.
When either a TLB refill, TLB invalid, or TLB modified exception occurs, this register is
loaded with the virtual page number (VPN2) and the ASID of the virtual address that did
not have a matching TLB entry. Figure 7-9 shows the formats of the EntryHi register
and Table 7-9 describes the EntryHi register fields.
31 13 12 8 7 0
VPN2 0 ASID
(32-bit mode)
63 62 61 40 39 13 12 8 7 0
RFILL VPN2 0ASID
(64-bit mode)
Figure 7-9 EntryHi Register Formats
Table 7-9 EntryHi Register Field Descriptions
32-bit mode
Bit Field Description Cold Reset Read/Write
311 VPN2 Vi rtual page number divided by two Undefined Read/Write
128 0 Reserved 0x0 Read
70 ASID Address space ID field
An 8-bit field that lets multiple processes share the TLB;
each process has a distinct mapping of otherwise identical
virtual page num bers.
Undefined Read/Write
64-bit mode
Bit Field Description Cold Reset Read/Write
6362 R Region. Used to match vA ddr63 and vAddr62.
00: user, 01: supervisor, 11: kernel Undefined Read/Write
6140 Fill Reserved. 0 on read. Ignored on write. Undefined Read
3913 VPN2 Virt ual page number divi ded by two Undefi ned Read/Write
128 0 Reserved 0x0 Read
70 ASID Address space ID field. Undefined Read/Write
TX49/H2 Archit ecture
7-11
7.2.10 Compare Register (Reg#11)
The Compare register acts as a timer. When value of the Count register equals the
value of the Compare register, interrupt bit IP (7) in the Cause register is set. This
causes an in terrupt exce ption as so on as the interrupt is e nabled . Writing a value to this
register, as a side effect, clears the timer interrupt.
For diagnostic purpose, this register is a read/write register. However, in normal
operation this register is write only. Figure 7-10 shows the format of the Compare
register and Table 7- 10 describes the Compare register field.
31 0
Compare
Figure 7-10 Compare Register Format
Table 7-10 Compare Register Field Description
Bit Field Description Cold Reset Read/Write
310 Compare Acts as a timer; it maintains a stable value that does not
change on its own. 0x0 Read/Write
TX49/H2 Archit ecture
7-12
7.2.11 Status Reg ister (Reg#12)
The Status register is a read/write register that contains the operating mode, interrupt
enabling, and diagnostic states of the processor. The more important Status register
fields are as following;
The Interrupt Mask (IM) field of 8 bits controls the enabling of eight interrupt
conditions. Interrupt must be enabled before they can be asserted, and the
corresponding bits are set in both the IM field of this register and the Interrupt
Pending field of the Cause register.
The Coprocessor Usability (CU) field of 4 bits controls the usability of four
possible coprocessors. Regardless of the CU0 bit setting, CP0 is always usable in
Kernel mode.
The Diagnostic Status (DS) field of 9 bits is used for self-testing, and checks the
cache and virtual memory system.
The Reverse Endian (RE) bit reverses the endianness. The processor can be
configured as either little/big-endian at reset; reverse-endian selection is used in
Kernel and Supervisor modes, and in the User mode when the RE bit is 0.
Setting the RE bit to 1 inverts the User mode endianness.
Figure 7-11 shows the f ormat of the Statu s regis ter an d Table 7-11 d escr ibes the St atu s
register field.
31 28 27 26 25 24 16 15 8 7 6 5 4 3 2 1 0
CU 0 FR RE DS IM KX SX UX KSU ERL EXL IE
24 23 22 21 20 19 18 17 16
0 BEV0SR0CH00
Figure 7-11 Status Register Format
Table 7-11 Status Register Field Descriptions
Bit Field Description Cold Reset Read/Write
3128 CU (3,2,1,0) Controls the usability of each of the four coprocessor unit
numbers. CP0 is always usable when in Kernel mode,
regardless of the setting of t he CU0 bit.
0: unusable, 1: usable
0000 Read/Write
27 0 Reserved 0 Read
26 FR Enables addit i onal fl oating-poi nt regis t ers.
0: 16 registers, 1: 32 registers 0 Read/Write
25 RE Reverse-E ndi an bi t, valid in User mode. 0 Read/Write
2423 0 Reserved 0x0 Read
22 BEV Controls the location of TLB refill and general exception
vectors.
0: normal, 1: bootstrap
1 Read/Write
TX49/H2 Archit ecture
7-13
Bit Field Description Cold Reset Read/Write
21 0 Reserved 0 Read
20 SR 1: I ndicates a soft reset or NMI has occurred. 0 Read/W rite
19 0 Reserved 0 Read
18 CH “Hit” or “miss” indication for last CACHE Hit Invalidate, Hit
W rite Back Invali date, Hit Write Back f or a primary cache.
0: miss, 1: hit.
0 Read/Write
1716 0 Reserved 0x0 Read
158 I M Interrupt Mask
Controls the enabling of each of the external, internal and
software interrupts. An interrupt is taken if interrupts are
enabled, and the corresponding bits are set in both the IM
field of the Status register and the IP field of the Cause
register.
0: disabled, 0: enabled
0x0 Read/Write
7 KX Enables 64-bit addressing in Kernel mode. The extended-
addressing TLB refill exception is used for TLB misses on
kernel address es.
0: 32-bit, 1: 64-bit
0 Read/Write
6 SX Enables 64-bit addressing and operations in Supervisor
mode. The extended-addressing TLB refill exception is used
for TLB misses on supervisor addresses.
0: 32-bit, 1: 64-bit
0 Read/Write
5 UX Enables 64-bi t addressing and operations i n User mode. The
extended-addressing TLB refill exception is used for TLB
misses on user address es.
0: 32-bit, 1: 64-bit
0 Read/Write
43 KSU Mode.
10: user, 01: supervisor, 00: kernel. 0x0 Read/Write
2 ERL Error Level.
0: normal, 1: error. 1 Read/Write
1 EXL Exception Level.
0: normal, 1: exception. 0 Read/Write
0 IE Interrupt Enable.
0: disable, 1: enable. 0 Read/Write
TX49/H2 Archit ecture
7-14
Status Register Modes and Access States
Fields of the Status register set the modes and access states described in the section
that follow.
! Interrupt Enable: Interrupts are enabled when all of th e following conditi ons are met:
IE = 1
EXL = 0
ERL = 0
If these conditions are met, the settings of the IM bits enable the interrupt.
! Operation Modes: The following CPU Status register bit settings are required for
User, Kernel and Supervisor modes (see Section 8.3, Operation Modes, for more
information about operating modes).
The processor is in User mode when KSU = 102, EXL = 0, and ERL = 0.
The processor is in Supervisor mode when KSU = 012, EXL = 0 and ERL = 0.
The processor is in Kernel mode when KSU = 002, or EXL= 1, or ERL =1.
! 32- and 64-bit Modes: The following CPU Status register settings select 32- or 64-bit
operation for User, Kernel, and Supervisor operating modes. Enabling 64-bit
operation permits the execution of 64-bit opcodes and translation of 64-bit addresses.
64-bit operation for User, Kernel and Supervis or modes can be set independently.
64-bit addr essing f or Kernel mo de is enable d when K X = 1. 64-bit o peratio ns are
always valid in Kernel mode.
64-bit addressing and operations are enabled for Supervisor mode when SX = 1.
64-bit addressing and operations are enabled for User mode when UX = 1.
! Kernel Address Space Accesses: Access to the kernel address space is allowed when
the processor is in Kernel mode.
! Supervisor Address Space Accesses: Access to the supervisor address space is allowed
when the processor is in Kernel or Supervisor mode, as described above in the section
above titled Operating Modes.
! User Address Space Accesses: Access to the user address is allowed in any of the
three operating modes.
Status Register Reset
The contents of the Status register are undefined at reset, except for the following bits
in the Diagnostic Status field:
ERL and BEV = 1
The SR bit distinguishes between the Reset exception and the Soft Reset exception
(caused by Nonmaskable Interrupt [NMI]).
TX49/H2 Archit ecture
7-15
7.2.12 Cause Register (Reg#13)
The Cause register holds the cause of the most recent exception. This register is read-
only, except for the IP[1~0] bits. Figure 7-12 shows the format of the Cause register and
Table 7-12 describes the Cause register field.
31 30 29 28 27 16 15 8 7 6 2 1 0
BD 0 CE 0 IP 0 ExcCode 0
Figure 7-12 Cause Register Format
Table 7-12 Cause Register Field Descriptions
Bit Field Description Cold Reset Read/Write
31 BD Indicates whether or not the last exception was taken while
executing in a branch delay slot.
0: normal, 1: delay slot.
0 Read
30 0 Reserved 0 Read
29~28 CE Indicates the coprocessor unit number referenced when a
coprocess or unusabl e excepti on is taken.
00: coprocess or 0, 01: coprocessor 1,
10: coprocess or 2, 11: coprocessor 3.
0x0 Read
27~16 0 Reserved 0x0 Read
15~10 IP [7~2] I ndi cates whether an interrupt is pending.
0: not pending, 1: pending. INT[5:0] Read
9~8 IP [1~0] Software interrupts.
0: reset, 1: set. 0x0 Read/Write
7 0 Reserved 0 Read
6~2 ExcCode Excepti on Code field.
0: Int: Interrupt.
1: Mod: TLB modification exception.
2: TLBL: TLB exception (load or instruction fetch)
3: TLBS: TLB exception (Store)
4: AdEL: Address error exception (l oad or i nstruction fetch)
5: AdES: Address error excepti on (store)
6: IBE: Bus error exception (i nstruction fetch)
7: DBE: Bus error exception (dat a referenc e: load or Store)
8: Sys: Syscall exception
9: Bp: Breakpoint exception
10: RI: Reserved ins truction exception
11: CpU: Coproc essor Unus abl e exception
12: Ov: Arithmetic Overfl ow exception
13: Tr: Trap exception
14: Reserved:
15: FPE: Floating-P oi nt exception
16-31: Reserved :
0x0 Read
1~0 0 Reserved 0x0 Read
TX49/H2 Archit ecture
7-16
7.2.13 EPC Register (Reg#14)
The Exception Program Counter (EPC) register is a read/write register. This register
contents the address at which processing resumes after an exception has been serviced.
For synchronous exceptions, this register cont ains either;
the virtual address of the instruction that was the direct cause of the exception.
the virtual address of the immediately preceding branch or jump instruction
(when the instruction is in a branch delay slot, and the Branch Delay bit in the
Cause register is set).
The processor does not write to the EPC register when EXL bit in the Status register is
set to 1 . Figure 7-13 shows the f ormats of the EPC regis ter and Table 7-13 de scribes the
EPC register field.
31 0
EPC
(32-bit mode)
63 0
EPC
(64-bit mode)
Figure 7-13 EPC Register Formats
Table 7-13 EPC Register Field Description
32-bit mode
Bit Field Description Cold Reset Read/Write
31~0 EPC Exception program counter Undefined Read/Write
64-bit mode
Bit Field Description Cold Reset Read/Write
63~0 EPC Exception program counter Undefined Read/Write
TX49/H2 Archit ecture
7-17
7.2.14 PRId Register (Reg#15)
The Processor Revision Identifier (PRId) register is a read-only register. This register
contents information identifying the implementation and revision level of the CPU and
CP0. Figure 7-14 shows the format of the PRId register and Table 7-14 describes the
PRId register field.
31 16 15 8 7 0
0ImpRev
Figure 7-14 PRId Register Format
Table 7-14 PRId Register Field Descriptions
Bit Field Description Cold Reset Read/Write
31~16 0 Reserved 0x0 Read
15~8 Imp Implementat i on number 0x2d means “TX49 family”. 0x2d Read
7~0 Rev Revision num ber +.+Read
+ Val ue is sho wn in prod uc t sheet
TX49/H2 Archit ecture
7-18
7.2.15 Conf ig Register (Reg#16)
The Config register is a read-only register; except for HALT, ICE#, DCE# and K0 fields.
This register specifies various configuration options selected on the TX49.
EC, BE, IC, DC, IB and DB fields are set by the hard ware du ring re se t and are in clude d
in this register as read-only status bits for the software to access. Figure 7-15 shows the
format of the Config register and Table 7-15 describes the Config register field.
31 30 28 27 24 23 19 18 17 16 15 14
13
12 11 9 8 6 5 4 3 2 0
0EC 0 0HALTICE#DCE#BE 1 0 IC DC IB DB 0 K0
Figure 7-15 Config Register Format
Table 7-15 Config Register Field Descriptions
Bit Field Description Cold Reset Read/Write
31 0 Reserved 0 Read
30~28 E C GBUS clock rate:
0: process or cl ock frequency di vi ded by 2
1: process or cl ock frequency di vi ded by 3
2: process or cl ock frequency di vi ded by 4
7: process or cl ock frequency di vi ded by 2.5
3, 4, 5, 6 : reserved
pin Read
27 0 Reserved pin Read/Write
26~24 0 Reserved pin Read
23~19 0 Reserved 0 Read
18 HALT Wait mode.
0: Halt
1: Doze
Indicates the power-down behavior of the TX49 when WAIT
instructi on is executed. The TX49 stalls the pipeline bot h in
halt and doze mode. Cache snoops are possible during
Doze mode but not possible during Halt mode. Halt mode
reduces power consumption to a greater extent than Doze
mode.
0 Read/Write
17 ICE#Instruction Cache Enabl e
0: Instruct i on cache enable
1: Instruct i on cache disable
0 Read/Write
16 DCE#Data Cac he Enable
0: Data cache enable
1: Data cache disable
0 Read/Write
15 BE Big Endian
0: Little Endian
1: Big Endian
pin Read
14~13 1 Reserved 11 Read
12 0 Reserved 0 Read
TX49/H2 Archit ecture
7-19
Bit Field Description Cold Reset Read/Write
11~9 IC Inst ruction cache size. In the TX49, this is set to 8 KB (001),
16 KB (010) or 32 KB (011). 001, 010 or
011 Read
8~6 DC Data cache size. In the TX49, this is set to 8 KB (001),
16 KB (010) or 32 KB (011). 001, 010 or
011 Read
5 IB Primary I-Cache line Size
1:32 bytes (8 words) 1 Read
4 DB Primary D-cache lin e Size
1:32 bytes (8 words) 1 Read
3 0 Reserved 0 Read
2~0 K0 kseg0 coherency algorit hm
0: Cacheable, nonc oherent, write-t hrough, no-WA
1: Cacheable, nonc oherent, write-t hrough, WA
2: Uncached
3: Cacheable, nonc oherent, write-back, WA
4-7: Reserved
0x0 Read/Write
TX49/H2 Archit ecture
7-20
7.2.16 LLAddr Register (Reg#17)
The Load Linked Address (LLAddr) register is a read/wirte register, and contains the
physical address read by the most recent Load Linked (LL/LLD) instruction. This register
is for diagnostic purposes only, and serves no function during normal operation. Figure
7-16 shows the format of the LLAddr register and Table 7-16 describes the LLAddr
register field.
31 0
pAddr (35~4)
Figure 7-16 LLAddr Register Format
Table 7-16 LLAddr Register Field Description
Bit Field Description Cold Reset Read/Write
31~0 pAddr Physical address bits 35~4 0x0 Read/Write
TX49/H2 Archit ecture
7-21
7.2.17 XContext Register (Reg#20)
The XContext register is a read/write register, and contains a pointer to an entry in the
page table entry (PTE) array, an operating system data structure that stores virtual to
physical address translations. When there is a TLB miss, the operating system software
loads the TLB with the missin g transl ation from the PTE array . How ever th e conte nts of
this register duplicates some information of the BadVAddr register, it is arranged in a
form that i s more useful for TLB exception handler by a software. This register is for use
with the XTLB refill handler, which loads TLB entries for references to a 64-bit address
space, and is included solely for operating system use. The operating system sets the PTE
base field in the register, as needed. Normally, the operating system uses this register to
address the current page map which resides in the Kernel mapped segment, kseg3.
The BadVPN2 field of 27 bits has bit [39~13] of the virtual address that caused the TLB
miss; bit 12 is excluded because a single TLB entry maps to an even-odd page pair. For a
4 KByte page size, this format may be used directly to access the pair-table of 8 Byte
PTEs. For other page sizes and PTE size s, shifting and masking this value produces the
appropriate address.
Figure 7-17 shows the format of the XContext register and Table 7-17 describes the
XContext register field.
63 33 32 31 30 4 3 0
PTEBase R BadVPN2 0
Figure 7-17 XContext Register Format
Table 7-17 XContext Register Field Description
Bit Field Description Cold Reset Read/Write
63~33 PTEBase Page t abl e entry bas e pointer
This field is normally written with a value that allows the
operation system to use the Context register as a pointer
into t he cu rrent PTE array i n m emory.
Undefined Read/Write
32~31 R The Region field contains bits 63 to 62 of the virtual address.
00: user, 01: supervisor, 11: kernel Undefined Read/Write
30~4 BadVP N2 Bad virtual page number divided by two.
This field is written by hardware on a miss. It contains the
VPN of the most recent i nval idly translated virtual address.
Undefined Read
3~0 0 Reserved 0x0 Read
TX49/H2 Archit ecture
7-22
7.2.18 Debug Register (Reg#23)
The Debug register is a read-only; except for TLF, BsF, SSt and JtagRst fields. This
register holds the information for debug handler. Figure 7-18 shows the format of the
Debug register and Table 7-18 describes the Debug register field.
31 30 29 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
DBD
DM
0
NIS
TRS
OES
TLF
BsF
0
SSt
JtagRst
0
DINT
DIB
DDBS
DDBL
DBp
DSS
Figure 7-18 Debug Register Format
Table 7-18 Debug Register Field Descriptions
Bit Field Description Cold Reset Read/Write
31 DBD Debug Branch Delay; W hen a debug exception occurs while
an instruction i n the branch delay slot is executing, this bit i s
set to 1.
0 Read
30 DM Debug Mode; It indicates that a debug exception has taken
place. This bit is set when a debug exception is taken, and
is cleared upon return from the exception (DERET). While
this bit is set all interrupts, including NMI, TLB exception ,
BUS error exception, and debug exception are masked and
cache line locki ng funct i on is disabl ed.
0: Debug handler not running.
1: Debug handler running.
0 Read
29~15 0 Reserved 0x0 Read
14 NIS Non-maskable Interrupt Status; When this bit is set
indicating that a non-maskable interrupt has occurred at the
same time as a debug exception. In this case the Status,
Cause, EPC, and BadVAddr registers assumes the usual
status after occurrence of a non-maskable interrupt, but the
address in DEPC is not the non-maskable exception vector
address (0xbfc0 0000). Instead, 0xbfc0 0000 is put in DEPC
by the debug handler software after which processing
returns directly from the debug exception to the non-
maskable i nterrupt handl er.
0 Read
13 TRS TLB Miss Status; When this bit is set indicating the Debug
Exception and TLB/XTLB refill exception has occurred at the
same time. In this case the Status, Cause, EPC, and
BadVAddr registers assumes the usual status after
occurrence of TLB/XTLB refill. The address in the DEPC is
not the other exception vector address. Instead, 0xbfc0
0200 (if BEV = 1) in case of TLB refill exception and 0xbfc0
0280 (if BEV = 1) in case of XTLB refill exception or 0x8000
0000 (if BEV = 0) in case of TLB refil l exc eption and 0x8000
0080 (if BEV = 0) in case of XTLB refill exception is put in
DEPC by the debug exception handler software, after which
processing returns directly from the debug exception to the
other exception handler.
0 Read
TX49/H2 Archit ecture
7-23
Bit Field Description Cold Reset Read/Write
12 OES Other Exception Status; When this bit is set indicates
exception other than reset, NMI, or TLB/XTLB refill has
occurred at the same time as a debug exception. In this
case the Status, Cause, EPC, and BadVAddr registers
assume the usual status after occurrence of such an
exception, but the addressing the DEPC is not the other
exception Vector address . Instead, 0xbfc0 0380 (if BEV = 1)
or 0x8000 0180 (if BEV = 0) is put in DEPC by the debug
exception handler software, after which processing returns
directly from the other exception handler.
0 Read
11 TLF TLB Exception Flag; This bit is set to 1 when TLB related
exception occurs for immediately preceding load or store
instruct ion while a debug exception handler is running (DM =
1). TLB exception will set this bit to 1 regardless of writing
zero. It is cleared by writing 0 and writing 1 is ignored.
0 Read/Write
10 BsF Bus Error Exception Flag; This bit is set to 1 when a bus
error exception occurs for a load or store instruction while a
debug exception handler is running (DM = 1). Bus error
exception will set this bit to 1 regardless of writ ing zero. It is
cleared by writing 0 and writing 1 is ignored.
0 Read/Write
9 0 Reserved 0 Read
8 SSt Single St ep; S et to 1 indicates t he single st ep debug functi on
is enable (1) or disabled (0). The function is disable when
the DM bit is set to 1 while the debug exception is running.
0 Read/Write
7 JtagRst JTAG Reset; When this bit is set to 1 the processor reset the
JTAG unit. 0 Read/Write
6 0 Reserved 0 Read
5 DINT Debug Interrupt Break Exception Status; set to 1 when
debug interrupts occurs. 0 Read
4 DIB Debug Instruction Break Exception Status; Set to 1 on
instruction address break. 0 Read
3 DDBS Debug Data Break Store Exception Stat us; Set to 1 on dat a
address break at store operati on. 0 Read
2 DDBL Debug Data Break Load Exception Status; Set to 1 on data
address break at load operati on. 0 Read
1 DBp Debug Breakpoint Exception Status; This bit is set when
executing SDBBP inst ruction. 0 Read
0 DSS Debug Single Step Exception Status; Set to 1 indicate Single
Step Exception. 0 Read
TX49/H2 Archit ecture
7-24
7.2.19 DEPC Regist er (Reg#24)
The DEPC register holds the address where processing resumes after the debug
exception routine has finished. The address that has been loaded in the DEPC register is
the virtual address of the instruction that caused the debug exception. If the instruction
is in the branch delay slot, the virtual address of the immediately preceding branch or
jump instruction is placed in this register. Execution of the DERET instruction causes a
jump to the address in the DEPC. If the DEPC is both written from software (by MTC0)
and by hardware (debug exception) then the DEPC is loaded by the value generated by
the hardware.
Figure 7-19 show s the fo rmats o f th e DE PC reg is ter and Ta ble 7- 19 de scribe s th e DE PC
register field.
31 0
DEPC
(32-bit mode)
63 0
DEPC
(64-bit mode)
Figure 7-19 DEPC Register Formats
Table 7-19 DEPC Register Field Description
32-bit mode
Bit Field Description Cold Reset Read/Write
31~0 DEPC Debug except i on program counter. Undefined Read/W rite
64-bit mode
Bit Field Description Cold Reset Read/Write
63~0 DEPC Debug except i on program counter. Undefined Read/W rite
TX49/H2 Archit ecture
7-25
7.2.20 TagLo Register (Reg#28) and TagHi Register (Reg#29)
The TagLo and TagHi registers are a read/write registers. These registers hold the
primary cache tag for cache lock function or cache diagnostics. These registers are
written by the CACHE/MTC0 instruction. Figure 7-20 shows the formats of the TagLo
and TagHi registers and Table 7-20 describes the TagLo and TagHi registers field.
31 87 65 3 2 1 0
PTagLo PState RWNT Lock F0 0
(TagLo)
31 30 29 0
F1 PtagLo1 0
(TagHi)
Figure 7-20 TagLo and TagHi Register Formats
Table 7-20 TagLo and TagHi Register Field Descriptions
TagLo
Bit Field Description Cold Reset Read/Write
31~8 PTagLo Bits 35~12 of the physical address 0x0 Read/Write
7~6 PState S pecifies the primary cac h e stat e
0: Invali d 1: Reserved
2: Reserved 3: Valid
0x0 Read/Write
5~3 RWNT Read/W rite bits required for Windows NT 0x0 Read/Write
2 Lock Lock bit (0: not l ocked, 1: l ocked) 0 Read/Write
1 F0 F IFO Replace bit 0 (indicates the set to be replaced) 0 Read/Write
0 0 Reserved 0 Read
TagHi
Bit Field Description Cold Reset Read/Write
31 F1 FIFO Replace bit 1 (indicates the set to be replaced) 0 Read/Write
30 PTagLo1 Bit 11 of the physical address 0 Read/Write
29~0 0 Reserved 0x0 Read
F1 and F0 are concatenated and indicate the set to be replaced.
F1  F0
0 0 : way0
0 1 : way1
1 0 : way2
1 1 : way3
TX49/H2 Archit ecture
7-26
7.2.21 ErrorEPC Register (Reg#30)
The ErrorEPC is a read/write register, and is similar to the EPC register. This register
is used to store the program counter (PC) on ColdReset, SoftReset and NMI exceptions.
This register contains the virtual address at which instruction processing can resume
after servicing an error. This address can be;
The virtual address of the instruction that caused the exception
The virtual address of the immediately preceding branch or jump instruction,
when this address is in a branch delay slot.
There is no branch delay slot indication for this register. Figure 7-21 shows the formats
of the ErrorEPC register and Table 7-21 describes the ErrorEPC register field.
31 0
ErrorEPC
(32-bit mode)
63 0
ErrorEPC
(64-bit mode)
Figure 7-21 ErrorEPC Register Formats
Table 7-21 ErrorEPC Register Field Descriptions
32-bit mode
Bit Field Description Cold Reset Read/Write
31~0 ErrorEP C E rror Exception Program Counter. Undefined Read/Write
64-bit mode
Bit Field Description Cold Reset Read/Write
63~0 ErrorEP C E rror Exception Program Counter. Undefined Read/Write
TX49/H2 Archit ecture
7-27
7.2.22 DESAVE Register (Reg#31)
This register is used by the debug exception handler to save one of the GPRs, that is
then used to save the rest of the context to a pre-determined memory are, e.g. in the
processor probe. This register allows the safe debugging of exception handlers and other
types of code where the existence of a valid stack for context saving cannot be ass u med.
Figure 7-22 shows the formats of the DESAVE register and Table 7-22 describes the
DESAVE register field.
Note: This register can use for ICE sy stem only.
63 0
DESAVE
Figure 7-22 DESAVE Register Format
Table 7-22 DESAVE register Field Description
32/64-bit mode
Bit Field Description Cold Reset Read/Write
63~0 DESAVE Save one of the GPRs Undefined Read/Write
TX49/H2 Archit ecture
7-28
7.2.23 The Init ialization of CP0 Registers in SoftReset Exception
Table 7-23 shows the values of t he registers that be initialized by SoftReset exception.
Table 7-23 The Initial Value by SoftReset Exception
Register Bit Field SoftRest Description
22 BEV 1 Same value as ColdReset
20 SR 1 ColdReset has priority over SoftReset
Status (Reg#12) 2 ERL 1 S ame value as ColdReset
TX49/H2 Archit ecture
8-1
8. Memory Management System
8.1 Introduction
The TX49 provides a full-featured memory management unit (MMU) which uses an on-chip
translation look aside buffer (TLB) to translate virtual addresses into physical addresses.
8.2 Address Space Overview
The TX49 physical address space is 64 Gbyte using a 36-bit address. The virtual
address is either 64 or 32 bits wide depending on whether the processor is operating in 64-
or 32-bit mode. In 32-bit mode, addresses are 32-bits wide and the maximum user process
size is 2 Gby te (2 **31). In 64-b it mode , addresse s are 64-bi t wide and the maximum u ser
process is 1 Tbyte (2**40). The virtual address is extended with an Address Space
Identifier (ASID) to reduce the frequency of TLB flushing when switching context. The
size of the ASID field is 8 bits. The ASID is contained in the CP0 EntryHi register.
8.2.1 Virt ual Addr ess Space
The processor virtual address can be either 32 or 64 bits wide, depending on whether
the processor is operating in 32-bit or 64-bit mode.
In 32-bit mode, addresses are 32 bits wide.
The maximum user process size is 2 gigabytes (231).
In 64-bit mode, addresses are 64 bits wide.
The maximum user process size is 1 terabyte (240).
Figure 8-1 shows the translation of a virtual address into a physical address.
3. The Offset, which does not pass through the
TLB, is then concatenated to the PFN.
2. If there is a match, the page frame number
(PFN) representi ng the upper bits of the
physica l address (PA ) is output from the
TLB.
Physical address
Virtual address
1. Virtual address (VA) represented by the virtual
page number (VPN) is compared with tag in
the TLB.
VPNASIDG
VPNASIDG
PFN
TLB
OffsetPFN
TLB
Entry
Offset
Figure 8-1 Overview of a Virtual-to-Physical Address Translation
As shown in Figure 8-2 and Figure 8-3, the virtual address is extended with an 8-bit
address space identifier (ASID), which reduces the frequency of TLB flushing when
switching con tex ts . This 8 - bi t ASID is in th e CP0 EntryHi register, described la ter in this
chapter. The Global bit (G) is in the EntryLo0 and EntryLo1 registers, described later in
this chapter.
TX49/H2 Archit ecture
8-2
8.2.2 Physical Address Space
Using a 36-bit address, the processor physical address space encompasses 64 Gbytes.
The section following describes the translation of a virtual address to a physical address.
8.2.3 Virtual-to-Physical Address Translation
Converting a virtual address to a physical address begins by comparing the virtual
address from the processor with the virtual addresses in the TLB; there is a match when
the virtual page number (VPN) of the address is the same as the VPN field of the entry,
and either:
the Global (G) bit of the TLB entry is set, or
the ASID field of the virtual address is the same as the ASID field of the TLB
entry.
This match is referred to as a TLB hit. If there is no match, a TLB Miss exception is
taken by the processor and software is allowed to refill the TLB from a page table of
virtual/phy sic al add re sses in memory.
If there is a virtual address match in the TLB, the physical address is output from the
TLB and concatenated with the Offset, which represents an address within the page
frame space. The Offset does not pass through the TLB.
Virtual-to-physical translation is described in greater detail throughout the remainder
of this chapter; Figure 8-8 is a flow diagram of the process shown at the end of this
chapter. The next two sections describe the 32-bit and 64-bit address translations.
TX49/H2 Archit ecture
8-3
8.2.4 32-bit Mode Address Tr anslation
Figure 8-2 shows the virtual-to-physical-address translation of a 32-bit mode address.
This figure illustrates two of the possible page sizes: a 4-Kbyte page (12 bits) and a 16-
Mbyte page (24 bits).
The top portion of Figure 8-2 shows a virtual address with a 12-bit, or 4-Kbyte,
page size, labeled Offset. The remaining 20 bits of the address represent the
VPN, and Index the 1M-entry page table.
The bottom portion of Figure 8-2 shows a virtual address with a 24-bit, or 16-
Mbyte, page size, labeled Offset. The remaining 8 bits of the address represent
the VPN, and index the 256-entry page table.
Offset passed
unchanged to
physical
memory
Offset passed
unchanged to
physical
memory
Virtual-to-physical
translation in TLB
Bits 31, 30 and 29 of the virt ual
address s elect us er, supervisor,
or kernel address spaces.
Virtual-to-physical
translation in TLB
Virtual Address with 256 (28) 16-Mbyte pages
Virtual Address with 1M (220) 4-Kbyte pages
TLB
20 bits = 1 M page s
VPNASID
12208
01112282939 32 31
Offset
36-bit Physical Address
035 PFN Offset
8 bits = 256 pages
VPNASID
2488
02324282939 32 31
Offset
TLB
Figure 8-2 32-bit Mode Virtual Address Translation
TX49/H2 Archit ecture
8-4
8.2.5 64-bit Mode Address Tr anslation
Figure 8-3 shows the virtual-to-physical-address translation of a 64-bit mode address.
This figure illustrates two of the possible page sizes: a 4-Kbyte page (12 bits) and a 16-
Mbyte page (24 bits).
The top portion of Figure 8-3 shows a virtual address with a 12-bit, or 4-Kbyte,
page size, labelled Offset. The remaining 28 bits of the address represent the
VPN, and index the 256M-entry page table.
The bottom portion of Figure 8-3 shows a virtual address with a 24-bit, or 16-
Mbyte, pag e size, la belled Offset. The remaining 16 bits of the address represent
the VPN, and index the 64K-entry page table.
Offset passed
unchanged to
physical
memory
Offset passed
unchanged to
physical
memory
Virtual-to-physical
translation in TLB
Bits 62 and 63 of the virtual
address select user , supervisor,
or kernel address spaces.
Virtual-to-physical
translation in TLB
Virtual Address with 64 K (216) 16-Mbyte pages
Virtual Address with 256 M (228) 4-Kbyt e pages
28 bits = 256M pages
VPN0 or -1ASID
1228248
0111239406171 64 6263
Offset
36-bit Physical Address 035 PFN Offset
16 bits = 64 K pages
VPN0 or -1ASID
2416248
023243940616271 64 63
Offset
TLB
TLB
Figure 8-3 64-bit Mode Virtual Address Translation
TX49/H2 Archit ecture
8-5
8.3 Operating Modes
The TX49 has the three operating modes, User mode, Supervisor mode and Kernel mode, for
32- and 64-bit operation. The KSU, EXL and ERL bit in the Status register select User,
Supervisor or Kernel mode. The UX, SX and KX bit in the Status register select 32- or 64-bit
addressing in user, supervisor and kernel mode respectively.
KSU EXL ERL UX SX KX Mode
10 0 0 0 - - 32-bit addressing in user mode
10 0 0 1 - - 64-bit addressing in user mode
01 0 0 - 0 - 32-bit addressi ng i n supervi sor mode
01 0 0 - 1 - 64-bit addressi ng i n supervi sor mode
00 - - - - 0 32-bit addressing in kernel mode
- 1 - - - 0 32-bit addressing in kernel mode
- - 1 - - 0 32-bi t addressing in kernel mode
00 - - - - 1 64-bit addressing in kernel mode
- 1 - - - 1 64-bit addressing in kernel mode
- - 1 - - 1 64-bi t addressing in kernel mode
8.3.1 User Mode Operations
In User mode, a single, uniform virtual address space-labelled User segment-is
available; its size is:
2 Gbytes (231 bytes) in 32-bit mode (useg)
1 Tbyte (240 bytes) in 64-bit mode (xuseg)
Figure 8-4 shows User mode virtual address space.
0x 0000 0000 0000 0000
0x 0000 0100 0000 0000
0x F FF F FF FF FF FF FF FF
xuseguseg
0x 0000 0000
0x 8000 0000
0x FFFF FFFF
64-bit32-bit*
2 GB
Mapped
Cacheable
Address
Error
1 TB
Mapped
Cacheable
Address
Error
Figure 8-4 User Mode Virtual Addr ess Space
*Note: In 32-bit mode, bit 31 is sign-extended through bits 63~32. Failure results in an
address error exception.
The User segment starts at address 0 and the current active user process resides in
either useg (in 32-bit mode) or xuseg (in 64-bit mode). The TLB identically maps all
references to useg/xuseg from all modes, and controls cache accessibility.
The processor operates in User mode when the Status register contains the following
bit-values:
KSU bits = 102
EXL = 0
ERL = 0
TX49/H2 Archit ecture
8-6
In conjunction with these bits, the UX bit in the Status register selects between 32- or
64-bit User mode addressing as follows:
when UX = 0, 32-bit useg space is sele cted and TLB misses are han dle d by th e 3 2-
bit TLB refill exception handler
when UX = 1, 64-bit xuseg space is selected and TLB misses are handled by the
64-bit TLB refill exception handler
Table 8-1 lists the characteristics of the two user mode segments, useg and xuseg.
Table 8-1 32-bit and 64-bit User Mode Segments
Status Register
Bit Values
Address Bit
Values KSU EXL ERL UX
Segment
Name Address Range Segment Size
32-bit
A (31) = 0102000 useg 0x0000 0000
through
0x7FFF FFFF
2 Gbyte
(231 bytes)
64-bit
A (63~40) = 0102001xuseg 0x0000 0000 0000 0000
through
0x0000 00FF FFFF FFFF
1 Tbyte
(240 bytes)
32-bit User Mode (useg)
In User mode, when UX = 0 in the Status register, User mode addressing is
compatible wi th the 32-bit addre ssing mode l shown in Fig ure 8-4, and a 2-G byte user
address space is available, labelled useg.
All valid User mode virtual addresses have their most-significant bit cleared to 0;
any attempt to reference an address with the most-significant bit set while in User
mode causes an Address Error exception.
The system maps all references to useg through the TLB, and bit settings within
the TLB entry for the page determine the cacheability of a reference.
64-bit User Mode (xuseg)
In User mode, when UX = 1 in the Status register, User mode addressing is
extended to the 64-bit model shown in Figure 8-4 . In 64-bit User mode, the processor
provides a single, uniform a ddress space of 240 bytes, labelled xuseg.
All valid User mode virtual addresses have bits 63~40 equal to 0; an attempt to
reference an address with bits 63~40 not equal to 0 causes an Address Error
exception.
The system maps all reference to xuseg through the TLB, and bit settings within
the TLB entry for the page determine the cacheability of a reference.
TX49/H2 Archit ecture
8-7
8.3.2 Supervisor Mode Operations
Supervisor mode is designed for layered operating systems in which a true kernel runs
in TX49 Kernel mode, and t he rest of the operating system runs in Supervisor mode.
The processor operates in Supervisor mode when the Status register contains the
following bit-values:
KSU = 012
EXL = 0
ERL = 0
In conjunction with these bits, the SX bit in the Status register selects between 32- or
64-bi t Supervisor mode addressing:
when SX = 0, 32-bit supervisor space is selected and TLB misses are handled by
the 32-bit TLB refill exception handler
when SX = 1, 64-bit supervisor space is selected and TLB misses are handled by
the 64-bit XTLB refill exception handler
The system maps all references through the TLB, and bit settings within the TLB entry
for the page determine the cacheability of a reference.
Figure 8- 5 shows Su pervisor mode addr ess mapping. Ta ble 8-2 l ists the charac teristic s
of the supervisor mode segments; descriptions of the address spaces follow.
0x FFFF FF FF E000 0000
0x 0000 0000 0000 0000
0x 0000 0100 0000 0000
0x 4000 0000 0000 0000
0x 4000 0100 0000 0000
0x FFFF FF FF C000 0000
0x FFFF FF FF FFFF FF FF
x
suseg
x
sseg
csseg
suseg
sseg
0x 0000 0000
0x 8000 0000
0x A000 0000
0x C000 0000
0x E000 0000
0x FFFF FFFF
32-bit*
2 GB
Mapped
Cacheable
0.5 GB
Mapped
Cacheable
Address
error
Address
error
Address
error
64-bit
0.5 GB
Mapped
Cacheable
1 TB
Mapped
Cacheable
1 TB
Mapped
Cacheable
Address
error
Address
error
Address
error
Figure 8-5 Superv is or Mode Ad dr es s Sp ace
*Note: In 32-bit mode, bit31 is sign-extended through bits 63~32. Failure results in an
address error exception.
TX49/H2 Archit ecture
8-8
Table 8-2 32-bit and 64-bit Supervisor Mode Segments
Status Register
Bit Values
Address Bit
Values KSU EXL ERL SX
Segment
Name Address Range Segment Size
32-bit
A (31) = 0012000suseg 0x0000 0000
through
0x7FFF FFFF
2 Gbyte
(231 bytes)
32-bit
A (31~29) = 1102012000ssseg 0xC000 0000
through
0xDFFF FFFF
512 Mbytes
(229 bytes)
64-bit
A (63~62) = 002012001xsuseg 0x0000 0000 0000 0000
through
0x0000 00FF FFFF FFFF
1 Tbyte
(240 bytes)
64-bit
A (63~62) = 012012001xsseg 0x4000 0000 0000 0000
through
0x4000 00FF FFFF FFFF
1 Tbyte
(240 bytes)
64-bit
A (63~62) = 112012001csseg 0xFFFF FFFF C000 0000
through
0xFFFF FFFF DFFF FFFF
512 Mbytes
(229 bytes)
32-bit Supervisor Mode, User Space (suseg)
In Supervisor mode, when SX = 0 in th e Status register and the most-significant bit
of the 32-bit virtual address is set to 0, the suseg virtual address space is selected; it
covers the full 231 bytes (2 Gbytes) of the current user address space. The virtual
address is extended with the contents of the 8-bit ASID field to form a unique virtual
address. Thi s mapped space starts at virtua l address 0x0 000 0000 and ru ns through
0x7FFF F FF F.
32-bit Supervisor Mode, Supervisor Space (sseg)
In Supervisor mode, when SX = 0 in the Status register and the three most-
significant bits of the 32-bit virtual address are 1102, the sseg virtua l addre ss space is
selected; it covers 229 bytes (512 Mbytes) of the current supervisor address space. The
virtual address is extended with the contents of the 8-bit ASID field to form a unique
virtual address. This mapped space begins at virtual address 0xC000 0000 and runs
through 0xDF FF FFFF.
64-bit Supervisor Mode, User Space (xsuseg)
In Supervisor mode, when SX = 1 in the Status register and bits 63:62 o f the v irtual
address are set to 002, the xsuseg virtual address space is selected; it covers the full
240 bytes (1 Tbyte) of the current user address space. The virtual address is extended
with the contents of the 8-bit ASID field to form a unique virtual address. This
mapped space starts at virtual address 0x0000 0000 0000 0000 and runs through
0x0000 00FF FFFF FFFF.
64-bit Supervisor Mode, Current Supervisor Space (xsseg)
In Supervisor mode, when SX = 1 in the Status register and bits 63~62 of the
virtual address are set to 012, the xsseg current supervisor virtual address space is
selected. The virtual address is extended with the contents of the 8-bit ASID field to
form a unique virtual address. This mapped space begins at virtual address 0x4000
0000 0000 0000 and runs through 0x4000 00FF FFFF FFFF.
TX49/H2 Archit ecture
8-9
64-bit Supervisor Mode, Separate Supervisor Space (csseg)
In Supervisor mode, when SX = 1 in the Status register and bits 63~62 of the
virtual address are set to 112, the csseg separate supervisor virtual address space is
selected. Addressing of the csseg is compatible with addressing sseg in 32-bit mode.
The virtual address is extended with the contents of the 8-bit ASID field to form a
unique virtual address. This mapped space begins at virtual address 0xFFFF FFFF
C000 0000 and runs through 0xFFFF FFFF DFFF FFFF.
8.3.3 Kernel Mode Operat ions
The processor operates in Kernel mode when the Status register contains one or more of
the following values:
KSU = 002
EXL = 1
ERL = 1
In conjunction with these bits, the KX bit in the Status register selects between 32- or
64-bit Kernel mode addressing:
when KX = 0, 32-bit kernel space is selected and all TLB misses are handled by
the 32-bit TLB refill exception handler
when KX = 1, 64-bit kernel space is selected and all TLB misses are handled by
the 64-bit XTLB refill exception handler
The processor enters Kernel mode whenever an exception is detected and it remains in
Kernel mode until an Exception Return (ERET) instruction is executed and results in
ERL and/or EXL = 0. The ERET instruction restores the processor to the mode existing
prior to the exception.
Kernel mode virtual address space is divided into regions differentiated by the high-
order bits of the virtual address, as shown in Figure 8-6. Table 8-3 lists the
characteristic s of the 32-b it kernel mod e segments, an d Table 8-4 lis ts the charac teristics
of the 64-bit kernel mode segments.
TX49/H2 Archit ecture
8-10
kuseg
kseg0
kseg1
ksseg
kseg3
0x 0000 0000
0x 8000 0000
0x A000 0000
0x C000 0000
0x E000 0000
0x FFFF FFFF
32-bit*
2 GB
Mapped
Cacheable
0.5 GB
Mapped
Cacheable
0.5 GB
Mapped
Cacheable
0.5 GB
Unmapped
Cacheable
0.5 GB
Unmapped
Uncached
0x FFFF FFFF E 0 00 000 0
0x 0000 0000 0000 0000
0x 0000 0100 0000 0000
0x 4000 0000 0000 0000
0x 4000 0100 0000 0000
0x 8000 0000 0000 0000
0x C000 0000 0000 0000
0x C000 00FF 8000 0000
0x FFFF FFFF 800 0 00 00
0x FFFF FFFF A 0 00 000 0
0x FFFF FFFF C000 000 0
0x FFFF FFFF FFFF FFFF
x
kuseg
x
ksseg
x
kphys
x
kseg
cksseg
ckseg0
ckseg1
ckseg3
64-bit
0.5 GB
Mapped
Cacheable
0.5 GB
Mapped
Cacheable
1 TB
Mapped
Cacheable
1 TB
Mapped
Cacheable
Mapped
Cacheable
Unmapped
(For details
see figure 8-7)
Address
error
Address
error
Address
error
0.5 GB
Unmapped
Uncached
0.5 GB
Unmapped
Cacheable
Figure 8-6 Kernel Mo de Addr es s Spac e
*Note 1: In 32-bit mode, bit 31 is sign-extended through bits 63~32. Failure results in an address error
exception.
*Note 2: 0xff00_0000 through 0xff3f _ffff in 32-bit m ode and 0xffff_ffff_ff00_0000 through 0xffff_ffff_ff3f_ffff
in 64-bit mode are reserved (unmapped, uncached) for use by registers in the Debug Support
Unit and TX49 MCU peripherals.
TX49/H2 Archit ecture
8-11
0xBFFF FFFF FFFF FFFF
4* 64 GB
Unmapped
Reserved
64 GB
Unmapped
Cacheable
noncoherent
WB
64 GB
Unmapped
Uncached
64 GB
Unmapped
Cacheable
noncoherent
WT-WA
64 GB
Unmapped
Cacheable
noncoherent
WT-no-WA
0x9FFF FFFF FFFF FFFF
0xA000 0000 0000 0000
0x97FF FFFF FFFF FFFF
0x9800 0000 0000 0000
0x8FFF FFFF FFFF FFFF
0x9000 0000 0000 0000
0x87FF FFFF FFFF FFFF
0x8800 0000 0000 0000
0x8000 0000 0000 0000
Figure 8-7 xkphys Address Space
TX49/H2 Archit ecture
8-12
Table 8-3 32-bit Kerne l Mode Segments
Status Register
Is One Of These Values
Address
Bit Values KSU EXL ERL KX
Segment
Name Address Range Segment Size
A (31) = 00Kuseg 0x0000 0000
through
0x7FFF FFFF
2 Gbyte
(231 bytes)
A (31~29) = 10020Kseg0 0x8000 0000
through
0x9FFF FFFF
512 Mbytes
(229 bytes)
A (31~29) = 10120Kseg1 0xA000 0000
through
0xBFFF FFFF
512 Mbytes
(229 bytes)
A (31~29) = 11020Ksseg 0xC000 0000
through
0xDFFF FFFF
512 Mbytes
(229 bytes)
A (31~29) = 11120Kseg3 0xE000 0000
through
0xFFFF FFFF
512 Mbytes-4 Mbytes
(229 bytes)
KSU = 002
or
EXL = 1
or
ERL = 1
0(Reserved)0xFF00 0000
through
0xFF3F FFFF 4 Mbytes
32-bit Kernel Mode, User Space (kuseg)
In Kernel mode, when KX = 0 in the Status register, and the most-significant bit of
the virtual address, A31, is cleared, the 32-bit kuseg vir tual add ress sp ace is selected ;
it covers the full 231 bytes (2 Gbytes) of the current user address space. The virtual
address is extended with the contents of the 8-bit ASID field to form a unique virtual
addre ss. When ER L = 1 in the Status register, the user address region becomes a 231
bytes unmapped (that is, mapped directly to physical addresses) uncached address
space.
32-bit Kernel Mode, Kernel Space 0 (kseg0)
In Kernel mode, when KX = 0 in the Status register and the most-significant three
bits of the virtual address are 1002, 32-b it kseg0 virtua l add re s s space is se le cte d ; i t i s
the 229 bytes (512 Mbyte) kernel physical space. References to kseg0 are not mapped
throug h the TLB; the ph ysical ad dres s selecte d i s defin ed by subtrac ting 0x8000 0000
from the virtual address. The K0 field of the Config register, described in this
chapter, controls cacheability and coherency.
TX49/H2 Archit ecture
8-13
32-bit Kernel Mode, Kernel Space 1 (kseg1)
In Kernel mode, when KX = 0 in the Status register and the most-significant three
bits of the 32-bit virtual address are 1012, 32-bit kseg1 virtual address space is
selecte d; it is the 229 bytes (512 Mbyte) kernel physical space. References to kseg1 are
not mapped through the TLB; the physical address selected is defined by subtracting
0xA000 0000 from the virtual address. Caches are disabled for accesses to these
addresses, and physical memory (or memory-mapped I/O device registers) are
accessed directly.
32-bit Kernel Mode, Supervisor Space (ksseg)
In Kernel mode, when KX = 0 in the Status register and the most-significant three
bits of th e 32-bit v irtual address are 1102, the ksseg virtual add ress space is selec ted;
it is the current 229 bytes (512 Mbyte) supervisor virtual space. The virtual address is
extended with the contents of the 8-bit ASID field to form a unique virtual address.
32-bit Kernel Mode, Kernel Space 3 (kseg3)
In Kernel mode, when KX = 0 in the Status register and the most-significant three
bits of th e 32-bit v ital ad dress are 1112, the kseg3 virtu al addre ss space is sele cted; i t
is the current 229 bytes (512 Mbyte-4 Mbyte) kernel virtual space. The virtual
address is extended with the contents of the 8-bit ASID field to form a unique virtual
address.
Note: These is the 4 Mbytes Res er ved area, beg in at virtual ad dres s 0x FF 00_0 000 an d runs
through 0xFF3F_FFFF.
TX49/H2 Archit ecture
8-14
Table 8-4 64-bit Kerne l Mode Segments
Status Register
Is One Of These Values
Address
Bit Values KSU EXL ERL KX
Segment
Name Address Range Segment Size
A (63~62) = 0021xkuseg 0x0000 0000 0000 0000
through
0x0000 00FF FFFF FFFF
1 Tbytes
(240 bytes)
A (63~62) = 0121xksseg 0x4000 0000 0000 0000
through
0x4000 00FF FFFF FFFF
1 Tbytes
(240 bytes)
A (63~62) = 1021xkphys 0x8000 0000 0000 0000
through
0xBFFF FFFF FFFF FFFF 8*232 bytes
A (63~62) = 1121xkseg 0xC000 0000 0000 0000
through
0xC000 00FF 7FFF FFFF 240 –231 bytes
A (63~62) = 112
A (61~31) = -1 1ckseg0 0xFFFF FFFF 8000 0000
through
0xFFFF FFFF 9FFF FFFF
512 Mbytes
(229 bytes)
A (63~62) = 112
A (61~31) = -1 1ckseg1 0xFFFF FFFF A000 0000
through
0xFFFF FFFF BFFF FFFF
512 Mbytes
(229 bytes)
A (63~62) = 112
A (61~31) = -1 1cksseg 0xFFFF FFFF C000 0000
through
0xFFFF FFFF DFFF FFFF
512 Mbytes
(229 bytes)
A (63~62) = 112
A (61~31) = -1 1ckseg3 0xFFFF FFFF E000 0000
through
0xFFFF FFFF FFFF FFFF
512 Mbytes
-4 Mbyte
KSU = 002
or
EXL = 1
or
ERL = 1
1(Reserved)0xFFFF FFFF FF00 0000
through
0xFFFF FFFF FF3F FFFF 4 Mbytes
64-bit Kernel Mode, User Space (xkuseg)
In Kernel mode, when KX = 1 in the Status register and bits 63~62 of the 64-bit
virtual address are 002, the xkuseg virtual address space is selected; it covers the
current user address space. The virtual address is extended with the contents of the
8-bit ASID field to form a unique virtual address.
When ERL = 1 in the Status register, the user address region becomes a 231 bytes
unmapped (that is, mapped directly to physical addresses) uncached address space.
64-bit Kernel Mode, Current Supervisor Space (xksseg)
In Kernel mode, when KX = 1 in the Status register and bits 63~62 of the 64-bit
virtual address are 012, the xksseg virtual address space is selected; it is the current
supervisor virtual space. The virtual address is extended with the contents of the 8-
bit ASID field to form a unique virtual address.
TX49/H2 Archit ecture
8-15
64-bit Kernel Mode, Physical Spaces (xkphys)
In Kernel mode, when KX = 1 in the Status register and bits 63~62 of the 64-bit
virtual addr ess are 102, one of the two unmapped xkphys add ress space s are se lected,
either cached or uncached. Accesses with address bits 58~36 not equal to 0 c ause an
address error.
References to this space are not mapped; the physical address selected is taken
from bits 35~0 of the virtual address. Bits 61~59 of the virtual address specify the
cacheability and coherency attributes, as shown in Table 8-5.
Table 8-5 Cacheability and Coherency Attributes
Value(61~5 9) Cac hea bi lity and Coherenc y Attr ibu t es Star t ing Ad dr ess
0 Cacheable, non-coherent, write-through, no write
allocate 0x8000 0000 0000 0000
1 Cacheable, non-coherent, write-through, no write
allocate 0x8800 0000 0000 0000
2 Uncac hed 0x9000 0000 0000 0000
3 Cacheable, non-coherent 0x9800 0000 0000 0000
4-7 Reserved 0xA000 0000 0000 0000
64-bit Kernel Mode, Kernel Space (xkseg)
In Kernel mode, when KX = 1 in the Status register and bits 63~62 of the 64-bit
virtual address are 112, the address space select ed is one of the following:
kernel virtual space, xkseg, the current kernel virtual space; the virtual address
is extended with the contents of the 8-bit ASID field to form a unique virtual
address
one of the four 32-bit kernel compatibility spaces, as described in the next section.
64-bit Kernel Mode, Compatibility Spaces (ckseg1~0, cksseg, ckseg3)
In Kernel mode, when KX = 1 in the Status re gister, bi ts 63~62 o f the 64-bi t virtual
addre ss are 112, and bits 61~31 of the virtual address equal-1, the lower two bytes of
address, as shown in Figure 8-6, select one of the following 512 Mbytes compatibility
spaces.
ckseg0. This 64-bit vir tu al add re ss space is an unmappe d reg io n, co mpatib le w ith
the 32-bit address model kseg0. The K0 field of the Config register,
described in this chapter, controls cacheability and coherency.
ckseg1. This 64-bit virtual address space is an unmapped and uncached region,
compatible with the 32-bit address model kseg1.
cksseg. This 64-bit virtual address space is the current supervisor virtual space,
compatible with the 32-bit address model ksseg.
ckseg3. This 64-bit vir tual add ress space is kernel v irtual space, compati ble with
the 32-bit address model kseg3.
TX49/H2 Archit ecture
8-16
8.4 Translation Lookaside Buffer
8.4.1 Joint TLB
The TX49 has a fully associative TLB which maps 48 pairs (odd/even entry) of virtual
pages to their corresponding physical addresses.
8.4.2 TLB Entry format
32-bit addressi ng
127 121 120 109 108 96
0 MASK 0
95 77 76 75 72 71 64
VPN2 G 0 ASID
63 62 61 38 37 35 34 33 32
0PFN CDV0
31 30 29 65 3 2 1 0
0PFN CDV0
64-bit addressi ng
255 217 216 205 204 192
0 MASK 0
191 190 189 168 167 141 140 139 136 135 128
R 0 VPN2 G 0 ASID
127 94 93 70 69 67 66 65 64
0PFNCDV0
63 30 29 6 5 3 2 1 0
0PFNCDV0
MASK : Page comparison mask. This field set s the variable page size for each TLB entry.
VPN2 : Virtual page number divided by two (maps to two pages)
ASID : Address space ID field.
R : Region. (00: user, 01: supervisor, 11: kernel) used to match Vaddr63~62.
PFN : Page frame number; upper bits of the physical address.
C : Specifies the cache algorithm to be used (see the “C” field of the EntryLo0, 1).
D : Dirty. If this bit is set, the page is marked as dirty and therefore, writable. This bit is
actually a write-protect bit that software can use to prevent alteration of data.
V : Valid. If this bit is set, it indicates that the TLB entry is valid. If a cache hit occurs
through a TLB entry when this bit is cleared, a TLB invalid exception occurs.
G : Global. If this bit is set in both Lo0 and Lo1, then ignore the ASID during TLB lookup.
0 : Reserved. Returns zeroes when read.
TX49/H2 Archit ecture
8-17
8.4.3 Instruction-TLB
The TX49 has a 2-entry instruction TLB (ITLB). Each ITLB entry is a subset of any
single JTLB entry. The ITLB is completely invisible to software.
8.4.4 Data-TLB
The TX49 h as a 4-entry data TLB (DTLB). Each DTLB entry is a subset of any single
JTLB entry. The DTLB is completely invisible to software.
TX49/H2 Archit ecture
8-18
8.5 Virtual-to-Physical Address Translation Process
During virtual-to-physical address translation, the CPU compares the 8-bit ASID (if the
Global bit, G, is not set) of the virtual address to the ASID of the TLB entry to see if there is a
match. One of the following comparisons are also made:
In 32-bit mode, the highest 7 to 19 bits (depending upon the page size) of the virtual
address are compared to the contents of the TLB VPN2 (virtual page number divided
by two).
In 64-bit mode, the highest 15 to 27 bits (depending upon the page size) of the virtual
address are compared to the contents of the TLB VPN2 (virtual page number divided
by two).
If a TLB entry matches, the physical address and access control bits (C, D, and V ) are
retrieved from the matching TLB entry. While the V bit of the entry must be set for a valid
translation to take place, it is not involved in the determination of a matching TLB entry.
Figure 8-8 illustrates the TLB address translation process.
Access
Cache
XTLB
Refill
TLB
Refill
TLB
Invalid
TLB
Mod Uncached?
Write?
32-bit
address?
D
= 1?
V
= 1?
G
= 1? ASID
Match?
VPN
Match?
Mapped
Address?
Legal
Address?
Sup
Mode?
User
Mode? Legal
Address?
Legal
Address?
For valid
address space, see
the section describing
Operating Modes
in this chapter.
Virtual Add ress (Input)
ExceptionException
ExceptionException
Address
Error
No
Physical Address (Output)
No
No
No
No
No
NoNo
No
Dirty
Global
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No Yes
YesYes NoNoNo
Access
Main
Memor
y
Address
Error
Address
Error
VPN
and
ASID
Figure 8-8 TLB Address Translation
TX49/H2 Archit ecture
8-19
TLB Misses
If there is no TLB entry tha t match es th e virtual addre ss, a TL B refill e xceptio n occurs.
(TLB refill exceptions are described in Chapter 11.) If the access control bits (D and V)
indicate th at the access is not valid, a TLB modification or TLB invalid exception occurs.
If the C bits equal 0102, the physical address that is retrieved accesses main memory,
bypassing the cache.
TLB Instructions
Table 8-6 lists the instructions that the CPU provides for working with the TLB. See
Appendix A for a detailed description of these instructions.
Table 8-6 TLB Instructions
Op Code Description of Instruction
TLBP Translation Lookaside Buffer Probe
TLBR Trans l at i on Lookaside Buffer Read
TLBWI Translation Lookaside Buffer Write Index
TLBWR Translation Lookaside Buffer Write Random
TX49/H2 Archit ecture
8-20
TX49/H2 Archit ecture
9-1
9. Cache Organization
9.1 Introduction
This chapter describes the cache memory of TX49. This processor has two on-chip primary
caches for instruction and data. Both caches are configured as either 8 K-byte, 16 K-byte or 32
K-byte in size.
9.2 Instruction Cache (I-Cache)
The TX49 primary I-cache has the following characteristics:
Cache size: 8 KB/ 16 KB/ 32 KB (fixed in each products)
Four-way set associative
FIFO replacement
Indexed with a virtual address
Checked with a physical tag
Block (line) size: 8 words (32 bytes)
Burst refill size: 8 words (32 bytes)
Lockable on a per-line basis (way1, way2 and way3)
All valid bits, lock and FIFO bits are cleared by a Reset exception
9.2.1 Instruction Cache Address Field
Figure 9-1 shows the instruction cache address field. When 4-KB page size is used in
32 KB Instruction cache, the bit 12 of the Physical Address and the Virtual Address must
be same value.
35 11 10 5 4 3 2 0 (8 KB)
Physical Tag
(25 bits) Cache Tag Index
(6 bits) Word
(2 bits) Byte
(3 bits)
35 12 11 5 4 3 2 0 (16 KB )
Physical Tag
(24 bits ) Cache Tag Index
(7 bits) Word
(2 bits) Byte
(3 bits)
35 12 11 5 4 3 2 0 (32 KB)
Physical Tag
(24 bits ) Cache Tag Index
(8 bits) Word
(2 bits) Byte
(3 bits)
Figure 9-1 Instruc ti on Cache Addr es s Fi eld
TX49/H2 Archit ecture
9-2
9.2.2 Instruction Cache Configuration
Each line in the 4 ways of the instruction cache share FIFO replacement bits. Figure
9-2 shows the fo rmat of re place me nt bits. The se bit s ar e sh are d by way 0 , w ay1, w ay2 and
way3 for 8 KB/ 16 KB/ 32 KB cache, and indicate next set to which replacement will be
directed; when lock bit is set to 1, indicate this set is not locked.
Each line of ins tructio n cache da ta h a s an as so ciated 27-bi t ( 8 KB )/2 6- bi t (16 K B/32 KB )
tag that co ntains a 25- bit (8 KB)/24- bit (16 KB/3 2 KB) physica l address, a single Lo ck bit
and a single valid bit, except for the line in way0, which has an 26-bit (8 KB)/25-bit
(16 K B/32 KB) tag that excludes a lock bit. Figu re 9-3 shows the formats o f tag and data
pair.
10
F1 F0
F0: FIFO replace bit 0
F1: FIFO replace bit 1
Figure 9-2 Format of Replacement Bits
25 24 0 63 0 63 0 63 0 63 0
V PTag Data Data Data Data
Format for way0 (8 KB )
24 23 0 63 0 63 0 63 0 63 0
V PTag Data Data Data Data
Format for way0 (16 KB/32 KB)
26 25 24 0 63 0 63 0 63 0 63 0
L V PTag Data Data Data Data
Format for way1, 2 and 3 (8 KB)
25 24 23 0 63 0 63 0 63 0 63 0
L V PTag Data Data Data Data
Format for way1, 2 and 3 (16 KB/32 KB)
L: Lock bit (1: enable, 0: disable)
V: Valid bit (1: valid, 0: invalid)
PTag: Physical tag (bit 3512 of the physical address )
Data: Instruction cache data
Figure 9-3 Format of Tag and Data Pair for I-cache
9.3 Data Cache
The TX49 primary D-cache has the following characteristics:
Cache size: 8 KB/ 16 KB/ 32 KB (fixed in each products)
Four-way set associative
FIFO replacement
Indexed with a virtual address
Checked with a physical tag
Block (line) size: 8 words (32 bytes)
Burst refill size: 8 words (32 bytes)
TX49/H2 Archit ecture
9-3
Lockable on a per-line basis (way1, way2 and way3)
Store b uffer
Selectable write-back and write-through on a page basic
All W, CS, FIFO and Lock bits are cleared by a Reset exception
9.3.1 Data Cache Address Field
Figure 9-4 shows the data cache address field. When 4-KB page size is used in 32 KB
Instruction cache, the bit 12 of the Physical Address and the Virtual Address must be
same value.
35 11 10 5 4 3 2 0 (8 KB)
Physical Tag
(25 bits) Cache Tag Index
(6 bits) Word
(2 bits) Byte
(3 bits)
35 12 11 5 4 3 2 0 (16 KB )
Physical Tag
(24 bits ) Cache Tag Index
(7 bits) Word
(2 bits) Byte
(3 bits)
35 12 11 5 4 3 2 0 (32 KB)
Physical Tag
(24 bits ) Cache Tag Index
(8 bits) Word
(2 bits) Byte
(3 bits)
Figure 9-4 Data Cache Address Field
9.3.2 Data Cache Configuration
Each line in the 4 ways of the data cache share F1, F0 replacement bits. Figure 9-5
shows the format of replacement bits. These bits are shared by way0, way1, way2 and
way3 for 8 KB/ 16 KB/ 32 KB cache, and indicate next set to which replacement will be
directed; when lock bit is set to 1, indicate this set is not locked.
Each line of data cache data has an associated 29-bit/28-bit tag that contains a 25-
bit/24-bit physical address, a single Lock bit, a single write-back bit and a 2-bit cache
state, except for the line in way0, which has an 28-bit/27-bit tag that excludes a Lock bit.
Figure 9-6 shows the formats of tag and data pair.
10
F1 F0
F0: FIFO replace bit 0
F1: FIFO replace bit 1
Figure 9-5 Format of Replacement Bits
TX49/H2 Archit ecture
9-4
27 26 2524 0 63 0 63 0 63 0 63 0
W CS PTag Data Data Data Data
Format for way0 (8 KB )
26 25 2423 0 63 0 63 0 63 0 63 0
W CS PTag Data Data Data Data
Format for way0 (16 KB/ 32 KB)
28 27 26 2524 0 63 0 63 0 63 0 63 0
L W CS PTag Data Data Data Data
Format for way1, 2 and 3 (8 KB)
27 26 25 2423 0 63 0 63 0 63 0 63 0
L W CS PTag Data Data Data Data
Format for way1, 2 and 3 (16 KB/ 32 KB)
L: Lock bit (1: enable, 0: disable)
W: Write-back bit (set if cache line has written)
CS: Primary cache state
(0: Invali d, 1: Res erved, 2: Reserved, 3: Valid)
PTag: Physical tag (bit 35~12 of the physical address)
Data: Data cache data
Figure 9-6 Format of Tag and Data Pair for D-cache
In the TX49, the W (write-back) bit, not the cache state, indicates when the primary
cache contents modified data that must be written back to memory. The states Invalid
and Valid are used to describe the cache line. That is, there is no hardware support for
cache coherency.
9.3.3 Data Cache Policies
The TX49 provides three write policy options for the data cache: two write-through
modes and one write-back mode. Selection of a write policy is done by the K0 bit in the
Config register for the kseg0 segment and the C bit within each TLB entry for the other
segments. For a description of the K0 bit, see Table 7-15; for a description of the C bit, see
Table 7-3.
The write policy should not be changed once the cache is initialized; otherwise, the
contents of the data cache are not guaranteed.
a) Write-through modes (write allocate/no write allocate)
In write-through, the data is written to cache and to main memory at the same
time. On a cache store miss, a write-through without write-allocate causes dat a
to be sent only to main memory, whereas a write-through with write-allocate
causes the relevant cache line to be replaced before being s ent to the data cache
and main memory.
b) Write-back mode
In the write-back policy, a copy of the data is written to cache by the processor,
but not to main memory. The data will be written to main memory only if cache’s
copy is about to be replaced.
TX49/H2 Archit ecture
9-5
9.4 FIFO Replacement Algorithm
The TX49 uses the FIFO (first in, first out) policy when overwriting the blocks of data in its
instruction and data caches.
Typically, data items in way0, way1, way2 and way3 are replaced in this order.
The FIFO[1:0] bits do not point at locked and valid lines.
Invalid lines, if any, are replaced first.
The FIFO replacement bits are altered when external data is written to the cache or
via the CACHE instruction.
Figure 9-7 shows several examples of how the FIFO replacement bits change due to cache
line replacements.
A)
Way0
Invalid Way1
Invalid Way2
Invalid Way3
Invalid B)
C) D)
Way0
Invalid Way1
Invalid
Lock
Way2
Invalid Way3
Invalid
Way0
Invalid Way1
Invalid Way2
valid Way3
Invalid Way0
Invalid Way1
valid
Lock
Way2
Invalid Way3
valid
E) F)
Way0
Invalid Way1
valid
Lock
Way2
valid
Lock
Way3
valid
Lock
Way0
valid Way1
valid Way2
valid
Lock
Way3
valid
Figure 9-7 FIFO Replacement Policy
9.5 Lock function
The lock function can be used to locate critical in struction/data in one instruction/data cache
set and they are not replaced when the lock bit is set.
9.5.1 Lock bit setting and clearing
Setting the Lock bit in each line cache enable the instruction/data cache lock function.
When the lock function is enabled, the instruction/data in the valid line is locked and
never be replaced. The set to be locked is pointed by FIFO bit. Refilled instruction/data
during the lock function is enabled is locked. When a store miss occurs for the write-
through data cache without write allocate, the store data is not written to the cache and
will therefore not be locked.
The lock function is disabled by clearing the Lock bit in each line.
In order to clear or set the Lock bi t in the cache, Cache instructions (Index store I-cache
/D-cache Tag) can be used, and in order to load the instruction/data to cache from
memory, another Cache instructions (Fill I-cache/D-cache) can be used (refer to Cache
instruction).
TX49/H2 Archit ecture
9-6
Clear the lock bit as follows when data written to a locked line should be stored in main
memory.
(1) Read the locked data from cache memory
(2) Clear the lock bit
(3) Store the data that was read
9.5.2 Operation During Lock
After the lock bit is set for a line, the line can be replaced only when it’s line state is
invalid. The locked valid line can ne ve r be rep lace d. FIFO b it sho u ld p oin t o nly to th e set
of locked invalid line or unlocked line.
A write access to a locked valid line takes place only to the cache not to the memory at
Write Back mode. Both of the cache and the memory are replaced at Write Through
mode.
9.5.3 Example of Dat a Cache Locking
During the load operation to the locked line of the cache, any interrupt should be
disabled in order to avoid to lock the wrong data.
To lock data cache lines, the following sequence of codes could be used.
....................... /* Disable the interrupt */
mtc0 t0, TagLo /* Load data into TagLo reg */
cache 2 (D), offset (base) /* Invalidate and lock line in desired set using
Index_Store_Tag cache instruction */
cache 7 (D), offset (base) /* Fill the cache line from desired memory location */
....................... / Enable the interrupt */
9.5.4 Example of I nstruct ion Cache Locking
To lock instruction cache lines, the following sequence of codes could be used:
....................... /* Disable the interrupt */
mtc0 t0, TagLo /* Load data into TagLo reg */
cache 2 (I), offset (base) /* Invalidate and lock line in desired set using
Index_Store_Tag cache instruction */
cache 5 (I), offset (base) /* Fill the cache line from desired memory location */
....................... /* Enable the interrupt */
TX49/H2 Archit ecture
9-7
9.6 The Primary Cache Accessing
Figure 9-8 shows the virtual address (VA) index to the primary cache. Each instruction and
data cache size is 8 KB, 16 KB or 32 KB. The virtual address bits be used to index into the
primary cache decided by the cache size.
Tags
Tag line Data line
Data
32KB:VA(125)
16KB:VA(115)
8KB:VA(105)
64
W State Tag
VA(125)
to
VA(105)
Figure 9-8 Primary Cache Data and Tag Organization
9.7 Cache States
The section describes about the state of a cache line. The cache line in the TX49 is in one of
states described in Table 9-1.
The I-Cache line is in one of t he following states:
invalid
valid
The D-Cache line is in one of the following states:
invalid
valid
Table 9-1 Cache States
Cache line Stat e Descripti on
Invalid A cache line that does not contain valid information must be marked
invalid, and cannot be used. A cache line in any other state t han invalid
is assumed to contain valid inform at i o n.
Valid A Valid cac he line contains vali d information. The cache line may or not
be consistent with memory and is owned by the processor (see Cache
Line Ownership in this chapter).
TX49/H2 Archit ecture
9-8
9.8 Cache Line Ownership
The TX49 becomes the owner of a cache line after it writes to that cache line (that is, by
entering the Valid), and is responsible for providing the contents of that line on a read
request. There can only be one owner for each cache line.
9.9 Cache Multi-Hit Operation
The TX49 is not guaranteed the operation for the multi-hit of primary cache.
Thus, in case of locking the specified program/data in the primary cache, the program/data
must be used after locked in the cache by Fill instruction.
Such as the previous description the cache multi hit does not guarantee in the TX49.
9.10 Cache Test Function
9.10.1 Cache Disabling
The Config register bits ICE# (Instruction Cache Enable) and DCE# (Data Cache
Enable) are used to enable and disable the in struction and data cache, respectively.
When a cache is disabled, all cache accesses are misses and there is no refill (nor is
there any burst bus cycle; this is the same as accessing a non-cacheable area). The Valid
bit (V) or Cache State bit (C S) for each entry cannot be modified.
Notes:
When the instruction cache is disabled:
Every instruction fetch causes a cache miss, and external memory accesses are
performed using single-read bus cycles.
The CACHE instruction can still operate on the instruction cache.
Notes:
When the data cache is disabled:
Every load or store instruction causes a cache miss. Data cache refills are
disabled, and external memory accesses occur using single-read or single-write
transactions.
The CACHE instruction can still operate on the data cache.
Notes:
How to disable the instruction cache:
When disabling the instruction cache, instruction streaming should be
discontinued by placing a jump instruction following the MTC0 instruction.
Example: MTC0 Rn, Config (Set the ICE# bit to 1)
J L1 (Jump to L1 and disable instruction streaming)
NOP (Branch delay slot)
L1: CACHE IndexIncaliate, offset (base)
TX49/H2 Archit ecture
9-9
9.10.2 Cache Flushing
Both the instruction and data cache are flushed when a ColdReset/SoftReset exception
is raised (all valid bits are cleared to 0).
The instruction cache is flushed by the CACHE instruction Index_Invalidate
/Hit_Invalidate. The data cache is flushed by the CACHE instruction
IndexWriteBackInvalidate/HitInvalidate/HitWriteBackInvalidate.
The processor writes the cache line back to main memory during the execution of Index
Writeback Invalidate, Hit Writeback Invalidate or Hit Writeback CACHE instruction or
when the modified cache line is replaced. In write-back mode, software is responsible for
ensuring cache coherency.
TX49/H2 Archit ecture
9-10
TX49/H2 Archit ecture
10-1
10. Write Buffer
The TX49 contains a write buffer to improve the performance of writes to the external memory.
Every write to external memory uses this on-chip write buffer. The write buffer holds up to four
64-bit address and data pairs.
For a cache miss write-back, the entire buffer is used for the write-back data and allows the
processor to proceed in parallel with the memory update. For uncached and write-through stores,
the write buffer uncouples the CPU from the write to memory. If the write buffer is full,
additional stores will stall until there is room for them in the write buffer.
The TX49 processor core might issue a read request while the write buffer is performing a
write operation. Multiple read/write operations are serviced in the following order:
If there is only a write request, the data in the write buffer is written to an external
device.
If there is only a read request, a read operation is performed to bring in data from an
external device.
If a read request and a write request occur simultaneously, the read request is
serviced first, except for the following cases:
when the processor issues a read request to the target address of one of the write
buffer entries
when the processor issues an uncacheable read reference while the write buffer
has uncacheable write data
The BC0T and BC0F instructions can be used to determine whether any data is present in the
write buffer:
If there is data in the write buffer, the coprocessor condition signal is false (0).
If there is no data in the write buffer, the coprocessor condition signal is true (1).
Following is the assembly language code to freeze the processor until the write buffer becomes
empty.
SW
NOP
NOP
Loop: BC0F Loop
NOP
The following sequence of instructions also causes the TX49 to perform the same action.
Appended to a store instruction, the SYNC instruction ensures that the store instruction
initiated prior to this instruction is completed before any instruction after this instruction is
allowed to start.
SW
SYNC
TX49/H2 Archit ecture
10-2
TX49/H2 Archit ecture
11-1
11. CPU Exception
11.1 Introduction
This chapter describes the explanation of CPU exception processing. The chapter concludes
with a description of each exception’s cause, together with the manner in which the CPU
processes and services these exceptions.
11.2 Exception Vector Locations
Exception vector addresses are stored in an area of kseg0 or kseg1 except for Debug
exception vector. The vector address of the ColdReset, SoftReset and NMI exception is always
in a non-cacheable area of kseg1. Vector addresses of the other exceptions depend on the BEV
bit of Status register. When BEV is 0, these exceptions are vectored to a cacheable area of
kseg0. When BEV is 1, all vector addresses are in a non-cacheable area of kseg1.
Table 11-1 shows the list of the exception vector locations.
Table 11-1 Exception Vector Locations
Exception TX49 Vector Address (virtual address)
(BEV = 0) (BEV = 1)
ColdReset, S oft Reset , NMI 0xffff_ffff_bfc0_0000 0xffff_ffff_bfc 0_0000
TLB refill, EXL = 0 0xffff_f ff f_8000_0000 0xffff _ffff_bfc0_0200
XTLB refill, EXL = 0
(X = 64 bit TLB) 0xffff_ffff_8000_0080 0xffff_ffff_bfc0_0280
Others (common exception) 0xfff f _ffff_8000_0180 0xff f f_ffff_bfc 0_0380
Exception TX49 Vector Address (physical address)
(BEV = 0) (BEV = 1)
ColdReset, S oft Reset , NMI 0x0_1fc0_0000 0x0_1f c0_0000
TLB refill, EXL = 0 0x0_0000_0000 0x0_1fc0_0200
XTLB refill, EXL = 0
(X = 64 bit TLB) 0x0_0000_0080 0x0_1fc0_0280
Others (common exception) 0x0_0000_0180 0x0_1fc0_0380
The cache error exception is not occurred because the TX49 does not have the parity bit into
the primary cache. Debug exception needs the care, it has the special address. (See 14.9.5)
Table 11-2 shows the list of the debug exception vector locations.
Table 11-2 Debug Exc epti on Vec tor Loca tio ns
Exception TX49 Debug Exception Vector Address (virtual address)
(ProbEnb = 0) (ProbEnb = 1)
Debug 0xffff_ffff_bfc0_0400 0xffff_ffff_ff20_0200
Exception TX49 Debug Exception Vector Address (physical
address)
(ProbEnb = 0) (ProbEnb = 1)
Debug 0x0_1fc0_0400 0xf_ff20_0200
TX49/H2 Archit ecture
11-2
11.3 Priority of Exception
More than one exception may be raised for the same instruction, in which case only the
exception with the highest priority is reported. The TX49 Processor Core instruction
exception priority is shown in Table 11-3.
Table 11-3 Priority of Exception
Priority Exception Mnemonic
Cold Reset
Soft Reset
NMI
Address error Inst . Fetch AdEL
TLB refill Inst. Fetch TLBL
TLB invalid Inst. Fetch TLBL
Bus error Inst. Fetch IBE
Integer overflow, Trap, System Call, Breakpoint ,
Reserved Inst ruction, Coprocesso r Unusable, or
Floating-Point Exception
Ov, Tr, Sys,
Bp, RI, CpU,
FPE
Address error Data acc e ss AdEL/AdES
TLB refill Data access TLBL/TLBS
TLB invalid Data access TLBL/TLBS
TLB modified Data write Mod
Bus error Data access DBE
High
Low
Interrupt Int
General exceptions (i.e., exceptions other than debug exceptions) are prioritized as follows:
1. If more than one exception condition occurs for a signal instruction, only the exception
with the highest priority is reported, as shown in Table 11-3 (from highest to lowest
priority).
2. If two instructions cause exception conditions in the M and E stages of the pipeline
simultaneously, the instruction in the M stage causes the processor to take an
exception.
3. When 64-bit instructions are executed in 32-bit mode, the Reserved Instruction (RI)
exception can occur simultaneous with other exception, as shown below. In that case,
the RI exception is given precedence.
RI and CpU
RI and Ov
RI and AdEL/S (data)
RI and TLBL/S (data)
General and debug exceptions are prioritized as follows:
1. If a general exception condition and a debug exception condition occur for a single
instruction, the debug exception is serviced first, and then the general exception is
serviced.
2. If two instructions cause exception conditions in the M and E stages of the pipeline
simult aneously, only the instruction in the M stage generates an exception.
For details on debug exceptions, see Section 14.9.
TX49/H2 Archit ecture
11-3
11.4 ColdReset Exception
11.4.1 Cause
This ColdReset exception occurs when the GCOLDRESET* signal is asserted and then
deasserted. This exception is not maskable.
11.4.2 Processing
A special interrupt vector that resides in an unmapped and uncached area is used. It is
therefore not necessary for hardware to initialize TLB and cache memory in order to
process this exception. The vector location of this exception is;
In 32 bit mode, 0xbfc0 0000 (virtual address), 0x0_1fc0_0000 (physical address)
In 64 bit mode, 0xffff ffff bfc0 0000 (virtual address), 0x0_1fc0_0000 (physical
address)
The most register’s contents are cleared when this exception occurs. The values of
these bits are listed into t he table of Section 7.
Valid bits, Lock bits and FI FO replacement bits in the instruction cache are all cleared
to 0. W bits, CS bits, Lock bits and FIFO replacement bits in the data cache are all cleared
to 0.
If a ColdReset exception occurs during bus cycle, the current bus cycle is aborted and an
exception is taken.
11.4.3 Servicing
The ColdReset exception is serviced by;
initializing all registers, coprocessor registers, caches and the memory system
performing diagnostic tests
bootstrapping the operating system
TX49/H2 Archit ecture
11-4
11.5 SoftReset Exception
11.5.1 Cause
This SoftReset exception occurs when the GRESET* signal is asserted and then
deasserted. This exception is not maskable.
11.5.2 Processing
A special interrupt vector that resides in an unmapped and uncached area is used. It is
therefore not necessary for hardware to initialize TLB and cache memory in order to
process this exception. The vector location of this exception is;
In 32 bit mode, 0xbfc0 0000 (virtual address), 0x0_1fc0_0000 (physical address)
In 64 bit mode, 0xffff ffff bfc0 0000 (virtual address), 0x0_1fc0_0000 (physical
address)
All register contents are retained except for the following.
ErrorEPC register, which contains the restart PC
ERL, SR and BEV bits of Status register, which are set to “1”
Because Soft-reset exception can abort cache and bus operations, cache and memory
state is undefined when this exception occurs.
11.5.3 Servicing
The SoftReset exception is serviced by saving the current processor state for diagnostic
purposes, and reinitializing for the ColdReset exception.
TX49/H2 Archit ecture
11-5
11.6 NMI (Non-maskable Interrupt) Exception
11.6.1 Cause
The NMI (Non-maskable Interrupt) exception occurs at the falling edge of the GNMI*
signal. This inte rrup t is not maskable , and occu r s reg ard less o f th e EX L, ER L an d IE bits
of the Status register.
11.6.2 Processing
The same special interrupt vector as for Cold-reset/Soft-reset exception (0xbfc0_0000/
0xffff_ffff_bfc0_0000). This vector is located within unmapped and uncached area so th at
the cache and TLB need not be initialized to process this exception. When this exception
occurs, the SR bit of Status register is set.
Because NMI exception can occur in the midst of another exception, it is not normally
possible to continue program execution after servicing NMI exception.
Unlike the Cold-reset/Soft-reset exception, but like other exceptions, this exception
occurs at an instruction boundary. The state of the primary cache and memory system
are preserved b y this exception.
All register contents are retained except for the following.
ErrorEPC register, which contains the restart PC
If the exception-causing ins truction is in a branch delay slot, the ErrorEPC
register points at the preceding branch instruction, and the BD bit of the Cause
registe r is set as ind ica t io n.
ERL, SR and BEV bits of the Status register, which is set to 1.
11.6.3 Servicing
The NMI exception is serviced by saving the current processor state for diagnostic
purposes, and reinitializing the system for the ColdReset exception.
TX49/H2 Archit ecture
11-6
11.7 Address Error Exception
11.7.1 Cause
The Address Error exception occurs when an attempt is made to execute one of the
following.
load or store a doubleword that is not aligned on a doubleword boundary
load, fetch or store a word that is not aligned on a word boundary
load or store a halfword that is not aligned on a halfword boundary
reference Kernel mode address while in User or Supervisor mode
reference Supervisor mode address while in User mode
This exception is not maskable.
11.7.2 Processing
The common exception vector is used. ExcCode AdEL or AdES in Cause register is set
depending on whether the memory access attempt was a load or store. When this
exception is raised, the misalign virtual address causing the exception, or the protected
virtual address that was illegally referenced, is placed in BadVAddr register. The
contents of the VPN field of Context and EntryHi registers are undefined, as are the
contents of EntryLo register.
If EXL bit of Status register is only set to 0, the following operation is executed. EPC
register points to the address of the instruction causing the exception. If, however, the
affected instruction was in the branch delay slot (for execution during a branch), the
immediately preceding branch instruction address is retained in EPC register and BD bit
of Cause register is set to “1”.
11.7.3 Servicing
The process executing at the time is handed a segmentation violation signal. This error
is usually fatal to the process incurring the exception.
TX49/H2 Archit ecture
11-7
11.8 TLB Refill Except ion
11.8.1 Cause
The TLB refill exception occurs when there is no TLB entry to match a reference to a
mapped address. This exception is not maskable.
11.8.2 Processing
There are two special exception vectors for this exception; one for references to 32-bit
virtual addres s, and one for refe rences to 64-bit virtual address . The K X, SX and U X bits
of Status register determine whether the User, Supervisor or Kernel address referenced
are 32-bit mode or 64-bit mode. When EXL bit of Status register is set to “0”, all
references use these vectors. When this exception occurs, TLBL or TLBS code is set in the
ExcCode field of Cause register. This code indicates whether the instruction, as shown by
EPC register and BD bit of Cause register, caused the miss by an instruction reference,
load operation, or store operation.
When this exception occurs;
BadVAddr, Context, XContext and EntryHi registers hold the virtual address
failed address translation
EntryHi register contains ASID from which the translation fault occurred, too
A valid address in which to place the replacement TLB entry is contained into
Random register
The contents of EntryLo register are undefined
If EXL bit of Status register is only set to 0, the following operation is executed. EPC
register points to the address of the instruction causing the exception. If, however, the
affected instruction was in the branch delay slot (for execution during a branch), the
immediately preceding branch instruction address is retained in EPC register and BD bit
of Cause register is set to “1”.
11.8.3 Servicing
To service this exception, the contents of the Context or XContext register are used as a
virtual address to fetch memory locations containing the physical page frame and access
control bits for a pair of TLB entries. The two entries are placed into the
EntryLo0/EntryLo1 register; the EntryHi and EntryLo registers are written into the TLB.
It is possible that the virtual address used to obtain the physical address and access
control information is on a page that is not resident in the TLB. This condition is
processed by allowing a TLB refill exception in the TLB refill handler. This second
exception goes to the common exception vector because the EXL bit of the Status register
is set.
TX49/H2 Archit ecture
11-8
11.9 TLB Invalid Exception
11.9.1 Cause
The TLB Invalid exception occurs when a virtual address reference matches a TLB entry
that is marked invalid (TLB valid bit cleared). This exception is not maskable.
11.9.2 Processing
The common exception vector is used for this exception. When this exception occurs,
TLBL or TLBS code is set in the ExcCode field of Cause register. This code indicates
whether the instruction, as shown by EPC register and BD bit of Cause register, caused
the miss by an instruction reference, load operation, or store operation.
When this exception occurs;
BadVAddr, Context, XContext and EntryHi registers hold the virtual address
failed address translation
EntryHi register contains ASID from which the translation fault occurred, too
A valid address in which to place the replacement TLB entry is contained into
Random register
The contents of EntryLo register are undefined
If EXL bit of Status register is only set to 0, the following operation is executed. EPC
register points to the address of the instruction causing the exception. If, however, the
affected instruction was in the branch delay slot (for execution during a branch), the
immediately preceding branch instruction address is retained in EPC register and BD bit
of Cause register is set to “1”.
11.9.3 Servicing
A TLB entry is typically marked invalid when one of the following is t rue;
a virt ual address does not exist
the virtual address exists, but is not in main memory (a page fault)
a trap is desired on any reference to the page (for example, to maintain a
reference bit or during debug)
After serv icing the cause of a TLB In valid ex ception , the TLB e ntry is located w ith TLB
Probe (TLBP) instruction, and replaced by an entry with that entry’s Valid bit set.
TX49/H2 Archit ecture
11-9
11.10 TLB Modified Exception
11.10.1 Cause
The TLB Modified exception occurs when a store operation virtual address reference to
memory matches a TLB entry that is marked valid but is not dirty and therefore is not
writable. This exception is not maskable.
11.10.2 Processing
The common exception vector is used for this exception, and Mod code in Cause register
is set.
When this exception occurs;
BadVAddr, Context, XContext and EntryHi registers hold the virtual address
failed address translation
EntryHi register contains ASID from which the translation fault occurred, too
The contents of EntryLo register are undefined
If EXL bit of Status register is only set to 0, the following operation is executed. EPC
register points to the address of the instruction causing the exception. If, however, the
affected instruction was in the branch delay slot (for execution during a branch), the
immediately preceding branch instruction address is retained in EPC register and BD bit
of Cause register is set to 1.
11.10.3 Servicing
The kernel uses the failed virtual address or virtual page number to identify the
corresponding access control information. The page identified may or may not permit
write accesses; if writes are not permitted, a write protection violation occurs.
If write accessed are permitted, the page frame is marked dirty/writable by the kernel
in its own data structures. The TLB Probe (TLBP) instruction places the index of the
TLB entry that must be altered into the Index register. The EntryLo register is loaded
with a word containing the physical page frame and access control bits (with the D bit
set), and the EntryHi and EntryLo registers are written into the TLB.
TX49/H2 Archit ecture
11-10
11.11 Bus Error Exception
11.11.1 Cause
The Bus Error exception occurs when GBUSERR* signal is asserted during a memory
read bus cycle. This exception is raised by board-level circuitry for events such as bus
time-out, backplane bus parity errors, and invalid physical memory addresses or access
types. This occurs during execution of the instruction causing the bus error. The memory
bus cycle ends upon notification of a bus error. When a bus error is rai sed during a burst
refill, the following refill is not performed. A bus error request made by asserting
GBUSERR* signal will be ignored if TX49 is executing a cycle other than a bus cycle. It is
therefore not possible to raise a Bus Error exception in a write access using a write buffer.
A general interrupt must be used instead. This exception is not maskable.
11.11.2 Processing
The common interrupt vector is used for a Bus Error exception. The IBE or DBE code
in the ExcCode field of the Cause register is set, signifying whether the instruction (as
indicated by the EPC register and BD bit in the Cause register) caused the exception by
an instruction reference, load operation, or store operation.
The EPC register contains the address of the instruction that caused the exception,
unless it is in a branch delay slot, in which case the EPC register contains the address of
the preceding branch instruction and the BD bit of the Cause register is set.
11.11.3 Servicing
The physical address at which the fault occurred can be computed from information
available in the CP0 registers.
If the IBE code in the Cause register is set (indicating an instruction fetch
reference), the virtual address is contained in the EPC register (or 4+ the
contents of the EPC register if the BD bit of the Cause register is set).
If the DBE code is set (indicating a load or store reference), the instruction that
caused the exception is located at the virtual address contained in the EPC
register (or 4+ the contents of the EPC register if the BD bit of the Cause register
is set).
The virtual address of the load and store reference can then be obtained by interpreting
the instruction. The physical address can be obtained by using the TLB Probe (TLBP)
instruction and reading the EntryLo register to compute the physical page number.
The process executing at the time of this exception is handed a bus error signal, which
is usually fatal.
TX49/H2 Archit ecture
11-11
11.12 Integer Overflow Exception
11.12.1 Cause
The Integer Overflow exception occurs when ADD, ADDI, SUB, DADD, DADDI or
DSUB instruction results in a 2’s complement overflow. This exception is not maskable.
11.12.2 Processing
The common exception vector is used for this exception, and the Ov code in Cause
register is set.
If EXL bit of Status register is only set to 0, the following operation is executed. EPC
register points to the address of the instruction causing the exception. If, however, the
affected instruction was in the branch delay slot (for execution during a branch), the
immediately preceding branch instruction address is retained in EPC register and BD bit
of Cause register is set to 1.
11.12.3 Servicing
The process executing at the time of the exception is handed a floating-point
exception/integer overflow signal. This error is usually fatal to the current process.
TX49/H2 Archit ecture
11-12
11.13 Trap Exception
11.13.1 Cause
The Trap exception occurs when TGE, TGEU, TLT, TLTU, TEQ, TNE, TGEI, TGEIU,
TLTI, TLTIU, TEQI or TNEI instruction results in a TRUE condition. This exception is
not maskable.
11.13.2 Processing
The common exception vector is used for this exception, and the Tr code in Cause
register is set.
If EXL bit of Status register is only set to 0, the following operation is executed. EPC
register points to the address of the instruction causing the exception. If, however, the
affected instruction was in the branch delay slot (for execution during a branch), the
immediately preceding branch instruction address is retained in EPC register and BD bit
of Cause register is set to 1.
11.13.3 Servicing
The process executing at the time of a Trap exception is handed a floating-point
exception/integer overflow signal. This error is usually fatal.
TX49/H2 Archit ecture
11-13
11.14 System Call Exception
11.14.1 Cause
The System Call exception occurs during an attempt to execute the SYSCALL
instruction. This exception is not maskable.
11.14.2 Processing
The common exception vector is used for this exception, and the Sys code in Cause
register is set.
If EXL bit of Status register is only set to 0, the following operation is executed. EPC
register points to the address of the SYSCALL instruction. If, however, the affected
instruction was in the branch delay slot (for execution during a branch), the immediately
preceding branch instruction address is retained in EPC register.
If the SYSCALL instruction is in a branch delay slot, BD bit of Status register is set,
otherwise this bit is cleared.
11.14.3 Servicing
When this exception occurs, control is transferred to the applicable system routine.
To resume execution, the EPC register must be altered so that the SYSCALL
instruction does not re-execute; this is accomplished by adding a value of 4 to the EPC
register (EPC regis ter + 4) before returning.
If a SYSCALL instruction is in a branch delay slot, a more complicated algorithm,
beyond the scope of this des cription, may be required.
TX49/H2 Archit ecture
11-14
11.15 Breakpoint Exception
11.15.1 Cause
The Breakpoint exception occurs when an attempt is made to execute the BREAK
instruction. This exception is not maskable.
11.15.2 Processing
The common exception vector is used for this exception, and the Bp code in Cause
register is set.
If EXL bit of Status register is only set to 0, the following operation is executed. EPC
register points to the address of the BREAK instruction. If, however, the affected
instruction was in the branch delay slot (for execution during a branch), the immediately
preceding branch instruction address is retained in EPC register.
If the BREAK instruction is in a branch delay slot, BD bit of Status register is set,
otherwise this bit is cleared.
11.15.3 Servicing
When the Breakpoint exception occurs, control is transferred to the applicable system
routine. Additional distinctions can be mode by analyzing the unused bits of the BREAK
instruction (bits 25~6), and loading the contents of the instruction whose address the EPC
register contains. A value of 4 must be added to the contents of the EPC register (EPC
register + 4) to locate the instruction if it resides in a branch delay slot.
To resume execution, the EPC register must be altered so that the BREAK instruction
does not re-execute; this is accomplished by adding a value of 4 to the EPC register (EPC
register + 4) before ret urning.
If a BREAK instruction is in a branch delay slot, interpretation of the branch
instruction is required to resume execution.
TX49/H2 Archit ecture
11-15
11.16 Reserved Instruction Exception
11.16.1 Cause
The Reserved Instruction exception occurs when one of the following condition occurs:
an attempt is made to execute an instruction with an undefined major opecode
(bit 31~26)
an attempt is made to execute a SPECIAL instruction with an undefined minor
opcode (bit 5~0)
an attempt is made to execute a REGIMM instruction with an undefined minor
opcode (bit20~16)
an attempt is made to execute 64-bit operations in 32-bit mode when in User or
Supervisor modes
an attempt is made to execute a COPz rs instruction with an undefined minor
opcode (bit25~21)
an attempt is made to execute a COPz rt instruction with an undefined minor
opcode (bit20~16)
64-bit operations are always valid in Kernel mode regardless of the value of the KX bit
in Status register. This exception is not maskable.
11.16.2 Processing
The common exception vector is used for this exception, and the RI code in Cause
register is set.
If EXL bit of Status register is only set to 0, the following operation is executed. EPC
register points to the address of the instruction causing the exception. If, however, the
affected instruction was in the branch delay slot (for execution during a branch), the
immediately preceding branch instruction address is retained in EPC register and the BD
bit of Cause register is set to 1.
11.16.3 Servicing
No instruction in the MIPS ISA are currently interpreted. The process executing at the
time of this e xception is h anded an il legal in struction/re served o perand f ault signal. Th is
error is usually fatal.
TX49/H2 Archit ecture
11-16
11.17 Coprocessor Unusable Exception
11.17.1 Cause
The Coprocessor Unusable exception occurs when an attempt is made to execute a
coprocessor instruction for either.
attempting to execute a coprocessor CPz instruction when its corresponding CUz
bit in Status register.
in User or Supervisor mode attempting to execute a CP0 instruction when CU0
bit is cleared to “0”. (In Kernel mode, an exception is not raised when a CP0
instruction is issued , regardless of the CU0 bit setting)
an attempt is made to execute a FPU instruction in TX49 without FPU
11.17.2 Processing
The common exception vector is used for this exception, and the CpU code in Cause
register is set. The coprocessor number referred to at the time of the exception is stored
in Cause register CE (Coprocessor Error) field.
If EXL bit of Status register is only set to 0, the following operation is executed. EPC
register points to the address of the instruction causing the exception. If, however, the
affected instruction was in the branch delay slot (for execution during a branch), the
immediately preceding branch instruction address is retained in EPC register and BD bit
of Cause register is set to 1.
11.17.3 Servicing
The coprocessor unit to which an attempted reference was mode is identified by the
Coprocessor Usage Error field, which results in one of the following situations:
If the process is entitled access to the coprocessor, the coprocessor is marked
usable and the corresponding user state is restored to the coprocessor.
If the process is entitled access to the coprocessor, but the coprocessor does not
exist or has failed, interpretation of the coprocessor instruction is possible.
If the BD bit is set in the Cause register, the branch instruction must be
interpreted; then the coprocessor instruction can be emulated and execution
resumed with the EPC register advanced past the coprocessor instruction.
If the process is not entitled access to the coprocessor, the process executing at
the time i s handed an il legal instru ction/privile ged instructio n fault signal. This
error is usually fatal.
TX49/H2 Archit ecture
11-17
11.18 Floating-Point Exception
11.18.1 Cause
The Floating-Point exception is used by the floating-point coprocessor. This exception is
not maskable.
11.18.2 Processing
The common exception vector is used for this exception, and the FPE code in Cause
register is set. The contents of the Floating-Point Control/Status register indicate the
cause of this exception.
If EXL bit of Status register is only set to 0, the following operation is executed. EPC
register points to the address of the instruction causing the exception. If, however, the
affected instruction was in the branch delay slot (for execution during a branch), the
immediately preceding branch instruction address is retained in EPC register and the BD
bit of Cause register is set to 1.
11.18.3 Servicing
This exception is cleared by clearing the appropriate bit in the Floating-Point
Control/Sta tus register.
For an unimplemented instruction exception, the kernel should emulate the instruction;
for other exceptions, the kernel should pass the exception to the user program that caused
the exception.
TX49/H2 Archit ecture
11-18
11.19 Interrupt Exception
11.19.1 Cause
The Interrupt exception is raised by any of eight interrupts (two software and six
hardware). A hardware interrupt is raised when GINT* signal goes active. A software
interrupt is raised by setting the IP[1]/IP[0] bit in Cause register. The significance of
these interrupts is dependent upon the specific system implementation.
Each of the eight interrupts can be masked individually by clearing its corresponding
bit in the IM(Interrupt Mask) field of Status register, and all interrupts can be masked at
once by clearing IE bit of Status register to “0”.
If the GTINTDIS is low when a Reset exception occurred, GINT[5]* is disa bled and th e
timer exception is enabled.
11.19.2 Processing
The common exception vector is used a s following;
In 32 bit mode, 0x8000 0180 (BEV = 0)
0xbfc0 0380 (BEV = 1)
In 64 bit mode, 0xffff ffff 8000 0180 (BEV = 0)
0xffff ffff bfc0 0380 (BEV = 1)
11.19.3 Servicing
If the interrupt is caused by one of the two software-generated exceptions (SW1 or
SW0), the interrupt condition is cleared by setting the corresponding Cause register bit to
0.
If the interrupt is hardware-generated, the interrupt condition is cleared by correcting
the condition causing the interrupt pin to be asserted.
If the timer interrupt is caused, the interrupt condition is cleared by changing the value
of the Compare register or setting the corresponding C ause register bit (IP[7]) to 0.
Interrupts are not acceptable when the settings of the Status register are EXL = 1 and
ERL = 1.
Note: due to the writ e buffer, a store t o an extern al device wi ll not neces sary occur until after
other instructions in the pipeline finish. Thus, the user must ensure that the store will
occur before the return from exception instruction (ERET) is executed otherwise the
interrupt may be serviced again even though there should be no interrupt pending.
TX49/H2 Archit ecture
11-19
11.20 Exception Handling and Servicing Flowcharts
The remainder of this chapter contains flowcharts for the following exceptions and
guidelines for their handlers:
general exceptions and their exception handler
TLB/XTLB miss exception and their exception handler
Cold Reset, Soft Reset and NMI exceptions, and a guideline to their handler.
Generally speaking, the exceptions are handled by hardware (HW); the exceptions are then
serviced by software (SW).
Exceptions other than Reset, Soft Reset, NMI or first-level miss
Note: Interrupts can be masked by IE or IMs
EXL 1
PC 0xFFFF FFFF B F C0 0200 + 180
(unmapped, unc ached)
PC 0xFFFF FFFF 8000 0000 + 180
(unmapped, cached)
Cause 31 (BD) 0
EPC PC
Cause 31 (BD) 1
EPC (PC - 4)
= 0 (normal) = 1 (bootstrap)
Yes No
= 1
Processor forced to Kernel Mode
& interrupt di sabled
= 0
Comments
To General Exception Servicing Guidelines
BEV
Instr. in
Br. Dly. Slot?
EXL
(SR1)
Check if exception within
another exception
FP Control/ S tatus Register
is only set if the respective exception
occurs.
EnHi, X/Context are s et only for
*TLB- Invalid, Modified,
& Refill exc eptions
BadVA i s set onl y for
TLB-invalid, Modif i ed,
and Refill exceptions
Note: not set if it is a Bus Error
Set FP Cont rol St atus Res is ter
EnHi VPN2, ASID
X/Context VPN2
Set Cause Register
(ExcCode, CE)
Set BadVA
Figure 11-1 General Ex cepti on Han dler (HW)
TX49/H2 Archit ecture
11-20
ERET
MTC0 -
EPC
STATUS
EXL = 1
Service Code
Check CA USE REG. & Jump to
appropriate Servi ce Code
MTC0 -
(Set Status B i t s:)
KSU 00
EXL 0
& IE = 1
MFC0 -
X/Context
EPC
Status
Cause
Status
bit 21 (TS) (*)
= 0
= 1
Comments
Optional: Check only if 2nd-l evel TLB miss
(optional - only to enable Interrupts while keeping K e rnel Mode)
¥After EXL = 0, all exceptions allowed.
(except interrupt i f masked by IE or IM)
Reset t he processor
*Save Register File
*ERET is not allowed in the branc h del ay slot of
another Jump Instruction
*Processor does not execute the inst ruction which
is in t he ERET’s branch delay slot
*PC EPC; EXL 0
* LLbit 0
*Unmapped vec tor TLBMod, TLBInv,
TLB Refill exceptions not possible
*EXL = 1 so Int errupt exceptions disabled
*OS/System to avoid all ot her excepti ons
*Only Cold Reset, Soft Reset, NMI exceptions
possible.
*Save the context (register f i l e and so on)
(*)Reserved for TX49.
Figure 11-2 General Ex c epti on Ser vic ing Gu id el ines ( SW)
TX49/H2 Archit ecture
11-21
Vec. Off. = 0x180Vec. Off. = 0x000Vec. Off. = 0x080
EXL 1
EnHi VPN2, ASID
X/Context VPN2
Set Caus e Reg.
ExcCode, CE and
Set BadVA
EnHi VPN2, ASID
X/Context VPN2
Set Caus e Reg.
ExcCode, CE and
Set BadV A
To TLB/XTLB Exception Servicing Guidelines
Instr. in
Br. Dly. Slot?
EXL
(SR bit 1)
EXL
(SR bit 1)
XTLB
Exception?
EPC PC
Cause bit 31 (BD) 0
EPC (PC-4)
Cause bit 31 (BD) 1
BEV
(SR bit 22)
PC 0xFFFF FFFF 8000 0000 + Vec. Off.
(unmapped, cached) PC 0xFFFF FFFF BF00 0200 + Ve c. Off.
(unmapped, unc ached)
= 0 (normal) = 1 (bootstrap)
Proces sor forced t o K ernel Mode &
interrupt di s abl ed
No
Yes
Points to General ExceptionPoints to Refill Exception
NoYes
= 0
= 0
= 1
= 1
Check if exception within
another exception
Figure 11-3 TLB/XTLB Miss Exception Handler (HW)
TX49/H2 Archit ecture
11-22
ERET
Service Code
MFC0 -
CONTEXT
Comments
*ERET is not allowed in the branc h del ay slot of
another Jump Instruction
*Processor does not execute the instruction which is
in the ERE T s branch delay s l ot
*PC EPC; EXL = 0
*LLbit 0
*Load the mapping of the virtual address in Context Reg.
Move it to ENLO and Wri te into the TLB
*There could be a TLB miss again during the mapping
of the data or i nstruct i on address. The process or will
jump to the general exception vector since the EXL is 1.
(Option to complete the fi rst level refill in the general
exception handler or ERET to the original i nstruct i o n
and take the except i on agai n)
*Unmapped vec tor TLBMod, TLBInv,
TLB Refill exceptions not possible
*EXL = 1 so Int errupt exceptions disabled
*OS/System to avoid all ot her excepti ons
*Only Cold Reset, Soft Reset, NMI exceptions
possible.
Figure 11-4 TLB/XTLB Exception Servicing Guidelines (SW)
TX49/H2 Archit ecture
11-23
PC 0xFFFF FFFF B F C0 0000
ErrorEPC PC
ERET Cold Reset Service CodeSoft Reset Service Code
NMI Servic e Code Status bi t 20
(SR)
Yes
= 0
(Optional)
No
= 1
Cold Reset, S o ft Reset & NMI S ervicing
Guideli nes (SW )
Cold Reset ExceptionSoft Reset or NM I Exception
Cold Reset, So ft Reset & NMI Exception Handling (HW)
Note: There is no i ndi cation from the
proessor t o di fferentiate between
NMI & Soft Reset;
there m ust be a sys t em level i ndi cation.
NMI?
Status:
BEV 1
TS 0 (*)
SR 1
ERL 1
Random TLBENTR IES-1
Wired 0
Status:
BEV 1
TS 0 (*)
SR 0
ERL 1
(*) Reserved for TX49
Figure 11-5 Cold Reset, Soft Reset & NMI Exception Handling (HW) and
Servicing Guidelines (SW)
TX49/H2 Archit ecture
11-24
TX49/H2 Archit ecture
12-1
12. Floating-Point Unit, CP1
This chapter describes the floating-point operations, including the programming model,
instruction set and formats.
The floating-point operations fully conform to the requirements of ANSI/IEEE Standard 754-
1985, IEEE Standard for Binary Floating-Point Arithmetic.
12.1 Overview
All floating-point instructions, as defined in the MIPS ISA for the floating-point coprocessor,
CP1, are processed by the other hardware unit that executes integer instructions.
The execution of floating-point instructions can be disabled by the coprocessor usability CU
bit defined in the CP0 Status register.
12.2 Floating Point Register
12.2.1 Floating - Point Gener al Regist ers (FGRs)
CP1 has a set of Floating-Point General Purpose registers (FGRs) that can be accessed
in the following ways:
As 32 general purpose registers (32 FGRs), each of which is 32 bits wide when the
FR bit in the CPU Status register equals 0; or as 32 general purpose registers (32
FGRs), each of which is 64-bits wide when FR equals 1. The CPU accesses these
registers through MOVE, LOAD, and STORE instructions.
As 16 floating-point registers (see t he next section for a description of FPRs), each
of which is 64-bits wide, when the FR bit in the CPU Status register equals 0.
The FPRs hold values in either single- or double-precision floating-point format.
Each FPR corresponds to adjacently numbered FGRs as shown in Figure 12-1.
As 32 floating-point registers (see the next section for a description of FPRs), each
of which is 64-bits wide, when the FR bit in the CPU Status register equals 1.
The FPRs hold values in either single- or double-precision floating-point format.
Each FPR corresponds to an FGR as shown in Figure 12-1.
Floating-point
Registers (FPR)
(FR = 0)
Floating-Point
Gen era l Purp o se Re
g
isters
Floating-point
Registers (FPR)
(FR = 1)
Floating-Point
Gen era l Purp o se Re
g
isters
31 (FGR) 0 63 (FGR) 0
(least) FGR0 FPR0 FGR0
FPR0 (most) FGR1 FPR1 FGR1
(least) FGR2 FPR2 FGR2
FPR2 (most) FGR3 FPR3 FGR3
••
••
••
(least) FGR28 FPR28 FGR28
FPR28 (most) FGR29 FPR29 FGR29
(least) FGR30 FPR30 FGR30
FPR30 (most) FGR31 FPR31 FGR31
Floating-point
Control Registers
(FCR)
Control/Status Register Implementation/Revision Re
g
ister
31 (FCR31) 0 31
(
FCR0
)
0
Figure 12-1 FP Registers
TX49/H2 Archit ecture
12-2
12.2.2 Floating-Point Control Regist ers
The MIPS RISC architecture defines 32 floating-point control registers (FCRs); the
TX49 processor implements two of these registers: FCR0 and FCR31. These FCRs are
described below:
The Implementation/Revision register (FCR0) holds revision information.
The Control/Status register (FCR31) controls and monitors exceptions, holds the
result of compare operations, and establishes rounding modes.
FCR1 to FCR30 are reserved.
Table 12-1 lists the assignments of the FCRs.
Table 12-1 Floating-Point Control Register Assignments
FCR Number Use
FCR0 Coprocess or impl ement ation and revi si on regi ster
FCR1 to FCR30 Reserved
FCR31 Rounding mode, cause, trap enables, and flags
Implementation and Revision Register, (FCR0)
The read-only Implementation and Revision register (FCR0) specifies the
implementation and revision number of CP1. This information can determine the
coprocessor revision and performance level, and can also be used by diagnostic software.
Figure 12-2 shows the layout of the register; Table 12-2 describes the Implementation
and Revision register (FCR0) fields.
Implementation/Revision Register (FCR0)
31 16 15 8 7 0
0ImpRev
16 8 8
Figure 12-2 Implementation/Revision Register
Table 12-2 FCR0 Fields
Field Description
Imp Im pl em entat i on number
Rev Revision num ber i n the form of y. x
0 Reserved. Returns zeroes when read.
The revis ion number is a value of the form y. x, where:
y is a major revision number held in bits 7:4.
x is a minor revision number held in bits 3:0.
Control/Status Register (FCR31)
The Control/Status register (FCR31) contains control and status information that can
be accessed by instructions in either Kernel or User mode. FCR31 also controls the
arithmetic rounding mode and enables User mode traps, as well as identifying any
exceptions that may have occurred in the most recently executed floating-point
instruction, along with any exceptions that may have occurred without being trapped.
Figure 12-3 shows the format of the Control/Status register, and Table 12-3 describes
the Control/Status register fields. Figure 12-4 shows the Control/Status register Cause,
Flag, and Enable fields.
TX49/H2 Archit ecture
12-3
Control/Status Register (FCR31)
31 25 24 23 22 18 17 12 11 7 6 2 1 0
0FS C0 Cause
EVZOUI Enables
VZOUI Flags
VZOUI RM
7115 6 5 52
Figure 12-3 FP Control/Status Register Bit Assignments
Table 12-3 Control/Status Register Fields
Field Description
FS When set, denormalized results can be flushed instead of causing
an unimplemented operati on except i on.
C Condition bit. Stores the result of compare instruction. See
description of Control/Status register Condition bit.
Cause Cause bits. These bits identify the exceptions raised by the most
recently executed float ing-point inst ructi on. See Figure 12-4 and the
description of Control/Status register Cause, Flag, and Enable bits.
Enables Enable bits. When set, these bits trap any floating-point exceptions
to indicate that they have been passed to the CPU. See Figure 12-4
and the description of Control/Status register Cause, Flag, and
Enable bits.
Flags Flag bits. These bits indicate that an exception was raised. See
Figure 12-4 and the description of Control/Status register Cause,
Flag, and Enable bits.
RM Rounding mode bits. See Table 12-5 and the description of
Control/Status register Rounding Mode Control bits.
Bit# 17 16 15 14 13 12
EVZOUI
Bit#1110987
VZOUI
Bit#65432
VZOUI
Inexact Operation
Underflow
Overflow
Divisi on by Zero
Invalid Operat i on
Unimplement ed Operation
Cause
Bits
Enable
Bits
Flag
Bits
Figure 12-4 Control/Status Register Cause, Flag, and Enable Fields
TX49/H2 Archit ecture
12-4
Control/Status Register FS Bit
The FS bit enables the flushing of denormalized values. When the FS bi t i s set and the
Underflow and Inexact Enable bits are not set, denormalized results are flushed instead
of causing an Unimplemented Operation exception. Results are flushed either to 0 or the
minimum normalized value, depending upon the rounding mode (see Table 12-4 below),
and the Underflow and Inexact Flag and Cause bits are set.
Table 12-4 Flush Values of Denormalized Results
Flushed Result Rounding ModeDenormalized
Result RN RZ RP RM
Positive +0+0+2Emin +0
Negative -0 -0 -0 -2Emin
Control/Status Register Condition Bit
When a floating-point Compare operation takes place, the result is stored at bit 23, the
Condition bit. The C bit is set to 1 if the condition is true; the bit is cleared to 0 if the
condition is false. Bit 23 is affected only by compare and CTC1 instructions.
The BC1T and BC1F instructions test the C bit to decide whether or not to cause a
branch.
Control/Status Register Cause, Flag, and Enable Fields
Figure 12-4 illustrates the Cause, Flag, and Enable fields of the Control/Status
register. The Cause and Flag fields are updated by all conversion, computational (except
MOV. fmt), CTC1, reserved, and unimplemented instructions. All other instructions have
no affect on these fields.
Cause Bits
Bits 17:12 in the Control/Status register contain Cause bits, as shown in Figure
12-4, which reflect the results of the most recently executed floating-point
instruction. The Cause bits are a logical extension of the CP0 Cause register; they
identify the exceptions raised by the last floating-point operation. If the
corresponding Enable bit is set at the time of the exception a floating-point exception
and interrupt is raised. If more than one exception occurs on a single instruction,
each appropriate bit is set.
The Cause bits are updated by most floating-point operations. The
Unimplemented Operation (E) bit is set to 1 if software emulation is required,
otherwise it remains 0. The other bits are set to 0 or 1 to indicate the occurrence or
non-occurrence (respectively) of an IEEE 754 exception. Within the set of floating-
point instructions that update the Cause bits, the Cause field indicat es the exceptions
raised by the most-recently-executed instruction.
When a floating-point exception is taken, no results are stored, and the only state
affected is the Cause bit. Therefore, software emulation routines can use the original
values to emulate the exception-causing floating-point operation.
Enable Bits
A floating-point exception is generated any time a Cause bit and the corresponding
Enable bit are set. A floating-point operation that sets an enabled Cause bit forces
an immediate floating-point exception, as does setting both Cause and Enable bits
TX49/H2 Archit ecture
12-5
with CTC1. Software can also emulate above.
There is no enable for Unimplemented Operation (E). An Unimplemented
exception always generates a floating-point exception.
Before returning from a floating-point exception, software must first clear the
enabled Cause bits with a CTC1 instruction to prevent a repeat of the interrupt.
Thus, User mode programs can never observe enabled Cause bits set; if this
information is required in a User mode handler, it must be passed somewhere other
than the Status register.
For a floating-point operation that sets only unenabled Cause bits, no floating-
point exception occurs and the default result defined by IEEE 754 is stored. In this
case, the exceptions that were caused by the immediately previous floating-point
operation can be determined by reading the Cause field.
Flag Bits
The Flag bits are cumulative and indicate the exceptions that were raised by the
operations that were executed since the bits were explicitly reset. Flag bi ts ar e se t to
1 if an IEEE 754 exception is raised, otherwise they remain unchanged. The Flag
bits are never cleared as a side effect of floating-point operations; however, they can
be set or cleared by writing a new value into the Status register, using a CTC1
instruction.
Control/Status Register Rounding Mode Control Bits
Bits 1 and 0 in the Control/Status register constitute the Rounding Mode (RM) field.
As shown in Table 12-5, these bits specify the rounding mode that CP1 uses for all
floating-point operations.
Table 12-5 Rounding Mode Bit Decoding
Rounding
ModeRM
(1:0) Mnemonic Description
0 RN Round result to nearest representable value; round to value with least-significant
bit 0 when the two nearest representable values are equal l y near.
1 RZ Round toward 0: round to value closest to and not greater in magnitude than the
infinitely precise result.
2 RP Round toward +∞: round to value closest to and not less than the infinitely precise
result.
3 RM Round toward −∞: round to value closest to and not greater than the infinitely
precise result.
12.2.3 Accessing the FP Control and Implementation/Revision Registers
The Control/Status and the Implementation/Revision registers are read by a Move
Control From Coprocessor 1 (CFC1) instruction.
The bits in the Control/Status register can be set or cleared by writing to the register
using a Move Con trol To Copro ce ssor 1 ( C TC1) in stru c tio n. Th e Implementation/Revision
register is a read-only register. There are no pipeline hazards (between any instructions)
associated with floating-point control registers.
TX49/H2 Archit ecture
12-6
12.3 Floating-Point Formats
CP1 performs both 32-bit (single-precision) and 64-bit (double-precision) IEEE standard
floating-point operations. The 32-bit single-precision format has a 24-bit signed-magnitude
fraction field (f
+
s) and an 8-bit exponent (e), as shown in Figure 12-5.
31 30 23 22 0
s
Sign e
Exponent f
Fraction
18 23
Figure 12-5 Single-Precision Floating-Point Format
The 64-bi t double-p recision format ha s a 53-bit signed-mag nitude fraction field (f
+
s) and an
11-bit exponent, as shown in Figure 12-6.
63 62 5251 0
s
Sign e
Exponent f
Fraction
111 52
Figure 12-6 Double-Precision Floating-Point Format
As shown in the above figures, numbers in floating-point format are composed of three
fields:
sign field, s
biased exponent, e = E + bias
fraction, f = b1b2....bp-1
The range of the unbiased exponent E includes every integer between the two values Emin
and Emax inclusive, together with two other reserved values:
Emin 1 (to encode 0 and denormalized numbers)
Emax + 1 (to encode and NaNs [Not a Number])
For single-and double-precision formats, each representable nonzero numerical value has
just one encoding. For single-and double-precision formats, the value of a number, v, is
determined by the equations shown in Table 12-6.
Table 12-6 Equations for Calculating Values in Single and Double-Precision Floating-Point Format
No. Equation
(1) if E = Emax+1 and f 0, then v is NaN, regardless of s
(2) if E = Emax+1 and f = 0, then v = (1)s
(3) if Emin E Emax, then v = (1)s2E(1.f)
(4) if E = Emin1 and f 0, then v = (1)s2Emin(0.f)
(5) if E = Emin1 and f = 0, then v = (1)s0
For all floating-point formats, if v is NaN, the most-significant bit of f determines whether
the value is a signaling or quiet NaN: v is a signaling NaN if the most-significant bit of f is set,
otherwise, v is a quiet NaN.
Table 12-7 defines the values for the format parameters; minimum and maximum floating-
point values are given in Table 12-8.
TX49/H2 Archit ecture
12-7
Table 12-7 Floating-Point Format Parameter Values
Format
Parameter Single Double
Emax +127 +1023
Emin –126 –1022
Exponent bias +127 +1023
Exponent width in bits 8 11
Integer bit hidden hidden
Fraction width in bits 23† 52†
Format width in bits 32 64
Excluding the sign bit.
Table 12-8 Minimum and Maximum Floating-Point Values
Type Value
Single-prec isi on Minimum 1.40129846e-45
Single-prec isi on Minimum Norm 1.17549435e-38
Single-prec isi on Maximum 3.40282347e +38
Double-precis i on Minimum 4.9406564584124654e-324
Double-precis i on Minimum Norm 2.2250738585072014e-308
Double-precis i on Maximum 1.7976931348623157e+308
12.4 Binary Fixed-Point Format
Binary fixed-point values are held in 2's complement format. Unsigned fixed-point values
are not directly provided by the floating-point instruction set. Figure 12-7 illustrates binary
single fixe d- po int format an d Fig ure 1 2- 8 i llus tra tes b in ary lo ng fix e d- po int f o rmat; Tab le 12- 9
lists the binary fixed-point format fields.
31 30 0
Sign Integer
131
Figure 12-7 Binary Single Fixed-Point Format
63 62 0
Sign Integer
163
Figure 12-8 Binary Long Fixed-Point Format
Field assignments of the binary fixed-point format are:
Table 12-9 Binary Fixed-Point Format Fields
Field Description
sign sign bit
integer int eger val ue (2’ s complement)
TX49/H2 Archit ecture
12-8
12.5 Floating-Point Instruction Set Summary
Each instruction is 32 bits long, and aligned on a word boundary. This section describes
the overview of instructions for floating-point unit. A detailed description of each instruction
is provided in Appendix B.
12.5.1 Load, Move and Store Instructions (Table 12-10)
Load and Store instructions move data between memory and FPU general purpose
registers, and Move instructions move data directly between CPU and FPU general
purpose registers. These instructions are not perform format conversions and therefore
never cause floating-point exceptions. The instruction immediately following a load can
use the contents of the loaded register. However, in such case the hardware interlocks,
requiring additional real cycles. Thus, the scheduling of load delay slots is required to
avoid the interlocking.
Data Alignment
All processor loads and stores reference the following aligned data items:
For word loads and stores, the access type is always WO RD, and the low-order 2 bits
of the address must always be 0.
For doubleword loads and stores, the access type is always DOUBLEWORD, and the
low-order 3 bits of the address must always be 0.
Endian
Regardless of byte-numbering order (endianness) of the data, the address specifies the
byte that has the smallest byte address in the addressed field. For a big-endian system, it
is the leftmost byte; for a little-endian system, it is the rightmost byte.
Table 12-10 FPU Instruction Set (Optional): Load, Move and Store Instruction
Instruction Description Note
LWC1 Load Word to FPU (coprocessor 1) MIPS I
SWC1 Store Word from FPU (coprocessor 1) MIPS I
MTC1 Move Word to FPU (coprocessor 1) MIPS I
CTC1 Move Control Word to FPU (coprocessor 1) MIPS I
MFC1 Move W ord from FPU (c oprocessor 1) MIPS I
CFC1 Move Control Word from FPU (coprocessor 1) MIPS I
TX49/H2 Archit ecture
12-9
12.5.2 Conversion Instruct ions (Table 12-11)
Conversion instructions perform conversion operations between the various data
formats such as single- or double-precision, fixed- or floating-point formats. Table 12-11
list conversion instructions.
Table 12-11 FPU Instruction Set(Optional): Conversion Instruction
Instruction Description Note
CVT.S.fmt Floating-Point Convert to Single FP Format MIPS I
CVT.W.fmt Floating-Point Convert to Single Fixed-Point Format MIPS I
ROUND.W.fmt Floating-point Round MIPS II
TRUNC.W.fmt Floating-poi nt Trunc at e MIPS II
CEIL.W.fmt Floating-point Ceiling MIPS II
FLOOR.W.fmt Floating-poi nt Floor MIPS II
12.5.3 Computational I nstr uctions (Table 12-12)
Computational instructions perform arithmetic operations on floating-point values in
the FPU registers. These are two categories of computational instructions:
3-Operand Register-Type instructions, which perform floating-point addition,
multiplication, division, and square root operations
2-Operand Register-Type instructions, which perform floating-point absolute
value, move, negate, and square root operat ion.
Table 12-12 FPU Instruction Set(Optional): Computational Instruction
Instruction Description Note
ADD.fmt Floati ng-poi nt Add MIPS I
SUB.fmt Floating-poi nt S ubtract MIPS I
MUL.fmt Floating-poi nt Multi pl y MIPS I
DIV.fmt Floati ng-poi nt Divi de MIPS I
ABS.fmt Float i ng-poi nt A bsolute Value MIPS I
MOV.fmt Float i ng-poi nt Move MIPS I
NEG.fmt Floating-point Negat e MIPS I
SQRT.fmt Floating-poi nt S quare root MIPS II
TX49/H2 Archit ecture
12-10
12.5.4 Compare and Branch Instructions (Table 12-13)
Compare instructions perform comparisons of the contents of registers and set a
conditional bit based on the results. Branch on FPU Condition instructions perform a
branch to the specified target if the specified coprocessor condition is met.
Table 12-13 FPU Instruction Set(Optional): Compare and Branch Instruction
Instruction Description Note
C.cond.fmt Floating-point Compare MIPS I
BC1T Branch on FPU True MIPS I
BC1F Branch on FPU False MIPS I
BC1TL Branch on FPU True Likely MIPS II
BC1FL Branch on FPU False Likely MIPS II
The floating-point compare (C.fmt.cond) instructions interpret the contents of two FPU
registers (fs, f t) in the specified f ormat (fmt) an d arithmetica lly compare th em. A resu lt is
determined based on the comparison and conditions (cond) specified in the instruction.
Table 12-4 lists the mnemonics for the compare instruction conditions .
Table 12-14 Mnemonics and Definitions of Compare Instruction Conditions
Mnemonic Definition Mnemonic Definition
FFalse TTrue
UN Unordered OR Ordered
EQ Equal NEQ Not E qual
UEQ Unordered or E qual OLG O rdered or Less than or Greater than
OLT Ordered Less Than UGE Unordered or Greater than or Equal
ULT Unordered or Less Than OGE Ordered Great er t han or Equal
OLE Ordered Less Than or Equal UGT Unordered or Greater Than
ULE Unordered or Less than or Equal OGT Ordered Greater Than
SF Signaling False ST Signal i ng True
NGLE Not Greater than or Less than or Equal GLE Greater than, or Less than or Equal
SEQ Signaling Equal SNE Signaling Not Equal
NGL Not Greater than or Less than GL Greater than or Less Than
LT Less Than NLT Not Less Than
NGE Not Greater than or Equal GE Greater than or Equal
LE Less than or Equal NLE Not Less than or Equal
NGT Not Greater Than GT Great er Than
TX49/H2 Archit ecture
13-1
13. Floating-Point Exception
13.1 Introduction
This chapter describes floating-point exceptions, including FPU exception type, exception
trap processing, exception flags, saving and restoring state when handling an exception, and
trap handlers for IEEE Standard 754 exceptions.
13.2 Exception Types
The FP Control/Status register described in Chapter 12 contains an Enable bit for each
exception type; exception Enable bits determine whether an exception will cause the FPU to
initiate a trap or set a status flag.
If a trap is taken, the FPU remains in the state found at the beginning of the
operation and a software exception handling routine executes.
If no trap is taken, an appropriate value is written into the FPU destination register
and execution cont inues.
The FPU supports the five IEEE Standard 754 exceptions:
Inexact (I)
Underflow (U)
Overflow (O)
Division by Zero (Z)
Invalid Operation (V)
Cause bits, Enables, and Flag bits (status flags) are used.
The FPU adds a sixth exception type, Unimplemented Operation (E). This exception
indicates the use of a software implementation. The Unimplemented Operation exception has
no Enable or Flag bit; whenever this exception occurs, an unimplemented exception trap is
taken.
Figure 13-1 shows the Control/Status register bits that support exceptions.
Bit #171615141312
E V Z O U I Cause Bits
Bit # |
11 |
10 |
9|
8|
7
V Z O U I Enable Bits
Bit # |
6|
5|
4|
3|
2
V Z O U I Flag Bits
|
Unimplemented |
Invalid |
Divisi on by
Zero
|
Overflow |
Underflow |
Inexact
Figure 13-1 Control/Status Register Exception/Flag/Trap/Enable Bits
TX49/H2 Archit ecture
13-2
13.3 Exception Trap Processing
When a floating-point except ion trap is taken, the Cause register indicates the floating-point
coprocessor is the cause of the exception trap.
The Floating-Point Exception (FPE) code is used, and the Cause bits of the floating-point
Control/Status register indicate the reason for the floating-point exception. These bits are, in
effect, an extension of the system coprocessor Cause register.
13.4 Flags
A Flag bit is prov ided for e ach IEEE ex ceptio n. This Flag bit is set to a 1 o n the assertion o f
its corresponding exception, with no corresponding exception trap signaled.
When no exception trap is signaled, floating-point coprocessor takes a default action,
providing a substitute value for the exception-causing result of the floating-point operation.
The particular default action taken depends upon the type of exception. Table 13-1 lists the
default action taken by the FPU for each of the IEEE exceptions.
Table 13-1 Default FPU Exception Actions
Field Description Rounding
Mode Default Actio n
I I nexact excepti o n ANY Suppl y a rounded result.
U Underflow
exception ANY Supply a rounded result.
OOverflow
exception RN Modify overflow values to with the sign of the
intermediate result.
RZ Modify overflow values to the format’s largest finite
number with the sign of the intermediate result.
RP Modify negative overflows to the format’s most negative
finite number; modify pos iti ve overf l ows to +
RM Modify positive overflows to the format’s largest finite
number; modify negative overflows to
Z Division by zero ANY Supply a properly s i gned
V Invali d operat ion ANY Supply a quiet Not a Number (NaN).
The FPU detects the eight exception causes internally. When the FPU encounters one of
these unusual situations, it causes either an IEEE exception or an Unimplemented Operation
exception (E).
Table 13-2 lists the exception-causing situations and contrasts the behavior of the FPU with
the requirements of the IEEE Standard 754.
TX49/H2 Archit ecture
13-3
Table 13-2 FPU Exception-Causing Conditions
FPA Internal
Result IEEE Standar d
754 Trap Enable Trap Disable Notes
Inexact resul t I I I Loss of accuracy
Exponent overflow O, I* O, I O, I Normalized exponent > Emax
Division by zero Z Z Z Zero is (exponent = Emin – 1, mantissa = 0)
Overflow on convert V V E Source out of integer range
Signaling NaN
source V V E Quiet NaN result generated from quiet NaN
source
Invalid operat i on V V E 0/0, etc.
Exponent underflow U E E Normalized exponent < Emin
Denormali zed or
QNaN None E E Denormalized is (exponent = Emin – 1 and
mantissa < > 0)
*The IEEE Standard 754 specifies an inexact exception on overflow only if the overflow trap is
disabled.
13.5 FPU Exceptions
The following sections describe the conditions that cause the FPU to generate each of its
exceptions, and details the FPU response to each exception-causing condition.
Inexact Exception (I)
The FPU generates the Inexact exception if one of the following occurs:
the rounded result of an operation is not exact, or
the rounded result of an operation overflows, or
the rounded result of an operation underflows and both the Underflow and Inexact
Enable bits are not set and the FS bit is set.
Trap Enabled Results: If Inexact exception traps are enabled, the result register is not
modified and the source regist ers are preserved.
Trap Disabled Results: The rounded or overflowed result is delivered to the destination
register if no other software trap occurs.
Invalid Operation Exception (V)
The Invalid Operation exception is signaled if one or both of the operands are invalid for an
implemented operation. When the exception occurs without a trap, the MIPS ISA defines the
result as a quiet Not a Number (qNaN). The invalid operations are:
Addition or subtraction: magnitude subtraction of infinities, such as: ( + ) + (−∞) or
(−∞) (−∞)
Multiplication: 0 times , with any signs
Division: 0/0, or /, with any signs
Comparison of predicates involving ‘<’ or ‘>’ without ‘?’, when the operands are
unordered
Any arithmetic operation, when one or both operands is a signaling NaN. A move
(MOV) operation is not considered to be an arithmetic operation, but absolute value
(ABS) and negate (NEG) are.
Comparison or a Convert From Floating-point Operation on a signaling NaN.
Square root:
x
, where x is less than zero.
Software can simulate the Invalid Operation exception for other operations that are invalid
for the given source operands. Examples of these operations include IEEE Standard 754-
TX49/H2 Archit ecture
13-4
specified functions implemented in software, such as Remainder: x REM y, where y is 0 or x
is infinite; conversion of a floating-point number to a decimal format whose value causes an
overflow, is infinity, or is NaN; and transcendental functions, such as ln (5) or cos1 (3).
Refer to Appendix B for exa mples or for routines to handle these cases.
Trap Enabled Results: The result register is not modified, and the source registers are
preserved.
Trap Disabled Results: A quiet NaN is delivered to the destination register if no other
software trap occurs.
Divide-by-Zero Exception (Z)
The Division-by-Zero exception is signaled on an implemented divide operation if the
divisor is zero and the dividend is a finite nonzero number. Software can simulate this
exception for other operations that produce a signed infinity, such as In (0), sec (π/2) , csc (0),
or 0-1
Trap Enabled Results: The result register is not modified, and the source registers are
preserved.
Trap Disabled Results: The result, when no trap occurs, is a correctly signed infinity.
Overflow Exception (O)
The Overflow exception is signaled when the magnitude of the rounded floating-point
result, with an unbounded exponent range, is larger than the largest finite number of the
destination format. (This exception also signals an Inexact exception.)
Trap Enabled Results: The result register is not modified, and the source registers are
preserved.
Trap Disabled Results: The result, when no trap occurs, is determined by the rounding
mode and the sign of the intermediate result (as listed in Table
12-1).
Underflow Exception (U)
Two related events contribute to the Underflow exception:
creation of a tiny nonzero result between ±2Emin which can cause some later exception
because it is so tiny
extraordinary loss of accuracy during the approximation of such tiny numbers by
denormalized numbers.
IEEE Standard 754 allows a variety of ways to detect these events, but requires they be
detected the same way for all operations.
Tininess can be detected by one of the following methods:
after rounding (when a nonzero result, computed as though the exponent range were
unbounded, would lie strictly between ±2Emin)
before rounding (when a nonzero result, computed as though the exponent range and
the precision were unbounded, would lie strictly between ±2Emin).
The MIPS architecture requires that tininess be detected after rounding.
Loss of accura cy can be detected by one of the following methods :
denormalization loss (when the delivered result differs from what would have been
computed if the exponent range were unbounded)
inexact result (when the delivered result differs from what would have been computed
if the exponent range and precision were both unbounded).
TX49/H2 Archit ecture
13-5
The MIPS architecture requires that loss of accuracy be detected as an inexact result.
Trap Enabled Results: If Underflow or Inexact traps are enabled, or if the FS bit is not
set, then an Unimplemented exception (E) is generated, and the
result register is not modified.
Trap Disabled Results: If Underflow and Inexact traps are not enabled and the FS bit is
set, the result is determined by the rounding mode and the sign of
the intermediate result (as listed in Table 12-1).
Unimplemented Instruction Exception (E)
Any attempt to execute an instruction with an operation code or format code that has been
reserved for future definition sets the Unimplemented bit in the Cause field in the FPU
Control/Status register and traps. The operand and destination registers remain
undisturbed and the instruction is emulated in software. Any of the IEEE Standard 754
exceptions can arise from the emulated operation, and these exceptions in turn are
simulated.
The Unimplemented Instruction exception can also be signaled when unusual operands or
result conditions are detected that the implemented hardware cannot handle properly.
These include:
Denormalized operand, except for Compare instruction
Quiet Not a Number operand, except for Compar e instruction
Denormalized result or Underflow, when either Underflow or Inexact Enable bits are
set or the FS bit is not set.
Reserved opcodes
Unimplemented formats
Operations which are invalid for their format (for instance, CVT.S.S)
Note: Denormalized and NaN operands are only trapped if the instruction is a convert or
computational operation. Moves do not trap if their operands are either denormalized or
NaNs.
The use of this exception for such conditions is optional; most of these conditions are newly
developed and are not expected to be widely used in early implementations. Loopholes are
provided in the architecture so that these conditions can be implemented with assistance
provided by software, maintaining full compatibility with the IEEE Standard 754.
Trap Enabled Results: The result register is not modified, and the source registers are
preserved.
Trap Disabled Results: This trap cannot be disabled.
TX49/H2 Archit ecture
13-6
13.6 Saving and Restoring State
Sixteen doubleword coprocessor load or store operations save or restore the coprocessor
floating-point register state in memory. The remainder of control and status information can
be saved or restored through CFC1/CTC1 instructions, and saving and restoring the processor
registers. Normally, the Control/Status register is saved first and restored last.
When state is restored, state information in the Control/Status register indicates the
exceptions that are pending. Writing a zero value to the Cause field of Control/Status register
clears all pending exceptions, permitting normal processing to restart after the floating-point
register state is restored.
13.7 Trap Handlers for IEEE Standard 754 Exceptions
The IEEE Standard 754 strongly recommends that users be allowed to specify a trap
handler for any of the five standard exceptions that can compute; the trap handler can either
compute or specify a substitute result to be placed in the destinati on register of the operation.
By retriev ing an instructio n using th e processo r Exception Program Counter (EPC) register,
the trap handler determines:
exceptions occurring during the operation
the operat ion being performed
the destination format
On Overflow or Underflow exceptions (except for conversions), and on Inexact exceptions,
the trap handler gains access to the correctly rounded result by examining source registers
and simulating the operation in software.
On Overflow or Underflow exceptions encountered on floating-point conversions, and on
Invalid Operation and Divide-by-Zero exceptions, the trap handler gains access to the operand
values by examining the source registers of the instruction.
The IEEE Standard 754 recommends that, if enabled, the overflow and underflow traps
take precedence over a separate inexact trap. This prioritization is accomplished in software;
hardware sets the bits for both the Inexact exception and the Overflow or Underflow
exception.
32 doublewords if the FR bit is set to 1.
TX49/H2 Archit ecture
14-1
14. Debug Support Unit
14.1 Features
1. Utilizes JTAG interface compatible with IEEE Std. 1149.1.
2. Additional Status pins and debug clock in conjunction with JTAG pins provide Real-Time
Trace information.
3. Processor access to external processor probe to execute from the external trace memory
during debug exception and boot time. This is to eliminate system memory for debugging
purpose.
4. Supports DMA access through JTAG interface to internal processor bus to access internal
registers, host system peripherals and system memory.
5. Debug functions
Instruction Address Break
Data Bus break
Processor Bus Break
Hardware Debug Interrupt
Reset, NMI, Interrupt Mask
6. Instructions for Debug
SDBBP, DERET, CTC0, CFC0
7. CP0 Registers for Debug
Debug, DEPC, DESAVE
14.2 EJTAG interface
This interface consists of two modes of operation a Run Time Mode and a Real Time Mode.
The Run Time mode provides functions such as processor Run, Stop, Single Step, and access to
internal registers and system memory. The Real Time mode provides additional status pins
used in conjunction with JTAG pins for Real Time Trace information.
Pins In/Out Description
GTCK I Test Clock Input
GDCLK O Debug Cloc k (1/ 3 CPU Clock)
GTDI/GDINT I Tes t Dat a Input (GTDI) at Run Time mode
/Debug Interrupt Input (GDI NT) at Real Time mode
GTDO/GTPC[0] O Test Data Output (GTDO)
/PC Output (GTPC)
GTMS I Test Mode Select Input
GTRST* I Reset
GPCST[80] O PC Trace Stat us Inform ation
GTPC[31] O PC Output
TX49/H2 Architecture
14-2
14.3 JTAG Interface
Standard JTAG interface is used for on chip debugging during Run Time mode. The TX49
Debug Support Unit has following registers.
Instruction Register
Bypass R e g iste r
Boundary-Scan Register
Device Identification Register
Implementation Register
JTAG_Data_register
JTAG_Address_Register
JTAG_Control_Register
14.4 Processor Access Overview
The core processor can access external processor probe for reading and writing to external
monitor memory, registers and other external resources.
In addition the processor can execute from the external monitor memory located from
0xf_ff20 0000 to 0xf_ff2f ffff when the ProbEnb bit is set and the processor probe is turned ON.
Any access to the monitor location from 0xf_ff20 0000 to 0xf_ff3f ffff are only allowed when the
processor is in the debug mode (DM = 1).
14.5 Instruction
The instruction is a 8 bit field. Instructions for the TX49 Debug Support Unit are encoded
between 0x80 and 0x9f and other codes are reserved for Toshiba Stand ard JTAG instruction s
(Includes EXTEST, SAMPLE/PRELOAD, INTEST, IDCODE and HI-Z) and so on.
Instructions are decoded as follows.
Hex Value Instruction Description
0x83 EJTAG_ImpCode Select Implementation Register
0x88 JTAG_ADDRESS_IR Select JTAG_Address Register
0x89 JTAG_DATA_IR Select JTAG_Data Register
0x8A JTAG_CONTROL_IR Select JTAG_Control Register
0x8B JTAG_ALL_IR Select JTAG_All Register
0x90 PCT RACE PCTRACE Inst ruction
Any unused instruction between 0x80 a nd 0x9f defaulted to BYPASS instruction.
TX49/H2 Archit ecture
14-3
14.6 Debug Unit
14.6.1 Extended Instructions
SDBBP
DERET
CTC0
CFC0
14.6.2 Extended Debug Registers in CP0
Debug Register
Debug Exception PC (DEPC)
Debug SAVE
14.7 Register Map
Address Mnemonic Description
0xf ff30 0000 DCR Debug Control Register
0xf ff30 0008 IBS Instruction Break Status
0xf ff30 0010 DBS Data Break Status
0xf ff30 0018 PBS Processor Break Status
0xf ff30 0100 IBA0 Instruction B reak Address 0
0xf ff30 0108 IBC0 Instruction B reak Cont rol 0
0xf ff30 0110 IBM0 Instruction Break Address Mask 0
0xf ff30 0300 DBA0 Dat a Break Address 0
0xf ff30 0308 DBC0 Data Break Cont rol 0
0xf ff30 0310 DBM0 Dat a B reak Address Mask 0
0xf ff30 0318 DB0 Data Break Value 0
0xf ff30 0600 PBA0 Processor Bus Break Address 0
0xf ff30 0608 PBD0 P rocessor Bus Break Data 0
0xf ff30 0610 PBM0 P rocessor Bus Break Mask 0
0xf ff30 0618 PBC0 P rocessor Bus Break Control 0
14.8 Processor Bus Break Function
This function is to monitor the interface to core and provide debug interruption or trace
trigger for a given physical address and data.
TX49/H2 Architecture
14-4
14.9 Debug Exception
Three kinds of debug exception are supported.
Debug Single Step (DSS bit)
Debug Breakpoint Exception (SDBBP Instruction)
JTAG Break Exception (Jtagbrk bit in JTAG_Control_Register)
Note: During real time debugging, first two functions are disabled.
14.9.1 Debug Single Step (DSS)
When the debug register DSS bit is set, this exception has been raised each time one
instruction is executed.
14.9.2 Debug Breakpoint exception (Dbp)
This exception is ra ised when SDBBP instruction is executed.
14.9.3 JTAG Break Exception
This exception is raised when JTAG unit set the Jtagbrk in J TAG_Control_Register.
14.9.4 Debug Exception Handling
Updates DEPC and Debug register.
Registers other than DEPC and Debug register retain their values.
14.9.5 Branching to debug handler
If the ProbEnb bit in JTAG_Control_Register[15] is set, the debug exception vector is
located at PC: 0xffff ffff ff20 0200.
If the ProbEnb bit in JTAG_Conctrol_Register[15] is cleared, the debug exception vector
is located at
PC: 0xffff ffff bfc0 0 400.
14.9.6 Exception handling when in Debug Mode (DM bit is set)
All interrupts including NMI are masked. When the NMI interrupt has occurred
during Debug mode, it is stored internally and the NMI interrupt is taken after debug
handler is fi nis hed (DM is clear).
14.10 Real Time PC TRACE Output
In real time mode non-sequential Program Counter and trace information are outputted on
GTPC[3~0] and GPCST[8~0]. at 1/3 of the processor clock speed.
TX49/H2 Archit ecture
15-1
15. TX49 MPU Core Signal Descriptions
The TX49 MPU core has a 64-bit bu s interface that is up ward compati ble with the TX39 G -bus
interface.
Figure 15-1 TX49 MPU Core Interface Signals
TX49 Core
GAFM[35:0]
GBE[7:0]*
36
8
64
GDFM[63:0]
GDTM[63:0]
GWR*
GACK*
GBUSERR*
GBURST*
GLAST*
GCACHE*
GID
GBUSOEN
GTRST*
GTDI/GDINT*
GTMS
GTCK
GTPC[3:1]
GTDO/GTPC[0]
GSNOOP*
GREQ*
GSREQ*
GHPGREQ*
GHPSREQ*
GGNT*
GSGNT*
GHPSGNT*
GREL*
GHAVEIT*
3
Memory Interface
Debug/JTAG Interface
Clock and Syst em
Control Int erface
GATM[35:5]
GRD*
GBSTART*
31
64
GDCLK
GPCST[8:0] 9
DMA Interface
GHPGGNT*
CPUCLK
GBUSCLK
GCRATE[1:0]
GDOZE
GHALT
GTINTDIS
GBS64*
GENDIAN
Interrupt Inte r face
GCOLDRESET*
GNMI*
GRESET*
GINT[5:0]*
6
GTEST[2:0]
GDIS*
Test Interface
3
GCPCOND[3:2]
2
Coprocess or Int erfac e
GCPRD*
GCPRDACK*
GCPWR*
GCPWRACK*
2
TX49/H2 Archit ecture
15-2
15.1 Signal Descriptions
15.1.1 Memory Interface Signals
Table 15-1 lists the memory interface signals.
Table 15-1 Memory Interface Signals
Signal Name I/O Active
State Description
GAFM[35:0] O Address From Bus (Output)
GAFM[35:0] is used as a 36-bit output address bus.
GATM[35:5] I Address To Bus (Input)
GATM[35:5] is a 31-bit address input bus used for data cache snooping.
GBE[7:0]*O Low Byte Enable
GBE[7:0]* defines the valid data bytes within the 64-bit data bus. The correlation
between the byte enable signals and data bytes is as follows:
GDFM[63:0] O Dat a From Master (Output )
This data bus always acts as a 64-bit output.
GDTM[63:0] I Data t o Master (Input)
This data bus always acts as a 64-bit input.
GRD*O Low Read
GRD* is an output-only strobe t hat is assert ed duri ng a bus read operation.
GWR*OLow Write
GWR* is an output-only strobe that is asserted during a bus write operation.
GACK*I Low Read/Write Acknowledge
GACK* is sampled with the rising edge of GBUSCLK. The TX49 MPU core ends
single-read and single-write operations in the next cycle after GACK* is recognized
as asserted. During burst-read and burst-write operations, the TX49 MPU core
increments the address at the next rising edge of GBUSCLK after GACK* is
recognized as asserted. If GACK* is sampled as deasserted, a bus wait cycle is
inserted.
GCACHE*O Cacheable
GCACHE* is an output signal that indi cates whether the bus transfer in progress is
being performed on a cached or uncached address spac e.
H: Uncached space
L: Cached space
GID O Instruction or Dat a
GID is an output signal that indicates t he type of bus transfer being performed.
H: Instruction
L: Data
GBSTART*O Low B us St art
GBSTART* is an output signal that is asserted for one clock cycle to indicate that a
bus operation has started.
GBUSERR*I Low B u s E rror
When GBUSERR* is asserted during a bus read operation, the TX49 MPU core
immediately terminates the ongoing transaction and takes a Bus Error exception.
GBUSERR* is valid only during bus read operat i ons.
Byte Enable Corresponding Data Byte
GBE[7]* GDFM[63:56], GDTM[63:56]
GBE[6]* GDFM[55:48], GDTM[55:48]
GBE[5]* GDFM[47:40], GDTM[47:40]
GBE[4]* GDFM[39: 32], GDTM[[39:32]
GBE[3]* GDFM[31: 24], GDTM[[31:24]
GBE[2]* GDFM[23:16], GDTM[23:16]
GBE[1]* GDFM[15:8], GDTM[15:8]
GBE[0]* GDFM[7:0], GDTM[7:0]
TX49/H2 Archit ecture
15-3
Signal Name I/O Active
State Description
GBURST*O Low Burst
GBURST* is an output-only strobe that is asserted during burst-read and burst-
write operations.
GLAST*O Low Last
GLAST* is an output signal that indiates completion of a bus cycle.
During a single-read or single-write, GLAST* is asserted simultaneously with
GBSTART*.
During a burst-read or burst-write, GLAST* is asserted when the TX49 MPU
core has recognized a GACK* for the second last data read.
GBUSOEN*O Low G-Bus Output Enable
GBUSOEN* is the output enable control f or the bus control signals:
While the TX49 assumes bus mastershp: Low
While the TX49 has released bus mastership: High
While GDIS* is asserted: Hi gh
TX49/H2 Archit ecture
15-4
15.1.2 DMA Interface Signals
Table 15-2 lists the DMA interface signals.
Table 15-2 DMA Interface Signals
Signal Name I/O Active
State Description
GSNOOP*I Low SNOOP
The TX49 samples GNSOOP* with the rising edge of GBUSCLK. W hen GSNOOP*
is recognized as asserted, the TX49 captures the address on GATM[35:5] and
compares it to the address es of all data items held in the on-chip data cache. If t he
snoop address hits in the data cache, the cache entry is invalidated. GSNOOP* is
valid when either GHPSGNT* or GSGNT* is asserted.
GREQ*I Low Normal Bus Request
Alternate bus masters assert this signal to request bus mastershp as per ET
concurrenc y protocols.
GSREQ*I Low Snoop Bus Request
Alternate bus masters assert this signal to request bus mastership as per ST
concurrenc y protocols.
GHPGREQ*I Low High-Priority Normal Bus Request
In response to GHPGREQ*, the TX49 asserts GHPGGNT* to grant the bus to the
requesting bus master as per ET concurrency protocols. GHPGREQ* has priority
over GREQ* if both are asserted simultaneousl y.
GHPSREQ*I Low High-Priority Snoop Bus Request
In response to GHPSREQ*, the TX49 asserts GHPSGNT* to grant the bus to the
requesting bus master as per ST concurrency protocols. GHPSREQ* has priority
over GSREQ* if both are asserted sim ultaneously.
GGNT*O Low Normal Bus Grant
Assertion of GGNT* indicates that the TX49 has relinquished bus mastership in
response to GREQ*.
GSGNT*O Low S noop Bus Grant
Assertion of GSGNT* indicates that the TX49 has relinquished bus mastership in
response to GSREQ*.
GHPGGNT*O Low High-Priority Normal Bus Request
Assertion of GHPGGNT* indicates that the TX49 has relinquished bus mastership
in response to GHPGREQ*.
GHPSGNT*O Low High-Priority Snoop Bus Grant
Assertion of GHPSGNT* indicates that the TX49 has relinquished bus mastership
in response to GHPSREQ*.
GREL*O Low Rel ease Request
This output signal indicates to an external bus master that the TX49 want s to regai n
bus mastership. The TX49 asserts GREL* 1) when higher-priority GHPGREQ* is
asserted while lower-priority GSGNT* is asserted and 2) when a bus request is
generated from the TX49 processor core while GHPGGNT* is asserted.
GHAVEIT*I Low Have IT
This is a bus grant acknowledge signal used by an external bus master to indicate
that it has assumed bus mastership. The external bus master can release the bus
by asserting and deasserting GHAVEIT* while keeping a bus request signal
asserted. In a single-bus-master system, GHAVEIT* may be tied high.
TX49/H2 Archit ecture
15-5
15.1.3 Coprocessor Interface Signals
Table 15-3 lists the coprocessor interface signals.
Table 15-3 Coprocessor Interface Signals
Signal Name I/O Active
State Description
GCPRD*O Low Coproc essor Read
GCPRD* is an output-only strobe that is asserted during a coprocessor read
operation.
GCPWR*O Low Coprocessor Write
GCPWR* is an output-only strobe that is asserted during a coprocessor write
operation.
GCPRDACK*I Low Coprocessor Read Acknowledge
A coprocessor asserts this signal to indicate to the TX49 processor core that the
coprocess or read request has been acknowledged.
GCPWRACK*I Low Coprocessor Write Acknowledge
A coprocessor asserts this signal to indicate to the TX49 processor core that the
coprocess or write request has been acknowledged.
GCPCOND[3:2] I Coprocessor Condition
Coprocessor branch instructions use the GCPCOND[z] signal as the coprocessor
z’s condition signal: GCPCOND[3] is for CP3, and GCPCON[2] is for CP2.
15.1.4 Interrupt Interface Signals
Table 15-4 lists the interrupt interface signals.
Table 15-4 Interrupt Interface Signals
Signal Name I/O Active Description
GCOLDRESET*I Low Coldreset
Asserti on of this input signal initiates a cold reset and forces the TX49 to enter Cold
Reset excepti on process i ng.
GRESET*I Low Reset
Asserti on of this input signal init iates a soft reset and forces the TX49 to enter Soft
Reset excepti on process i ng.
GNMI*I Low Nonmaskable Interrupt
Assertion of this input signal forces the TX49 to enter Nonmaskable Interrupt
exception process i ng.
GINT[5:0]*I Low Interrupt
Assertion of any of these interrupt request inputs causes a general Interrupt
exception unless the corresponding bit is masked in the Status regist er.
GINT[5] can be configured for either a general interrupt input or a timer interrupt
input during Reset exception processing. If the GTINTDIS input is zero during a
reset sequence, GINT[5] is configured for the timer interrupt input.
TX49/H2 Archit ecture
15-6
15.1.5 Test Interface Signals
Table 15-5 lists the test interface signals.
Table 15-5 Test Interface Signals
Signal Name I/O Active
State Description
GTEST[2:0] I Test
The GTEST[2:0] inputs are used to set up the TX49 in test mode. A value of 2’b000
at GTEST[2:0] puts the TX49 in normal operation mode.
GDIS*I Disabl e output
This input must be tied high.
15.1.6 Debug Interface Signals
Table 15-6 lists the debug interface signals.
Table 15-6 Debug Interface Signals
Signal Name I/O Active
State Description
GTRST*I Low Test Reset Input
Asserti on of this input init i al i zes the on-chi p Debug Support Unit (DSU).
GTDI/GDINT*I Test Data Input / Debug Interrupt
Run-Time mode: Functions as a serial data input to the EJTAG instruction
register.
Real-Time mode: Switches the debug unit mode from Real-Time mode to Run-
Time mode.
GTMS I Tes t Mode Select Input
The GTMS input controls the transitions of the TAP controller in conjunction with
the rising edge of GTCK.
GTCK I Test Clock Input
GTCK is used to shift test data into or out of JTAG logic for EJTAG instructions.
GTCK is independent of CPUCLK.
GTPC[3:1] O Tr a ce PC Outp ut.
GTPC[3:1] provide non-sequent i al program c ounter output at the GDCLK speed.
GTDO/GTPC[0] O Test Data Output
Run-Time mode: Shifts serial output data from the EJTAG data or instruction
register.
Real-Time mode: Provides a non-sequent i al program c ounter.
GPCST[8:0] O PC Trace Status
The GPCST[8:0] outputs provide PC trace status informati on and serial m onitor bus
mode.
GDCLK O Debug Clock
Output clock for EJTAG debug.
TX49/H2 Archit ecture
15-7
15.1.7 Clock and System Control Interface Signals
Table 15-7 lists the clock and system control interface signals.
Table 15-7 Clock and System Control Interface Signals
Signal Name I/O Active
State Description
CPUCLK I CPU Clock Input
The TX49 processor core operates at the same frequency as CPUCLK.
GBUSCLK I GB US Clock Input
GBUSCLK is the clock input for the G-Bus interface.
A divided-down clock must be applied to GBUSCLK according the value of
GCRATE[1:0]. Otherwise, correct operation is not guaranteed.
GCRATE [1:0] I GBUS Clock Rate Input from External Pin
GCRATE[1:0] select s the frequency at which the G-Bus interface runs with respect
to the TX49 processor core. The frequency division factor can be one of the
following; it must not be changed while the process or is running.
GCRATE[1:0]
1/2
1/3
1/4
1/2.5
GDOZE O High Doze
GDOZE follows the state programmed into the Doze bit in the Config register.
GDOZE=1 when the TX49 is in Doze mode.
GHALT O High Halt
GHALT follows the state programmed into the Halt bit in the Config register.
GHALT=1 when the TX49 is in Halt mode.
GTINTDIS I Timer interrupt disable Input from External Pin
GTINTDIS is specifies the pin function of GINT[5] during a reset sequence.
H: Disables t he timer interrupt function (i.e., configures the GINT[5] pin as a general
interrupt request pin)
L: Enables the timer interrupt function (i.e., configures the GINT[5] pin as a timer
interrupt request pin.)
GENDIAN I Endianess Input from External Pin
GENDIAN specifies byte ordering during a reset sequence.
H: Big-endian
L: Little-endi an
GBS64*I System bus size.
GBS64* specifies t he G-Bus size duri ng a reset s equence.
H: 32-bit (GDTM[31:0] and GDFM[31:0] are valid.)
L: 64-bit (GDTM[63:0] and GDFM[63:0] are valid.)
TX49/H2 Archit ecture
15-8
TX49/H2 Architecture
16-1
16. Low Power Consumption Modes
The TX49 can reduce its power consumption compared to the normal mode by controlling its
internal clocks. The following two operation modes function as low power consumption modes of
the TX49:
Halt mode
Doze mode
16.1 Halt mode
The halt mode reduces power consumption by halting TX49 operation. By setting the HALT
bit of the Config register to 0 by the software and executing WAIT instruction, the TX49 mode
shifts from the normal operation mode to the halt mode.
Therefore, as for bus control requests in the halt mode, a bus release request is responded to
in cases of ET concurrency such as the GREQ* signal or the GHPGREQ* signal. However, the
request is not responded to in cases of ST concurrency such as the GSREQ* signal or the
GHPSREQ* signal. O n the o ther hand , if WAI T in structio n is ex ecute d while the bus is being
released, the halt mode starts in cases of ET concurrency, but in cases of ST concurrency
starts after bus ownership is granted and the GHALT signal is asserted.
If WAIT instruction is executed during a bus operation, the GHALT signal is asserted after
the bus operation is completed.
If data remain in the write buffer, the write operation is executed even after shifting to the
halt mode.
The internal halt bit is cleared by the assertion of the GINT[5~0]* signal, the GNMI* signal,
the GRESET* signal or the GCOLDRESET* signal , and the TX49 re turn from the halt mode.
If this is caused by the assertion of the GINT[5:0]* signal, the TX49 is released from the halt
mode irrespective of the value in the IntMask field of the Status register. If the TX49 is
brought back from the halt mode by the GCOLDRESET* signal, the GRESET* signal, the
GNMI* signal, or a non-masked GINT[5~0]* signal, the initial instruction in the
corresponding exception handler is executed. At this time, the EPC register is pointing to the
instruction following the WAIT instruction. If it is recovered by a masked GINT[5~0]* si gnal ,
execution resumes from the instruction following the instruction that was being executed
when it shifted to the halt mode.
As shown in Figure 16-1 the TX49 outputs the status of the internal halt bit on the GHALT
signal. The memory interface output signals in the halt mode are maintained in the same
status as when no bus operation was being executed.
Note: When the condition is brought back from the Power Consumption Modes are satisfied and WAIT
instruction is executed, the TX49 does not shift to the mode.
TX49/H2 Architecture
16-2
GBUSCLK
GHALT
Internal CPUCLK
GRD*, GWR*
M-stage W-stage of WAIT
HALT bi t s et 0 before here
Figure 16-1 Halt Mode
16.2 Doze mode
The doze mode is also a mode which halts TX49 operation in order to lower power-
consumption. However, the difference from the halt mode is that bus control requests (both
ST concurrency and ET concurrency) from an external bus master can be responded to.
Snooping operation of the data cache can also performed in ST concurrency. By setting the
HALT bit of the Config register to 1 by the software and executing WAIT instruction, the
TX49 mode shifts from the normal operation mode to the doze mode. Then, the TX49
Processor Core that is built into the TX49 halts operation while retaining the pipeline status.
As mentioned above, bus control requests are responded to while in the doze mode in cases
of ET concurrency such as the GREQ* signal and the GHPGREQ* signal, and in cases of ST
concurrency such as the GSREQ* signal and the GHPSREQ* signal. On the other hand, if
WAIT instruction is exec uted while the bus is being rele ased, the doze mode starts in cases of
ET concurrency, but in cases of ST concurrency starts after bus ownership i s granted and the
GDOZE signal is asserted. If WAIT instruction is executed during a bus operation, the
GDOZE signal is asserted after the bus operation is completed. The snooping of an external
bus master is done by ST concurrency when the TX49 is in the doze mode. For the bus that is
released by the assertion of the SGNT* signal or the GHPSGNT* signal, snooping of the data
cache can be performed by the GSNOOP* signal and the GA[35 ~0] signal. When an external
bus master deasserts the GSREQ* signal or the GHPSREQ* signal, the TX49 deasserts the
GSGNT* signal or the GHPSGNT* signal.
By asserting the GINT[5~0]* signal, the GNMI* signal, the GRESET* signal or the
GCOLDRESET* signal, the internal doze bit is cleared and the TX49 returns from the doze
mode. If this is caused by the assertion of the GINT[5~0]* signal, the TX49 is released from
the doze mode irrespective of the value in the IntMask field of the Status register. If the TX49
is brought back from the doze mode by the GCOLDRESET* signal, the GNMI* signal, or a
non-masked GINT[5~0]* signal, the top instruction in the corresponding exception handler is
exec uted. At this time, the EPC is pointin g to the ins truction following the WAIT in structio n.
If it is recovered by a masked GINT[5~0]* signal, execution resumes from the instruction
following the instruction that was being executed when it shifted to the doze mode.
TX49/H2 Architecture
16-3
As shown in Figure 16-2, the TX49 outputs the status of the internal doze bit on the GDOZE
signal. The memory interface output signals in the doze mode are maintained in the same
status as when no bus operation was executed.
Note: When the condition is brought back from the Power Consumption Modes are satisfied and WAIT
instruction is executed, the TX49 does not shift to the mode.
GBUSCLK
GDOZE
Internal
CPUCLK
(except snoop clock)
GRD*, GWR*
W-stage
before here
of WAITM-stage
HALT b i t se t 1
Figure 16-2 Doze Mode
16.3 Status Shifts
Figure 16-3 shows the status shifts in the operation mode of the TX49.
Interrupt or Reset
Interrupt or
Reset
Interrupt or
Reset
HALT bit = 1 & WAIT inst
HALT bit = 0 & WAIT inst.
Halt
Mode
Normal
Operation
Mode
Doze
Mode
Figure 16-3 Status Shift Among Normal Operation Mode and Low Power Consumption Modes
When operation status shifts from the normal operation mode to the halt mode, it is
returned to the normal operation mode by an interrupt or a reset. Similarly, when it shifts
from the normal operation mde to the doze mode, it is returned to the normal operation mode
by an interrupt or a reset. After a reset, the TX49 is initialized to the normal operation mode.
TX49/H2 Architecture
16-4
TX49/H2 Architecture
A-1
Appendix A: CPU Instruction Set Details
This appe ndix provid es a detailed d escriptio n of the o peration of each TX4 9 instruct ion in both
32- and 64-bit modes. The instructions are listed in alphabetical order.
The exceptions that may occur due to the execution of each instruction are listed after the
description of each instruction. The description of the immediate causes and manner of handling
exceptions is omitted from the instruction descriptions in this chapter.
Figures at the end of this appendix list the bit encoding for the constant fields of each
instruction, and the bit encoding for each individual instruction is included with that instruction.
For a detailed description of the FPU instructions, refer to Appendix B.
A.1 Instruction Classes
The TX49 has some classes of CPU instructions, as follows.
Load and Store
Computational
Jump and Branch
Coprocessor
Special
Exception
Multiply and Divide
Debug
Others
TX49/H2 Architecture
A-2
A.1.1 Instruction Formats
Every instruction consists of a single word (32 bits) aligned on a word boundary. The
main instruction formats are shown in Figure A-1.
J-Type (Jump)
I-Type (Immedi at e)
immediateop rs rt
15162021252631 0
op target
252631 0
R-Type (Register)
functshamtrdop rs rt
56101115162021252631 0
where:
op is a 6-bit operat i on code
rs is a 5-bit source regis ter specifier
rt is a 5-bit target (source/dest i nation) regist er or branch condit i on
immediate is a 16-bi t immediat e, branch displacement or address
displacement
target is a 26-bit j ump target address
rd is a 5-bit destinati on regi ster spec i fier
sham t is a 5-bit shif t amount
funct is a 6-bit f unction f i eld
Figure A-1 CPU Instruction Formats
A.1.2 Instr uct ion Notat ion Conventions
In this appendix, all variable subfields in an instruction format (such as rs, rt
immediate, etc.) are shown in lowercase names.
For the sa ke of clari ty, we so metimes use an alias f or a variable subfie ld in the formats
of specific instructions. For example, we use rs = base in the format for load and store
instructions. Such an alias is always lower case, since it refers to a variable subfield.
Figures with the actual bit encoding for all the mnemonics are located at the end of this
Appendix, and the bit encoding also accompanies each instruction.
In the instruction descriptions that follow, the Operation section describes the operation
performed by each instruction using a high-level language notation. The TX49 can
operate as either a 32- or 64-bit microprocessor. The operation for both modes is included
with the instruction description. Special symbols used in the notation are described in
Table A-1.
TX49/H2 Architecture
A-3
Table A-1 CPU Instruction Operation Notations
Symbol Meaning
Assignment.
  Bit string conc atenat i on.
xyReplication of bit value x into a y-bit string. Note: x is always a single-bit value.
xy...z Selection of bits y through z of bit string x. Little-endian bit notation is always used.
If y isess than z, this expressi on is an empty (zero length) bit st ri ng.
+Two’s complement or floating-point addit i on.
Two’s complement or floating-point s ubtrac tion.
*Two’s complement or floating-point multipl ication.
Div Two’s complement integer divis i on.
Mod Two’s complement modulo.
/ Float ing-point divisi on.
< Two’s complement less than comparison.
And Bitwise logic AND.
Or Bitwise logic OR.
Xor Bitwise logic XOR.
Nor Bitwise logic NOR.
GPR[x] General-Register x. The content of GPR[0] is always zero. At tempts t o alte r t he content of
GPR[0] have no effect.
CPR[z,x] Coproces sor unit z, general register x.
CCR[z,x] Coprocessor unit z, control regis t er x.
COC[z] Coprocesso r unit z condition signal.
BigEndianMem Big-endian mode as configured at reset (0 Little, 1 Big). Specifies the endianess of
the memory interface (see LoadMemory and StoreMemory), and the endianess of Kernel
and Supervisor m ode execution.
ReverseEndian Signal to reverse the endianess of load and store instructions. This feature is available in
User mode only, and is effected by setting the RE bit of the Status register. Thus,
ReverseEndian may be computed as (SR25 and User mode)
BigEndianCPU The endianess for load and store instructions (0 Little, 1 Big). In User mode, this
endianess may be reversed by setting SR25 Thus, BigEndianCPU may be computed as
BigEndianMem XOR ReverseE ndi an.
Llbit Bit of state to specify synchronization instructions. Set by LL, cleared by ERET and
Invalidat e and read by SC.
T + i: Indicates t he time steps between operations. Each of the statements within a time st ep are
defined to be executed in sequenti al order (as modified by condi tional and l oop const ructs).
Operations which are mark ed T
+
i: are executed at instruc tion cycle i relative to the start of
execution of the instruct ion. Thus, an instruction which st arts at tim e j executes operat ions
marked T + i: at time i + j. The interpretation of the order of excution between two
instructions or two operations which execute at the same time should be pessimistic; the
order is not defined.
TX49/H2 Architecture
A-4
A.1.3 Sign Extension and Zero Extension
With some instru ction s the bit len gth m ay be e xten ded; f or ex ample, a 16-bit o ffse t may
be extended to 32 bits. This extension can take the from of either a sign extension or zero
extension.
Sign extension
The extended part is fi lled with th e value of the most significant bit.
(example)
1001100101011100 16 bit
11111111111111111001 100101011100 32 bit
Zero extension
The extended part is filled with zeros.
(example)
1001100101011100 16 bit
00000000000000001001100101011 100 32 bit
A.1.4 Instr uction Notation Examples
The Following examples illustrate the application of some of the instruction notation
conventions:
Example #1:
GPR[rt] immediate   016
Sixteen zero bits are co ncat enated with an immedi ate value (typically 16 bits), and the
32-bit string (with the lower 16 bits set to zero) is assigned to General-Purpose
Register rt.
Example #2:
(immediate15)16 || immediat e150
Bit 15 (the sign bit) of an immediate value is extended for 16 bit positions, and the
result is concatenated with bits 15 through 0 of the immediate value to form a 32-bit
sign extended value.
TX49/H2 Architecture
A-5
A.2 Load and Store Instructions
In the TX49 implementation, the instruction immediately following a load may use the
contents of the register loaded. In such cases, the hardware interlocks, requiring additional
real cycles, so scheduling load delay slots is still desirable, although not required for
functional code.
Two special instructions are provided in the TX49 implementation of the MIPS ISA, Load
Linked and Store Conditional. These instructions are used in carefully coded sequences to
provide one of several synchronization primitives, including test-and-set, bit-level locks,
semaphores, and sequencers / event counts.
In the load and store operation descriptions, the functions listed in Table A-2 are used to
summarize the handling of virtual addresses and physical memory.
Table A-2 Load and Store Common Functions
Function Meaning
AddressTrans lation Uses the TLB to find the physic al address given the virtual address. The function fails
and an exception is taken if the required translation is not pres ent in the TLB.
LoadMemory Uses the cache and main memory to find the contents of the word containing the
specified physic al address. The low-order two bits of t he address and the acc ess t ype
field indica t es which of each of the four bytes within the data word need t o be returned.
If the cache is enabled for this access, the enti re word i s returned and loaded into t he
cache.
StoreMemory Uses the cache, write buffer, and main memory to store the word or part of word
specified as data in the word containing the specified physic al address. The low-order
two bits of the address and the access type field indicates which of each of the four
bytes within the dat a word should be stored.
The access type field indicates the size of the data item to be loaded or stored as shown in
Table A-3. Regardless of access type or byte-numbering order (endianness), the address
specif ies the byte which has the sma llest byte addr ess of the byte s in the addresse d fiel d. For
a Big-endian machine, this is the leftmost byte and contains the sign for a 2’s-complement
number; for a Little-endian machine, this is the rightmost byte and contains the lowest
precision byte.
Table A-3 Access Type Specifications for Loads/Stores
Access Type Mnemonic Value Meaning
DOUBLEWORD 7 doubl eword (64 bi ts)
SEPTIBYTE 6 seven bytes (56 bi ts)
SEXTIBYTE 5 si x bytes (48 bits )
QUINTIBYTE 4 five bytes (40 bits)
WORD 3 word (32 bits )
TRIPLEBYTE 2 triple-byte (24 bits )
HALFWORD 1 halfword (16 bits)
BYTE 0 byte (8 bits)
The bytes within the addressed doubleword which are used can be determined directly from
the access type and the t hree low-order bits of the address, as shown in Chapter 2.
TX49/H2 Architecture
A-6
A.3 Jump and Branch Instructions
All jump and branch instructions have an architectural delay of exactly one instruction.
That is, the instructio n immediately follo wing a jump or bran ch (i.e., occupy ing the de lay slot)
is always executed while the target instruction is being fetched from storage. It is not valid
for a delay slot to be occupied itself by a jump or branch in struction; however, this error is not
detected, a nd the results of such an operation are undefi n ed.
If an exception or interrupt prevents the completion of a legal instruction during a delay
slot, the hardware sets the EPC register to point at the jump or branch instruction which
precedes it. When the code is restarted, both the jump or branch instructions and the
instruction in the delay slot are reexecuted.
Because jump and branch instructions may be restarted after exceptions or interrupts, they
must be restartable. Therefore, when a jump or branch instruction stores a return link value,
register 31 (the register in which the link is stored) may not be used as a source register.
Since instructions must be word-aligned, a Jump Register or Jump and Link Register
instruction must use a register whose two low-order bits are zero. If these low-order bits are
not zero, an address exception will occur when the jump target instruction is subsequently
fetched.
A.4 Coprocessor Instructions
The MIPS architecture provides four coprocessor units, or classes. Coprocessors are
alternate execution units, which have separate register files from the CPU. R-Series
coprocessors have 2 register spaces, each with thirty-two 32-bit registers. The first space,
coprocessor general registers, may be directly loaded from memory and stored into memory,
and their contents may be transferred between the coprocessor and processor. The second,
coprocessor control registers, may only have their contents transferred directly between the
coprocessor and processor. Coprocessor instructions may alter registers in either space.
Normally, by convention, Coprocessor Control Register 0 is interpreted as a Coprocessor
Implementation And Revision register. However, the system control coprocessor (CP0) uses
Coprocessor General Register 15 for the processor / coprocessor revision register. The
register’s low-order byte (bits 70) is interpreted as a coprocessor unit revision number. The
second byte (bits 158) is interpreted as a coprocessor unit implementation descriptor. The
revision number is a value of the form y.x where y is a major revision number in bits 74 and x
is a minor revision number in bits 30.
The contents of the high-order halfword of the register are not defined (currently read as 0
and should be 0 when written).
A.5 System Control Coprocessor (CP0) Instructions
There are some spec ial limi tat ion s impo sed o n o pe ratio ns in vo lvin g CP0 that is inco rpo rate d
within the CPU. Although load and store instructions to transfer data to and from
coprocessors and move control to/from coprocessor instructions are generally permitted by the
MIPS architecture, CP0 is given a somewhat protected status since it has responsibility for
exception handling and memory management. Therefore, the move to/from coprocessor
instructions are the only valid mechanism for reading from and writing to the CP0 registers.
Several coprocessor operation instructions are defined for CP0 to directly read, write, and
probe TLB entries and to modify the operating modes in preparation for returning to User
mode or interrupt-enabled states.
TX49/H2 Architecture
A-7
A.6 CPU Instructions
This appendix provides a detailed description of the operation of each TX49 instruction in
both 32- and 64-bit modes.
Exceptions that may occur due to the execution of each instruction are listed after the
description of each instruction.
For a detailed description of the exception of the exceptions, refer to Chapter 4.
TX49/H2 Architecture
A-8
ADD Add ADD
rd ADD
100000
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
ADD rd,rs,rt
Description:
The content s of general register rs and the contents of general register rt are added to form
the result. The result is placed into general register rd. In 64-bit mode, the operands must
be valid sign-extended, 32-bit values.
An overflow exception occurs if the carries out of bits 30 and 31 differ (2’s-complement
overflow). The destination register rd is not modified when an integer overflow exception
occurs.
Operation:
32 T: GPR[rd] GPR[rs] + GPR[rt]
64 T: temp GPR[rs] + GPR [rt]
GPR[rd] (temp31)32   temp310
Exceptions:
Integer overflow exception
TX49/H2 Architecture
A-9
ADDI Add Immediate ADDI
ADDI
001000 rs immediatert
1516202125
6
2631 0
55 16
Format:
ADDI rt, rs, immediate
Description:
The 16-bit immediate is sign-extended and added to the contents of general register rs to
form the result. The result is placed into general register rt. In 64-bit mode, the operand
must be valid sign-extended, 32-bit values.
An overflow exception occurs if carries out of bits 30 and 31 differ (2’s-complement
overflow). The destination register rt is not modified when an integer overflow exception
occurs.
Operation:
32 T: GPR[rt] GPR [rs] + (immediate15)16   immediate150
64 T: temp GPR[rs] + (imm edi ate15)48   immedi ate150
GPR[rt] (temp31)32   temp310
Exceptions:
Integer overflow exception
TX49/H2 Architecture
A-10
ADDIU Add Immediate Unsigned ADDIU
ADDIU
001001 rs immediatert
1516202125
6
2631 0
55 16
Format:
ADDIU rt, rs, immediate
Description:
The 16-bit immediate is sign-extended and added to the contents of general register rs to
form the result. The result is placed into general register rt. No integer overflow exception
occurs under any circumstances. In 64-bit mode, the operand must be valid sign-extended,
32-bit values.
The only difference between this instruction and the ADDI instruction is that ADDIU
never causes an overflow exception.
Operation :
32 T: GPR[rt] GPR [rs] + (immediate15)16   immediate150
64 T: temp GPR[rs] + (imm edi ate15)48   immedi ate150
GPR[rt] (temp31)32   temp310
Exceptions:
None
TX49/H2 Architecture
A-11
ADDU Add Unsigned ADDU
rd ADDU
100001
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
ADDU rd, rs, rt
Description:
The content s of general register rs and the contents of general register rt are added to form
the result. The result is placed into general register rd. No overflow exception occurs under
any circumstances. In 64-bit mode, the operands must be valid sign-extended, 32-bit values.
The only diff erence betwe en this instruction and the ADD in struction is th at ADDU ne ver
causes an overfl ow exception.
Operation:
32 T: GPR[rd] GPR[rs] + GPR[rt]
64 T: temp GPR[rs] + GPR[rt]
GPR[rd] (temp31)32   temp310
Exceptions:
None
TX49/H2 Architecture
A-12
AND And AND
rd AND
100100
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
AND rd, rs, rt
Description:
The contents of general register rs are combined with the contents of general register rt in
a bit-wise logical AND operation. The result is placed into general register rd.
Operation:
32 T: GPR[rd] GPR[rs] + GPR[rt]
64 T: GPR[rd] GPR[rs] and GPR[rt ]
Exceptions:
None
TX49/H2 Architecture
A-13
ANDI And Immediate ANDI
ANDI
001100 rs immediatert
1516202125
6
2631 0
55 16
Format:
ANDI rt, rs, immediate
Description:
The 16-bi t immediate is zero-extended and combined with the contents of general register
rs in a bit-wise logical AND operation. The result is placed into general register rt.
Operation:
32 T: GPR[rt] 016   (immediate and GPR[rs]150)
64 T: GPR[rt] 048   (immediate and GPR[rs]150)
Exceptions:
None
TX49/H2 Architecture
A-14
BCzF Branch On Coprocessor z False BCzF
offset
BCF
00000
BC
01000
COPz
0100xx*
1516202125
6
2631 0
55 16
Format:
BCzF offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. If coprocessor z’s
condition signal (CpCond), as sampled during the previous instruction, is false, then the
program branches to the target address with a delay of one instruction.
Because the condition line is sampled during the previous instruction, there must be at
least one instruct ion betwe en this in struction and a cop rocessor in struction that chang es th e
condition line.
Operation:
32 T-1: condition not COC[z]
T: target (offset15)14   offset   02
T + 1: if condition then
PC PC + target
endif
64 T-1 condition not COC[z]
T: target (offset15)46   offset   02
T + 1: if condition then
PC PC + target
endif
*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit
Encoding” at the end of Appendix A.
TX49/H2 Architecture
A-15
BCzF Branch On Coprocessor z False
(continued) BCzF
Exceptions:
Coprocessor unus able exception
Opcode Bit Encoding
Branch conditionBC sub-opcode
BCzF
Coprocessor Unit Number
BC0F
Bit #
Opcode
0161718192021222324252627282930
0010
31
100000000000
0161718192021222324252627282930
0010
31
101000000000
0161718192021222324252627282930
0010
31
101100000000
BC1F
Bit #
BC3F
Bit #
0161718192021222324252627282930
0010
31
100100000000
BC2F
Bit #
Note: CpCond0 = Write Buffer Empty
(Empty true (1), Not empty false (0))
CpCond1 = FPU (See the A ppendix B)
CpCond2 = External Pin condit ion (GCPCOND2)
CpCond3 = External Pin condit ion (GCPCOND3)
TX49/H2 Architecture
A-16
BCzFL Branch On Coprocessor
z
False Likely BCzFL
offset
BC
01000 BCFL
00010
COPz
0100xx*
15
16202125
6
2631 0
55 16
Format:
BCzFL offset
Description :
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of
coproc essor z’s conditio n line, as samp led during the previous in struction, is f alse, the target
address is branched to with a delay of one instruction.
If the conditional branch is not taken, the instruction in the branch delay slot is nullified.
Because the condition line is sampled during the previous instruction, there must be at
least one instruct ion betwe en this in struction and a cop rocessor in struction that chang es th e
condition line.
*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit
Encoding” at the end of Appendix A.
TX49/H2 Architecture
A-17
BCzFL Branch On Coprocessor
z
False Likely (continued) BCzFL
Operation:
32 T-1: condition not COC[z]
T: target (offset15)14   offset   02
T + 1: if condition then
PC PC + target
else
NullityCurrentInstruction
endif
64 T-1 condition not COC[z]
T: target (offset15)46   offset   02
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
Endif
Exceptions:
Coprocessor unus able exception
Opcode Bit Encoding:
Branch condi tionBC sub-opcode
BCzFL
Coprocess or Uni t Number
BC0FL
Bit #
Opcode
0161718192021222324252627282930
0010
31
100000000100
0161718192021222324252627282930
0010
31
101000000100
0161718192021222324252627282930
0010
31
101100000100
BC1FL
Bit #
BC3FL
Bit #
0161718192021222324252627282930
0010
31
100100000100
BC2FL
Bit #
Note: CpCond0 = Write Buffer Empty
(Empty true (1), Not empty false (0))
CpCond1 = FPU (See the A ppendix B)
CpCond2 = External Pin condit ion (GCPCOND2)
CpCond3 = External Pin condit ion (GCPCOND3)
TX49/H2 Architecture
A-18
BCzT Branch On Coprocessor z True BCzT
offset
BCT
00001
BC
01000
COPz
0100XX*
1516202125
6
2631 0
55 16
Format:
BCzT offset
Description :
A branch target address is computed from the sum of the address of the instruction in the
delay slo t and the 16- bit offset, shifted left two bits and sign-extended. If the coprocessor z’s
condition signal (CpCond) is true, then the program branches to the target address, with a
delay of one instruction.
Because the condition line is sampled during the previous instruction, there must be at
least one instruct ion betwe en this in struction and a cop rocessor in struction that chang es th e
condition line.
Operation :
32 T-1: condition COC[z]
T: target (offset15)14   offset   02
T + 1: if condition then
PC PC + target
endif
64 T-1 condition COC[z]
T: target (offset15)46   offset   02
T + 1: if condition then
PC PC + target
Endif
*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit
Encoding” at the end of Appendix A.
TX49/H2 Architecture
A-19
BCzT Branch On Coprocessor z True
(continued) BCzT
Exceptions:
Coprocessor unus able exception
Opcode Bit Encoding:
Branch condi tionBC sub-opcode
BCzT
Coprocess or Uni t Number
BC0T
Bit #
Opcode
0161718192021222324252627282930
001
0
31
100000001000
0161718192021222324252627282930
0010
31
101000001000
0161718192021222324252627282930
0010
31
101100001000
BC1T
Bit #
BC3T
Bit #
0161718192021222324252627282930
0010
31
100100001000
BC2T
Bit #
Note: CpCond0 = Write Buffer Empty
(Empty true (1), Not empty false (0))
CpCond1 = FPU (See the A ppendix B)
CpCond2 = External Pin condit ion (GCPCOND2)
CpCond3 = External Pin condit ion (GCPCOND3)
TX49/H2 Architecture
A-20
BCzTL Branch On Coprocessor
z
True Likely BCzTL
offset
BCTL
00011
BC
01000
COPz
0100XX*
1516202125
6
2631 0
55 16
Format:
BCzTL offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of
coprocessor z’s condition line, a s sampled during the previous instruction, is true, the target
address is branched to with a delay of one instruction.
If the conditional branch is not taken, the instruction in the branch delay slot is nullified.
Because the condition line is sampled during the previous instruction, there must be at
least one instruct ion betwe en this in struction and a cop rocessor in struction that chang es th e
condition line.
Operation:
32 T-1: condition COC[z]
T: target (offset15)14  offset   02
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
64 T-1 condition COC[z]
T: target (offset15)46   offset   02
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit
Encoding” at the end of Appendix A.
TX49/H2 Architecture
A-21
BCzTL Branch On Coprocessor
z
True Likely (continued) BCzTL
Exceptions:
Coprocessor unus able exception
Opcode Bit Encoding:
Branch condi tionBC sub-opcode
BCzTL
Coprocess or Uni t Number
BC0TL
Bit #
Opcode
0161718192021222324252627282930
0010
31
100000001100
0161718192021222324252627282930
0010
31
101000001100
0161718192021222324252627282930
0010
31
101100001100
BC1TL
Bit #
BC3TL
Bit #
0161718192021222324252627282930
0010
31
100100001100
BC2TL
Bit #
Note: CpCond0 = Write Buffer Empty
(Empty true (1), Not empty false (0))
CpCond1 = FPU (See the A ppendix B)
CpCond2 = External Pin condit ion (GCPCOND2)
CpCond3 = External Pin condit ion (GCPCOND3)
TX49/H2 Archit ecture
A-22
A. BEQ Branch On Equal BEQ
rs offset
BEQ
000100
1516202125
6
2631 0
55 16
rt
Format:
BEQ rs, rt, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of
general register rs and the con-tents of general register rt are compared. If the two registers
are equal, then the program branches to the target address, with a delay of one instruction.
Operation:
32 T: condition (offse t15)14   offset   02
condition (GPR[rs] = GPR[rt])
T + 1: if condition then
PC PC + target
endif
64 T: target (offset15)46   offset   02
condition (GPR[rs] = GPR[rt])
T + 1: if condition then
PC PC + target
endif
Exceptions:
None
TX49/H2 Archit ecture
A-23
BEQL Branch On Equal Likely BEQL
rs offset
BEQL
010100
1516202125
6
2631 0
55 16
rt
Format:
BEQL rs, rt, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of
general register rs and the contents of general register rt are compared. If the two registers
are equal, the target address is branched to, with a delay of one instruction. If the
conditional branch is not taken, the instruction in the branch delay slot is nullified.
Operation:
32 T: target (offset15)14   offset   02
condition (GPR[rs] = GPR[rt])
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
64 T: target (offset15)46   offset   02
condition (GPR[rs] = GPR[rt])
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
TX49/H2 Archit ecture
A-24
BGEZ Branch On Greater Than
Or Equal To Zero BGEZ
rs offset
BGEZ
00001
REGIMM
000001
1516202125
6
2631 0
55 16
Format:
BGEZ rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of
general register rs have the sign bit cleared, then the program branches to the target
address, with a delay of one instruction.
Operation:
32 T: target (offset15)14   offset   02
condition (GPR[rs]31 = 0)
T + 1: if condition then
PC PC + target
endif
64 T: target (offset15)46   offset   02
condition (GPR[rs]63 = 0)
T + 1: if condition then
PC PC + target
endif
Exceptions:
None
TX49/H2 Archit ecture
A-25
BGEZAL Branch On Greater
Than Or Equal To Zero
And Link BGEZAL
rs offset
BGEZAL
10001
REGIMM
000001
1516202125
6
2631 0
55 16
Format:
BGEZAL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slo t and the 16-bit offset, shifted left two b its and sign-extended. Unconditionally, the
address of the instruction after the delay slot is placed in the link register, r31 . If the
contents of general register rs have the sign bit cleared, then the program branches to the
target address, with a delay of one instruction.
General register rs may not be general register 31, because such an instruction is not
restartable. An attempt to execute this instruction is not tapped, however.
Operation:
32 T: target (offset15)14   offset   02
condition (GPR[rs]31 = 0)
GPR[31] PC + 8
T + 1: if condition then
PC PC + target
endif
64 T: target (offset15)46   offset   02
condition (GPR[rs]63 = 0)
GPR[31] PC + 8
T + 1: if condition then
PC PC + target
endif
Exceptions:
None
TX49/H2 Archit ecture
A-26
BGEZALL Branch On Greater
Than Or Equal To
Zero And Link Likely BGEZALL
rs offset
BGEZALL
10011
REGIMM
000001
1516202125
6
2631 0
55 16
Format:
BGEZALL rs, offset
Descriptions:
A branch target address is computed from the sum of the address of the instruction in the
delay slo t and the 16-bit offset, shifted left two b its and sign-extended. Unconditionally, the
address of the instruction after the delay slot is placed in the link register, r31 . If the
contents of general register rs have the sign bit cleared, then the program branches to the
target address, with a delay of one instruction.
General register rs may not be general register 31, because such an instruction is not
restart able . An atte m pt to e x ec ute th is in str uction is not rapped, however. If the conditional
branch is not taken, the instruction in the branch delay slot is nullified.
Operation:
32 T: target (offset15)14   offset   02
condition (GPR[rs]31 = 0)
GPR[31] PC + 8
T + 1: if condition then
PC PC + target
Else
NullifyCurrentInstruction
Endif
64 T: target (offset15)46   offset   02
condition (GPR[rs]63 = 0)
GPR[31] PC + 8
T + 1: if condition then
PC PC + target
Else
NullifyCurrentInstruction
Endif
Exceptions:
None
TX49/H2 Archit ecture
A-27
BGEZL Branch On Greater Than
Or Equal To Zero Likely BGEZL
rs offset
BGEZL
00011
REGIMM
000001
1516202125
6
2631 0
55 16
Format:
BGEZL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of
general register rs have the sign bit cleared, then the program branches to the target
address, with a delay of one instruction. If the conditional branch is not taken, the
instruction in the branch delay slot is nullified.
Operation:
32 T: target (offset15)14   offset   02
condition (GPR[rs]31 = 0)
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
64 T: target (offset15)46   offset   02
condition (GPR[rs]63 = 0)
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
TX49/H2 Archit ecture
A-28
BGTZ Branch On Greater Than Zero BGTZ
rs offset
0
00000
BGTZ
000111
1516202125
6
2631 0
55 16
Format:
BGTZ rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of
general register rs are compared to zero. If the contents of general register rs have the sign
bit cleared and are not equal to zero, then the program branches to the target address, with a
delay of one instruction.
Operation:
32 T: target (offset15)14   offset   02
condition (GPR[rs]31 = 0 ) and (GPR[rs] 032)
T + 1: if condition then
PC PC + target
endif
64 T: target (offset15)46   offset   02
condition (GPR[rs]63 = 0 ) and (GPR[rs] 064)
T + 1: if condition then
PC PC + target
endif
Exceptions:
None
TX49/H2 Archit ecture
A-29
BGTZL Branch On Greater
Than Zero Likely BGTZL
rs offset
0
00000
BGTZL
010111
1516202125
6
2631 0
55 16
Format:
BGTZL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of
general register rs are compared to zero. If the contents of general register rs have the sign
bit cleared and are not equal to zero, then the program branches to the target address, with a
delay of one in struction. If the conditio nal branch is not take n, the in struction in the branch
delay slot is nullified.
Operation:
32 T: target (offset15)14   offset   02
condition (GPR[rs]31 = 0 ) and (GPR[rs] 032)
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
64 T: target (offset15)46   offset   02
condition (GPR[rs]63 = 0 ) and (GPR[rs] 064)
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
TX49/H2 Archit ecture
A-30
BLEZ Branch on Less Than Or
Equal To Zero BLEZ
rs offset
0
00000
BLEZ
000110
1516202125
6
2631 0
55 16
Format:
BLEZ rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of
general register rs are compared to zero. If the contents of general register rs have the sign
bit set, or are equal to zero, then the program branches to the target address, with a delay of
one instruction.
Operation:
32 T: target (offset15)14   offset   02
condition (GPR[rs]31 = 1 ) or (GPR[rs] = 032)
T + 1: if condition then
PC PC + target
endif
64 T: target (offset15)46   offset   02
condition (GPR[rs]63 = 1 ) or (GPR[rs] = 064)
T + 1: if condition then
PC PC + target
endif
Exceptions:
None
TX49/H2 Archit ecture
A-31
BLEZL Branch on Less Than
Or Equal To Zero Likely BLEZL
rs offset
0
00000
BLEZL
010110
1516202125
6
2631 0
55 16
Format:
BLEZL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of
general register rs is compared to zero. If the contents of general register rs have the sign bit
set, or are equal to zero, then the program branches to the target address, with a delay of one
instruction.
If the conditional branch is not taken, the instruction in the branch delay slot is nullified.
Operation:
32 T: target (offset15)14   offset   02
condition (GPR[rs]31 = 1 ) or (GPR[rs] = 032)
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
64 T: target (offset15)46   offset   02
condition (GPR[rs]63 = 1 ) or (GPR[rs] = 064)
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
Endif
Exceptions:
None
TX49/H2 Archit ecture
A-32
BLTZ Branch On Less Than Zero BLTZ
rs offset
BLTZ
00000
REGIMM
000001
1516202125
6
2631 0
55 16
Format:
BLTZ rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of
general register rs have the sign bit set, then the program branches to the target address,
with a delay of one instruction.
Operation:
32 T: target (offset15)14 offset   02
condition (GPR[rs]31 = 1)
T + 1: if condition then
PC PC + target
endif
64 T: target (offset15)46 offset   02
condition (GPR[rs]63 = 1)
T + 1: if condition then
PC PC + target
endif
Exceptions:
None
TX49/H2 Archit ecture
A-33
BLTZAL Branch On Less
Than Zero And Link BLTZAL
rs offset
BLTZAL
10000
REGIMM
000001
1516202125
6
2631 0
55 16
Format:
BLTZAL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slo t and the 16-bit offset, shifted left two bi ts and sign-extended. Unconditionally, the
address of the instruction after the delay slot is placed in the link register, r31 . If the
contents of general register rs have the sign bit set, then the program branches to the target
address, with a delay of one instruction.
General register rs may not be general register 31, because such an instruction is not
restartable. An attempt to execute this instruction with register 31 specified as rs is not
trapped, however.
Operation:
32 T: target (offset15)14 offset   02
condition (GPR[rs]31 = 1)
GPR[31] PC + 8
T + 1: if condition then
PC PC + target
endif
64 T: target (offset15)46 offset   02
condition (GPR[rs]63 = 1)
GPR[31] PC + 8
T + 1: if condition then
PC PC + target
endif
Exceptions:
None
TX49/H2 Archit ecture
A-34
BLTZALL Branch On Less Than
Zero And Link Likely BLTZALL
rs offset
BLTZALL
10010
REGIMM
000001
1516202125
6
2631 0
55 16
Format:
BLTZALL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slo t and the 16-bit offset, shifted left two b its and sign-extended. Unconditionally, the
address of the instruction after the delay slot is placed in the link register, r31 . If the
contents of general register rs have the sign bit set, then the program branches to the target
address, with a delay of one instruction.
General register rs may not be general register 31, because such an instruction is not
restartable. An attempt to execute this instruction with register 31 specified as rs is not
trapped, h owever. If the condition al branch is not taken, th e instruct ion in the branch delay
slot is nullified.
Operation:
32 T: target (offset15)14 offset   02
condition (GPR[rs]31 = 1)
GPR[31] PC + 8
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
64 T: target (offset15)46 offset   02
condition (GPR[rs]63 = 1)
GPR[31] PC + 8
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
TX49/H2 Archit ecture
A-35
BLTZL Branch On Less Than Zero
Likely BLTZL
rs offset
BLTZL
00010
REGIMM
000001
1516202125
6
2631 0
55 16
Format:
BLTZ rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of
general register rs have the sign bit set, then the program branches to the target address,
with a delay of on e instruction. If th e conditional bran ch is not ta ken, the in struction in th e
branch delay slot is nullified.
Operation:
32 T: target (offset15)14 offset   02
condition (GPR[rs]31 = 1)
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
64 T: target (offset15)46 offset   02
condition (GPR[rs]63 = 1)
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
TX49/H2 Archit ecture
A-36
BNE Branch On Not Equal BNE
rtrs offset
BNE
000101
1516202125
6
2631 0
55 16
Format:
BNE rs, rt, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of
general register rs and the contents of general register rt are compared. If the two registers
are not equal, then the program branches to the target address, with a delay of one
instruction.
Operation:
32 T: target (offset15)14 offset   02
condition (GPR[rs] GPR[rt])
T + 1: if condition then
PC PC + target
endif
64 T: target (offset15)46 offset   02
condition (GPR[rs] GPR[rt])
T + 1: if condition then
PC PC + target
endif
Exceptions:
None
TX49/H2 Archit ecture
A-37
BNEL Branch On Not Equal Likely BNEL
rtrs offset
BNEL
010101
1516202125
6
2631 0
55 16
Format:
BNEL rs, rt, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of
general register rs and the contents of general register rt are compared. If the two registers
are not equal, then the program branches to the target address, with a delay of one
instruction.
If the conditional branch is not taken, the instruction in the branch delay slot is nullified.
Operation:
32 T: target (offset15)14 offs e t   02
condition (GPR[rs] GPR[rt])
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
64 T: target (offset15)46 offset   02
condition (GPR[rs] GPR[rt])
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
TX49/H2 Archit ecture
A-38
BREAK Breakpoint BREAK
code BREAK
001101
SPECIAL
000000
5625
6
2631 0
20 6
Format:
BREAK
Description:
A breakpoint trap occurs, immediately and unconditionally transferring control to the
exception handler.
The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption
handler only by loading the contents of the memory word containing the instruction.
Operation:
32, 64 T: BreakpointExcept ion
Exceptions:
Breakpoint exception
TX49/H2 Archit ecture
A-39
CACHE Cache CACHE
base op offset
CACHE
101111
1516202125
6
2631 0
55 16
Format:
CACHE op, offset(base)
Description:
Gene rate s a virtual ad dr es s by sig n- ex te nd in g th e 16-bit o ff se t and add ing th e re sult to th e
contents of register base. The virtual address is translated to a physical address using the
TLB, and the 5-bit sub-opecode designates the cache operation to be performed at that
address.
If CP0 is unusable (in User or Supervisor mode), the CP0 enable bit in the Status register
is cleared, and a Coprocessor Unusable Exception is raised. The behavior of this instruction
for operation and cache combinations other than those listed in the table below, and when
used with an uncached address, is undefined.
Cache index opera tions designate a cache block using part of the virtual address.
The memory address that specifies in cache instruction must be cacheable area. If
uncachable area is specified, the operation is not guaranteed for TX49. If the instruction is
issued for the line which this instruction itself exists, the following operation is not
guaranteed.
The Index operation uses part of the virtual address to specify a cache block.
The each wa y is chosen by LSB (bit 0..1) of the virtual address.
Virtual Address bit (1:0) Selected Way
00 Way 0
01 Way 1
10 Way 2
11 Way 3
The Hit operation accesses the specified cache as normal data references, and performs the
specified operation if the cache block contains valid data with the specified physical address
(a hit). If the cache block is invalid or contains a different address (a miss), no operation is
performed. Write back from a cache goes to memory. The address to be written is specified
by the cache tag and not the translated physical address. TLB Refill and TLB Invalid
exceptions can occur on any operation. For Index operations (where the physical address is
used to index the cache but need not match the cache tag) unmapped addresses may be used
to avoid TLB exceptions. This operation never causes TLB Modified or Virtual Coherency
exceptions. Bits 1716 of the instruction specify the cache as follows:
Code Name Cache
0 I Pr imary instruction
1 D Primary data
2-reserved
3-reserved
TX49/H2 Archit ecture
A-40
CACHE Cache
(continued) CACHE
Bits 2018 of the instruction specify the operation as follows:
Code Caches Name Operation
0 I Index Invalidate Set the cache state of the indexed block to invalid.
0 D Index WriteBack
Invalidate Examine the cache state and W bit of the primary data cache block at the
invalidate i ndex specified by the virtual address. If the st ate is not invalid and
the W bit is set, then write back the block t o memory. The address to write is
taken from the primary cache tag. Set cache stat e of prim ary cache block to
invalid. LSB (bit 1 0) of V A s elect th e way.
1 I / D Index Load Tag Read the tag for the cache block at the specified index and place it into the
TagLo and TagHi CP0 registers. LSB (bit 1 0) of VA select the way.
2 I / D Index Store Tag Write the tag for the cache block at the specified index from the TagLo and
TagHi CP0 registers. LSB (bit 1 0) of VA select the way.
3 I Undefined Undefined
3 D Create Dirty
Exclusive This operation is used to avoid loading data needlessly from memory when
writing new contents into an entire cache block. If the cacheblock does not
contain the specified address, and the block is dirty, write it back to the
memory. In all cases, set the cache block tag to the specified physical
address, s et the cache state to Dirty Exclusive.
4 I / D Hit Invalidate If the cache block contains the specified address, mark the cache block
invalid. In case of multi-hit, lock bits of the specified line become ineffective
and all way are invalidated.
5 I Fill Fill the primary instruction cache block from memory. LSB (bit 1 0) of VA
select the way.
5DHit WriteBack
Invalidate If the cache block contains the specified address, write back the data if it is
dirty, and mark t he cache block i nval id.
6 I Undefined Undefined
6 D Hit WriteBack If the cache block contains the specified address, and the W bit is set, write
back the data to memory, and clear the W bit.
7 I Undefined Undefined
7 D Fill Fill the primary data cache block from memory. LSB (bit 1 0) of VA select
the way.
TX49/H2 Archit ecture
A-41
CACHE Cache
(continued) CACHE
Operation:
32, 64 T: vAddr ((offset15)48   offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
CacheOp(op, cAddr, pAddr)
Exceptions:
Coprocessor unus able exception
TLB refill exception
TLB invalid exception
TX49/H2 Archit ecture
A-42
CFC0 Move Control From Coprocessor 0 CFC0
rd 0
000 0000 0000
CF
00010
COP0
010000 rt
10111516202125
6
2631 0
555 11
Format:
CFC0 rt, rd
Description:
For ICE system only.
Loads the contents of Monit or memory into the general-purpose register rt.
Operation:
32 T: data CCR[0,rd]
T + 1: GPR[rt] data
64 T: data (CCR[0,rd]31)32 CCR[0, rd]
T + 1: GPR[rt] data
Exceptions:
Coprocessor Unusable exception
TX49/H2 Archit ecture
A-43
CFCz Move Control From Coprocessor CFCz
rd 0
000 0000 0000
CF
00010
COPz
0100xx*rt
10111516202125
6
2631 0
555 11
Format:
CFCz rt, rd
Description:
The contents of coprocessor control register rd of coprocessor unit z ar e loaded into gene ral
register rt.
Operation:
32 T: data CCR[z,rd]
T + 1: GPR[rt] data
64 T: data (CCR[z,rd]31)32 CCR[z, rd]
T + 1: GPR[rt] data
Exceptions:
Coprocessor unus able exception
Reserved Instruction exception (CFC3)
Opcode Bit Encoding:
Coprocessor Suboperation
CFCz
Coprocess or Uni t Number
CFC1
Bit #
Opcode
0212223
24252627282930
0010
31
0010 010
0212223
24252627282930
0010
31
0001 010
CFC2
Bit #
Note: CFC1 for FPU (See the Appendix B)
CFC2 for Coprocessor 2 (user define)
TX49/H2 Archit ecture
A-44
COPz Coprocessor z Operation COPz
cofun
CO
1
COPz
0100xx*
6
2631 0
525
2425
Format:
COPz cofun.
Description:
A coprocessor operation is performed. The operation may specify and reference internal
coprocessor registers, and may change the state of the coprocessor condition line, but does
not modify state within the processor or the cache / memory system. Details of coprocessor 1
operations are contained in Appendix B.
Operation:
32, 64 T: CoprocessorOperation(z, cofun)
Exceptions:
Coprocessor unus able exception
Coprocessor interrupt or Floating-Point Exception (CP1 only)
Reserved Instruction exception (COP3)
Opcode Bit Encoding:
CO sub-opcode (s ee end of Appendix A)
COPz
Coprocess or Uni t Number
COP0
Bit #
Opcode
0252627282930
0010
31
100
0252627282930
0010
31
110
0252627282930
0010
31
111
COP1
Bit #
COP3
Bit #
0252627282930
0010
31
101
COP2
Bit #
Note: COP0 for ICE system
COP1 for FPU (See the Appendix B)
COP2 for Coprocessor 2 (user define)
TX49/H2 Archit ecture
A-45
CTC0 Move Control To Coprocessor 0 CTC0
rd 0
000 0000 0000
CT
00110
COP0
010000 rt
10111516202125
6
2631 0
555 11
Format:
CTC0 rt, rd
Description:
For ICE system only.
Loads the contents of general-purpose register rt into the Monitor memory.
Operation:
32, 64 T: data GPR[rt]
T + 1: CCR[0,rd] data
Exceptions:
Coprocessor Unusable exception
TX49/H2 Archit ecture
A-46
A. CTCz Move Control to
Coprocessor z CTCz
rd 0
000 0000 0000
CT
00110
COPz
0100xx*rt
10111516202125
6
2631 0
555 11
Format:
CTCz rt, rd
Description:
The content s of general register rt are loaded into control register rd of coprocessor unit z.
Operation:
32, 64 T: data GPR[rt]
T + 1: CCR[z,rd] data
Exceptions:
Coprocessor unusable
Reserved Instruction exception (CTC3)
* Opcode Bit Encoding:
CTCz
CTC1
Bit #
Opcode
02627282930
0010
31
10
02627282930
0010
31
01
CTC2
Bit #
25
0
222324
011
21
0
25
0
222324
011
21
0
Coprocess or Suboperation
Coprocess or Unit Num ber
Note: CTC1 for FPU (See the Appendix B)
CTC2 for Coprocessor 2 (user define)
See “CPU Instruction Opcode Bit Encoding” at the end of Appendix A.
TX49/H2 Architecture
A-47
DADD Doubleword Add DADD
rd DADD
101100
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
DADD rd, rs, rt
Description:
The content s of general register rs and the contents of general register rt are added to form
the result. The result is placed into general register rd.
An overflow exception occurs if the carries out of bits 62 and 63 differ(2’s-complement
overflow). The destination register rd is not modified when an integer overflow exception
occurs.
Operation:
64 T: GPR[rd] GPR[rs] + GPR[rt]
Exceptions:
Integer overflow exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Archit ecture
A-48
DADDI Doubleword Add
Immediate DADDI
rs rt immediate
DADDI
011000
1516202125
6
2631 0
55 16
Format:
DADDI rt, rs, immediate
Description:
The 16-bit immediate is sign-extended and added to the contents of general register rs to
form the result. The result is placed into general register rt.
An overflow exception occurs if carries out of bits 62 and 63 differ (2’s-complement
overflow). The destination register rt is not modified when an integer overflow exception
occurs.
Operation:
64 T: GPR [rt] GPR[rs] + (immediate15)48   immediate150
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Integer overflow exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-49
DADDIU Doubleword Add
Immediate Unsigned DADDIU
rs rt immediate
DADDIU
011001
15162021
25
6
2631 0
55 16
Format:
DADDIU rt, rs, immediate
Description:
The 16-bit immediate is sign-extended and added to the contents of general register rs to
form the result. The result is placed into general register rt. No integer overflow exception
occurs under any circumsta nces.
The only difference between this instruction and the DADDI instruction is that DADDIU
never causes an overflow exception.
Operation:
64 T: GPR[rt] GPR [rs] + (immediate15)48   immediate150
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Archit ecture
A-50
DADDU Doubleword Add Unsigned DADDU
rd DADDU
101101
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
DADDU rd, rs, rt
Description:
The content s of general register rs and the contents of general register rt are added to form
the result. The result is placed into general register rd.
No overflow exception occurs under any circumstances.
The only difference between this instruction and the DADD instruction is that DADDU
never causes an overflow exception.
Operation:
64 T: GPR [rd] GPR[ rs] + GPR[rt]
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-51
DDIV Doubleword Divide DDIV
DDIV
011110
0
00 0000 0000
SPECIAL
000000 rs rt
56151620
2125
6
2631 0
55 10 6
Format:
DDIV rs, rt
Description:
The contents of general register rs are divided by the contents of general register rt,
treating both operands as 2’s-complement values. No overflow exception occurs under any
circumstances, and the result of this operation is undefined when the divisor is zero.
This instruction is typically followed by additional instructions to check for a zero divisor
and for overflow.
When the operation completes, the quotient word of the double result is loaded into special
register LO, and the remainder word of the double result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of those
instructions are undefined. Correct operation requires separating reads of HI or LO from
writes by two or more instructions.
Operation:
64 T-2: LO undefined
Hl undefined
T-1: LO undefined
Hl undefined
T: LO GPR[rs] div GPR[rt ]
Hl GPR[rs] mod GPR[rt]
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Archit ecture
A-52
DDIVU Doubleword Divide
Unsigned DDIVU
DDIVU
011111
0
000000 0000
SPECIAL
000000 rs rt
561516202125
6
2631 0
55 10 6
Format:
DDIVU rs, rt
Description:
The contents of general register rs are divided by the contents of general register rt,
treating both operands as unsigned values. No integer overflow exception occurs under any
circumstances, and the result of this operation is undefined when the divisor is zero.
This instruction is typically followed by additional instructions to check for a zero divisor.
When the operation completes, the quotient word of the double result is loaded into special
register LO, and the remainder word of the double result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of those
instructions are undefined. Correct operation requires separating reads of HI or LO from
writes by two or more instructions.
Operation:
64 T-2: LO undefined
Hl undefined
T-1: LO undefined
Hl undefined
T: LO (0 GPR[rs]) div (0 GPR[rt])
Hl (0 GPR[rs]) mod (0 GPR[rt])
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-53
DERET Debug Exception Return DERET
DERET
011111
0
000 0000 0000 0000 0000
CO
1
COP0
010000
56
6
26
31 0
119 6
2425
Format:
DERET
Description:
Execute a return a self-debug interrupt or exception. This instruction requires a branch
delay slot like that of the branch or jump instructions, and executes with a delay of one
instruction cycle. The DERET instruction itself cannot be put in the delay slot.
The return address stored in the DEPC register is copied to the PC, and processing returns
to the original program.
Note: If a MTC0 instruction was used to set the return address in the DEPC register, a
minimum of two instructions must be executed before executing DERET.
Operation:
32, 64 T: temp DEPC
T-1: PC temp
Debug30 0
Exceptions:
Coprocessor unus able exception
TX49/H2 Archit ecture
A-54
DIV Divide DIV
DIV
011010
0
00 0000 0000
SPECIAL
000000 rs rt
561516
202125
6
2631 0
55 10 6
Format:
DIV rs, rt
Description:
The contents of general register rs are divided by the contents of general register rt,
treating both operands as 2’s-complement values. No overflow exception occurs under any
circumstances, and the result of this operation is undefined when the divisor is zero. In 64-
bit mode, the operands must be valid sign-extended, 32-bit values.
This instruction is typically followed by additional instructions to check for a zero divisor
and for overflow.
When the operation completes, the quotient word of the double result is loaded into special
register LO, and the remainder word of the double result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of those
instructions are undefined. Correct operation requires separating reads of HI or LO from
writes by two or more instructions.
Operation:
32 T-2: LO undefined
Hl undefined
T-1: LO undefined
Hl undefined
T: LO GPR[rs] div GPR[rt]
Hl GPR[rs] mod GPR[rt]
64 T-2: LO undefined
Hl undefined
T-1: LO undefined
Hl undefined
T: q GPR[rs]310 div GPR[rt]310
r GPR[rs]310 mod GPR[rt]310
LO (q31)32   q310
HI (r31)32   r310
Exceptions:
None
TX49/H2 Architecture
A-55
DIVU Divide Unsigned DIVU
DIVU
011011
0
00 0000 0000
SPECIAL
000000 rs rt
561516202125
6
2631 0
55 10 6
Format:
DIVU rs, rt
Description:
The contents of general register rs are divided by the contents of general register rt,
treating both operands as unsigned values. No integer overflow exception occurs under any
circumstances, and the result of this operation is undefined when the divisor is zero. In 64-
bit mode, the operands must be valid sign-extended, 32-bit values. In 64-bitmode, the
operands must be valid sign-extended, 32-bit values.
This instruction is typically followed by additional instructions to check for a zero divisor.
When the operation completes, the quotient word of the double result is loaded into special
register LO, and the remainder word of the double result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of those
instructions are undefined. Correct operation requires separating reads of HI or LO from
writes by two or more instructions.
Operation:
32 T-2: LO undefined
Hl undefined
T-1: LO undefined
Hl undefined
T: LO (0 GPR[rs]) div (0 GPR[rt])
Hl (0 GPR[rs]) mod (0 GPR[rt])
64 T-2: LO undefined
Hl undefined
T-1: LO undefined
Hl undefined
T: q (0 GPR[rs]310) div (0 GPR[rt]310)
r (0 GPR[rs]310) mod (0 GPR[rt]310)
LO (q31)32   q310
HI (r31)32   r310
Exceptions:
None
TX49/H2 Archit ecture
A-56
DMFC0 Doubleword Move From
System Control Coprocessor DMFC0
rd 0
000 0000 0000
DMF
00001
COP0
010000 rt
10111516202125
6
2631 0
555 5
Format:
DMFC0 rt, rd
Description:
The contents of coprocessor register rd of the CP0 a re loaded into general register rt.
This operation is defined in kernel mode regardless of the setting of the Status. KX bit.
Execution of this instruction with in supervisor mode with Status. SX = 0 or in user mode
with UX = 0, causes a reserved instruction exception. All 64-bits of the general register
destination are written from the coprocessor register source. The operation of DMFC0 on a
32-bit coprocessor 0 register is undefined.
Operation:
64 T: data CP R[0,r d]
T + 1: GPR[rt] data
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Coprocessor unus able exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-57
DMTC0 Doubleword Move TO
System Control Coprocessor DMTC0
rd 0
000 0000 0000
DMT
00101
COP0
010000 rt
10111516202125
6
2631 0
555 11
Format:
DMTC0 rt, rd
Description:
The content s of general register rt are loaded int o coprocessor register rd of the CP0.
This operation is defined for the R4000 operating in 64-bit mode or in 32-bit kernal mode.
Execution of this instruction in 32-bit u ser or supervisor mode causes a reserved instruction
exception. All 64-bits of he coprocessor 0 register are written from the general register
source. The operation of DMTC0 on a 32-bit coprocessor 0 register is undefined.
Because the state of the virtual address translation system may be altered by this
instruction, the operation of load, store instructions and TLB operations immediately prior to
and after this instruction are undefined.
Operation:
64 T: data GPR[rt]
T + 1: CPR[0,rd] data
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Coprocessor unus able exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Archit ecture
A-58
DMULT Doubleword Multiply DMULT
DMULT
011100
0
00 0000 0000
SPECIAL
000000 rs rt
561516202125
6
2631 0
55 10 6
DMULT
011100
SPECIAL
000000 rs rt
1516202125
6
2631 0
55 6
rd
11
5
10
0
0 0000
5
56
Format:
DMULT rs, rt
DMULT rd, rs, rt
Description:
The contents of general registers rs and rt are multiplied, heating both operands as 2’s-
complement values. No integer overflow exception occurs under any circumstances .
When the operation completes, the low-order word of the double result is loaded into
special register LO, and the high-order word of the double result is loaded into special
register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of these
instructions are undefined. Correct operation requires separating reads of HI or LO from
writes by a minimum of two other instructions.
Operation:
64 T-2: LO undefined
Hl undefined
T-1: LO undefined
Hl undefined
T: t GPR[rs] GPR[rt]
LO t630
HI t12764
GPR[rd] t630
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-59
DMULTU Doubleword Multiply
Unsigned DMULTU
DMULTU
011101
0
00 0000 0000
SPECIAL
000000 rs rt
655 10 6
DMULTU
011101
SPECIAL
000000 rs rt
15
16202125
6
2631 0
55 6
rd
11
5
10
0
0 0000
5
56
15
162021252631 0
56
Format:
DMULTU rs, rt
Description:
The contents of general register rs and the contents of general register rt are multiplied,
treating both operands as unsigned values. No over-flow exception occurs under any
circumstances.
When the operation completes, the low-order word of the double re-suit is loaded into
special register LO, and the high-order word of the double result is loaded into special
register HI.
If either of the two preceding instructions is MFHI or MFLO, the re-suits of these
instructions are undefined. Correct operation requires separating reads of HI or LO from
writes by a minimum of two instructions.
Operation:
64 T-2: LO undefined
Hl undefined
T-1: LO undefined
Hl undefined
T: t (0GPR[rs]) (0 GPR[rt])
LO t630
HI t12764
GPR[rd] t630
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Archit ecture
A-60
DSLL Doubleword Shift
Left Logical DSLL
sard DSLL
111000
0
00000
SPECIAL
000000 rt
5610111516202125
6
2631 0
55556
Format:
DSLL rd, rt, sa
Description:
The contents of general register rt are shifted left by sa bits, inserting zeros into the low-
order bits. The result is placed i n register rd.
Operation:
64 T: s 0 sa
GPR[rd] GPR[rt](63-sa) 0  0s
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-61
DSLLV Doubleword Shift Left
Logical Variable DSLLV
rd DSLLV
010100
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
DSLLV rd, rt, rs
Description:
The contents of general register rt are shifted left by the number of bits specified by the
low-order six bits contained as contents of general register rs, inserting zeros into the low-
order bits. The result is placed i n register rd.
Operation :
64 T: s GPR[rs]50
GPR[rd] GPR[rt](63-s) 0  0s
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Archit ecture
A-62
DSLL32 Doubleword Shift Left
Logical + 32 DSLL32
sard DSLL32
111100
0
00000
SPECIAL
000000 rt
5610111516202125
6
2631 0
55556
Format:
DSLL32 rd, rt, sa
Description:
The contents of general register rt are shifted left by 32 + sa bits, inserting zeros into the
low-order bits. The result is placed in register rd.
Operation:
64 T: s 1 sa
GPR[rd] GPR[rt](63-s) 0  0s
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-63
DSRA Doubleword Shift Right
Arithmetic DSRA
sard DSRA
111011
0
00000
SPECIAL
000000 rt
5610111516202125
6
2631 0
55556
Format:
DSRA rd, rt, sa
Description:
The contents of general register rt are shifted right by sa bits, sign-ex-tending the high-
order bits. The result is placed i n register rd.
Operation:
64 T: s 0 sa
GPR[rd] (GPR[ r t]63)s GPR[rt]63s
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Archit ecture
A-64
DSRAV Doubleword Shift Right
Arithmetic Variable DSRAV
rd DSRAV
010111
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
DSRAV rd, rt, rs
Description:
The contents of general register rt are shifted right by the number of bits specified by the
low-order six bits of general register rs, sign-ex-tending the high-order bits. The result is
placed in register rd.
Operation:
64 T: s GPR[rs]50
GPR[rd] (GPR[ r t]63)s GPR[rt]63s
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-65
DSRA32 Doubleword Shift Right
Arithmetic + 32 DSRA32
sard DSRA32
111111
0
00000
SPECIAL
000000 rt
5610111516202125
6
2631 0
55556
Format:
DSRA32 rd, rt,sa
Description:
The contents of general register rt are shifted right by 32 + sa bits, sign-extending the
high-order bits. The result us placed in register rd.
Operation:
64 T: s 1 sa
GPR[rd] (GPR[ r t]63)s GPR[rt]63s
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Archit ecture
A-66
DSRL Doubleword Shift Right
Logical DSRL
sard DSRL
111010
0
00000
SPECIAL
000000 rt
5610111516202125
6
2631 0
55556
Format:
DSRL rd, rt, sa
Description:
The content s of general register rt are shifted right by sa bits, inserting zeros into the high-
order bits. The result is placed i n register rd.
Operation:
64 T: s 0 sa
GPR[rd] 0s GPR[rt] 63s
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-67
DSRLV Doubleword Shift Right
Logical Variable DSRLV
rd DSRLV
010110
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
DSRLV rd, rt, rs
Description:
The contents of general register rt are shifted right by the number of bits specified by the
low-order six bits of general register rs, inserting zeros unto the high-order bits. The result
us plac ed in register rd.
Operation:
64 T: s GPR[rs]50
GPR[rd] 0s GPR[rt]63s
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Archit ecture
A-68
DSRL32 Doubleword Shift Right
Logical + 32 DSRL32
sard DSRL32
111110
0
00000
SPECIAL
000000 rt
5
610111516202125
6
2631 0
55556
Format:
DSRL32 rd, rt, sa
Description:
The contents of general register rt are shif ted right by 32 + sa bit s, inserting zeros in to th e
high-order bits. The result is placed in register rd.
Operation:
64 T: s 1 sa
GPR[rd] 0s GPR[rt]63s
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-69
DSUB Doubleword Subtract DSUB
rd DSUB
101110
0
00000
SPECIAL
000000 rs rt
5
610111516202125
6
2631 0
55556
Format:
DSUB rd, rs, rt
Description:
The content s of general register rt are subtracted from the contents of general register rs to
form a result. The result is placed into general register rd.
The only difference between this instruction and the DSUBU instruction is that DSUBU
never traps on overflow.
An integer overflow exception takes place if the carries out of bits 62and 63 differ (2’s-
complement overflow). The destination register rd is not modified when an integer overflow
exception occurs.
Operation :
64 T: GPR[rd] GPR[rs] GPR[rt]
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Integer overflow exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Archit ecture
A-70
DSUBU Doubleword Subtract
Unsigned DSUBU
rd DSUBU
101111
0
00000
SPECIAL
000000 rs rt
5
610111516202125
6
2631 0
55556
Format:
DSUBU rd, rs, rt
Description:
The content s of general register rt are subtracted from the contents of general register rs to
form a result. The result is placed into general register rd.
The only difference between this instruction and the DSUB instruction is that DSUBU
never taps on overflow. No integer overflow exception occurs under any circumstances.
Operation:
64 T: GPR[rd] GPR[rs] GPR[rt]
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-71
ERET Exception Return ERET
ERET
011000
0
000 0000 0000 0000 0000
CO
1
COP0
010000
5
6
6
2631 0
119 6
2425
Format:
ERET
Description:
ERET is the TX49 instruction for returning from an interrupt, exception, or error trap.
Unlike a branch or jump instruction, ERET does not execute the next instruction.
ERET must not itself be placed in a branch delay slot.
If the processor is servicing an error trap (SR2 = 1), then load the PC from the ErrorEPC
and clear the ERL bit of th e Status register (SR2). Otherwise (SR2 = 0), load the PC f ro m th e
EPC, and clear the EXL bit of the Status register (SR1).
An ERET executed between a LL and SC also causes the SC to fail.
In case of th is instruction i s placed in the bo undary of memo ry, it is necessary to kee p the
branch delay slot into same memory area.
Operation:
32, 64 T: if SR2 = 1 then
PC ErrorEPC
SR SR313 0 SR10
else
PC EPC
SR SR312 0 SR0
endif
LLbit 0
Exceptions:
Coprocessor unus able exception
TX49/H2 Archit ecture
A-72
JJump J
J
000010 target
25
6
2631 0
26
Format:
J target
Description:
The 26-bit target address is shifted left two bits and combined with the high-order bits of
the address of the delay slot. The program unconditionally jumps to this calculated address
with a delay of one instruction.
Operation:
32 T: temp target
T + 1: PC PC3128 temp 02
64 T: temp target
T + 1: PC PC6328 temp 02
Exceptions:
None
TX49/H2 Architecture
A-73
JAL Jump And Link JAL
JAL
000011 target
25
6
2631 0
26
Format:
JAL target
Description:
The 26-bit target address is shifted left two bits and combined with the high-order bits of
the address of the delay slot. The program unconditionally jumps to this calculated address
with a delay of one instruction. The address of the instruction after the delay slot is placed
in the link register, r31.
Operation:
32 T: temp target
GPR[31] PC + 8
T + 1: PC PC3128 temp 02
64 T: temp target
GPR[31] PC + 8
T + 1: PC PC6328 temp 02
Exceptions:
None
TX49/H2 Archit ecture
A-74
JALR Jump And Link Register JALR
rd JALR
001001
0
00000
0
00000
SPECIAL
000000 rs
5610111516202125
6
2631 0
55556
Format:
JALR rs
JALR rd, rs
Description:
The program unconditionally jumps to the address contained in general register rs, with a
delay of one instruction. The address of the instruction after the delay slot is placed in
general register rd. The default value of rd, if omitted in the assembly langu age instruc tion,
is 31.
Register specifiers rs and rd may not be equal, because such an instruction does not have
the same effect when reexecuted. However, an attempt to execute this instruction is not
trapped, and the result of executing such a n instruction is undefined.
Since instructions must be word-aligned, a Jump and Link Register instruction must
specify a target register (rs) whose two low-order bits are zero. If these. low-order bits are
not zero, an address exception will occur when the jump target instruction is subsequently
fetched.
Operation:
32, 64 T: t em p GPR[rs]
GPR[rd] PC + 8
T + 1: PC temp
Exceptions:
None
TX49/H2 Architecture
A-75
JR Jump Register JR
JR
001000
0
000 0000 0000 0000
SPECIAL
000000 rs
56202125
6
2631 0
5156
Format:
JR rs
Description:
The program unconditionally jumps to the address contained in general register rs, with a
delay of one instruction.
Since ins truction s mu st be wo rd- aligne d , a Jump Register instructio n mu st spe cif y a targ et
register (rs) whose two low-order bits are zero. If these low-order bits are not zero, an
address exception will occur when the jump target instruction is subsequently fetched.
Operation:
32, 64 T: t em p GPR[rs]
T + 1: PC temp
Exceptions:
None
TX49/H2 Architecture
A-76
A. LB Load Byte LB
offset
LB
100000 base rt
1516202125
6
2631 0
55 16
Format:
LB rt, offset (base)
Description:
The 16-bit offset is sign-extended and added tp the contents of general register base to fo r m
a virtual address. The contents of the byte at the memory location specified by the effective
address are sign-extended and loaded unto general register rt.
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
mem LoadMemory (uncac hed, BYTE, pAddr, vA ddr, DA TA)
byte vA ddr20 xor BigEndianCPU3
GPR[rt] (mem7 + 8*byte)24 mem7 + 8*byte8*byte
64 T: vAddr ((offs et15)48 offset150 ) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
mem LoadMemory (uncac hed, BYTE, pAddr, vA ddr, DA TA)
byte vA ddr20 xor BigEndianCPU3
GPR[rt] (mem7 + 8*byte)56 mem7 + 8*byte8*byte
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-77
LBU Load Byte Unsigned LBU
offset
LBU
100100 base rt
1516202125
6
2631 0
55 16
Format:
LBU rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The contents of the byte at the memory location specified by the effective
address are zero-extended and loaded into general register rt.
Operation :
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
mem LoadMemory (uncac hed, BYTE, pAddr, vA ddr, DA TA)
byte vA ddr20 xor BigEndianCPU3
GPR[rt] 024||mem7 + 8*byte8*byte
64 T: vAddr ((offs et15)48 offset150 ) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
mem LoadMemory (uncac hed, BYTE, pAddr, vA ddr, DA TA)
byte vA ddr20 xor BigEndianCPU3
GPR[rt] 056||mem7 + 8*byte8*byte
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-78
LD Load Doubleword LD
offset
LD
110111 base rt
1516202125
6
2631 0
55 16
Format:
LD rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form
a virtual address. The contents of the 64-bit doubleword at the memory location specified by
the effective address are loaded into general register rt.
If any of the three least-significant bits of the effective address are non-zero, an address
error exception occurs.
Operation:
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
mem LoadMemory (uncac hed, DOUB LEWORD, pAddr, vA ddr, DA TA)
GPR[rt] mem
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-79
LDCz Load Doubleword To
Coprocessor z LDCz
offset
LDCz
1101xx*base rt
1516202125
6
2631 0
55 16
Format:
LDCz rt, offset (base)
Description :
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The processor reads a double-word from the addressed memory location
and makes the data available to coprocessor unit z. The manner in which each coprocessor
uses he data is defined by the individual coprocessor specifications.
If any of the three least-significant bits of the effective address are non-zero, an address
error exception takes place.
This instruction is not valid for use with CP0.
This instruction is undefined when the least-significant bit of the rt-field is non-zero.
*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit
Encoding” at the end of Appendix A.
TX49/H2 Architecture
A-80
LDCz Load Doubleword To
Coprocessor z
(continued) LDCz
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
mem LoadMemory (uncac hed, DOUB LEWORD, pAddr, vA ddr, DA TA)
COPzLD(rt, mem)
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
mem LoadMemory (uncac hed, DOUB LEWORD, pAddr, vA ddr, DA TA)
COPzLD (rt, mem)
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
Coprocessor unus able exception
Opcode Bit Encoding:
LDCz
Coprocessor Unit Num ber
LDC1
Bit #
Opcode
02627282930
1011
31
10
02627282930
1011
31
01
LDC2
Bit #
TX49/H2 Architecture
A-81
LDL Load Doubleword Left LDL
offset
LDL
011010 base rt
1516202125
6
2631 0
55 16
Format:
LDL rt, offset (base)
Description:
This instruction can be used in combination with the LDR instruction to load a register
with eight consecutive bytes from memory, when the bytes cross a boundary between two
doublewords. LDL loads the left portion of the regis ter from the appropriate part of the high-
order doubleword; LDR loads the right portion of the register from the appropriate part of
the low-order doubleword.
The LDL instructio n adds it s sig n-exte nded 16-bit offset to the contents of general register
base to form a virtual address which can specify an arbitrary byte. It reads bytes only from
the doubleword in memory which contains the specified starting byte. From one to eight
bytes will be loaded, depending on the starting byte specified.
Conceptually, it starts at the specified byte in memory and loads that byte into the high-
order (left-most) byte of the register; then it proceeds toward the low-order byte of the
doubleword in memory and the low-order byte of the register, loading bytes from memory
into the register until it reaches the low-order byte of the doubleword in memory. The least-
significant (right-most) byte(s) of the register will not be changed.
LDL $24,3($0)
memory
(big-endian)
register
address 0
address 8 111098 15141312
32107654 $24
before DCBA HGFE
$24
after 6543 HGF7
TX49/H2 Architecture
A-82
LDL Load Doubleword Left
(continued) LDL
The contents of general register rt are internally bypassed within the processor so that no
NOP is needed between an immediately preceding load instruction which specifies register rt
and a following LDL (or LDR) instruction which also specifies register rt.
No address exceptions due to alignment are possible.
Operation:
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEnci an3)
if BigEndianMem = 0 then
pAddr pAddrPSIZE-13 03
endif
byte vA ddr20 xor BigEndianCPU3
mem LoadMemory (uncac hed, byt e, pA ddr, vA ddr, DA TA)
GPR[rt] mem7 + 8*byte0 GPR[rt]55 8*byte0
Note: It is also the same operation in th e 32 bit kernel mode.
TX49/H2 Architecture
A-83
LDL Load Doubleword Left
(continued) LDL
Given a doublew ord in a register an d a doublew ord in memory , the oper ation of LDL u s as
follows:
LDL
Register
Memory
CBA DEFGH
KJI LMNOP
BigEndianCPU = 0 BigEndianCPU = 1
offset offset
vAddr20Destination type LEM BEM Destination type LEM BEM
0PBCDEFGH007IJKLMNOP700
1OPCDEFGH106JKLMNOPH601
2 NOPDEFGH 2 0 5 KLMNOPGH 5 0 2
3 MNOPEFGH 3 0 4 LMNOPFGH 4 0 3
4 LMNOPFGH 4 0 3 MNOPEFGH 3 0 4
5 KLMNOPGH 5 0 2 NOPDEFGH 2 0 5
6 JKLMNOPH 6 0 1 OPCDEFGH 1 0 6
7IJKLMNOP700PBCDEFGH007
LEM BigEndianMem = 0
BEM BigEndianMem = 1
Type AccessType sent t o memory
Offset Addr20 sent to memory
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-84
LDR Load Doubleword Right LDR
offset
LDR
011011 base rt
1516202125
6
2631 0
55 16
Format:
LDR rt, offset (base)
Description:
This instruction can be used in combination with the LDL instruction to load a register
with eight consecutive bytes from memory, when the bytes cross a boundary between two
doublewords. LDR loads the right portion of the register from the appropriate part of the
low-order doubleword; LDL loads the left portion of the register from the appropriate part of
the high-order doubleword.
The LDR instruction adds its sign-extended 16-bit offset to the c on - te nts o f g e ne r al re g iste r
base to form a virtual address which can specify an arbitrary byte. It reads bytes only from
the doubleword in memory which contains the specified starting byte. From one to eight
bytes will be loaded, depending on the starting byte specified.
Conceptually, it starts at the specified byte in memory and loads that byte into the low-
order (right-most) byte of the register; then it proceeds toward the high-order byte of the
doubleword in memory and the high-order byte of the register, loading bytes from memory
into the register until it reaches the high-order byte of the doubleword in memory. The most
significant (left-most) byte (s) of the register will not be changed.
LDR $24,4 ($0)
register
memory
(big-endian)
register
address 0
address 8 111098 15141312
32107654 $24
before DCBA HGFE
$24
after 0CBA 4321
TX49/H2 Architecture
A-85
LDR Load Doubleword Right
(continued) LDR
The contents of general register rt are internally bypassed within the processor so that no
NOP is needed between an immediately preceding load instruction which specifies register rt
and a following LDR (or LDL) instruction which also specifies register rt.
No address exceptions due to alignment are possible.
Operation:
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEnci an3)
if BigEndianMem = 1 then
pAddr pAddr313 03
endif
byte vA ddr20 xor BigEndianCPU3
mem LoadMemory (uncac hed, byt e, pA ddr, vA ddr, DA TA)
GPR[rt] GPR[ rt] 6364 8*byte mem638*byte
Note: It is also the same operation in th e 32 bit kernel mode.
TX49/H2 Architecture
A-86
LDR Load Doubleword Right
(continued) LDR
Given a d oubleword in a re gister and a double word in memory, th e operation of LDR is as
follows:
LDR
Register
Memory
CBA DEFGH
KJI LMNOP
BigEndianCPU = 0 BigEndianCPU = 1
offset offset
vAddr20Destination type LEM BEM Destination type LEM BEM
0IJKLMNOP700ABCDEFGI070
1AIJKLMNO610ABCDEFIJ160
2 ABIJKLMN 5 2 0 ABCDEIJK 2 5 0
3 ABCIJKLM 4 3 0 ABCDIJKL 3 4 0
4 ABCDIJKL 3 4 0 ABCIJKLM 4 3 0
5 ABCDEIJK 2 5 0 ABIJKLMN 5 2 0
6ABCDEFIJ160AIJKLMNO610
7ABCDEFGI070IJKLMNOP700
LEM BigEndianMem = 0
BEM BigEndianMem = 1
Type AccessType sent t o memory
Offset Addr20 sent to memory
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-87
LH Load Halfword LH
offset
LH
100001 base rt
1516202125
6
2631 0
55 16
Format:
LH rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The contents of the halfword at the memory location specified by the
effective address are sign-extended and loaded into general register rt.
If the least-significant bit of the effective address is non-zero, an address error exception
occurs.
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian 0))
mem LoadMemory (uncached, HA LFWORD, pAddr, vAddr, DATA)
byte vA ddr20 xor (BigEndianCPU2 0)
GPR[rt] (mem15 + 8*byte)16 mem15 + 8*byte8*byte
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian 0))
mem LoadMemory (uncached, HA LFWORD, pAddr, vAddr, DATA )
byte vA ddr20 xor (BigEndianCPU2 0)
GPR[rt] (mem15 + 8*byte)16 mem15 + 8*byte8*byte
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-88
LHU Load Halfword Unsigned LHU
offset
LHU
100101 base rt
1516202125
6
2631 0
55 16
Format:
LHU rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The contents of the halfword at the memory location specified by the
effective address are zero-extended and loaded into general register rt.
If the least-significant bit of the effective address is non-zero, an address error exception
occurs.
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian2 0))
mem LoadMemory (uncac hed, HA LFWORD, pAddr, vA ddr, DATA)
byte vA ddr20 xor (BigEndianCPU2 0)
GPR[rt] 016mem15 + 8*byte8*byte
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian2 0))
mem LoadMemory (uncached, HA LFWORD, pAddr, vAddr, DATA)
byte vA ddr20 xor (BigEndianCPU2 0)
GPR[rt] 048mem15 + 8*byte8*byte
Exceptions:
TLB refill exception
TLB invalid exception
Bus Error exception
Address error exception
TX49/H2 Architecture
A-89
LL Load Linked LL
offset
LL
110000 base rt
1516202125
6
2631 0
55 16
Format:
LL rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The contents of the word at the memory location specified by the effective
address are loaded into general register rt. In 64-bit mode, the loaded word is sign-extended.
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-90
LLD Load Linke Doubleword LLD
offset
LLD
110100 base rt
1516202125
6
2631 0
55 16
Format:
LLD rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The contents of the doubleword at the memory location specified by the
effective address are loaded into general register rt.
The processor begins checking the accessed doubleword for modification by other
processors and devices.
Load Linked Doubleword and Store Conditional Doubleword can be used to atomically
update memory locations:
L1: LLD T1, (T0)
ADD T2, T1, 1
SCD T2, (T0)
BEQ T2, 0, L1
NOP
This atomically increments the word addressed by T0. Changing the ADD to an OR
changes this to an atomic bit set.
TX49/H2 Architecture
A-91
LLD Load Linked Doubleword
(continued) LLD
The operation of LLD is undefined if the addressed location is uncached and, for
synchronization between multiple processors, the operation of LLD is undefined if the
addressed location is noncoherent.
A cache miss that occurs between LLD and SCD may cause SCD to fail, so no load or store
instruction should occur between LLD and SCD. Exceptions also cause SCD to fail, so
persistent exceptions must be avoided.
This instruction is available in User mode, and it is not necessary for CP0 to be enabled.
If any of the three least-significant bits of the effective address are non-zero, an address
error exception takes place.
Operation:
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
memLoadMem ory (uncached,DOUBLE WORD,pAddr,vAddr,DATA)
GPR[rt] mem
LLbit 1
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-92
LUI Load Upper Immediate LUI
immediate
0
00000
LUI
001111 rt
1516202125
6
2631 0
55 16
Format:
LUI rt, immediate
Description:
The 16-bi t immediate is shif ted left 16 bits and concatenate d to 16 bit s of ze ros. The resu lt
is placed into general register rt. In 64-bit mode, the loaded word is sign-extended.
Operation:
32 T: GPR[rt] immediate 016
64 T: GPR[rs] (immediate15)32 immediate 016
Exceptions:
None
TX49/H2 Architecture
A-93
LW Load Word LW
offset
LW
100011 base rt
1516202125
6
2631 0
55 16
Format:
LW rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The contents of the word at the memory location specified by the effective
address are loaded into general register rt. In 64-bit mode, the loaded word is sign-extended.
If either of the two least-significant bits of the effective address is non-zero, an address
error exception occurs.
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian 02)
mem LoadMemory (uncac hed, WORD, pAddr, vAddr, DATA)
byte vA ddr20 xor (BigEndianCPU 02)
GPR[rt] mem31 + 8*byte8*byte
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian 02)
mem LoadMemory (uncac hed, WORD, pAddr, vAddr, DATA)
byte vA ddr20 xor (BigEndianCPU 02)
GPR[rt] (mem31 + 8*byte)32 mem31 + 8*byte8*byte
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-94
LWCz Load Word To Coprocessor
zLWCz
offset
LWXz
1100xx*base rt
1516202125
6
2631 0
55 16
Format:
LWCz rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The processor reads a word from the addressed memory location, and
makes the data available to coprocessor unit z. The manner in which each coprocessor uses
the data is defined by the individual coprocessor specifications.
If either of the two least-significant bits of the effective address is non-zero, an address
error exception occurs.
This instruction is not valid for use with CP0.
*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit
Encoding” at the end of Appendix A.
TX49/H2 Architecture
A-95
LWCz Load Word To Coprocessor z
(continued) LWCz
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian 02)
mem LoadMemory (uncac hed, WORD, pAddr, vAddr, DATA)
byte vA ddr20 xor (BigEndianCPU 02)
COPzLW (byte, rt, mem)
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian 02)
mem LoadMemory (uncac hed, WORD, pAddr, vAddr, DATA)
byte vA ddr20 xor (BigEndianCPU 02)
COPzLW(byte, rt, mem)
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
Coprocessor unus able exception
Opcode Bit Encoding:
LWCz
LWC1
Bit #
Opcode Coprocessor Unit Number
02627282930
0011
31
10
02627282930
0011
31
01
LWC2
Bit #
TX49/H2 Architecture
A-96
LWL Load Word Left LWL
offset
LWL
100010 base rt
1516202125
6
2631 0
55 16
Format:
LWL rt, offset (base)
Description:
This instruction can be used in combination with the LWR instruction to load a register
with four consecutive bytes from memory, when the bytes cross a boundary between two
words. LWL loads the left portion of the register from the appropriate part of the high-order
word; LWR loads the right portion of the register from the appropriate part of the low-order
word.
The LWL ins truction ad ds it s sig n-ex tend ed 16-bit offset to the contents of general register
base to form a virtual address which can specify an arbitrary byte. It reads bytes only from
the word in memory which contains the specified starting byte. From one to four bytes will
be loaded, dep ending on the starting byte sp ecified. In 64-bi t mode, the lo aded word is sign-
extended.
Conceptually, it starts at the specified byte in memory and loads that byte into the high-
order (left-most) byte of the register; then it proceeds toward the low-order byte of the word
in memory and the low-order byte of the register, loading bytes from memory into the
register until it reaches the low-order byte of the word in memory. The least-significant
(right-most) byte(s) of the register will not be changed.
LWL $24,1 ($0)
memory
(big-endian)
register
address 0
address 4 7654
3210 $24
before DCBA
$24
after D321
TX49/H2 Architecture
A-97
LWL Load Word Left
(continued) LWL
The contents of general register rt are internally bypassed within the processor so that no
NOP is needed between an immediately preceding load instruction which specifies register rt
and a following LWL (or LWR) instruction which also specifies register rt.
No address exceptions due to alignment are possible.
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEnci an 3)
if BigEndianMem = 0 then
pAddr pAddrPSIZE-12 02
endif
byte vA ddr10 xor BigEndianCPU2
word vAddr2 xor BigEndianCPU
mem LoadMemory (uncached, 0 byte, pAddr, vA ddr, DA TA)
temp me m32*word + 8*byte + 732*word  GPR[rt]23 8*byte0
GPR[rt] temp
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEnci an 3)
if BigEndianMem = 0 then
pAddr pAddrPSIZE-12 02
endif
byte vA ddr10 xor BigEndianCPU2
word vAddr2 xor BigEndianCPU
mem LoadMemory (uncached, 0 byte, pAddr, vA ddr, DA TA)
temp me m32*word + 8*byte + 732*word  GPR[rt]23 8*byte0
GPR[rt] (temp31)32 temp
TX49/H2 Architecture
A-98
LWL Load Word Left
(continued) LWL
Given a d oublewo rd in a register an d a doublew ord in memo ry, the o peration of LW L is as
follows:
LWL
Register
Memory
CBA DEFGH
KJI LMNOP
BigEndianCPU = 0 BigEndianCP U = 1
offset offset
vAddr20Destination type LEM BEM Destination type LEM BEM
0 SSSSPFGH 0 0 7 SSSSI JKL 3 4 0
1 SSSSOPGH 1 0 6 SSSSJKLH 2 4 1
2 SSSSNOPH 2 0 5 SSSSKLGH 1 4 2
3 SSSSMNOP 3 0 4 SSSSLFGH 0 4 3
4 SSSSLFGH 0 4 3 SSSSMNOP 3 0 4
5 SSSSKLGH 1 4 2 SSSSNOPH 2 0 5
6 SSSSJKLH 2 4 1 SSSSOPGH 1 0 6
7 SSSSI JKL 3 4 0 SSSSPFGH 0 0 7
LEM BigEndianMem = 0
BEM BigEndianMem = 1
Type AccessType (see Figure 2-2) sent to memory
Offset pAddr20 sent to memory
Ssign-extend of destination31
Exception:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-99
LWR Load Word Right LWR
offset
LWR
100110 base rt
1516202125
6
2631 0
55 16
Format:
LWR rt, offset (base)
Description:
This instruction can be used in combination with the LWL instruction to load a register
with four consecutive bytes from memory, when the bytes cross a boundary between two
words. LWR loads the right portion of the register from the appropriate part of the low-order
word; LWL loads the left portion of the register from the appropriate part of the high-order
word.
The LWR instruction adds its sign-extended 16-bit offset to the contents of general register
base to form a virtual address which can specify an arbitrary byte. It reads bytes only from
the word in memory which contains the specified starting byte. From one to four bytes will
be loaded, depending on the starting byte specified. In 64-bit mode, if bit 31 of the
desti nation regist er is loaded, then the loaded word is sign-extended.
Conceptually, it starts at the specified byte in memory and loads. that byte into the low-
order (right-most) byte of the register; then it proceeds toward the high-order byte of the
word in memory and the high-order byte of the register, loading bytes from memory into the
register until it reaches the high-order byte of the word in memory.
The most significant (left-most) byte(s) of the register will not be changed.
LWR $24,4 ($0)
memory
(big-endian)
register
address 0
address 4 7654
3210 $24
before DCBA
$24
after 4CBA
TX49/H2 Architecture
A-100
LWR Load Word Right
(continued) LWR
The contents of general register rt are internally bypassed within the processor so that no
NOP is needed between an immediately preceding load instruction which specifies register rt
and a following LWR (or LWL) instruction which also specifies register rt.
No address exceptions due to alignment are possible.
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
if BigEndianMem = 1 then
pAddr pAddrPSIZE-313 03
endif
byte vA ddr10 xor BigEndianCPU2
word vAddr2 xor BigEndianCPU
mem LoadMemory (uncached, 0 byte, pAddr, vA ddr, DA TA)
temp GPR [rt ]3132 8*byte mem31 + 32*word32*word + 8*byte
GPR[rt] temp
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
if BigEndianMem = 1 then
pAddr pAddrPSIZE-313 03
endif
byte vA ddr10 xor BigEndianCPU2
word vAddr2 xor BigEndianCPU
mem LoadMemory (uncached, 0 byte, pAddr, vA ddr, DA TA)
temp GPR [rt ]3132 8*byte mem31 + 32*word32*word + 8*byte
GPR[rt] (temp31)32 temp
TX49/H2 Architecture
A-101
LWR Load Word Right
(continued) LWR
Given a word in a register and a word in memory, the operation of LWR is as follows:
LWR
Register
Memory
CBA DEFGH
KJI LMNOP
BigEndianCPU = 0 BigEndianCP U = 1
offset offset
vAddr20destination type LEM BEM Destination type LEM BEM
0 SSSSMNOP 0 0 4 XXXXEFGI 0 7 0
1 XXXXEMNO 1 1 4 XXXXEFI J 1 6 0
2 XXXXEFMN 2 2 4 XXXXEI JK 2 5 0
3 XXXXEFGM 3 3 4 SSSSI JKL 3 4 0
4 SSSSI JKL 0 4 0 XXXXEFGM 4 3 4
5 XXXXEI JK 1 5 0 XXXXEFMN 5 2 4
6 XXXXEFI J 2 6 0 XXXXEMNO 6 1 4
7 XXXXEFGI 3 7 0 SSSSMNOP 7 0 4
LEM BigEndianMem = 0
BEM BigEndianMem = 1
Type AccessType (see Figure 2-2) sent to memory
Offset pAddr20 sent to memory
Ssign-extend of destination31
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-102
A. LWU Load Word Unsigned LWU
offset
LWU
100111 base rt
1516202125
6
2631 0
55 16
Format:
LWU rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The contents of the word at the memory location specified by the effective
address are loaded into general register rt. The l oaded word is zero-extended.
If either of the two least-significant bits of the effective address is non-zero, an address
error exception occurs.
Operation:
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian 02)
mem LoadMemory (uncac hed, WORD, pAddr, vAddr, DATA)
byte vA ddr20 xor (BigEndianCPU 02)
GPR[rt] 032mem31 + 8*byte8*byte
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-103
MADD Multiply/Add MADD
rd MADD
000000
MAC
011100 rt
1011
6
31 0
555 11
rs 0
00000
MADD
000000
MAC
011100 rs rt
1516202125
6
2631 0
55 6
0
00 0000 0000
10
56
151620212526 56
Format:
MADD rs, rt
MADD rd, rs, rt
Description:
Multiplies the contents of general registers rs and rt, treating both values as two’s
complement, and puts the double-word result in special registers HI and LO. An overview
exception is never raised. The low-order word of the multiplication result is put in general
register rd and in special register LO, whereas the high-order word of the reuslt is put in
special register HI.
If rd is omit ted in a ssembly lang uage, 0 is u sed as the defau lt value. To guarantee co rrect
operation even if an interrupt occurs, neithe of the two instructions following MADD should
be DIV or DIVU instructions which modify the HI and LO register contents.
Operation:
32, 64 T: t (HI LO) + GPR[rs]*GPR[rt]
LO t310
HI t6332
GPR[rd] t310
Exception:
None
TX49/H2 Architecture
A-104
MADDU Multiply/Add Unsigned MADDU
rd MADDU
000001
MAC
011100 rt
6
31 0
555 11
rs 0
00000
MADDU
000001
MAC
011100 rs rt
1516202125
6
2631 0
55 6
0
00 0000 0000
10
56
151620212526 56
1011
Format:
MADDU rs, rt
MADDU rd, rs, rt
Description:
Multiplies the contents of general registers rs and rt, treating both values as unsigned, and
puts the double-word result in special registers HI and LO. An overview exception is never
raised. The low-order word of the multiplication result is put in general register rd and in
special register LO, whereas the high-order word of the reuslt is put in special register HI.
If rd is omit ted in a ssembly lang uage, 0 is u sed as the defau lt value. To guarantee co rrect
operation even if an interrupt occurs, neithe of the two instructions following MADDU should
be DIV or DIVU instructions which modify the HI and LO register contents.
Operation:
32, 64 T: t (HI LO) + (0 || GPR[rs]) + (0 || GPR[rt])
LO t310
HI t6332
GPR[rd] t310
Exception:
None
TX49/H2 Architecture
A-105
MFC0 Move From System
Control Coprocessor 0 MFC0
rd 0
000 0000 0000
MF
00000
COP0
010000 rt
1011
15
16
20
2125
6
2631 0
555 11
Format:
MFC0 rt, rd
Description:
The contents of coprocessor register rd of the CP0 are loaded into general register rt.
May be used on both 32-bit and 64-bit CP0 registers.
Operation:
32 T: data CP R[0,r d]
T + 1: GPR[rt] data
64 T: data CP R[0,r d]
T + 1: GPR[rt] (data31)32 data310
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
A-106
MFCz Move From Coprocessor z MFCz
rd 0
000 0000 0000
MF
00000
COPz
0100xx*rt
10111516202125
6
2631 0
555 11
Format:
MFCz rt, rd
Description:
The contents of coprocessor register rd of coprocessor z are loaded into general register rt.
Execution of the instruction referencing coprocessor 3 causes a reserved instruction
exception, not a coprocessor unusab le exception.
Operation:
32 T: data CP R[ z,rd]
T + 1: GPR[rt] data
64 T: if rd0 = 0
data CPR[z,r d41 0]310
else
data CPR[z,r d41 0]6332
endif
T + 1: GPR[rt] (data31)32||data
Exceptions:
Coprocessor unus able exception
Reserved instruction exception (coprocessor 3)
*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit
Encoding” at the end of Appendix A.
TX49/H2 Architecture
A-107
MFCz Move From Coprocessor z
(continued) MFCz
Opcode Bit Encoding:
Coprocessor Suboperation
MFCz
Coprocessor Unit Number
MFC1
Bit #
Opcode
0
21222324252627282930
0010
31
0010000
0
21222324252627282930
0010
31
0001000
MFC2
Bit #
MFC0
Bit # 0
21222324252627282930
0010
31
0000000
TX49/H2 Architecture
A-108
MFHI Move From HI MFHI
rd MFHI
010000
0
00000
0
00 0000 0000
SPECIAL
000000
561011151625
6
2631 0
10 5 5 6
Format:
MFHI rd
Description:
The contents of special register HI are loaded into general register rd.
To ensure proper operation in the event of interruptions, the two instructions which follow
a MFHI instru ction may n ot be any of the ins tructions which modify th e HI register: MULT,
MULTU, DIV, DIVU, MTHI, DMULT, DMULTU, DDIV, DDIVU, MADD, MADDU.
Operation:
32, 64 T: GP R[rd] HI
Exceptions:
None
TX49/H2 Architecture
A-109
MFLO Move From Lo MFLO
rd MFLO
010010
0
00000
0
00 0000 0000
SPECIAL
000000
561011151625
6
2631 0
10 5 5 6
Format:
MFLO rd
Description:
The contents of special register LO are loaded into general register rd.
To ensure proper operation in the event of interruptions, the two instructions which follow
a MFLO instruc tion may n o t be an y o f the ins tructio ns wh ich mo d ify the LO r e gi ster : MULT,
MULTU, DIV, DIVU, MTLO, DMULT, DMULTU, DDIV, DDIVU, MADD, MADDU.
Operation:
32, 64 T: GP R[rd] LO
Exceptions:
None
TX49/H2 Architecture
A-110
MTC0 Move To System Control
Coprocessor 0 MTC0
rd 0
000 0000 0000
MT
00100
COP0
010000 rt
101115162021
25
6
2631 0
555 11
Format:
MTC0 rt, rd
Description:
The content s of general register rt are loaded int o coprocessor register rd of the CP0.
Because the state of the virtual address translation system may be altered by this
instruction, the operation of load, store instructions and TLB operations immediately prior to
and after this instruction are undefined.
Operation:
32, 64 T: dat a GPR[rt]
T + 1: CPR[0,rd] data
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
A-111
MTCz Move To Coprocessor z MTCz
rd 0
000 0000 0000
MT
00100
COPz
0100xx*rt
10111516202125
6
2631 0
555 11
Format:
MTCz rt, rd
Description:
The contents of general register rt are loaded into coprocessor register rd of coprocessor z.
Execution of the instruction referencing coprocessor 3 causes a reserved instruction
exception, not a coprocessor unusab le exception.
Operation:
32 T: data GPR[rt]
T + 1: CPR[z,rd] data
64 T: data GPR[rt]310
T + 1: if rd0 = 0
CPR[z,rd41 0] CPR[z, rd41 0]6332 data
else
CPR[z,rd41 0] data||CPR[z, rd 41 0]310
endif
Exceptions:
Coprocessor unus able exception
Reserved instruction exception (coprocessor 3)
*Opcode Bit Encoding:
MTCz
MTC0
Bit # 021222324252627282930
0010
31
0000001
021222324252627282930
0010
31
0010001
MTC1
Bit #
Coprocess or Suboperation
Coprocess or Uni t Number
Opcode
021222324252627282930
0010
31
0001001
MTC2
Bit #
TX49/H2 Architecture
A-112
MTHI Move To HI MTHI
MTHI
010001
0
000 0000 0000 0000
SPECIAL
000000 rs
5620212526
6
31 0
5156
Format:
MTHI rs
Description:
The content s of general register rs are loaded into special register HI
If a MTHI operation is executed following a MULT, MULTU, DIV, DIVU, DMULT,
DMULTU, DDIV, DDIVU, MADD, or MADDU instruction, but before any MFLO, MFHI,
MTLO, or MTHI instructions, the contents of sp ecial register LO are undefined.
Operation:
32, 64 T 2: HI undefined
T 1: HI undefined
T: HI GPR[rs]
Exceptions:
None
TX49/H2 Architecture
A-113
MTLO Move To LO MTLO
MTLO
010011
0
000 0000 0000 0000
SPECIAL
000000 rs
5620
21
2526
6
31 0
5156
Format:
MTLO rs
Description:
The contents of general register rs are loade d in to spe cial reg ister LO If a MTLO operation
is executed following a MULT, MULTU, DIV, DIVU, DMULT, DMULTU, DDIV, DDIVU,
MADD, or MADDU in struction, but before any MF LO, MFHI, MT LO, or MTHI in structions,
the contents of special register HI are undefined.
Operation:
32, 64 T 2: LO undefined
T 1: LO undefined
T: LO GPR[rs ]
Exceptions:
None
TX49/H2 Architecture
A-114
MULT Multiply MULT
MULT
011000
0
00 0000 0000
SPECIAL
000000 rs rt
561516202125
6
2631 0
55106
MULT
011000
SPECIAL
000000 rs rt
1516202125
6
2631 0
55 6
rd
11
5
10
0
0 0000
5
56
Format:
MULT rs, rt
MULT rd, rs, rt
Description:
The contents of general registers rs and rt are multiplied , treating both o perands as 32-bit
2’s-complement values. No integer overflow exception occurs under any circumstances. In
64-bit mode, the operands must be valid 32-bit, sign-extended values.
When the operation completes, the low-order word of the double result is loaded into
special register LO, and the high-order word of the double result is loaded into special
register HI.
If either of the two preceding instructions is MFHI or MFLO, the results is of these
instructions are undefined. Correct operation requires separating reads of HI or LO from
writes by a minimum of two other instructions.
Operation:
32 T 2: LO undefined
HI undefined
T 1: LO undefined
HI undefined
T: t GPR[rs]* GPR[rt]
LO t310
HI t6332
GPR[rd] t310
64 T 2: LO undefined
HI undefined
T 1: LO undefined
HI undefined
T: t GPR[rs]310* GPR[rt]310
LO (t31)32 t310
HI (t63)32 t6332
GPR[rd] (t31)32 t310
Exceptions:
None
TX49/H2 Architecture
A-115
MULTU Multiply Unsigned MULTU
MULTU
011001
0
00 0000 0000
SPECIAL
000000 rs rt
561516202125
6
2631 0
55 10 6
MULTU
011001
SPECIAL
000000 rs rt
1516202125
6
2631 0
55 6
rd
11
5
10
0
0 0000
5
56
Format:
MULTU rs, rt
MULTU rd, rs, rt
Description:
The contents of general register rs and the contents of general register rt are multiplied,
treating both operands as unsigned values. No overflow exception occurs under any
circumstances. In 64-bit mode, the operands must be valid 32-bit, sign-extended values.
When the operation completes, the low-order word of the double result is loaded into
special register LO, and the high-order word of the double result is loaded into special
register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of these
instructions are undefined. Correct operation requires separating reads of HI or LO from
writes by a minimum of two instructions.
Operation:
32 T 2: LO undefined
HI undefined
T 1: LO undefined
HI undefined
T: t (0 GPR[rs])* (0 GPR[rt])
LO t310
HI t6332
GPR[rd] t310
64 T 2: LO undefined
HI undefined
T 1: LO undefined
HI undefined
T: t (0 GPR[rs]310)* (0 GPR[rt]310)
LO (t31)32 t310
HI (t63)32 t6332
GPR[rd] (t31)32 t310
Exceptions:
None
TX49/H2 Architecture
A-116
NOR Nor NOR
rd NOR
100111
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
NOR rd, rs, rt
Description:
The contents of general register rs are combined with the contents of general register rt in
a bit-wise logical NOR operation. The result is placed into general register rd.
Operation:
32, 64 T: GP R[rd] GPR[rs] nor GP R[rt]
Exceptions:
None
TX49/H2 Architecture
A-117
OR Or OR
rd OR
100101
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
OR rd, rs, rt
Description:
The contents of general register rs are combined with the contents of general register rt in
a bit-wise logical OR operation. The result is placed into general register rd.
Operation:
32, 64 T: GP R[rd] GPR[rs] or GP R[rt]
Exceptions:
None
TX49/H2 Architecture
A-118
ORI Or Immediate ORI
immediate
ORI
001101 rs rt
1516202125
6
2631 0
55 16
Format:
ORI rt, rs, immediate
Description:
The 16-bi t immediate is zero-extended and combined with the contents of general register
rs in a bit-wise logical OR operation. The result is placed into general register rt.
Operation:
32 T: GPR[rt] GPR [rs]3116 (immediate or GPR[rs ]150)
64 T: GPR[rt] GPR [rs]6316 (immediate or GPR[rs ]150)
Exceptions:
None
TX49/H2 Architecture
A-119
PREF Prefetch PREF
offset
PREF
110011 base hint
1516202125
6
2631 0
55 16
Format :
PREF hint, offset (base)
Description :
PREF adds the 16-bit signed offset to the contents of GPR base to form an effective byte
address. It advises that data at the effective address may be used in the near future.
If the hint field is 000002, this instruction prefetches a block of data from main memory
into cache.
PREF is an advisory instruction. It may change the performance of the program. For all
hint values and all effective addresses, it neither changes architecturally-visible state nor
alters the meaning of the program.
PREF does not cause addressing-related exceptions. If it raises an exception condition, the
exception conditions ignored. If an addressing-related exception is raised and ignored, no
data will be prefetched, even if no data is prefetched in such a case, some action that is not
architecturally-visible, such as writeback of a dirty cache line, might take place.
PREF will never generate a memory operation for a location with an uncached memory
access type.
The defined hint values are shown in the table below. The TX49 only supports hint = 0.
The hint table may be extended in future implementations.
hint field: Value
Value Name Data use and desired prefetch action
0 Load Data is expected to be loaded (not modified).
Fetch data as if for a load.
1-31 Reserved Reserved
TX49/H2 Architecture
A-120
PREF Prefetch
(continued) PREF
Programming Notes:
Prefetch can not prefetch data from a mapped location unless the translation for that
location is present in the TLB. Locations in memory pages that have not been accessed
recently may not have translations in the TLB, so prefetch may not be effective for such
locations.
Prefetch does not cause addressing exceptions. It will not cause an exception to prefetch
using an address pointer value before the validity of a pointer determined.
Operation :
32, 64 T: vAddr GPR[base] = sign_extend (offset)
(pAddr, uncached) Address Translation (vAddr, DATA, LOAD)
Prefetch (uncached, pAddr, vAddr, DATA, hint)
Exception :
None
TX49/H2 Architecture
A-121
SB Store Byte SB
offset
SB
101000 base rt
1516202125
6
2631 0
55 16
Format:
SB rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The least-significant byte of register rt is stored at the effective address.
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
byte vA ddr20 xor BigEndianCPU3
data GPR[rt]638*byte0 08*byte
StoreMem ory (uncached, BYTE, data, pAddr, vAddr, DATA)
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
byte vA ddr20 xor BigEndianCPU3
data GPR[rt]638*byte0 08*byte
StoreMem ory (uncached, BYTE, data, pAddr, vAddr, DATA)
Exceptions:
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-122
SC Store Conditional SC
offset
SC
111000 base rt
1516202125
6
2631 0
55 16
Format:
SC rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to f o rm
a virtual address. The contents of general register rt are conditionally stored at the memory
location specified by the effective address.
If an ERET instruction occurs between the Load Linked instruction and this store
instruction, the store fails and is inhibited from taking place.
The success or failure of the store operation (as defined above) is indicated by the contents
of general register rt aft er execution of the instruction. A successful store sets the contents of
general register rt to1 ; an unsuccessful store sets it to 0.
The operation of Store Conditional is undefined when the address is different from the
address used in the last Load Linked.
This instruction is available in User mode; it is not necessary for CP0 to be enabled.
If either of the two least-significant bits of the effective address is non-zero, an address
error exception takes place.
TX49/H2 Architecture
A-123
SC Store Conditional
(continued) SC
If this instruction should both fail and take an exception, the exception takes precedence.
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian 02)
data GPR[rt]638*byte0 08*byte
if LLbit then
StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)
endif
GPR[rt] 031LLbit
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian 02)
data GPR[rt]638*byte0 08*byte
if LLbit then
StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)
endif
GPR[rt] 063Llbit
Exceptions:
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-124
SCD Store Conditional
Doubleword SCD
offset
SCD
111100 base rt
1516202125
6
2631 0
55 16
Format:
SCD rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to f o rm
a virtual address. The contents of general register rt are conditionally stored at the memory
location specified by the effective address.
If an ERET instruction occurs between the Load Linked Doubleword instruction and this
store instruction, the store fails and is inhibited from taking place.
The success or failure of the store operation (as defined above) is indicated by the contents
of general register rt aft er execution of the instruction. A successful store sets the contents of
general register rt to1; an unsuccessful store sets it to 0.
The operation of Store Conditional Doubleword is undefined when the address is different
from the addres s used in the las t Load Linked Doubleword.
This instruction is available in User mode; it is not necessary for CP0 to be enabled.
If either of the three least-significant bits of the effective address is non-zero, an address
error exception takes place.
If this instruction should both fail and take an exception, the exception takes precedence.
Operation:
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
data GPR[rt]
If LLbit then
StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)
endif
GPR[rt] 063Llbit
Note: It is also the same operation in th e 32 bit kernel mode.
TX49/H2 Architecture
A-125
SCD Store Conditional
Doubleword
(continued) SCD
Exceptions:
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-126
SD Store Doubleword SD
offset
SD
111111 base rt
15
16202125
6
2631 0
55 16
Format:
SD rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The contents of general register rt are stored at the memory location
specified by the effective address.
If either of the three least-significant bits of the effective address are non-zero, an address
error exception occurs.
Operation:
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
data GPR[rt]
StoreMem ory (uncac hed, DOUB LEWORD, data, pAddr, vAddr, DATA)
Note:It is also the same operation, and the upper 32 bit is ignored when the virtual
address is created in the 32 bit kernel mode.
Exceptions:
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-127
SDBBP Store Debug Breakpoint SDBBP
SDBBP
001110
SPECIAL
000000
5625
6
2631 0
20 6
Code
Format:
SDBBP code
Description:
Raises a Debug Breakpoint exception, passing control to an exception handler. The code
field can used for passing information to the exception handler, but the only way to have the
code field retrived by the exception handler is to load the contents of the memory word
containing this instruction using the DEPC register.
Operation:
32, 64 T: Software DebugBreak poi ntExcept ion
Exception:
Debug Breakpoint exception
TX49/H2 Architecture
A-128
SDCz Store Doubleword From
Coprocessor z SDCz
offset
SDCz
1111xx*base rt
1516202125
6
2631 0
55 16
Format:
SDCz rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. Coprocessor unit z sources a double wo r d, w h ich the p ro c esso r wr i te s to the
addressed memory location. The data to be stored is defined by individual coprocessor
specifications.
If any of the three least-significant bits of the effective address are non-zero, an address
error exception takes place.
This instruction is not valid for use with CP0.
This instruction is undefined when the least-significant bit of the rt-field is non-zero.
*See the table, “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit
Encoding” at the end of Appendix A.
TX49/H2 Architecture
A-129
SDCz Store Doubleword From
Coprocessor z (continued) SDCz
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
data COPzSD (rt),
StoreMem ory (uncac hed, DOUB LEWORD, data, pAddr, vAddr, DATA)
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
data COPzSD (rt),
StoreMem ory (uncac hed, DOUB LEWORD, data, pAddr, vAddr, DATA)
Exceptions:
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
Coprocessor unus able exception
Opcode Bit Encoding:
SDCz
Coprocess or Uni t NumberS D opcode
0
2627282930
1111
31
10
0
2627282930
1111
31
01
SDC1
Bit #
SDC2
Bit #
TX49/H2 Architecture
A-130
SDL Store Doubleword Left SDL
offset
SDL
101100 base rt
1516202125
6
2631 0
55 16
Format:
SDL rt, offset (base)
Description:
This instruction can be used with the SDR instruction to store the contents of a register
into eight consecutive bytes of memory, when the bytes cross a boundary between two
doublewords. SDL stores the left portion of the register into the appropriate part of the high-
order doubleword of memory; SDR stores the right portion of the register into the
appropriat e part of the low-order doubleword.
The SDL instruction adds its sign-extended 16-bit offset to the contents of general register
base to form a virtual address which may sp ecif y an arb itrary byte . It al ters o n ly th e w ord in
memory which contains that byte. From one to four bytes will be stored, depending on the
starting byte specified.
Conceptually, it starts at the most-significant byte of the register and copies it to the
specified byte in memory; then it proceeds toward the low-order byte of the register and the
low-order byte of the word in memory, copying bytes from register to memory until it reaches
the low-order byte of the word in memory.
No address exceptions due to alignment are possible.
SWL $24,1 ($0)
memory
(big-endian)
register
address 0
address 8 111098 15141312
32107654 $24
before DCBA HGFE
after
address 0
address 8 111098 15141312
CBA0GFED
TX49/H2 Architecture
A-131
SDL Store Doubleword Left
(continued) SDL
This operation is only defined for the TX4300 operating in 64-bit mode nad 32-bit kernal
mode.
Execution of this instruction in 32-bit user or supervisor mode causes a reserved
instruction exception.
Operation:
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
If BigEndianMem = 0 then
pAddr pAddr313 03
endif
byte vA ddr20 xor BigEndianCPU3
data 0568*byte GPR[rt]63568*byte
StoreMem ory (uncac hed, byt e, dat a, pAddr, vAddr, DATA)
Note:It is also the same operation, and the upper 32 bit is ignored when the virtual
address is created in the 32 bit kernel mode.
TX49/H2 Architecture
A-132
SDL Store Doubleword Left
(continued) SDL
Given a d oublewo rd in a register an d a doublew ord in memory, the o peration of SW L is a s
follows:
LWL
Register
Memory
CBA DEFGH
KJI LMNOP
BigEndianCPU = 0 BigEndianCP U = 1
offset offset
vAddr20destination type LEM BEM destination type LEM BEM
0IJKLMNOA
0AH7 0 0
1IJKLMN
  601
2 I JKLMABC 2 0 5 I JABCDEF 5 0 2
3 I JKLABCD 3 0 4 I JKABCDE 4 0 3
4 I JKABCDE 4 0 3 I JKLABCD 3 0 4
5 I JABCDEF 5 0 2 I JKLMABC 2 0 5
6 I ABCDEFG 6 0 1 I J KLMNAB 1 0 6
7 ABCDEFGH 7 0 0 I J KL MNOA 0 0 7
LEM BigEndianMem = 0
BEM BigEndianMem = 1
Type Access Type (see Figure 2-2) sent to memory
Offset pAddr20 sent to memory
Exceptions:
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-133
SDR Store Doubleword Right SDR
offset
SDR
101101 base rt
1516202125
6
2631 0
55 16
Format:
SDR rt, offset (base)
Description:
This instruction can be used with the SDL instruction to store the contents of a register
into eight consecutive bytes of memory, when the bytes cross a boundary between two
doublewords. SDR stores the right portion of the register into the appropriate part of the
low-order doubleword; SDL stores the left portion of the register into the appropriate part of
the low-order doubleword of memory.
The SDR instruction adds its sign-extended 16-bit offset to the contents of general register
base to form a virtual address which may sp ecif y an arb itrary byte . It al ters o n ly th e w ord in
memory which contains that byte. From one to eight bytes will be stored, depending on the
starting byte specified.
Concep tually, it start s at the least-si gnificant ( rightmost) by te of the re gister and copie s it
to the specified byte in memory; then it proceeds toward the high-order byte of the register
and the high-order byte of the word in memory, copying bytes from register to memory until
it reaches the high-order byte of the word in memory.
No address exceptions due to alignment are possible.
SWR $24,4 ($0)
memory
(big-endian)
register
address 0
address 8 111098 15141312
32107654 $24
before DCBA HGFE
after
address 0
address 8 111098 15141312
HGFE7654
memory
(big-endian)
TX49/H2 Architecture
A-134
SDR Store Doubleword Right
(continued) SDR
This operation is only defined for the TX4300 operating in 64-bit mode and 32-bit kernal
mode.
Execution of this instruction in 32-bit user or supervisor mode causes a reserved
instruction exception
Operation:
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
if BigEndianMem = 0 then
pAddr pAddrPSIZE-313 03
endif
byte vA ddr10 xor BigEndianCPU3
data GPR[rt]638*byte0 08*byte
StoreMem ory (uncached, DOUBLEWORD-byte, data, pAddr, vA ddr,
Note:It is also the same operation, and the upper 32 bit is ignored when the virtual
address is created in the 32 bit kernel mode.
TX49/H2 Architecture
A-135
SDR Store Doubleword Right
(continued) SDR
Given a d oubleword in a re gister and a doublew ord in memory, the operation of SDR i s as
follows:
SDR
Register
Memory
CBA DEFGH
KJI LMNOP
BigEndianCPU = 0 BigEndianCP U = 1
offset offset
vAddr20destination type LEM BEM destination type LEM BEM
0 ABCDEFGH 7 0 0 HJ KLMNOP 0 7 0
1 BCDEFGHP 6 1 0 GHKL MNOP 1 6 0
2 CDEFGHOP 5 2 0 FGHLMNOP 2 5 0
3 DEFGHNOP 4 3 0 EFGHMNOP 3 4 0
4 EFGHMNOP 3 4 0 DEFGHNOP 4 3 0
5 FGHLMNOP 2 5 0 CDEFGHOP 5 2 0
6 GHKLMNOP 1 6 0 BCDEFGHP 6 1 0
7 HJKLMNOP 0 7 0 ABCDEFGH 7 0 0
LEM BigEndianMem = 0
BEM BigEndianMem = 1
Type Access Type (see Figure 2-2) sent to memory
Offset pAddr20 sent to memory
Exceptions:
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)
TX49/H2 Architecture
A-136
A. SH Store Halfword SH
offset
SH
101001 base rt
1516202125
6
231 0
55 16
Format:
SH rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
an unsigned effective address. The least-significant halfword of register rt is stored at the
effective address. If the least-significant bit of the effective address is non-zero, an address
error exception occurs.
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian2 0))
byte vA ddr20 xor (BigEndianCPU2 0)
data GPR[rt]63-8*byte0 08*byte
StoreMem ory (uncached, HALFWORD, data, pAddr, vAddr, DATA)
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian2 0))
byte vA ddr20 xor (BigEndianCPU2 0)
data GPR[rt]63-8*byte0 08*byte
StoreMem ory (uncached, HALFWORD, data, pAddr, vAddr, DATA)
Exceptions:
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-137
SLL Shift Left Logical SLL
sard SLL
000000
0
00000
SPECIAL
000000 rt
5610111516202125
6
2631 0
55556
Format:
SLL rd, rt, sa
Description:
The contents of general register rt are shifted left by sa bits, inserting zeros into the low-
order bits. The result is placed in register rd. In 64-bit mode, the 32-bit result is sign
extended when placed in the destination register. It is sign-extended for all shift amounts,
including zero; SLL with a zero shift amount truncates a 64-bit value to 32-bits and sign
extends this 32-bit value. SLL, unlike nearly all other word operations, does not repuire and
operand to be a properly sign-extended word value to produce a valid sign-extended word
result.
Note: SLL with a shift amount of zero may be treated as a NOP by some assemblers at
some optimization levels. If using SLL with zero shift to truncate 64-bit values, check the
assembler being used.
Operation:
32 T: GPR[rd] GPR[rt]31-sa0 0sa
64 T: s 0sa
temp GPR [rt ]31-s0 0s
GPR[rd] (temp31)32 temp
Exceptions:
None
TX49/H2 Architecture
A-138
SLLV Shift Left Logical
Variable SLLV
rd SLLV
000100
0
00000
0
rs
SPECIAL
000000 rt
5610111516202125
6
2631 0
55556
Format:
SLLV rd, rt, rs
Description:
The contents of general register rt are shifted left by the number of bits specified by the
low-order five bits contained as contents of general register rs, inserting zeros into the low-
order bits. The result is placed in register rd. In 64-bit mode, the 32-bit result is sign
extended when placed in the destination register. It is sign-extended for all shift amounts,
including zero; SLLV with a zero shift amount truncates a 64-bit value to 32-bits and sign
extends this 32-bit value. SLLV, unlike nearly all other word operations, does not require
the operand to be a properly sign-extended word value to produce a valid sign-extended word
result.
Note: SLLV with a shift amount of zero may be treated as a NOP by some assemblers at
some optimization levels. If u sing SLLV with zero shif t to truncate 64-bit values, check the
assembler being used.
Operation :
32 T: s GP[rs]40
GPR[rd] GPR[rt](31-s) 0 0s
64 T: s 0GP[rs]40
temp GPR [rt ](31-s) 0 0s
GPR[rd] (temp31)32 temp
Exceptions:
None
TX49/H2 Architecture
A-139
SLT Set On Less Than SLT
rd SLT
101010
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
SLT rd, rs, rt
Description:
The contents of general register rt are subtracted from the contents of general register rs.
Considering both quantities as signed integers, if the contents of general register rs are less
than the contents of general register rt, the result is set to one, otherwise the result is set to
zero.
The result is placed into general register rd.
No integer overflow exception occurs under any circumstances. The comparison is valid
even if the subtraction used during the comparis on overflows.
Operation:
32 T: if GPR[rs] < GPR[rt] then
GPR[rd] 031 1
else
GPR[rd] 032
endif
64 T: if GPR[rs] < GPR[rt] then
GPR[rd] 063 1
else
GPR[rd] 064
endif
Exceptions:
None
TX49/H2 Architecture
A-140
SLTI Set On Less Than
Immediate SLTI
immediate
SLTI
001010 rs rt
1516202125
6
2631 0
55 16
Format:
SLTI rt, rs, immediate
Description:
The 16-bi t immediate is sign-extended and subtracted from the contents of general register
rs. Considering both quantities as signed integers, if rs is less than the sign-extended
immediate, the result is set to one, otherwise the result is set to zero. The result is placed
into general register rt.
No integer overflow exception occurs under any circumstances. The comparison is valid
even if the subtraction used during the comparis on overflows.
Operation:
32 T: if GPR[rs] < (immediate15)16 immediate150 then
GPR[rt] 0311
else
GPR[rt] 032
endif
64 T: if GPR[rs] < (immediate15)48 immediate150 then
GPR[rt] 0631
else
GPR[rt] 064
endif
Exceptions:
None
TX49/H2 Architecture
A-141
SLTIU Set On Less Than
Immediate Unsigned SLTIU
immediate
SLTIU
001011 rs rt
151620
21
25
6
2631 0
55 16
Format:
SLTIU rt, rs, immediate
Description:
The 16-bi t immediate is sign-extended and subtracted from the contents of general register
rs. Considering both quantities as unsigned integers, if rs is less than the sign-extended
immediate, the result is set to one, otherwise the result is set to zero. The result is placed
into general register rt.
No integer overflow exception occurs under any circumstances. The comparison is valid
even if the subtraction used during the comparis on overflows.
Operation:
32 T: if (0 GPR[rs]) < (immediate15)16 immediate150 then
GPR[rt] 0311
else
GPR[rt] 032
endif
64 T: if (0 GPR[rs]) < (immediate15)48 immediate150 then
GPR[rt] 0631
else
GPR[rt] 064
endif
Exceptions:
None
TX49/H2 Architecture
A-142
SLTU Set On Less Than Unsigned SLTU
rd SLTU
101011
0
00000
SPECIAL
000000 rs rt
56101115162021
25
6
2631 0
55556
Format:
SLTU rd, rs, rt
Description:
The contents of general register rt are subtracted from the contents of general register rs.
Considering both quantities as unsigned integers, if the contents of general register rs are
less than the contents of general register rt, the result is set to one, otherwise the result is
set to zero.
The result is placed into general register rd.
No integer overflow exception occurs under any circumstances. The comparison is valid
even if the subtraction used during the comparis on overflows.
Operation:
32 T: if (0 GPR[rs]) < 0 GPR[rt ] th e n
GPR[rd] 031 1
else
GPR[rd] 032
endif
64 T: if (0 GPR[rs]) < 0 GPR[rt ] th e n
GPR[rd] 063 1
else
GPR[rd] 064
endif
Exceptions:
None
TX49/H2 Architecture
A-143
SRA Shift Right Arithmetic SRA
sard SRA
0
00000
SPECIAL
000000 rt
56101115162021
25
6
2631 0
55556
Format:
SRA rd, rt, sa
Description:
The contents of general register rt are shifted right by sa bits, sign-extending the high-
order bits. The result is placed in register rd. In 64-bit mode, the operand must be a valid
sign-ext ended, 32-bit value.
Operation :
32 T: GPR[rd] (GPR[rt]31)sa GPR[rt]31sa
64 T: s 0sa
temp (GPR[rt]31)s GPR[rt]31s
GPR[rd] (temp31)32 temp
Exceptions:
None
TX49/H2 Architecture
A-144
SRAV Shift Right Arithmetic
Variable SRAV
rd SRAV
000111
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
26
31 0
55556
Format:
SRAV rd, rt, rs
Description:
The contents of general register rt are shifted right by the number of bits specified by the
low-order five bits of general register rs, sign-extending the high-order bits. The result is
placed in register rd. In64-bit mode, the operand must be a valid sign-extended, 32-bit value.
Operation:
32 T: s GPR[rs]40
GPR[rd] (GPR[ r t]31)s GPR[rt]31sa
64 T: s GPR[rs]40
temp (GPR[rt]31)s GPR[rt]31s
GPR[rd] (temp31)32 temp
Exceptions:
None
TX49/H2 Architecture
A-145
SRL Shift Right Logical SRL
rd SRL
000010
0
00000
SPECIAL
000000 sart
5610111516202125
6
2631 0
55556
Format:
SRL rd, rt, sa
Description:
The content s of general register rt are shifted right by sa bits, inserting zeros into the high-
order bits. The result is placed in register rd. In64-bit mode, the operand must be a valid
sign-ext ended, 32-bit value.
Operation:
32 T: GPR[rd] 0sa GPR[rt]31sa
64 T: s 0sa
temp 0s GPR[rt]31s
GPR[rd] (temp31)32 temp
Exceptions:
None
TX49/H2 Architecture
A-146
SRLV Shift Right Logical Variable SRLV
rd SRLV
000110
0
00000
SPECIAL
000000 rs rt
5610111516202125
6
2631 0
55556
Format:
SRLV rd, rt, rs
Description:
The contents of general register rt are shifted right by the number of bits specified by the
low-order five bits of general register rs, inserting zeros into the high-order bits. The result
is placed in register rd. In 64-bit mode, the operand must be a valid sign-extended, 32-bit
value.
Operation:
32 T: s GPR[rs]40
GPR[rd] 0s GPR[rt]31s
64 T: s GPR[rs]40
temp 0s GPR[rt]31s
GPR[rd] (temp31)32 temp
Exceptions:
None
TX49/H2 Architecture
A-147
SUB Subtract SUB
rd SUB
100010
0
00000
SPECIAL
000000 rs rt
56101115162021
25
6
2631 0
55556
Format:
SUB rd, rs, rt
Description:
The content s of general register rt are subtracted from the contents of general register rs to
form a result. The result is placed into general register rd. In 64-bit mode, the operands
must be valid sign-extended, 32-bit values.
The only difference between this instruction and the SUBU instruction is that SUBU never
traps on overflow.
An integer overflow exception takes place if the carries out of bits 30 and 31 differ (2’s-
complement overflow). The destination register rd is not modified when an integer overflow
exception occurs.
Operation:
32 T: GPR[rd] GPR[rs] GPR[rt]
64 T: temp GPR[rs] GPR[rt]
GPR[rd] (temp31)32 temp310
Exceptions:
Integer overflow exception
TX49/H2 Architecture
A-148
SUBU Subtract Unsigned SUBU
rd SUBU
100011
0
00000
SPECIAL
000000 rs rt
561011151620
2125
6
2631 0
55556
Format:
SUBU rd, rs, rt
Description:
The content s of general register rt are subtracted from the contents of general register rs to
form a result. The result is placed into general register rd. In 64-bit mode, the operands
must be valid sign-extended,32-bit values.
The only difference between this instruction and the SUB instruction is that SUBU never
traps on overflow. No integer overflow exception occurs under any circumstances.
Operation:
32 T: GPR[rd] GPR[rs] GPR[rt]
64 T: temp GPR[rs] GPR[rt]
GPR[rd] (temp31)32 temp310
Exceptions:
None
TX49/H2 Architecture
A-149
SW Store Word SW
offset
SW
101011 base rt
1516202125
6
2631 0
55 16
Format:
SW rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. The contents of general register rt are stored at the memory location
specified by the effective address.
If either of the two least-significant bits of the effective address are non-zero, an address
error exception occurs.
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian 02)
byte vA ddr20 xor (BigEndianCPU 02)
data GPR[rt]63-8*byte 08*byte
StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian 02)
byte vA ddr20 xor (BigEndianCPU 02)
data GPR[rt]63-8*byte 08*byte
StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)
Exceptions:
TLB refill exception
TUB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-150
SWCz Store Word From
Coprocessor z SWCz
offset
SWCz
1110xx*base rt
1516202125
6
26
31 0
55 16
Format:
SWCz rt, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
a virtual address. Coprocessor unit z sources a word, which the processor writes to the
addressed memory l ocation.
The data to be stored is defined by individual coprocessor specifications. This instruction
is not valid for use with CP0. If either of the two least-significant bits of the effective address
is non-zero, an address error exception occurs.
Execution of the instruction referencing coprocessor 3 causes a reserved instruction
exception, not a coprocessor unusab le exception.
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian 02)
byte vA ddr20 xor (BigEndianCPU 02)
data COPzSW (byte, rt)
StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor (ReverseEndian 02)
byte vA ddr20 xor (BigEndianCPU 02)
data COPzSW (byte, rt)
StoreMem ory (uncache, WORD, data, pAddr, vAddr, DATA)
*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit
Encoding” at the end of Appendix A.
TX49/H2 Architecture
A-151
SWCz Store Word From
Coprocessor z (Continued) SWCz
Exceptions:
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
Coprocessor unus able exception
Opcode Bit Encoding:
SWCz
Coprocess or Uni t Number
SWC1
Bit #
SW Opcode
0
26272829
30
0111
31
10
0
2627282930
0111
31
01
SWC2
Bit #
TX49/H2 Architecture
A-152
SWL Store Word Left SWL
offset
SWL
101010 base rt
1516
2021
25
6
2631 0
55 16
Format:
SWL rt, offset (base)
Description:
This instruction can be used with the SWR instruction to store the contents of a register
into four consecutive bytes of memory, when the bytes cross a boundary between two words.
SWL stores the left portion of the register into the appropriate part of the high-order word of
memory; SWR stores the right portion of the register into the appropriate part of the low-
order word.
The SWL instruction adds its sign-extended 16-bit offset to the contents of general register
base to form a virtual address which may sp ecif y an arb itrary byte . It al ters o n ly th e w ord in
memory which contains that byte. From one to four bytes will be stored, depending on the
staring byte specified.
Conceptually, it starts at the most-significant byte of the register and copies it to the
specified byte in memory; then it proceeds toward the low-order byte of the register and the
low-order byte of the word in memory, copying bytes from register to memory until it reaches
the low-order byte of the word in memory.
No address exceptions due to alignment are possible.
SWL $24,1($0)
memory
(big-endian)
register
address 0
address 4 7654
3210 $24
before DCBA
after
address 0
address 4 7654
CBA0
TX49/H2 Architecture
A-153
SWL Store Word Left
(Continued) SWL
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
if BigEndianMem = 0 then
pAddr pAddr312 02
endif
byte vA ddr10 xor BigEndianCPU2
if (vAddr2 xor BigEndianCPU) = 0 then
data 032 024-8*byteGPR[rt]3124-8*byte
else
data 024-8*byte GPR[rt]3124-8*byte 032
endif
StoreMem ory (uncac hed, byt e, dat a, pAddr, vAddr, DATA)
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
if BigEndianMem = 0 then
pAddr pAddr312 02
endif
byte vA ddr10 xor BigEndianCPU2
if (vAddr2 xor BigEndianCPU) = 0 then
data 032 024-8*byteGPR[rt]3124-8*byte
else
data 024-8*byte GPR[rt]3124-8*byte 032
endif
StoreMem ory (uncac hed, byt e, dat a, pAddr, vAddr, DATA)
TX49/H2 Architecture
A-154
SWL Store Word Left
(Continued) SWL
Given a d oublewo rd in a register an d a doublew ord in memory, the o peration of SW L is a s
follows:
SWL
Register
Memory
CBA DEFGH
KJI LMNOP
BigEndianCPU = 0 BigEndianCPU = 1
offset offset
vAddr20Destination type LEM BEM Destination type LEM BEM
0IJKLMNOE007EFGHMNOP340
1IJKLMNEF106IEFGMNOP241
2 I JKLMEFG 2 0 5 I JEFMNOP 1 4 2
3IJKLEFGH304IJKEMNOP043
4IJKEMNOP043IJKLEFGH304
5IJEFMNOP142IJKLMEFG205
6 IEFGMNOP 2 4 1 I JKLMNEF 1 0 6
7EFGHMNOP340IJKLMNOE007
LEM BigEndianMem = 0
BEM BigEndianMem = 1
Type AccessType (see Figure 2-2) sent to memory
Offset pAddr20 sent to memory
Exceptions:
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
TX49/H2 Architecture
A-155
SWR Store Word Right SWR
offset
SWR
101110 base rt
1516202125
6
2631 0
55 16
Format:
SWR rt, offset (base)
Description:
This instruction can be used with the SWL instruction to store the contents of a register
into four consecutive bytes of memory, when the bytes cross a boundary between two words.
SWR stores the right portion of the register into the appropriate part of the low-order word;
SWL stores the left portion of the register into the appropriate part of the low-order word of
memory.
The SWR instruction adds its sign-extended 16-bit offset to the contents of general register
base to form a virtual address which may sp ecif y an arb itrary byte . It al ters o n ly th e w ord in
memory which contains that byte. From one to four bytes will be stored, depending on the
starting byte specified.
Concep tually, it start s at the least-s ignificant (r ightmost) byte of the reg ister and copies it
to the specified byte in memory; then it proceeds toward the high-order byte of the register
and the high-order byte of the word in memory, copying bytes from register to memory until
it reaches the high-order byte of the word in memory.
No address exceptions due to alignment are possible.
SWR $24,4($0)
memory
(big-endian)
register
address 0
address 4 7654
3210 $24
before DCBA
after
address 0
address 4 765D
3210
TX49/H2 Architecture
A-156
SWR Store Word Right
(Continued) SWR
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
if BigEndianMem = 0 then
pAddr pAddr312 02
endif
byte vA ddr10 xor BigEndianCPU2
if (vAddr2 xor BigEndi anCPU) = 0 then
data 032 GPR[rt]31-8*byte0 08*byte
else
data GPR[rt]31-8*byte0 08*byte 032
endif
StoreMem ory (uncached, WORD-byte, data, pA ddr, vAddr, DA TA)
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, unchached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20 xor ReverseEndian3)
if BigEndianMem = 0 then
pAddr pAddr312 02
endif
byte vA ddr10 xor BigEndianCPU2
if (vAddr2 xor BigEndi anCPU) = 0 then
data 032 GPR[rt]31-8*byte0 08*byte
else
data GPR[rt]31-8*byte0 08*byte 032
endif
StoreMem ory (uncached, WORD-byte, data, pA ddr, vAddr, DA TA)
TX49/H2 Architecture
A-157
SWR Store Word Right
(Continued) SWR
Given a d oubleword in a register and a doublewo rd in memory, the operation of SWR is a s
follows:
SWR
Register
Memory
CBA DEFGH
KJ
ILMNOP
BigEndianCPU = 0 BigEndianCPU = 1
offset offset
vAddr20Destination type LEM BEM Destination type LEM BEM
0IJKLEFGH304HJKLMNOP070
1 I JKLFGHP 2 1 4 GHKLMNOP 1 6 0
2IJKLGHOP124FGHLMNOP250
3IJKLHNOP034EFGHMNOP340
4EFGHMNOP340IJKLHNOP034
5 FGHLMNOP 2 5 0 I JKLGHOP 1 2 4
6 GHKLMNOP 1 6 0 I JKLFGHP 2 1 4
7HJKLMNOP070IJKLEFGH304
LEM BigEndianMem = 0
BEM BigEndianMem = 1
Type AccessType (see Figure 2-2) sent to memory
Offset pAddr20 sent to memory
Exceptions:
TLB refill exception
TLB invalid exception
TLB modifica tion exception
BUS error exception
Address error exception
TX49/H2 Architecture
A-158
SYNC Synchronize SYNC
SYNC
001111
0
0000 0000 0000 0000 0000
SPECIAL
000000
56
25
6
26
31 0
20 6
Format:
SYNC
Description:
The SYNC instruction ensures that any loads and stores fetched prior to the present
instruction are completed before any loads or stores after this instruction are allowed to
start. Use of the SYNC instruction to serialize certain memory references may be required in
multiprocessor environment for proper synchronization.
For example:
Processor A Processor B
SW
LI
SYNC
SW
R1, DATA
R2, 1
R2, FLAG
1: LW
BEQ
NOP
SYNC
LW
R2, FLAG
R2, R0, 1B
R1, DATA
The SYNC in processor A prevents DATA being written after FLAG, which could cause
processor B to read stale data. The SYNC in processor B prevents DATA from being read
before FLAG, which could likewise result in reading stale data. For processors which only
execute loads and stores in order, with respect to shared memory, this instruction is a NOP.
LL and SC instructions implicitly perform a SYNC.
This instruction is allowed in User mode.
Operation:
32, 64 T: S ync Operation()
Exceptions:
None
TX49/H2 Architecture
A-159
SYSCALL System Call SYSCALL
SYSCALL
001100
SPECIAL
000000
5625
6
26
31 0
20 6
Code
Format:
SYSCALL
Description:
A system call exception occurs, immediately and unconditionally transferring control to the
exception handler.
The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption
handler only by loading the contents of the memory word containing the instruction.
Operation:
32, 64 T: S ys temCallException
Exceptions:
System Call exception
TX49/H2 Architecture
A-160
A. TEQ Trap If Equal TEQ
code TEQ
110100
SPECIAL
000000 rs rt
561516202125
6
2631 0
55 10 6
Format:
TEQ rs, rt
Description:
The content s of general register rt are compared to general register rs.
If the contents of general register rs are equal to the contents of general register rt, a trap
exception occurs.
The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption
handler only by loading the contents of the memory word containing the instruction.
Operation:
32, 64 T: i f GPR[ rs] = GPR[rt] then
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-161
TEQI Trap If Equal Immediate TEQI
immediate
TEQI
01100
REGIMM
000001 rs
15162021
25
6
2631 0
55 16
Format:
TEQI rs, immediate
Description:
The 16-bi t immediate is sign-extended and compared to the contents of general register rs.
If the contents of general register rs are equal to the sign-extended immediate, a trap
exception occurs.
Operation:
32 T: if GPR[rs] (immediate15)16 immediate150 then
TrapException
endif
64 T: if GPR[rs] (immediate15)48 immediate150 then
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-162
TGE Trap If Greater Than Or
Equal TGE
code TGE
110000
SPECIAL
000000 rs rt
561516202125
6
2631 0
55 10 6
Format:
TGE rs, rt
Description:
The contents of general register rt are compared to the contents of general register rs.
Considering both quantities as signed integers, if the contents of general register rs are
greater than or equal to the contents of general register rt, a trap exception occurs.
The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption
handler only by loading the contents of the memory word containing the instruction.
Operation:
32, 64 T: i f GPR[ rs] GPR[rt] then
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-163
TGEI Trap If Greater Than Or
Equal Immediate TGEI
immediate
TGEI
01000
REGIMM
000001 rs
15162021
25
6
2631 0
55 16
Format:
TGEI rs, immediate
Description:
The 16-bi t immediate is sign-extended and compared to the contents of general register rs.
Considering both quantities as signed integers, if the contents of general register rs are
greater than or equal to the sign-extended immediate, a trap exception occurs.
Operation:
32 T: if GPR[rs] (immediate15)16 immediate150 then
TrapException
endif
64 T: if GPR[rs] (immediate15)48 immediate150 then
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-164
TGEIU Trap If Greater Than Or
Equal Immediate
Unsigned TGEIU
immediate
TGEIU
01001
REGIMM
000001 rs
1516202125
6
2631 0
55 16
Format:
TGEIU rs, immediate
Description:
The 16-bi t immediate is sign-extended and compared to the contents of general register rs.
Considering both quantities as unsigned integers, if the contents of general register rs are
greater than or equal to the sign-extended immediate, a trap exception occurs.
Operation:
32 T: if (0 GPR[rs]) (0 (immediate15)16 immediate150) then
TrapException
endif
64 T: if (0 GPR[rs]) (0 (immediate15)48 immediate150) then
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-165
TGEU Trap If Greater Than Or
Equal Unsigned TGEU
code TGEU
110001
SPECIAL
000000 rs rt
561516202125
6
26
31 0
55 10 6
Format:
TGEU rs, rt
Description:
The contents of general register rt are compared to the contents of general register rs.
Considering both quantities as unsigned integers, if the contents of general register rs are
greater than or equal to the contents of general register rt, a trap exception occurs.
The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption
handler only by loading the contents of the memory word containing the instruction.
Operation:
32, 64 T: if (0 GPR[rs]) (0GPR[rt]) then
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-166
TLBP Probe TLB For Matching Entry TLBP
TLBP
001000
0
000 0000 0000 0000 0000
CO
1
COP0
010000
56
6
2631 0
519 6
2425
Format:
TLBP
Description:
The Index register is loaded with the address of the TLB entry whose contents match the
contents of the EntryHi register. If no TLB entry matches, the high-order bit of the Index
register is set.
The architecture does not specify the operation of memory references associated with the
instructio n immediately after a TL BP instruc tion, nor is the ope ration specif ied if more than
one TLB entry matches.
Operation:
32 T: Index 1 025Undeficed6
for i in 0TLBEntries-1
if (TLB[i]9577 = EntryHi3112) and (TLB[i]76 or
(TLB[i]7164 = EntryHi70)) then
Index 026 i50
endif
endfor
64 T: Index 1 025Undeficed6
for i in 0TLBEntries-1
if (TLB[i]167141 and not (015 TLB[i]216205))
= (EntryHi3913 and not (015 TLB[i]216205)) and
(TLB[i]140 or (TLB[i]135128 = EntryHi70)) t hen
Index 026 i50
endif
endfor
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
A-167
TLBR Read Indexed TLB Entry TLBR
TLBR
000001
0
000 0000 0000 0000 0000
CO
1
COP0
010000
5
6
6
2631 0
519 6
2425
Format:
TLBR
Description:
The G bit (controls ASID matching) read from the TLB is written into both EntryLo0 and
EntryLo1.
The EntryHi and EntryLo registers are loaded with the contents of the TLB entry pointed
at by the contents of the TLB Index register. The operation is invalid (and the results are
unspecified) if the contents of the TLB Index register are greater than the number of TLB
entries in the processor.
Operation:
32 T: PageMask TLB[Index50]12796
EntryHi TLB[Index50]9564 and not TLB[Index50]12796
EntryLo1 TLB[Index50]6332
EntryLo0 TLB[Index50]310
64 T: PageMask TLB[Index50]255192
EntryHi TLB[Index50]191128 and not TLB[Index50]255192
EntryLo1 TLB[Index50]12765 TLB[Index50]140
EntryLo0 TLB[Index50]631 TLB[Index50]140
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
A-168
TLBWI Write Indexed TLB Entry TLBWI
TLBWI
000010
0
000 0000 0000 0000 0000
CO
1
COP0
010000
56
6
2631 0
519 6
2425
Format:
TLBWI
Description:
The G bit of the TLB is written with the logical AND of the G bits in EntryLo0 and
EntryLo1.
The TLB entry pointed at by the contents of the TLB Index register is loaded with the
contents of the EntryHi and EntryLo registers.
The operation is invalid (and the results are unspecified) if the contents of the TLB Index
register are greater than the number of TLB entries in the processor.
Operation:
32, 64 T: TLB [Index50]
PageMask(EntryHi and not PageMask) EntryLo1 EntryLo0
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
A-169
TLBWR Write Random TLB Entry TLBWR
TLBWR
000110
0
000 0000 0000 0000 0000
CO
1
COP0
010000
56
6
2631 0
519 6
2425
Format:
TLBWR
Description:
The G bit of the TLB is written with the logical AND of the G bits in EntryLo0 and
EntryLo1.
The TLB entry pointed at by the contents of the TLB Random register is loaded with the
contents of the EntryHi and EntryLo registers.
Operation:
32, 64 T: TLB [Random50]
PageMask(EntryHi and not PageMask) EntryLo1 EntryLo0
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
A-170
TLT Trap If Less Than TLT
code TLT
110010
SPECIAL
000000 rs rt
561516202125
6
2631 0
55 10 6
Format:
TLT rs, rt
Description:
The content s of general register rt are compared to general register rs.
Considering both quantities as signed integers, if the contents of general register rs are
less than the contents of general register rt, a trap exception occurs.
The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption
handler only by loading the contents of the memory word containing the instruction.
Operation:
32, 64 T: i f GPR[ rs] < GPR[rt] then
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-171
TLTI Trap If Less Than Immediate TLTI
immediate
REGIMM
000001 rs TLTI
01010
151620
2125
6
2631 0
55 16
Format:
TLTI rs, immediate
Description:
The 16-bi t immediate is sign-extended and compared to the contents of general register rs.
Considering both quantities as signed integers, if the contents of general register rs are less
than the sign-extended immediate, a trap exception occurs.
Operation:
32 T: if GPR[rs] < (immediate15)16 immediate150 then
TrapException
endif
64 T: if GPR[rs] < (immediate15)48 immediate150) then
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-172
TLTIU Trap If Less Than
Immediate Unsigned TLTIU
immediate
TLTIU
01011
REGIMM
000001 rs
1516202125
6
2631 0
55 16
Format:
TLTIU rs, immediate
Description:
The 16-bi t immediate is sign-extended and compared to the contents of general register rs.
Considering both quantities as signed integers, if the contents of general register rs are less
than the sign-extended immediate, a trap exception occurs.
Operation:
32 T: if (0 GPR[rs]) < (0 (immediate15)16 immediate150) then
TrapException
endif
64 T: if (0 GPR[rs]) < (0 (immediate15)48 immediate150) then
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-173
TLTU Trap If Less than
Unsigned TLTU
code TLTU
110011
SPECIAL
000000 rs rt
561516202125
6
2631 0
55 10 6
Format:
TLTU rs, rt
Description:
The contents of general register rt are compared to general register rs. Considering both
quantities as unsigned integers, if the contents of general register rs are less than the
contents of general register rt, a trap exception occurs.
The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption
handler only by loading the contents of the memory word containing the instruction.
Operation:
32, 64 T: i f (0GPR [rs ]) < (0 GPR [r t ]) the n
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-174
TNE Trap If Not Equal TNE
code TNE
110110
SPECIAL
000000 rs rt
561516202125
6
2631 0
55 10 6
Format:
TNE rs, rt
Description:
The contents of general register rt are compared to general register rs. If the contents of
general register rs are not equal to the contents of general register rt, a tap exception occurs.
The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption
handler only by loading the contents of the memory word containing the instruction.
Operation:
32, 64 T: i f GPR [rs] GPR [rt] then
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-175
TNEI Trap If Not Equal Immediate TNEI
immediate
TNEI
01110
REGIMM
000001 rs
1516202125
6
2631 0
55 16
Format:
TNEI rs, immediate
Description:
The 16-bi t immediate is sign-extended and compared to the contents of general register rs.
If the contents of general register rs are not equal to the sign-extended immediate, a trap
exception occurs.
Operation:
32 T: if GPR[rs] (immediate15)16 immediate150 then
TrapException
endif
64 T: if GPR[rs](immediate15)48 immediate150 then
TrapException
endif
Exceptions:
Trap exception
TX49/H2 Architecture
A-176
WAIT Wait WAIT
WAIT
100000
0
000 0000 0000 0000 0000
CO
1
COP0
010000
56
6
2631 0
519 6
24
25
Format :
WAIT
Description :
The WAIT instruction is used to halt the internal pipeline and thus reduce the power
consumption of the CPU. See Chapter 16.
Operation :
32, 64 T: if G-bus is idle then
StopPipeline
Endif
Exceptions :
Coprocessor unus able exception
TX49/H2 Architecture
A-177
XOR Exclusive Or XOR
rd 0
00000 XOR
100110
SPECIAL
000000 rtrs
5610111516202125
6
2631 0
55556
Format:
XOR rd, rs, rt
Description:
The contents of general register rs are combined with the contents of general register rt in
a bit-wise logical exclusive OR operation. The result is placed into general register rd.
Operation:
32, 64 T: GP R [ rd] GPR [rs] xor GPR [rt]
Exceptions:
None
TX49/H2 Architecture
A-178
XORI Exclusive OR Immediate XORI
immediate
XORI
001110 rtrs
1516
202125
6
2631 0
55 16
Format:
XORI rt, rs, immediate
Description:
The 16-bi t immediate is zero-extended and combined with the contents of general register
rs in a bit-wise logical exclusive OR operation. The result is placed into general register rt.
Operation:
32 T: GPR [rt] GPR [rs] xor (016 immediate)
64 T: GPR [rt] GPR [rs] xor (048 immediate)
Exceptions:
None
TX49/H2 Architecture
A-179
A.7 Bit Encoding of CPU Instruction OPcodes
The Table A-2 shows the bit codes for all TX49 CPU instructions(ISA and extended ISA)
Table A-4 CPU Operation Code Bit Encoding
OPcode
31 26 0
OPcode
3129 2826
01234567
0 SPECIA λREGIMM λJ JAL BEQ BNE BLEZ BGTZ
1 ADDI ADDIU SLTI SLTIU ANDI ORI XORI LUI
2COP0 αCOP1 αCOP2 αCOP3 α θ BEQL BNEL BLEZL BGTZL
3DADDI εDADDIU εLDL εLDR εMAC λ***
4 LB LH LWL LW LBU LHU LWR LWU ε
5SB SH SWL SW SDL εSDR εSWR CACHE
6LL LWC1 αLWC2 αPREF LLD εLDC1 αLDC2 αLD ε
7SC SWC1 αSWC2 α*SCD εSDC1 αSDC2 αSD ε
SPECIAL Function
31 26 5 0
OPcode =
SPECIAL SPECIAL
Function
5320
01234567
0SLL *SRL SRA SLLV *SRLV SRAV
1JR JALR **
SYSCALL BREAK SDBBP SYNC
2 MFHI MTHI MFLO MTLO DSLLV ε*DSRLV εDSRAV ε
3 MULT MULTU DIV DIVU DMULT εDMULTεDDIV εDDIVU ε
4 ADD ADDU SUB SUBU AND OR XOR NOR
5**
SLT SLTU DADD εDADDU εDSUB εDSUBU ε
6 TGE TGEU TLT TLTU TEQ *TNE *
7 DSLL ε*DSRL εDSRA εDSLL32 ε*DSRL32 εDSRA32 ε
TX49/H2 Architecture
A-180
REGIMM rt
31 26 20 16 0
OPcode =
REGIMM REGIMM
rt
2019 1816
01234567
0 BLTZ BGEZ BLTZL BGEZL ****
1 TGEI TGEIU TLTI TLTIU TEQI *TNEI *
2 BLTZAL BGEZAL BLTZALL BGEZALL ****
3********
COPz rs
31 26 25 21 0
OPcode =
COPz COPz
rs
2524 2321
01234567
0MF DMF εCF γMT DMT εCT γ
1BC γγγγγγγ
2
3CO
COPz rt
31 26 20 16 0
OPcode =
COPz COPz
rt
2019 1816
01234567
0 BCF BCT BCFL BCTL γγγγ
1γγγγγγγγ
2γγγγγγγγ
3γγγγγγγγ
COP0 Function
31 26 5 0
OPcode =
COP0 COP0
Function
5320
01234567
0φTLBR TLBWI φφφTLBWR φ
1TLBP φφφφφφφ
2φφφφφφφφ
3ERET φφφφφφDERET
4WAIT φφφφφφφ
5φφφφφφφφ
6φφφφφφφφ
7φφφφφφφφ
TX49/H2 Architecture
A-181
MAC Function
31 26 5 0
OPcode =
MAC MAC
Function
5320
01234567
0 MADD MADDU γγγγγγ
1γγγγγγγγ
2γγγγγγγγ
3γγγγγγγγ
4γγγγγγγγ
5γγγγγγγγ
6γγγγγγγγ
7γγγγγγγγ
Key :
*: This opcode is reserved for future use. An attempt to execute it causes a Reserved
Instruction exception.
γ: This opcode is reserved for future use. An attempt to execute it causes a Reserved
Instruction exception.
λ: This opecode indicates an instruction class. The instruction word must be further decoded
by examining additional tables that show the values for another instruction field.
α: This opcode is a coprocessor operation, not a CPU operation. If the processor state does
not allow access to the specified coprocessor, the instruction causes a Coprocessor
Unusable exception. It is included in the table because it uses a primary opecode in the
instruction encodeing map.
φ: This opc ode is reser ved f or futur e use, but does n ot caus e a R eser ved I nstr uctio n exc eption
in TX49 implementations. It is treated as “NOP”.
θ: This opcode is valid when BC is only selected in COPz rs; In other case, it causes a
Reserved Instr uction exception .
ε: This opcode is val id when the processor is operat ing either in the Kern el mode or in the 64-
bit non-Kernel (User or Supervisor) mode; In other case, it causes a Reserved Instruction
exception .
TX49/H2 Architecture
A-182
TX49/H2 Architecture
B-1
Appendix B: FPU Instruction Set Details
This appendix provides a detailed description of the operation of each Floating-Point (FPU)
instructio n. The in struc tions are liste d alpha betical ly. Th e exce ptions that may o ccur du e to th e
execution of each instruction are listed after the description of each ins truction. The description
of the immediate causes and the manner of handling exceptions us omitted horn the instruction
descriptions in this chapter. Refer to Chapter 6 for detailed descriptions of floating-point
exceptions a n d handling.
Table B-5 lists the entire bit encoding for the constant fields of the Floating-Point instruction
set; the bit encoding for each instruction is included with that individual instruction.
B.1 Instruction Formats
There are three basic instruction format types:
I-Type, or Immediate instructions, which include load and store operations, M-Type,
or Move instructions
R-Type, or Register instructions, which include the two-and three-register Floating-
Point operations.
Branch instructions and Move instructions
The instruction description subsections that follow show how the three basic instruction
formats are used by:
Load and store instructions,
Move instructions, and
Floating-Point Computational instructions.
TX49/H2 Architecture
B-2
Floating-point instructions are mapped onto the MIPS coprocessor instructions, defining
coprocessor unit number one (CP1) as the floating-point unit.
Each operation is valid only for certain formats. Implementations may support some of
these formats and operations only through emulation, but only need support combinations
that are valid, which are marked with a V in Table B-1 below. Those combinations marked
with a “R” are not currently specified by this architecture, causing an unimplemented
instruction trap, to maintain compatibility with future architecture extensions.
Table B-1 Valid FPU Instruction Formats
Source Format
Operation Single Double Word Longword
ADD V V R R
SUB V V R R
MUL V V R R
DIV V V R R
SQRT V V R R
ABS V V R R
MOV V V
NEG V V R R
TRUNC.L V V
ROUND.L V V
CEIL.L V V
LOOR.L V V
TRUNC.W V V
ROUND.W V V
CEIL.W V V
FLOOR.W V V
CVT.S V V V
CVT.D V V V
CVT.W V V
CVT.L V V
CVVRR
TX49/H2 Architecture
B-3
The coprocessor branch on condition true/false instructions can be used to logically negate
any predicate. Thus, the 32 possible conditions require only 16 distinct comparisons, as shown
in Table B-2 below.
Table B-2 Logical Negation of Predicates by Condition True/False
Condition Relations
Mnemonic
True False Code Greater
Than Less
Than Equal Unordered
Invalid Operation
exception if
unordered
FT 0FFF F No
UNOR 1FFF T No
EQNEQ2FFT F No
UEQ OGL 3 F F T T No
OLT UGE 4 F T F F No
ULT OGE 5 F T F T No
OLE UGT 6 F T T F No
ULE OGT 7 F T T T No
SF ST 8 F F F F Yes
NGLE GLE 9 F F F T Yes
SEQ SNE 10 F F T F Yes
NGLGL 11FFT T Yes
LT NLT 12 F T F F Yes
NGEGE13FTF T Yes
LE NLE 14 F T T F Yes
NGTGT 15FTT T Yes
B.1.1 Floating-Point Loads, Stores, and Moves
All movement of data between the floating-point coprocessor and memory is
accomplished by coprocessor load and store operations, which reference the floating-point
coprocessor’s General-Purpose Registers. These operations are unformated; no format
conversions are performed and, therefore, no floating-point exceptions occur due to these
operations.
Data may also be directly moved between the floating-point coprocessor and the
processor by move to coprocessor and move from coprocessor instructions. Like the
floating-point load and store operations, move to/from operations perform no format
conversions and never cause floating-point exceptions.
An additional pair of coprocessor registers are available, called Floating-Point Control
registers for which the only data movement opera-lions supported are moves to and from
processor General-Purpose Registers.
B.1.2 Floating-Point Operations
The floating-point unit’s operation set includes floating-point add, subtract, multiply,
divide, square root, convert between fixed-point and floating-point format, convert
between floating-point formats, and floating-point compare. These operations satisfy
IEEE Standard 754’s requirements for accuracy. Specifically, these operations obtain a
result which is identical to performing the result with infinite precision and then
rounding to the specified format, using the current rounding mode.
Instructions must specify the format of their operands. Except for con-version
functions, mixed-format operations are not provided.
TX49/H2 Architecture
B-4
B.2 Instruction Notational Conventions
In this appendix, all variable sub fields in an instruction format (such as fs, ft, immediate,
and so on) are shown with lower-case names. The instruction name (such as ADD, SUB, and
so on) is shown in upper-case.
For the sake of clarity, an alias is sometimes substituted for a variable subfield in the
formats of specific instructions. For example, we use rs = base in the format fo r load and sto re
instructions. Such an alias is always lower case, since it refers to a variable subfield.
In some instructions, however, the two instruction subfields op and function have constant
6-bit values. When reference is made to these instructions, upper-case mnemonics are used.
In the floating-point instruction, for example, we use op = COP1 and function = FADD. In
some cases, a single field has both fixed and variable subfields, so the name contains both
upper and lower case characters. Actual bit encoding for mnemonics is shown in Figure B-5 at
the end of this appendix, and are also included with each individual instruction.
In the instruction description examples that follow, the Operation section describes the
operation performed by each instruction using a high-level language notation.
B.2.1 Instr uction Notation Examples
Example #1:
GPR[ft] immediate 016
Sixteen ze ro bits are concate nated with an i mmediate value ( typic ally 1 6 bi ts),. and
the 32-bit string (with the lower 16 bits set to zero) is assigned to GPR register ft.
Example #2:
(immediate15)16 immediate150
Bit 15 (the sign bit) of an immediate v alue is exte nded for 16 bi t position s, and the
result is concatenated with bits 15 through 0 of the immediate value to form a 32-bit
sign-ext ended value.
TX49/H2 Architecture
B-5
B.3 Load and Store Instructions
In the MIPS ISA, all load operations have a delay of at least one instruction. That is,
the instruction immediately following a load cannot use the contents of the register that
will be loaded with the data being fetched from storage.
In the TX49, the instruction immediately following a load may use the contents of the
register loaded. In such cases, the hardware will interlock, requiring additional real
cycles, so scheduling load delay slots is still desirable, although not absolutely required for
functional code.
When the FR bit in the Status register equals zero, the Floating-Point General Registers
(FGR) are 32-bits wide. When the FR bit in the Status register equals one, the Floating-
Point General Registers (FGR) are 64-bits wide. The behavior of the load store
inst urctions in dependent on the width of the FGRs.
In the load/store operation descriptions, the functions listed in Table B-3 are used to
summarize the handling of virtual addresses and physical memory.
Table B-3 Load/Store Common Functions
Function Meaning
AddressTranslation Uses the TLB to find the physical address given the virtualaddress. The function fails and an
exception is taken if therequired trans l ation is not pres ent in the TLB.
LoadMemory Uses the cache and main memory to find the contents of theword containing the specified physical
address. The low-ordertwo bits of the address and the access type field indicates whichof each of
the four bytes within the data word need to bereturned. If the cache is enabled f or this access, the
entire wordis returned and loaded i nto the cache.
StoreMemory Uses the cache, write buffer and m ain memory to st ore the wordor part of word specifi ed as data in
the word containing thespecified physical address. The low-order two bits of theaddress and the
access type fiel d indic ates which of eac h of thef our bytes within t he data word should be stored.
TX49/H2 Architecture
B-6
Figure B-1 shows the I-Type instruction format used by load and store operations.
I-Type (Immediat e)
baseop offsetft
6
2631
55 16
21 16 025 20 15
where:
op i s a 6-bit operation code
base is the 5-bit base register spec if i er
ft is a 5-bit. sourc e (for st ores) or destination (for loads)
FPA register specifier
offset i s the 16-bit signed immediate offset
Figure B-1 Load and Stor e Instr uction Format
All coprocessor loads and stores reference aligned word data items. Thus, for word loads
and stores, the access type field is always WORD, and the low-order two bits of the address
must always be zero.
For double word loads and stores, the access type field is always DOUBLEWORD, and the
low-order three bits of the address must always be zero.
Regardless of byte-numbering order (endianness), the address specifies that byte which ha s
the smallest byte-address of all of the bytes in the addressed field. For a Big-endian machine,
this is the leftmost byte; for a Little-endian machine, this is the rightmost byte.
TX49/H2 Architecture
B-7
B.4 Computational Instructions
Computational instructions include all of the arithmetic floating-point operations performed
by the FPU. Figure B-2 shows the R-Type instruction format used for computational
operations.
R-Type (Register)
fdfs function
COP1 fmt ft
5610111516202125
6
2631 0
55556
where:
COP1 is a 6-bit major operation code
fmt is a 5-bit format spec ifi er
fs is a 5-bit sourc e 1 register
ft is a 5-bit source2 register
fd i s a 5-bit destination register
function i s a 6-bit function field
Figure B-2 Computational Instruction Format
Each floating-point instruction can be applied to a number of operand formats. The operand
format for an instruction is specifie d by the 4-bit Format field; decoding for this field is shown
in Table B-4.
Table B-4 Format Field Decoding
Code Mnemonic Size Format
16 S single Binary floating-point
17 D double Binary f l oating-poi nt
18 Reserved
19 Reserved
20 W single Binary fixed-point
21 L longword 64-bit binary fixed-point
2231 - - Reserved
The function indicates which floating-point operation is to be performed. Table B-5 lists all
floati ng-point instructions.
TX49/H2 Architecture
B-8
Table B-5 Floating-Point Instructions and Operations
Code (50) Mnemonic Operation
0ADDAdd
1 SUB Subtract
2 MUL multiply
3DIVDivide
4 SQRT Square root
5 ABS Absolute value
6MOVMove
7 NEG Negate
8 ROUND.L Convert to single fixed-point, rounded t o nearest/even
9 TRUNC.L Convert to single fixed-point, rounded toward zero
10 CEIL.L Convert to single fixed-point, rounded to +∞
11 FLOOR.L Convert to single fixed-point, rounded to −∞
12 ROUND.W Convert to single fi xed-point, rounded to nearest/even
13 TRUNC.W Convert to single fixed-point, rounded toward zero
14 CEIL.W Convert to single fixed-point, rounded to +∞
15 FLOOR. W Convert to single fixed-point, rounded to −∞
1631 - Reserved
32 CVT.S Convert to single floating-point
33 CVT.D Convert t o doubl e floating-point
34 - Reserved
35 - Reserved
36 CVT.W Convert to binary fixed-point
37 CVT.L Convert to 64-bit binary fixed-point
3847 - Reserved
4863 C Floating-poi nt c om pare
TX49/H2 Architecture
B-9
In the following pages, the notation FGR refers to the FPU’s 32 General-Purpose Registers
FGRO through FGR31, and FPR refers to the FPU’s Floating-Point Registers. When the FR
bit in the Status register (SR26) equals zero, only the even Floating-Point Registers are valid
and the FPU’s 32 General-Purpose Registers are 32-bits wide. When the FR bit in the Status
register (SR26) equals one, both odd and even Floating-Point Registers may be used and the
FPU’s 32 General-Purpose Registers are 64-bits wide.
The following routines are used in the description of the floating-point operations to get the
value of an FPR or to change the value of an FGR:
32 Bit Mode
value < - - ValueFP R (fpr, fmt)
/* undefined for odd fpr */
case fmt of
S, W: value < - - FGR[ f pr + 0]
D: /* undefined for fpr not even */
value < - - FGR[fpr + 1] FGR[fpr + 0]
end
StoreFPR (fpr, fmt, value):
/* undefined for odd fpr */
case fmt of
S, W: FGR[fpr + 1] < - - undefined
FGR[fpr + 0] < - - value
D: FGR[fpr + 1] < - - value6332
FGR[fpr + 0] < - - value310
end
64 Bit Mode
value < - - ValueFP R (fpr, fmt)
case fmt of
S: value < - - FGR[fpr]310
D, L: value < - - FGR[f pr]
W: value < - - FGR[fpr]
end
StoreFPR (fpr, fmt, value):
case fmt of
S, W: FGR[ f pr] < - - undefined32 value
D, L: FGR[fpr] < - - value
end
TX49/H2 Architecture
B-10
ABS.fmt Floating-Point Absolute
Value ABS.fmt
fdfs ABS
000101
0
00000
COP1
010001 fmt
56101115
16
202125
6
2631 0
55556
Format:
ABS.fmt fd, fs
Description:
The contents of the FPU register specified by fs are interpreted in thespecified format and
the arithmetic absolute value is taken. The result is placed in the floating-point register
specified by fd.
The absolute value operation is arithmetic; a NaN operand signals in-valid operation.
This instruction is valid only for single- and double-precision floating-point formats. The
operation is not defined if bit 0 of any register specification is set and the FR bit in the Status
register equals zero, since the register numbers specify an even-odd pair of adjacent
coprocessor general registers. When the FR bit in the Status register equals one, both even
and odd register numbers are valid.
Operation:
32, 64 T: S toreFPR (fd, fmt, AbsoluteValue (ValueFPR (fs, fmt)))
Exceptions:
Coprocessor unus able exception
Coprocessor exception tap
Coprocessor Exceptions:
Unimplemented operation exception
Invalid operation exception
TX49/H2 Architecture
B-11
ADD.fmt Floating-Point Add ADD.fmt
fdfs ADD
000000
COP1
010001 ftfmt
5610111516202125
6
2631 0
55556
Format:
ADD.fmt fd, fs, ft
Description
The contents of the FPU registers specified by fs and ft are interpreted in the specified
format and arithme tically added. The resu lt is round- ed as if calcula ted to infinite p recision
and then rounded to the specified format (fmt), according to the current rounding mode. The
result is placed in the floating-point register ( FPR) specifie d by fd.
This instruction is valid only for single- and double-precision floating-point formats. The
operation is not defined if bit 0 of any register specification is set and the FR bit in the Status
register equals zero, since the register numbers specify an even-odd pair of adjacent
coprocessor general registers. When the FR bit in the Status register equals one, both even
and odd register numbers are valid.
Operation:
32, 64 T: StoreFPR (fd, fmt, ValueFPR(fs, fmt) + ValueFPR (fl, fmt))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Unimplemented operation exception
Invalid operation exception
Inexact exception
Overflow exception
Underflow exception
TX49/H2 Architecture
B-12
BC1F Branch On FPU False
(coprocessor 1) BC1F
offset
BCF
00000
BC
01000
COP1
010001
1516202125
6
2631 0
55 16
Format:
BC1F offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the result of the
last floating-point compare is false(zero), the program branches to the target address, with a
delay of one instruction. There must be at least one instruction between C.cond. fmt and
BC1F.
Operation:
32 T 1: conditi on not COC[1]
T: target (offset15)14  offset   02
T + 1: if condition then
PC PC + target
endif
64 T 1 conditi on not COC[1]
T: target (offset15)46   offset   02
T + 1: if condition then
PC PC + target
endif
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
B-13
BC1FL Branch On FPU False
Likely
(coprocessor 1) BC1FL
offset
BCFL
00010
BC
01000
COP1
010001
1516202125
6
2631 0
55 16
Format:
BC1FL offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended.
If the result of the last floating-point compare is false(zero), the program branches to the
target address, with a delay of one instruction. If the conditional branch is not taken, the
instruction in the branch delay slot is nullified. There must be at least on instruction
between C.cond. fmt and BC1FL.
Operation:
32 T 1: conditi on not COC[1]
T: target (offset15)14  offset   02
T + 1: if condition then
PC PC + target
Else
NullifyCurrentInstruction
Endif
64 T 1: conditi on not COC[1]
T: target (offset15)46   offset   02
T + 1: if condition then
PC PC + target
Else
NullifyCurrentInstruction
endif
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
B-14
BC1T Branch On FPU True
(coprocessor 1) BC1T
offset
BCT
00001
BC
01000
COP1
010001
1516202125
6
2631 0
55 16
Format:
BC1T offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the result of the
last floating-point compare is true(one), the program branches to the target address, with a
delay of one instruction. There must be at least one instruction between C.cond. fmt and
BC1T.
Operation:
32 T 1: conditi on COC[1]
T: target (offset15)14  offset   02
T + 1: if condition then
PC PC + target
endif
64 T 1: conditi on COC[1]
T: target (offset15)46   offset   02
T + 1: if condition then
PC PC + target
endif
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
B-15
BC1TL Branch On FPU True Likely
(coprocessor 1) BC1TL
offset
BCTL
00011
BC
01000
COP1
010001
15162021
25
6
2631 0
55 16
Format:
BC1TL offset
Description:
A branch target address is computed from the sum of the address of the instruction in the
delay slot and the 16-bit offset, shifted left two bits and sign-extended.
If the result of the last floating-point compare is true(one), the program branches to the
target address, with a delay of one instruction. If the conditional branch is not taken, the
instruction in the branch delay slot is nullified. There must be at least one instruction
between C.cond.fmt and BC1TL.
Operation:
32 T 1: conditi on COC[1]
T: target (offset15)14  offset   02
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
64 T 1: conditi on COC[1]
T: target (offset15)46   offset   02
T + 1: if condition then
PC PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
B-16
C.cond.fmt Floating-Point
Compare C.cond.fmt
FC*cond*ft fs 0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
5555 42
3
4
Format:
C.cond.fmt fs, ft
Description:
The contents of the floating-point registers specified by fs and ft are interpreted in the
specified format and arithmetically compared.
A result is determined based on the comparison and the conditions specified in the
instruction. If one of the values is a Not a Number (NaN), and the high-order bit of the
condition field is set, an invalid operation exception is taken. After a one-instruction delay,
the condition is available for testing with branch on floating-point coprocessor condition
instructions. There must be at least one instruction between the conpare and branch.
Comparisons are exact and can neither overflow nor underflow. Four mutually exclusive
relations are possible results: less than, equal, greater than, and unordered. The last case
arises when one or both of the operands are NaN; every NaN compares unordered with
every-thing, including itself. Comparisons ignore the sign of zero, so + 0 = −0.
This instruction is valid only for single- and double-precision floating-point formats. The
operation is not defined if bit 0 of any register specification is set and the FR bit in the Status
register equals zero, since the register numbers specify an even-odd pair of adjacent
coprocessor general registers. When the FR bit in the Status register equals one, both even
and odd register numbers are valid.
**See “FPU Instruction Opcode Bit Encoding” at the end of Appendix B.
TX49/H2 Architecture
B-17
C.cond.fmt Floating-Point
Compare
(continued) C.cond.fmt
Operation:
32, 64 T: if NaN (ValueFPR(is, fmt)) or NaN (ValueFPR(it, fmt)) then
less false
equal false
unordered true
if cond3 t hen
signal lnval i dOperationException
endif
else less VaIueFPR (fs, fmt) < ValueFPR (I t, fmt)
equal ValueFPR (fs, fmt) = ValueFPR (it, fmt)
unordered false
endif
condition (cond2 and less) or (cond1 and equal) or
(cond0 and unordered)
FCR[31]23 condition
COC[1] condition
Exceptions:
Coprocessor unusable
Floating-Point exception
Coprocessor Exceptions:
Unimplemented operation exception
Invalid operation exception
TX49/H2 Architecture
B-18
CEIL.L.fmt Floating-Point
Ceiling to Long
Fixed-Point Format CEIL.L.fmt
fdfs CEIL.L
001010
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
CEIL.L.fmt fd, fs
Description:
The contents of the floating-point register specified by fs are interpreted in. the specified
source format, fmt, and arithme tical ly con verte d to the long fixe d-po int format. The re su lt is
placed in the floating-point register specified by fd.
Regardless of the setting of the current rounding mode, the conversion is rounded as if the
current rounding mode is round to + ∞ (2).
This instruction is valid only for conversion from single-, double-, extended or quad-
precision floating-point formats. If extended or quad-precision format is specified, the
operation is not defined if bit 0 of the source register specification is set, since the register
number specifies an aligned coprocessor general register. When the FR bit in the Status
regist er equals one, both even and odd register numbers are valid.
When the source operand is an Infinity, NaN, or the correctly rounded integer result us
outside of -263 to 263 -1, the Invalid operation exception us raised. If the Invalid operation is
not enabled then no exception us taken and 263 -1 is returned.
This instruction is not implemented on MIPS I or MIPS II processors, and Will cause an
unimplemented operation exception to occur.
Operation:
32, 64 T: StoreFPR (fd, L, ConvertFmt (ValueFPR (fs, fmt), fmt, L))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
TX49/H2 Architecture
B-19
CEIL.W.fmt Floating-Point
Ceiling to Single
Fixed-Point Format CEIL.W.fmt
fdfs CEIL.W
001110
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
CEIL.W.fmt fd, fs
Description:
The contents of the floating-point register specified by fs are interpreted in the specified
source format, fmt, and arithmetically converted to the single fix ed-point f ormat. The resul t
is placed in the floating-point regist er specified by fd.
Regardless of the setting of the current rounding mode, the conversion is rounded as if the
current rounding mode is round to + ∞ (2).
This instruction is valid only for conversion from a single- or double-precision floating-
point formats. The ope ration i s not d efin ed if bit 0 of any re gister spe cificat ion is set and the
FR bit in the Status register equals zero, since the register numbers specify an even-odd pair
of adjacent coprocessor general registers. When the FR bit in the Status register equals one,
both even and odd register numbers are valid.
When the source operand is an Infinity or NaN, or the correctly rounded integer result is
outside of 231 to 231-1, the Invalid operation exception is raised. If the Invalid operation is
not enabled then no exception is taken and 231-1 is returned.
Operation:
32, 64 T: StoreFPR (fd, W, ConvertFmt (ValueFPR (fs, fmt), fmt, W))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
TX49/H2 Architecture
B-20
CFC1 Move Control Word From
FPU
(coprocessor 1) CFC1
rt fs 0
000 0000 0000
CF
00010
COP1
010001
10111516202125
6
2631 0
555 11
Format:
CFC1 rt, fs
Description:
The contents of the FPU’ s control register fs are loaded into general register rt.
This operation is only defined when fs equals 0 or 31.
The contents of general register rt are undefined for the instruction immediately following
CFC1.
Operation:
32 T: temp FCR[fs]
T + 1: GPR[ rt] temp
64 T: temp FCR[fs]
T + 1: GPR[ rt] (temp31)32 temp
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
B-21
CTC1 Move Control Word To FPU
(coprocessor 1) CTC1
rt fs 0
000 0000 0000
CT
00110
COP1
010001
10111516202125
6
2631 0
555 11
Format:
CTC1 rt, fs
Description:
The contents of general register rt are loaded into the FPU’s control register fs. This
operation is only defined when fs equals 0 or 31 . Writing to Control Register 31, the floating-
point Control/Status register, causes an interrupt or exception if any cause bit and its
corresponding enable bit are both set. The register will be written before the exception
occurs. The contents of floating-point control register fs are undefined for the instruction
immediately following CTC1.
Operation:
32 T: temp GPR[rt]
T + 1: F CR[fs] temp
COC[1] FCR[31]23
64 T: temp GPR[rt]31~0
T + 1: F CR[fs] temp
COC[1] FCR[31]23
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Unimplemented operation exception
Invalid operation exception
Division by zero exception
Inexact exception
Overflow exception
Underflow exception
TX49/H2 Architecture
B-22
CVT.D.fmt Floating-Point
Convert to Double
Fixed-Point Format CVT.D.fmt
fdfs CVT.D
100001
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
CVT.D.fmt fd, fs
Description:
The contents of the floating-point register specified by fs is interpreted in the specified
source format, fmt, and arithmetically converted to the double. binary floating-point format.
The result is placed in the floating-point register specified by fd.
This instruction is valid only for conversions from single floating-pount format, 32-bit or
64-bit fixed-point format.
If the single floating-point or single fixed-point format is specified, the operation is exact.
The operation is not defined if bit 0 of any register specification is set and the FR bit in the
Status register equals zero, since the register numbers specify an even-odd pair of adjacent
coprocessor general registers. When the FR bit in the Status register equals one, both even
and odd register numbers are valid.
Operation:
32, 64 T: StoreFP R (fd, D, ConvertFmt (V aIueFP R (fs, fmt), fmt , D))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
Underflow exception
TX49/H2 Architecture
B-23
CVT.L.fmt Floating-Point
Convert to Long
Fixed-Point Format CVT.L.fmt
fdfs CVT.L
100101
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
CVT.L.fmt fd, fs
Description:
The contents of the floating-point register specified by fs is interpreted in the specified
source format, fmt, and arithme tical ly con verte d to the long fixe d-po int format. The re su lt is
placed in the floating-point register specified by fd.
This instruction is valid only for conversions from single-, double-, extended- or quard-
precision floating-point formats. If extended- or quad-precision format is specified, the
operation is not defined if bit 0 of the source register specification is set, since the register
number specifies an aligned coprocessor general register.
When the source operand is an Infinity, NaN, or the correctly rounded integer result is
outside of 263 to 263-1, the Invalid operation exception is raised. If the Invalid operation is
not enabled then no exception is taken and 263-1 is returned.
This instruction is not implemented on MIPS I or MIPS II processors, and will cause an
unimplemented operation exception to occur.
The operation is not defined if bit 0 of any register specification is set and the FR bit in the
status register epuals zero.
Operation:
32, 64 T: StoreFPR (fd, L, ConvertFmt (V al ueFPR (fs, fmt), fmt, L))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
TX49/H2 Architecture
B-24
CVT.S.fmt Floating-Point
Convert to Single
Fixed-Point Format CVT.S.fmt
fdfs CVT.S
100000
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
CVT.S.fmt fd, fs
Description:
The contents of the floating-point register specified by fs are interpreted in the specified
source format, fmt, and arithmetically converted to the single binary floating-point format.
The result is placed in the floating-point register specified by fd. Rounding occurs according
to the currently specified rounding mode.
This instruction is valid only for conversions from double floating-point format, or from 32-
bit or 64-bit fixed-point format. The operation is not defined if bit 0 of any register
specification is set and the FR bit in the Status register equals zero, since the register
numbers specify an even-odd pair of adjacent coprocessor general registers. When the FR bit
in the Status register equals one, both even and odd register numbers are valid.
Operation:
32, 64 T: StoreFPR (fd, S, ConvertFmt (Val ueFPR (fs , fmt ), fmt, S))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
Underflow exception
TX49/H2 Architecture
B-25
CVT.W.fmt Floating-Point
Convert to
Fixed-Point Format CVT.W.fmt
fdfs CVT.W
100100
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
CVT.W.fmt fd, fs
Description:
The contents of the floating-point register specified by fs are interpreted in the specified
source format, fmt, and arithmetically converted to the single fix ed-point f ormat. The resul t
is placed in the floating-point regist er specified by fd.
This instruction is valid only for conversion from a single- or double-precision floating-
point formats. The ope ration i s not d efin ed if bit 0 of any re gister spe cificat ion is set and the
FR bit in the Status register equals zero, since the register numbers specify an even-odd pair
of adjacent coprocessor general registers. When the FR bit in the Status register equals one,
both even and odd register numbers are valid.
When the source operand is an Infinity or NaN, or the correctly rounded integer result us
outside of 231 to 231-1, an Invalid operation exception is raised. If Invalid operation is not
enabled, then no exception is taken and 231-1 is returned.
Operation:
32, 64 T: StoreFPR (fd, W, ConvertFmt (ValueFP R (fs, fm t), fmt, W))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
TX49/H2 Architecture
B-26
DIV.fmt Floating-Point
Divide DIV.fmt
fdfs DIV
000011
COP1
010001 ftfmt
5610111516202125
6
2631 0
55556
Format:
DIV.fmt fd, fs, ft
Description:
The contents of the floating-point registers specified by fs and ft are interpreted in the
specified format and the value in fs is divided by the value in ft. The result is rounded as if
calculated to infinite precision and then rounded to the specified format, according to the
current rounding mode. The result is placed in the floating-point register specified by fd.
This instruction is valid for only single or double precision floating-point formats.
The operation is not defined if bit 0 of any register specification is set and the FR bit in the
Status register equals zero, since the register numbers specify an even-odd pair of adjacent
coprocessor general registers. When the FR bit in the Status register equals one, both even
and odd register numbers are valid.
Operation:
32, 64 T: StoreFPR (fd, fmt, ValueFPR(fs, fmt)/ValueFPR(ft, fmt))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Unimplemented operation exception
Invalid operation exception
Division-by-zero exception
Inexact exception
Overflow exception
Underflow exception
TX49/H2 Architecture
B-27
DMFC1 Doubleword Move From
Floating-Point Coprocessor DMFC1
rt fs 0
000 0000 0000
DMF
00001
COP1
010001
10111516202125
6
2631 0
555 11
Format:
DMFC1 rt, fs
Description:
The contents of register fs from the floating-point coprocessor is stored into processor
register rt.
The contents of general register rt are undefined for the instruction immediately following
DMFC1.
The FR bit in the Status register specifies whether all 32 register of the TX49 are
addressable. When FR is clear, this instruction is not defined when the least significant bit
of fs is non-zero. When FR is set, fs may specify either odd or even registers.
Operation:
64 T: if SR26 = 1 then /*64-bit wide FGRs*/
data FGR[fs]
elseif fs0 = 0 then /*valid specifier, 32-bit wide FGRs*/
data FGR[fs+1] FGR[fs]
else /*undefi ned for odd 32-bit reg #s */
data undefined64
endif
T+1: GPR[rt] data
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)
Coprocessor Exceptions:
Unimplemented operation exception
TX49/H2 Architecture
B-28
DMTC1 Doubleword Move To
Floating-Point Coprocessor DMTC1
rt fs 0
000 0000 0000
DMT
00101
COP1
010001
10111516202125
6
2631 0
555 11
Format:
DMTC1 rt, fs
Description:
The content s of general register rt are loaded int o coprocessor register fs of the CP1.
The contents of floating-point register fs are undefined for the instruction immediately
following DMTC1.
The FR bit in the Status register specifies whether all 32 register of the TX49 are
addre ssable. When FR equals zero, this instru ction is no t defined w hen the least significant
bit of fs is non-zero. When FR equals one, fs may specify either odd or even registers.
Operation:
64 T: data GPR[rt]
T + 1: if SR26 = 1 then /*64-bit wide FGRs*/
FGR[fs] data
elseif fs0 = 0 then /*valid specifier, 32-bit wide valid FGRs*/
FGR[fs + 1] data6332
FGR[fs] data310
else /*undefi ned result for odd 32-bit reg #s */
undefined_result
endif
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)
Coprocessor Exceptions:
Unimplemented operation exception
TX49/H2 Architecture
B-29
FLOOR.L.fmt Floating-Point
Floor to Long
Fixed-Point Format FLOOR.L.fmt
fdfs FLOOR.L
001011
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
FLO0R.L.fmt fd, fs
Description:
The contents of the floating-point register specified by fs are interpreted in the specified
source format, fmt, and arithme tical ly con verte d to the long fixe d-po int format. The re su lt is
placed in the floating-point register specified by fd.
Regardless of the setting of the current rounding mode, the conver-sion is rounded as if the
current rounding mode is round to −∞ (3).
This instruction is valid only for conversion from single-, double-, extended or quad-
precision floating-point formats. If extended or quad-precision format is specified, the
operation is not defined if bit 0 of the source register specification is set, since the register
number specifies an aligned coprocessor general register.
When the source operand is an Infinity, NaN, or the correctly rounded integer result is
outside of 263 to 263-1, the Invalid operation exception is raised. If the Invalid operation is
not enabled then no exception is taken and 263-1 is returned. This instruction is not
implemented on MIPS I or MIPS II processors, and will cause an unimplemented operation
exception to occur.
Operation:
32, 64 T: StoreFPR (fd, L, ConvertFmt (V al ueFPR (fs, fmt), fmt, L))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
TX49/H2 Architecture
B-30
FLOOR.W.fmt Floating-Point
Floor to Single
Fixed-Point Format FLOOR.W.fmt
fdfs FLOOR.W
001111
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
FLOOR.W.fmt fd, fs
Description:
The contents of the floating-point register specified by fs are interpreted in the specified
source format, fmt, and arithmetically converted to the single fix ed-point f ormat. The resul t
is placed in the floating-point regist er specified by fd.
Regardless of the setting of the current rounding mode, the conversion is rounded as if the
current rounding mode is round to −∞ (RM = 3).
This instruction is valid only for conversion from a single- or double-precision floating-
point formats. The ope ration i s not d efin ed if bit 0 of any re gister spe cificat ion is set and the
FR bit in the Status register equals zero, since the register numbers specify an even-odd pair
of adjacent coprocessor general registers. When the FR bit in the Status register equals one,
both even and odd register numbers are valid.
When the source operand is an Infinity or NaN, or the correctly rounded integer result is
outside of 231 to 231-1, an Invalid operation exception is raised. If Invalid operation is not
enabled, then no exception is taken and 231-1 is returned.
Operation:
32, 64 T: StoreFPR (fd, W, ConvertFmt (ValueFP R (fs, fm t), fmt, W))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
TX49/H2 Architecture
B-31
B LDC1 Load Doubleword to FPU
(coprocessor 1) LDC1
ft
LDC1
110101 offsetbase
1516202125
6
2631 0
55 16
Format:
LDC1 ft, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
an unsigned effective address. In 32-bit mode, the contents of the doubleword at the memory
location specified by the effective address is loaded into registers ft and ft
+
1 of the floating-
point coprocessor. This instruction is not v alid, and is undefined, when the least signific ant
bit of ft is non-zero. In 64-bit mode, the contents of the doubleword at the memory location
specified by the effective ad-dress are loaded into the 64-bit register ft of the floating point
coprocessor. The FR b it of th e Status register (SR26) specifies whether all 32 registers of the
TX49 are addressable. When FR = 0, this instruction is not defined when the least
significant bit of ft is non-zero. When FR = 1, ft may specify either odd or even registers.
If any of the three least-significant bits of the effective address are non-zero, an address
error exception takes place.
TX49/H2 Architecture
B-32
LDC1 Load Doubleword to FPU
(coprocessor 1)
(continued) LDC1
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) Address Trans l ation (vAddr, DATA)
data LoadMemory (uncached, DLUBLEWORD, pAddr, vAddr, DATA)
if SR26 = 1 then /*64-bit wide GFRs */
FGR[ft] data
elseif ft0 = 0 then /*valid specifier, 32-bit wide FGRs */
FGR[ft + 1] data6332
FGR[ft] data310
else /*undefi ned result if odd */
undefined_result
endif
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) Address Trans l ation (vAddr, DATA)
data LoadMemory (uncached, DLUBLEWORD, pAddr, vAddr, DATA)
if SR26 = 1 then /*64-bit wide GFRs */
FGR[ft] data
elseif ft0 = 0 then /*valid specifier, 32-bit wide FGRs */
FGR[ft + 1] data6332
FGR[ft] data310
else /*undefi ned result if odd */
undefined_result
endif
Exceptions:
Coprocessor unusable
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
TX49/H2 Architecture
B-33
LWC1 Load Word to FPU
(coprocessor 1) LWC1
ft
LWC1
110001 offsetbase
1516202125
6
2631 0
55 16
Format:
LWC1 ft, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
an unsigned effective address. The contents of theword at the memory location specified by
the effective address is loaded into register ft of the floating-point coprocessor.
The FR bit of the Status register specifies whether all 64-bit Floating-Point Registers are
addressable. If FR equals zero, LWC1 loads eitherthe high or low half of the 16 even
Floating-Point Registers. If FR equals one, LWC1 loads the low 32-bits of both even and odd
Floating-Point Registers.
If either of the two least-significant bits of the effective address is non-zero, an address
error exception occurs.
TX49/H2 Architecture
B-34
LWC1 Load Word to FPU
(coprocessor 1)
(continued) LWC1
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE13 (pAddr20xor(ReverseEndian 02))
mem LoadMemory(uncached, WORD, pAddr, vAddr, DATA)
byte vA ddr20xor(BigEndianCPU 02)
/*“mem” is aligned 64-bits from mem ory. P ick out c orrect byt es. */
if SR26 = 1 then */64-bit wide FRGs */
FGR[ft] undefined32 mem31 + 8*byte8*byte
else /*32-bit wide FGRs */
FGR[rf] mem31 + 8*byte8*byte
endif
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE13 (pAddr20xor(ReverseEndian 02))
mem LoadMemory(uncached, WORD, pAddr, vAddr, DATA)
byte vA ddr20xor(BigEndianCPU 02)
/*“mem” is aligned 64-bits from mem ory. P ick out c orrect byt es. */
if SR26 = 1 then */64-bit wide FRGs */
FGR[ft] undefined32 mem31 + 8*byte8*byte
else /*32-bit wide FGRs */
FGR[rf] mem31 + 8*byte8*byte
endif
Exceptions:
Coprocessor unusable
TLB-refill exception
TLB invalid exception
Bus error exception
Address error exception
TX49/H2 Architecture
B-35
MFC1 Move From FPU
(Coprocessor 1) MFC1
rt fs 0
000 0000 0000
MF
00000
COP1
010001
10111516202125
6
2631 0
555 11
Format:
MFC1 rt, fs
Description:
The contents of register fs from the floating-point coprocessor are stored into processor
register rt.
The contents of register rt are undefined for time T of the instruction immediately
following this load instruction.
The FR bit of the Status register specifies whether all 32 registers of the TX49 are
addressable. If FR equals zero, MFC1 stores either the high or low half of the 16 even
Floating-Point Registers. If FR equals one, MFC1 stores the low 32-bits of both even and odd
Floating-Point Registers.
Operation:
32 T: data FGR[fs]310
T + 1: GPR[ rt] data
64 T: data FGR[fs]310
T + 1: GPR[rt] (data31)32 data
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
B-36
MOV.fmt Floating-Point Move MOV.fmt
fdfs MOV
000110
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
MOV.fmt fd, fs
Description:
The contents of the FPU register specified by fs are interpre ted in th e specifie d format and
are copied into the FPU register specified by fd. The move operation is non-arithmetic; no
IEEE 754 exceptions occur as a result of the instruction.
This instruction is valid only for single- or double-precision floating-point formats.
The operation is not defined if bit 0 of any register specification is set and the FR bit in the
Status register equals zero, since the register numbers specify an even-odd pair of adjacent
coprocessor general registers. When the FR bit in the Status register equals one, both even
and odd register numbers are valid.
Operation:
32, 64 T: StoreFPR (fd, fmt, VaIueFPR (fs, fmt ))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Unimplemented operation exception
TX49/H2 Architecture
B-37
MTC1 Move To FPU
(Coprocessor 1) MTC1
rt fs 0
000 0000 0000
MT
00100
COP1
010001
10111516202125
6
2631 0
555 11
Format:
MTC1 rt, fs
Description:
The contents of register rt are loaded into the FPU’s general regist er at location fs.
The contents of floating-point register fs is undefined for the instruction immediately
following MTC1.
The FR bit of the Status register specifies whether all 32 registers of the TX49 are
addressable. If FR equals zero, MTC1 loads either the high or low half of the 16 even
Floating-Point Registers. If FR equals one, MTC1 loads the low 32-bits of both even and odd
Floating-Point Registers.
Operation:
32, 64 T: data GPR[rt]310
T + 1: if SR26 = 1 then /* 64-bit wide FGRs */
FGR[fs] undefined32 data
else /* 32-bit wide FGRs */
endif
Exceptions:
Coprocessor unus able exception
TX49/H2 Architecture
B-38
MUL.fmt Floating-Point Multiply MUL.fmt
ft fdfs MUL
000010
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
MUL.fmt fd, fs, ft
Description:
The contents of the floating-point registers specified by fs and ft are interpreted in the
specified format and arithmetically multiplied. The result is rounded as if calculated to
infinite precision and then rounded to the specified format, according to the current rounding
mode. The result is pl aced in the floating-point register specified by fd.
This instruction is valid only for single- or double-precision floating-point formats.
The operation is not defined if bit 0 of any register specification is set and the FR bit in the
Status register equals zero, since the register numbers specify an even-odd pair of adjacent
coprocessor general registers. When the FR bit in the Status register equals one, both even
and odd register numbers are valid.
Operation:
32, 64 T: StoreFPR (fd, fmt, ValueFPR (fs, fmt)* ValueF PR (ft, fm t))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Unimplemented operation exception
Invalid operation exception
Inexact exception
Overflow exception
Underflow exception
TX49/H2 Architecture
B-39
NEG.fmt Floating-Point Negate NEG.fmt
fdfs NEG
000111
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
NEG.fmt fd, fs
Description:
The contents of the FPU register specified by fs are interpre ted in th e specifie d format and
the arithme t ic ne g atio n i s t aken ( the po larity o f the sig n- bit is ch an ge d ). Th e re su lt is p l ace d
in the FPU register specified by fd.
The negat e operation is arithmetic; an NaN operand signals invalid operation.
This instruction is valid only for single- or double-precision floating-point formats. The
operation is not defined if bit 0 of any register specification is set and the FR bit in the Status
register equals zero, since the register numbers specify an even-odd pair of adjacent
coprocessor general registers. When the FR bit in the Status register equals one, both even
and odd register numbers are valid.
Operation:
32, 64 T: StoreFPR (fd, fmt, Negate (ValueFPR (fs, fmt)))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Unimplemented operation exception
Invalid operation exception
TX49/H2 Architecture
B-40
ROUND
L.fmt Floating-Point
Round to Long
Fixed-Point Format ROUND
L.fmt
fdfs ROUND.L
001000
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
ROUND.L.fmt fd, fs
Description :
The contents of the floating-point register specified by fs are interpreted in the specified
source format, fmt, and arithme tical ly con verte d to the long fixe d-po int format. The re su lt is
placed in the floating-point register specified by fd.
Regardless of the setting of the current rounding mode, the conversion is rounded as if the
current rounding mode is round to nearest/even (0).
This instruction is valid only for conversion from single-, double-, extended or quad-
precision floating-point formats. If extended or quad-precision format is specified, the
operation is not defined if bit 0 of the source register specification is set, since the register
number specifies an aligned coprocessor general register.
When the source operand is an Infinity , NaN, or the correctly rounded integer result is
outside of 263 to 263-1, the Invalid operation exception is raised. If the Invalid operation is
not enabled then no exception is taken and 263-1 is returned.
This instruction is not implemented on MIPS I or MIPS II processors, and will cause an
unimplemented operation exception to occur.
Operation:
32, 64 T: StoreFPR (fd, L, ConvertFmt (V al ueFPR (fs, fmt), fmt, L))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
TX49/H2 Architecture
B-41
ROUND W.fmt Floating-Point
Round to Single
Fixed-Point Format ROUND W.fmt
fdfs ROUND.W
001100
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
ROUND.W.fmt fd, fs
Description:
The contents of the floating-point register specified by fs are interpreted in the specified
source format, fmt, and arithmetically converted to the single fix ed-point f ormat. The resul t
is placed in the floating-point regist er specified by fd.
Regardless of the setting of the current rounding mode, the conversion is rounded as if the
current rounding mode is round to nearest/even (RM = 0).
This instruction is valid only for conversion from a single- or double-precision floating-
point formats. The ope ration i s not d efin ed if bit 0 of any re gister spe cificat ion is set and the
FR bit in the Status register equals zero, since the register numbers specify an even-odd pair
of adjacent coprocessor general registers. When the FR bit in the Status register equals one,
both even and odd register numbers are valid.
When the source operand is an Infinity or NaN, or the correctly rounded integer result is
outside of 231 to 231-1, an Invalid operation exception is raised. If Invalid operation is not
enabled, then no exception is taken and 231-1 is returned.
Operation:
32, 64 T: StoreFPR (fd, W, ConvertFmt (ValueFP R (fs, fm t), fmt, W))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
TX49/H2 Architecture
B-42
SDC1 Store Doubleword from FPU
(coprocessor 1) SDC1
ft
SDC1
111101 offsetbase
1516202125
6
2631 0
55 16
Format:
SDC1 ft, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
an unsigned effective address.
In 32-bit mode, the contents of registers ft and ft
+
1 from the floating-point coprocessor are
stored at the memory location specified by the effective address. This instruction is not valid,
and is undefined, when the least significant bit of ft is non-zero.
In 64-bit mode, the 64-bit register ft is stored to the contents of the doubleword at the
memory location specified by the effective address. The FR bit of the Status register (SR26)
specifies whether all 32 registers of the TX49 are addressable. When FR = 0, this in struction
is not de fined if the leas t significan t bit o f ft is non-zero. If FR = 1, ft may specify either odd
or even regis ters.
If any of the three least-significant bits of the effective address are non-zero, an address
error exception takes place.
TX49/H2 Architecture
B-43
SDC1 Store Doubleword from FPU
(coprocessor 1)
(continued) SDC1
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
if SR26 = 1 /*64-bit wide FGRs */
data FGR[ft]
elseif ft0 = then /* valid spec ifi er, 32-bit wide FGRs */
data FGR[ft + 1] FGR[ft]
else /*undefi ned for odd 32-bit reg #s */
data undefined64
endif
StoreMem ory (uncac hed, DOUB LEWORD, data, pAddr, vAddr, DATA)
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
if SR26 = 1 /*64-bit wide FGRs */
data FGR[ft]
elseif ft0 = then /* valid spec ifi er, 32-bit wide FGRs */
data FGR[ft + 1] FGR[ft]
else /*undefi ned for odd 32-bit reg #s */
data undefined64
endif
StoreMem ory (uncac hed, DOUB LEWORD, data, pAddr, vAddr, DATA)
Exceptions:
Coprocessor unusable
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
TX49/H2 Architecture
B-44
SQRT.fmt Floating-Point
Square Root SQRT.fmt
fdfs SQRT
000100
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
SQRT.fmt fd, fs
Description:
The contents of the floating-point register specified by fs are interpreted in the specified
format and the positive arithmetic square root is taken. The result is rounded as if
calculated to infinite precision and then rounded to the specified format, according to the
current rounding mode. If the value of fs corresponds to 0, the resu lt will be 0. The re sult
is placed in the floating-point regist er specified by fd.
This instruction is valid only for single- or double-precision floating-point formats.
The operation is not defined if bit 0 of any register specification is set and the FR bit in the
Status register equals zero, since the register numbers specify an even-odd pair of adjacent
coprocessor general registers. When the FR bit in the Status register equals one, both even
and odd register numbers are valid.
Operation:
32, 64 T: StoreFPR (fd, fmt, SquareRoot (V al ueFPR (fs, fmt)))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Unimplemented operation exception
Invalid operation exception
Inexact exception
TX49/H2 Architecture
B-45
SUB.fmt Floating-Point Subtract SUB.fmt
ft fdfs SUB
000001
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
SUB.fmt fd,fs, ft
Description:
The contents of the floating-point registers specified by fs and ft are interpreted in the
specified format and the value in ft is subtracted from the value in fs. The re sult is rounde d
as if calculated to infinite precision and then rounded to the specified format, according to
the current rounding mode. The result is placed in the floating-point register specified by fd.
This instruction is valid only for single- or double-precision floating-point formats.
The operation is not defined if bit 0 of any register specification is set and the FR bit in the
Status register equals zero, since the register numbers specify an even-odd pair of adjacent
coprocessor general registers. When the FR bit in the Status register equals one, both even
and odd register numbers are valid.
Operation:
32, 64 T: StoreFPR (fd, fmt, ValueFPR (fs, fmt) ValueFPR (ft, fmt))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Unimplemented operation exception
Invalid operation exception
Inexact exception
Overflow exception
Underflow exception
TX49/H2 Architecture
B-46
SWC1 Store Word from FPU
(coprocessor 1) SWC1
ft
SWC1
111001 offsetbase
1516202125
6
2631 0
55 16
Format:
SWC1 ft, offset (base)
Description:
The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m
an unsigned effective address. The contents of register ft from the floating-point coprocessor
are stored at the memory location specified by the effective address.
The FR bit of the Status register specifies whether all 64-bit Floating-Point Registers are
addressable. If FR equals zero, SWC1 stores either the high or low half of the 16 even
Floating-Point Registers. If FR equals one, SWC1 stores the low 32-bits of both even and odd
Floating-Point Registers.
If either of the two least-significant bits of the effective address are non-zero, an address
error exception occurs.
TX49/H2 Architecture
B-47
SWC1 Store Word from FPU
(coprocessor 1)
(continued) SWC1
Operation:
32 T: vAddr ((offs et15)16 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20xor (RecerseEndian  02))
byte vA ddr20xor (BigE ndi anCPU 02)
/* tne bytes of the word are put in the correct byte lanes in
* “data” for a 64-bit path to memory */
if SR26 = 1 then /*64-bit wide FGRs */
data FGR[ft]63-8*byte0 08*byte
else /* 32-bit wide FGRs /*
data 032-8*byte FGR[ft] 08*byte
endif
StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)
64 T: vAddr ((offs et15)48 offset150) + GPR[base]
(pAddr, uncached) AddressT ransl at i on (vAddr, DATA)
pAddr pAddrPSIZE-13 (pAddr20xor (RecerseEndian  02))
byte vA ddr20xor (BigE ndi anCPU 02)
/* tne bytes of the word are put in the correct byte lanes in
* “data” for a 64-bit path to memory */
if SR26 = 1 then /*64-bit wide FGRs */
data FGR[ft]63-8*byte0 08*byte
else /* 32-bit wide FGRs /*
data 032-8*byte FGR[ft] 08*byte
endif
StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)
Exceptions:
Coprocessor unusable
TLB refill exception
TLB invalid exception
TLB modifica tion exception
Bus error exception
Address error exception
TX49/H2 Architecture
B-48
TRUNC.L.fmt Floating-Point
Truncate to Long
Fixed-Point Format TRUNC.L.fmt
fdfs TRUNC.L
001001
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
TRUNC.L.fmt fd, fs
Description :
The contents of the floating-point register specified by fs are interpreted in the specified
source format, fmt, and arithmetically converted to the single fix ed-point f ormat. The resul t
is placed in the floating-point regist er specified by fd.
Regardless of the setting of the current rounding mode, the conversion is rounded as if the
current rounding mode is round toward zero (1).
This instruction is valid only for conversion from single-, double-, ex-tended or quad-
precision floating-point formats. If extended or quad-precision format is specified, the
operation is not defined if bit 0 of the source register specification is set, since the register
number specifies an aligned coprocessor general register.
When the source operand is an Infinity, NaN, or the correctly rounded integer result is
outside of 263 to 263-1, the Invalid operation exception is raised. If the Invalid operation is
not enabled then no exception is taken and 263-1 is returned.
This instruction is not implemented on MIPS I or MIPS II processors, and will cause an
unimplemented operation exception to occur.
Operation:
32, 64 T: StoreFPR (fd, L, ConvertFmt (V al ueFPR (fs, fmt), fmt, L))
Note: It is also the same operation in th e 32 bit kernel mode.
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
TX49/H2 Architecture
B-49
TRUNC.W.fmt Floating-Point
Truncate to Single
Fixed-Point Format TRUNC.W.fmt
fdfs TRUNC.W
001101
0
00000
COP1
010001 fmt
5610111516202125
6
2631 0
55556
Format:
TRUNC.W.fmt fd, fs
Description:
The contents of the FPU register specified by fs are interpreted in the specified source
format fmt and arithmetically converted to the single fixed-point format. The result us
placed in the FPU register specified by fd.
Regardless of the setting of the current rounding mode, the conversion is rounded as if the
current rounding mode is round toward zero (RM = 1).
This instruction is valid only for conversion from a single- or double-precision floating-
point formats. The ope ration i s not d efin ed if bit 0 of any re gister spe cificat ion is set and the
FR bit in the Status register equals zero, since the register numbers specify an even-odd pair
of adjacent coprocessor general registers. When the FR bit in the Status register equals one,
both even and odd register numbers are valid.
When the source operand is an Infinity or NaN, or the correctly rounded integer result is
outside of 231 to 231-1, an Invalid operation exception is raised. If Invalid operation is not
enabled, then no exception is taken and -231 is returned.
Operation:
32, 64 T: StoreFPR (fd, W, ConvertFmt (VaIueFPR (fs, fmt ), fmt , W))
Exceptions:
Coprocessor unus able exception
Floating-Point exception
Coprocessor Exceptions:
Invalid operation exception
Unimplemented operation exception
Inexact exception
Overflow exception
TX49/H2 Architecture
B-50
B.5 Bit Encoding of FPU Instruction OPcodes
The Table B-6 shows the bit codes for all TX49 FPU instructions (ISA and extended ISA)
Table B-6 FPU Operation Code Bit Encoding
Opcode
31 26 0
OPcode
2826
312901234567
0
1
2COP1
3
4
5
6 LWC1 LDC1 θ
7SWC1 SDC1
θ
Sub
31 26 25 21 0
OPcode Sub
2321
252401234567
0MF DMF
η θ CF MT DMT η θ CT δ
1BC δδδδδδδ
2S D
θδ δWL
η θ δ δ
3δδδδδδδδ
TX49/H2 Architecture
B-51
Br
31 26 20 16 0
OPcode Br
1816
201901234567
0 BCF BCT BCFL BCTL γγγγ
1γγγγγγγγ
2γγγγγγγγ
3γγγγγγγγ
CP1 Function
31 26 5 0
OPcode CP1
Function
20
5301234567
0 ADD SUB MUL DIV SQRT ABS MOV NEG
1ROUND.L η θ TRUNC.L η θ CEIL.L η θ FLOOR.L η θ ROUND.W TRUNC.W CEIL.W FLOORW
2δδδδδδδδ
3δδδδδδδδ
4CVT.S CVT.D
θδ δCVT.W CVT.L η θ δ δ
5δδδδδδδδ
6 C.F C.UN C.EQ C.UEQ C.OLT C.ULT C.OLE C.ULE
7 C.SF C.NGLE C.SEQ C.NGL C.LT C.NGE C.LE C.NGT
Key:
γ: This opcode is reserved for future use. An attempt to execute it causes a Reserved
Instruction exception.
δ: Thie opcode is reserved for future use. An attempt to execute it causes a Unimplemented
operatio n exc ept ions in al l current implementations .
η: This opcode is valid only when MIPS III instructions are enabled. An attempt to execute
these without MIPS III instruction enabled will cause an Unimplemented operation exception.
θ: This opcode is valid only when the TX49 has a double precision FPU in hardware. An
attempt to execute these without it will cause an Unimplemented operation exception.
Note:
FPU Instructions are valid only when TX49 has with FPU(CP1). An attempt to execute these
insturctions causes a Coprocessor Unusable exception, independent of C0_SR(bit 29)’s value.
TX49/H2 Architecture
B-52
TX49/H2 Architecture
C-1
Appendix C: Coprocessor 0 Hazards
C.1 Pipeline Interlock and Hazard in TX49
C.1.1 Interlock in Load Delay Slot
Pipeline control logic will interlock the pipeline when detecting a hazard condition and
pipeline won’t resume until the hazard is resolved.
An example is sho wn in Figure C- 1. In thi s case, instru ction in the load delay slot tries
to read the d estination re gister of the load instruction resulting in p ipeline sta ll until the
data is read from the cache.
lw $5, 0 ($26) F D E M W
addu $8, $7, $5 F D ES E M W
Cache Read Finish
Figure C-1 Interlock in Load Delay Slot
Pipeline also interlocks when the cache miss occurs or when the data is loaded from
uncached area (Figure C-2).
lw $5, 0 ($26) F D E M FX W
RD RD
addu $8, $7, $5 F D ES ES ES ES E M W
Read Bus Cycl e by lw.
Cache Read Finish
Figure C-2 Interlock in Cache Miss or in the Data Load from Non-cached Area
In this example where there is a register hazard between two consecutive instructions,
ADDU will stall at E stage until the destination register of LW is written back.
However, if there is no data dependancy between LW and ADDU, execution of ADDU
will comple te with out stall bef ore th e de stinatio n regis ter o f LW is written back. P ipeline
interlock occurs at the first instruction that has the data dependency with the preceding
load instruction (Figure C-3).
lw $5, 0 ($26) F D E M FX W
RD RD RD
addu $8, $7, $6 F D E M W
ori $9, $0, 0x1f F D E M W
addu $9, $8, $5 F D ES ES ES E M W
Figure C-3 Pipeline Interlock by Cache Miss
TX49/H2 Architecture
C-2
Pipeline also interlocks on write-after-write hazard which is illustrated in Figure C-4.
Write-after-write hazard is detected when one of the instructions following a load has the
destination register which is same as that of the load instruction. In this example, the
ADDU instruction stalls at its E stage until the destination register ($1) of the load is
written back.
lw $1, 0 ($26) F D E M FX W
RD RD RD
addu $8, $7, $6 F D E M W
ori $9, $0, 0x1f F D E M W
addu $1, $8, $5 F D ES E S ES E M W
Figure C-4 Write-af ter- write Hazar d b y Load Instr uc tio n
A SYNC instruction may be placed right after a load instruction. This will cause
pipelin e stall u ntil th e bu s cycle i ssued by the prev ious load instruc tion comp letes (Figur e
C-5). If the data is read from the cache, there is no bus cycle pending before the SYNC
which results in no pipeline stall.
lw $5, 0 ($26) F D E M FX W
RD RD
sync F D E MS MS MS M W
Read Bus Cycl e by lw.
Memory Read Finish
Figure C-5 SYNC Instruction After Load Instruction
C.1.2 Branch Delay Slot
Branch and jump instructions have a branch delay slot (Figure C-6). Also, DERET
instruction has a branch delay slot. Note that the result is undefined when the
branch/jump instruction is placed in the branch delay slot1.
beq $1, $4, L1 F D E M W
subu $3, $5, $6 (delay slot ) F D E M W
L1: addiu $7, $7, 1 (target) F D E M W
Figure C-6 Branch Delay Slot
1 Instructions which cause exception, such as, SYSCALL, BREAK, and SDBBP may be placed in the
branch delay slot.
TX49/H2 Architecture
C-3
C.1.3 Multiply, Mult iply/Add and Division Instructions
This subsection explains the pipeline hazard/interlock caused by the combinations of
multiply, multiply/add, division, and MTHI/MTLO/MFHI/MFLO instructions (Figure
C-7). Basically, the pipeline hazard/inte rlock by these in structions can be summarized in
this way:
Pipeline interlocks when the data dependency exists.
Pipeline interlocks when preceding 32-bit multiply or 32-bit multiply/add
instruction has <rd> field.
Pipeline in terlocks w hen 32-bi t instructio n and 64-b it ins truction are exe cuted in
sequence.
HI/LO registers are in undefined state within two instructions before the division
instruction, such as, DIV/DIVU/DDIV/DDIVU instruction2.
SUCCEEDING INSTRUCTION
MULT/
MULTU
(2-operand)
MULT/
MULTU
(3-operand)
MADD/
MADDU
(2-operand)
MADD/
MADDU
(3-operand)
MTHI/
MTLO MFHI/
MFLO DIV/
DIVU
DMULT/
DMULTU
(2-operand)
DMULT/
DMULTU
(3-operand)
DDIV/
DDIVU
MULT/MULTU
(2-operand) NO STALL NO STALL NO STALL NO STALL INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK
MULT/MULTU
(3-operand) INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK
MADD/MADDU
(2-operand) NO STALL NO STALL NO STALL NO STALL INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK
MADD/MADDU
(3-operand) INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK
MTHI/MTLO NO STALL NO STALL NO STALL NO STALL NO STALL NO STALL NO STALL NO STALL NO STALL NO STALL
MFHI/MFLO NO STALL NO STALL NO STALL NO STAL L NO STALL NO STALL * NO STALL NO STALL *
DIV/DIVU INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK
DMULT/DMULTU
(2-operand) INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK
DMULT/DMULTU
(3-operand) INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK
PRECEEDING INSTRUCTION
DDIV/DDIVU INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK
*: HI/LO registers are in undefined state within two instructions before division instruction
Figure C-7 MAC pipeline hazard/interlock
In the following sections, the pipeline hazards/interlocks caused by the possible
combinations of the instructions related multiply, multiply/add, division and both 32-bit
and 64-bit operations are illustrated in detail. The Figures in the following sections
classifies the cases in such a way that:
A The preceding instruction is immediately followed by 32-bit multiply or multiply/add
instruction
B The preceding instruction is immediately followed by MFHI or MFLO intstruction
C The preceding instruction is immediately followed by MTHI or MTLO intstruction
D The preceding instruction is immediately followed by 32-bit division instruction
E The preceding instruction is immediately followed by 64-bit multiply instruction
F The preceding instruction is immediately followed by 64-bit division instruction
2 In the original R3000, this can be applied to MULT, MULTU, MTHI, and MTLO instructions.
TX49/H2 Architecture
C-4
Case 1: Preceding Instruction Is 32-bit Multiply or 32-bit Mutiply/Add Instruction
A. 32-bit Multiply and Multiply/Add Instructi ons
Pipeline interlocks when data dependency or write
back date into <rd> exists.
2-operand Inst ruct ion is preceeding
MULT/MADD $3, $4 F D E1 E2 E3 M W
MULT/MADD $6, $7, $8 F D E1 E2 E3 M W
Multiply Stage 1 Multiply Stage 4
With data dependency
MULT/MADD $3, $4, $5 F D E1 E2 E3 M W
MULT/MADD $6, $3, $8 F D ES ES ES E1 E2 E3 M W
B. MFHI/MFLO Instructions
Pipeline interlocks until result of MULT/MADD
instruct i ons stored int o <rd> and HI/LO register.
MULT/MADD $3, $4, $5 F D E1 E2 E3 M W
MFHI/MFLO F D ES ES E M W
HI/LO read
C. MTHI/MTLO Instruct ions
Pipeline i nterlocks until result of MULT/MADD
instruct i on is stored into <rd> and HI/ LO register.
MULT/MADD $3, $4, $5 F D E1 E2 E3 M W
MTHI/MTLO F D ES ES E M W
Update HI/LO
Update HI/LO
D. 32-bi t Divis i on Inst ruction
The result of 3-operand multiply instruction is stored
in <rd>, and HI/LO registers are eventually updated
by divisi on i nstruction.
MULT $3, $4, $5
F D E1 E2 E3 M W
DIV $6, $7 F D ES ES E M W
V1 V2 V3 V4 V36
Division stag e 1
E. 64-bit Multipl y Instructions
Pipeline interlocks when data dependency or write
back data into <rd> exists.
2-operand Inst ruct ion is preceeding
MULT $6, $3 F D E1 E2 E3 M W
DMULT $4, $7 F D ES ES E1 E2 E6 M W
With data dependency
MULT $3, $4, $5
F D E1 E1 E3 M W
DMULT $6, $3, $8 F D ES ES E1 E2 E6 M W
F. 64-bit Di vi sion Instruction
The result of 3-operand multiply instruction is stored
in <rd>, and HI/LO registers are eventually updated
by divisi on i nstruction.
MULT $3, $4, $5
F D E1 E2 E3 M W
DDIV $6, $7 F D ES ES E M W
V1 V2 V3 V4 V68
Division stag e 1
Figure C-8 Pipeline Hazard/Interlock by 32-bit Multiply or 32-bit Multiply/Add Instruction
Note that in the category A of the Figure C-8, pipeline interlocks for any instruction
immediately after the multiply or multiply/add instruction when it has the data
dependency regarding the general purpose registers. Thus, in the category D, the DIV
instruction stalls at the E sta ge for three cycles when the division instruction has the data
dependency with the preceding multiply instruction.
Also note that in the category D of the Figure C-8, Because the division instruction
overwrites the HI/LO registers, the HI/LO registers as the result of the 2-operand
multiply instru ction is undef ined. The re sult o f the mu ltiply instructio n, as in thi s figure ,
is correctly stored in the <rd> register. If the preceding multiply or multiply/add
instruction had a <rd> field, pipeline interlocks due to the resource conflict.
TX49/H2 Architecture
C-5
Case 2: Preceding Instruction Is MFHI/MFLO Instruction
A. 32-bit Multiply and Multiply/Add Instructi ons
MULT/MADD updates the HI/LO registers at M
stage and the prior MFHI/MFLO can read the HI/LO
registers before the update.
MFHI/MFLO F D E M W
MULT/MADD $6, $7, $8 F D E1 E2 E3 M W
Update HI/LO
Read HI/LO
B. MFHI/MFLO Instructions
No hazard.
MFHI/MFLO F D E M W
MFHI/MFLO F D E M W
C. MTHI/MTLO Instruct ions
No hazard because MTHI/MTLO updates HI/LO
resisters at M stage.
MFHI/MFLO F D E M W
MTHI/MTLO F D E M W
Update HI/LO
Read HI/LO
D. 32-bi t Divis i on Inst ruction
It is necessary to insert at least two instructions
between MFHI/MFLO and DIV.
MFHI/MFLO F D E M W
nop F D E M W
nop F D E M W
DIV F D E M W
V1 V2 V3 V36
Update HI/LO
E. 64-bit Multipl y Instructions
DMULT updates the HI/LO registers at M stage and
the prior MFHI/MFLO can read the HI/LO registers
before the update.
MFHI/MFLO F D E M W
DMULT $6, $7, $8 F D E1 E2 E6 M W
Update HI/LO
Read HI/LO
F. 64-bit Di vi sion Instruction
It is necessary to insert at least two instructions
between MFHI/MFLO and DDIV.
MFHI/MFLO F D E M W
nop F D E M W
nop F D E M W
DDIV F D E M W
V1 V2 V3 V68
Update HI/LO
Figure C-9 Pipeline Hazard/Interlock by MFHI/MFLO Instructions
TX49/H2 Architecture
C-6
Case3: Preceding Instruction Is MTHI/MTLO Instruction
A. 32-bit Multiply and Multiply/Add Instructi ons
MULT/MADD updates the HI/LO registers at M
stage and MADD can use HI/LO registers updated
by the prior MTHI/ MTLO.
MTHI/MTLO F D E M W
MULT/MADD $6, $7, $8 F D E1 E2 E3 M W
Update HI/LO
Update HI/LO
B. MFHI/MFLO Instructions
No hazard because MTHI/MTLO updates the HI/LO
registers before MFHI/MFLO reads them.
MTHI/MTLO F D E M W
MFHI/MFLO F D E M W
Read HI/LO
Update HI/LO
C. MTHI/MTLO Instruct ions
No hazard.
MTHI/MTLO F D E M W
MTHI/MTLO F D E M W
Update HI/LO
Update HI/LO
D. 32-bi t Divis i on Inst ruction
The division instruction starts to update HI/LO
registers at E stage, and the prior MTHI/MTLO has
no meaning.
MTHI/MTLO F D E M W
DIV F D E M W
V1 V2 V3 V36
Update HI/LO
E. 64-bit Multipl y Instructions
DMULT updates the HI/LO registers at M stage.
MTHI/MTLO F D E M W
DMULT $6, $7, $8 F D E1 E2 E3 E4 E5 E6 M W
Update HI/LO
Update HI/LO
F. 64-bit Di vi sion Instruction
The division instruction starts to update HI/LO
registers at E stage, and the prior MTHI/MTLO has
no meaning.
MTHI/MTLO F D E M W
DDIV F D E M W
V1 V2 V3 V68
Update HI/LO
Figure C-10 Pipeline Hazard/Interlock by MTHI/MTLO Instructions
TX49/H2 Architecture
C-7
Case 4: Preceding Instruction Is 32-bit Division Instruction
A. 32-bit Multiply and Multiply/Add Instructi ons
Pipeline interlocks till the division instruction is
completed.
DIV F D E M W
V1 V2 V3 V36
MULT/MADD $6, $7, $8
F D ES ES ES E1 E3 M W
B. MFHI/MFLO Instructions
Pipeline i nterlocks because of data dependency.
DIV F D E M W
V1 V2 V3 V36
MFHI/MFLO F D ESESESEMW
C. MTHI/MTLO Instruct ions
Pipeline interlocks till the division instruction is
completed.
DIV F D E M W
V1 V2 V3 V36
MTHI/MTLO F D ESESESEMW
D. 32-bi t Divis i on Inst ruction
Pipeline interlocks till the division instruction is
completed.
DIV F D E M W
V1 V2 V3 V36
DIV F D ES ES ES EMW
V1 V2 V3 V36
E. 64-bit Multipl y Instructions
Pipeline interlocks till the division instruction is
completed.
DIV F D E M W
V1 V2 V3 V36
DMULT $6, $7, $8
F D ES ES ES E1 E6 M W
F. 64-bit Di vi sion Instruction
Pipeline interlocks till the division instruction is
completed.
DIV F D E M W
V1 V2 V3 V36
DDIV F D ES ES ES EMW
V1 V2 V3 V68
Figure C-11 Pipeline Hazard/Interlock by Division Instructions
TX49/H2 Architecture
C-8
Case 5: Preceding Instruction Is 64-bit Multiply Instruction
A. 32-bit Multiply and Multiply/Add Instructi ons
Pipeline interlocks till the multiply instruction is
completed.
DMULT $3, $4
F D E1 E2 E3 E4 E5 E6 M W
MULT/MADD $6, $7, $8
F D ES ES ES ES E1 E2 E3 M W
B. MFHI/MFLO Instructions
Pipeline i nterlocks because of data dependency.
DMULT F D E1E2E3E4E5E6 M W
MFHI/MFLO F D ES ES ES EMW
C. MTHI/MTLO Instruct ions
Pipeline interlocks till the multiply instruction is
completed.
DMULT F D E1E2E3E4E5E6 M W
MTHI/MTLO F D ES ES ES ES E M W
D. 32-bi t Divis i on Inst ruction
Pipeline interlocks till the multiply instruction is
completed.
DMULT $3, $4
F D E1 E2 E3 E4 E5 E6 M W
DIV $6, $7
F D ES ES ES ES E M W
V1 V2 V3 V36
E. 64-bit Multipl y Instructions
Pipeline interlocks till the multiply instruction is
completed.
DMULT $3, $4
F D E1 E2 E3 E4 E5 E6 M W
DMULT $6, $7, $8
F D ES ES ES ES E1 E6 M W
F. 64-bit Di vi sion Instruction
Pipeline interlocks till the multiply instruction is
completed.
DMULT $3, $4
F D E1 E2 E3 E4 E5 E6 M W
DDIV $6, $7
F D ES ES ES ES E M W
V1 V2 V3 V68
Figure C-12 Pipeline Hazard/Interlock by Division Instructions
TX49/H2 Architecture
C-9
Case 6: Preceding Instruction Is 64-bit Division Instruction
A. 32-bit Multiply and Multiply/Add Instructi ons
Pipeline interlocks till the division instruction is
completed.
DDIV F D E M W
V1 V2 V3 V68
MULT/MADD $6, $7, $8
F D ES ES ES E1 E3 M W
B. MFHI/MFLO Instructions
Pipeline i nterlocks because of data dependency.
DDIV F D E M W
V1 V2 V3 V68
MFHI/MFLO F D ES ES ES EMW
C. MTHI/MTLO Instruct ions
Pipeline interlocks till the division instruction is
completed.
DDIV F D E M W
V1 V2 V3 V68
MTHI/MTLO F D ES ES ES EMW
D. 32-bi t Divis i on Inst ruction
Pipeline interlocks till the division instruction is
completed.
DDIV F D E M W
V1 V2 V3 V68
DIV F D ES ES ES EMW
V1 V2 V3 V36
E. 64-bit Multipl y Instructions
Pipeline interlocks till the division instruction is
completed.
DDIV F D E M W
V1 V2 V3 V68
DMULT $6, $7, $8
F D ES ES ES E1 E6 M W
F. 64-bit Di vi sion Instruction
Pipeline interlocks till the division instruction is
completed.
DDIV F D E M W
V1 V2 V3 V68
DDIV F D ES ES ES EMW
V1 V2 V3 V68
Figure C-13 Pipeline Hazard/Interlock by Division Instructions
TX49/H2 Architecture
C-10
C.1.4 Instructions regarding System Control Co-processor (CP0)
C.1.4.1 MFC0 and MTC0 Instructions
Pipeline interlocks when the MFC0 instruction is followed by the instruction that
reads the destination register of MFC0 instruction (Figure C-14).
mfc0 $5, EPC F D E M W
addu $8, $7, $5 F D ES E M W
EPC Read
Stall
Figure C-14 Pipeline Interlock by MFC0 Instruction
No pipeline hazards occur when the MTC0 instruction is followed by MFC0
instruction because MTC0 writes the destination register in the M stage and MFC0
reads it also in the M stage (Figure C-15).
mtc0 $5, DEPC F D E M W
mfc0 $8, DEPC F D E M W
DEPC Write
DEPC Read
Figure C-15 MTC0 Instruction Followed by MFC0 Instruction
C.1.4.2 ERET Instruction
Unlike a branch or jump instruction, ERET does not execute the next instruction.
The changed EPC becomes effective at the second instruction after the MTC0
instruction (Figure C-16).
mtc0 $5, EPC F D E M W
nop F D E M W
eret F D E M W
nop F D E M W
EPC Update
Figure C-16 MTC0 Instruction Followed by ERET Instruction
TX49/H2 Architecture
C-11
C.1.4.3 DERET Instruction
The DERET instruction has a branch delay slot, and the debug exception mode is
effective till the delay slot instruction3. The instruction in the delay slot of DERET
must be NOP instruction. Single step exception is disabled till the instruction to
which DERET returns the control.
mtc0 $5, DEPC F D E M W
nop F D E M W
deret F D E M W
nop F D E M W
DEPC Update
Figure C-17 MTC0 Instruction Followed by DERET Instruction
3 i.e. DM bit stays one (1) and interrupts and exceptions stay disabled.
TX49/H2 Architecture
C-12
C.1.5 Control Bits Change in CP0 Registers by MTC0 Instruction
The following sections describe the timings when the control bits change by the MTC0
instruction become effective.
C.1.5.1 Status Register
CU Bits: Because the co-processor instructions refer the CU bit in the D stage, if
either of the two following instructions of the MTC0 instruction is the co-
processor instruction, then its result is undefined because the CU bit is
undefined (Figure C-18).
mtc0 $5, STATUS F D E M W
nop F D E M W
nop F D E M W
copz F D E M W
CU Bit Update
CU Bit Read
Figure C-18 Hazard regarding the CU Bits
Note that even if the CU bit is changed by the MTC0 instruction during the co-
processor bus cycles of the preceding co-processor instruction, this gives no effect on
the co-processor instruction currently being executed.
RE Bit: Because the load/store instructions refer the RE bit in the E stage, the
change becomes effective at the second instruction after the MTC0
instruction. The result of the load/store instructions immediately after
the MTC0 instruction is undefined (Figure C-19).
mtc0 $5, STATUS F D E M W
nop F D E M W
Iw F D E M W
RE Bit Update
RE Bit Read
Figure C-19 Hazard regarding the RE Bits
Note that even if the RE bit is changed by the MTC0 instruction during the bus
cycles of the preceding load/store instruction, this gives no effect on the load/store
instruction currently being executed.
TX49/H2 Architecture
C-13
BEV Bit: For the exceptions that occur in the E stage, such as, the address error
(AdEL) or the TLB miss (TLBL) exceptions which occurs in the
instruction fetch stage, the exception vector base address designated by
the changed BEV becomes effective at the second instruction after the
MTC0 instruction. If these exceptions occur in the instruction
immediately after the MTC0 instruction, the referred value of the BEV bit
is undefined4 (Figure C-20).
mtc0 $5, STATUS F D E M W
nop F D E M W
Iw F D E XXXX
BEV Bit Update
E Stage Exception Occurs
Figure C-20 Hazard regarding the BEV Bits (1)
For the exceptions that occur in the M stage, such as, IBE, DBE, NmI, CpU, Ov,
Sys, Bp, RI, Ad E L (d at a), T LB L (dat a) , and TLB S, Mo d, an d In t, the e x cep tion ve ctor
base address designated by the changed BEV becomes effective at the instruction
immediately after the MTC0 instruction (Figure C-21).
mtc0 $5, STATUS F D E M W
Iw F D E M XXXX
BEV Bit Update
M Stage Exception Occurs
Figure C-21 Hazard regarding the BEV Bits (2)
Note that because the interrupts and the Bus Error exception occurs
asynchronously with the instruction execution, the BEV bit value for them is the
value which is hold in the BEV bit when they occurs.
IntMask Bits and IE Bit:
When the MTC0 instruction enables the interrupts by changing these bit,
then the corresponding interrupts become enabled at the second
instruction after the MTC0 instruction5 (Figure C-22).
On the other hand, when the MTC0 instruction disables the interrupts, the
corresponding interrupts become disabled at the instruction immediately after the
MTC0 instruction (Figure C-23).
FR Bit: Because the FR bit is changed in the M stage of the MTC0 instruction,
new FR bit becomes effective at the third instruction after the MTC0
instruction (Figure C-24).
4 The new exception vector base a ddress may be effecti ve because of pipeline stall.
5 They may become enable at the instruction immediately after the MTC0 instruction because of
pipeline stall.
TX49/H2 Architecture
C-14
mtc0 $5, STATUS F D E M W
nop F D E M W
Iw (Interrupt E nabl ed) F D E M W
IntMask/I E Updat e
(Interrupt Enable)
Interrupt Occurs
Figure C-22 Hazard regarding the IntMask Bits and IE Bit (1)
mtc0 $5, STATUS F D E M W
Iw (Interrupt Disabl ed) F D E M W
IntMask/I E Updat e
(Interrupt Dis abl e)
Figure C-23 Hazard regarding the IntMask Bits and IE Bit (2)
mtc0 $5, STATUS F D E M W
nop F D E M W
nop F D E M W
dmtc1 F D E M W
FR Bit Update
Reference FR Read
Figure C-24 Hazard regarding the FR Bit
TX49/H2 Architecture
C-15
EXL, ERL, KX, SX, UX, KSU Bit:
The modification of these bits become effective at the forth instruction
after the MTC0 instruction. On the other hand, new addressing mode for
a load/store instruction which is accessing the address in
Kernel/Supervisor space or accessing in 64-bit addressing is effective at
the second instruction after the MTC0 instruction. If either of the two
instructions after the MTC0 instruction is co-processor instruction, result
of the instruction is undefined (Figure C-25).
mtc0 $5, STATUS F D E M W
nop F D E M W
Iw F D E M W
cpz F D E M W
sd F D E M W
Update
MIPS-III i n structi o n
64-bit addressi ng
Kernel or
Supervisor mode
Figure C-25 EXL, ER L, KX, SX, UX, KSU Bit
C.1.5.2 Config Register
ICE# Bit: The MTC0 instruction may change the ICE# bit during the instruction
cache streaming. In this case, the old ICE# bit are effective for the
instructions during the streaming (Figure C-26).
mtc0 $5, Config ; updat e ICE # bit
nop
beq $0, $0, L1 ; stop instruction streaming
nop
L1: Iw $2, 0 ($0) ; new ICE# bit is effect i ve
Figure C-26 ICE# Bit update
DCE# Bit: The changed DCE# becomes effective at the second instruction after the
MTC0 instruction. The DCE# bit is undefined at the instruction
immed iately after the MTC0 instruc tion . Note th at the MTC0 instruc tion
may change the DCE# bit during the data cache refill. In this case, the
hardware interlock waits updating the DCE# bit till the data cache refill
finishes.
K0 Bit: The modification of these bits becomes effective at the forth instruction
after the MTC0 instruction, the result of the instruction in Kseg0 address
space is undefined if they executed as first, second or third instruction
after the MTC0 instruction. On the other hand, the modification of these
bits are effective at the third instruction after MTC0 instruction. New
addressing mode for a load/store instruction accessing the Kseg0 address
space is undefined if the instruction executed as first or second instruction
after MTC0 instruction.
TX49/H2 Architecture
C-16
C.2 Pipeline Beha vior on Cache Miss
This section describes the pipeline behavior on cache miss.
C.2.1 Instruction Cache Miss
Instruction cache miss is detected in F stage and it is immediately followed by a cache
refill cycle (Figure C-27).
GRD
GDIN[31:0]
addu $5, $26, $7 F D E M W
addu $8, $7, $6 F D E M W
Iw $2, 0 ($1) F D E M W
addu $9, $8, $5 F DS DS DS DS DS D E M W
subu $5, $3, $7 F D E M W
addu subu
Inst. Cache Miss
Instruction Cache Refill
Figure C-27 Streaming on Instruction Cache Refill Cycle in 32-bit GBus mode
On cache miss, the fetched instructions are immediately decoded and executed before
completion of refill cycle so that the pipeline resumes the execution of instruction stream
as shown in Figure C-27. This is so called streaming6 and its refill cycle is called stream
cycle.
When the branch or jump instruction is executed during the stream cycle, streaming
will be terminated which means refill cycle will completed but the fetched instructions
after the branch delay slot won' t be exe cuted. The pipeline will stall until th e in struction
at the branch or jump target is fetched. (Figure C-28).
6 No streaming in 64-bit GBus mode with 1:1 of GBus clock rate. TX49 executes one instruction per
clock cycle even if two instructions are fetched in one cycle. In this case, fetched instruction won't be
executed until the refill cycle completes.
TX49/H2 Architecture
C-17
GRD
GDIN[31:0]
addu $5, $26, $7 F DS DS DS DS DS D E M W
subu $9, $8, $5 F D E M W
jr $25 F D E M W
lw $2, 0 ($1) F D E M W
lw $3, 0 ($5) (target Inst ruct i on) F D E M W
addu subu
Instruction Cache Refill
Inst. Cache Miss
jr Iw
Jump
Figure C-28 Branch/Jump Instruction during Stream Cycle in GBus 32-bit Mode
C.2.2 Data Cache Miss
The data cache miss is detected in the M stage of load instruction and it is immediately
follow ed by a cache re fill cycle. Non -blocking load mechanism implem ented in TX49 data
cache allows the following instruction stream to be executed without waiting for the
completion of data cache refill if there is no data dependancy between the load and the
following instructions.
The pipeline will stall at E-stage of the instruction which use the refilled data as its
source until the data is loaded. (Figure C-29).
Iw $5, 0 ($26) F D E M FX W
RD RD RD
addu $8, $7, $6 F D E M W
ori $9, $0, 0x1f F D E M W
addu $9, $8, $5 F D E S E S ES E M W
Figure C-29 Pipeline Interlock by Cache Miss
The pipeline also interlocks when a load/store instruction is issued during the data
cache refill cycle because of the resource (i.e. data cache) conflict (Figure C-30).
TX49/H2 Architecture
C-18
Iw $5, 0 ($26) F D E M F X W
RD RD RD
Iw $7, 0 ($25) F D E MS MS MS MS M W
ori $9, $0, 0x1f F D E S ES ES ES E M W
addu $9, $8, $5 F DS DS DS DS D E M W
Reference FR Read
resource conflict
Figure C-30 Load Instruction during the Data Cache Refill Cycle
It is possible that the conflict at W-stage occurs between load instruction and one of the
following instructions if the load instruction causes cache refill cycle. This situation is
shown in Figure C-31.
In this c ase, W-stage o f load instructio n takes pre cedence re sulting in on e cycle stall at
M-stage of the addu inst ruction.
Iw $5, 0 ($26) F D E M FX W
RD RD RD
addu $4, $3, $7 F D E M W
ori $9, $0, 0x1f F D E M W
addu $9, $8, $7 F D E M W
addu $7, $6, $8 F D E MS M W
W stage Resource Conflict
Data Cache Miss
Figure C-31 W stage Pipeline Register Conflict
If the instruction fetch cycle is requested during the data cache refill cycle, the data
cache refill completes first followed by the instruction fetch cycle (Figure C-32).
Iw $5, 0 ($26) F D E M M W
RD RD RD
addu $7, $6, $8 F D E M W
addu $4, $3, $7 F D E M W
ori $9, $0, 0x1f F D E M W
addu $9, $8, $5 F DS DS DS DS DS DS DS D E M W
addu $7, $6, $5 F D E M WInst. Cache Miss
Data Cache Miss
Figure C-32 Instruction Cache Miss during the Data Cache Refill Cycle
TX49/H2 Architecture
C-19
C.3 Pipeline Behavior in Uncached Area
The pipeline behavior regarding the memory access to an uncached area is similar to that of
refill cycle sequence caused by the cache miss.
C.3.1 Data Read from Uncached Area
FDEM–FXW
Iw $5, 0 ($26) RD RD RD RD
addu $8, $7, $6 F D E M W
ori $9, $0, 0x1f F D E M W
addu $9, $8, $5 F D E S E S ES ES E M W
Figure C-33 Data Read from Uncached Area
C.3.2 Instruction Fetch from Uncached Area
addu $5, $3, $3 F DS DS DS DS DS D E M W
Iw $2, 0 ($1) F DS D E M W
ori $9, $0, 0x1f F DS D E M W
addu $8, $9, $8 F DS D E M W
Figure C-34 Instruction Fetch from Uncached Area
C.3.3 Data Write to Uncached Area
FDEMW
sw $5, 0 ($26) WR
addu $8, $7, $6 F D E M W
ori $9, $0, 0x1f F D E M W
addu $9, $8, $5 F D E M W
Write to Write Buffer
Figure C-35 Data Write to Uncached Area
TX49/H2 Architecture
C-20
C.4 Timings on the Exception Handling
This section describes the detail pipeline behavior on exception. When an exception takes
place, the instruction on which the exception occurs is aborted. All instructions immediately
after that instruction are also aborted and the processor passes the control to the exception
handler.
The exceptions normally occur in the M stage, but some of the exceptions occur in the E
stage. The exceptions which occur in the E stage are:
Debug Single Step (DSS)
Debug Instruction Break (DIB)
Address Error on Instruction Fetch (AdEL)
TLB Refill/Invalid on Instruction Fetch (TLBL)
Note that the Reset/Soft Reset Exceptions occur in any stage.
C.4.1 Basic Pipeline Behavior W hen Exceptions Occur
The following Figure illustrates the pipeline behavior when an exception occurs.
Iw $5, 0 ($26) F D E M W
addu $7, $6, $8 F D E M Aborted
addu $4, $3, $7 F D E Aborted
ori $9, $0, 0 × 1f F D Aborted
addu $9, $8, $5 F Aborted
addu $7, $6, $5 F D E M W
Exception Detect ed
Exception Handler
(a) Exception Detected in the M Stage
Iw $5, 0 ($26) F D E M W
addu $4, $3, $7 F D E A borted
ori $9, $0, 0x1f F D Aborted
addu $9, $8, $5 F Aborted
addu $7, $6, $5 F D E M W
Exception Detect ed
Exception Handler
(b) Exception Detected in the E Stage
Figure C-36 Pipeli ne Be ha vior in Cas e of Exception
TX49/H2 Architecture
C-21
C.4.2 Exceptions during the Execution of Multi-cycle Instructions
As described in the section entitle Multiply, Multiply/Add and Division Instructions,
multi-cycle instructions which do not have a destination register file, such as DIV, and the
following instructions will be executed in parallel if they do not have data dependency.
If an exception takes place at the instruction being executed in parallel with this type of
multi-cycle instructions, the preceding multi-cycle instruction is completed while the
instructions after the exception are aborted and the control is passed to the exception
handler.
FDEMW
div $8, $9 V1 V 2 V 3 V 4 V 5 V 6 v7 ….. V 35 V36
addu $7, $6, $5 F D E M Aborted
addu $4, $3, $7 F D E Aborted
ori $9, $0, 0x1f F D Abort ed
addu $9, $8, $5 F Aborted
addu $7, $6, $5 F D E M W
Exception Detected
Exception Handler
Figure C-37 Exception during the Execution of Division Instruction
C.4.3 Exceptions during the Data Cache Refill Cycle
When one of the exceptions occurs at the instruction which is being executed in parallel
with data cache refill, the data cache refill cycle is completed while the instructions after
the exception are aborted and the control is passed to the exception handler.
FDEM–FXW
Iw $3, 0 ($1) RD RD RD
addu $7, $6, $5 F D E M Aborted
addu $4, $3, $7 F D E Aborted
ori $9, $0, 0x1f F D Abort ed
addu $9, $8, $5 F Aborted
addu $7, $6, $5 F D E M W
Exception Detected
Exception Handler
Figure C-38 Exceptions during the Data Cache Refill Cycle (1)
TX49/H2 Architecture
C-22
However, when one of the fatal exceptions, such as Bus Error or Reset occurs, the refill
cycle is also aborted and the control is passed to the exception handler.
F D E M Aborted
Iw $3, 0 ($1) RD Aborted
addu $7, $6, $5 F D E M Aborted
addu $4, $3, $7 F D E Aborted
ori $9, $0, 0x1f F D Abort ed
addu $9, $8, $5 F Aborted
addu $7, $6, $5 F D E M W
Fatal Exception Detected
Exception Handler
Figure C-39 Exception during Data Cache Refill Cycle (2)
TX49/H2 Architecture
D-1
Appendix D: G-Bus Overview
D.1 G-Bus Operation
The G- Bus has a 36- bit addres s bus and a 64- bit data bus. Byte and halfword transfers c an
occur in any byte lane, depending on how GBE[7:0]* are driven.
The G-Bus speed can be divided by 2, 2.5, 3 or 4 relative to the CPU full speed. Selection of
which G-Bus speed to use is determined by the value of GCRATE[1:0] while GCO LDRESET is
asserted. Correct operation is not guaranteed if GCRATE[1:0] changes while the TX49 is
running.
The TX49 supports four different types of bus transactions: single-read, burst-read, single-
write and burst-write. When a bus transaction starts, GBSTART* is asserted for one
GBUSCLK cycle, regardless of the type of the transaction. Peripheral logic must sample
GBSTART* to recognize the beginning of a bus cycle. It should be noted that when multiple
read or write trans actions occur back- to-back, GRD* or GWR* remains asserted until the last
transaction is completed; therefore, GRD* and GWR* can not be used to detect the beginning
of a bus cycle.
During a read operation, the TX49 samples GACK* with the rising edge of GBUSCLK.
When it is detected as asserted, the TX49 captures the data on GDTM at the next rising edge
of GBUSCLK. If the bus transaction is a burst-read, the TX49 also automatically increments
the address value.
During a write operation, the TX49 samples GACK* with the rising edge of GBUSCLK.
When it is detected as asserted during a single-write, the TX49 terminates the current bus
transaction at the next rising edge of GBUSCLK. If the bus transaction is a burst-write, the
TX49 goes ahead with the next write, automatically incrementing the address val ue.
GLAST* indicates the completion of a bus cycle. Peripheral logic must sample GLAST* to
terminate a bus tran sact io n.
D.2 Types of G-Bus Arbitration
One important feature of the TX49 is its enhanced bus arbitration flexibility. This section
introduces two types of bus arbitration: Snoop & Transfer (ST) concurrency and Execute &
Transfer (ET) concurrency. ST concurrency causes the TX49 to stall the processor pipeline
while allowing the internal data cache to be snooped during DMA transfers. In contrast, ET
concurrency allows the processor core to continue execution out of the internal cache during
external bus mastership; ET concurrency does not al low data cache snooping.
D.2.1 Snoop & Transfer (ST) Concurrency
In systems in which main memory is accessed by DMA, it must be ensured that the
intern al data cache of the TX4 9 always ha s the mo st recent da ta and is not in pos session
of stale data. In other words, if the data in main memory has been changed by DMA, the
matching cache entries in the TX49 must be marked as "modified" (i.e., invalidated). ST
concurrency allows the TX49 to "snoop" DMA’s access to main memory and check for a
matching dat a cache entry. Figure D -1 illustr ates this f eature. Du ring an ST con currency
operation, the TX49 stalls the processor pipeline.
An alternate bus master asserts either GHPSREQ* or GSREQ* to request bus
mastership for an ST concurrency operation. Once GHPSREQ* or GSREQ* is detected,
the TX49 will flush the internal write buffer before granting the bus to the requesting
master; GHPSGNT* or GSGNT* is asserted to indicate that the bus has been granted.
TX49/H2 Architecture
D-2
While GHPSGNT* or GSGNT* i s a sserte d, th e TX49 c ont inually sample s GSNO OP* with
the rising edge of GBUSCLK. When GSNOOP* is recognized as asserted, the TX49
captures the address on GATM[35:5] and compares it to the addresses of all data items
held in the data cache. If the snoop address hits in the data cache, the cache entry is
invalidated. GSNOOP* is valid only when either GHPSGNT* or GSGNT* is asserte d.
The internal data cache of the TX49 can employ either the write-through or write-back
policy. The write-back data cache does not provide support for snooping. When the write-
back option is sel ected, GHPSREQ* and GSREQ* can not be used.
Figure D-1 ST Concurr enc y
D.2.2 Execute & Trans fer (ET) Concurrency
Figure D-2 illustrates ET concurrency. Whereas ST concurrency causes the TX49 to
stall the processor pipeline, ET concurrency allows the processor to continue execution out
of the internal cache during external bus mastership. However, it does stall when there is
a need for a cache refill. Also, if the write buffer is full, additional stores will stall until
there is room for them in the write buffer.
ET concurrency is recommended for the following cases:
when the internal data cache is programmed for write-back mode
when performing DMA transfers to an uncached address space even if the internal
data cache is programmed for write-t hrough mode
An alternate bus master asserts either GHPGREQ* or GREQ* to request bus
mastership for an ET concurrency operation. Once GHPGREQ* is detected and the bus is
free, the TX49 will grant the bus to the requesting master. GHPGGNT* or GREQ* is
asserted to indicate that the bus has been granted to the master. If the bus is busy, the
TX49 will relinquish the bus after it completes the current bus cycle. GHPGREQ* and
GREQ* are sampled with the rising edge of GBUSCLK.
G-Bus
External Bus
TX49
WBU, etc.
Bus Master
External Device
Interface
TX49
Processor
Core
TX49/H2 Architecture
D-3
Figure D-2 ET Concurr enc y
Table D-1 summarizes the differences between ST and ET concurrency.
Table D-1 ST Concurrency vs. ET Concurrency
ST Concurrency ET Concurrency
Handshake Signals Bus request signal: GHPSREQ*
Bus grant signal: GHPSGNT*
Bus request signal : GSRE Q *
Bus grant signal: GSGNT*
Bus request signal: GHPGREQ*
Bus grant signal: GHPGGNT*
Bus request signal: GREQ*
Bus grant signal: GGNT*
Data Cache Snooping Acc ept ed by assert i on of GSNOOP*
(Not support ed in write-back m ode) Not Supported
Stores to the Write Buffer Disabled Enabled
Usage Example When an external bus mast er
performs store operat ions to a
memory space mapped to the data
cache (i.e., When data cache
snooping is necess ary)
When an external bus m ast er
transfers dat a over the G-Bus
without performing a snoop
operation.
When the data cache employs the
write-back policy
Maximum Bus Control
(Request-to-Grant)
Latency
Remaini n g current bus c ycle
+ write buffer flushing*
+ dta read bus cycle already issued int ernal l y
+ instructi on fetch bus cyc l e already iss ued internally
* During an ET concurrency operation, the write buffer is flushed only when the write buffer
contains uncac hed store data which has not yet been written to m emory and the T X49 issues
an uncached read request to the target address of one of the write buffer entries.
G-Bus
External Bus
TX49
WBU, etc.
Bus Master
External Device
Interface
TX49
Processor
Core
TX49/H2 Architecture
D-4
TX49/H2 Architecture
E-1
Appendix E: Differences From TX4955A,TX4300 and TX4600
Item TX4955A TX4300 TX4600
Datapath 64 64 64
ISA MIPS I, II, III MIPS I, II,III MIPS I, II, III
+MADD, +Debug
+PREF
Pipeline 5 5 5
MMU TLB TLB TLB
JointT LB 48 double 32 double 48 double
I-TLB 2 entry 2 entry 2 entry
D-TLB 4 entry No 4 entry
Page Size 4 K-16 MB 4 K-16 MB 4 K-16 MB
Shutdown No-TS Yes No-TS
V.A. Size 40 40 40
P.A. Size 36 32 36
I-cache
Size 32 KB 16 KB 16 KB
Associate. 4-way Dir.-map 2-way
Lock Yes No No
Snoop No No No
Index V V V
TagPPP
Line 32 B 32 B 32 B
Parity No No Yes
D-cache
Size 32 KB 8 KB 16 KB
Associate. 4-way Dir.-map 2-way
Lock Yes No No
Write Policy W.-back/ -through W. -back W.-back/-through
Snoop No No No
Index V V V
TagPPP
Line 32 B 16 B 32 B
Parity No No Yes
TX49/H2 Architecture
E-2
Item TX4955A TX4300 TX4600
WriteBuffer 4A/D pairs 4A/D pai rs 4A/D pairs
FPU FPU Hard Shared w/ IU FPU Hard
(CP1) Shared w/
I-mul/div
Single Single Single
Double Double Double
Debug Support Unit Yes No No
MPU SysAD SysAD SysAD
Bus I/F 32-bit 32-bit 64-bit
A/D multip lexed A/D multiplex e d A/D mult ip lexed
Sys.Clock Ratio:
1:1NoNoNo
2:1 Yes Yes Yes
2.5:1 Yes No No
3:1 Yes Yes Yes
4:1 Yes No Yes
5:1NoNoYes
6:1NoNoYes
7:1NoNoYes
8:1NoNoYes
JTAG Yes Yes(No func.) No
Power Sup. Internal: 1.5 V
External: 3.3 V 3.3 V 3.3 V
Power down -W AI T Inst. -Status. Reg. -W AI T Inst.
Mode (Halt/Doze) (1/4 P Cl ock) (S tand-by)
Package PQFP-160 PQFP-120 PGA-179
HSQFP-208