TX49/H2 CORE ARCHITECTURE - Toshiba

64-Bit TX System RISC

TX49/H2 Core Architecture

JAN. 2002

R4000/R4400/R5000 are a trademark of MIPS Technologies, Inc.

The information contained herein is subject to change without notice.

The information contained herein is presented only as a guide for the applications of our

products. No responsibility is assumed by TOSHIBA for any infringements of patents or

other rights of the third parties which may result from its use. No license is granted by

implication or otherwise under any patent or patent rights of TOSHIBA or others.

The products described in this document contain components made in the United States

and subject to export control of the U.S. authorities. Diversion contrary to the U.S. law

is prohibited.

TOSHIBA is continually working to improve the quality and reliability of its products.

Nevertheless, semiconductor devices in general can malfunction or fail due to their

inherent electrical sensitivity and vulnerability to physical stress.

It is the responsibility of the buyer, when utilizing TOSHIBA products, to comply with

the standards of safety in making a safe design for the entire system, and to avoid

situations in which a malfunction or failure of such TOSHIBA products could cause loss

of human life, bodily injury or damage to property.

In developing your designs, please ensure that TOSHIBA products are used within

specified operating ranges as set forth in the most recent TOSHIBA products

specifications.

Also, please keep in mind the precautions and conditions set forth in the “Handling

Guide for Semiconductor Devices,” or “TOSHIBA Semiconductor Reliability

Handbook” etc..

The Toshiba products listed in this document are intended for usage in general

electronics applications ( computer, personal equipment, office equipment, measuring

equipment, industrial robotics, domestic appliances, etc.).

These Toshiba products are neither intended nor warranted for usage in equipment that

requires extraordinarily high quality and/or reliability or a malfunction or failure of

which may cause loss of human life or bodily injury (“Unintended Usage”). Unintended

Usage include atomic energy control instruments, airplane or spaceship instruments,

transportation instruments, traffic signal instruments, combustion control instruments,

medical instruments, all types of safety devices, etc.. Unintended Usage of Toshiba

products listed in this document shall be made at the customer’s own risk.

The products described in this document may include products subject to the foreign

exchange and foreign trade laws.

Preface

Thank you for your new or continued patronage of Toshiba semiconductor products. This is the 1998

edition of the user’s manual for the TX49 Family of 64-bit RISC microprocessors, entitled 64-Bit TX

System RISC TX49/H2 Architecture.

This manual is written so as to be accessible to engineers who may be designing a To shiba

microprocessor into their products for the first time. No prior knowledge of these devices is assumed.

The manual includes a review of the architecture of the processor family, a description of the TX49

instruction set, and sections dedicated to various other relevant topics, such as the Memory

Management System (MMU) and CPU exceptions.

Toshiba continually updates its technic al information. Your comments and sugge stions concerning this

and other Toshiba documents are sincerely appreciated and may be used in subsequent editions. For

updates to this document or for additional information about the product, please contact your nearest

Toshiba office or authorized Toshiba dealer.

January 2002

Contents

Handling Precautions

1. Introduction ........................................................................................................................................... 1-1

2. Feature................................................................................................................................................... 2-1

3. TX49 Block Diagram............................................................................................................................. 3-1

4. CPU Registers Overview....................................................................................................................... 4-1

4.1 Introduction ................................................................................................................................... 4-1

4.2 CPU Registers................................................................................................................................ 4-1

4.3 CP0 Registers................................................................................................................................. 4-2

5. CPU Instruction Set Summary ............................................................................................................ 5-1

5.1 Introduction ................................................................................................................................... 5-1

5.2 Instruction Format........................................................................................................................ 5-1

5.3 Instruction Set Ove rv iew.............................................................................................................. 5-2

5.3.1 Load and Store Instructions (Table 5-1)..............................................................................5-2

5.3.2 Computational Instructions (Table 5-2)............................................................................... 5-3

5.3.3 Jump and Branch Instructions (Table 5-3).......................................................................... 5-4

5.3.4 Special Instructions (Table 5-4)............................................................................................ 5-5

5.3.5 Exception Instructions (Table 5-5).......................................................................................5-5

5.3.6 Coprocessor Instructions (Table 5-6).................................................................................... 5-6

5.3.7 CP0 Instructions (Table 5-7)................................................................................................. 5-6

5.3.8 Multiply and Divide Instructions (Table 5-8)...................................................................... 5-7

5.3.9 Debug Instructions (Table 5-9).............................................................................................5-7

5.3.10 Other Instructions (Table 5-10)............................................................................................ 5-7

5.4 Instruction Execution Cycles........................................................................................................ 5-7

5.5 Defining Access Types................................................................................................................... 5-8

6. CPU Pipeline ......................................................................................................................................... 6-1

6.1 Introduction ................................................................................................................................... 6-1

6.2 Basic Pipeline Operation............................................................................................................... 6-1

6.3 TX49 Pipeline Activities................................................................................................................ 6-2

6.4 Branch and Load Delay................................................................................................................. 6-3

6.4.1 Delayed load........................................................................................................................... 6-3

6.4.2 Delayed branching................................................................................................................. 6-3

6.5 Non-blocking Load Function......................................................................................................... 6-4

6.6 Interlock and Exception Handling ............................................................................................... 6-4

6.6.1 Overview of Interlock and Exception Handling ..................................................................6-4

6.6.2 Exception Conditions............................................................................................................. 6-6

6.6.3 Stall Conditions..................................................................................................................... 6-6

6.6.4 External Stalls....................................................................................................................... 6-6

6.6.5 Interlock and Exception Timing........................................................................................... 6-6

6.7 Multiply and Multiply/Add Instructions (MULT, MULTU, MADD, MADDU)......................... 6-7

6.8 Divide Instructions (DIV, DIVU).................................................................................................. 6-7

6.9 Streaming....................................................................................................................................... 6-7

7. System Control Coprocessor, CP0........................................................................................................ 7-1

7.1 Introduction ................................................................................................................................... 7-1

7.2 CP0 Registers................................................................................................................................. 7-2

7.2.1 Index register (Reg#0)........................................................................................................... 7-2

7.2.2 Random register (Reg#1)....................................................................................................... 7-3

7.2.3 EntryLo0 register (Reg#2) and EntryLo1 register (Reg#3)................................................. 7-4

7.2.4 Context regi ster (Reg#4) ....................................................................................................... 7-5

7.2.5 PageMask Register (Reg#5).................................................................................................. 7-6

7.2.6 Wired Register (Reg#6) ......................................................................................................... 7-7

7.2.7 BadVAddr Register (Reg#8).................................................................................................. 7-8

7.2.8 Count Register (Reg#9) ......................................................................................................... 7-9

7.2.9 EntryHi Register (Reg#10).................................................................................................. 7-10

Contents

7.2.10 Compare Register (Reg#11) ................................................................................................ 7-11

7.2.11 Status Registe r (Re g#12)..................................................................................................... 7-12

7.2.12 Cause Register (Reg#13) ..................................................................................................... 7-15

7.2.13 EPC Register (Reg#14)........................................................................................................ 7-16

7.2.14 PRId Register (Reg#15)....................................................................................................... 7-17

7.2.15 Config Register (Reg#16)..................................................................................................... 7-18

7.2.16 LLAddr Register (Reg#17) .................................................................................................. 7-20

7.2.17 XContext Register (Reg#20)................................................................................................ 7-21

7.2.18 Debug Register (Reg#23)..................................................................................................... 7-22

7.2.19 DEPC Register (Reg#24)..................................................................................................... 7-24

7.2.20 TagLo Register (Reg#28) and TagHi Register (Reg#29) ................................................... 7-25

7.2.21 ErrorEPC Register (Reg#30)............................................................................................... 7-26

7.2.22 DESAVE Register (Reg#31)................................................................................................ 7-27

7.2.23 The Initialization of CP0 Registers in SoftReset Exception............................................. 7-28

8. Memory Management System.............................................................................................................. 8-1

8.1 Introduction ................................................................................................................................... 8-1

8.2 Address Space Overview............................................................................................................... 8-1

8.2.1 Virtual Address Space........................................................................................................... 8-1

8.2.2 Physical Address Space......................................................................................................... 8-2

8.2.3 Virtual-to-Physical Address Translation............................................................................. 8-2

8.2.4 32-bit Mode Address Translation......................................................................................... 8-3

8.2.5 64-bit Mode Address Translation......................................................................................... 8-4

8.3 Operating Modes ........................................................................................................................... 8-5

8.3.1 User Mode Operations........................................................................................................... 8-5

8.3.2 Supervisor Mode Operations ................................................................................................8-7

8.3.3 Kernel Mode Operations....................................................................................................... 8-9

8.4 Translation Lookaside Buffer..................................................................................................... 8-16

8.4.1 Joint TLB ............................................................................................................................. 8-16

8.4.2 TLB Entry format................................................................................................................ 8-16

8.4.3 Instruction-TLB................................................................................................................... 8-17

8.4.4 Data-TLB ............................................................................................................................. 8-17

8.5 Virtual-to-Physical Address Translation Process .....................................................................8-18

9. Cache Organization............................................................................................................................... 9-1

9.1 Introduction ................................................................................................................................... 9-1

9.2 Instruction Cache (I-Cache).......................................................................................................... 9-1

9.2.1 Instruction Cache Address Field.......................................................................................... 9-1

9.2.2 Instruction Cache Configuration..........................................................................................9-2

9.3 Data Cache..................................................................................................................................... 9-2

9.3.1 Data Cache Address Field..................................................................................................... 9-3

9.3.2 Data Cache Configuration .................................................................................................... 9-3

9.3.3 Data Cache Policies............................................................................................................... 9-4

9.4 FIFO Replacement Algorithm...................................................................................................... 9-5

9.5 Lock function ................................................................................................................................. 9-5

9.5.1 Lock bit setting and clearing ................................................................................................ 9-5

9.5.2 Operation During Lock ......................................................................................................... 9-6

9.5.3 Example of Data Cache Locking........................................................................................... 9-6

9.5.4 Example of Instruction Cache Locking................................................................................ 9-6

9.6 The Primary Cache Accessing ...................................................................................................... 9-7

9.7 Cache States .................................................................................................................................. 9-7

9.8 Cache Line Ownership.................................................................................................................. 9-8

9.9 Cache Multi-Hit Operation........................................................................................................... 9-8

9.10 Cache Test Function...................................................................................................................... 9-8

9.10.1 Cache Disabling..................................................................................................................... 9-8

9.10.2 Cache Flushing...................................................................................................................... 9-9

10. Write Buffer......................................................................................................................................... 10-1

11. CPU Exception..................................................................................................................................... 11-1

11.1 Introduction ................................................................................................................................. 11-1

11.2 Exception Vector Locations......................................................................................................... 11-1

Contents

iii

11.3 Priority of Exception ................................................................................................................... 11-2

11.4 ColdReset Exception.................................................................................................................... 11-3

11.4.1 Cause.................................................................................................................................... 11-3

11.4.2 Processing ............................................................................................................................ 11-3

11.4.3 Servicing............................................................................................................................... 11-3

11.5 SoftReset Exception..................................................................................................................... 11-4

11.5.1 Cause.................................................................................................................................... 11-4

11.5.2 Processing ............................................................................................................................ 11-4

11.5.3 Servicing............................................................................................................................... 11-4

11.6 NMI (Non-maskable Interrupt) Exception ................................................................................ 11-5

11.6.1 Cause.................................................................................................................................... 11-5

11.6.2 Processing ............................................................................................................................ 11-5

11.6.3 Servicing............................................................................................................................... 11-5

11.7 Address Error Exception............................................................................................................. 11-6

11.7.1 Cause.................................................................................................................................... 11-6

11.7.2 Processing ............................................................................................................................ 11-6

11.7.3 Servicing............................................................................................................................... 11-6

11.8 TLB Refill Exception................................................................................................................... 11-7

11.8.1 Cause.................................................................................................................................... 11-7

11.8.2 Processing ............................................................................................................................ 11-7

11.8.3 Servicing............................................................................................................................... 11-7

11.9 TLB Invalid Exception................................................................................................................ 11-8

11.9.1 Cause.................................................................................................................................... 11-8

11.9.2 Processing ............................................................................................................................ 11-8

11.9.3 Servicing............................................................................................................................... 11-8

11.10 TLB Modified Exception ............................................................................................................. 11-9

11.10.1 Cause.................................................................................................................................... 11-9

11.10.2 Processing ............................................................................................................................ 11-9

11.10.3 Servicing............................................................................................................................... 11-9

11.11 Bus Error Exception.................................................................................................................. 11-10

11.11.1 Cause.................................................................................................................................. 11-10

11.11.2 Processing .......................................................................................................................... 11-10

11.11.3 Servicing............................................................................................................................. 11-10

11.12 Integer Overflow Exception...................................................................................................... 11-11

11.12.1 Cause.................................................................................................................................. 11-11

11.12.2 Processing .......................................................................................................................... 11-11

11.12.3 Servicing............................................................................................................................. 11-11

11.13 Trap Exception .......................................................................................................................... 11-12

11.13.1 Cause.................................................................................................................................. 11-12

11.13.2 Processing .......................................................................................................................... 11-12

11.13.3 Servicing............................................................................................................................. 11-12

11.14 System Call Exception .............................................................................................................. 11-13

11.14.1 Cause.................................................................................................................................. 11-13

11.14.2 Processing .......................................................................................................................... 11-13

11.14.3 Servicing............................................................................................................................. 11-13

11.15 Breakpoint Exception................................................................................................................ 11-14

11.15.1 Cause.................................................................................................................................. 11-14

11.15.2 Processing .......................................................................................................................... 11-14

11.15.3 Servicing............................................................................................................................. 11-14

11.16 Reserved Instruction Exception ............................................................................................... 11-15

11.16.1 Cause.................................................................................................................................. 11-15

11.16.2 Processing .......................................................................................................................... 11-15

11.16.3 Servicing............................................................................................................................. 11-15

11.17 Coprocessor Unusable Exception ............................................................................................ . 11-16

11.17.1 Cause.................................................................................................................................. 11-16

11.17.2 Processing .......................................................................................................................... 11-16

11.17.3 Servicing............................................................................................................................. 11-16

11.18 Floating-Point Exception .......................................................................................................... 11-17

11.18.1 Cause.................................................................................................................................. 11-17

Contents

11.18.2 Processing .......................................................................................................................... 11-17

11.18.3 Servicing............................................................................................................................. 11-17

11.19 Interrupt Exception................................................................................................................... 11-18

11.19.1 Cause.................................................................................................................................. 11-18

11.19.2 Processing .......................................................................................................................... 11-18

11.19.3 Servicing............................................................................................................................. 11-18

11.20 Exception Handling and Servicing Flowcharts....................................................................... 11-19

12. Floating-Point Unit, CP1.................................................................................................................... 12-1

12.1 Overview ...................................................................................................................................... 12-1

12.2 Floating Point Register............................................................................................................... 12-1

12.2.1 Floating-Point General Registers (FGRs)..........................................................................12-1

12.2.2 Floating-Point Control Registers........................................................................................ 12-2

12.2.3 Accessing the FP Control and Implementation/Revision Registers................................. 12-5

12.3 Floating-Po int Fo rmats............................................................................................................... 12-6

12.4 Binary Fixed-Point Format......................................................................................................... 12-7

12.5 Floating-Point Instruction Set Summary.................................................................................. 12-8

12.5.1 Load, Move and Store Instructions (Table 12-10)............................................................. 12-8

12.5.2 Conversion Instructions (Table 12-11)............................................................................... 12-9

12.5.3 Computational Instructions (Table 12-12)......................................................................... 12-9

12.5.4 Compare and Branch Instructions (Table 12-13)............................................................12-10

13. Floating-Point Exception .................................................................................................................... 13-1

13.1 Introduction ................................................................................................................................. 13-1

13.2 Exception Types...........................................................................................................................13-1

13.3 Exception Trap Processing.......................................................................................................... 13-2

13.4 Flags............................................................................................................................................. 13-2

13.5 FPU Exceptions........................................................................................................................... 13-3

13.6 Saving and Restoring State ........................................................................................................ 13-6

13.7 Trap Handlers for IEEE Standard 754 Exceptions................................................................... 13-6

14. Debug Support Unit............................................................................................................................ 14-1

14.1 Features ....................................................................................................................................... 14-1

14.2 EJTAG interface.......................................................................................................................... 14-1

14.3 JTAG Interface............................................................................................................................ 14-2

14.4 Processor Access Overview......................................................................................................... 14-2

14.5 Instruction ................................................................................................................................... 14-2

14.6 Debug Unit................................................................................................................................... 14-3

14.6.1 Extended Instructions......................................................................................................... 14-3

14.6.2 Extended Debug Registers in CP0 ..................................................................................... 14-3

14.7 Register Map................................................................................................................................ 14-3

14.8 Processor Bus Break Function ................................................................................................... 14-3

14.9 Debug Exception.......................................................................................................................... 14-4

14.9.1 Debug Single Step (DSS)..................................................................................................... 14-4

14.9.2 Debug Breakpoint exception (Dbp) ....................................................................................14-4

14.9.3 JTAG Break Exception........................................................................................................ 14-4

14.9.4 Debug Exception Handling.................................................................................................14-4

14.9.5 Branching to debug handler ............................................................................................... 14-4

14.9.6 Exception handling when in Debug Mode (DM bit is set) ................................................ 14-4

14.10 Real Time PC TRACE Output.................................................................................................... 14-4

15. TX 49 MPU Core Signal Descr iptio ns................................................................................................. 15-1

15.1 Signal Descriptions...................................................................................................................... 15-2

15.1.1 Memory Interface Signals................................................................................................... 15-2

15.1.2 DMA Interface Signals........................................................................................................ 15-4

15.1.3 Coprocessor Interface Signals............................................................................................. 15-5

15.1.4 Interrupt Interface Signals................................................................................................. 15-5

15.1.5 Test Interface Signals ......................................................................................................... 15-6

15.1.6 Debug Interface Signals...................................................................................................... 15-6

15.1.7 Clock and System Control Interface Signals..................................................................... 15-7

16. Low Power Consumption Modes......................................................................................................... 16-1

16.1 Halt mode..................................................................................................................................... 16-1

Contents

16.2 Doze mode.................................................................................................................................... 16-2

16.3 Status Shifts ................................................................................................................................ 16-3

Appendix A: CPU Instruction Set Details..........................................................................................A-1

A.1 Instruction Classes........................................................................................................................A-1

A.1.1 Instruction Format s ..............................................................................................................A-2

A.1.2 Instruction Notation Conventions........................................................................................A-2

A.1.3 Sign Extension and Zero Extension.....................................................................................A-4

A.1.4 Instruction Notation Examples............................................................................................A-4

A.2 Load and Store Instructions.........................................................................................................A-5

A.3 Jump and Branch Instructions.....................................................................................................A-6

A.4 Coprocessor Instructions...............................................................................................................A-6

A.5 System Control Coprocessor (CP0) Instructions.........................................................................A-6

A.6 CPU Instructions...........................................................................................................................A-7

A.7 Bit Encoding of CPU Instruction OPcodes ..............................................................................A-179

Appendix B: FPU Instruction Set Details..........................................................................................B-1

B.1 Instruction Format s ......................................................................................................................B-1

B.1.1 Floating-Po int Lo ads, Store s, and Moves ............................................................................ B-3

B.1.2 Floating-Point Operations ....................................................................................................B-3

B.2 Instruction Notational Conventions.............................................................................................B-4

B.2.1 Instruction Notation Examples............................................................................................B-4

B.3 Load and Store Instructions.........................................................................................................B-5

B.4 Computational Instructions..........................................................................................................B-7

B.5 Bit Encoding of FPU Instruction OPcodes.................................................................................B-50

Appendix C: Coprocessor 0 Hazards............................................................................................... ....C-1

C.1 Pipeline Interlock and Hazard in TX49.......................................................................................C-1

C.1.1 Interlock in Load Delay Slot.................................................................................................C-1

C.1.2 Branch Delay Slot..................................................................................................................C-2

C.1.3 Multiply, Multiply/Add and Division Instructions..............................................................C-3

C.1.4 Instructions regarding System Control Co-processor (CP0).............................................C-10

C.1.5 Control Bits Change in CP0 Registers by MTC0 Instruction...........................................C-12

C.2 Pipeline Behavior on Cache Miss...............................................................................................C-16

C.2.1 Instruction Cache Miss .......................................................................................................C-16

C.2.2 Data Cache Miss..................................................................................................................C-17

C.3 Pipeline Behavior in Uncached Area .........................................................................................C-19

C.3.1 Data Read from Uncached Area.........................................................................................C-19

C.3.2 Instruction Fetch from Uncached Area..............................................................................C-19

C.3.3 Data Write to Uncached Area.............................................................................................C-19

C.4 Timings on the Exception Handling...........................................................................................C-20

C.4.1 Basic Pipeline Behavior When Exceptions Occur .............................................................C-20

C.4.2 Exceptions during the Execution of Multi-cycle Instructions ..........................................C-21

C.4.3 Exceptions during the Data Cache Refill Cycle.................................................................C-21

Appendix D: G-Bus Overview............................................................................................................. D-1

D.1 G-Bus Operation........................................................................................................................... D-1

D.2 Types of G-Bus Arbitration.......................................................................................................... D-1

D.2.1 Snoop & Transfer (ST) Concurrency................................................................................... D-1

D.2.2 Execute & Transfer (ET) Concurrency................................................................................ D-2

Appendix E: Differences From TX4955A,TX4300 and TX4600........................................................E-1

Contents

Handling Precautions

1 Using Toshiba Semiconductors Safely

1-1

1. Using Toshiba Semiconductors Safely

TOSHIBA are continually working to improve the quality and the reliability of their products.

Nevertheless, semiconductor devices in general can malfuncti on or fail due to their inherent

electrical sensitivity and vulnerability to physical stress. It is the responsibility of the buyer, when

utilizing TOSHIBA products, to observe standards of safety, and to avoid situations in which a

malfunction or failure of a TOSHIBA product could cause loss of human life, bodily injury or

damage to property.

In developing your designs, please ensure that TOSHIBA products are used within specified

operating ranges as set forth in the most recent products specifications. Also, please keep in mind

the precautions and conditions set forth in the TOSHIBA Semiconductor Reliability Handbook.

1 Using Toshiba Semiconductors Safely

1-2

2 Safety Precautions

2-1

2. Safety Precautions

This section lists important precautions which users of semiconductor devices (and anyone else)

should observe in order to avoi d injury and dama ge to propert y, and to ens ure safe a nd correct us e

of devices.

Please be sure that you understand the meanings of the labels and the graphic symbol described

below before you move on to the detailed descriptions of the precaut ions.

[Explanation of labels]

[Explanation of labels][Explanation of labels]

[Explanation of labels]

Indicates an imminently hazardous situation which will result in death or

serious injury if you do not follow instructions.

Indicates a pot entially hazardous situation which could result in death or

serious injury if you do not follow instructions.

Indicates a potentially haza rdous situation which i f not avoided, ma y result

in minor injury or moderate injury.

[Explanation of graphic symbol]

[Explanation of graphic symbol][Explanation of graphic symbol]

[Explanation of graphic symbol]

Graphic symbol Meaning

Indicates t hat cauti on is required (laser beam is dangerous to eyes).

2 Safety Precautions

2-2

2.1 General Precautions regarding Semiconductor Devices

Do not use devices under conditions exceeding t hei r absol ute maximum ratings (e.g. current, volt age, power dissipation or

temperature).

This may cause the device to break down, degrade its perform ance, or cause it to catch fi re or explode resulting in injury.

Do not insert devices i n the wrong orientat i on.

Make sure that the positive and negati ve termi nals of power suppl i es are connected correctly. Otherwise the rated maximum

current or power dissipation may be exceeded and the device may break down or undergo performance degradation, causing it to

catch fire or explode and resulting in injury.

When power to a device is on, do not touch the device’s heat sink.

Heat sinks becom e hot, s o you may burn your hand.

Do not touch the tips of device leads.

Because some types of devic e have l eads with poi nted tips, you may prick your finger.

When conducting any ki nd of evaluation, inspection or testing, be sure to connect the testing equipment ’ s electrodes or probes to

the pins of the device under test before powering it on.

Otherwise, you m ay receive an el ectric shock causing injury.

Before grounding an item of measuring equipment or a soldering iron, check that there is no electrical leakage from it.

Electri cal leakage may cause the device which you are testing or soldering to break down, or could give you an electric shock.

Always wear protecti ve gl asses when cutting the leads of a device with clippers or a similar tool.

If you do not, small bits of met al flying off the cut ends may damage your eyes.

2 Safety Precautions

2-3

2.2 Precautions Specific to Each Product Group

2.2.1 Optical semiconductor devices

When a visible semiconductor laser is operating, do not look directl y into the laser beam or look through the optical system.

This is highly likel y to impair visi on, and i n the worst case may cause blindness.

If it is necessary to examine t he las er apparatus, for exampl e to inspect its optical characteristic s, always wear the appropri ate

type of laser prot ective gl asses as stipulated by IEC standard IEC825-1.

Ensure that the current flowing in an LED device does not exceed the device’s maximum rated current.

This is particularl y important for resin-packaged LE D devic es, as excessive current may cause the package resin to blow up,

scatteri ng resi n fragments and causi ng injury.

When testing the diel ect ric strength of a photocoupler, us e testi ng equipment which can shut off the supply voltage to the

photocoupler. If you detect a leakage current of more than 100 µA, use the testing equipment to shut off the photocoupler’s

supply voltage; otherwise a large short-circuit current will flow continuously, and the device may break down or burst in to flames,

resulting in fire or injury.

When incorporat i ng a visible sem i conductor laser into a design, use the device’s internal photodetector or a separate

photodetector to stabilize the laser’s radiant power so as to ensure that laser beams exceeding the laser’s rated radiant power

cannot be emitted.

If this stabilizi ng m echanism does not work and the rated radiant power is exceeded, the device may break down or the

excessivel y powerful la ser beams may cause injury.

2.2.2 Power devices

Never touch a power device while it is powered on. Also, after turning off a power device, do not touch it until it has thoroughly

discharged all rem ai ning elect rical charge.

Touching a power device while it is powered on or still charged could caus e a severe electric shock, resulting in death or serious

injury.

When conducting any kind of evaluation, inspection or testing, be sure to connect the testing equipment’s electrodes or probes to

the device under test before powering it on.

When you have finished, disc harge any el ectrical charge remaining in the device.

Connecting the electrodes or probes of testing equipment to a device while it is powered on may result in electric shock, c ausi ng

injury.

2 Safety Precautions

2-4

Do not use devices under conditions which exceed thei r absol ute maximum ratings (current, voltage, power dissipation,

temperature etc. ).

This may cause the device to break down, causing a large short-circuit current to flow, which may in turn cause it to catch fire or

explode, resulting i n fi re or injury.

Use a unit which can detect short-circuit currents and which will shut off the power supply if a short-c i rcuit occurs.

If the power supply is not shut off, a large short-circuit current will flow continuously, which may in turn cause the device to catch

fire or explode, resulti ng i n fire or injury.

When designing a case for enclosing your system, consider how best to protect the user from shrapnel in the event of the device

catching fire or exploding.

Flying shrapnel can cause injury.

When conducting any ki nd of evaluati on, inspection or testing, always use protective safety tools such as a cover for the device.

Otherwise you may sustai n i nj u ry caused by t he devic e catc hi ng fire or exploding.

Make sure that all metal casings in your design are grounded to earth.

Even in modules where a device’s electrodes and m etal casing are i n sul at e d, capacit ance i n the module may cause the

electrost ati c pot enti al i n the casing to rise.

Dielectric breakdown may cause a high voltage to be applied to the casing, causing electric shock and injury to anyone touching it.

When designing the heat radiati on and safet y features of a system incorporating high-speed rectifi ers, remember to take the

device’s f o rward and reverse losses into account.

The leakage current in these devices is greater than that in ordinary rectifiers; as a result, if a high-speed rectifie r is use d in an

extreme environment (e.g. at high temperature or high voltage), its reverse loss may increase, causing thermal runaway to occu r.

This may in turn cause the device to explode and scatter shrapnel, resulting in injury to the user.

A design should ensure that, except when the main circuit of the device is active, reverse bias is applied to the device gate while

electricity is conducted to control circuits, so that the main circuit will becom e inactive.

Malfunct i on of the device may cause serious accidents or injuries.

When conducting any ki nd of evaluation, inspection or testing, either wear protecti ve gl oves or wait until the device has cooled

properly before handling it.

Devices become hot when they are operated. Even after the power has been turned off, the device will retain residual heat which

may cause a burn to anyone touching it.

2.2.3 Bipolar ICs (for use in automobiles)

If your design incl udes an inducti ve l oad such as a motor coil, incorporate diodes or similar devices into t he design to prevent

negative current from flowing in.

The load current generated by powering the device on and off may cause it to function erratically or to break down, which could in

turn caus e injury.

Ensure that the power supply t o any devic e which incorporates protective functions is stable.

If the power supply is unstabl e, the device may operate erratically, prevent i ng the prot ective functions from working correctly. If

protect i ve funct i ons fail , t he devic e may break down causi ng injury to the user.

3 General Safety Precautions and Usage Considerations

3-1

3. General Safety Precauti ons and Usage Considerations

This section is designed to help you gain a better understanding of semiconductor devices, so as to

ensure the safety, quality and reliability of the devices which you incorporate into your designs.

3.1 From Incomi ng to Shipping

3.1.1 Electrostatic discharge (ESD)

When handling individual devices (which are not yet mount ed on a printed

circuit board), be sure that the environment is protected against

electrostatic electricity. Operators should wear anti-static clothing, and

containers and other objects which come into direct contact with devices

should be made of anti-static materials and should be grounded to earth via

an 0.5- to 1.0-MΩ protective resistor.

Please follow the precautions described below; this is particularly important

for devices which are marked “Be careful of static.”.

(1) Work environment

• When humidity in the working environment decreases, the human body and other insulators

can easily become charged with static electricity due to friction. Maintain the recommended

humidity of 40% to 60% in the work environment, while also taking into account the fact that

moisture-proof-packed products may absorb moisture after unpacking.

• Be sure that all equipment, jigs and tools in the working area are grounded to earth.

• Place a conductive mat over the floor of the work area, or take other appropriate measures, s o

that the floor s urfac e is prot ected a gainst st at ic el ect ricit y an d is grounded t o ea rth. Th e surfa ce

resistivity should be 104 to 108 Ω/sq and the resistance between surface and ground, 7.5 × 105 to

108 Ω

• Cover the workbench surface also with a conductive mat (with a surface resistivity of 104 to

108 Ω/sq, for a resistance between surface and ground of 7.5 × 105 to 108 Ω) . The purpose of this

is to disperse stat ic electricity on the surfac e (through resistive components) and ground it to

earth. Workbench surfaces must not be constructed of low-resistance metallic materials that

allow rapid static discharge when a charged device touches them directly.

• Pay attention to the following points when using automatic equipment in your workplace:

(a) When picking up ICs with a vacuum unit, use a conductive rubber fitting on the end of the

pick-up wand to protect against electrostatic charge.

(b) Minimize friction on IC packa ge surfac es. If some rubbing is una voidable due to t he device’s

mechanical structure, minimize the friction plane or use material with a small friction

coefficient and low electrical resistance. Also, consider the use of an ionizer.

dissipates static electricity.

(d) Ensure that no statically charged bodies (such as work clothes or the human body) touch

the devices.

3 General Safety Precautions and Usage Considerations

3-2

(e) Make sure that sections of the tape carrier which come into contact with installation

devices or other electrical machinery are made of a low-resistan ce material.

(f) Make sure that jigs and tools used in the assembly process do not touch devices.

(g) In processes in which packages may retain an electrostatic charge, use an ionizer to

neutralize the ions.

• Make sure that CRT displays in the working area are protected against static charge, for

example by a VDT filter. As much as possible, avoid turning displays on and off. Doing so can

cause electrostatic induction in devices.

• Keep track of charged potential in the workin g area by taking periodic measurements.

• Ensure that work chairs are protected by an anti-static textile cover and are grounded to the

floor surface by a grounding chain. (Suggested resistance between the seat surface and

grounding chain is 7.5 × 105 to 1012Ω.)

• Install anti-static mats on storage shelf surfaces. (Suggested surface resist ivity is 104 to 108

Ω/sq; suggested resistance between surface and ground is 7.5 × 105 to 108 Ω.)

• For transport and temporary storage of devi ces, use containers (boxes, jigs or bags) that are

made of anti-static materials or materials which di ssipate electrostatic charge.

• Make sure that cart surfaces which come into contact with device packaging are made of

materials which will conduct static electricity, and verify that they are grounded to t he floor

surface via a grounding chain.

• In any location where the level of static electricity is to be closely controlled, the ground

resistance level should be Class 3 or above. Use different ground wires for all items of

equipment which may come into physical contact with devices.

(2) Operating environment

• Operators must wea r anti-s tati c clothing and conduct ive s hoes (or

a leg or heel strap).

• Operators must wear a wrist strap grounded to earth via a

resistor of about 1 MΩ.

• Soldering irons must be grounded from iron tip to earth, and must be used only at low voltages

(6 V to 24 V).

• If the tweezers you us e are likely to touch the device terminals, use anti-static tweezers and in

particular avoid metallic tweezers. If a charged device touches a low-resistance tool, rapid

discharge can occur. When using vacuum tweezers, attach a conductive chucking pat to the tip,

and connect it to a dedicated ground used especially for anti-static purposes (suggested

resistance value: 104 to 108 Ω).

• Do not place devices or their containers near sources of strong electric al fields (such as above a

CRT).

3 General Safety Precautions and Usage Considerations

3-3

• When storing printed circuit boards which have devi ces mounted on them, use a board

container or bag that is protected against static charge. To avoid the occurrence of static charge

or discha rge due to friction, keep the boards separate from one other and do not stack them

directly on top of one another.

• Ensure, if possible, that any articles (such as clipboards) which are brought to any location

where the level of static electricity must be closely controlled are cons tructed of anti-static

materials.

• In cases where the human body comes into direct contact with a device, be sure to wear anti-

static finger covers or gl oves (suggested resista nce value: 108 Ω or less).

• Equipment safety covers installed near devices should have resistance rati ngs of 109 Ω or less.

• If a wrist strap cannot be used for some reason, and there is a possibility of imparting friction to

devices, use an ionizer.

• The transport film used in TCP products is manufactured from materials in which static

charges tend to build up. When using these products, install an ionizer to prevent the film from

being charged with static electricity. Also, ensure that no static electricity will be applied to the

product’s copper foils by taking measures to prevent static occuring in the peripheral

equipment.

3.1.2 Vibration, impact and stress

Handle devices and packaging materials with care. To avoid damage

to devices, do not toss or drop packages. Ensure that devices are not

subject ed to mechanical vibration or shock during transportation.

Ceramic package devices and devices in canister-type packages which

have empty space inside them are subject to damage from vibration

and shock because the bonding wires are secured only at their ends.

Plastic molded devices, on the other hand, have a relatively high level

of resistance to vibration and mechanical shock because their bonding

wires are enveloped and fixed in resin. However, when any device or package type is installed in

target equipment , it is to some extent s uscept ib le to wiring di sc onnecti ons and other dama ge from

vibration, shock and stressed solder junctions. Therefore when devices are incorporated into the

design of equipment which will be subject to vibration, the structural design of the equipment

must be thought out carefully.

If a device is subjected to especially strong vibration, mechanical shock or stress, the package or

the chip itself may crack. In products such as CCDs which incorporate window glass, this could

cause su rface flaws in the glass or cause the connection between the glass a n d the ceramic to

separate.

Furthermore, it is known that stress applied to a semiconductor device through the package

changes the resistance characteristics of the chip because of piezoelectric effects. In analog circuit

design attention must be paid to the probl em of package stress as well as to the dangers of

vibration and shock as described above.

Vibration

3 General Safety Precautions and Usage Considerations

3-4

3.2 Storage

3.2.1 General storage

• Avoid storage locations where devices will be exposed to moisture or direct sunlight.

• Follow the instructions printed on the device cartons regarding

transportation and storage.

• The storage area temperature should be kept within a

temperature range of 5°C t o 35°C, a nd relative humi dity s hould

be maintained at between 45% and 75%.

• Do not store devices in the presence of harmful (especially

corrosive) gases, or in dusty conditions.

• Use storage areas where there is minimal temperature fluctuation. Rapid temperature changes

can cause moisture to form on stored devices , resulting in lead oxidation or corrosi on. As a result,

the solderability of the leads will be degraded.

• When repacking devices, use anti-static containers.

• Do not allow external forces or loads to be applied to devices while they are in storage.

• If devices have been stored for more than two years, their electrical characteristics should be

test ed and their leads should be tested for ease of soldering before they are used.

3.2.2 Moisture-proof packing

Moisture-proof packing should be handled with care. The handling

procedure specified for each packing type should be followed scrupulously.

If the proper procedures are not fol lowed, the quality and reliability of

devices may be degraded. This section describes general precautions for

handling moisture-proof packing. Since the details may differ from device

to device, refer also to the relevant individual datasheets or databook.

(1) General precautions

Follow th e instructions printed on the device cartons regarding transportation and storage.

• Do not drop or toss device packing. The laminated aluminum material in it can be rendered

ineffective by rough handling.

• The stora g e area temperature should be kept within a temperature range of 5°C to 30°C, and

relative humidity should be maintained at 90% (max). Use devices within 12 months of the date

marked on the package seal.

Humidity: Temperature:

3 General Safety Precautions and Usage Considerations

3-5

• If the 12-month st orage period has expired, or if the 30% humidity indicator shown in Figure 1

is pink when the packing i s opened, it may be advisable, depending on the device and pack ing

type, to back the devices at high temperature to remove any moisture. Please refer to the table

below. After the pack has been opened, use the devices in a 5°C to 30°C. 60% RH environment

and within t he effecti ve usa ge period l ist ed on the mois ture-proof pa cka ge. If t he effect ive us age

period has expired, or if the packing has been stored in a high-humidity environment, b ake the

devices at high temperature.

Packing Moisture removal

Tray If the packing bears the “Heatproof” marking or indicates the maximum temperature which it can

withstand, bake at 125°C for 20 hours. (Some devices require a different procedure.)

Tube Transfer devices to trays bearing the “Heatproof” marking or indicating the temperature which they

can withstand, or to aluminum tubes before bak i ng at 125°C for 20 hours.

Tape Deviced packed on tape cannot be baked and must be used within the effective usage period after

unpacking, as specif i ed on the packing.

• When baking devices, protect the devices from static electricity.

• Moisture indicators can detect the approximate humidity level at a standard temperature of

25°C. 6-point indicators and 3-point indicators are currently in use, but eventually all indicators

will be 3-point indicators.

DANGER IF PINK

CHANGE DESICCANT

READ AT LAVENDER

BETWEEN PINK & BLUE

10%

20%

30%

40%

50%

60%

HUM IDITY INDICATOR

DANGER IF PINK

READ AT LAVENDER

BETWEEN PINK & BLUE

HUM IDITY INDICATOR

(a) 6-point indicator (b) 3-poin t indicator

Figure 1 Humidity indicator

3 General Safety Precautions and Usage Considerations

3-6

3.3 Design

Care must be exercis ed in the des ign of electr onic equipment t o achieve the des ired relia bilit y. It is

important not only to adhere to specifications concerning absolute maximum ratings and

recommended operating conditions, it is also important to consider the overall environment in

which equipment will be used, including factors such as the ambient temperature, transient noise

and voltage and current surges, as well as mounting conditions which affect device reliability. This

section describes some general precauti ons which you should observe when designing circuits and

when mounting devices on printed circuit boards.

For more detailed information about each product family, refer to t he relevant individual technical

datasheets available from Toshiba.

3.3.1 Absolute maximum ratings

Do not use devices under condi ti ons i n which t heir ab sol ute maximum rat ings

(e.g. current, voltage, power dissipation or temperature) will be exceeded. A

device may break down or its performance may be degraded, causing it to

catch fire or explode resultin g in injury to the user.

The absolute maximum ratings are rated values which must not be

exceeded during operation, even for an instant. Although absol ute

maximum ratings differ from product to product , they essentially

concern the voltage and current at each pin, the allowable power

dissipation, and the junction and storage temperatures.

If the voltage or current on any pin exceeds the absolute maximum

rating, the device’s internal circuitry can become degraded. In the worst

case, heat generated in internal circuitry can fuse wiring or cause the semiconductor chip to break

down.

If storage or operating temperatures exceed rated values, the package seal can deteriorate or the

wires can become disconnected due to the differences between the thermal expansion coefficients

of the materials from which the device is constructed.

3.3.2 Recommended operating conditions

The recommended operating conditions for each device ar e those necessary to guarantee that the

device will operate as specified in the datasheet.

If greater reliability is required, derate the device’s absolute maximum ratings for voltage, current,

power and temperature before using it.

3.3.3 Derating

When incorporating a device into your desi gn, reduce its rated absolute maximum voltage, current,

power diss ipation and operating temperature in order to ensure high reliability.

Since derating differs from application to application, refer to the technical datasheets available

for the various devices used in your design.

3.3.4 Unused pins

If unused pins are left open, some devices can exhibit input instability problems, resulting in

malfunctions such as abrupt increase in current flow. Similarly, if the unused output pins on a

device are connected to the power supply pin, the ground pin or to other output pins, the IC may

malfuncti on or break down.

3 General Safety Precautions and Usage Considerations

3-7

Since the details regarding the handling of unused pins differ from device to device and from pin

to pin, please follow the instructions given in the relevant individual datasheets or databook.

CMOS logic IC inputs, for example, have extremely high impedance. If an input pin is left open, it

can easily pick up extraneous noise and become unstable. In this case, if the input voltage level

reaches an intermediate level, it is possible that both the P-channel and N-channel transistors

will be turned on, allowing unwanted supply current to flow. Therefore, ensure that the unused

input pins of a devi ce are connected to the power s upply (Vcc) pin or ground (GND) pin of t he same

device. For details of what to do with the pins of heat sinks, refer to the relevant technical

datasheet and databook.

3.3.5 Latch-up

Latch-up is an abnormal conditi on inherent in CMOS devi ces, in which Vcc get s shorted to ground.

This happens when a parasitic PN-PN junction (thyrist or structure) internal to the CMOS chip is

turned on, causing a large current of the order of several hundred mA or more to flow between Vcc

and GND, eventually causing the device to break down.

Latch-up occurs when the input or output voltage exceeds the ra ted value, causing a large current

to flow in the internal chip, or when the voltage on the Vcc (Vdd) pin exceeds its rated value,

forcing the internal chip into a breakdown condition. Once the chip falls into the latch-up state,

even though the excess voltage may have been applied only for an instant, the large current

continues to flow between Vcc (Vdd) and GND (Vss). This causes the device to heat up and, in

extreme cas es , t o emit ga s fumes as wel l. To avoi d th is prob lem, obs erve t he foll owing preca ut ions :

(1) Do not allow voltage l evels on the input and output pins either to rise above Vcc (Vdd) or to

fall below GND (Vss). Also, follow any prescribed power-on sequence, so that power is applied

gradually or in steps rather than abruptly.

(2) Do not allow any abnormal noise signals to be applied to the device.

(3) Set the voltage levels of unused input pins to Vcc (Vdd) or GND (Vss).

(4) Do not connect output pins to one another.

3.3.6 Input/Output protection

Wired-AND configurations, in whi ch outputs are connected together, cannot be used, since this

short-circuits the out puts . Outputs should, of course, never be connected to Vcc (Vdd) or GND

(Vss).

Furthermore, ICs with tri-state outputs can undergo performance degrada tion if a shorted output

current is al lowed t o flow for an extended peri od of t ime. Th erefore, wh en des igni ng circuit s , ma ke

sure that tri-state outputs will not be enabled simultaneously.

3.3.7 Load capacitance

Some devices display increased delay times if the load capacitance is large. Also, large charging

and discharging currents will flow in the device, causing noise. Furthermore, since outputs are

shorted for a relatively long t ime, wiring can become fused.

Consult the technical information for the device being used to determine the recommended load

capacitance.

3 General Safety Precautions and Usage Considerations

3-8

3.3.8 Thermal design

The failure rate of semiconductor devices is greatly increased as operating temperatures increase.

As shown in Figure 2, the int ernal thermal stress on a device is the sum of the ambient

temperature and the temperat ure rise due to power dissipation in the device. Therefore, to

achieve optimum reliability, observe the following precautions concerning thermal design:

(1) Keep the a mbient temperature (Ta) as low as possible.

(2) If the device’s dynamic power dis sipation is relatively large, select the most appropriate

circuit board material, and consider the use of heat sinks or of forced air cooling. Such

measures will help lower the thermal resistance of the package.

(3) Derate the device’s absolute maximum ratings to minimize thermal stress from power

dissipation.

θja = θjc + θca

θja = (Tj–Ta) / P

θjc = (Tj–Tc) / P

θca = (Tc–Ta) / P

in which θja = thermal resistance between junction and surrounding air (°C/W)

θjc = thermal resistance between junction and package surface, or internal thermal

resistance (°C/W)

θca = thermal resistance between package surface and surrounding air, or external

thermal resistance (°C/W)

Tj = junction temperature or chip temperature (°C)

Tc = package surface temperature or case temperature (°C)

Ta = ambient temperature (°C)

P = power dissipation (W)

θca

θjc

Figure 2 Thermal resistance of package

3.3.9 Interfacing

When connecting inputs and outputs between devices, make sure input voltage (VIL/VIH) and

output voltage (VOL/VOH) levels a re matched. Otherwise, the devices may malfunction. When

connecting devices operating at different supply voltages, such as in a dual-power-supply system,

be aware that erroneous power-on and power-off sequences can result in device breakdown. For

details of how to interface particular devices, consult the relevant technical datasheets and

databooks. If you have any questions or doubts about interfacing, contact your nearest Toshiba

office or distributor.

3 General Safety Precautions and Usage Considerations

3-9

3.3.10 Decoupling

Spike currents generated during switching can cause Vcc (Vdd) and GND (Vss) voltage levels to

fluctuat e, ca using ri nging i n the output waveform or a dela y in res pons e speed. (The power s uppl y

and GND wiring impedance is normally 50 Ω to 100 Ω.) For this reason, the impedance of power

supply lines with respect to high frequencies must be kept low. This can be accomplished by using

thick and short wiring for the Vcc (Vdd) and GND (Vss) lines and by installing decoupling

capacitors (of approximately 0.01 µF to 1 µF capacitance) as high-frequency filters between Vcc

(Vdd) and GND (Vss) at strategic locations on the printed circuit board.

For low-frequency filtering, it is a good idea to install a 10- to 100-µF capacitor on the printed

circuit board (one capacitor will suffice). If the capacitance is excessively large, however, (e.g.

several thousand µF) latch-up can be a problem. Be sure to choose an appropriate capacitance

value.

An important point about wiring is that, in the case of high-speed logic ICs, noise is caused mainly

by reflection and crosstalk, or by the power supply impedance. Reflections cause increased signal

delay, ringing, overshoot and undershoot, thereby reducing the device’s safety margins with

respect t o noise. To prevent reflections, reduce the wiring length by increasing the device

mounting density so as to lower the inductance (L) and capacitance (C) in the wiring. Extreme

care must be taken, however, when ta king this corrective measure, since it tends to cause

crosstalk between the wires. In practice, t here must be a trade-off between these two factors.

3.3.11 External noise

Printed circuit boards with long I/O or signal pattern lines are

vulnerabl e to induced noise or surges from outside sources.

Consequently, malfunctions or breakdowns can result from

overcurrent or overvoltage, depending on the types of device

used. To protect against nois e, lower the impedance of the

pattern line or insert a noise-canceling circuit. Protective

measures mu st also be taken ag ains t su rge s.

For details of the appropria te protective measures for a

particular device, consult the relevant databook.

3.3.12 Electromagnetic interference

Widespread use of electrical and electronic equipment in recent years has brought with it radio

and TV reception problems due to electromagnetic interference. To use the radio spectrum

effectively and to maintain radio communications quality, each country has formulated

regulati ons limiting the amount of electromagnetic interference which can be generated by

individual products.

Electromagnetic interference includes conduction noise propagated through power supply and

telephone lin es, and noise from direct electromagnetic waves radiated by equipment. Different

measurement methods and corrective measures are used to a ssess and counteract each specific

type of noise.

Difficult ies in controlling electromagnetic interference derive from the fact that there is no

method available which allows designers to calculate, at the design stage, the strength of the

electromagnetic waves which will emanate from each component in a piece of equipment. For this

reason, it is only after the prototype equipment has been completed that the designer can take

measurements using a dedicated instrument to determine the strength of electromagnetic

interference waves. Yet it is possible during system design to incorporat e some measures for the

prevention of electromagnetic interference, which can facilitate taking corrective measures once

the design has been completed. These include installing shields and noise filters, and increasing

Input/Output

Signals

3 General Safety Precautions and Usage Considerations

3-10

the thi ckness of the power supply wiring patterns on the printed circuit board. One effective

method, for exampl e, i s t o devis e s everal shieldi ng opt ions during des i gn, and then s elect t he mos t

suitable shielding method based on the results of measurements taken after the prototype has

been completed.

3.3.13 Peripheral circuits

In most cases semiconductor devices are used with peripheral circuits and components. The input

and output signal voltages and currents in these circuits must be chosen to match the

semiconductor device’s specifications. The following factors must be taken into account.

(1) Inappropriate voltages or currents applied to a device’s input pins may cause i t to operate

erratically. Some devices contain pull-up or pull-down resistors. When designing your system,

remember to take the effect of this on the voltage and current levels into account.

(2) The output pins on a device have a predetermined external circuit drive capability. If this

drive capability is greater than that required, either incorporate a compensating circuit into

your design or carefully select suitable components for use in external circuits.

3.3.14 Safety standards

Each country has safety standards which must be observed. These safety standards include

requirement s for quality assurance systems and design of device in sulation. Such requirements

must be fully taken into account to ensure that your design conforms to the applicable safety

standards.

3.3.15 Other precautions

(1) When designing a system, be sure to incorporate fail-sa fe and other appropriate measures

according to the intended purpose of your system. Also, be sure to debug your sys tem under

actual board-mo un ted cond ition s.

(2) If a plasti c-package device is placed in a strong elect ric fiel d, surface leak age may occur due to

the charge-up phenomenon, resulting in device malfunction. In such cases take appropriate

measures to prevent this problem, for example by protecting th e package surface with a

conductive shield.

(3) With some microcomputers and MOS memory devices, caution is required when powering on

or resetting the device. To ensure that your design does not violate device specifications,

consult the relevant databook for each constituent device.

(4) Ensure that no conductive mat erial or object (such as a metal pin) can drop onto and short t he

leads of a device mounted on a printed circuit board.

3.4 Inspection, Testing and Evaluation

3.4.1 Grounding

Ground all measuring instruments, jigs, tools and soldering irons to earth.

Electrical leakage may cause a device to break down or may result in electric

shock.

3 General Safety Precautions and Usage Considerations

3-11

3.4.2 Inspection Sequence

c Do not insert devices in the wrong orientation. Make sure that the positive

and negative electrodes of the power supply are correctly connected.

Otherwise, the rat ed maximum current or maximum power dissipation

may be exceeded and the device may break down or undergo performance

degradation, causing it to catch fire or explode, resulting in injury to the

user.

d When conducting any kind of evaluation, inspection or testing using AC

power with a peak voltage of 42.4 V or DC power exceeding 60 V, be sure to

connect the electrodes or probes of the test ing equipment to the device

under test before powering it on. Connecting the electrodes or probes of

testing equipment to a device while it is powered on may result in electric

shock, causing injury.

(1) Apply voltage to the test jig only after inserting the device securely into it. When applying or

removing power, observe the relevant precautions, if any.

(2) Make sure that the voltage applied to the device is off before removing the device from the

test jig. Otherwise, the device may undergo performance degrad ation or be destroyed.

(3) Make sure that no surge voltages from the measuring equipment are applied to the device.

(4) The chips housed in tape carrier packages (TCPs) are bare chips and are therefore exposed.

During inspection take care not to crack the chip or cause any flaws in it.

Electrical contact may also cause a chip to become faulty. T h erefore make sure that nothing

comes into electrical contact with the chip.

3.5 Mounting

There are essentially two main types of semiconductor device package: lead insertion and surface

mount. During mounting on printed circuit boards, devices can become contaminated by flux or

damaged by thermal stress from the soldering process. With surface-mount devices in particular,

the most significant problem is thermal stress from solder reflow, when the entire package is

subjected to heat. This section describes a recommended temperature profile for each mounting

method, as well as general precautions which you should take when mounting devices on printed

circuit boards. Note, however, that even for devices with the same package type, the appropriate

mounting method varies according t o th e size of the chip and the size and shape of the lead fra me.

Therefore, please consult the relevant technical datash eet and databook.

3.5.1 Lead forming

c Always wear protective glasses when cutting the leads of a device with

clippers or a similar tool. If you do not, small bits of metal flying off the cut

ends may damage your eyes.

d Do not touch the tips of device leads. Because some types of device have

leads with pointed tips, you may pric k your finger.

Semiconductor devices must undergo a process in which the leads are cut and formed before the

devices can be mounted on a printed circuit board. If undue stress is applied to the interior of a

device during this process, mechanical breakdown or performance degradation can result. This is

attributable primarily to differences between the stress on the device’s external leads and the

stress on the internal l eads. If the relative difference is great enough, the device’s internal leads,

adhesive properties or sealant can be damaged. Observe these precautions during the lead-

forming process (this does not apply to surface-mount devices):

3 General Safety Precautions and Usage Considerations

3-12

(1) Lead insertion hole intervals on the printed circuit board should match the lead pitch of the

device precisely.

(2) If lead insertion hole intervals on the printed circuit board do not precisely match the lead

pitch of the device, do not attempt to forcibly insert devices by pressing on them or by pulling

on their leads.

(3) For the minimum clearance specification between a device and a

printed circuit board, refer to the relevant device’s datasheet and

databook. If necessary, achieve the required clearance by forming

the device’s leads appropriately. Do not use the spacers which are

used to raise devices above the surface of the printed circuit board

during soldering to achieve clearance. These spa cers normally

continue to expand due to heat, even aft er the solder has begun to solidify; this appl ies severe

stress to the device.

(4) Observe the following precautions when forming the leads of a device prior to mounting.

• Use a tool or jig to secure the lead at its base (where the lead meets the device package) while

bending so as to avoid mechanical stress to the device. Also avoid bending or stretching device

leads repeatedly.

• Be careful not to damage the lead during lead forming.

• Follow any other precautions described in the individual datasheets and databooks for each

device and package type.

3.5.2 Socket mounting

(1) When socket mounting devices on a printed circuit board, use sockets which match the

inserted device’s package.

(2) Use s ockets whose contacts have the appropria te contact pressure. If the contact pressure is

insufficient, the socket may not make a perfect contact when the device is repeatedly inserted

and removed; if the pressure is excessively high, the device leads may be bent or damaged

when they are inserted into or removed from the socket.

(3) When s oldering sockets to the printed circuit board, use sockets whose constructi on prevents

flux from penetrating into the contacts or which allows flux to be completely cleaned off.

(4) Make sure the coating agent applied to the printed circuit board for moisture-proofing

purposes does not stick to the socket contacts.

(5) If the device leads are severely bent by a socket as it is inserted or removed and you wish to

repair the leads so as to continue using the device, make sure that this lead correction is only

performed once. Do not use devices whose leads have been corrected more than once.

(6) If the printed circuit board with the devices mounted on it will be subjected to vibration from

external sources, use sockets which have a strong contact pressure so as to prevent the

sockets and devices from vibrating relative to one another.

3.5.3 Soldering temperature profile

The soldering temperature a n d heating time vary from device to device. Therefore, when

specifying the mounting conditions, refer to the individual datasheets and databooks for the

devices us ed.

3 General Safety Precautions and Usage Considerations

3-13

(1) Using a soldering iron

Complete soldering within ten seconds for lead temperatures of up to 260°C, or within three

seconds for lead temperatures of up to 350°C.

(2) Using medium infrared ray reflow

• Heating top and bottom with long or medium infrared rays is recommended (see Figure 3).

Long infrared ray heater (preheating)

Medium infrared ray heater

(reflow)

Product flow

Figure 3 Heating top and bottom with long or medium infrared rays

• Complete the infrared ray reflow process wit hin 30 seconds at a package surfac e temperature of

between 210°C and 240°C.

• Refer to Figure 4 for an example of a good temperature profile for infrared or hot air reflow.

210

30 s

or less

Time (s)

60-120 s

(°C)

240

160

140

Package surface temperature

Figure 4 Sample temperature profile for infrared or hot air reflow

(3) Using hot air reflow

• Complete hot air reflow within 30 seconds at a package surface temperature of between 210°C

and 240°C.

• For an example of a recommended temperature profile, refer to Figure 4 above.

(4) Using solder flow

• Apply preheating for 60 to 120 seconds at a temperature of 150°C.

• Fo r le ad ins er ti on -ty pe pa ck ag es , co mp le te so lde r flo w wit h in 1 0 s eco nd s w it h the

temperature at the stopper (or, if there is no stopper, at a location more than 1.5 mm from

the body) which doe s not exceed 260°C.

3 General Safety Precautions and Usage Considerations

3-14

• For surface-mount packages, complete soldering within 5 seconds at a temperature of 250°C or

less in order to prevent thermal stress in the device.

• Figure 5 shows a n example of a recommended temperature profile for surface-mount packages

using solder flow.

5 s

or less

60-120 s

(°C)

250

160

140

Package surface temperature

Time (s)

Figure 5 Sample temperature profile for solder flow

3.5.4 Flux cleaning and ultrasonic cleaning

(1) When cleaning circuit boards to remove flux, make sure that no residual reactive ions such as

Na or Cl remain. Note that organic solvents react with water to generate hydrogen chloride

and other corrosive gases which can degrade device performance.

(2) Washing devices with water will not cause any problems. However, make sure that no

reactive ions such as sodium and chlorine are left as a residue. Also, be sure to dry devices

sufficiently after washing.

(3) Do not rub device markings with a brush or with your hand during cleaning or while the

devices ar e still wet from the cleaning agent. Doing so can rub off the markings.

(4) The dip cleaning, shower cleaning and steam cleaning processes all involve the chemical

action of a solvent. Use only recommended solvents for these cleaning methods. Wh en

immersin g devices in a solvent or steam bath, make sure that the temperature of the liquid is

50°C or below, and that the circuit board is removed from the bath within one minute.

(5) Ultrasonic cleaning should not be used with hermetically-sealed ceramic packages such as a

leadless chip carrier (LCC), pin grid array (PGA) or charge-coupled devi ce (CCD), because the

bonding wires can become disconnected due to resonance during the cleaning process. Even if

a device package allows ultrasonic cleaning, limit the duration of ultrasonic cleaning to as

short a time as possi bl e, si nce long hours of ult ras onic cl eaning degra de the a dhes ion b etween

the mold resin and the frame material. The followi ng ultrasonic clea ning conditions a re

recommended:

Frequency: 27 kHz ∼ 29 kHz

Ultrasonic output power: 300 W or less (0.25

W/cm2 or less)

Cleaning time: 30 seconds or less

Suspend the circuit board in the solvent bath during ultrasonic cleaning in such a way that

the ultrasonic vibrator does not come into direct contact with the circuit board or the device.

3 General Safety Precautions and Usage Considerations

3-15

3.5.5 No cleaning

If analog devices or high-speed devices are used without being cleaned, flux residues may cause

minute amounts of leakage between pins. Similarly, dew condensation, which occurs in

environments containing residual chlorine when power to the device is on, may cause between-

lead leakage or migration. Therefore, Toshiba recommends that these devices be cleaned.

However, if the flux used contains only a small amount of halogen (0.05W% or less), the devices

may be used without cleaning without any problems.

3.5.6 Mounting tape carrier packages (TCPs)

(1) When tape carrier packages (TCPs) are mounted, measures must be taken to prevent

electrostatic breakdown of the devices.

(2) If devices are being picked up from tape, or outer lead bonding (OLB) mounting is being

carried out, consult the manufac turer of the insertion machine which is being used, in order

to establish the optimum mounting conditions in advance and to avoid any possible hazards.

(3) The base film, which is made of polyimide, is hard and thin. Be careful not to cut or scratch

your hands or any objects while handling the tape.

(4) When punching tape, try not to scatter broken pieces of tape too much.

(5) Treat the extra film, reels and spacers left after punching as industrial waste, taking care not

to destroy or pollute the envi ronment.

(6) Chips housed in tape carrier packages (TCPs) are bare chips an d therefore have their reverse

side exposed. To ensure that the chip will not be cracked during mounting, ensure that no

mechanical shock is a ppli ed to the reverse s i de of the chi p. E lect ri cal conta ct may a ls o caus e a

chip to fai l. Therefore, when mounting devices, make sure that nothing comes into electrical

contact with the reverse side of the chip.

If your design requires connecting the reverse side of the chip to the circuit board, please

consult Toshiba or a Toshiba distributor beforehand.

3.5.7 Mounting chips

Devices delivered in chip form tend to degrade or break under external forces much more easily

than plastic-packaged devices. Therefore, caution is required when handling this type of device.

(1) Mount devices in a properly prepared environment so that chip surfaces will not be exposed to

polluted ambient air or other polluted substances.

(2) When handling chips, be careful not to expose them to static electricity.

In particul ar, measures must b e ta ken t o prevent st ati c dama ge during t he mounti ng of chips .

With this in mind, Toshiba recommend mounting all peripheral parts first and then mounting

chips last (after all other components have been mounted).

(3) Make sure that PCBs (or any other kind of circuit board) on which chips are being mounted do

not have any chemical residues on them (such as the chemicals whi ch were used for etching

the PCBs).

(4) When mounting chips on a board, use the method of assembly that is most suitable for

maintaining the appropriate electrical, thermal and mechanical properties of the

semiconductor devices used.

* For details of devices in chip form, refer to the relevant device’s individual datasheets.

3 General Safety Precautions and Usage Considerations

3-16

3.5.8 Circuit board coating

When devices are to be used in equipment requiring a high degree of reliability or in extreme

environments (where moisture, corrosive gas or dust is present), circuit boards may be coated for

protection. However, before doing so, you must carefully consider the possible stress and

contamination effects that may result and then choose t he coating resin which results in the

minimum level of stress to the device.

3.5.9 Heat sinks

(1) When attaching a heat sink to a device, be careful not to apply excessive force to the device in

the process.

(2) When attaching a device to a heat sink by fixing it at two or more locations , evenly tighten all

the screws in stages (i.e. do not fully tighten one screw while the rest are still only loosely

tightened). Finally, fully tighten all the screws up to the specified torque.

(3) Drill holes for screws in the heat sink exactly as specified. Smooth the

surface by removing burrs and protrusions or indentations which might

interfere with the installation of any part of the device.

(4) A coating of silicone compound can be applied between the heat sink and

the device to improve heat conductivity. Be sure to apply the coating

thinly and evenly; do not use too much. Also, be sure to use a non-volatile

compound, as volatile compounds can crack after a time, causing the heat

radiation properties of the heat sink to deteriorate.

(5) If the device is housed in a plastic package, use caution when selecting the type of silicone

compound to be applied between the heat sink and the device. With some types, the base oil

separates and penetrates the plastic package, significantly reducing the useful life of the

device.

Two recommended silicone compounds in which base oil separation is not a problem are

YG6260 from Toshiba Silicone.

(6) Heat-sink-equipped devices can become very hot during operation. Do not touch them, or you

may sustain a burn.

3.5.10 Tightening torque

(1) Make sure the screws are tightened with fastening torques not exceeding the torque values

stipulated in individual datasheets and databooks for the devices used.

(2) Do not allow a power screwdriver (elect rical or air-driven) to touch devices.

3.5.11 Repeated device mounting and usage

Do not remount or re-use devices which fall into t he categories listed below; t hese devices may

cause significant problems relating to performance and reliability.

(1) Devices which have been removed from th e board after soldering

(2) Devices which have been inserted in the wrong orienta tion or which have had reverse current

applied

(3) Devices which have undergone lead forming more than once

3 General Safety Precautions and Usage Considerations

3-17

3.6 Protecting Devices in the Field

3.6.1 Temperature

Semiconductor devices are generally more sensitive to temperature than ar e other electronic

components. The various electrical characteristics of a semiconductor device are dependent on the

ambient temperature at whic h the device is used. It is therefore necessary to understand the

temperature characteristics of a device and t o incorporat e device derati ng into circuit design. Not e

also that if a device is used above its maximum temperature rating, device deterioration is more

rapid and it will reach the end of its usable life sooner than expected.

3.6.2 Humidity

Resin-mol d ed devices are sometimes improperly sealed. When these devices are used for an

extended period of time in a high-humidity environment, moisture can penetrate into the device

and cause chip degradation or malfunction. Furthermore, when devices are mounted on a regular

printed circuit board, the impedance between wiring components can decrease under high-

humidity conditions. In systems which require a high signal-source impedance, circuit board

leakage or leakage bet ween device lead pins can cause malfunctions. The application of a

moisture-proof treatment to the device surface should be considered in this case. On the other

hand, operation under low-humidity conditi ons can damage a device due to t he occurrence of

electrostatic discharge. Unless damp-proofing measures have been specifically taken, use devices

only in environments with appropriate ambient moisture levels (i.e. within a relative humidity

range of 40% to 60%).

3.6.3 Corrosive gases

Corrosive gases can cause chemical reactions in devices, degrading device characteristics.

For example, sulphur-bearing corrosive gases emanating from rubber placed near a device

(accompanied by condensation under high-humidity conditions) can corrode a device’s leads. The

resulting chemical reaction between leads forms foreign particles which can cause electrical

leakage.

3.6.4 Radioactive and cosmic rays

Most industrial and consumer semiconductor devices are not designed with protection against

radioactive and cosmic rays. Devices used in aerospace equipment or in radioactive environments

must therefore be shielded.

3.6.5 Strong electrical and magnetic fields

Devices exposed to strong magnet ic fields ca n undergo a polarizati on phenomenon in their plastic

material, or within the chip, which gives rise to abnormal symptoms such as impedance changes

or increased leakage current. Failures have been reported in LSIs mounted near malfunctioning

deflection yokes in TV sets. In such cases the device’s installation location must be changed or the

device must be shielded against the electric al or magnetic field. Shielding against magnetism is

especially necessary for devices used in an alternating magnetic field beca use of the electromotive

forces generated in this type of environment.

3 General Safety Precautions and Usage Considerations

3-18

3.6.6 Interference from light (ultraviolet rays, sunlight, fluorescent lamps and

incandescent lamps)

Light st riki ng a semiconduct or device genera tes el ectromot ive force du e t o phot oelect ric effects . In

some cases the device can malfunction. This is especially true for devices in which the internal

chip is exposed. When designing circuits, make sure that devices are protected against incident

light from external sources. This problem is not limited to optical semiconductors and EPROMs.

All types of device can be affected by light.

3.6.7 Dust and oil

Just like corrosive gases, dust and oil can cause chemical reactions in devices, which will

adversely affect a device’s electrical characteristics. To avoid this problem, do not use devices in

dusty or oily environments. This is especially important for optical devices because dust and oil

can affect a device’s optical characteristics as well as its physical integrity and the electrical

performance factors mentioned above.

3.6.8 Fire

Semiconductor devices are combust ible; they can emit smoke and catch fire if heated sufficiently.

When this happens, some devices may generate poisonous gases. Devices should therefore never

be used in close proximity to an open flame or a heat-generating b ody, or near flammable or

combustible materials.

3.7 Disposal of Devices and Packing Materials

When discarding unused devices and packing materials, follow all procedures specified by local

regulations in order to protect the environment against contamination.

4 Precautions and Usage Considerations

4-1

4. Precautions and Usage Considerati ons

This section describes matters specific to each product group which need to be taken into

consideration when using devices. If the same item is described in Sections 3 and 4, the

description in Section 4 takes precedence.

4.1 Microcontrollers

4.1.1 Design

(1) Using resonators which are not specifically recommended for use

Resonators recommended for use wit h Toshiba products in microcontroller oscillator applications

are listed in Toshiba databooks along with information about oscillation conditions. If you use a

resonator not included in this list, please consult Toshiba or the resonator manufacturer

concerning the suitability of the device for your application.

(2) Undefined functi ons

In some microcontrollers certain instruction code values do not constitute valid processor

instructions. Also, it is possible that the values of bits in registers will become undefined. Take

care in your applications not to use invalid instructions or to let register bit values become

undefined.

4 Precautions and Usage Considerations

4-2

64-Bit TX System RISC

TX49/H2 Core Architecture

TX49/H2 Architecture

1-1

I TX49/H2 Processor Core Specification

1. Introduction

The TX49/ H2 Processor Co re is a high p erformance an d low-power 64- bit RISC microproc essor

core developed by Toshiba which is well-suited to embedded applications such as networking,

laser printer, STB (Set Top Box) and 3-D graphic.

In this manual, TX49/H2 is called “TX49” hereinafter.

TX49/H2 Archit ecture

1-2

TX49/H2 Archit ecture

2-1

2. Feature

• 64 bit operation

• 32 of 64 bit integer general purpose registers

• 32 of 64 bit floating point general purpose registers

• 64 GB physical address space

• Instruction Set

• Upward compatible with MIPS I, MIPS II, and MIPS III ISA

• MAC (Multiply and Accumulate) instructions

• PREF (Prefetch) instruction

• Optimized 5 stage pipeline

• Instruction Cache

• 8 KB/ 16 KB/ 32KB : Fixed in each products

• Four-way set associative

• Lock function support (Way1-Way3)

• Data cache

• 8 KB/ 16 KB/ 32 KB: Fixed in each products

• Four-way set associative

• Lock function support (Way1-Way3)

• Write policies

Write-back

Write-through-No-Write-Allocate-Snoop

Write-through-Write-Allocate-Snoop

• MMU

• 48-doubl e-entry (even/odd) Joint TLB

• 2-entry Instruction TLB

• 4-entry Data TLB

• IEEE754 compatible single and double precision FPU

• Single and double precision FPU in hardware

• Debug support (EJTAG)

• Debug instructions

• Real time debugging is supported by deb ug module logic

• Power management modes (Halt, Doze)

• WAIT instruction

TX49/H2 Archit ecture

2-2

TX49/H2 Archit ecture

3-1

3. TX49 Block Diagram

Figure 3-1 show s the b lock diagra m of T X49 Pure Core, M PU Core an d MCU. TX49 Pure Co re

includes an instruction cache and a data cache. These cache are selectable by user system from

among a variety of possible configurations. Cache size is predetermined for each ASSP product,

however.

TX49 Pure Core

TX49 MCU

TX49 MPU Core

Instruction Cache

8 KB/ 16 KB/ 32 KB

4-way set associative

Lockable

Data Cache

8 KB/ 16 KB/ 32 KB

4-way set associative

Lockable

WB/WT

Integer Unit

GPR

DataPath

MAC

Pipeline

Control

CP0

CP0 Register

MMU/TLB

Exception Unit

FPU(CP1)

FP Register

Data Path

Debug

Support

Unit

Write Buffer GBUS I/F

Peripheral

Figure 3-1 Block Diagram of the TX49

TX49/H2 Archit ecture

3-2

TX49/H2 Archit ecture

4-1

4. CPU Registers Overview

4.1 Introduction

The TX49 has the CPU registers for integer operation or address calculation and the CP0

registers for memory system or exception handling.

4.2 CPU Registers

The TX49 has the 64-bit CPU registers.

• 32 general-purpose registers

• 64-bit program counters

• HI/L O register for storing the result of multiply and divide operations

Figure 4-1 shows the configuration of these registers.

General Purpose Regis t ers (GPR) Multiply/ Di vi de Regi sters

63 0 63 0

r0 = 0HI

r1 63 0

r2 LO

.Program counter

r29 63 0

r30 PC

r31 = Link Address

Figure 4-1 TX49 CPU registers

The r0 and r31 registers of GPR have special functions as follows.

• Register r0 always contains the value 0. It can be a target register of an instruction

whose operation result is not needed. Or, it can be a source register of an instruction

that requires a value of 0.

• Register r31 is the link register for the Jump and Link instruction. The address of

the instruction after the delay slot is placed in r31.

The TX49 has the following some special registers that are used or modified implicitly by

certain instructions.

• HI - Holds the high-order bits of the result of integer multiply operation or the

remainder of integer divide operation.

• LO - Holds the low-order bits of the result of integer multiply operation or the

quotient of integer divide opera tion.

These two registers are used to store that result of an integer multiplication or division. In

multiplication, the 64 high-order bits of a 128-bit result are stored in the HI, and the 64 low-

order bits are stored in the LO. In division, the resulting quotient is stored in the LO, and the

remainder is stored in the HI.

• PC - Program Counter

The register contains the address of the currently executed instruction.

TX49/H2 Archit ecture

4-2

4.3 CP0 Registers

The TX49 has the 32-bit or 64-bit System control coprocessor(CP0) registers. These

registers are used for memory system or exception handling. Table 4-1 lists the CP0 registers

built into the TX49. The more detail information are described in Chapter 7.

Table 4-1 CP0 Registers

Index Reg#0 Config Reg#16

Random Reg#1 LLAddr Reg#17

EntryLo0 Reg#2 (Reserved) (Note 1) Reg#18

EntryLo1 Reg#3 (Reserved) (Note 1) Reg#19

Context Reg#4 XContext Reg#20

PageMask Reg#5 (Reserved) (Note 1) Reg#21

Wired Reg#6 (Reserved) (Note 1) Reg#22

(Reserved) (Note 1) Reg#7 Debug (Note 2) Reg#23

BadVAddr Reg#8 DEPC (Note 2) Reg#24

Count Reg#9 (Reserved) (Note 1) Reg#25

EntryHi Reg#10 (Reserved) (Note 1) Reg#26

Compare Reg#11 (Reserved) (Note 1) Reg#27

Status Reg#12 TagLo Reg#28

Cause Reg#13 TagHi Reg#29

EPC Reg#14 ErrorEPC Reg#30

PRId Reg#15 DESAVE (Note 2) Reg#31

Note 1:These register s are use d to test the S yst em Control Cop r oces s or ( CP0) and s hou ld not be

access ed by the user.

Note 2:These registers are exclusively used by external in-circuit emulators (ICE).

TX49/H2 Archit ecture

5-1

5. CPU Instruction Set Summary

5.1 Introduction

Each instruction is 32 bits long. These instructions are upward compatible with the MIPS I,

II and III instruction set architecture and the TX39’s instructions.

5.2 Instruction Format

There are three instruction formats: Immediate (I-type), Jump (J-type) and Register (R-

type), as shown in Figure 5-1. Having just three instruction formats simplifies instruction

decoding. If more complex functions or addressing modes are required, they can be produced

with the compiler using combinations of the instructions.

Immediate (I -type)

31 26 25 21 20 16 15 0

op rs rt immediate

Jump (J-type)

31 26 25 0

op target

31 26 25 21 20 16 15 11 10 6 5 0

op rs rt rd sa funct

op Operation code (6 bits)

rs Source regist er (5 bits)

rt Target (source or destinati on) regi st er, or branc h condi tion (5 bits)

rd Destination regi ster (5 bits)

immediate Immediate, branc h displ acem ent, address displacement (16 bits)

target Branch target address (26 bi ts )

sa Shift amount (5 bits)

funct Functi on (6 bits)

Figure 5-1 Instruction formats and subfield mnemonics

TX49/H2 Archit ecture

5-2

5.3 Instruction Set Overview

5.3.1 Load and Store Instructions (Table 5- 1)

Load and Store instructions move data between memory and general purpose registers,

and are all I-type instructions. The only directly supported addressing mode is “base

Table 5-1 CPU Instruction Set: Load and Store Instructions

Instruction Description Note

LB Load Byte MIPS I

LBU Load Byte Unsigned MIPS I

LH Load Halfword MIPS I

LHU Load Halfword Unsigned MIPS I

LW Load Word MIPS I

LWL Load Word Left MIPS I

LWR Load Word Right MIPS I

SB Store Byte MIPS I

SH Store Halfword MIPS I

SW Store Word MIPS I

SWL Store Word Left MIPS I

SWR Store Word Right MIPS I

LD Load Doubleword MIPS III

LDL Load Doubleword Left MIPS III

LDR Load Doubleword Right MIPS III

LL Load Linked MIPS II

LLD Load Linked Doubleword MIPS III

LWU Load Word Unsigned MIPS III

SC Store Conditional MIPS II

SCD Store Conditional Doubl eword MIPS III

SD Store Doubleword MIPS III

SDL Store Doubleword Left MIPS III

SDR Store Doubleword Right MIPS III

SYNC Sync MIPS II

TX49/H2 Archit ecture

5-3

5.3.2 Computational Instructions (Table 5-2)

Computational instructions perform arithmetic, logical or shift operations on values in

registers. This instruction format can be R-type or I-type. With R-type instructions, the

one/two operands and the result are register values. With I-type instructions, one of the

operands is 16-bit immediate data. Computational instructions can be classified as

follows.

• ALU immediate

• Three-operand register-type

• Shift

• Multiply/Divide

Table 5-2 CPU Instruction Set: Computational Instructions

Instruction Description Note

(ALU Immediat e)

ADDI Add Immediate MIPS I

ADDIU Add Immediate Unsigned MIPS I

SLTI Set on Less Than Immediate MIPS I

SLTIU Set on Less Than Immediate Unsigned MIPS I

ANDI AND Immediate MIPS I

ORI OR Immediate MIPS I

XORI Exclusive OR Immediate MIPS I

LUI Load Upper Immediate MIPS I

DADDI Doubleword Add Immediat e MIPS III

DADDIU Doubleword Add Immediate Unsigned MIPS III

(ALU 3-Operand, regist er type)

ADD Add MIPS I

ADDU Add Unsi gned MIPS I

SUB Subtract MIPS I

SUBU Subtract Unsi gned MIPS I

SLT Set on Less Than MIPS I

SLTU Set on Less Than Unsigned MIPS I

AND AND MIPS I

OR OR MIPS I

XOR E xclu sive O R MIP S I

NOR NOR MIPS I

DADD Doubleword Add MIPS III

DADDU Doubleword Add Unsigned MIPS III

DSUB Doubleword Subtract MIPS III

DSUBU Doubleword Subtrac t Unsigned MIPS III

(Shift)

SLL Shift Left Logical MIPS I

SRL Shift Right Logical MIPS I

SRA Shift Right A rithm etic MI P S I

SLLV Shift Left Logical Variabl e MIPS I

SRLV Shift Right Logical Variabl e MIPS I

SRAV Shift Right Arithmetic Variable MIPS I

DSLL Doubleword Shif t Left Logical MIPS III

DSRL Doubleword Shif t Right Logical MIPS III

DSRA Doubleword Shif t Right A rithmetic MIPS III

DSLLV Doubleword Shift Left Logic al Variabl e MIPS III

DSRLV Doubleword Shift Right Logic a l Vari abl e MIPS III

TX49/H2 Archit ecture

5-4

Instruction Description Note

DSRAV Doubleword Shif t Right A rithm etic Variable MIPS III

DSLL32 Doubleword Shift Left Logical +32 MIPS III

DSRL32 Doubleword Shif t Right Logical +32 MIPS III

DSRA32 Doubleword Shif t Right A rithmetic +32 MIPS III

( Multiply and Divi de)

MULT Multiply MIPS I

MULTU Multiply Unsigned MIPS I

DIV Divide MIPS I

DIVU Divide Unsigned MIPS I

MFHI Mo ve From HI MIP S I

MTHI Mo ve To HI MIP S I

MFLO Move From LO MIPS I

MTLO Mo ve To LO MIP S I

DMULT Doubleword Multiply MIP S III

DMULTU Doubleword Multiply Uns i gned MIPS III

DDIV Doubleword Divide MIPS III

DDIVU Doubleword Divide Unsi gned MIPS III

5.3.3 Jump and Branch Instructions (Table 5-3)

Jump and branch instructions change the control flow of a program. All jump and

branch instructions occur with a delay of one instruction: that is, the instruction

immediately following the jump or branch (this is known as the instruction in the delay

slot) always executes while the target instruction is being fetched from storage. Branch-

likely instructions are used for static branch prediction. The instruction in the delay slot

is executed only when the branch is taken; the instruction in the delay slot is nullified if

the branch is not taken.

Table 5-3 CPU Instruction Set: Jump and Branch Instructions

Instruction Description Note

JJump MIPS I

JAL Jump And Link MIPS I

JR Jump Register MIPS I

JALR Jump And Link Register MIPS I

BEQ Branch on Equal MIPS I

BNE Branch on Not Equal MIPS I

BLEZ Branch on Less Than or Equal to Zero MIPS I

BGTZ Branch on Greater Than Zero MIPS I

BLTZ Branch on Less Than Zero MIPS I

BGEZ Branch on Greater than or Equal to Zero MIPS I

BLTZAL Branch on Less Than Zero And Link MIPS I

BGEZAL Branch on Greater than or Equal to Zero And Link MIPS I

BEQL Branch on Equal Likely MIPS II

BNEL Branch on Not Equal Likely MIPS II

BLEZL Branch on Less Than or Equal to Zero Likely MIPS II

BGTZL Branch on Greater Than Zero Likely MIPS II

BLTZL Branch on Less Than Zero Likely MIPS II

BGEZL Branch on Greater Than or Equal to Zero Likely MIPS II

BLTZALL Branch on Less Than Zero And Link Likely MIPS II

BGEZALL Branch on Greater Than or Equal to Zero And Link Likely MIPS II

TX49/H2 Archit ecture

5-5

5.3.4 Special Instructions (Table 5-4)

There are special instructions used for software trap. The instruction format is R-type

for all two.

Table 5-4 CPU Instruction Set: Special Instructions

Instruction Description Note

SYSCALL System Call MIPS I

BREAK Break MIPS I

5.3.5 Exception Instr uctions (Table 5-5)

These instructions (R-type or I-type) cause a branch to the general exception handling

vector based upon the result of a comparison.

Table 5-5 CPU Instruction Set: Exception Instructions

Instruction Description Note

TGE Trap if Greater Than or Equal MIPS II

TGEU Trap if Greater Than or Equal Unsigned MIPS II

TLT Trap if Less Than MIPS II

TLTU Trap if Less Than Unsigned MIP S II

TEQ Trap if Equal MIPS II

TNE Trap if Not Equal MIPS II

TGEI Trap if Greater Than or Equal Immediat e MIPS II

TGEIU Trap if Greater Than or Equal Immediate Unsigned MIPS II

TLTI Trap if Less Than Immediate MIPS II

TLTIU Trap if Less Than Immediate Unsigned MIPS II

TEQI Trap if Equal Immediate MIPS II

TNEI Trap if Not Equal Immediate MIPS II

TX49/H2 Archit ecture

5-6

5.3.6 Coprocessor Instructions (Table 5-6)

Coprocessor instructions invoke coprocessor operations. The format of these

instructions depends on which coprocessor is used.

Table 5-6 CPU Instruction Set: Coprocessor Instructions

Instruction Description Note

LWCz Load Word to Coprocessor z (z = 1,2) MIPS I

SWCz Store Wo rd from Coprocessor z (z = 1,2) MIPS I

MTCz Move To Coprocess or z (z = 1,2) MIPS I

MFCz Move From Coprocessor z (z = 1,2) MIPS I

CTCz Move Control To Coprocessor z (z = 1,2) MIPS I

CFCz Move Control From Coproc essor z (z = 1,2) MI PS I

COPz Coprocess or Operation z (z = 1,2) MI PS I

BCzT Branch on Coprocess o r z True (z = 0,1,2) MIPS I

BCzF Branch on Coprocess or z Fals e (z = 0,1,2) MIPS I

BCzTL Branch on Coprocess or z True Like l y (z = 0,1,2) MIPS II

BCzFL Branch on Coprocesso r z False Lik ely (z = 0,1,2) MIPS II

LDCz Load Double Coprocess or z (z = 1,2) MIPS III

SDCz Store Doubl e Coprocessor z (z = 1,2) MIPS III

DMTCz Doubleword Move To Coprocessor z (z = 1,2) MIPS III

DMFCz Doubleword Move From Coprocesso r z (z = 1,2) MI PS III

5.3.7 CP0 Instructions (Table 5-7)

Coprocessor 0 instructions are used for operations involving the system control

coprocessor (CP0 ) registers, processor memory management and exception handling.

Table 5-7 Instructi on Set: CP 0 Instruct ions

Instruction Description Note

MTC0 Mo ve To CP0 MIPS I

MFC0 Mo ve From CP0 MIPS I

DMTC0 Doubleword Move To CP0 MIPS III

DMFC0 Doubleword Move From CP0 MIPS III

TLBR Read Indexed TLB Entry

TLBWI Write Indexed TLB Entry

TLBWR Write Random TLB Entry

TLBP Probe TLB for Matching Entry

CACHE Cache MIPS III

ERET Exception Return MIPS III

WAIT Enter power managem ent mode

TX49/H2 Archit ecture

5-7

5.3.8 Multiply and Divide Instructions (Table 5-8)

Table 5-8 Extensions to the ISA: Multiply and Divide Instructions

Instruction Description Note

MULT Multiply (3-operand)

MULTU Multiply Unsigned (3-operand)

DMULT Doubleword Multiply (3-operand)

DMULTU Doubleword Multiply Uns i gned (3-operand)

MADD Multiply and ADD (3-operand)

MADDU Multipl y and A DD Unsigned (3-operand)

5.3.9 Debug Instructions (Table 5-9)

Table 5-9 Extensions to the ISA: Debug Instructions

Instruction Description Note

CTC0 Move Control To Coprocess or 0

CFC0 Move Control From Coprocess or 0

SDBBP Software Debug Breakpoint

DERET Debug Exception Return

5.3.10 Other Instructions (Table 5-10)

Table 5-10 Other Instructions

Instruction Description Note

PREF Prefetch

5.4 Instruction Execution Cycles

Because the TX49 employs the high-speed Multiply and Add Calculator (MAC), multiply

instructions, such as MULT, MULTU, DMULT and DMULTU are executed faster. And, TX49

is improved the execution of divide instructi ons, too.

Instruction Latency (2op/3op) Repeat (2op/3op)

MULT 2/3 operand 4/4 1/3

MADD 2/3 operand 4/4 1/3

DMULT 2/3 operand 7/7 6/6

DIV 37 36

DDIV 69 68

TX49/H2 Archit ecture

5-8

5.5 Defining Access Types

Access type indicates the size of a TX49 processor data item to be loaded or stored, set by

the load or store instruction opcode. Access types are defined in Table A-3.

Regardless of access type or byte ordering (endianness), the address given specifies the low-

order byte in the addressed field. For a big-endian configuration, the low-order byte is the

most-significant byte; for a little-endian configuration, the low-order byte is the least-

significant byte.

The access type, together with the three low-order bits of the address, define the bytes

accessed within the addressed doubleword (shown in Figure 5-2). Only the combinations

shown in Figure 5-2 are permissible; other combinations cause address error exceptions. See

Appendix A for individual descriptions of CPU load and store instructions.

Bytes Accessed

Low-Order

Address

Bits

Access Type

Mnemonic

(Value)

210

Big Endian

(63-----------------31-----------------0)

Byte

Little Endian

(63-----------------31-----------------0)

Byte

Doubleword (7) 0 0 0 0 1 2 3 4 5 6 7 7 6 5 4 3 2 1 0

0000123456 6543210

Septibyte (6) 001 12345677654321

000012345 543210

Sextibyte (5) 010 234567765432

00001234 43210

Quintibyte (4) 011 3456776543

0000123 3210

Word (3) 100 45677654

000012 210

001 123 321

100 456 654

Triplebyte (2)

101 567765

00001 10

010 23 32

100 45 54

Halfword (1)

110 6776

0000 0

001 1 1

010 2 2

011 3 3

100 4 4

101 5 5

110 6 6

Byte (0)

111 77

Figure 5-2 B yte Access within a Doub le wor d

TX49/H2 Archit ecture

6-1

6. CPU Pipeline

6.1 Introduction

This chapter describes the operation of the TX49 pipeline. It explains the basic operation of

the pipe line . An d, it e xp lain s h ow the T X 49 hand le d d elay in stru c tio ns; the se are in stru ct ion s

that follo w a branch o r load in struction in the pipelin e. A later section explain s interru ptions

to the pipeline flow caused by interlocks and exceptions.

6.2 Basic Pipeline Operation

The TX49 executes instructions in an optimized 5 stage pipeline. Each pipeline stage is

executed in one clock cycle. When the pipeline is fully utilized, five instructions are executed

at the same time, resulting in an average instruction execution rate of one instruction par

cycle as illustrated in Figure 6-1.

One cycle

F1 F2 D1 D2 E1 E2 M1 M2 W1 W2

F1 - Instruction Fetch, Phase one

F2 - Instruction Fetch, Phase two

D1 - Instructi on Decode, P hase one

D2 - Instructi on Decode, P hase t wo

E1 - Execution, Phase one

E2 - Execution, Phase two

M1 - Memory Access , P hase one

M2 - Memory Access, Phase two

W1 - Write Back, Phase one

W2 - Write Back, Phase two

Figure 6-1 Pipeline stages for executing TX49 instructions

F1, F2 : Instruction Fetch

During the F1 phase the ITLB begins the virtual to physical address

translation. And, during the F2 phase the instruction cache fetch and the virtual

to physical address translation are completed.

D1, D2 : Instruction Decode

The instruction is decoded. Contents of the general-purpose registers are read.

If the instruction involves a branch or jump, the target address is generated.

The coprocessor condition signal is latched.

E1, E2 : Execution

Arithmetic, logical and shift operations are performed. The execution of

multiple/divide instructions is begun.

For load and store instructions, the data virtual address is calculated, and

virtual-to-physical address translation is begun.

TX49/H2 Archit ecture

6-2

M1, M2 : Memory Acce ss

The dat a cache is accessed in the case of load and store instructions.

W1, W2 : Write Back

The result is written to a general register.

6.3 TX49 Pipeline Activities

Stage F1 F2 D1 D2 E1 E2 M1 M2 W1 W2

Fetch ICD ICA RF

& Decode ITLBM ITLBR ITC I DE C

ALU ALU WB

Load/Store DVA DCAD DCAA DCLA

JTLB1 JTLB2

SA DTC WB

DCW

Jump/Branch BCMP

BAC IVA

ICD: Instruction cache address decode

ICA: Instruction cache array access

RF: Register fetch

ITLBM: Instruction address translat i on match

ITLBR: Instruc ti on address translation read

ITC: Instruction tag match

IDEC: Instruct i on decode

ALU: ALU operation

WB: Write back to register file

DVA: Data virtual address calculatio n

DCAD: Data cache address decode

DCAA: Data cache array access

DCLA: Data cache load align

JTLB1: Address translation in JTLB stage1

JTLB2: Address translation in JTLB stage2

SA: Store align

DTC: Data cache tag check

DCW: Data cache write

BCMP: Branch compare

BAC: Branch address calcul ation

IVA: Generate inst ruct i on virt ual address

TX49/H2 Archit ecture

6-3

6.4 Branch and Load Delay

Some TX49 instructions are executed with a delay of one instruction cycle. The cycle in

which an ins tructio n i s de lay ed is c alle d a de l ay slot . A d elay occu rs with load instru ction and

branch/jump instructions.

6.4.1 Delayed load

With load instru ctions, a one -cycle delay occurs wh ile waiting fo r the data being loaded

to become available for use by another instruction. The TX49 checks the instruction in

the delay slot (the instruction immediately following the load instruction) to see if that

instruction needs to use the load result; if so, it stalls the pipeline (see Figure 6-2).

LW r5, 0 (r26 ) F D E M W

ADDU r8, r7, r5 F D ES E M W

↑ Pipeline stall

Figure 6-2 CPU Pipeli ne Loa d Delay

6.4.2 Delayed branching

Figure 6-3 shows the pipeline flow for jump/branch instructions. The branch target

address that must be generated for these type of instructions does not become available

unit the E stage - too late to be used by the instruction in the branch delay slot. The

branch target instruction is fetched immediately after the branch delay slot cycle.

It is, however, possible to fetch a different instruction that would normally be executed

prior to the branch instruction.

BEQ r1, r4, L1 F D E M W

Target addr

subu r3, r5,r6 (delay slot ) F D E M W

L1:addiu r7, r7, 1 (target) F D E M W

Figure 6-3 CPU Pipeli ne Branc h Del a y

You can make effective use of the branch delay slot as follows.

• Since the instruction immediately follo wing a bran ch instruction w ill be ex ecuted

just prior to the branch, you can therefore place an instruction (that logically

should be executed just before the branch) into delay slot following the branch

instruction.

• The TX49 provides Branch Likely instructions in addition to the normal Branch

instructions. If the branch condition of the Branch Likely instruction is met, the

instruction in the del ay slot is executed and the br anch is taken. If the branch is

not taken, the instruction in the delay is treated as a NOP.

Therefore, Branch-Likely instructions allow the processor to execute the

instruction immediately following the branch while the target instruction is being

fetched.

• If no instru ction is placed in the d elay slot, a N OP is placed just af ter the branch

instruction.

TX49/H2 Archit ecture

6-4

6.5 Non-blocking Load Function

The no n- bloc king load fu nctio n pre v en ts the pip e line fr om st alling wh en a cache miss o ccu rs

and a refill cycle is required to refill the data cache. Instructions after the load instruction

that do not use registers affected by the load will continue to be executed. An example is

shown in Figure 6-4. Here a cache miss occurs with the first load instruction. The two

instructio ns fo llo win g are ex e cute d p rio r to th e lo ad. Th e fo urth in stru ct ion (A DD ) mu st use a

cache data becomes valid.

LW r3, 0(r0) FDEMRRRRW

ADD r6, r4, r2 F D E M W r3

ADD r7, r5, r2 F D E M W

ADD r8,r9,r3 F D ES ES ES E M W

R: Refill cycle, ES: Stall in E sta g e

Figure 6-4 Non-blocking load function

6.6 Interlock and Exception Handling

6.6.1 Overview of Interlock and Exception Handling

Smooth pipeline flow is interrupted when cache misses or exceptions occur, or when

data dependencies are detected. Interruptions handled using hardware, such as cache

misses, are referred to as interlocks, while those that are handled using software are

called exceptions.

As shown in Figure 6-5, all interlock and exception conditions are collectively referred

to as faults.

Figure 6-5 Interlocks, Exc eptions, and Fau lts

These are two types of interlocks:

• stalls, which are resolved by halting the pipeline

• slips, which require one part of the pipeline to advance while another part of the

pipeline is held static

At each cycle, exception and interlock condition corresponds to a particular pipeline

stage, a condition can be traced to the particular instruction in the exception/interlock

stage, as shown in Figure 6-6. For ins tance, an Illegal Ins truction (II) excep tion is raised

in the except ion (EX) stage.

Table 6- 1 and Table 6- 2 describe the pipeline inte rlocks and exce ptions liste d in Fig ure

6-6.

Exceptions Interlocks

Stalls Slips

Software Hardware

Faults

TX49/H2 Archit ecture

6-5

Pipeline Stage

State FDEMW

ITM ICM DCM

Stall CPE

LDI

MDSt

Slip FCBsy

ITLB IBE RI DBE

Cun NMI

BP Reset

SC OVF

DTLB Trap

DTMod

Exception

Intr

Figure 6-6 Correspondence of pipeline stage to interlock condition

Table 6-1 Pipeline Interlocks

Interlock Description

ITM Instruction TLB Miss

ICM Instruc ti on Cache Miss

CPE Coprocess or Possi ble Exception

DCM Data Cache Miss

LDI Load Interlock

MDSt Multiply / Divide Start

FCBsy FP Coprocessor Busy

Table 6-2 Pipeline Exceptions

Exception Description

ITLB Instruc tion Transl ation or Address Exception

Intr External Interrupt

IBE Instruc tion B us Error

RI Reserved Instruction

BP Breakpoint

SC System Call

Cun Coprocessor Unusabl e

OVF Integer Overflow

FPE FP Interr u p t

ExTrap EX Stage Traps

DTLB Data Translation or Address Exception

TLBMod TLB Modified

DBE Data Bus Error

NMI Nom-maskable Interrupt (or Soft Reset)

Reset Reset

TX49/H2 Archit ecture

6-6

6.6.2 Exception Conditions

When an exceptio n cond ition o ccurs, the re levant in struc tion an d all tho se th at follo w it

in the pipeline are cancelled. Accordingly, any stall conditions and any later exception

conditions that may have referenced this instruction are inhibited; there is no benefit in

servicing stalls for a cancelled instruction.

After instruction cancellation, a new instruction stream begins, starting execution at a

predefined exception vector. System Control Coprocessor registers are loaded with

information that identifies the type of exception and auxiliary information such as the

virtual address at which translation exceptions occur.

6.6.3 Stall Conditions

Often, a stall condition is only detected after parts of the pipeline have advanced using

incorrect data; thi s is called a pipeline overru n. When a stal l co nd ition i s de tecte d, al l f ive

instructions − each different stage of the pipeline − are frozen at once. In this stalled

state, no pipeline stages can advance until the interlock condition is resolved. For

example, when a cache miss occurs, the processor must refill the cache before it restarts

the pipeline.

Once the interlock is removed, the restart sequence begins two cycles before the

pipeline resumes execution. The restart sequence reverses the pipeline overrun by

inserting the correct information into the pipeline.

6.6.4 External Stalls

External stall is another class of interlocks. An external stall originates outside the

processor and is not referenced to a particular pipeline stage. This interlock is not

affected by exceptions.

6.6.5 Interlock and Exception Timing

To prevent interlock and exception handling from adversely affecting the processor

cycle time, the TX49 processor uses both logic and circuit pipeline techniques to reduce

critical timing paths. Interlock and exception handling have the following effects on the

pipeline:

• In some cases, the processor pipeline must be backed up (reversed and started

over again from a prior stage) to recover from interlocks.

• In some cases, interlocks are serviced for instructions that will be aborted, due to

an exception.

TX49/H2 Archit ecture

6-7

6.7 Multiply and Multiply/Add Instructions (MULT, MULTU, MADD, MADDU)

The TX49 can execute 32-bit multiply and multiply/add instructions of 2-operand

continuously, and can use the results in the HI/LO registers in immediately following

instructions, without pipeline stall as shown Figure 6-7. The TX49 requires three cycles to

use the results of a general-purpose register as shown Figure 6-8.

MULT/MADD r3, r4 F D E1 E2 E3 M W

MULT/MADD r6, r7, r8 F D E1 E2 E3 M W

Figure 6-7 MULT and MADD Instruct ions with out dat a dependency

(32-bit and 2-op er an d)

MULT/MADD r3, r4, r5 F D E1 E2 E3 M W

MULT/MADD r6, r3, r8 F D ES ES ES E1 E2 E3 M W

Figure 6-8 MULT and MADD Instructions with data dependency

(32-bit and 3-op er an d)

6.8 Divide Instructions (DIV, DIVU)

Division starts from the pipeline E stage and takes 36 cycles.

Figure 6-9 shows an example of a divide instruction.

DIV/DIVU F D E M W

V1 V2 V3 V4 …V35 V36

Division stage1

Figure 6-9 DIV and DIVU Instruc t ions

6.9 Streaming

During a cach e refill ope ration, the TX49 can re sume exec ution immediately after arr ival of

necessary data or instruction in cache even though cache refill is not completed. This is

referred to as “streaming”.

TX49/H2 Archit ecture

6-8

TX49/H2 Archit ecture

7-1

7. System Control Coprocessor, CP0

7.1 Introduction

The TX49 has a System Control Co-Processor (CP0). CP0 translates virtual addresses to

physical addresses. CP0 manages exceptions and transitions between kernel, supervisor, and

user states. CP0 also controls the cache sub-system, as well as providing diagnostic control

and error recovery facilities.

TX49/H2 Archit ecture

7-2

7.2 CP0 Registers

This section is described about the bit field of each register. The term “coldreset” of tables

shows the value of each bit when GCOLDRESET* signal is asserted. The reserved bits in

description must be written the same value in reset, and return the same value when read.

7.2.1 Index register (Reg#0)

The Index register is a 32-bit read/write register containing six bits to index an entry in

the TLB. The P bit of the register shows the success/failure of a TLB Probe (TLBP)

instruction.

The Index register also specifies the TLB entry affected by TLB Read (TLBR) or TLB

Write Inde x (TLBWI) instruc tions. Fig ure 7-1 sh ows th e form at of the I ndex register and

Table 7-1 describes the Index register fields.

31 30 65 0

P 0 Index

Figure 7-1 Index Register Format

Table 7-1 Index Register Field Descriptions

Bit Field Description Cold Reset Read/Write

31 P Probe failure. Set to 1 when the previous

TLB Probe (TLBP) instruction was unsuccessf ul. Undefined Read/Write

30~6 0 Reserved 0x0 Read

5~0 Index Index to the TLB entry affected by the TLB Read and TLB

W rite Index instruc ti ons. Undefined Read/Write

TX49/H2 Archit ecture

7-3

7.2.2 Random register (Reg#1)

The Random register is a read only register containing six bits to index an entry in the

TLB. This register decrements as each i nstruction executes. The values are as follows.

• A lower bound is set by the number of TLB entries reserved for exclusive use by

the operating system (the contents of the Wired register).

• An upper bound is set by the total number of TLB entries (47 maximum).

The Random register specifies the TLB entry affected by TLB Write Random (TLBWR)

instruction. However the register doesn’t need to be read for this purpose, it is readable

to verify proper operation of the processor.

To simplify testing, the Random register is set to the value of the upper bound upon

system reset. This register is also set to the upper bound when the Wired register is

written.

Figure 7-2 shows the format of the Random register and Table 7-2 describes the

Random regis ter fields.

31 65 0

0 Random

Figure 7-2 Random Register Format

Table 7-2 Random Register Field Descriptions

Bit Field Description Cold Reset Read/Write

31~6 0 Reserved. 0x0 Read

5~0 Random TLB random index for TLBWR instruction. Upper bound

(47) Read

TX49/H2 Archit ecture

7-4

7.2.3 EntryLo0 register (Reg#2) and EntryLo1 register (Reg#3)

The EntryLo register consists of two registers have identical formats :

• EntryLo0 is used for even virtual pages

• EntryLo1 is used for odd virtual pages

The EntryLo0 and EntryLo1 register are read/write register. These registers hold the

physical page frame number (PFN) of the TLB entry for even and odd pages, respectively,

when performing TLB read and write operations.

Figure 7- 3 shows th e format of the EntryLo0/En tryLo1 re gister an d Table 7-3 descr ibes

the EntryLo0/EntryL o1 register fields.

63 32 31 30 29 6 5 3 2 1 0

0WCEPFNCDVG

Figure 7-3 EntryLo0/EntryLo1 Register Format

Table 7-3 EntryLo0/EntryLo1 Register Field Descriptions

Bit Field Description Cold Reset Read/Write

63~32 0 Reserved 0x0 Read

31~30 WCE Usable for Win-CE 0x0 Read/Write

29~6 PFN Page frame number. Undefined Read/Write

5~3 C Specifies the TLB page coherency attribute.

0: Cacheable, nonc oherent, write-t hrough, no-WA

1: Cacheable, nonc oherent, write-t hrough, WA

2: Uncached

3: Cacheable,nonc oherent,write-back,W A

4∼7: Reserved

0x0 Read/Write

2D Dirty

If this bit is set, the page is marked as dirty and, therefore,

writable. This bit is actually a write-protect bit that software

can use to prevent alteration of dat a.

0 Read/Write

1 V Valid

If this bit is set, it indicates that the TLB entry is valid;

otherwise, a TLBL or TLBS miss occurs.

0 Read/Write

0 G Global

If this bit is set in both EntryLo0 and EntryLo1, then the

processor ignores the ASID during TLB lookup.

0 Read/Write

TX49/H2 Archit ecture

7-5

7.2.4 Context register (Reg#4)

The Context register is a read/write register containing the pointer to an entry in the

page table entry (PTE) array. This array is an operating system data structure that

stores virtual to physical address translations. When there is a TLB miss, the CPU loads

the TLB with the missing translation from the PTE array. Normally, the operating

system uses the Context register to address the current page map which resides in the

kernel mapped segment,kseg3. However the contents of this register duplicates some

information of the BadVAddr register, it is arranged in a form that is more useful for TLB

exception handler b y a software.

Figure 7-4 shows the formats of the Context register and Table 7-4 describes the

Context regi ster fields.

31 23 22 4 3 0

PTEBase BadVPN2 0

(32-bit mode)

63 23 22 4 3 0

PTEBase BadVPN2 0

(64-bit mode)

Figure 7-4 Context Register Formats

Table 7-4 Context Register Field Descriptions

32-bit mode

Bit Field Description Cold Reset Read/Write

31∼23 PTEBase Page t abl e entry bas e pointer

This field is for use by the operating system. It is normally

written with a value that allows the operating system to use

the Context register as a pointer into the current PTE array

in memor y.

Undefined Read/Write

22∼4 BadVP N2 Bad virtual address bits 31~13

This field is written by hardware on a miss. It contains the

virtual page number (VP N) of the most recent virtual address

that did not have a valid translati on.

Undefined Read

3∼0 0 Reserved 0x0 Read

64-bit mode

Bit Field Description Cold Reset Read/Write

63∼23 PTEBase Page t abl e entry bas e pointer Undefined Read/Write

22∼4 BadVP N2 Bad virtual address bits 31~13 Undefined Read

3∼0 0 Reserved 0x0 Read

The 19-bit BadVPN2 field contains bits 31 to 13 of the virtual address that caused the

TLB miss; bits 12 is excluded because a single TLB entry maps to an even-odd page pair.

For a 4-Kbyte page size, this format can directly address the pair-table of 8-byte PTEs.

For other page size and PTE sizes, shifting and masking this value produces the

appropriate address.

TX49/H2 Archit ecture

7-6

7.2.5 PageMask Register (Reg#5)

The PageMask register is a read/write register used for reading from/writing to the

TLB. This register holds a comparison mask that sets the variable page size for each TLB

entry.

TLB read and write operations use this register as either a source or a destination.

When virtual addresses are presented for translation into physical address, the

corresponding bits in the TLB identify which virtual address bits among bits 24~13 are

used in the comparison. When the Mask field is not one of the values shown in Table 7-5,

the operation of the TLB is undefined.

Figure 7-5 shows the format of the PageMask register and Table 7-5 describes the

PageMask register fields.

31 25 24 13 12 0

0MASK0

Figure 7-5 PageMask Register Format

Table 7-5 PageMask Register Field Descriptions

Bit Field Description Cold Reset Read/Write

31∼25 0 Reserved 0x0 Read

24∼13 MASK Page comparison mask

000000000000: page size = 4 Kbytes

000000000011: page size = 16 Kbytes

000000001111: page size = 64 Kbytes

000000111111: page size = 256 Kbytes

000011111111: page size = 1 Mbytes

001111111111: page size = 4 Mbytes

111111111111: page size = 16 Mbytes

0x0 Read/Write

12∼0 0 Reserved 0x0 Read

TX49/H2 Archit ecture

7-7

7.2.6 Wired Register (Reg#6)

The Wired register is a read/write register specifies the boundary between the wired

and random entries of the TLB as follows. Wired entries are non-replaceable entries,

which can not be overwritten by a TLB write random operation. Random entries can be

overwritten.

TLB

Wired Regist er

Range of Random entries

Range of Wired entries

The Wired register is set to 0 upon system reset. Writing this register also sets the

Random register to the value of its upper bound. Figure 7-6 shows the format of the

Wired regist er and Table 7-6 describes the Wired register fields.

31 65 0

0Wired

Figure 7-6 Wired Register

Table 7-6 Wired Register Filed Descriptions

Bit Field Description Cold Reset Read/Write

31∼60 Reserved

(Must be written as zeroes, and ret urns zeroes when read.) 0x0 Read

5∼0 Wired TLB Wired boundary. 0x0 Read/Write

TX49/H2 Archit ecture

7-8

7.2.7 BadVAddr Register (Reg#8)

The Bad Virtual Address (BadVAddr) register is a read only register that displays the

most recent virtual address that cause one of the following exceptions; Address Error,

TLB Invalid, TLB Modified and TLB Refill exceptions.

The processor does not write to this register when the EXL bit in the Status register is

set to a 1. Figu re 7-7 sh ow s th e f o rmats of the B ad VAd dr re g is ter and Ta ble 7- 7 d e scribes

the BadVAddr register fields.

31 0

Bad Virtual Address

(32-bit mode)

63 0

Bad Virtual Address

(64-bit mode)

Figure 7-7 BadVAddr Register Formats

Table 7-7 BadVAddr Register Field Descriptions

32-bit mode

Bit Field Description Cold Reset Read/Write

31∼0 BadVAddr Bad Virt ual address Undefined Read

64-bit mode

Bit Field Description Cold Reset Read/Write

63∼0 BadVAddr Bad Virt ual address Undefined Read

TX49/H2 Archit ecture

7-9

7.2.8 Count Register (Reg#9)

The Count register is a read/write register. This register acts as a timer, incrementing

at a constant rate (1/2 rate of CPUCLK) whether or not an instruction is executed, retired,

or any forward progress is made t h rough the pipeline.

This register can be also written for diagnostic purpose or system initialization. Figure

7-8 shows the format of the Count register and Table 7-8 describes the Count register

field.

31 0

Count

Figure 7-8 Count Register Format

Table 7-8 Count Register Field Description

Bit Field Description Cold Reset Read/Write

31∼0 Count 32-bit timer, incrementing at half the maximum instruction

issue rate (CPUCLK). 0x0 Read/Write

TX49/H2 Archit ecture

7-10

7.2.9 EntryHi Register (Reg#10)

The EntryHi is a read/write register, and holds the high-order bits of a TLB entry for

TLB read and write operations. This register is accessed by the TLB Probe (TLBP), TLB

Write Ransom (TLBWR), TLB Write Indexed (TLBWI), and TLB Read Indexed (TLBR)

instructions.

When either a TLB refill, TLB invalid, or TLB modified exception occurs, this register is

loaded with the virtual page number (VPN2) and the ASID of the virtual address that did

not have a matching TLB entry. Figure 7-9 shows the formats of the EntryHi register

and Table 7-9 describes the EntryHi register fields.

31 13 12 8 7 0

VPN2 0 ASID

(32-bit mode)

63 62 61 40 39 13 12 8 7 0

RFILL VPN2 0ASID

(64-bit mode)

Figure 7-9 EntryHi Register Formats

Table 7-9 EntryHi Register Field Descriptions

32-bit mode

Bit Field Description Cold Reset Read/Write

31∼1 VPN2 Vi rtual page number divided by two Undefined Read/Write

12∼8 0 Reserved 0x0 Read

7∼0 ASID Address space ID field

An 8-bit field that lets multiple processes share the TLB;

each process has a distinct mapping of otherwise identical

virtual page num bers.

Undefined Read/Write

64-bit mode

Bit Field Description Cold Reset Read/Write

63∼62 R Region. Used to match vA ddr63 and vAddr62.

00: user, 01: supervisor, 11: kernel Undefined Read/Write

61∼40 Fill Reserved. 0 on read. Ignored on write. Undefined Read

39∼13 VPN2 Virt ual page number divi ded by two Undefi ned Read/Write

12∼8 0 Reserved 0x0 Read

7∼0 ASID Address space ID field. Undefined Read/Write

TX49/H2 Archit ecture

7-11

7.2.10 Compare Register (Reg#11)

The Compare register acts as a timer. When value of the Count register equals the

value of the Compare register, interrupt bit IP (7) in the Cause register is set. This

causes an in terrupt exce ption as so on as the interrupt is e nabled . Writing a value to this

For diagnostic purpose, this register is a read/write register. However, in normal

operation this register is write only. Figure 7-10 shows the format of the Compare

31 0

Compare

Figure 7-10 Compare Register Format

Table 7-10 Compare Register Field Description

Bit Field Description Cold Reset Read/Write

31∼0 Compare Acts as a timer; it maintains a stable value that does not

change on its own. 0x0 Read/Write

TX49/H2 Archit ecture

7-12

7.2.11 Status Reg ister (Reg#12)

The Status register is a read/write register that contains the operating mode, interrupt

enabling, and diagnostic states of the processor. The more important Status register

fields are as following;

• The Interrupt Mask (IM) field of 8 bits controls the enabling of eight interrupt

conditions. Interrupt must be enabled before they can be asserted, and the

corresponding bits are set in both the IM field of this register and the Interrupt

Pending field of the Cause register.

• The Coprocessor Usability (CU) field of 4 bits controls the usability of four

possible coprocessors. Regardless of the CU0 bit setting, CP0 is always usable in

Kernel mode.

• The Diagnostic Status (DS) field of 9 bits is used for self-testing, and checks the

cache and virtual memory system.

• The Reverse Endian (RE) bit reverses the endianness. The processor can be

configured as either little/big-endian at reset; reverse-endian selection is used in

Kernel and Supervisor modes, and in the User mode when the RE bit is 0.

Setting the RE bit to 1 inverts the User mode endianness.

Figure 7-11 shows the f ormat of the Statu s regis ter an d Table 7-11 d escr ibes the St atu s

31 28 27 26 25 24 16 15 8 7 6 5 4 3 2 1 0

CU 0 FR RE DS IM KX SX UX KSU ERL EXL IE

24 23 22 21 20 19 18 17 16

0 BEV0SR0CH00

Figure 7-11 Status Register Format

Table 7-11 Status Register Field Descriptions

Bit Field Description Cold Reset Read/Write

31∼28 CU (3,2,1,0) Controls the usability of each of the four coprocessor unit

numbers. CP0 is always usable when in Kernel mode,

regardless of the setting of t he CU0 bit.

0: unusable, 1: usable

0000 Read/Write

27 0 Reserved 0 Read

26 FR Enables addit i onal fl oating-poi nt regis t ers.

0: 16 registers, 1: 32 registers 0 Read/Write

25 RE Reverse-E ndi an bi t, valid in User mode. 0 Read/Write

24∼23 0 Reserved 0x0 Read

22 BEV Controls the location of TLB refill and general exception

vectors.

0: normal, 1: bootstrap

1 Read/Write

TX49/H2 Archit ecture

7-13

Bit Field Description Cold Reset Read/Write

21 0 Reserved 0 Read

20 SR 1: I ndicates a soft reset or NMI has occurred. 0 Read/W rite

19 0 Reserved 0 Read

18 CH “Hit” or “miss” indication for last CACHE Hit Invalidate, Hit

W rite Back Invali date, Hit Write Back f or a primary cache.

0: miss, 1: hit.

0 Read/Write

17∼16 0 Reserved 0x0 Read

15∼8 I M Interrupt Mask

Controls the enabling of each of the external, internal and

software interrupts. An interrupt is taken if interrupts are

enabled, and the corresponding bits are set in both the IM

field of the Status register and the IP field of the Cause

0: disabled, 0: enabled

0x0 Read/Write

7 KX Enables 64-bit addressing in Kernel mode. The extended-

addressing TLB refill exception is used for TLB misses on

kernel address es.

0: 32-bit, 1: 64-bit

0 Read/Write

6 SX Enables 64-bit addressing and operations in Supervisor

mode. The extended-addressing TLB refill exception is used

for TLB misses on supervisor addresses.

0: 32-bit, 1: 64-bit

0 Read/Write

5 UX Enables 64-bi t addressing and operations i n User mode. The

extended-addressing TLB refill exception is used for TLB

misses on user address es.

0: 32-bit, 1: 64-bit

0 Read/Write

4∼3 KSU Mode.

10: user, 01: supervisor, 00: kernel. 0x0 Read/Write

2 ERL Error Level.

0: normal, 1: error. 1 Read/Write

1 EXL Exception Level.

0: normal, 1: exception. 0 Read/Write

0 IE Interrupt Enable.

0: disable, 1: enable. 0 Read/Write

TX49/H2 Archit ecture

7-14

Status Register Modes and Access States

Fields of the Status register set the modes and access states described in the section

that follow.

! Interrupt Enable: Interrupts are enabled when all of th e following conditi ons are met:

• IE = 1

• EXL = 0

• ERL = 0

If these conditions are met, the settings of the IM bits enable the interrupt.

! Operation Modes: The following CPU Status register bit settings are required for

User, Kernel and Supervisor modes (see Section 8.3, Operation Modes, for more

information about operating modes).

• The processor is in User mode when KSU = 102, EXL = 0, and ERL = 0.

• The processor is in Supervisor mode when KSU = 012, EXL = 0 and ERL = 0.

• The processor is in Kernel mode when KSU = 002, or EXL= 1, or ERL =1.

! 32- and 64-bit Modes: The following CPU Status register settings select 32- or 64-bit

operation for User, Kernel, and Supervisor operating modes. Enabling 64-bit

operation permits the execution of 64-bit opcodes and translation of 64-bit addresses.

64-bit operation for User, Kernel and Supervis or modes can be set independently.

• 64-bit addr essing f or Kernel mo de is enable d when K X = 1. 64-bit o peratio ns are

always valid in Kernel mode.

• 64-bit addressing and operations are enabled for Supervisor mode when SX = 1.

• 64-bit addressing and operations are enabled for User mode when UX = 1.

! Kernel Address Space Accesses: Access to the kernel address space is allowed when

the processor is in Kernel mode.

! Supervisor Address Space Accesses: Access to the supervisor address space is allowed

when the processor is in Kernel or Supervisor mode, as described above in the section

above titled Operating Modes.

! User Address Space Accesses: Access to the user address is allowed in any of the

three operating modes.

Status Register Reset

The contents of the Status register are undefined at reset, except for the following bits

in the Diagnostic Status field:

• ERL and BEV = 1

The SR bit distinguishes between the Reset exception and the Soft Reset exception

(caused by Nonmaskable Interrupt [NMI]).

TX49/H2 Archit ecture

7-15

7.2.12 Cause Register (Reg#13)

The Cause register holds the cause of the most recent exception. This register is read-

only, except for the IP[1~0] bits. Figure 7-12 shows the format of the Cause register and

Table 7-12 describes the Cause register field.

31 30 29 28 27 16 15 8 7 6 2 1 0

BD 0 CE 0 IP 0 ExcCode 0

Figure 7-12 Cause Register Format

Table 7-12 Cause Register Field Descriptions

Bit Field Description Cold Reset Read/Write

31 BD Indicates whether or not the last exception was taken while

executing in a branch delay slot.

0: normal, 1: delay slot.

0 Read

30 0 Reserved 0 Read

29~28 CE Indicates the coprocessor unit number referenced when a

coprocess or unusabl e excepti on is taken.

00: coprocess or 0, 01: coprocessor 1,

10: coprocess or 2, 11: coprocessor 3.

0x0 Read

27~16 0 Reserved 0x0 Read

15~10 IP [7~2] I ndi cates whether an interrupt is pending.

0: not pending, 1: pending. INT[5:0] Read

9~8 IP [1~0] Software interrupts.

0: reset, 1: set. 0x0 Read/Write

7 0 Reserved 0 Read

6~2 ExcCode Excepti on Code field.

0: Int: Interrupt.

1: Mod: TLB modification exception.

2: TLBL: TLB exception (load or instruction fetch)

3: TLBS: TLB exception (Store)

4: AdEL: Address error exception (l oad or i nstruction fetch)

5: AdES: Address error excepti on (store)

6: IBE: Bus error exception (i nstruction fetch)

7: DBE: Bus error exception (dat a referenc e: load or Store)

8: Sys: Syscall exception

9: Bp: Breakpoint exception

10: RI: Reserved ins truction exception

11: CpU: Coproc essor Unus abl e exception

12: Ov: Arithmetic Overfl ow exception

13: Tr: Trap exception

14: Reserved:

15: FPE: Floating-P oi nt exception

16-31: Reserved :

0x0 Read

1~0 0 Reserved 0x0 Read

TX49/H2 Archit ecture

7-16

7.2.13 EPC Register (Reg#14)

The Exception Program Counter (EPC) register is a read/write register. This register

contents the address at which processing resumes after an exception has been serviced.

For synchronous exceptions, this register cont ains either;

• the virtual address of the instruction that was the direct cause of the exception.

• the virtual address of the immediately preceding branch or jump instruction

(when the instruction is in a branch delay slot, and the Branch Delay bit in the

Cause register is set).

The processor does not write to the EPC register when EXL bit in the Status register is

set to 1 . Figure 7-13 shows the f ormats of the EPC regis ter and Table 7-13 de scribes the

EPC register field.

31 0

EPC

(32-bit mode)

63 0

EPC

(64-bit mode)

Figure 7-13 EPC Register Formats

Table 7-13 EPC Register Field Description

32-bit mode

Bit Field Description Cold Reset Read/Write

31~0 EPC Exception program counter Undefined Read/Write

64-bit mode

Bit Field Description Cold Reset Read/Write

63~0 EPC Exception program counter Undefined Read/Write

TX49/H2 Archit ecture

7-17

7.2.14 PRId Register (Reg#15)

The Processor Revision Identifier (PRId) register is a read-only register. This register

contents information identifying the implementation and revision level of the CPU and

CP0. Figure 7-14 shows the format of the PRId register and Table 7-14 describes the

PRId register field.

31 16 15 8 7 0

0ImpRev

Figure 7-14 PRId Register Format

Table 7-14 PRId Register Field Descriptions

Bit Field Description Cold Reset Read/Write

31~16 0 Reserved 0x0 Read

15~8 Imp Implementat i on number 0x2d means “TX49 family”. 0x2d Read

7~0 Rev Revision num ber +.+Read

+ Val ue is sho wn in prod uc t sheet

TX49/H2 Archit ecture

7-18

7.2.15 Conf ig Register (Reg#16)

The Config register is a read-only register; except for HALT, ICE#, DCE# and K0 fields.

This register specifies various configuration options selected on the TX49.

EC, BE, IC, DC, IB and DB fields are set by the hard ware du ring re se t and are in clude d

in this register as read-only status bits for the software to access. Figure 7-15 shows the

format of the Config register and Table 7-15 describes the Config register field.

31 30 28 27 24 23 19 18 17 16 15 14

12 11 9 8 6 5 4 3 2 0

0EC 0 0HALTICE#DCE#BE 1 0 IC DC IB DB 0 K0

Figure 7-15 Config Register Format

Table 7-15 Config Register Field Descriptions

Bit Field Description Cold Reset Read/Write

31 0 Reserved 0 Read

30~28 E C GBUS clock rate:

0: process or cl ock frequency di vi ded by 2

1: process or cl ock frequency di vi ded by 3

2: process or cl ock frequency di vi ded by 4

7: process or cl ock frequency di vi ded by 2.5

3, 4, 5, 6 : reserved

pin Read

27 0 Reserved pin Read/Write

26~24 0 Reserved pin Read

23~19 0 Reserved 0 Read

18 HALT Wait mode.

0: Halt

1: Doze

Indicates the power-down behavior of the TX49 when WAIT

instructi on is executed. The TX49 stalls the pipeline bot h in

halt and doze mode. Cache snoops are possible during

Doze mode but not possible during Halt mode. Halt mode

reduces power consumption to a greater extent than Doze

mode.

0 Read/Write

17 ICE#Instruction Cache Enabl e

0: Instruct i on cache enable

1: Instruct i on cache disable

0 Read/Write

16 DCE#Data Cac he Enable

0: Data cache enable

1: Data cache disable

0 Read/Write

15 BE Big Endian

0: Little Endian

1: Big Endian

pin Read

14~13 1 Reserved 11 Read

12 0 Reserved 0 Read

TX49/H2 Archit ecture

7-19

Bit Field Description Cold Reset Read/Write

11~9 IC Inst ruction cache size. In the TX49, this is set to 8 KB (001),

16 KB (010) or 32 KB (011). 001, 010 or

011 Read

8~6 DC Data cache size. In the TX49, this is set to 8 KB (001),

16 KB (010) or 32 KB (011). 001, 010 or

011 Read

5 IB Primary I-Cache line Size

1:32 bytes (8 words) 1 Read

4 DB Primary D-cache lin e Size

1:32 bytes (8 words) 1 Read

3 0 Reserved 0 Read

2~0 K0 kseg0 coherency algorit hm

0: Cacheable, nonc oherent, write-t hrough, no-WA

1: Cacheable, nonc oherent, write-t hrough, WA

2: Uncached

3: Cacheable, nonc oherent, write-back, WA

4-7: Reserved

0x0 Read/Write

TX49/H2 Archit ecture

7-20

7.2.16 LLAddr Register (Reg#17)

The Load Linked Address (LLAddr) register is a read/wirte register, and contains the

physical address read by the most recent Load Linked (LL/LLD) instruction. This register

is for diagnostic purposes only, and serves no function during normal operation. Figure

7-16 shows the format of the LLAddr register and Table 7-16 describes the LLAddr

31 0

pAddr (35~4)

Figure 7-16 LLAddr Register Format

Table 7-16 LLAddr Register Field Description

Bit Field Description Cold Reset Read/Write

31~0 pAddr Physical address bits 35~4 0x0 Read/Write

TX49/H2 Archit ecture

7-21

7.2.17 XContext Register (Reg#20)

The XContext register is a read/write register, and contains a pointer to an entry in the

page table entry (PTE) array, an operating system data structure that stores virtual to

physical address translations. When there is a TLB miss, the operating system software

loads the TLB with the missin g transl ation from the PTE array . How ever th e conte nts of

this register duplicates some information of the BadVAddr register, it is arranged in a

form that i s more useful for TLB exception handler by a software. This register is for use

with the XTLB refill handler, which loads TLB entries for references to a 64-bit address

space, and is included solely for operating system use. The operating system sets the PTE

base field in the register, as needed. Normally, the operating system uses this register to

address the current page map which resides in the Kernel mapped segment, kseg3.

The BadVPN2 field of 27 bits has bit [39~13] of the virtual address that caused the TLB

miss; bit 12 is excluded because a single TLB entry maps to an even-odd page pair. For a

4 KByte page size, this format may be used directly to access the pair-table of 8 Byte

PTEs. For other page sizes and PTE size s, shifting and masking this value produces the

appropriate address.

Figure 7-17 shows the format of the XContext register and Table 7-17 describes the

XContext register field.

63 33 32 31 30 4 3 0

PTEBase R BadVPN2 0

Figure 7-17 XContext Register Format

Table 7-17 XContext Register Field Description

Bit Field Description Cold Reset Read/Write

63~33 PTEBase Page t abl e entry bas e pointer

This field is normally written with a value that allows the

operation system to use the Context register as a pointer

into t he cu rrent PTE array i n m emory.

Undefined Read/Write

32~31 R The Region field contains bits 63 to 62 of the virtual address.

00: user, 01: supervisor, 11: kernel Undefined Read/Write

30~4 BadVP N2 Bad virtual page number divided by two.

This field is written by hardware on a miss. It contains the

VPN of the most recent i nval idly translated virtual address.

Undefined Read

3~0 0 Reserved 0x0 Read

TX49/H2 Archit ecture

7-22

7.2.18 Debug Register (Reg#23)

The Debug register is a read-only; except for TLF, BsF, SSt and JtagRst fields. This

Debug register and Table 7-18 describes the Debug register field.

31 30 29 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

DBD

NIS

TRS

OES

TLF

BsF

SSt

JtagRst

DINT

DIB

DDBS

DDBL

DBp

DSS

Figure 7-18 Debug Register Format

Table 7-18 Debug Register Field Descriptions

Bit Field Description Cold Reset Read/Write

31 DBD Debug Branch Delay; W hen a debug exception occurs while

an instruction i n the branch delay slot is executing, this bit i s

set to 1.

0 Read

30 DM Debug Mode; It indicates that a debug exception has taken

place. This bit is set when a debug exception is taken, and

is cleared upon return from the exception (DERET). While

this bit is set all interrupts, including NMI, TLB exception ,

BUS error exception, and debug exception are masked and

cache line locki ng funct i on is disabl ed.

0: Debug handler not running.

1: Debug handler running.

0 Read

29~15 0 Reserved 0x0 Read

14 NIS Non-maskable Interrupt Status; When this bit is set

indicating that a non-maskable interrupt has occurred at the

same time as a debug exception. In this case the Status,

Cause, EPC, and BadVAddr registers assumes the usual

status after occurrence of a non-maskable interrupt, but the

address in DEPC is not the non-maskable exception vector

address (0xbfc0 0000). Instead, 0xbfc0 0000 is put in DEPC

by the debug handler software after which processing

returns directly from the debug exception to the non-

maskable i nterrupt handl er.

0 Read

13 TRS TLB Miss Status; When this bit is set indicating the Debug

Exception and TLB/XTLB refill exception has occurred at the

same time. In this case the Status, Cause, EPC, and

BadVAddr registers assumes the usual status after

occurrence of TLB/XTLB refill. The address in the DEPC is

not the other exception vector address. Instead, 0xbfc0

0200 (if BEV = 1) in case of TLB refill exception and 0xbfc0

0280 (if BEV = 1) in case of XTLB refill exception or 0x8000

0000 (if BEV = 0) in case of TLB refil l exc eption and 0x8000

0080 (if BEV = 0) in case of XTLB refill exception is put in

DEPC by the debug exception handler software, after which

processing returns directly from the debug exception to the

other exception handler.

0 Read

TX49/H2 Archit ecture

7-23

Bit Field Description Cold Reset Read/Write

12 OES Other Exception Status; When this bit is set indicates

exception other than reset, NMI, or TLB/XTLB refill has

occurred at the same time as a debug exception. In this

case the Status, Cause, EPC, and BadVAddr registers

assume the usual status after occurrence of such an

exception, but the addressing the DEPC is not the other

exception Vector address . Instead, 0xbfc0 0380 (if BEV = 1)

or 0x8000 0180 (if BEV = 0) is put in DEPC by the debug

exception handler software, after which processing returns

directly from the other exception handler.

0 Read

11 TLF TLB Exception Flag; This bit is set to 1 when TLB related

exception occurs for immediately preceding load or store

instruct ion while a debug exception handler is running (DM =

1). TLB exception will set this bit to 1 regardless of writing

zero. It is cleared by writing 0 and writing 1 is ignored.

0 Read/Write

10 BsF Bus Error Exception Flag; This bit is set to 1 when a bus

error exception occurs for a load or store instruction while a

debug exception handler is running (DM = 1). Bus error

exception will set this bit to 1 regardless of writ ing zero. It is

cleared by writing 0 and writing 1 is ignored.

0 Read/Write

9 0 Reserved 0 Read

8 SSt Single St ep; S et to 1 indicates t he single st ep debug functi on

is enable (1) or disabled (0). The function is disable when

the DM bit is set to 1 while the debug exception is running.

0 Read/Write

7 JtagRst JTAG Reset; When this bit is set to 1 the processor reset the

JTAG unit. 0 Read/Write

6 0 Reserved 0 Read

5 DINT Debug Interrupt Break Exception Status; set to 1 when

debug interrupts occurs. 0 Read

4 DIB Debug Instruction Break Exception Status; Set to 1 on

instruction address break. 0 Read

3 DDBS Debug Data Break Store Exception Stat us; Set to 1 on dat a

address break at store operati on. 0 Read

2 DDBL Debug Data Break Load Exception Status; Set to 1 on data

address break at load operati on. 0 Read

1 DBp Debug Breakpoint Exception Status; This bit is set when

executing SDBBP inst ruction. 0 Read

0 DSS Debug Single Step Exception Status; Set to 1 indicate Single

Step Exception. 0 Read

TX49/H2 Archit ecture

7-24

7.2.19 DEPC Regist er (Reg#24)

The DEPC register holds the address where processing resumes after the debug

exception routine has finished. The address that has been loaded in the DEPC register is

the virtual address of the instruction that caused the debug exception. If the instruction

is in the branch delay slot, the virtual address of the immediately preceding branch or

jump instruction is placed in this register. Execution of the DERET instruction causes a

jump to the address in the DEPC. If the DEPC is both written from software (by MTC0)

and by hardware (debug exception) then the DEPC is loaded by the value generated by

the hardware.

Figure 7-19 show s the fo rmats o f th e DE PC reg is ter and Ta ble 7- 19 de scribe s th e DE PC

31 0

DEPC

(32-bit mode)

63 0

DEPC

(64-bit mode)

Figure 7-19 DEPC Register Formats

Table 7-19 DEPC Register Field Description

32-bit mode

Bit Field Description Cold Reset Read/Write

31~0 DEPC Debug except i on program counter. Undefined Read/W rite

64-bit mode

Bit Field Description Cold Reset Read/Write

63~0 DEPC Debug except i on program counter. Undefined Read/W rite

TX49/H2 Archit ecture

7-25

7.2.20 TagLo Register (Reg#28) and TagHi Register (Reg#29)

The TagLo and TagHi registers are a read/write registers. These registers hold the

primary cache tag for cache lock function or cache diagnostics. These registers are

written by the CACHE/MTC0 instruction. Figure 7-20 shows the formats of the TagLo

and TagHi registers and Table 7-20 describes the TagLo and TagHi registers field.

31 87 65 3 2 1 0

PTagLo PState RWNT Lock F0 0

(TagLo)

31 30 29 0

F1 PtagLo1 0

(TagHi)

Figure 7-20 TagLo and TagHi Register Formats

Table 7-20 TagLo and TagHi Register Field Descriptions

TagLo

Bit Field Description Cold Reset Read/Write

31~8 PTagLo Bits 35~12 of the physical address 0x0 Read/Write

7~6 PState S pecifies the primary cac h e stat e

0: Invali d 1: Reserved

2: Reserved 3: Valid

0x0 Read/Write

5~3 RWNT Read/W rite bits required for Windows NT 0x0 Read/Write

2 Lock Lock bit (0: not l ocked, 1: l ocked) 0 Read/Write

1 F0 F IFO Replace bit 0 (indicates the set to be replaced) 0 Read/Write

0 0 Reserved 0 Read

TagHi

Bit Field Description Cold Reset Read/Write

31 F1 FIFO Replace bit 1 (indicates the set to be replaced) 0 Read/Write

30 PTagLo1 Bit 11 of the physical address 0 Read/Write

29~0 0 Reserved 0x0 Read

F1 and F0 are concatenated and indicate the set to be replaced.

F1  F0

0 0 : way0

0 1 : way1

1 0 : way2

1 1 : way3

TX49/H2 Archit ecture

7-26

7.2.21 ErrorEPC Register (Reg#30)

The ErrorEPC is a read/write register, and is similar to the EPC register. This register

is used to store the program counter (PC) on ColdReset, SoftReset and NMI exceptions.

This register contains the virtual address at which instruction processing can resume

after servicing an error. This address can be;

• The virtual address of the instruction that caused the exception

• The virtual address of the immediately preceding branch or jump instruction,

when this address is in a branch delay slot.

There is no branch delay slot indication for this register. Figure 7-21 shows the formats

of the ErrorEPC register and Table 7-21 describes the ErrorEPC register field.

31 0

ErrorEPC

(32-bit mode)

63 0

ErrorEPC

(64-bit mode)

Figure 7-21 ErrorEPC Register Formats

Table 7-21 ErrorEPC Register Field Descriptions

32-bit mode

Bit Field Description Cold Reset Read/Write

31~0 ErrorEP C E rror Exception Program Counter. Undefined Read/Write

64-bit mode

Bit Field Description Cold Reset Read/Write

63~0 ErrorEP C E rror Exception Program Counter. Undefined Read/Write

TX49/H2 Archit ecture

7-27

7.2.22 DESAVE Register (Reg#31)

This register is used by the debug exception handler to save one of the GPRs, that is

then used to save the rest of the context to a pre-determined memory are, e.g. in the

processor probe. This register allows the safe debugging of exception handlers and other

types of code where the existence of a valid stack for context saving cannot be ass u med.

Figure 7-22 shows the formats of the DESAVE register and Table 7-22 describes the

DESAVE register field.

Note: This register can use for ICE sy stem only.

63 0

DESAVE

Figure 7-22 DESAVE Register Format

Table 7-22 DESAVE register Field Description

32/64-bit mode

Bit Field Description Cold Reset Read/Write

63~0 DESAVE Save one of the GPRs Undefined Read/Write

TX49/H2 Archit ecture

7-28

7.2.23 The Init ialization of CP0 Registers in SoftReset Exception

Table 7-23 shows the values of t he registers that be initialized by SoftReset exception.

Table 7-23 The Initial Value by SoftReset Exception

22 BEV 1 Same value as ColdReset

20 SR 1 ColdReset has priority over SoftReset

Status (Reg#12) 2 ERL 1 S ame value as ColdReset

TX49/H2 Archit ecture

8-1

8. Memory Management System

8.1 Introduction

The TX49 provides a full-featured memory management unit (MMU) which uses an on-chip

translation look aside buffer (TLB) to translate virtual addresses into physical addresses.

8.2 Address Space Overview

The TX49 physical address space is 64 Gbyte using a 36-bit address. The virtual

address is either 64 or 32 bits wide depending on whether the processor is operating in 64-

or 32-bit mode. In 32-bit mode, addresses are 32-bits wide and the maximum user process

size is 2 Gby te (2 **31). In 64-b it mode , addresse s are 64-bi t wide and the maximum u ser

process is 1 Tbyte (2**40). The virtual address is extended with an Address Space

Identifier (ASID) to reduce the frequency of TLB flushing when switching context. The

size of the ASID field is 8 bits. The ASID is contained in the CP0 EntryHi register.

8.2.1 Virt ual Addr ess Space

The processor virtual address can be either 32 or 64 bits wide, depending on whether

the processor is operating in 32-bit or 64-bit mode.

• In 32-bit mode, addresses are 32 bits wide.

The maximum user process size is 2 gigabytes (231).

• In 64-bit mode, addresses are 64 bits wide.

The maximum user process size is 1 terabyte (240).

Figure 8-1 shows the translation of a virtual address into a physical address.

3. The Offset, which does not pass through the

TLB, is then concatenated to the PFN.

2. If there is a match, the page frame number

(PFN) representi ng the upper bits of the

physica l address (PA ) is output from the

TLB.

Physical address

Virtual address

1. Virtual address (VA) represented by the virtual

page number (VPN) is compared with tag in

the TLB.

VPNASIDG

PFN

TLB

OffsetPFN

TLB

Entry

Offset

Figure 8-1 Overview of a Virtual-to-Physical Address Translation

As shown in Figure 8-2 and Figure 8-3, the virtual address is extended with an 8-bit

address space identifier (ASID), which reduces the frequency of TLB flushing when

switching con tex ts . This 8 - bi t ASID is in th e CP0 EntryHi register, described la ter in this

chapter. The Global bit (G) is in the EntryLo0 and EntryLo1 registers, described later in

this chapter.

TX49/H2 Archit ecture

8-2

8.2.2 Physical Address Space

Using a 36-bit address, the processor physical address space encompasses 64 Gbytes.

The section following describes the translation of a virtual address to a physical address.

8.2.3 Virtual-to-Physical Address Translation

Converting a virtual address to a physical address begins by comparing the virtual

address from the processor with the virtual addresses in the TLB; there is a match when

the virtual page number (VPN) of the address is the same as the VPN field of the entry,

and either:

• the Global (G) bit of the TLB entry is set, or

• the ASID field of the virtual address is the same as the ASID field of the TLB

entry.

This match is referred to as a TLB hit. If there is no match, a TLB Miss exception is

taken by the processor and software is allowed to refill the TLB from a page table of

virtual/phy sic al add re sses in memory.

If there is a virtual address match in the TLB, the physical address is output from the

TLB and concatenated with the Offset, which represents an address within the page

frame space. The Offset does not pass through the TLB.

Virtual-to-physical translation is described in greater detail throughout the remainder

of this chapter; Figure 8-8 is a flow diagram of the process shown at the end of this

chapter. The next two sections describe the 32-bit and 64-bit address translations.

TX49/H2 Archit ecture

8-3

8.2.4 32-bit Mode Address Tr anslation

Figure 8-2 shows the virtual-to-physical-address translation of a 32-bit mode address.

This figure illustrates two of the possible page sizes: a 4-Kbyte page (12 bits) and a 16-

Mbyte page (24 bits).

• The top portion of Figure 8-2 shows a virtual address with a 12-bit, or 4-Kbyte,

page size, labeled Offset. The remaining 20 bits of the address represent the

VPN, and Index the 1M-entry page table.

• The bottom portion of Figure 8-2 shows a virtual address with a 24-bit, or 16-

Mbyte, page size, labeled Offset. The remaining 8 bits of the address represent

the VPN, and index the 256-entry page table.

Offset passed

unchanged to

physical

memory

Offset passed

unchanged to

physical

memory

Virtual-to-physical

translation in TLB

Bits 31, 30 and 29 of the virt ual

address s elect us er, supervisor,

or kernel address spaces.

Virtual-to-physical

translation in TLB

Virtual Address with 256 (28) 16-Mbyte pages

Virtual Address with 1M (220) 4-Kbyte pages

TLB

20 bits = 1 M page s

VPNASID

12208

01112282939 32 31

Offset

36-bit Physical Address

035 PFN Offset

8 bits = 256 pages

VPNASID

2488

02324282939 32 31

Offset

TLB

Figure 8-2 32-bit Mode Virtual Address Translation

TX49/H2 Archit ecture

8-4

8.2.5 64-bit Mode Address Tr anslation

Figure 8-3 shows the virtual-to-physical-address translation of a 64-bit mode address.

This figure illustrates two of the possible page sizes: a 4-Kbyte page (12 bits) and a 16-

Mbyte page (24 bits).

• The top portion of Figure 8-3 shows a virtual address with a 12-bit, or 4-Kbyte,

page size, labelled Offset. The remaining 28 bits of the address represent the

VPN, and index the 256M-entry page table.

• The bottom portion of Figure 8-3 shows a virtual address with a 24-bit, or 16-

Mbyte, pag e size, la belled Offset. The remaining 16 bits of the address represent

the VPN, and index the 64K-entry page table.

Offset passed

unchanged to

physical

memory

Offset passed

unchanged to

physical

memory

Virtual-to-physical

translation in TLB

Bits 62 and 63 of the virtual

address select user , supervisor,

or kernel address spaces.

Virtual-to-physical

translation in TLB

Virtual Address with 64 K (216) 16-Mbyte pages

Virtual Address with 256 M (228) 4-Kbyt e pages

28 bits = 256M pages

VPN0 or -1ASID

1228248

0111239406171 64 6263

Offset

36-bit Physical Address 035 PFN Offset

16 bits = 64 K pages

VPN0 or -1ASID

2416248

023243940616271 64 63

Offset

TLB

Figure 8-3 64-bit Mode Virtual Address Translation

TX49/H2 Archit ecture

8-5

8.3 Operating Modes

The TX49 has the three operating modes, User mode, Supervisor mode and Kernel mode, for

32- and 64-bit operation. The KSU, EXL and ERL bit in the Status register select User,

Supervisor or Kernel mode. The UX, SX and KX bit in the Status register select 32- or 64-bit

addressing in user, supervisor and kernel mode respectively.

KSU EXL ERL UX SX KX Mode

10 0 0 0 - - 32-bit addressing in user mode

10 0 0 1 - - 64-bit addressing in user mode

01 0 0 - 0 - 32-bit addressi ng i n supervi sor mode

01 0 0 - 1 - 64-bit addressi ng i n supervi sor mode

00 - - - - 0 32-bit addressing in kernel mode

- 1 - - - 0 32-bit addressing in kernel mode

- - 1 - - 0 32-bi t addressing in kernel mode

00 - - - - 1 64-bit addressing in kernel mode

- 1 - - - 1 64-bit addressing in kernel mode

- - 1 - - 1 64-bi t addressing in kernel mode

8.3.1 User Mode Operations

In User mode, a single, uniform virtual address space-labelled User segment-is

available; its size is:

• 2 Gbytes (231 bytes) in 32-bit mode (useg)

• 1 Tbyte (240 bytes) in 64-bit mode (xuseg)

Figure 8-4 shows User mode virtual address space.

0x 0000 0000 0000 0000

0x 0000 0100 0000 0000

0x F FF F FF FF FF FF FF FF

xuseguseg

0x 0000 0000

0x 8000 0000

0x FFFF FFFF

64-bit32-bit*

2 GB

Mapped

Cacheable

Address

Error

1 TB

Mapped

Cacheable

Address

Error

Figure 8-4 User Mode Virtual Addr ess Space

*Note: In 32-bit mode, bit 31 is sign-extended through bits 63~32. Failure results in an

address error exception.

The User segment starts at address 0 and the current active user process resides in

either useg (in 32-bit mode) or xuseg (in 64-bit mode). The TLB identically maps all

references to useg/xuseg from all modes, and controls cache accessibility.

The processor operates in User mode when the Status register contains the following

bit-values:

• KSU bits = 102

• EXL = 0

• ERL = 0

TX49/H2 Archit ecture

8-6

In conjunction with these bits, the UX bit in the Status register selects between 32- or

64-bit User mode addressing as follows:

• when UX = 0, 32-bit useg space is sele cted and TLB misses are han dle d by th e 3 2-

bit TLB refill exception handler

• when UX = 1, 64-bit xuseg space is selected and TLB misses are handled by the

64-bit TLB refill exception handler

Table 8-1 lists the characteristics of the two user mode segments, useg and xuseg.

Table 8-1 32-bit and 64-bit User Mode Segments

Status Register

Bit Values

Address Bit

Values KSU EXL ERL UX

Segment

Name Address Range Segment Size

32-bit

A (31) = 0102000 useg 0x0000 0000

through

0x7FFF FFFF

2 Gbyte

(231 bytes)

64-bit

A (63~40) = 0102001xuseg 0x0000 0000 0000 0000

through

0x0000 00FF FFFF FFFF

1 Tbyte

(240 bytes)

32-bit User Mode (useg)

In User mode, when UX = 0 in the Status register, User mode addressing is

compatible wi th the 32-bit addre ssing mode l shown in Fig ure 8-4, and a 2-G byte user

address space is available, labelled useg.

All valid User mode virtual addresses have their most-significant bit cleared to 0;

any attempt to reference an address with the most-significant bit set while in User

mode causes an Address Error exception.

The system maps all references to useg through the TLB, and bit settings within

the TLB entry for the page determine the cacheability of a reference.

64-bit User Mode (xuseg)

In User mode, when UX = 1 in the Status register, User mode addressing is

extended to the 64-bit model shown in Figure 8-4 . In 64-bit User mode, the processor

provides a single, uniform a ddress space of 240 bytes, labelled xuseg.

All valid User mode virtual addresses have bits 63~40 equal to 0; an attempt to

reference an address with bits 63~40 not equal to 0 causes an Address Error

exception.

The system maps all reference to xuseg through the TLB, and bit settings within

the TLB entry for the page determine the cacheability of a reference.

TX49/H2 Archit ecture

8-7

8.3.2 Supervisor Mode Operations

Supervisor mode is designed for layered operating systems in which a true kernel runs

in TX49 Kernel mode, and t he rest of the operating system runs in Supervisor mode.

The processor operates in Supervisor mode when the Status register contains the

following bit-values:

• KSU = 012

• EXL = 0

• ERL = 0

In conjunction with these bits, the SX bit in the Status register selects between 32- or

64-bi t Supervisor mode addressing:

• when SX = 0, 32-bit supervisor space is selected and TLB misses are handled by

the 32-bit TLB refill exception handler

• when SX = 1, 64-bit supervisor space is selected and TLB misses are handled by

the 64-bit XTLB refill exception handler

The system maps all references through the TLB, and bit settings within the TLB entry

for the page determine the cacheability of a reference.

Figure 8- 5 shows Su pervisor mode addr ess mapping. Ta ble 8-2 l ists the charac teristic s

of the supervisor mode segments; descriptions of the address spaces follow.

0x FFFF FF FF E000 0000

0x 0000 0000 0000 0000

0x 0000 0100 0000 0000

0x 4000 0000 0000 0000

0x 4000 0100 0000 0000

0x FFFF FF FF C000 0000

0x FFFF FF FF FFFF FF FF

suseg

sseg

csseg

suseg

sseg

0x 0000 0000

0x 8000 0000

0x A000 0000

0x C000 0000

0x E000 0000

0x FFFF FFFF

32-bit*

2 GB

Mapped

Cacheable

0.5 GB

Mapped

Cacheable

Address

error

Address

error

Address

error

64-bit

0.5 GB

Mapped

Cacheable

1 TB

Mapped

Cacheable

1 TB

Mapped

Cacheable

Address

error

Address

error

Address

error

Figure 8-5 Superv is or Mode Ad dr es s Sp ace

*Note: In 32-bit mode, bit31 is sign-extended through bits 63~32. Failure results in an

address error exception.

TX49/H2 Archit ecture

8-8

Table 8-2 32-bit and 64-bit Supervisor Mode Segments

Status Register

Bit Values

Address Bit

Values KSU EXL ERL SX

Segment

Name Address Range Segment Size

32-bit

A (31) = 0012000suseg 0x0000 0000

through

0x7FFF FFFF

2 Gbyte

(231 bytes)

32-bit

A (31~29) = 1102012000ssseg 0xC000 0000

through

0xDFFF FFFF

512 Mbytes

(229 bytes)

64-bit

A (63~62) = 002012001xsuseg 0x0000 0000 0000 0000

through

0x0000 00FF FFFF FFFF

1 Tbyte

(240 bytes)

64-bit

A (63~62) = 012012001xsseg 0x4000 0000 0000 0000

through

0x4000 00FF FFFF FFFF

1 Tbyte

(240 bytes)

64-bit

A (63~62) = 112012001csseg 0xFFFF FFFF C000 0000

through

0xFFFF FFFF DFFF FFFF

512 Mbytes

(229 bytes)

32-bit Supervisor Mode, User Space (suseg)

In Supervisor mode, when SX = 0 in th e Status register and the most-significant bit

of the 32-bit virtual address is set to 0, the suseg virtual address space is selected; it

covers the full 231 bytes (2 Gbytes) of the current user address space. The virtual

address is extended with the contents of the 8-bit ASID field to form a unique virtual

address. Thi s mapped space starts at virtua l address 0x0 000 0000 and ru ns through

0x7FFF F FF F.

32-bit Supervisor Mode, Supervisor Space (sseg)

In Supervisor mode, when SX = 0 in the Status register and the three most-

significant bits of the 32-bit virtual address are 1102, the sseg virtua l addre ss space is

selected; it covers 229 bytes (512 Mbytes) of the current supervisor address space. The

virtual address is extended with the contents of the 8-bit ASID field to form a unique

virtual address. This mapped space begins at virtual address 0xC000 0000 and runs

through 0xDF FF FFFF.

64-bit Supervisor Mode, User Space (xsuseg)

In Supervisor mode, when SX = 1 in the Status register and bits 63:62 o f the v irtual

address are set to 002, the xsuseg virtual address space is selected; it covers the full

240 bytes (1 Tbyte) of the current user address space. The virtual address is extended

with the contents of the 8-bit ASID field to form a unique virtual address. This

mapped space starts at virtual address 0x0000 0000 0000 0000 and runs through

0x0000 00FF FFFF FFFF.

64-bit Supervisor Mode, Current Supervisor Space (xsseg)

In Supervisor mode, when SX = 1 in the Status register and bits 63~62 of the

virtual address are set to 012, the xsseg current supervisor virtual address space is

selected. The virtual address is extended with the contents of the 8-bit ASID field to

form a unique virtual address. This mapped space begins at virtual address 0x4000

0000 0000 0000 and runs through 0x4000 00FF FFFF FFFF.

TX49/H2 Archit ecture

8-9

64-bit Supervisor Mode, Separate Supervisor Space (csseg)

In Supervisor mode, when SX = 1 in the Status register and bits 63~62 of the

virtual address are set to 112, the csseg separate supervisor virtual address space is

selected. Addressing of the csseg is compatible with addressing sseg in 32-bit mode.

The virtual address is extended with the contents of the 8-bit ASID field to form a

unique virtual address. This mapped space begins at virtual address 0xFFFF FFFF

C000 0000 and runs through 0xFFFF FFFF DFFF FFFF.

8.3.3 Kernel Mode Operat ions

The processor operates in Kernel mode when the Status register contains one or more of

the following values:

• KSU = 002

• EXL = 1

• ERL = 1

In conjunction with these bits, the KX bit in the Status register selects between 32- or

64-bit Kernel mode addressing:

• when KX = 0, 32-bit kernel space is selected and all TLB misses are handled by

the 32-bit TLB refill exception handler

• when KX = 1, 64-bit kernel space is selected and all TLB misses are handled by

the 64-bit XTLB refill exception handler

The processor enters Kernel mode whenever an exception is detected and it remains in

Kernel mode until an Exception Return (ERET) instruction is executed and results in

ERL and/or EXL = 0. The ERET instruction restores the processor to the mode existing

prior to the exception.

Kernel mode virtual address space is divided into regions differentiated by the high-

order bits of the virtual address, as shown in Figure 8-6. Table 8-3 lists the

characteristic s of the 32-b it kernel mod e segments, an d Table 8-4 lis ts the charac teristics

of the 64-bit kernel mode segments.

TX49/H2 Archit ecture

8-10

kuseg

kseg0

kseg1

ksseg

kseg3

0x 0000 0000

0x 8000 0000

0x A000 0000

0x C000 0000

0x E000 0000

0x FFFF FFFF

32-bit*

2 GB

Mapped

Cacheable

0.5 GB

Mapped

Cacheable

0.5 GB

Mapped

Cacheable

0.5 GB

Unmapped

Cacheable

0.5 GB

Unmapped

Uncached

0x FFFF FFFF E 0 00 000 0

0x 0000 0000 0000 0000

0x 0000 0100 0000 0000

0x 4000 0000 0000 0000

0x 4000 0100 0000 0000

0x 8000 0000 0000 0000

0x C000 0000 0000 0000

0x C000 00FF 8000 0000

0x FFFF FFFF 800 0 00 00

0x FFFF FFFF A 0 00 000 0

0x FFFF FFFF C000 000 0

0x FFFF FFFF FFFF FFFF

kuseg

ksseg

kphys

kseg

cksseg

ckseg0

ckseg1

ckseg3

64-bit

0.5 GB

Mapped

Cacheable

0.5 GB

Mapped

Cacheable

1 TB

Mapped

Cacheable

1 TB

Mapped

Cacheable

Mapped

Cacheable

Unmapped

(For details

see figure 8-7)

Address

error

Address

error

Address

error

0.5 GB

Unmapped

Uncached

0.5 GB

Unmapped

Cacheable

Figure 8-6 Kernel Mo de Addr es s Spac e

*Note 1: In 32-bit mode, bit 31 is sign-extended through bits 63~32. Failure results in an address error

exception.

*Note 2: 0xff00_0000 through 0xff3f _ffff in 32-bit m ode and 0xffff_ffff_ff00_0000 through 0xffff_ffff_ff3f_ffff

in 64-bit mode are reserved (unmapped, uncached) for use by registers in the Debug Support

Unit and TX49 MCU peripherals.

TX49/H2 Archit ecture

8-11

0xBFFF FFFF FFFF FFFF

4* 64 GB

Unmapped

Reserved

64 GB

Unmapped

Cacheable

noncoherent

64 GB

Unmapped

Uncached

64 GB

Unmapped

Cacheable

noncoherent

WT-WA

64 GB

Unmapped

Cacheable

noncoherent

WT-no-WA

0x9FFF FFFF FFFF FFFF

0xA000 0000 0000 0000

0x97FF FFFF FFFF FFFF

0x9800 0000 0000 0000

0x8FFF FFFF FFFF FFFF

0x9000 0000 0000 0000

0x87FF FFFF FFFF FFFF

0x8800 0000 0000 0000

0x8000 0000 0000 0000

Figure 8-7 xkphys Address Space

TX49/H2 Archit ecture

8-12

Table 8-3 32-bit Kerne l Mode Segments

Status Register

Is One Of These Values

Address

Bit Values KSU EXL ERL KX

Segment

Name Address Range Segment Size

A (31) = 00Kuseg 0x0000 0000

through

0x7FFF FFFF

2 Gbyte

(231 bytes)

A (31~29) = 10020Kseg0 0x8000 0000

through

0x9FFF FFFF

512 Mbytes

(229 bytes)

A (31~29) = 10120Kseg1 0xA000 0000

through

0xBFFF FFFF

512 Mbytes

(229 bytes)

A (31~29) = 11020Ksseg 0xC000 0000

through

0xDFFF FFFF

512 Mbytes

(229 bytes)

A (31~29) = 11120Kseg3 0xE000 0000

through

0xFFFF FFFF

512 Mbytes-4 Mbytes

(229 bytes)

KSU = 002

EXL = 1

ERL = 1

0(Reserved)0xFF00 0000

through

0xFF3F FFFF 4 Mbytes

32-bit Kernel Mode, User Space (kuseg)

In Kernel mode, when KX = 0 in the Status register, and the most-significant bit of

the virtual address, A31, is cleared, the 32-bit kuseg vir tual add ress sp ace is selected ;

it covers the full 231 bytes (2 Gbytes) of the current user address space. The virtual

address is extended with the contents of the 8-bit ASID field to form a unique virtual

addre ss. When ER L = 1 in the Status register, the user address region becomes a 231

bytes unmapped (that is, mapped directly to physical addresses) uncached address

space.

32-bit Kernel Mode, Kernel Space 0 (kseg0)

In Kernel mode, when KX = 0 in the Status register and the most-significant three

bits of the virtual address are 1002, 32-b it kseg0 virtua l add re s s space is se le cte d ; i t i s

the 229 bytes (512 Mbyte) kernel physical space. References to kseg0 are not mapped

throug h the TLB; the ph ysical ad dres s selecte d i s defin ed by subtrac ting 0x8000 0000

from the virtual address. The K0 field of the Config register, described in this

chapter, controls cacheability and coherency.

TX49/H2 Archit ecture

8-13

32-bit Kernel Mode, Kernel Space 1 (kseg1)

In Kernel mode, when KX = 0 in the Status register and the most-significant three

bits of the 32-bit virtual address are 1012, 32-bit kseg1 virtual address space is

selecte d; it is the 229 bytes (512 Mbyte) kernel physical space. References to kseg1 are

not mapped through the TLB; the physical address selected is defined by subtracting

0xA000 0000 from the virtual address. Caches are disabled for accesses to these

addresses, and physical memory (or memory-mapped I/O device registers) are

accessed directly.

32-bit Kernel Mode, Supervisor Space (ksseg)

In Kernel mode, when KX = 0 in the Status register and the most-significant three

bits of th e 32-bit v irtual address are 1102, the ksseg virtual add ress space is selec ted;

it is the current 229 bytes (512 Mbyte) supervisor virtual space. The virtual address is

extended with the contents of the 8-bit ASID field to form a unique virtual address.

32-bit Kernel Mode, Kernel Space 3 (kseg3)

In Kernel mode, when KX = 0 in the Status register and the most-significant three

bits of th e 32-bit v ital ad dress are 1112, the kseg3 virtu al addre ss space is sele cted; i t

is the current 229 bytes (512 Mbyte-4 Mbyte) kernel virtual space. The virtual

address is extended with the contents of the 8-bit ASID field to form a unique virtual

address.

Note: These is the 4 Mbytes Res er ved area, beg in at virtual ad dres s 0x FF 00_0 000 an d runs

through 0xFF3F_FFFF.

TX49/H2 Archit ecture

8-14

Table 8-4 64-bit Kerne l Mode Segments

Status Register

Is One Of These Values

Address

Bit Values KSU EXL ERL KX

Segment

Name Address Range Segment Size

A (63~62) = 0021xkuseg 0x0000 0000 0000 0000

through

0x0000 00FF FFFF FFFF

1 Tbytes

(240 bytes)

A (63~62) = 0121xksseg 0x4000 0000 0000 0000

through

0x4000 00FF FFFF FFFF

1 Tbytes

(240 bytes)

A (63~62) = 1021xkphys 0x8000 0000 0000 0000

through

0xBFFF FFFF FFFF FFFF 8*232 bytes

A (63~62) = 1121xkseg 0xC000 0000 0000 0000

through

0xC000 00FF 7FFF FFFF 240 –231 bytes

A (63~62) = 112

A (61~31) = -1 1ckseg0 0xFFFF FFFF 8000 0000

through

0xFFFF FFFF 9FFF FFFF

512 Mbytes

(229 bytes)

A (63~62) = 112

A (61~31) = -1 1ckseg1 0xFFFF FFFF A000 0000

through

0xFFFF FFFF BFFF FFFF

512 Mbytes

(229 bytes)

A (63~62) = 112

A (61~31) = -1 1cksseg 0xFFFF FFFF C000 0000

through

0xFFFF FFFF DFFF FFFF

512 Mbytes

(229 bytes)

A (63~62) = 112

A (61~31) = -1 1ckseg3 0xFFFF FFFF E000 0000

through

0xFFFF FFFF FFFF FFFF

512 Mbytes

-4 Mbyte

KSU = 002

EXL = 1

ERL = 1

1(Reserved)0xFFFF FFFF FF00 0000

through

0xFFFF FFFF FF3F FFFF 4 Mbytes

64-bit Kernel Mode, User Space (xkuseg)

In Kernel mode, when KX = 1 in the Status register and bits 63~62 of the 64-bit

virtual address are 002, the xkuseg virtual address space is selected; it covers the

current user address space. The virtual address is extended with the contents of the

8-bit ASID field to form a unique virtual address.

When ERL = 1 in the Status register, the user address region becomes a 231 bytes

unmapped (that is, mapped directly to physical addresses) uncached address space.

64-bit Kernel Mode, Current Supervisor Space (xksseg)

In Kernel mode, when KX = 1 in the Status register and bits 63~62 of the 64-bit

virtual address are 012, the xksseg virtual address space is selected; it is the current

supervisor virtual space. The virtual address is extended with the contents of the 8-

bit ASID field to form a unique virtual address.

TX49/H2 Archit ecture

8-15

64-bit Kernel Mode, Physical Spaces (xkphys)

In Kernel mode, when KX = 1 in the Status register and bits 63~62 of the 64-bit

virtual addr ess are 102, one of the two unmapped xkphys add ress space s are se lected,

either cached or uncached. Accesses with address bits 58~36 not equal to 0 c ause an

address error.

References to this space are not mapped; the physical address selected is taken

from bits 35~0 of the virtual address. Bits 61~59 of the virtual address specify the

cacheability and coherency attributes, as shown in Table 8-5.

Table 8-5 Cacheability and Coherency Attributes

Value(61~5 9) Cac hea bi lity and Coherenc y Attr ibu t es Star t ing Ad dr ess

0 Cacheable, non-coherent, write-through, no write

allocate 0x8000 0000 0000 0000

1 Cacheable, non-coherent, write-through, no write

allocate 0x8800 0000 0000 0000

2 Uncac hed 0x9000 0000 0000 0000

3 Cacheable, non-coherent 0x9800 0000 0000 0000

4-7 Reserved 0xA000 0000 0000 0000

64-bit Kernel Mode, Kernel Space (xkseg)

In Kernel mode, when KX = 1 in the Status register and bits 63~62 of the 64-bit

virtual address are 112, the address space select ed is one of the following:

• kernel virtual space, xkseg, the current kernel virtual space; the virtual address

is extended with the contents of the 8-bit ASID field to form a unique virtual

address

• one of the four 32-bit kernel compatibility spaces, as described in the next section.

64-bit Kernel Mode, Compatibility Spaces (ckseg1~0, cksseg, ckseg3)

In Kernel mode, when KX = 1 in the Status re gister, bi ts 63~62 o f the 64-bi t virtual

addre ss are 112, and bits 61~31 of the virtual address equal-1, the lower two bytes of

address, as shown in Figure 8-6, select one of the following 512 Mbytes compatibility

spaces.

• ckseg0. This 64-bit vir tu al add re ss space is an unmappe d reg io n, co mpatib le w ith

the 32-bit address model kseg0. The K0 field of the Config register,

described in this chapter, controls cacheability and coherency.

• ckseg1. This 64-bit virtual address space is an unmapped and uncached region,

compatible with the 32-bit address model kseg1.

• cksseg. This 64-bit virtual address space is the current supervisor virtual space,

compatible with the 32-bit address model ksseg.

• ckseg3. This 64-bit vir tual add ress space is kernel v irtual space, compati ble with

the 32-bit address model kseg3.

TX49/H2 Archit ecture

8-16

8.4 Translation Lookaside Buffer

8.4.1 Joint TLB

The TX49 has a fully associative TLB which maps 48 pairs (odd/even entry) of virtual

pages to their corresponding physical addresses.

8.4.2 TLB Entry format

32-bit addressi ng

127 121 120 109 108 96

0 MASK 0

95 77 76 75 72 71 64

VPN2 G 0 ASID

63 62 61 38 37 35 34 33 32

0PFN CDV0

31 30 29 65 3 2 1 0

0PFN CDV0

64-bit addressi ng

255 217 216 205 204 192

0 MASK 0

191 190 189 168 167 141 140 139 136 135 128

R 0 VPN2 G 0 ASID

127 94 93 70 69 67 66 65 64

0PFNCDV0

63 30 29 6 5 3 2 1 0

0PFNCDV0

MASK : Page comparison mask. This field set s the variable page size for each TLB entry.

VPN2 : Virtual page number divided by two (maps to two pages)

ASID : Address space ID field.

R : Region. (00: user, 01: supervisor, 11: kernel) used to match Vaddr63~62.

PFN : Page frame number; upper bits of the physical address.

C : Specifies the cache algorithm to be used (see the “C” field of the EntryLo0, 1).

D : Dirty. If this bit is set, the page is marked as dirty and therefore, writable. This bit is

actually a write-protect bit that software can use to prevent alteration of data.

V : Valid. If this bit is set, it indicates that the TLB entry is valid. If a cache hit occurs

through a TLB entry when this bit is cleared, a TLB invalid exception occurs.

G : Global. If this bit is set in both Lo0 and Lo1, then ignore the ASID during TLB lookup.

0 : Reserved. Returns zeroes when read.

TX49/H2 Archit ecture

8-17

8.4.3 Instruction-TLB

The TX49 has a 2-entry instruction TLB (ITLB). Each ITLB entry is a subset of any

single JTLB entry. The ITLB is completely invisible to software.

8.4.4 Data-TLB

The TX49 h as a 4-entry data TLB (DTLB). Each DTLB entry is a subset of any single

JTLB entry. The DTLB is completely invisible to software.

TX49/H2 Archit ecture

8-18

8.5 Virtual-to-Physical Address Translation Process

During virtual-to-physical address translation, the CPU compares the 8-bit ASID (if the

Global bit, G, is not set) of the virtual address to the ASID of the TLB entry to see if there is a

match. One of the following comparisons are also made:

• In 32-bit mode, the highest 7 to 19 bits (depending upon the page size) of the virtual

address are compared to the contents of the TLB VPN2 (virtual page number divided

by two).

• In 64-bit mode, the highest 15 to 27 bits (depending upon the page size) of the virtual

address are compared to the contents of the TLB VPN2 (virtual page number divided

by two).

If a TLB entry matches, the physical address and access control bits (C, D, and V ) are

retrieved from the matching TLB entry. While the V bit of the entry must be set for a valid

translation to take place, it is not involved in the determination of a matching TLB entry.

Figure 8-8 illustrates the TLB address translation process.

Access

Cache

XTLB

Refill

TLB

Refill

TLB

Invalid

TLB

Mod Uncached?

Write?

32-bit

address?

= 1?

= 1? ASID

Match?

VPN

Match?

Mapped

Address?

Legal

Address?

Sup

Mode?

User

Mode? Legal

Address?

Legal

Address?

For valid

address space, see

the section describing

Operating Modes

in this chapter.

Virtual Add ress (Input)

ExceptionException

Address

Error

Physical Address (Output)

NoNo

Dirty

Global

Yes

No Yes

YesYes NoNoNo

Access

Main

Memor

Address

Error

Address

Error

VPN

and

ASID

Figure 8-8 TLB Address Translation

TX49/H2 Archit ecture

8-19

TLB Misses

If there is no TLB entry tha t match es th e virtual addre ss, a TL B refill e xceptio n occurs.

(TLB refill exceptions are described in Chapter 11.) If the access control bits (D and V)

indicate th at the access is not valid, a TLB modification or TLB invalid exception occurs.

If the C bits equal 0102, the physical address that is retrieved accesses main memory,

bypassing the cache.

TLB Instructions

Table 8-6 lists the instructions that the CPU provides for working with the TLB. See

Appendix A for a detailed description of these instructions.

Table 8-6 TLB Instructions

Op Code Description of Instruction

TLBP Translation Lookaside Buffer Probe

TLBR Trans l at i on Lookaside Buffer Read

TLBWI Translation Lookaside Buffer Write Index

TLBWR Translation Lookaside Buffer Write Random

TX49/H2 Archit ecture

8-20

TX49/H2 Archit ecture

9-1

9. Cache Organization

9.1 Introduction

This chapter describes the cache memory of TX49. This processor has two on-chip primary

caches for instruction and data. Both caches are configured as either 8 K-byte, 16 K-byte or 32

K-byte in size.

9.2 Instruction Cache (I-Cache)

The TX49 primary I-cache has the following characteristics:

• Cache size: 8 KB/ 16 KB/ 32 KB (fixed in each products)

• Four-way set associative

• FIFO replacement

• Indexed with a virtual address

• Checked with a physical tag

• Block (line) size: 8 words (32 bytes)

• Burst refill size: 8 words (32 bytes)

• Lockable on a per-line basis (way1, way2 and way3)

• All valid bits, lock and FIFO bits are cleared by a Reset exception

9.2.1 Instruction Cache Address Field

Figure 9-1 shows the instruction cache address field. When 4-KB page size is used in

32 KB Instruction cache, the bit 12 of the Physical Address and the Virtual Address must

be same value.

35 11 10 5 4 3 2 0 (8 KB)

Physical Tag

(25 bits) Cache Tag Index

(6 bits) Word

(2 bits) Byte

(3 bits)

35 12 11 5 4 3 2 0 (16 KB )

Physical Tag

(24 bits ) Cache Tag Index

(7 bits) Word

(2 bits) Byte

(3 bits)

35 12 11 5 4 3 2 0 (32 KB)

Physical Tag

(24 bits ) Cache Tag Index

(8 bits) Word

(2 bits) Byte

(3 bits)

Figure 9-1 Instruc ti on Cache Addr es s Fi eld

TX49/H2 Archit ecture

9-2

9.2.2 Instruction Cache Configuration

Each line in the 4 ways of the instruction cache share FIFO replacement bits. Figure

9-2 shows the fo rmat of re place me nt bits. The se bit s ar e sh are d by way 0 , w ay1, w ay2 and

way3 for 8 KB/ 16 KB/ 32 KB cache, and indicate next set to which replacement will be

directed; when lock bit is set to 1, indicate this set is not locked.

Each line of ins tructio n cache da ta h a s an as so ciated 27-bi t ( 8 KB )/2 6- bi t (16 K B/32 KB )

tag that co ntains a 25- bit (8 KB)/24- bit (16 KB/3 2 KB) physica l address, a single Lo ck bit

and a single valid bit, except for the line in way0, which has an 26-bit (8 KB)/25-bit

(16 K B/32 KB) tag that excludes a lock bit. Figu re 9-3 shows the formats o f tag and data

pair.

F1 F0

F0: FIFO replace bit 0

F1: FIFO replace bit 1

Figure 9-2 Format of Replacement Bits

25 24 0 63 0 63 0 63 0 63 0

V PTag Data Data Data Data

Format for way0 (8 KB )

24 23 0 63 0 63 0 63 0 63 0

V PTag Data Data Data Data

Format for way0 (16 KB/32 KB)

26 25 24 0 63 0 63 0 63 0 63 0

L V PTag Data Data Data Data

Format for way1, 2 and 3 (8 KB)

25 24 23 0 63 0 63 0 63 0 63 0

L V PTag Data Data Data Data

Format for way1, 2 and 3 (16 KB/32 KB)

L: Lock bit (1: enable, 0: disable)

V: Valid bit (1: valid, 0: invalid)

PTag: Physical tag (bit 35∼12 of the physical address )

Data: Instruction cache data

Figure 9-3 Format of Tag and Data Pair for I-cache

9.3 Data Cache

The TX49 primary D-cache has the following characteristics:

• Cache size: 8 KB/ 16 KB/ 32 KB (fixed in each products)

• Four-way set associative

• FIFO replacement

• Indexed with a virtual address

• Checked with a physical tag

• Block (line) size: 8 words (32 bytes)

• Burst refill size: 8 words (32 bytes)

TX49/H2 Archit ecture

9-3

• Lockable on a per-line basis (way1, way2 and way3)

• Store b uffer

• Selectable write-back and write-through on a page basic

• All W, CS, FIFO and Lock bits are cleared by a Reset exception

9.3.1 Data Cache Address Field

Figure 9-4 shows the data cache address field. When 4-KB page size is used in 32 KB

Instruction cache, the bit 12 of the Physical Address and the Virtual Address must be

same value.

35 11 10 5 4 3 2 0 (8 KB)

Physical Tag

(25 bits) Cache Tag Index

(6 bits) Word

(2 bits) Byte

(3 bits)

35 12 11 5 4 3 2 0 (16 KB )

Physical Tag

(24 bits ) Cache Tag Index

(7 bits) Word

(2 bits) Byte

(3 bits)

35 12 11 5 4 3 2 0 (32 KB)

Physical Tag

(24 bits ) Cache Tag Index

(8 bits) Word

(2 bits) Byte

(3 bits)

Figure 9-4 Data Cache Address Field

9.3.2 Data Cache Configuration

Each line in the 4 ways of the data cache share F1, F0 replacement bits. Figure 9-5

shows the format of replacement bits. These bits are shared by way0, way1, way2 and

way3 for 8 KB/ 16 KB/ 32 KB cache, and indicate next set to which replacement will be

directed; when lock bit is set to 1, indicate this set is not locked.

Each line of data cache data has an associated 29-bit/28-bit tag that contains a 25-

bit/24-bit physical address, a single Lock bit, a single write-back bit and a 2-bit cache

state, except for the line in way0, which has an 28-bit/27-bit tag that excludes a Lock bit.

Figure 9-6 shows the formats of tag and data pair.

F1 F0

F0: FIFO replace bit 0

F1: FIFO replace bit 1

Figure 9-5 Format of Replacement Bits

TX49/H2 Archit ecture

9-4

27 26 2524 0 63 0 63 0 63 0 63 0

W CS PTag Data Data Data Data

Format for way0 (8 KB )

26 25 2423 0 63 0 63 0 63 0 63 0

W CS PTag Data Data Data Data

Format for way0 (16 KB/ 32 KB)

28 27 26 2524 0 63 0 63 0 63 0 63 0

L W CS PTag Data Data Data Data

Format for way1, 2 and 3 (8 KB)

27 26 25 2423 0 63 0 63 0 63 0 63 0

L W CS PTag Data Data Data Data

Format for way1, 2 and 3 (16 KB/ 32 KB)

L: Lock bit (1: enable, 0: disable)

W: Write-back bit (set if cache line has written)

CS: Primary cache state

(0: Invali d, 1: Res erved, 2: Reserved, 3: Valid)

PTag: Physical tag (bit 35~12 of the physical address)

Data: Data cache data

Figure 9-6 Format of Tag and Data Pair for D-cache

In the TX49, the W (write-back) bit, not the cache state, indicates when the primary

cache contents modified data that must be written back to memory. The states Invalid

and Valid are used to describe the cache line. That is, there is no hardware support for

cache coherency.

9.3.3 Data Cache Policies

The TX49 provides three write policy options for the data cache: two write-through

modes and one write-back mode. Selection of a write policy is done by the K0 bit in the

Config register for the kseg0 segment and the C bit within each TLB entry for the other

segments. For a description of the K0 bit, see Table 7-15; for a description of the C bit, see

Table 7-3.

The write policy should not be changed once the cache is initialized; otherwise, the

contents of the data cache are not guaranteed.

a) Write-through modes (write allocate/no write allocate)

In write-through, the data is written to cache and to main memory at the same

time. On a cache store miss, a write-through without write-allocate causes dat a

to be sent only to main memory, whereas a write-through with write-allocate

causes the relevant cache line to be replaced before being s ent to the data cache

and main memory.

b) Write-back mode

In the write-back policy, a copy of the data is written to cache by the processor,

but not to main memory. The data will be written to main memory only if cache’s

copy is about to be replaced.

TX49/H2 Archit ecture

9-5

9.4 FIFO Replacement Algorithm

The TX49 uses the FIFO (first in, first out) policy when overwriting the blocks of data in its

instruction and data caches.

• Typically, data items in way0, way1, way2 and way3 are replaced in this order.

• The FIFO[1:0] bits do not point at locked and valid lines.

• Invalid lines, if any, are replaced first.

• The FIFO replacement bits are altered when external data is written to the cache or

via the CACHE instruction.

Figure 9-7 shows several examples of how the FIFO replacement bits change due to cache

line replacements.

Way0

Invalid Way1

Invalid Way2

Invalid Way3

Invalid B)

C) D)

Way0

Invalid Way1

Invalid

Lock

Way2

Invalid Way3

Invalid

Way0

Invalid Way1

Invalid Way2

valid Way3

Invalid Way0

Invalid Way1

valid

Lock

Way2

Invalid Way3

valid

E) F)

Way0

Invalid Way1

valid

Lock

Way2

valid

Lock

Way3

valid

Lock

Way0

valid Way1

valid Way2

valid

Lock

Way3

valid

Figure 9-7 FIFO Replacement Policy

9.5 Lock function

The lock function can be used to locate critical in struction/data in one instruction/data cache

set and they are not replaced when the lock bit is set.

9.5.1 Lock bit setting and clearing

Setting the Lock bit in each line cache enable the instruction/data cache lock function.

When the lock function is enabled, the instruction/data in the valid line is locked and

never be replaced. The set to be locked is pointed by FIFO bit. Refilled instruction/data

during the lock function is enabled is locked. When a store miss occurs for the write-

through data cache without write allocate, the store data is not written to the cache and

will therefore not be locked.

The lock function is disabled by clearing the Lock bit in each line.

In order to clear or set the Lock bi t in the cache, Cache instructions (Index store I-cache

/D-cache Tag) can be used, and in order to load the instruction/data to cache from

memory, another Cache instructions (Fill I-cache/D-cache) can be used (refer to Cache

instruction).

TX49/H2 Archit ecture

9-6

Clear the lock bit as follows when data written to a locked line should be stored in main

memory.

(1) Read the locked data from cache memory

(2) Clear the lock bit

(3) Store the data that was read

9.5.2 Operation During Lock

After the lock bit is set for a line, the line can be replaced only when it’s line state is

invalid. The locked valid line can ne ve r be rep lace d. FIFO b it sho u ld p oin t o nly to th e set

of locked invalid line or unlocked line.

A write access to a locked valid line takes place only to the cache not to the memory at

Write Back mode. Both of the cache and the memory are replaced at Write Through

mode.

9.5.3 Example of Dat a Cache Locking

During the load operation to the locked line of the cache, any interrupt should be

disabled in order to avoid to lock the wrong data.

To lock data cache lines, the following sequence of codes could be used.

....................... /* Disable the interrupt */

mtc0 t0, TagLo /* Load data into TagLo reg */

cache 2 (D), offset (base) /* Invalidate and lock line in desired set using

Index_Store_Tag cache instruction */

cache 7 (D), offset (base) /* Fill the cache line from desired memory location */

....................... / Enable the interrupt */

9.5.4 Example of I nstruct ion Cache Locking

To lock instruction cache lines, the following sequence of codes could be used:

....................... /* Disable the interrupt */

mtc0 t0, TagLo /* Load data into TagLo reg */

cache 2 (I), offset (base) /* Invalidate and lock line in desired set using

Index_Store_Tag cache instruction */

cache 5 (I), offset (base) /* Fill the cache line from desired memory location */

....................... /* Enable the interrupt */

TX49/H2 Archit ecture

9-7

9.6 The Primary Cache Accessing

Figure 9-8 shows the virtual address (VA) index to the primary cache. Each instruction and

data cache size is 8 KB, 16 KB or 32 KB. The virtual address bits be used to index into the

primary cache decided by the cache size.

Tags

Tag line Data line

Data

32KB:VA(12∼5)

16KB:VA(11∼5)

8KB:VA(10∼5)

W State Tag

VA(12∼5)

VA(10∼5)

Figure 9-8 Primary Cache Data and Tag Organization

9.7 Cache States

The section describes about the state of a cache line. The cache line in the TX49 is in one of

states described in Table 9-1.

The I-Cache line is in one of t he following states:

• invalid

• valid

The D-Cache line is in one of the following states:

• invalid

• valid

Table 9-1 Cache States

Cache line Stat e Descripti on

Invalid A cache line that does not contain valid information must be marked

invalid, and cannot be used. A cache line in any other state t han invalid

is assumed to contain valid inform at i o n.

Valid A Valid cac he line contains vali d information. The cache line may or not

be consistent with memory and is owned by the processor (see Cache

Line Ownership in this chapter).

TX49/H2 Archit ecture

9-8

9.8 Cache Line Ownership

The TX49 becomes the owner of a cache line after it writes to that cache line (that is, by

entering the Valid), and is responsible for providing the contents of that line on a read

request. There can only be one owner for each cache line.

9.9 Cache Multi-Hit Operation

The TX49 is not guaranteed the operation for the multi-hit of primary cache.

Thus, in case of locking the specified program/data in the primary cache, the program/data

must be used after locked in the cache by Fill instruction.

Such as the previous description the cache multi hit does not guarantee in the TX49.

9.10 Cache Test Function

9.10.1 Cache Disabling

The Config register bits ICE# (Instruction Cache Enable) and DCE# (Data Cache

Enable) are used to enable and disable the in struction and data cache, respectively.

When a cache is disabled, all cache accesses are misses and there is no refill (nor is

there any burst bus cycle; this is the same as accessing a non-cacheable area). The Valid

bit (V) or Cache State bit (C S) for each entry cannot be modified.

Notes:

When the instruction cache is disabled:

• Every instruction fetch causes a cache miss, and external memory accesses are

performed using single-read bus cycles.

• The CACHE instruction can still operate on the instruction cache.

Notes:

When the data cache is disabled:

• Every load or store instruction causes a cache miss. Data cache refills are

disabled, and external memory accesses occur using single-read or single-write

transactions.

• The CACHE instruction can still operate on the data cache.

Notes:

How to disable the instruction cache:

• When disabling the instruction cache, instruction streaming should be

discontinued by placing a jump instruction following the MTC0 instruction.

Example: MTC0 Rn, Config (Set the ICE# bit to 1)

J L1 (Jump to L1 and disable instruction streaming)

NOP (Branch delay slot)

L1: CACHE IndexIncaliate, offset (base)

TX49/H2 Archit ecture

9-9

9.10.2 Cache Flushing

Both the instruction and data cache are flushed when a ColdReset/SoftReset exception

is raised (all valid bits are cleared to 0).

The instruction cache is flushed by the CACHE instruction Index_Invalidate

/Hit_Invalidate. The data cache is flushed by the CACHE instruction

IndexWriteBackInvalidate/HitInvalidate/HitWriteBackInvalidate.

The processor writes the cache line back to main memory during the execution of Index

Writeback Invalidate, Hit Writeback Invalidate or Hit Writeback CACHE instruction or

when the modified cache line is replaced. In write-back mode, software is responsible for

ensuring cache coherency.

TX49/H2 Archit ecture

9-10

TX49/H2 Archit ecture

10-1

10. Write Buffer

The TX49 contains a write buffer to improve the performance of writes to the external memory.

Every write to external memory uses this on-chip write buffer. The write buffer holds up to four

64-bit address and data pairs.

For a cache miss write-back, the entire buffer is used for the write-back data and allows the

processor to proceed in parallel with the memory update. For uncached and write-through stores,

the write buffer uncouples the CPU from the write to memory. If the write buffer is full,

additional stores will stall until there is room for them in the write buffer.

The TX49 processor core might issue a read request while the write buffer is performing a

write operation. Multiple read/write operations are serviced in the following order:

• If there is only a write request, the data in the write buffer is written to an external

device.

• If there is only a read request, a read operation is performed to bring in data from an

external device.

• If a read request and a write request occur simultaneously, the read request is

serviced first, except for the following cases:

• when the processor issues a read request to the target address of one of the write

buffer entries

• when the processor issues an uncacheable read reference while the write buffer

has uncacheable write data

The BC0T and BC0F instructions can be used to determine whether any data is present in the

write buffer:

If there is data in the write buffer, the coprocessor condition signal is false (0).

If there is no data in the write buffer, the coprocessor condition signal is true (1).

Following is the assembly language code to freeze the processor until the write buffer becomes

empty.

NOP

Loop: BC0F Loop

NOP

The following sequence of instructions also causes the TX49 to perform the same action.

Appended to a store instruction, the SYNC instruction ensures that the store instruction

initiated prior to this instruction is completed before any instruction after this instruction is

allowed to start.

SYNC

TX49/H2 Archit ecture

10-2

TX49/H2 Archit ecture

11-1

11. CPU Exception

11.1 Introduction

This chapter describes the explanation of CPU exception processing. The chapter concludes

with a description of each exception’s cause, together with the manner in which the CPU

processes and services these exceptions.

11.2 Exception Vector Locations

Exception vector addresses are stored in an area of kseg0 or kseg1 except for Debug

exception vector. The vector address of the ColdReset, SoftReset and NMI exception is always

in a non-cacheable area of kseg1. Vector addresses of the other exceptions depend on the BEV

bit of Status register. When BEV is 0, these exceptions are vectored to a cacheable area of

kseg0. When BEV is 1, all vector addresses are in a non-cacheable area of kseg1.

Table 11-1 shows the list of the exception vector locations.

Table 11-1 Exception Vector Locations

Exception TX49 Vector Address (virtual address)

(BEV = 0) (BEV = 1)

ColdReset, S oft Reset , NMI 0xffff_ffff_bfc0_0000 0xffff_ffff_bfc 0_0000

TLB refill, EXL = 0 0xffff_f ff f_8000_0000 0xffff _ffff_bfc0_0200

XTLB refill, EXL = 0

(X = 64 bit TLB) 0xffff_ffff_8000_0080 0xffff_ffff_bfc0_0280

Others (common exception) 0xfff f _ffff_8000_0180 0xff f f_ffff_bfc 0_0380

Exception TX49 Vector Address (physical address)

(BEV = 0) (BEV = 1)

ColdReset, S oft Reset , NMI 0x0_1fc0_0000 0x0_1f c0_0000

TLB refill, EXL = 0 0x0_0000_0000 0x0_1fc0_0200

XTLB refill, EXL = 0

(X = 64 bit TLB) 0x0_0000_0080 0x0_1fc0_0280

Others (common exception) 0x0_0000_0180 0x0_1fc0_0380

The cache error exception is not occurred because the TX49 does not have the parity bit into

the primary cache. Debug exception needs the care, it has the special address. (See 14.9.5)

Table 11-2 shows the list of the debug exception vector locations.

Table 11-2 Debug Exc epti on Vec tor Loca tio ns

Exception TX49 Debug Exception Vector Address (virtual address)

(ProbEnb = 0) (ProbEnb = 1)

Debug 0xffff_ffff_bfc0_0400 0xffff_ffff_ff20_0200

Exception TX49 Debug Exception Vector Address (physical

address)

(ProbEnb = 0) (ProbEnb = 1)

Debug 0x0_1fc0_0400 0xf_ff20_0200

TX49/H2 Archit ecture

11-2

11.3 Priority of Exception

More than one exception may be raised for the same instruction, in which case only the

exception with the highest priority is reported. The TX49 Processor Core instruction

exception priority is shown in Table 11-3.

Table 11-3 Priority of Exception

Priority Exception Mnemonic

Cold Reset

Soft Reset

NMI

Address error Inst . Fetch AdEL

TLB refill Inst. Fetch TLBL

TLB invalid Inst. Fetch TLBL

Bus error Inst. Fetch IBE

Integer overflow, Trap, System Call, Breakpoint ,

Reserved Inst ruction, Coprocesso r Unusable, or

Floating-Point Exception

Ov, Tr, Sys,

Bp, RI, CpU,

FPE

Address error Data acc e ss AdEL/AdES

TLB refill Data access TLBL/TLBS

TLB invalid Data access TLBL/TLBS

TLB modified Data write Mod

Bus error Data access DBE

High

Low

Interrupt Int

General exceptions (i.e., exceptions other than debug exceptions) are prioritized as follows:

1. If more than one exception condition occurs for a signal instruction, only the exception

with the highest priority is reported, as shown in Table 11-3 (from highest to lowest

priority).

2. If two instructions cause exception conditions in the M and E stages of the pipeline

simultaneously, the instruction in the M stage causes the processor to take an

exception.

3. When 64-bit instructions are executed in 32-bit mode, the Reserved Instruction (RI)

exception can occur simultaneous with other exception, as shown below. In that case,

the RI exception is given precedence.

• RI and CpU

• RI and Ov

• RI and AdEL/S (data)

• RI and TLBL/S (data)

General and debug exceptions are prioritized as follows:

1. If a general exception condition and a debug exception condition occur for a single

instruction, the debug exception is serviced first, and then the general exception is

serviced.

2. If two instructions cause exception conditions in the M and E stages of the pipeline

simult aneously, only the instruction in the M stage generates an exception.

For details on debug exceptions, see Section 14.9.

TX49/H2 Archit ecture

11-3

11.4 ColdReset Exception

11.4.1 Cause

This ColdReset exception occurs when the GCOLDRESET* signal is asserted and then

deasserted. This exception is not maskable.

11.4.2 Processing

A special interrupt vector that resides in an unmapped and uncached area is used. It is

therefore not necessary for hardware to initialize TLB and cache memory in order to

process this exception. The vector location of this exception is;

• In 32 bit mode, 0xbfc0 0000 (virtual address), 0x0_1fc0_0000 (physical address)

• In 64 bit mode, 0xffff ffff bfc0 0000 (virtual address), 0x0_1fc0_0000 (physical

address)

The most register’s contents are cleared when this exception occurs. The values of

these bits are listed into t he table of Section 7.

Valid bits, Lock bits and FI FO replacement bits in the instruction cache are all cleared

to 0. W bits, CS bits, Lock bits and FIFO replacement bits in the data cache are all cleared

to 0.

If a ColdReset exception occurs during bus cycle, the current bus cycle is aborted and an

exception is taken.

11.4.3 Servicing

The ColdReset exception is serviced by;

• initializing all registers, coprocessor registers, caches and the memory system

• performing diagnostic tests

• bootstrapping the operating system

TX49/H2 Archit ecture

11-4

11.5 SoftReset Exception

11.5.1 Cause

This SoftReset exception occurs when the GRESET* signal is asserted and then

deasserted. This exception is not maskable.

11.5.2 Processing

A special interrupt vector that resides in an unmapped and uncached area is used. It is

therefore not necessary for hardware to initialize TLB and cache memory in order to

process this exception. The vector location of this exception is;

• In 32 bit mode, 0xbfc0 0000 (virtual address), 0x0_1fc0_0000 (physical address)

• In 64 bit mode, 0xffff ffff bfc0 0000 (virtual address), 0x0_1fc0_0000 (physical

address)

All register contents are retained except for the following.

• ErrorEPC register, which contains the restart PC

• ERL, SR and BEV bits of Status register, which are set to “1”

Because Soft-reset exception can abort cache and bus operations, cache and memory

state is undefined when this exception occurs.

11.5.3 Servicing

The SoftReset exception is serviced by saving the current processor state for diagnostic

purposes, and reinitializing for the ColdReset exception.

TX49/H2 Archit ecture

11-5

11.6 NMI (Non-maskable Interrupt) Exception

11.6.1 Cause

The NMI (Non-maskable Interrupt) exception occurs at the falling edge of the GNMI*

signal. This inte rrup t is not maskable , and occu r s reg ard less o f th e EX L, ER L an d IE bits

of the Status register.

11.6.2 Processing

The same special interrupt vector as for Cold-reset/Soft-reset exception (0xbfc0_0000/

0xffff_ffff_bfc0_0000). This vector is located within unmapped and uncached area so th at

the cache and TLB need not be initialized to process this exception. When this exception

occurs, the SR bit of Status register is set.

Because NMI exception can occur in the midst of another exception, it is not normally

possible to continue program execution after servicing NMI exception.

Unlike the Cold-reset/Soft-reset exception, but like other exceptions, this exception

occurs at an instruction boundary. The state of the primary cache and memory system

are preserved b y this exception.

All register contents are retained except for the following.

• ErrorEPC register, which contains the restart PC

If the exception-causing ins truction is in a branch delay slot, the ErrorEPC

registe r is set as ind ica t io n.

• ERL, SR and BEV bits of the Status register, which is set to 1.

11.6.3 Servicing

The NMI exception is serviced by saving the current processor state for diagnostic

purposes, and reinitializing the system for the ColdReset exception.

TX49/H2 Archit ecture

11-6

11.7 Address Error Exception

11.7.1 Cause

The Address Error exception occurs when an attempt is made to execute one of the

following.

• load or store a doubleword that is not aligned on a doubleword boundary

• load, fetch or store a word that is not aligned on a word boundary

• load or store a halfword that is not aligned on a halfword boundary

• reference Kernel mode address while in User or Supervisor mode

• reference Supervisor mode address while in User mode

This exception is not maskable.

11.7.2 Processing

The common exception vector is used. ExcCode AdEL or AdES in Cause register is set

depending on whether the memory access attempt was a load or store. When this

exception is raised, the misalign virtual address causing the exception, or the protected

virtual address that was illegally referenced, is placed in BadVAddr register. The

contents of the VPN field of Context and EntryHi registers are undefined, as are the

contents of EntryLo register.

If EXL bit of Status register is only set to 0, the following operation is executed. EPC

affected instruction was in the branch delay slot (for execution during a branch), the

immediately preceding branch instruction address is retained in EPC register and BD bit

of Cause register is set to “1”.

11.7.3 Servicing

The process executing at the time is handed a segmentation violation signal. This error

is usually fatal to the process incurring the exception.

TX49/H2 Archit ecture

11-7

11.8 TLB Refill Except ion

11.8.1 Cause

The TLB refill exception occurs when there is no TLB entry to match a reference to a

mapped address. This exception is not maskable.

11.8.2 Processing

There are two special exception vectors for this exception; one for references to 32-bit

virtual addres s, and one for refe rences to 64-bit virtual address . The K X, SX and U X bits

of Status register determine whether the User, Supervisor or Kernel address referenced

are 32-bit mode or 64-bit mode. When EXL bit of Status register is set to “0”, all

references use these vectors. When this exception occurs, TLBL or TLBS code is set in the

ExcCode field of Cause register. This code indicates whether the instruction, as shown by

EPC register and BD bit of Cause register, caused the miss by an instruction reference,

load operation, or store operation.

When this exception occurs;

• BadVAddr, Context, XContext and EntryHi registers hold the virtual address

failed address translation

• EntryHi register contains ASID from which the translation fault occurred, too

• A valid address in which to place the replacement TLB entry is contained into

Random register

• The contents of EntryLo register are undefined

If EXL bit of Status register is only set to 0, the following operation is executed. EPC

affected instruction was in the branch delay slot (for execution during a branch), the

immediately preceding branch instruction address is retained in EPC register and BD bit

of Cause register is set to “1”.

11.8.3 Servicing

To service this exception, the contents of the Context or XContext register are used as a

virtual address to fetch memory locations containing the physical page frame and access

control bits for a pair of TLB entries. The two entries are placed into the

EntryLo0/EntryLo1 register; the EntryHi and EntryLo registers are written into the TLB.

It is possible that the virtual address used to obtain the physical address and access

control information is on a page that is not resident in the TLB. This condition is

processed by allowing a TLB refill exception in the TLB refill handler. This second

exception goes to the common exception vector because the EXL bit of the Status register

is set.

TX49/H2 Archit ecture

11-8

11.9 TLB Invalid Exception

11.9.1 Cause

The TLB Invalid exception occurs when a virtual address reference matches a TLB entry

that is marked invalid (TLB valid bit cleared). This exception is not maskable.

11.9.2 Processing

The common exception vector is used for this exception. When this exception occurs,

TLBL or TLBS code is set in the ExcCode field of Cause register. This code indicates

whether the instruction, as shown by EPC register and BD bit of Cause register, caused

the miss by an instruction reference, load operation, or store operation.

When this exception occurs;

• BadVAddr, Context, XContext and EntryHi registers hold the virtual address

failed address translation

• EntryHi register contains ASID from which the translation fault occurred, too

• A valid address in which to place the replacement TLB entry is contained into

Random register

• The contents of EntryLo register are undefined

If EXL bit of Status register is only set to 0, the following operation is executed. EPC

affected instruction was in the branch delay slot (for execution during a branch), the

immediately preceding branch instruction address is retained in EPC register and BD bit

of Cause register is set to “1”.

11.9.3 Servicing

A TLB entry is typically marked invalid when one of the following is t rue;

• a virt ual address does not exist

• the virtual address exists, but is not in main memory (a page fault)

• a trap is desired on any reference to the page (for example, to maintain a

reference bit or during debug)

After serv icing the cause of a TLB In valid ex ception , the TLB e ntry is located w ith TLB

Probe (TLBP) instruction, and replaced by an entry with that entry’s Valid bit set.

TX49/H2 Archit ecture

11-9

11.10 TLB Modified Exception

11.10.1 Cause

The TLB Modified exception occurs when a store operation virtual address reference to

memory matches a TLB entry that is marked valid but is not dirty and therefore is not

writable. This exception is not maskable.

11.10.2 Processing

The common exception vector is used for this exception, and Mod code in Cause register

is set.

When this exception occurs;

• BadVAddr, Context, XContext and EntryHi registers hold the virtual address

failed address translation

• EntryHi register contains ASID from which the translation fault occurred, too

• The contents of EntryLo register are undefined

If EXL bit of Status register is only set to 0, the following operation is executed. EPC

affected instruction was in the branch delay slot (for execution during a branch), the

immediately preceding branch instruction address is retained in EPC register and BD bit

of Cause register is set to 1.

11.10.3 Servicing

The kernel uses the failed virtual address or virtual page number to identify the

corresponding access control information. The page identified may or may not permit

write accesses; if writes are not permitted, a write protection violation occurs.

If write accessed are permitted, the page frame is marked dirty/writable by the kernel

in its own data structures. The TLB Probe (TLBP) instruction places the index of the

TLB entry that must be altered into the Index register. The EntryLo register is loaded

with a word containing the physical page frame and access control bits (with the D bit

set), and the EntryHi and EntryLo registers are written into the TLB.

TX49/H2 Archit ecture

11-10

11.11 Bus Error Exception

11.11.1 Cause

The Bus Error exception occurs when GBUSERR* signal is asserted during a memory

read bus cycle. This exception is raised by board-level circuitry for events such as bus

time-out, backplane bus parity errors, and invalid physical memory addresses or access

types. This occurs during execution of the instruction causing the bus error. The memory

bus cycle ends upon notification of a bus error. When a bus error is rai sed during a burst

refill, the following refill is not performed. A bus error request made by asserting

GBUSERR* signal will be ignored if TX49 is executing a cycle other than a bus cycle. It is

therefore not possible to raise a Bus Error exception in a write access using a write buffer.

A general interrupt must be used instead. This exception is not maskable.

11.11.2 Processing

The common interrupt vector is used for a Bus Error exception. The IBE or DBE code

in the ExcCode field of the Cause register is set, signifying whether the instruction (as

indicated by the EPC register and BD bit in the Cause register) caused the exception by

an instruction reference, load operation, or store operation.

The EPC register contains the address of the instruction that caused the exception,

unless it is in a branch delay slot, in which case the EPC register contains the address of

the preceding branch instruction and the BD bit of the Cause register is set.

11.11.3 Servicing

The physical address at which the fault occurred can be computed from information

available in the CP0 registers.

• If the IBE code in the Cause register is set (indicating an instruction fetch

reference), the virtual address is contained in the EPC register (or 4+ the

contents of the EPC register if the BD bit of the Cause register is set).

• If the DBE code is set (indicating a load or store reference), the instruction that

caused the exception is located at the virtual address contained in the EPC

is set).

The virtual address of the load and store reference can then be obtained by interpreting

the instruction. The physical address can be obtained by using the TLB Probe (TLBP)

instruction and reading the EntryLo register to compute the physical page number.

The process executing at the time of this exception is handed a bus error signal, which

is usually fatal.

TX49/H2 Archit ecture

11-11

11.12 Integer Overflow Exception

11.12.1 Cause

The Integer Overflow exception occurs when ADD, ADDI, SUB, DADD, DADDI or

DSUB instruction results in a 2’s complement overflow. This exception is not maskable.

11.12.2 Processing

The common exception vector is used for this exception, and the Ov code in Cause

If EXL bit of Status register is only set to 0, the following operation is executed. EPC

affected instruction was in the branch delay slot (for execution during a branch), the

immediately preceding branch instruction address is retained in EPC register and BD bit

of Cause register is set to 1.

11.12.3 Servicing

The process executing at the time of the exception is handed a floating-point

exception/integer overflow signal. This error is usually fatal to the current process.

TX49/H2 Archit ecture

11-12

11.13 Trap Exception

11.13.1 Cause

The Trap exception occurs when TGE, TGEU, TLT, TLTU, TEQ, TNE, TGEI, TGEIU,

TLTI, TLTIU, TEQI or TNEI instruction results in a TRUE condition. This exception is

not maskable.

11.13.2 Processing

The common exception vector is used for this exception, and the Tr code in Cause

If EXL bit of Status register is only set to 0, the following operation is executed. EPC

affected instruction was in the branch delay slot (for execution during a branch), the

immediately preceding branch instruction address is retained in EPC register and BD bit

of Cause register is set to 1.

11.13.3 Servicing

The process executing at the time of a Trap exception is handed a floating-point

exception/integer overflow signal. This error is usually fatal.

TX49/H2 Archit ecture

11-13

11.14 System Call Exception

11.14.1 Cause

The System Call exception occurs during an attempt to execute the SYSCALL

instruction. This exception is not maskable.

11.14.2 Processing

The common exception vector is used for this exception, and the Sys code in Cause

If EXL bit of Status register is only set to 0, the following operation is executed. EPC

instruction was in the branch delay slot (for execution during a branch), the immediately

preceding branch instruction address is retained in EPC register.

If the SYSCALL instruction is in a branch delay slot, BD bit of Status register is set,

otherwise this bit is cleared.

11.14.3 Servicing

When this exception occurs, control is transferred to the applicable system routine.

To resume execution, the EPC register must be altered so that the SYSCALL

instruction does not re-execute; this is accomplished by adding a value of 4 to the EPC

If a SYSCALL instruction is in a branch delay slot, a more complicated algorithm,

beyond the scope of this des cription, may be required.

TX49/H2 Archit ecture

11-14

11.15 Breakpoint Exception

11.15.1 Cause

The Breakpoint exception occurs when an attempt is made to execute the BREAK

instruction. This exception is not maskable.

11.15.2 Processing

The common exception vector is used for this exception, and the Bp code in Cause

If EXL bit of Status register is only set to 0, the following operation is executed. EPC

instruction was in the branch delay slot (for execution during a branch), the immediately

preceding branch instruction address is retained in EPC register.

If the BREAK instruction is in a branch delay slot, BD bit of Status register is set,

otherwise this bit is cleared.

11.15.3 Servicing

When the Breakpoint exception occurs, control is transferred to the applicable system

routine. Additional distinctions can be mode by analyzing the unused bits of the BREAK

instruction (bits 25~6), and loading the contents of the instruction whose address the EPC

To resume execution, the EPC register must be altered so that the BREAK instruction

does not re-execute; this is accomplished by adding a value of 4 to the EPC register (EPC

If a BREAK instruction is in a branch delay slot, interpretation of the branch

instruction is required to resume execution.

TX49/H2 Archit ecture

11-15

11.16 Reserved Instruction Exception

11.16.1 Cause

The Reserved Instruction exception occurs when one of the following condition occurs:

• an attempt is made to execute an instruction with an undefined major opecode

(bit 31~26)

• an attempt is made to execute a SPECIAL instruction with an undefined minor

opcode (bit 5~0)

• an attempt is made to execute a REGIMM instruction with an undefined minor

opcode (bit20~16)

• an attempt is made to execute 64-bit operations in 32-bit mode when in User or

Supervisor modes

• an attempt is made to execute a COPz rs instruction with an undefined minor

opcode (bit25~21)

• an attempt is made to execute a COPz rt instruction with an undefined minor

opcode (bit20~16)

64-bit operations are always valid in Kernel mode regardless of the value of the KX bit

in Status register. This exception is not maskable.

11.16.2 Processing

The common exception vector is used for this exception, and the RI code in Cause

If EXL bit of Status register is only set to 0, the following operation is executed. EPC

affected instruction was in the branch delay slot (for execution during a branch), the

immediately preceding branch instruction address is retained in EPC register and the BD

bit of Cause register is set to 1.

11.16.3 Servicing

No instruction in the MIPS ISA are currently interpreted. The process executing at the

time of this e xception is h anded an il legal in struction/re served o perand f ault signal. Th is

error is usually fatal.

TX49/H2 Archit ecture

11-16

11.17 Coprocessor Unusable Exception

11.17.1 Cause

The Coprocessor Unusable exception occurs when an attempt is made to execute a

coprocessor instruction for either.

• attempting to execute a coprocessor CPz instruction when its corresponding CUz

bit in Status register.

• in User or Supervisor mode attempting to execute a CP0 instruction when CU0

bit is cleared to “0”. (In Kernel mode, an exception is not raised when a CP0

instruction is issued , regardless of the CU0 bit setting)

• an attempt is made to execute a FPU instruction in TX49 without FPU

11.17.2 Processing

The common exception vector is used for this exception, and the CpU code in Cause

in Cause register CE (Coprocessor Error) field.

If EXL bit of Status register is only set to 0, the following operation is executed. EPC

affected instruction was in the branch delay slot (for execution during a branch), the

immediately preceding branch instruction address is retained in EPC register and BD bit

of Cause register is set to 1.

11.17.3 Servicing

The coprocessor unit to which an attempted reference was mode is identified by the

Coprocessor Usage Error field, which results in one of the following situations:

• If the process is entitled access to the coprocessor, the coprocessor is marked

usable and the corresponding user state is restored to the coprocessor.

• If the process is entitled access to the coprocessor, but the coprocessor does not

exist or has failed, interpretation of the coprocessor instruction is possible.

• If the BD bit is set in the Cause register, the branch instruction must be

interpreted; then the coprocessor instruction can be emulated and execution

resumed with the EPC register advanced past the coprocessor instruction.

• If the process is not entitled access to the coprocessor, the process executing at

the time i s handed an il legal instru ction/privile ged instructio n fault signal. This

error is usually fatal.

TX49/H2 Archit ecture

11-17

11.18 Floating-Point Exception

11.18.1 Cause

The Floating-Point exception is used by the floating-point coprocessor. This exception is

not maskable.

11.18.2 Processing

The common exception vector is used for this exception, and the FPE code in Cause

cause of this exception.

If EXL bit of Status register is only set to 0, the following operation is executed. EPC

affected instruction was in the branch delay slot (for execution during a branch), the

immediately preceding branch instruction address is retained in EPC register and the BD

bit of Cause register is set to 1.

11.18.3 Servicing

This exception is cleared by clearing the appropriate bit in the Floating-Point

Control/Sta tus register.

For an unimplemented instruction exception, the kernel should emulate the instruction;

for other exceptions, the kernel should pass the exception to the user program that caused

the exception.

TX49/H2 Archit ecture

11-18

11.19 Interrupt Exception

11.19.1 Cause

The Interrupt exception is raised by any of eight interrupts (two software and six

hardware). A hardware interrupt is raised when GINT* signal goes active. A software

interrupt is raised by setting the IP[1]/IP[0] bit in Cause register. The significance of

these interrupts is dependent upon the specific system implementation.

Each of the eight interrupts can be masked individually by clearing its corresponding

bit in the IM(Interrupt Mask) field of Status register, and all interrupts can be masked at

once by clearing IE bit of Status register to “0”.

If the GTINTDIS is low when a Reset exception occurred, GINT[5]* is disa bled and th e

timer exception is enabled.

11.19.2 Processing

The common exception vector is used a s following;

• In 32 bit mode, 0x8000 0180 (BEV = 0)

0xbfc0 0380 (BEV = 1)

• In 64 bit mode, 0xffff ffff 8000 0180 (BEV = 0)

0xffff ffff bfc0 0380 (BEV = 1)

11.19.3 Servicing

If the interrupt is caused by one of the two software-generated exceptions (SW1 or

SW0), the interrupt condition is cleared by setting the corresponding Cause register bit to

If the interrupt is hardware-generated, the interrupt condition is cleared by correcting

the condition causing the interrupt pin to be asserted.

If the timer interrupt is caused, the interrupt condition is cleared by changing the value

of the Compare register or setting the corresponding C ause register bit (IP[7]) to 0.

Interrupts are not acceptable when the settings of the Status register are EXL = 1 and

ERL = 1.

Note: due to the writ e buffer, a store t o an extern al device wi ll not neces sary occur until after

other instructions in the pipeline finish. Thus, the user must ensure that the store will

occur before the return from exception instruction (ERET) is executed otherwise the

interrupt may be serviced again even though there should be no interrupt pending.

TX49/H2 Archit ecture

11-19

11.20 Exception Handling and Servicing Flowcharts

The remainder of this chapter contains flowcharts for the following exceptions and

guidelines for their handlers:

• general exceptions and their exception handler

• TLB/XTLB miss exception and their exception handler

• Cold Reset, Soft Reset and NMI exceptions, and a guideline to their handler.

Generally speaking, the exceptions are handled by hardware (HW); the exceptions are then

serviced by software (SW).

Exceptions other than Reset, Soft Reset, NMI or first-level miss

Note: Interrupts can be masked by IE or IMs

EXL ← 1

PC ← 0xFFFF FFFF B F C0 0200 + 180

(unmapped, unc ached)

PC ← 0xFFFF FFFF 8000 0000 + 180

(unmapped, cached)

Cause 31 (BD) ← 0

EPC ← PC

Cause 31 (BD) ← 1

EPC ← (PC - 4)

= 0 (normal) = 1 (bootstrap)

Yes No

= 1

Processor forced to Kernel Mode

& interrupt di sabled

= 0

Comments

To General Exception Servicing Guidelines

BEV

Instr. in

Br. Dly. Slot?

EXL

(SR1)

Check if exception within

another exception

FP Control/ S tatus Register

is only set if the respective exception

occurs.

EnHi, X/Context are s et only for

*TLB- Invalid, Modified,

& Refill exc eptions

BadVA i s set onl y for

TLB-invalid, Modif i ed,

and Refill exceptions

Note: not set if it is a Bus Error

Set FP Cont rol St atus Res is ter

EnHi ← VPN2, ASID

X/Context ← VPN2

Set Cause Register

(ExcCode, CE)

Set BadVA

Figure 11-1 General Ex cepti on Han dler (HW)

TX49/H2 Archit ecture

11-20

ERET

MTC0 -

EPC

STATUS

EXL = 1

Service Code

Check CA USE REG. & Jump to

appropriate Servi ce Code

MTC0 -

(Set Status B i t s:)

KSU ← 00

EXL ← 0

& IE = 1

MFC0 -

X/Context

EPC

Status

Cause

Status

bit 21 (TS) (*)

= 0

= 1

Comments

Optional: Check only if 2nd-l evel TLB miss

(optional - only to enable Interrupts while keeping K e rnel Mode)

¥After EXL = 0, all exceptions allowed.

(except interrupt i f masked by IE or IM)

Reset t he processor

*Save Register File

*ERET is not allowed in the branc h del ay slot of

another Jump Instruction

*Processor does not execute the inst ruction which

is in t he ERET’s branch delay slot

*PC ← EPC; EXL ← 0

* LLbit ← 0

*Unmapped vec tor TLBMod, TLBInv,

TLB Refill exceptions not possible

*EXL = 1 so Int errupt exceptions disabled

*OS/System to avoid all ot her excepti ons

*Only Cold Reset, Soft Reset, NMI exceptions

possible.

*Save the context (register f i l e and so on)

(*)Reserved for TX49.

Figure 11-2 General Ex c epti on Ser vic ing Gu id el ines ( SW)

TX49/H2 Archit ecture

11-21

Vec. Off. = 0x180Vec. Off. = 0x000Vec. Off. = 0x080

EXL ← 1

EnHi ← VPN2, ASID

X/Context ← VPN2

Set Caus e Reg.

ExcCode, CE and

Set BadVA

EnHi ← VPN2, ASID

X/Context ← VPN2

Set Caus e Reg.

ExcCode, CE and

Set BadV A

To TLB/XTLB Exception Servicing Guidelines

Instr. in

Br. Dly. Slot?

EXL

(SR bit 1)

EXL

(SR bit 1)

XTLB

Exception?

EPC ← PC

Cause bit 31 (BD) ← 0

EPC ← (PC-4)

Cause bit 31 (BD) ← 1

BEV

(SR bit 22)

PC ← 0xFFFF FFFF 8000 0000 + Vec. Off.

(unmapped, cached) PC ← 0xFFFF FFFF BF00 0200 + Ve c. Off.

(unmapped, unc ached)

= 0 (normal) = 1 (bootstrap)

Proces sor forced t o K ernel Mode &

interrupt di s abl ed

Yes

Points to General ExceptionPoints to Refill Exception

NoYes

= 0

= 1

Check if exception within

another exception

Figure 11-3 TLB/XTLB Miss Exception Handler (HW)

TX49/H2 Archit ecture

11-22

ERET

Service Code

MFC0 -

CONTEXT

Comments

*ERET is not allowed in the branc h del ay slot of

another Jump Instruction

*Processor does not execute the instruction which is

in the ERE T ’ s branch delay s l ot

*PC ← EPC; EXL = 0

*LLbit ← 0

*Load the mapping of the virtual address in Context Reg.

Move it to ENLO and Wri te into the TLB

*There could be a TLB miss again during the mapping

of the data or i nstruct i on address. The process or will

jump to the general exception vector since the EXL is 1.

(Option to complete the fi rst level refill in the general

exception handler or ERET to the original i nstruct i o n

and take the except i on agai n)

*Unmapped vec tor TLBMod, TLBInv,

TLB Refill exceptions not possible

*EXL = 1 so Int errupt exceptions disabled

*OS/System to avoid all ot her excepti ons

*Only Cold Reset, Soft Reset, NMI exceptions

possible.

Figure 11-4 TLB/XTLB Exception Servicing Guidelines (SW)

TX49/H2 Archit ecture

11-23

PC ← 0xFFFF FFFF B F C0 0000

ErrorEPC ← PC

ERET Cold Reset Service CodeSoft Reset Service Code

NMI Servic e Code Status bi t 20

(SR)

Yes

= 0

(Optional)

= 1

Cold Reset, S o ft Reset & NMI S ervicing

Guideli nes (SW )

Cold Reset ExceptionSoft Reset or NM I Exception

Cold Reset, So ft Reset & NMI Exception Handling (HW)

Note: There is no i ndi cation from the

proessor t o di fferentiate between

NMI & Soft Reset;

there m ust be a sys t em level i ndi cation.

NMI?

Status:

BEV ← 1

TS ← 0 (*)

SR ← 1

ERL ← 1

Random ← TLBENTR IES-1

Wired ← 0

Status:

BEV ← 1

TS ← 0 (*)

SR ← 0

ERL ← 1

(*) Reserved for TX49

Figure 11-5 Cold Reset, Soft Reset & NMI Exception Handling (HW) and

Servicing Guidelines (SW)

TX49/H2 Archit ecture

11-24

TX49/H2 Archit ecture

12-1

12. Floating-Point Unit, CP1

This chapter describes the floating-point operations, including the programming model,

instruction set and formats.

The floating-point operations fully conform to the requirements of ANSI/IEEE Standard 754-

1985, IEEE Standard for Binary Floating-Point Arithmetic.

12.1 Overview

All floating-point instructions, as defined in the MIPS ISA for the floating-point coprocessor,

CP1, are processed by the other hardware unit that executes integer instructions.

The execution of floating-point instructions can be disabled by the coprocessor usability CU

bit defined in the CP0 Status register.

12.2 Floating Point Register

12.2.1 Floating - Point Gener al Regist ers (FGRs)

CP1 has a set of Floating-Point General Purpose registers (FGRs) that can be accessed

in the following ways:

• As 32 general purpose registers (32 FGRs), each of which is 32 bits wide when the

FR bit in the CPU Status register equals 0; or as 32 general purpose registers (32

FGRs), each of which is 64-bits wide when FR equals 1. The CPU accesses these

registers through MOVE, LOAD, and STORE instructions.

• As 16 floating-point registers (see t he next section for a description of FPRs), each

of which is 64-bits wide, when the FR bit in the CPU Status register equals 0.

The FPRs hold values in either single- or double-precision floating-point format.

Each FPR corresponds to adjacently numbered FGRs as shown in Figure 12-1.

• As 32 floating-point registers (see the next section for a description of FPRs), each

of which is 64-bits wide, when the FR bit in the CPU Status register equals 1.

The FPRs hold values in either single- or double-precision floating-point format.

Each FPR corresponds to an FGR as shown in Figure 12-1.

Floating-point

Registers (FPR)

(FR = 0)

Floating-Point

Gen era l Purp o se Re

isters

Floating-point

Registers (FPR)

(FR = 1)

Floating-Point

Gen era l Purp o se Re

isters

31 (FGR) 0 63 (FGR) 0

(least) FGR0 FPR0 FGR0

FPR0 (most) FGR1 FPR1 FGR1

(least) FGR2 FPR2 FGR2

FPR2 (most) FGR3 FPR3 FGR3

••

(least) FGR28 FPR28 FGR28

FPR28 (most) FGR29 FPR29 FGR29

(least) FGR30 FPR30 FGR30

FPR30 (most) FGR31 FPR31 FGR31

Floating-point

Control Registers

(FCR)

Control/Status Register Implementation/Revision Re

ister

31 (FCR31) 0 31

(

FCR0

)

Figure 12-1 FP Registers

TX49/H2 Archit ecture

12-2

12.2.2 Floating-Point Control Regist ers

The MIPS RISC architecture defines 32 floating-point control registers (FCRs); the

TX49 processor implements two of these registers: FCR0 and FCR31. These FCRs are

described below:

• The Implementation/Revision register (FCR0) holds revision information.

• The Control/Status register (FCR31) controls and monitors exceptions, holds the

result of compare operations, and establishes rounding modes.

• FCR1 to FCR30 are reserved.

Table 12-1 lists the assignments of the FCRs.

Table 12-1 Floating-Point Control Register Assignments

FCR Number Use

FCR0 Coprocess or impl ement ation and revi si on regi ster

FCR1 to FCR30 Reserved

FCR31 Rounding mode, cause, trap enables, and flags

Implementation and Revision Register, (FCR0)

The read-only Implementation and Revision register (FCR0) specifies the

implementation and revision number of CP1. This information can determine the

coprocessor revision and performance level, and can also be used by diagnostic software.

Figure 12-2 shows the layout of the register; Table 12-2 describes the Implementation

and Revision register (FCR0) fields.

Implementation/Revision Register (FCR0)

31 16 15 8 7 0

0ImpRev

16 8 8

Figure 12-2 Implementation/Revision Register

Table 12-2 FCR0 Fields

Field Description

Imp Im pl em entat i on number

Rev Revision num ber i n the form of y. x

0 Reserved. Returns zeroes when read.

The revis ion number is a value of the form y. x, where:

• y is a major revision number held in bits 7:4.

• x is a minor revision number held in bits 3:0.

Control/Status Register (FCR31)

The Control/Status register (FCR31) contains control and status information that can

be accessed by instructions in either Kernel or User mode. FCR31 also controls the

arithmetic rounding mode and enables User mode traps, as well as identifying any

exceptions that may have occurred in the most recently executed floating-point

instruction, along with any exceptions that may have occurred without being trapped.

Figure 12-3 shows the format of the Control/Status register, and Table 12-3 describes

the Control/Status register fields. Figure 12-4 shows the Control/Status register Cause,

Flag, and Enable fields.

TX49/H2 Archit ecture

12-3

Control/Status Register (FCR31)

31 25 24 23 22 18 17 12 11 7 6 2 1 0

0FS C0 Cause

EVZOUI Enables

VZOUI Flags

VZOUI RM

7115 6 5 52

Figure 12-3 FP Control/Status Register Bit Assignments

Table 12-3 Control/Status Register Fields

Field Description

FS When set, denormalized results can be flushed instead of causing

an unimplemented operati on except i on.

C Condition bit. Stores the result of compare instruction. See

description of Control/Status register Condition bit.

Cause Cause bits. These bits identify the exceptions raised by the most

recently executed float ing-point inst ructi on. See Figure 12-4 and the

description of Control/Status register Cause, Flag, and Enable bits.

Enables Enable bits. When set, these bits trap any floating-point exceptions

to indicate that they have been passed to the CPU. See Figure 12-4

and the description of Control/Status register Cause, Flag, and

Enable bits.

Flags Flag bits. These bits indicate that an exception was raised. See

Figure 12-4 and the description of Control/Status register Cause,

Flag, and Enable bits.

RM Rounding mode bits. See Table 12-5 and the description of

Control/Status register Rounding Mode Control bits.

Bit# 17 16 15 14 13 12

EVZOUI

Bit#1110987

VZOUI

Bit#65432

VZOUI

Inexact Operation

Underflow

Overflow

Divisi on by Zero

Invalid Operat i on

Unimplement ed Operation

Cause

Bits

Enable

Bits

Flag

Bits

Figure 12-4 Control/Status Register Cause, Flag, and Enable Fields

TX49/H2 Archit ecture

12-4

Control/Status Register FS Bit

The FS bit enables the flushing of denormalized values. When the FS bi t i s set and the

Underflow and Inexact Enable bits are not set, denormalized results are flushed instead

of causing an Unimplemented Operation exception. Results are flushed either to 0 or the

minimum normalized value, depending upon the rounding mode (see Table 12-4 below),

and the Underflow and Inexact Flag and Cause bits are set.

Table 12-4 Flush Values of Denormalized Results

Flushed Result Rounding ModeDenormalized

Result RN RZ RP RM

Positive +0+0+2Emin +0

Negative -0 -0 -0 -2Emin

Control/Status Register Condition Bit

When a floating-point Compare operation takes place, the result is stored at bit 23, the

Condition bit. The C bit is set to 1 if the condition is true; the bit is cleared to 0 if the

condition is false. Bit 23 is affected only by compare and CTC1 instructions.

The BC1T and BC1F instructions test the C bit to decide whether or not to cause a

branch.

Control/Status Register Cause, Flag, and Enable Fields

Figure 12-4 illustrates the Cause, Flag, and Enable fields of the Control/Status

MOV. fmt), CTC1, reserved, and unimplemented instructions. All other instructions have

no affect on these fields.

Cause Bits

Bits 17:12 in the Control/Status register contain Cause bits, as shown in Figure

12-4, which reflect the results of the most recently executed floating-point

instruction. The Cause bits are a logical extension of the CP0 Cause register; they

identify the exceptions raised by the last floating-point operation. If the

corresponding Enable bit is set at the time of the exception a floating-point exception

and interrupt is raised. If more than one exception occurs on a single instruction,

each appropriate bit is set.

The Cause bits are updated by most floating-point operations. The

Unimplemented Operation (E) bit is set to 1 if software emulation is required,

otherwise it remains 0. The other bits are set to 0 or 1 to indicate the occurrence or

non-occurrence (respectively) of an IEEE 754 exception. Within the set of floating-

point instructions that update the Cause bits, the Cause field indicat es the exceptions

raised by the most-recently-executed instruction.

When a floating-point exception is taken, no results are stored, and the only state

affected is the Cause bit. Therefore, software emulation routines can use the original

values to emulate the exception-causing floating-point operation.

Enable Bits

A floating-point exception is generated any time a Cause bit and the corresponding

Enable bit are set. A floating-point operation that sets an enabled Cause bit forces

an immediate floating-point exception, as does setting both Cause and Enable bits

TX49/H2 Archit ecture

12-5

with CTC1. Software can also emulate above.

There is no enable for Unimplemented Operation (E). An Unimplemented

exception always generates a floating-point exception.

Before returning from a floating-point exception, software must first clear the

enabled Cause bits with a CTC1 instruction to prevent a repeat of the interrupt.

Thus, User mode programs can never observe enabled Cause bits set; if this

information is required in a User mode handler, it must be passed somewhere other

than the Status register.

For a floating-point operation that sets only unenabled Cause bits, no floating-

point exception occurs and the default result defined by IEEE 754 is stored. In this

case, the exceptions that were caused by the immediately previous floating-point

operation can be determined by reading the Cause field.

Flag Bits

The Flag bits are cumulative and indicate the exceptions that were raised by the

operations that were executed since the bits were explicitly reset. Flag bi ts ar e se t to

1 if an IEEE 754 exception is raised, otherwise they remain unchanged. The Flag

bits are never cleared as a side effect of floating-point operations; however, they can

be set or cleared by writing a new value into the Status register, using a CTC1

instruction.

Control/Status Register Rounding Mode Control Bits

Bits 1 and 0 in the Control/Status register constitute the Rounding Mode (RM) field.

As shown in Table 12-5, these bits specify the rounding mode that CP1 uses for all

floating-point operations.

Table 12-5 Rounding Mode Bit Decoding

Rounding

ModeRM

(1:0) Mnemonic Description

0 RN Round result to nearest representable value; round to value with least-significant

bit 0 when the two nearest representable values are equal l y near.

1 RZ Round toward 0: round to value closest to and not greater in magnitude than the

infinitely precise result.

2 RP Round toward +∞: round to value closest to and not less than the infinitely precise

result.

3 RM Round toward −∞: round to value closest to and not greater than the infinitely

precise result.

12.2.3 Accessing the FP Control and Implementation/Revision Registers

The Control/Status and the Implementation/Revision registers are read by a Move

Control From Coprocessor 1 (CFC1) instruction.

The bits in the Control/Status register can be set or cleared by writing to the register

using a Move Con trol To Copro ce ssor 1 ( C TC1) in stru c tio n. Th e Implementation/Revision

associated with floating-point control registers.

TX49/H2 Archit ecture

12-6

12.3 Floating-Point Formats

CP1 performs both 32-bit (single-precision) and 64-bit (double-precision) IEEE standard

floating-point operations. The 32-bit single-precision format has a 24-bit signed-magnitude

fraction field (f

s) and an 8-bit exponent (e), as shown in Figure 12-5.

31 30 23 22 0

Sign e

Exponent f

Fraction

18 23

Figure 12-5 Single-Precision Floating-Point Format

The 64-bi t double-p recision format ha s a 53-bit signed-mag nitude fraction field (f

s) and an

11-bit exponent, as shown in Figure 12-6.

63 62 5251 0

Sign e

Exponent f

Fraction

111 52

Figure 12-6 Double-Precision Floating-Point Format

As shown in the above figures, numbers in floating-point format are composed of three

fields:

• sign field, s

• biased exponent, e = E + bias

• fraction, f = b1b2....bp-1

The range of the unbiased exponent E includes every integer between the two values Emin

and Emax inclusive, together with two other reserved values:

• Emin − 1 (to encode 0 and denormalized numbers)

• Emax + 1 (to encode ∞ and NaNs [Not a Number])

For single-and double-precision formats, each representable nonzero numerical value has

just one encoding. For single-and double-precision formats, the value of a number, v, is

determined by the equations shown in Table 12-6.

Table 12-6 Equations for Calculating Values in Single and Double-Precision Floating-Point Format

No. Equation

(1) if E = Emax+1 and f ≠ 0, then v is NaN, regardless of s

(2) if E = Emax+1 and f = 0, then v = (−1)s∞

(3) if Emin ≤ E ≤ Emax, then v = (−1)s2E(1.f)

(4) if E = Emin−1 and f ≠ 0, then v = (−1)s2Emin(0.f)

(5) if E = Emin−1 and f = 0, then v = (−1)s0

For all floating-point formats, if v is NaN, the most-significant bit of f determines whether

the value is a signaling or quiet NaN: v is a signaling NaN if the most-significant bit of f is set,

otherwise, v is a quiet NaN.

Table 12-7 defines the values for the format parameters; minimum and maximum floating-

point values are given in Table 12-8.

TX49/H2 Archit ecture

12-7

Table 12-7 Floating-Point Format Parameter Values

Format

Parameter Single Double

Emax +127 +1023

Emin –126 –1022

Exponent bias +127 +1023

Exponent width in bits 8 11

Integer bit hidden hidden

Fraction width in bits 23† 52†

Format width in bits 32 64

† Excluding the sign bit.

Table 12-8 Minimum and Maximum Floating-Point Values

Type Value

Single-prec isi on Minimum 1.40129846e-45

Single-prec isi on Minimum Norm 1.17549435e-38

Single-prec isi on Maximum 3.40282347e +38

Double-precis i on Minimum 4.9406564584124654e-324

Double-precis i on Minimum Norm 2.2250738585072014e-308

Double-precis i on Maximum 1.7976931348623157e+308

12.4 Binary Fixed-Point Format

Binary fixed-point values are held in 2's complement format. Unsigned fixed-point values

are not directly provided by the floating-point instruction set. Figure 12-7 illustrates binary

single fixe d- po int format an d Fig ure 1 2- 8 i llus tra tes b in ary lo ng fix e d- po int f o rmat; Tab le 12- 9

lists the binary fixed-point format fields.

31 30 0

Sign Integer

131

Figure 12-7 Binary Single Fixed-Point Format

63 62 0

Sign Integer

163

Figure 12-8 Binary Long Fixed-Point Format

Field assignments of the binary fixed-point format are:

Table 12-9 Binary Fixed-Point Format Fields

Field Description

sign sign bit

integer int eger val ue (2’ s complement)

TX49/H2 Archit ecture

12-8

12.5 Floating-Point Instruction Set Summary

Each instruction is 32 bits long, and aligned on a word boundary. This section describes

the overview of instructions for floating-point unit. A detailed description of each instruction

is provided in Appendix B.

12.5.1 Load, Move and Store Instructions (Table 12-10)

Load and Store instructions move data between memory and FPU general purpose

registers, and Move instructions move data directly between CPU and FPU general

purpose registers. These instructions are not perform format conversions and therefore

never cause floating-point exceptions. The instruction immediately following a load can

use the contents of the loaded register. However, in such case the hardware interlocks,

requiring additional real cycles. Thus, the scheduling of load delay slots is required to

avoid the interlocking.

Data Alignment

All processor loads and stores reference the following aligned data items:

• For word loads and stores, the access type is always WO RD, and the low-order 2 bits

of the address must always be 0.

• For doubleword loads and stores, the access type is always DOUBLEWORD, and the

low-order 3 bits of the address must always be 0.

Endian

Regardless of byte-numbering order (endianness) of the data, the address specifies the

byte that has the smallest byte address in the addressed field. For a big-endian system, it

is the leftmost byte; for a little-endian system, it is the rightmost byte.

Table 12-10 FPU Instruction Set (Optional): Load, Move and Store Instruction

Instruction Description Note

LWC1 Load Word to FPU (coprocessor 1) MIPS I

SWC1 Store Word from FPU (coprocessor 1) MIPS I

MTC1 Move Word to FPU (coprocessor 1) MIPS I

CTC1 Move Control Word to FPU (coprocessor 1) MIPS I

MFC1 Move W ord from FPU (c oprocessor 1) MIPS I

CFC1 Move Control Word from FPU (coprocessor 1) MIPS I

TX49/H2 Archit ecture

12-9

12.5.2 Conversion Instruct ions (Table 12-11)

Conversion instructions perform conversion operations between the various data

formats such as single- or double-precision, fixed- or floating-point formats. Table 12-11

list conversion instructions.

Table 12-11 FPU Instruction Set(Optional): Conversion Instruction

Instruction Description Note

CVT.S.fmt Floating-Point Convert to Single FP Format MIPS I

CVT.W.fmt Floating-Point Convert to Single Fixed-Point Format MIPS I

ROUND.W.fmt Floating-point Round MIPS II

TRUNC.W.fmt Floating-poi nt Trunc at e MIPS II

CEIL.W.fmt Floating-point Ceiling MIPS II

FLOOR.W.fmt Floating-poi nt Floor MIPS II

12.5.3 Computational I nstr uctions (Table 12-12)

Computational instructions perform arithmetic operations on floating-point values in

the FPU registers. These are two categories of computational instructions:

• 3-Operand Register-Type instructions, which perform floating-point addition,

multiplication, division, and square root operations

• 2-Operand Register-Type instructions, which perform floating-point absolute

value, move, negate, and square root operat ion.

Table 12-12 FPU Instruction Set(Optional): Computational Instruction

Instruction Description Note

ADD.fmt Floati ng-poi nt Add MIPS I

SUB.fmt Floating-poi nt S ubtract MIPS I

MUL.fmt Floating-poi nt Multi pl y MIPS I

DIV.fmt Floati ng-poi nt Divi de MIPS I

ABS.fmt Float i ng-poi nt A bsolute Value MIPS I

MOV.fmt Float i ng-poi nt Move MIPS I

NEG.fmt Floating-point Negat e MIPS I

SQRT.fmt Floating-poi nt S quare root MIPS II

TX49/H2 Archit ecture

12-10

12.5.4 Compare and Branch Instructions (Table 12-13)

Compare instructions perform comparisons of the contents of registers and set a

conditional bit based on the results. Branch on FPU Condition instructions perform a

branch to the specified target if the specified coprocessor condition is met.

Table 12-13 FPU Instruction Set(Optional): Compare and Branch Instruction

Instruction Description Note

C.cond.fmt Floating-point Compare MIPS I

BC1T Branch on FPU True MIPS I

BC1F Branch on FPU False MIPS I

BC1TL Branch on FPU True Likely MIPS II

BC1FL Branch on FPU False Likely MIPS II

The floating-point compare (C.fmt.cond) instructions interpret the contents of two FPU

registers (fs, f t) in the specified f ormat (fmt) an d arithmetica lly compare th em. A resu lt is

determined based on the comparison and conditions (cond) specified in the instruction.

Table 12-4 lists the mnemonics for the compare instruction conditions .

Table 12-14 Mnemonics and Definitions of Compare Instruction Conditions

Mnemonic Definition Mnemonic Definition

FFalse TTrue

UN Unordered OR Ordered

EQ Equal NEQ Not E qual

UEQ Unordered or E qual OLG O rdered or Less than or Greater than

OLT Ordered Less Than UGE Unordered or Greater than or Equal

ULT Unordered or Less Than OGE Ordered Great er t han or Equal

OLE Ordered Less Than or Equal UGT Unordered or Greater Than

ULE Unordered or Less than or Equal OGT Ordered Greater Than

SF Signaling False ST Signal i ng True

NGLE Not Greater than or Less than or Equal GLE Greater than, or Less than or Equal

SEQ Signaling Equal SNE Signaling Not Equal

NGL Not Greater than or Less than GL Greater than or Less Than

LT Less Than NLT Not Less Than

NGE Not Greater than or Equal GE Greater than or Equal

LE Less than or Equal NLE Not Less than or Equal

NGT Not Greater Than GT Great er Than

TX49/H2 Archit ecture

13-1

13. Floating-Point Exception

13.1 Introduction

This chapter describes floating-point exceptions, including FPU exception type, exception

trap processing, exception flags, saving and restoring state when handling an exception, and

trap handlers for IEEE Standard 754 exceptions.

13.2 Exception Types

The FP Control/Status register described in Chapter 12 contains an Enable bit for each

exception type; exception Enable bits determine whether an exception will cause the FPU to

initiate a trap or set a status flag.

• If a trap is taken, the FPU remains in the state found at the beginning of the

operation and a software exception handling routine executes.

• If no trap is taken, an appropriate value is written into the FPU destination register

and execution cont inues.

The FPU supports the five IEEE Standard 754 exceptions:

• Inexact (I)

• Underflow (U)

• Overflow (O)

• Division by Zero (Z)

• Invalid Operation (V)

Cause bits, Enables, and Flag bits (status flags) are used.

The FPU adds a sixth exception type, Unimplemented Operation (E). This exception

indicates the use of a software implementation. The Unimplemented Operation exception has

no Enable or Flag bit; whenever this exception occurs, an unimplemented exception trap is

taken.

Figure 13-1 shows the Control/Status register bits that support exceptions.

Bit #171615141312

E V Z O U I Cause Bits

Bit # |

11 |

10 |

V Z O U I Enable Bits

Bit # |

V Z O U I Flag Bits

Unimplemented |

Invalid |

Divisi on by

Zero

Overflow |

Underflow |

Inexact

Figure 13-1 Control/Status Register Exception/Flag/Trap/Enable Bits

TX49/H2 Archit ecture

13-2

13.3 Exception Trap Processing

When a floating-point except ion trap is taken, the Cause register indicates the floating-point

coprocessor is the cause of the exception trap.

The Floating-Point Exception (FPE) code is used, and the Cause bits of the floating-point

Control/Status register indicate the reason for the floating-point exception. These bits are, in

effect, an extension of the system coprocessor Cause register.

13.4 Flags

A Flag bit is prov ided for e ach IEEE ex ceptio n. This Flag bit is set to a 1 o n the assertion o f

its corresponding exception, with no corresponding exception trap signaled.

When no exception trap is signaled, floating-point coprocessor takes a default action,

providing a substitute value for the exception-causing result of the floating-point operation.

The particular default action taken depends upon the type of exception. Table 13-1 lists the

default action taken by the FPU for each of the IEEE exceptions.

Table 13-1 Default FPU Exception Actions

Field Description Rounding

Mode Default Actio n

I I nexact excepti o n ANY Suppl y a rounded result.

U Underflow

exception ANY Supply a rounded result.

OOverflow

exception RN Modify overflow values to ∞ with the sign of the

intermediate result.

RZ Modify overflow values to the format’s largest finite

number with the sign of the intermediate result.

RP Modify negative overflows to the format’s most negative

finite number; modify pos iti ve overf l ows to + ∞

RM Modify positive overflows to the format’s largest finite

number; modify negative overflows to – ∞

Z Division by zero ANY Supply a properly s i gned ∞

V Invali d operat ion ANY Supply a quiet Not a Number (NaN).

The FPU detects the eight exception causes internally. When the FPU encounters one of

these unusual situations, it causes either an IEEE exception or an Unimplemented Operation

exception (E).

Table 13-2 lists the exception-causing situations and contrasts the behavior of the FPU with

the requirements of the IEEE Standard 754.

TX49/H2 Archit ecture

13-3

Table 13-2 FPU Exception-Causing Conditions

FPA Internal

Result IEEE Standar d

754 Trap Enable Trap Disable Notes

Inexact resul t I I I Loss of accuracy

Exponent overflow O, I* O, I O, I Normalized exponent > Emax

Division by zero Z Z Z Zero is (exponent = Emin – 1, mantissa = 0)

Overflow on convert V V E Source out of integer range

Signaling NaN

source V V E Quiet NaN result generated from quiet NaN

source

Invalid operat i on V V E 0/0, etc.

Exponent underflow U E E Normalized exponent < Emin

Denormali zed or

QNaN None E E Denormalized is (exponent = Emin – 1 and

mantissa < > 0)

*The IEEE Standard 754 specifies an inexact exception on overflow only if the overflow trap is

disabled.

13.5 FPU Exceptions

The following sections describe the conditions that cause the FPU to generate each of its

exceptions, and details the FPU response to each exception-causing condition.

Inexact Exception (I)

The FPU generates the Inexact exception if one of the following occurs:

• the rounded result of an operation is not exact, or

• the rounded result of an operation overflows, or

• the rounded result of an operation underflows and both the Underflow and Inexact

Enable bits are not set and the FS bit is set.

Trap Enabled Results: If Inexact exception traps are enabled, the result register is not

modified and the source regist ers are preserved.

Trap Disabled Results: The rounded or overflowed result is delivered to the destination

Invalid Operation Exception (V)

The Invalid Operation exception is signaled if one or both of the operands are invalid for an

implemented operation. When the exception occurs without a trap, the MIPS ISA defines the

result as a quiet Not a Number (qNaN). The invalid operations are:

• Addition or subtraction: magnitude subtraction of infinities, such as: ( + ∞) + (−∞) or

(−∞) − (−∞)

• Multiplication: 0 times ∞, with any signs

• Division: 0/0, or ∞/∞, with any signs

• Comparison of predicates involving ‘<’ or ‘>’ without ‘?’, when the operands are

unordered

• Any arithmetic operation, when one or both operands is a signaling NaN. A move

(MOV) operation is not considered to be an arithmetic operation, but absolute value

(ABS) and negate (NEG) are.

• Comparison or a Convert From Floating-point Operation on a signaling NaN.

• Square root:

, where x is less than zero.

Software can simulate the Invalid Operation exception for other operations that are invalid

for the given source operands. Examples of these operations include IEEE Standard 754-

TX49/H2 Archit ecture

13-4

specified functions implemented in software, such as Remainder: x REM y, where y is 0 or x

is infinite; conversion of a floating-point number to a decimal format whose value causes an

overflow, is infinity, or is NaN; and transcendental functions, such as ln (−5) or cos−1 (3).

Refer to Appendix B for exa mples or for routines to handle these cases.

Trap Enabled Results: The result register is not modified, and the source registers are

preserved.

Trap Disabled Results: A quiet NaN is delivered to the destination register if no other

software trap occurs.

Divide-by-Zero Exception (Z)

The Division-by-Zero exception is signaled on an implemented divide operation if the

divisor is zero and the dividend is a finite nonzero number. Software can simulate this

exception for other operations that produce a signed infinity, such as In (0), sec (π/2) , csc (0),

or 0-1

Trap Enabled Results: The result register is not modified, and the source registers are

preserved.

Trap Disabled Results: The result, when no trap occurs, is a correctly signed infinity.

Overflow Exception (O)

The Overflow exception is signaled when the magnitude of the rounded floating-point

result, with an unbounded exponent range, is larger than the largest finite number of the

destination format. (This exception also signals an Inexact exception.)

Trap Enabled Results: The result register is not modified, and the source registers are

preserved.

Trap Disabled Results: The result, when no trap occurs, is determined by the rounding

mode and the sign of the intermediate result (as listed in Table

12-1).

Underflow Exception (U)

Two related events contribute to the Underflow exception:

• creation of a tiny nonzero result between ±2Emin which can cause some later exception

because it is so tiny

• extraordinary loss of accuracy during the approximation of such tiny numbers by

denormalized numbers.

IEEE Standard 754 allows a variety of ways to detect these events, but requires they be

detected the same way for all operations.

Tininess can be detected by one of the following methods:

• after rounding (when a nonzero result, computed as though the exponent range were

unbounded, would lie strictly between ±2Emin)

• before rounding (when a nonzero result, computed as though the exponent range and

the precision were unbounded, would lie strictly between ±2Emin).

The MIPS architecture requires that tininess be detected after rounding.

Loss of accura cy can be detected by one of the following methods :

• denormalization loss (when the delivered result differs from what would have been

computed if the exponent range were unbounded)

• inexact result (when the delivered result differs from what would have been computed

if the exponent range and precision were both unbounded).

TX49/H2 Archit ecture

13-5

The MIPS architecture requires that loss of accuracy be detected as an inexact result.

Trap Enabled Results: If Underflow or Inexact traps are enabled, or if the FS bit is not

set, then an Unimplemented exception (E) is generated, and the

result register is not modified.

Trap Disabled Results: If Underflow and Inexact traps are not enabled and the FS bit is

set, the result is determined by the rounding mode and the sign of

the intermediate result (as listed in Table 12-1).

Unimplemented Instruction Exception (E)

Any attempt to execute an instruction with an operation code or format code that has been

reserved for future definition sets the Unimplemented bit in the Cause field in the FPU

Control/Status register and traps. The operand and destination registers remain

undisturbed and the instruction is emulated in software. Any of the IEEE Standard 754

exceptions can arise from the emulated operation, and these exceptions in turn are

simulated.

The Unimplemented Instruction exception can also be signaled when unusual operands or

result conditions are detected that the implemented hardware cannot handle properly.

These include:

• Denormalized operand, except for Compare instruction

• Quiet Not a Number operand, except for Compar e instruction

• Denormalized result or Underflow, when either Underflow or Inexact Enable bits are

set or the FS bit is not set.

• Reserved opcodes

• Unimplemented formats

• Operations which are invalid for their format (for instance, CVT.S.S)

Note: Denormalized and NaN operands are only trapped if the instruction is a convert or

computational operation. Moves do not trap if their operands are either denormalized or

NaNs.

The use of this exception for such conditions is optional; most of these conditions are newly

developed and are not expected to be widely used in early implementations. Loopholes are

provided in the architecture so that these conditions can be implemented with assistance

provided by software, maintaining full compatibility with the IEEE Standard 754.

Trap Enabled Results: The result register is not modified, and the source registers are

preserved.

Trap Disabled Results: This trap cannot be disabled.

TX49/H2 Archit ecture

13-6

13.6 Saving and Restoring State

Sixteen doubleword† coprocessor load or store operations save or restore the coprocessor

floating-point register state in memory. The remainder of control and status information can

be saved or restored through CFC1/CTC1 instructions, and saving and restoring the processor

registers. Normally, the Control/Status register is saved first and restored last.

When state is restored, state information in the Control/Status register indicates the

exceptions that are pending. Writing a zero value to the Cause field of Control/Status register

clears all pending exceptions, permitting normal processing to restart after the floating-point

13.7 Trap Handlers for IEEE Standard 754 Exceptions

The IEEE Standard 754 strongly recommends that users be allowed to specify a trap

handler for any of the five standard exceptions that can compute; the trap handler can either

compute or specify a substitute result to be placed in the destinati on register of the operation.

By retriev ing an instructio n using th e processo r Exception Program Counter (EPC) register,

the trap handler determines:

• exceptions occurring during the operation

• the operat ion being performed

• the destination format

On Overflow or Underflow exceptions (except for conversions), and on Inexact exceptions,

the trap handler gains access to the correctly rounded result by examining source registers

and simulating the operation in software.

On Overflow or Underflow exceptions encountered on floating-point conversions, and on

Invalid Operation and Divide-by-Zero exceptions, the trap handler gains access to the operand

values by examining the source registers of the instruction.

The IEEE Standard 754 recommends that, if enabled, the overflow and underflow traps

take precedence over a separate inexact trap. This prioritization is accomplished in software;

hardware sets the bits for both the Inexact exception and the Overflow or Underflow

exception.

† 32 doublewords if the FR bit is set to 1.

TX49/H2 Archit ecture

14-1

14. Debug Support Unit

14.1 Features

1. Utilizes JTAG interface compatible with IEEE Std. 1149.1.

2. Additional Status pins and debug clock in conjunction with JTAG pins provide Real-Time

Trace information.

3. Processor access to external processor probe to execute from the external trace memory

during debug exception and boot time. This is to eliminate system memory for debugging

purpose.

4. Supports DMA access through JTAG interface to internal processor bus to access internal

registers, host system peripherals and system memory.

5. Debug functions

• Instruction Address Break

• Data Bus break

• Processor Bus Break

• Hardware Debug Interrupt

• Reset, NMI, Interrupt Mask

6. Instructions for Debug

• SDBBP, DERET, CTC0, CFC0

7. CP0 Registers for Debug

• Debug, DEPC, DESAVE

14.2 EJTAG interface

This interface consists of two modes of operation a Run Time Mode and a Real Time Mode.

The Run Time mode provides functions such as processor Run, Stop, Single Step, and access to

internal registers and system memory. The Real Time mode provides additional status pins

used in conjunction with JTAG pins for Real Time Trace information.

Pins In/Out Description

GTCK I Test Clock Input

GDCLK O Debug Cloc k (1/ 3 CPU Clock)

GTDI/GDINT I Tes t Dat a Input (GTDI) at Run Time mode

/Debug Interrupt Input (GDI NT) at Real Time mode

GTDO/GTPC[0] O Test Data Output (GTDO)

/PC Output (GTPC)

GTMS I Test Mode Select Input

GTRST* I Reset

GPCST[8∼0] O PC Trace Stat us Inform ation

GTPC[3∼1] O PC Output

TX49/H2 Architecture

14-2

14.3 JTAG Interface

Standard JTAG interface is used for on chip debugging during Run Time mode. The TX49

Debug Support Unit has following registers.

• Instruction Register

• Bypass R e g iste r

• Boundary-Scan Register

• Device Identification Register

• Implementation Register

• JTAG_Data_register

• JTAG_Address_Register

• JTAG_Control_Register

14.4 Processor Access Overview

The core processor can access external processor probe for reading and writing to external

monitor memory, registers and other external resources.

In addition the processor can execute from the external monitor memory located from

0xf_ff20 0000 to 0xf_ff2f ffff when the ProbEnb bit is set and the processor probe is turned ON.

Any access to the monitor location from 0xf_ff20 0000 to 0xf_ff3f ffff are only allowed when the

processor is in the debug mode (DM = 1).

14.5 Instruction

The instruction is a 8 bit field. Instructions for the TX49 Debug Support Unit are encoded

between 0x80 and 0x9f and other codes are reserved for Toshiba Stand ard JTAG instruction s

(Includes EXTEST, SAMPLE/PRELOAD, INTEST, IDCODE and HI-Z) and so on.

Instructions are decoded as follows.

Hex Value Instruction Description

0x83 EJTAG_ImpCode Select Implementation Register

0x88 JTAG_ADDRESS_IR Select JTAG_Address Register

0x89 JTAG_DATA_IR Select JTAG_Data Register

0x8A JTAG_CONTROL_IR Select JTAG_Control Register

0x8B JTAG_ALL_IR Select JTAG_All Register

0x90 PCT RACE PCTRACE Inst ruction

Any unused instruction between 0x80 a nd 0x9f defaulted to BYPASS instruction.

TX49/H2 Archit ecture

14-3

14.6 Debug Unit

14.6.1 Extended Instructions

• SDBBP

• DERET

• CTC0

• CFC0

14.6.2 Extended Debug Registers in CP0

• Debug Register

• Debug Exception PC (DEPC)

• Debug SAVE

14.7 Register Map

Address Mnemonic Description

0xf ff30 0000 DCR Debug Control Register

0xf ff30 0008 IBS Instruction Break Status

0xf ff30 0010 DBS Data Break Status

0xf ff30 0018 PBS Processor Break Status

0xf ff30 0100 IBA0 Instruction B reak Address 0

0xf ff30 0108 IBC0 Instruction B reak Cont rol 0

0xf ff30 0110 IBM0 Instruction Break Address Mask 0

0xf ff30 0300 DBA0 Dat a Break Address 0

0xf ff30 0308 DBC0 Data Break Cont rol 0

0xf ff30 0310 DBM0 Dat a B reak Address Mask 0

0xf ff30 0318 DB0 Data Break Value 0

0xf ff30 0600 PBA0 Processor Bus Break Address 0

0xf ff30 0608 PBD0 P rocessor Bus Break Data 0

0xf ff30 0610 PBM0 P rocessor Bus Break Mask 0

0xf ff30 0618 PBC0 P rocessor Bus Break Control 0

14.8 Processor Bus Break Function

This function is to monitor the interface to core and provide debug interruption or trace

trigger for a given physical address and data.

TX49/H2 Architecture

14-4

14.9 Debug Exception

Three kinds of debug exception are supported.

• Debug Single Step (DSS bit)

• Debug Breakpoint Exception (SDBBP Instruction)

• JTAG Break Exception (Jtagbrk bit in JTAG_Control_Register)

Note: During real time debugging, first two functions are disabled.

14.9.1 Debug Single Step (DSS)

When the debug register DSS bit is set, this exception has been raised each time one

instruction is executed.

14.9.2 Debug Breakpoint exception (Dbp)

This exception is ra ised when SDBBP instruction is executed.

14.9.3 JTAG Break Exception

This exception is raised when JTAG unit set the Jtagbrk in J TAG_Control_Register.

14.9.4 Debug Exception Handling

Updates DEPC and Debug register.

Registers other than DEPC and Debug register retain their values.

14.9.5 Branching to debug handler

If the ProbEnb bit in JTAG_Control_Register[15] is set, the debug exception vector is

located at PC: 0xffff ffff ff20 0200.

If the ProbEnb bit in JTAG_Conctrol_Register[15] is cleared, the debug exception vector

is located at

PC: 0xffff ffff bfc0 0 400.

14.9.6 Exception handling when in Debug Mode (DM bit is set)

All interrupts including NMI are masked. When the NMI interrupt has occurred

during Debug mode, it is stored internally and the NMI interrupt is taken after debug

handler is fi nis hed (DM is clear).

14.10 Real Time PC TRACE Output

In real time mode non-sequential Program Counter and trace information are outputted on

GTPC[3~0] and GPCST[8~0]. at 1/3 of the processor clock speed.

TX49/H2 Archit ecture

15-1

15. TX49 MPU Core Signal Descriptions

The TX49 MPU core has a 64-bit bu s interface that is up ward compati ble with the TX39 G -bus

interface.

Figure 15-1 TX49 MPU Core Interface Signals

TX49 Core

GAFM[35:0]

GBE[7:0]*

GDFM[63:0]

GDTM[63:0]

GWR*

GACK*

GBUSERR*

GBURST*

GLAST*

GCACHE*

GID

GBUSOEN

GTRST*

GTDI/GDINT*

GTMS

GTCK

GTPC[3:1]

GTDO/GTPC[0]

GSNOOP*

GREQ*

GSREQ*

GHPGREQ*

GHPSREQ*

GGNT*

GSGNT*

GHPSGNT*

GREL*

GHAVEIT*

Memory Interface

Debug/JTAG Interface

Clock and Syst em

Control Int erface

GATM[35:5]

GRD*

GBSTART*

GDCLK

GPCST[8:0] 9

DMA Interface

GHPGGNT*

CPUCLK

GBUSCLK

GCRATE[1:0]

GDOZE

GHALT

GTINTDIS

GBS64*

GENDIAN

Interrupt Inte r face

GCOLDRESET*

GNMI*

GRESET*

GINT[5:0]*

GTEST[2:0]

GDIS*

Test Interface

GCPCOND[3:2]

Coprocess or Int erfac e

GCPRD*

GCPRDACK*

GCPWR*

GCPWRACK*

TX49/H2 Archit ecture

15-2

15.1 Signal Descriptions

15.1.1 Memory Interface Signals

Table 15-1 lists the memory interface signals.

Table 15-1 Memory Interface Signals

Signal Name I/O Active

State Description

GAFM[35:0] O –Address From Bus (Output)

GAFM[35:0] is used as a 36-bit output address bus.

GATM[35:5] I – Address To Bus (Input)

GATM[35:5] is a 31-bit address input bus used for data cache snooping.

GBE[7:0]*O Low Byte Enable

GBE[7:0]* defines the valid data bytes within the 64-bit data bus. The correlation

between the byte enable signals and data bytes is as follows:

GDFM[63:0] O – Dat a From Master (Output )

This data bus always acts as a 64-bit output.

GDTM[63:0] I – Data t o Master (Input)

This data bus always acts as a 64-bit input.

GRD*O Low Read

GRD* is an output-only strobe t hat is assert ed duri ng a bus read operation.

GWR*OLow Write

GWR* is an output-only strobe that is asserted during a bus write operation.

GACK*I Low Read/Write Acknowledge

GACK* is sampled with the rising edge of GBUSCLK. The TX49 MPU core ends

single-read and single-write operations in the next cycle after GACK* is recognized

as asserted. During burst-read and burst-write operations, the TX49 MPU core

increments the address at the next rising edge of GBUSCLK after GACK* is

recognized as asserted. If GACK* is sampled as deasserted, a bus wait cycle is

inserted.

GCACHE*O Cacheable

GCACHE* is an output signal that indi cates whether the bus transfer in progress is

being performed on a cached or uncached address spac e.

H: Uncached space

L: Cached space

GID O Instruction or Dat a

GID is an output signal that indicates t he type of bus transfer being performed.

H: Instruction

L: Data

GBSTART*O Low B us St art

GBSTART* is an output signal that is asserted for one clock cycle to indicate that a

bus operation has started.

GBUSERR*I Low B u s E rror

When GBUSERR* is asserted during a bus read operation, the TX49 MPU core

immediately terminates the ongoing transaction and takes a Bus Error exception.

GBUSERR* is valid only during bus read operat i ons.

Byte Enable Corresponding Data Byte

GBE[7]* GDFM[63:56], GDTM[63:56]

GBE[6]* GDFM[55:48], GDTM[55:48]

GBE[5]* GDFM[47:40], GDTM[47:40]

GBE[4]* GDFM[39: 32], GDTM[[39:32]

GBE[3]* GDFM[31: 24], GDTM[[31:24]

GBE[2]* GDFM[23:16], GDTM[23:16]

GBE[1]* GDFM[15:8], GDTM[15:8]

GBE[0]* GDFM[7:0], GDTM[7:0]

TX49/H2 Archit ecture

15-3

Signal Name I/O Active

State Description

GBURST*O Low Burst

GBURST* is an output-only strobe that is asserted during burst-read and burst-

write operations.

GLAST*O Low Last

GLAST* is an output signal that indiates completion of a bus cycle.

• During a single-read or single-write, GLAST* is asserted simultaneously with

GBSTART*.

• During a burst-read or burst-write, GLAST* is asserted when the TX49 MPU

core has recognized a GACK* for the second last data read.

GBUSOEN*O Low G-Bus Output Enable

GBUSOEN* is the output enable control f or the bus control signals:

While the TX49 assumes bus mastershp: Low

While the TX49 has released bus mastership: High

While GDIS* is asserted: Hi gh

TX49/H2 Archit ecture

15-4

15.1.2 DMA Interface Signals

Table 15-2 lists the DMA interface signals.

Table 15-2 DMA Interface Signals

Signal Name I/O Active

State Description

GSNOOP*I Low SNOOP

The TX49 samples GNSOOP* with the rising edge of GBUSCLK. W hen GSNOOP*

is recognized as asserted, the TX49 captures the address on GATM[35:5] and

compares it to the address es of all data items held in the on-chip data cache. If t he

snoop address hits in the data cache, the cache entry is invalidated. GSNOOP* is

valid when either GHPSGNT* or GSGNT* is asserted.

GREQ*I Low Normal Bus Request

Alternate bus masters assert this signal to request bus mastershp as per ET

concurrenc y protocols.

GSREQ*I Low Snoop Bus Request

Alternate bus masters assert this signal to request bus mastership as per ST

concurrenc y protocols.

GHPGREQ*I Low High-Priority Normal Bus Request

In response to GHPGREQ*, the TX49 asserts GHPGGNT* to grant the bus to the

requesting bus master as per ET concurrency protocols. GHPGREQ* has priority

over GREQ* if both are asserted simultaneousl y.

GHPSREQ*I Low High-Priority Snoop Bus Request

In response to GHPSREQ*, the TX49 asserts GHPSGNT* to grant the bus to the

requesting bus master as per ST concurrency protocols. GHPSREQ* has priority

over GSREQ* if both are asserted sim ultaneously.

GGNT*O Low Normal Bus Grant

Assertion of GGNT* indicates that the TX49 has relinquished bus mastership in

response to GREQ*.

GSGNT*O Low S noop Bus Grant

Assertion of GSGNT* indicates that the TX49 has relinquished bus mastership in

response to GSREQ*.

GHPGGNT*O Low High-Priority Normal Bus Request

Assertion of GHPGGNT* indicates that the TX49 has relinquished bus mastership

in response to GHPGREQ*.

GHPSGNT*O Low High-Priority Snoop Bus Grant

Assertion of GHPSGNT* indicates that the TX49 has relinquished bus mastership

in response to GHPSREQ*.

GREL*O Low Rel ease Request

This output signal indicates to an external bus master that the TX49 want s to regai n

bus mastership. The TX49 asserts GREL* 1) when higher-priority GHPGREQ* is

asserted while lower-priority GSGNT* is asserted and 2) when a bus request is

generated from the TX49 processor core while GHPGGNT* is asserted.

GHAVEIT*I Low Have IT

This is a bus grant acknowledge signal used by an external bus master to indicate

that it has assumed bus mastership. The external bus master can release the bus

by asserting and deasserting GHAVEIT* while keeping a bus request signal

asserted. In a single-bus-master system, GHAVEIT* may be tied high.

TX49/H2 Archit ecture

15-5

15.1.3 Coprocessor Interface Signals

Table 15-3 lists the coprocessor interface signals.

Table 15-3 Coprocessor Interface Signals

Signal Name I/O Active

State Description

GCPRD*O Low Coproc essor Read

GCPRD* is an output-only strobe that is asserted during a coprocessor read

operation.

GCPWR*O Low Coprocessor Write

GCPWR* is an output-only strobe that is asserted during a coprocessor write

operation.

GCPRDACK*I Low Coprocessor Read Acknowledge

A coprocessor asserts this signal to indicate to the TX49 processor core that the

coprocess or read request has been acknowledged.

GCPWRACK*I Low Coprocessor Write Acknowledge

A coprocessor asserts this signal to indicate to the TX49 processor core that the

coprocess or write request has been acknowledged.

GCPCOND[3:2] I – Coprocessor Condition

Coprocessor branch instructions use the GCPCOND[z] signal as the coprocessor

z’s condition signal: GCPCOND[3] is for CP3, and GCPCON[2] is for CP2.

15.1.4 Interrupt Interface Signals

Table 15-4 lists the interrupt interface signals.

Table 15-4 Interrupt Interface Signals

Signal Name I/O Active Description

GCOLDRESET*I Low Coldreset

Asserti on of this input signal initiates a cold reset and forces the TX49 to enter Cold

Reset excepti on process i ng.

GRESET*I Low Reset

Asserti on of this input signal init iates a soft reset and forces the TX49 to enter Soft

Reset excepti on process i ng.

GNMI*I Low Nonmaskable Interrupt

Assertion of this input signal forces the TX49 to enter Nonmaskable Interrupt

exception process i ng.

GINT[5:0]*I Low Interrupt

Assertion of any of these interrupt request inputs causes a general Interrupt

exception unless the corresponding bit is masked in the Status regist er.

GINT[5] can be configured for either a general interrupt input or a timer interrupt

input during Reset exception processing. If the GTINTDIS input is zero during a

reset sequence, GINT[5] is configured for the timer interrupt input.

TX49/H2 Archit ecture

15-6

15.1.5 Test Interface Signals

Table 15-5 lists the test interface signals.

Table 15-5 Test Interface Signals

Signal Name I/O Active

State Description

GTEST[2:0] I – Test

The GTEST[2:0] inputs are used to set up the TX49 in test mode. A value of 2’b000

at GTEST[2:0] puts the TX49 in normal operation mode.

GDIS*I – Disabl e output

This input must be tied high.

15.1.6 Debug Interface Signals

Table 15-6 lists the debug interface signals.

Table 15-6 Debug Interface Signals

Signal Name I/O Active

State Description

GTRST*I Low Test Reset Input

Asserti on of this input init i al i zes the on-chi p Debug Support Unit (DSU).

GTDI/GDINT*I – Test Data Input / Debug Interrupt

Run-Time mode: Functions as a serial data input to the EJTAG instruction

Real-Time mode: Switches the debug unit mode from Real-Time mode to Run-

Time mode.

GTMS I – Tes t Mode Select Input

The GTMS input controls the transitions of the TAP controller in conjunction with

the rising edge of GTCK.

GTCK I – Test Clock Input

GTCK is used to shift test data into or out of JTAG logic for EJTAG instructions.

GTCK is independent of CPUCLK.

GTPC[3:1] O – Tr a ce PC Outp ut.

GTPC[3:1] provide non-sequent i al program c ounter output at the GDCLK speed.

GTDO/GTPC[0] O – Test Data Output

Run-Time mode: Shifts serial output data from the EJTAG data or instruction

Real-Time mode: Provides a non-sequent i al program c ounter.

GPCST[8:0] O – PC Trace Status

The GPCST[8:0] outputs provide PC trace status informati on and serial m onitor bus

mode.

GDCLK O – Debug Clock

Output clock for EJTAG debug.

TX49/H2 Archit ecture

15-7

15.1.7 Clock and System Control Interface Signals

Table 15-7 lists the clock and system control interface signals.

Table 15-7 Clock and System Control Interface Signals

Signal Name I/O Active

State Description

CPUCLK I – CPU Clock Input

The TX49 processor core operates at the same frequency as CPUCLK.

GBUSCLK I – GB US Clock Input

GBUSCLK is the clock input for the G-Bus interface.

A divided-down clock must be applied to GBUSCLK according the value of

GCRATE[1:0]. Otherwise, correct operation is not guaranteed.

GCRATE [1:0] I – GBUS Clock Rate Input from External Pin

GCRATE[1:0] select s the frequency at which the G-Bus interface runs with respect

to the TX49 processor core. The frequency division factor can be one of the

following; it must not be changed while the process or is running.

GCRATE[1:0]

1/2

1/3

1/4

1/2.5

GDOZE O High Doze

GDOZE follows the state programmed into the Doze bit in the Config register.

GDOZE=1 when the TX49 is in Doze mode.

GHALT O High Halt

GHALT follows the state programmed into the Halt bit in the Config register.

GHALT=1 when the TX49 is in Halt mode.

GTINTDIS I – Timer interrupt disable Input from External Pin

GTINTDIS is specifies the pin function of GINT[5] during a reset sequence.

H: Disables t he timer interrupt function (i.e., configures the GINT[5] pin as a general

interrupt request pin)

L: Enables the timer interrupt function (i.e., configures the GINT[5] pin as a timer

interrupt request pin.)

GENDIAN I – Endianess Input from External Pin

GENDIAN specifies byte ordering during a reset sequence.

H: Big-endian

L: Little-endi an

GBS64*I – System bus size.

GBS64* specifies t he G-Bus size duri ng a reset s equence.

H: 32-bit (GDTM[31:0] and GDFM[31:0] are valid.)

L: 64-bit (GDTM[63:0] and GDFM[63:0] are valid.)

TX49/H2 Archit ecture

15-8

TX49/H2 Architecture

16-1

16. Low Power Consumption Modes

The TX49 can reduce its power consumption compared to the normal mode by controlling its

internal clocks. The following two operation modes function as low power consumption modes of

the TX49:

• Halt mode

• Doze mode

16.1 Halt mode

The halt mode reduces power consumption by halting TX49 operation. By setting the HALT

bit of the Config register to 0 by the software and executing WAIT instruction, the TX49 mode

shifts from the normal operation mode to the halt mode.

Therefore, as for bus control requests in the halt mode, a bus release request is responded to

in cases of ET concurrency such as the GREQ* signal or the GHPGREQ* signal. However, the

request is not responded to in cases of ST concurrency such as the GSREQ* signal or the

GHPSREQ* signal. O n the o ther hand , if WAI T in structio n is ex ecute d while the bus is being

released, the halt mode starts in cases of ET concurrency, but in cases of ST concurrency

starts after bus ownership is granted and the GHALT signal is asserted.

If WAIT instruction is executed during a bus operation, the GHALT signal is asserted after

the bus operation is completed.

If data remain in the write buffer, the write operation is executed even after shifting to the

halt mode.

The internal halt bit is cleared by the assertion of the GINT[5~0]* signal, the GNMI* signal,

the GRESET* signal or the GCOLDRESET* signal , and the TX49 re turn from the halt mode.

If this is caused by the assertion of the GINT[5:0]* signal, the TX49 is released from the halt

mode irrespective of the value in the IntMask field of the Status register. If the TX49 is

brought back from the halt mode by the GCOLDRESET* signal, the GRESET* signal, the

GNMI* signal, or a non-masked GINT[5~0]* signal, the initial instruction in the

corresponding exception handler is executed. At this time, the EPC register is pointing to the

instruction following the WAIT instruction. If it is recovered by a masked GINT[5~0]* si gnal ,

execution resumes from the instruction following the instruction that was being executed

when it shifted to the halt mode.

As shown in Figure 16-1 the TX49 outputs the status of the internal halt bit on the GHALT

signal. The memory interface output signals in the halt mode are maintained in the same

status as when no bus operation was being executed.

Note: When the condition is brought back from the Power Consumption Modes are satisfied and WAIT

instruction is executed, the TX49 does not shift to the mode.

TX49/H2 Architecture

16-2

GBUSCLK

GHALT

Internal CPUCLK

GRD*, GWR*

M-stage W-stage of WAIT

HALT bi t s et 0 before here

Figure 16-1 Halt Mode

16.2 Doze mode

The doze mode is also a mode which halts TX49 operation in order to lower power-

consumption. However, the difference from the halt mode is that bus control requests (both

ST concurrency and ET concurrency) from an external bus master can be responded to.

Snooping operation of the data cache can also performed in ST concurrency. By setting the

HALT bit of the Config register to 1 by the software and executing WAIT instruction, the

TX49 mode shifts from the normal operation mode to the doze mode. Then, the TX49

Processor Core that is built into the TX49 halts operation while retaining the pipeline status.

As mentioned above, bus control requests are responded to while in the doze mode in cases

of ET concurrency such as the GREQ* signal and the GHPGREQ* signal, and in cases of ST

concurrency such as the GSREQ* signal and the GHPSREQ* signal. On the other hand, if

WAIT instruction is exec uted while the bus is being rele ased, the doze mode starts in cases of

ET concurrency, but in cases of ST concurrency starts after bus ownership i s granted and the

GDOZE signal is asserted. If WAIT instruction is executed during a bus operation, the

GDOZE signal is asserted after the bus operation is completed. The snooping of an external

bus master is done by ST concurrency when the TX49 is in the doze mode. For the bus that is

released by the assertion of the SGNT* signal or the GHPSGNT* signal, snooping of the data

cache can be performed by the GSNOOP* signal and the GA[35 ~0] signal. When an external

bus master deasserts the GSREQ* signal or the GHPSREQ* signal, the TX49 deasserts the

GSGNT* signal or the GHPSGNT* signal.

By asserting the GINT[5~0]* signal, the GNMI* signal, the GRESET* signal or the

GCOLDRESET* signal, the internal doze bit is cleared and the TX49 returns from the doze

mode. If this is caused by the assertion of the GINT[5~0]* signal, the TX49 is released from

the doze mode irrespective of the value in the IntMask field of the Status register. If the TX49

is brought back from the doze mode by the GCOLDRESET* signal, the GNMI* signal, or a

non-masked GINT[5~0]* signal, the top instruction in the corresponding exception handler is

exec uted. At this time, the EPC is pointin g to the ins truction following the WAIT in structio n.

If it is recovered by a masked GINT[5~0]* signal, execution resumes from the instruction

following the instruction that was being executed when it shifted to the doze mode.

TX49/H2 Architecture

16-3

As shown in Figure 16-2, the TX49 outputs the status of the internal doze bit on the GDOZE

signal. The memory interface output signals in the doze mode are maintained in the same

status as when no bus operation was executed.

Note: When the condition is brought back from the Power Consumption Modes are satisfied and WAIT

instruction is executed, the TX49 does not shift to the mode.

GBUSCLK

GDOZE

Internal

CPUCLK

(except snoop clock)

GRD*, GWR*

W-stage

before here

of WAITM-stage

HALT b i t se t 1

Figure 16-2 Doze Mode

16.3 Status Shifts

Figure 16-3 shows the status shifts in the operation mode of the TX49.

Interrupt or Reset

Interrupt or

Reset

Interrupt or

Reset

HALT bit = 1 & WAIT inst

HALT bit = 0 & WAIT inst.

Halt

Mode

Normal

Operation

Mode

Doze

Mode

Figure 16-3 Status Shift Among Normal Operation Mode and Low Power Consumption Modes

When operation status shifts from the normal operation mode to the halt mode, it is

returned to the normal operation mode by an interrupt or a reset. Similarly, when it shifts

from the normal operation mde to the doze mode, it is returned to the normal operation mode

by an interrupt or a reset. After a reset, the TX49 is initialized to the normal operation mode.

TX49/H2 Architecture

16-4

TX49/H2 Architecture

A-1

Appendix A: CPU Instruction Set Details

This appe ndix provid es a detailed d escriptio n of the o peration of each TX4 9 instruct ion in both

32- and 64-bit modes. The instructions are listed in alphabetical order.

The exceptions that may occur due to the execution of each instruction are listed after the

description of each instruction. The description of the immediate causes and manner of handling

exceptions is omitted from the instruction descriptions in this chapter.

Figures at the end of this appendix list the bit encoding for the constant fields of each

instruction, and the bit encoding for each individual instruction is included with that instruction.

For a detailed description of the FPU instructions, refer to Appendix B.

A.1 Instruction Classes

The TX49 has some classes of CPU instructions, as follows.

• Load and Store

• Computational

• Jump and Branch

• Coprocessor

• Special

• Exception

• Multiply and Divide

• Debug

• Others

TX49/H2 Architecture

A-2

A.1.1 Instruction Formats

Every instruction consists of a single word (32 bits) aligned on a word boundary. The

main instruction formats are shown in Figure A-1.

J-Type (Jump)

I-Type (Immedi at e)

immediateop rs rt

15162021252631 0

op target

252631 0

R-Type (Register)

functshamtrdop rs rt

56101115162021252631 0

where:

op is a 6-bit operat i on code

rs is a 5-bit source regis ter specifier

rt is a 5-bit target (source/dest i nation) regist er or branch condit i on

immediate is a 16-bi t immediat e, branch displacement or address

displacement

target is a 26-bit j ump target address

rd is a 5-bit destinati on regi ster spec i fier

sham t is a 5-bit shif t amount

funct is a 6-bit f unction f i eld

Figure A-1 CPU Instruction Formats

A.1.2 Instr uct ion Notat ion Conventions

In this appendix, all variable subfields in an instruction format (such as rs, rt

immediate, etc.) are shown in lowercase names.

For the sa ke of clari ty, we so metimes use an alias f or a variable subfie ld in the formats

of specific instructions. For example, we use rs = base in the format for load and store

instructions. Such an alias is always lower case, since it refers to a variable subfield.

Figures with the actual bit encoding for all the mnemonics are located at the end of this

Appendix, and the bit encoding also accompanies each instruction.

In the instruction descriptions that follow, the Operation section describes the operation

performed by each instruction using a high-level language notation. The TX49 can

operate as either a 32- or 64-bit microprocessor. The operation for both modes is included

with the instruction description. Special symbols used in the notation are described in

Table A-1.

TX49/H2 Architecture

A-3

Table A-1 CPU Instruction Operation Notations

Symbol Meaning

←Assignment.

  Bit string conc atenat i on.

xyReplication of bit value x into a y-bit string. Note: x is always a single-bit value.

xy...z Selection of bits y through z of bit string x. Little-endian bit notation is always used.

If y isess than z, this expressi on is an empty (zero length) bit st ri ng.

+Two’s complement or floating-point addit i on.

−Two’s complement or floating-point s ubtrac tion.

*Two’s complement or floating-point multipl ication.

Div Two’s complement integer divis i on.

Mod Two’s complement modulo.

/ Float ing-point divisi on.

< Two’s complement less than comparison.

And Bitwise logic AND.

Or Bitwise logic OR.

Xor Bitwise logic XOR.

Nor Bitwise logic NOR.

GPR[x] General-Register x. The content of GPR[0] is always zero. At tempts t o alte r t he content of

GPR[0] have no effect.

CPR[z,x] Coproces sor unit z, general register x.

CCR[z,x] Coprocessor unit z, control regis t er x.

COC[z] Coprocesso r unit z condition signal.

BigEndianMem Big-endian mode as configured at reset (0 → Little, 1 → Big). Specifies the endianess of

the memory interface (see LoadMemory and StoreMemory), and the endianess of Kernel

and Supervisor m ode execution.

ReverseEndian Signal to reverse the endianess of load and store instructions. This feature is available in

User mode only, and is effected by setting the RE bit of the Status register. Thus,

ReverseEndian may be computed as (SR25 and User mode)

BigEndianCPU The endianess for load and store instructions (0 → Little, 1 →Big). In User mode, this

endianess may be reversed by setting SR25 Thus, BigEndianCPU may be computed as

BigEndianMem XOR ReverseE ndi an.

Llbit Bit of state to specify synchronization instructions. Set by LL, cleared by ERET and

Invalidat e and read by SC.

T + i: Indicates t he time steps between operations. Each of the statements within a time st ep are

defined to be executed in sequenti al order (as modified by condi tional and l oop const ructs).

Operations which are mark ed T

i: are executed at instruc tion cycle i relative to the start of

execution of the instruct ion. Thus, an instruction which st arts at tim e j executes operat ions

marked T + i: at time i + j. The interpretation of the order of excution between two

instructions or two operations which execute at the same time should be pessimistic; the

order is not defined.

TX49/H2 Architecture

A-4

A.1.3 Sign Extension and Zero Extension

With some instru ction s the bit len gth m ay be e xten ded; f or ex ample, a 16-bit o ffse t may

be extended to 32 bits. This extension can take the from of either a sign extension or zero

extension.

• Sign extension

The extended part is fi lled with th e value of the most significant bit.

(example)

1001100101011100 16 bit

11111111111111111001 100101011100 32 bit

• Zero extension

The extended part is filled with zeros.

(example)

1001100101011100 16 bit

00000000000000001001100101011 100 32 bit

A.1.4 Instr uction Notation Examples

The Following examples illustrate the application of some of the instruction notation

conventions:

Example #1:

GPR[rt] ← immediate   016

Sixteen zero bits are co ncat enated with an immedi ate value (typically 16 bits), and the

32-bit string (with the lower 16 bits set to zero) is assigned to General-Purpose

Example #2:

(immediate15)16 || immediat e15∼0

Bit 15 (the sign bit) of an immediate value is extended for 16 bit positions, and the

result is concatenated with bits 15 through 0 of the immediate value to form a 32-bit

sign extended value.

TX49/H2 Architecture

A-5

A.2 Load and Store Instructions

In the TX49 implementation, the instruction immediately following a load may use the

contents of the register loaded. In such cases, the hardware interlocks, requiring additional

real cycles, so scheduling load delay slots is still desirable, although not required for

functional code.

Two special instructions are provided in the TX49 implementation of the MIPS ISA, Load

Linked and Store Conditional. These instructions are used in carefully coded sequences to

provide one of several synchronization primitives, including test-and-set, bit-level locks,

semaphores, and sequencers / event counts.

In the load and store operation descriptions, the functions listed in Table A-2 are used to

summarize the handling of virtual addresses and physical memory.

Table A-2 Load and Store Common Functions

Function Meaning

AddressTrans lation Uses the TLB to find the physic al address given the virtual address. The function fails

and an exception is taken if the required translation is not pres ent in the TLB.

LoadMemory Uses the cache and main memory to find the contents of the word containing the

specified physic al address. The low-order two bits of t he address and the acc ess t ype

field indica t es which of each of the four bytes within the data word need t o be returned.

If the cache is enabled for this access, the enti re word i s returned and loaded into t he

cache.

StoreMemory Uses the cache, write buffer, and main memory to store the word or part of word

specified as data in the word containing the specified physic al address. The low-order

two bits of the address and the access type field indicates which of each of the four

bytes within the dat a word should be stored.

The access type field indicates the size of the data item to be loaded or stored as shown in

Table A-3. Regardless of access type or byte-numbering order (endianness), the address

specif ies the byte which has the sma llest byte addr ess of the byte s in the addresse d fiel d. For

a Big-endian machine, this is the leftmost byte and contains the sign for a 2’s-complement

number; for a Little-endian machine, this is the rightmost byte and contains the lowest

precision byte.

Table A-3 Access Type Specifications for Loads/Stores

Access Type Mnemonic Value Meaning

DOUBLEWORD 7 doubl eword (64 bi ts)

SEPTIBYTE 6 seven bytes (56 bi ts)

SEXTIBYTE 5 si x bytes (48 bits )

QUINTIBYTE 4 five bytes (40 bits)

WORD 3 word (32 bits )

TRIPLEBYTE 2 triple-byte (24 bits )

HALFWORD 1 halfword (16 bits)

BYTE 0 byte (8 bits)

The bytes within the addressed doubleword which are used can be determined directly from

the access type and the t hree low-order bits of the address, as shown in Chapter 2.

TX49/H2 Architecture

A-6

A.3 Jump and Branch Instructions

All jump and branch instructions have an architectural delay of exactly one instruction.

That is, the instructio n immediately follo wing a jump or bran ch (i.e., occupy ing the de lay slot)

is always executed while the target instruction is being fetched from storage. It is not valid

for a delay slot to be occupied itself by a jump or branch in struction; however, this error is not

detected, a nd the results of such an operation are undefi n ed.

If an exception or interrupt prevents the completion of a legal instruction during a delay

slot, the hardware sets the EPC register to point at the jump or branch instruction which

precedes it. When the code is restarted, both the jump or branch instructions and the

instruction in the delay slot are reexecuted.

Because jump and branch instructions may be restarted after exceptions or interrupts, they

must be restartable. Therefore, when a jump or branch instruction stores a return link value,

Since instructions must be word-aligned, a Jump Register or Jump and Link Register

instruction must use a register whose two low-order bits are zero. If these low-order bits are

not zero, an address exception will occur when the jump target instruction is subsequently

fetched.

A.4 Coprocessor Instructions

The MIPS architecture provides four coprocessor units, or classes. Coprocessors are

alternate execution units, which have separate register files from the CPU. R-Series

coprocessors have 2 register spaces, each with thirty-two 32-bit registers. The first space,

coprocessor general registers, may be directly loaded from memory and stored into memory,

and their contents may be transferred between the coprocessor and processor. The second,

coprocessor control registers, may only have their contents transferred directly between the

coprocessor and processor. Coprocessor instructions may alter registers in either space.

Normally, by convention, Coprocessor Control Register 0 is interpreted as a Coprocessor

Implementation And Revision register. However, the system control coprocessor (CP0) uses

Coprocessor General Register 15 for the processor / coprocessor revision register. The

second byte (bits 15∼8) is interpreted as a coprocessor unit implementation descriptor. The

revision number is a value of the form y.x where y is a major revision number in bits 7∼4 and x

is a minor revision number in bits 3∼0.

The contents of the high-order halfword of the register are not defined (currently read as 0

and should be 0 when written).

A.5 System Control Coprocessor (CP0) Instructions

There are some spec ial limi tat ion s impo sed o n o pe ratio ns in vo lvin g CP0 that is inco rpo rate d

within the CPU. Although load and store instructions to transfer data to and from

coprocessors and move control to/from coprocessor instructions are generally permitted by the

MIPS architecture, CP0 is given a somewhat protected status since it has responsibility for

exception handling and memory management. Therefore, the move to/from coprocessor

instructions are the only valid mechanism for reading from and writing to the CP0 registers.

Several coprocessor operation instructions are defined for CP0 to directly read, write, and

probe TLB entries and to modify the operating modes in preparation for returning to User

mode or interrupt-enabled states.

TX49/H2 Architecture

A-7

A.6 CPU Instructions

This appendix provides a detailed description of the operation of each TX49 instruction in

both 32- and 64-bit modes.

Exceptions that may occur due to the execution of each instruction are listed after the

description of each instruction.

For a detailed description of the exception of the exceptions, refer to Chapter 4.

TX49/H2 Architecture

A-8

ADD Add ADD

rd ADD

100000

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

ADD rd,rs,rt

Description:

The content s of general register rs and the contents of general register rt are added to form

the result. The result is placed into general register rd. In 64-bit mode, the operands must

be valid sign-extended, 32-bit values.

An overflow exception occurs if the carries out of bits 30 and 31 differ (2’s-complement

overflow). The destination register rd is not modified when an integer overflow exception

occurs.

Operation:

32 T: GPR[rd] ← GPR[rs] + GPR[rt]

64 T: temp ← GPR[rs] + GPR [rt]

GPR[rd] ← (temp31)32   temp31∼0

Exceptions:

Integer overflow exception

TX49/H2 Architecture

A-9

ADDI Add Immediate ADDI

ADDI

001000 rs immediatert

1516202125

2631 0

55 16

Format:

ADDI rt, rs, immediate

Description:

The 16-bit immediate is sign-extended and added to the contents of general register rs to

form the result. The result is placed into general register rt. In 64-bit mode, the operand

must be valid sign-extended, 32-bit values.

An overflow exception occurs if carries out of bits 30 and 31 differ (2’s-complement

overflow). The destination register rt is not modified when an integer overflow exception

occurs.

Operation:

32 T: GPR[rt] ← GPR [rs] + (immediate15)16   immediate15∼0

64 T: temp ← GPR[rs] + (imm edi ate15)48   immedi ate15∼0

GPR[rt] ← (temp31)32   temp31∼0

Exceptions:

Integer overflow exception

TX49/H2 Architecture

A-10

ADDIU Add Immediate Unsigned ADDIU

ADDIU

001001 rs immediatert

1516202125

2631 0

55 16

Format:

ADDIU rt, rs, immediate

Description:

The 16-bit immediate is sign-extended and added to the contents of general register rs to

form the result. The result is placed into general register rt. No integer overflow exception

occurs under any circumstances. In 64-bit mode, the operand must be valid sign-extended,

32-bit values.

The only difference between this instruction and the ADDI instruction is that ADDIU

never causes an overflow exception.

Operation :

32 T: GPR[rt] ← GPR [rs] + (immediate15)16   immediate15∼0

64 T: temp ← GPR[rs] + (imm edi ate15)48   immedi ate15∼0

GPR[rt] ← (temp31)32   temp31∼0

Exceptions:

None

TX49/H2 Architecture

A-11

ADDU Add Unsigned ADDU

rd ADDU

100001

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

ADDU rd, rs, rt

Description:

The content s of general register rs and the contents of general register rt are added to form

the result. The result is placed into general register rd. No overflow exception occurs under

any circumstances. In 64-bit mode, the operands must be valid sign-extended, 32-bit values.

The only diff erence betwe en this instruction and the ADD in struction is th at ADDU ne ver

causes an overfl ow exception.

Operation:

32 T: GPR[rd] ← GPR[rs] + GPR[rt]

64 T: temp ← GPR[rs] + GPR[rt]

GPR[rd] ← (temp31)32   temp31∼0

Exceptions:

None

TX49/H2 Architecture

A-12

AND And AND

rd AND

100100

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

AND rd, rs, rt

Description:

The contents of general register rs are combined with the contents of general register rt in

a bit-wise logical AND operation. The result is placed into general register rd.

Operation:

32 T: GPR[rd] ← GPR[rs] + GPR[rt]

64 T: GPR[rd] ← GPR[rs] and GPR[rt ]

Exceptions:

None

TX49/H2 Architecture

A-13

ANDI And Immediate ANDI

ANDI

001100 rs immediatert

1516202125

2631 0

55 16

Format:

ANDI rt, rs, immediate

Description:

The 16-bi t immediate is zero-extended and combined with the contents of general register

rs in a bit-wise logical AND operation. The result is placed into general register rt.

Operation:

32 T: GPR[rt] ← 016   (immediate and GPR[rs]15∼0)

64 T: GPR[rt] ← 048   (immediate and GPR[rs]15∼0)

Exceptions:

None

TX49/H2 Architecture

A-14

BCzF Branch On Coprocessor z False BCzF

offset

BCF

00000

01000

COPz

0100xx*

1516202125

2631 0

55 16

Format:

BCzF offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. If coprocessor z’s

condition signal (CpCond), as sampled during the previous instruction, is false, then the

program branches to the target address with a delay of one instruction.

Because the condition line is sampled during the previous instruction, there must be at

least one instruct ion betwe en this in struction and a cop rocessor in struction that chang es th e

condition line.

Operation:

32 T-1: condition ← not COC[z]

T: target ← (offset15)14   offset   02

T + 1: if condition then

PC ← PC + target

endif

64 T-1 condition ← not COC[z]

T: target ← (offset15)46   offset   02

T + 1: if condition then

PC ← PC + target

endif

*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit

Encoding” at the end of Appendix A.

TX49/H2 Architecture

A-15

BCzF Branch On Coprocessor z False

(continued) BCzF

Exceptions:

Coprocessor unus able exception

Opcode Bit Encoding

Branch conditionBC sub-opcode

BCzF

Coprocessor Unit Number

BC0F

Bit #

Opcode

0161718192021222324252627282930

0010

100000000000

0161718192021222324252627282930

0010

101000000000

0161718192021222324252627282930

0010

101100000000

BC1F

Bit #

BC3F

Bit #

0161718192021222324252627282930

0010

100100000000

BC2F

Bit #

Note: CpCond0 = Write Buffer Empty

(Empty → true (1), Not empty → false (0))

CpCond1 = FPU (See the A ppendix B)

CpCond2 = External Pin condit ion (GCPCOND2)

CpCond3 = External Pin condit ion (GCPCOND3)

TX49/H2 Architecture

A-16

BCzFL Branch On Coprocessor

False Likely BCzFL

offset

01000 BCFL

00010

COPz

0100xx*

16202125

2631 0

55 16

Format:

BCzFL offset

Description :

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of

coproc essor z’s conditio n line, as samp led during the previous in struction, is f alse, the target

address is branched to with a delay of one instruction.

If the conditional branch is not taken, the instruction in the branch delay slot is nullified.

Because the condition line is sampled during the previous instruction, there must be at

least one instruct ion betwe en this in struction and a cop rocessor in struction that chang es th e

condition line.

*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit

Encoding” at the end of Appendix A.

TX49/H2 Architecture

A-17

BCzFL Branch On Coprocessor

False Likely (continued) BCzFL

Operation:

32 T-1: condition ← not COC[z]

T: target ← (offset15)14   offset   02

T + 1: if condition then

PC ← PC + target

else

NullityCurrentInstruction

endif

64 T-1 condition ← not COC[z]

T: target ← (offset15)46   offset   02

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

Endif

Exceptions:

Coprocessor unus able exception

Opcode Bit Encoding:

Branch condi tionBC sub-opcode

BCzFL

Coprocess or Uni t Number

BC0FL

Bit #

Opcode

0161718192021222324252627282930

0010

100000000100

0161718192021222324252627282930

0010

101000000100

0161718192021222324252627282930

0010

101100000100

BC1FL

Bit #

BC3FL

Bit #

0161718192021222324252627282930

0010

100100000100

BC2FL

Bit #

Note: CpCond0 = Write Buffer Empty

(Empty → true (1), Not empty → false (0))

CpCond1 = FPU (See the A ppendix B)

CpCond2 = External Pin condit ion (GCPCOND2)

CpCond3 = External Pin condit ion (GCPCOND3)

TX49/H2 Architecture

A-18

BCzT Branch On Coprocessor z True BCzT

offset

BCT

00001

01000

COPz

0100XX*

1516202125

2631 0

55 16

Format:

BCzT offset

Description :

A branch target address is computed from the sum of the address of the instruction in the

delay slo t and the 16- bit offset, shifted left two bits and sign-extended. If the coprocessor z’s

condition signal (CpCond) is true, then the program branches to the target address, with a

delay of one instruction.

Because the condition line is sampled during the previous instruction, there must be at

least one instruct ion betwe en this in struction and a cop rocessor in struction that chang es th e

condition line.

Operation :

32 T-1: condition ← COC[z]

T: target ← (offset15)14   offset   02

T + 1: if condition then

PC ← PC + target

endif

64 T-1 condition ← COC[z]

T: target ← (offset15)46   offset   02

T + 1: if condition then

PC ← PC + target

Endif

*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit

Encoding” at the end of Appendix A.

TX49/H2 Architecture

A-19

BCzT Branch On Coprocessor z True

(continued) BCzT

Exceptions:

Coprocessor unus able exception

Opcode Bit Encoding:

Branch condi tionBC sub-opcode

BCzT

Coprocess or Uni t Number

BC0T

Bit #

Opcode

0161718192021222324252627282930

001

100000001000

0161718192021222324252627282930

0010

101000001000

0161718192021222324252627282930

0010

101100001000

BC1T

Bit #

BC3T

Bit #

0161718192021222324252627282930

0010

100100001000

BC2T

Bit #

Note: CpCond0 = Write Buffer Empty

(Empty → true (1), Not empty → false (0))

CpCond1 = FPU (See the A ppendix B)

CpCond2 = External Pin condit ion (GCPCOND2)

CpCond3 = External Pin condit ion (GCPCOND3)

TX49/H2 Architecture

A-20

BCzTL Branch On Coprocessor

True Likely BCzTL

offset

BCTL

00011

01000

COPz

0100XX*

1516202125

2631 0

55 16

Format:

BCzTL offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of

coprocessor z’s condition line, a s sampled during the previous instruction, is true, the target

address is branched to with a delay of one instruction.

If the conditional branch is not taken, the instruction in the branch delay slot is nullified.

Because the condition line is sampled during the previous instruction, there must be at

least one instruct ion betwe en this in struction and a cop rocessor in struction that chang es th e

condition line.

Operation:

32 T-1: condition ← COC[z]

T: target ← (offset15)14  offset   02

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

64 T-1 condition ← COC[z]

T: target ← (offset15)46   offset   02

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit

Encoding” at the end of Appendix A.

TX49/H2 Architecture

A-21

BCzTL Branch On Coprocessor

True Likely (continued) BCzTL

Exceptions:

Coprocessor unus able exception

Opcode Bit Encoding:

Branch condi tionBC sub-opcode

BCzTL

Coprocess or Uni t Number

BC0TL

Bit #

Opcode

0161718192021222324252627282930

0010

100000001100

0161718192021222324252627282930

0010

101000001100

0161718192021222324252627282930

0010

101100001100

BC1TL

Bit #

BC3TL

Bit #

0161718192021222324252627282930

0010

100100001100

BC2TL

Bit #

Note: CpCond0 = Write Buffer Empty

(Empty → true (1), Not empty → false (0))

CpCond1 = FPU (See the A ppendix B)

CpCond2 = External Pin condit ion (GCPCOND2)

CpCond3 = External Pin condit ion (GCPCOND3)

TX49/H2 Archit ecture

A-22

A. BEQ Branch On Equal BEQ

rs offset

BEQ

000100

1516202125

2631 0

55 16

Format:

BEQ rs, rt, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of

general register rs and the con-tents of general register rt are compared. If the two registers

are equal, then the program branches to the target address, with a delay of one instruction.

Operation:

32 T: condition ← (offse t15)14   offset   02

condition ← (GPR[rs] = GPR[rt])

T + 1: if condition then

PC ← PC + target

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs] = GPR[rt])

T + 1: if condition then

PC ← PC + target

endif

Exceptions:

None

TX49/H2 Archit ecture

A-23

BEQL Branch On Equal Likely BEQL

rs offset

BEQL

010100

1516202125

2631 0

55 16

Format:

BEQL rs, rt, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of

general register rs and the contents of general register rt are compared. If the two registers

are equal, the target address is branched to, with a delay of one instruction. If the

conditional branch is not taken, the instruction in the branch delay slot is nullified.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs] = GPR[rt])

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs] = GPR[rt])

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

Exceptions:

None

TX49/H2 Archit ecture

A-24

BGEZ Branch On Greater Than

Or Equal To Zero BGEZ

rs offset

BGEZ

00001

REGIMM

000001

1516202125

2631 0

55 16

Format:

BGEZ rs, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of

general register rs have the sign bit cleared, then the program branches to the target

address, with a delay of one instruction.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 0)

T + 1: if condition then

PC ← PC + target

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 0)

T + 1: if condition then

PC ← PC + target

endif

Exceptions:

None

TX49/H2 Archit ecture

A-25

BGEZAL Branch On Greater

Than Or Equal To Zero

And Link BGEZAL

rs offset

BGEZAL

10001

REGIMM

000001

1516202125

2631 0

55 16

Format:

BGEZAL rs, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slo t and the 16-bit offset, shifted left two b its and sign-extended. Unconditionally, the

address of the instruction after the delay slot is placed in the link register, r31 . If the

contents of general register rs have the sign bit cleared, then the program branches to the

target address, with a delay of one instruction.

General register rs may not be general register 31, because such an instruction is not

restartable. An attempt to execute this instruction is not tapped, however.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 0)

GPR[31] ← PC + 8

T + 1: if condition then

PC ← PC + target

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 0)

GPR[31] ← PC + 8

T + 1: if condition then

PC ← PC + target

endif

Exceptions:

None

TX49/H2 Archit ecture

A-26

BGEZALL Branch On Greater

Than Or Equal To

Zero And Link Likely BGEZALL

rs offset

BGEZALL

10011

REGIMM

000001

1516202125

2631 0

55 16

Format:

BGEZALL rs, offset

Descriptions:

A branch target address is computed from the sum of the address of the instruction in the

delay slo t and the 16-bit offset, shifted left two b its and sign-extended. Unconditionally, the

address of the instruction after the delay slot is placed in the link register, r31 . If the

contents of general register rs have the sign bit cleared, then the program branches to the

target address, with a delay of one instruction.

General register rs may not be general register 31, because such an instruction is not

restart able . An atte m pt to e x ec ute th is in str uction is not rapped, however. If the conditional

branch is not taken, the instruction in the branch delay slot is nullified.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 0)

GPR[31] ← PC + 8

T + 1: if condition then

PC ← PC + target

Else

NullifyCurrentInstruction

Endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 0)

GPR[31] ← PC + 8

T + 1: if condition then

PC ← PC + target

Else

NullifyCurrentInstruction

Endif

Exceptions:

None

TX49/H2 Archit ecture

A-27

BGEZL Branch On Greater Than

Or Equal To Zero Likely BGEZL

rs offset

BGEZL

00011

REGIMM

000001

1516202125

2631 0

55 16

Format:

BGEZL rs, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of

general register rs have the sign bit cleared, then the program branches to the target

address, with a delay of one instruction. If the conditional branch is not taken, the

instruction in the branch delay slot is nullified.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 0)

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 0)

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

Exceptions:

None

TX49/H2 Archit ecture

A-28

BGTZ Branch On Greater Than Zero BGTZ

rs offset

00000

BGTZ

000111

1516202125

2631 0

55 16

Format:

BGTZ rs, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of

general register rs are compared to zero. If the contents of general register rs have the sign

bit cleared and are not equal to zero, then the program branches to the target address, with a

delay of one instruction.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 0 ) and (GPR[rs] ≠ 032)

T + 1: if condition then

PC ← PC + target

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 0 ) and (GPR[rs] ≠ 064)

T + 1: if condition then

PC ← PC + target

endif

Exceptions:

None

TX49/H2 Archit ecture

A-29

BGTZL Branch On Greater

Than Zero Likely BGTZL

rs offset

00000

BGTZL

010111

1516202125

2631 0

55 16

Format:

BGTZL rs, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of

general register rs are compared to zero. If the contents of general register rs have the sign

bit cleared and are not equal to zero, then the program branches to the target address, with a

delay of one in struction. If the conditio nal branch is not take n, the in struction in the branch

delay slot is nullified.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 0 ) and (GPR[rs] ≠032)

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 0 ) and (GPR[rs] ≠064)

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

Exceptions:

None

TX49/H2 Archit ecture

A-30

BLEZ Branch on Less Than Or

Equal To Zero BLEZ

rs offset

00000

BLEZ

000110

1516202125

2631 0

55 16

Format:

BLEZ rs, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of

general register rs are compared to zero. If the contents of general register rs have the sign

bit set, or are equal to zero, then the program branches to the target address, with a delay of

one instruction.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 1 ) or (GPR[rs] = 032)

T + 1: if condition then

PC ← PC + target

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 1 ) or (GPR[rs] = 064)

T + 1: if condition then

PC ← PC + target

endif

Exceptions:

None

TX49/H2 Archit ecture

A-31

BLEZL Branch on Less Than

Or Equal To Zero Likely BLEZL

rs offset

00000

BLEZL

010110

1516202125

2631 0

55 16

Format:

BLEZL rs, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of

general register rs is compared to zero. If the contents of general register rs have the sign bit

set, or are equal to zero, then the program branches to the target address, with a delay of one

instruction.

If the conditional branch is not taken, the instruction in the branch delay slot is nullified.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 1 ) or (GPR[rs] = 032)

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 1 ) or (GPR[rs] = 064)

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

Endif

Exceptions:

None

TX49/H2 Archit ecture

A-32

BLTZ Branch On Less Than Zero BLTZ

rs offset

BLTZ

00000

REGIMM

000001

1516202125

2631 0

55 16

Format:

BLTZ rs, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of

general register rs have the sign bit set, then the program branches to the target address,

with a delay of one instruction.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 1)

T + 1: if condition then

PC ← PC + target

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 1)

T + 1: if condition then

PC ← PC + target

endif

Exceptions:

None

TX49/H2 Archit ecture

A-33

BLTZAL Branch On Less

Than Zero And Link BLTZAL

rs offset

BLTZAL

10000

REGIMM

000001

1516202125

2631 0

55 16

Format:

BLTZAL rs, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slo t and the 16-bit offset, shifted left two bi ts and sign-extended. Unconditionally, the

address of the instruction after the delay slot is placed in the link register, r31 . If the

contents of general register rs have the sign bit set, then the program branches to the target

address, with a delay of one instruction.

General register rs may not be general register 31, because such an instruction is not

restartable. An attempt to execute this instruction with register 31 specified as rs is not

trapped, however.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 1)

GPR[31] ← PC + 8

T + 1: if condition then

PC ← PC + target

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 1)

GPR[31] ← PC + 8

T + 1: if condition then

PC ← PC + target

endif

Exceptions:

None

TX49/H2 Archit ecture

A-34

BLTZALL Branch On Less Than

Zero And Link Likely BLTZALL

rs offset

BLTZALL

10010

REGIMM

000001

1516202125

2631 0

55 16

Format:

BLTZALL rs, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slo t and the 16-bit offset, shifted left two b its and sign-extended. Unconditionally, the

address of the instruction after the delay slot is placed in the link register, r31 . If the

contents of general register rs have the sign bit set, then the program branches to the target

address, with a delay of one instruction.

General register rs may not be general register 31, because such an instruction is not

restartable. An attempt to execute this instruction with register 31 specified as rs is not

trapped, h owever. If the condition al branch is not taken, th e instruct ion in the branch delay

slot is nullified.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 1)

GPR[31] ← PC + 8

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 1)

GPR[31] ← PC + 8

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

Exceptions:

None

TX49/H2 Archit ecture

A-35

BLTZL Branch On Less Than Zero

Likely BLTZL

rs offset

BLTZL

00010

REGIMM

000001

1516202125

2631 0

55 16

Format:

BLTZ rs, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the contents of

general register rs have the sign bit set, then the program branches to the target address,

with a delay of on e instruction. If th e conditional bran ch is not ta ken, the in struction in th e

branch delay slot is nullified.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs]31 = 1)

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs]63 = 1)

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

Exceptions:

None

TX49/H2 Archit ecture

A-36

BNE Branch On Not Equal BNE

rtrs offset

BNE

000101

1516202125

2631 0

55 16

Format:

BNE rs, rt, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of

general register rs and the contents of general register rt are compared. If the two registers

are not equal, then the program branches to the target address, with a delay of one

instruction.

Operation:

32 T: target ← (offset15)14   offset   02

condition ← (GPR[rs] ≠ GPR[rt])

T + 1: if condition then

PC ← PC + target

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs] ≠GPR[rt])

T + 1: if condition then

PC ← PC + target

endif

Exceptions:

None

TX49/H2 Archit ecture

A-37

BNEL Branch On Not Equal Likely BNEL

rtrs offset

BNEL

010101

1516202125

2631 0

55 16

Format:

BNEL rs, rt, offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. The contents of

general register rs and the contents of general register rt are compared. If the two registers

are not equal, then the program branches to the target address, with a delay of one

instruction.

If the conditional branch is not taken, the instruction in the branch delay slot is nullified.

Operation:

32 T: target ← (offset15)14   offs e t   02

condition ← (GPR[rs] ≠ GPR[rt])

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

64 T: target ← (offset15)46   offset   02

condition ← (GPR[rs] ≠GPR[rt])

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

Exceptions:

None

TX49/H2 Archit ecture

A-38

BREAK Breakpoint BREAK

code BREAK

001101

SPECIAL

000000

5625

2631 0

20 6

Format:

BREAK

Description:

A breakpoint trap occurs, immediately and unconditionally transferring control to the

exception handler.

The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption

handler only by loading the contents of the memory word containing the instruction.

Operation:

32, 64 T: BreakpointExcept ion

Exceptions:

Breakpoint exception

TX49/H2 Archit ecture

A-39

CACHE Cache CACHE

base op offset

CACHE

101111

1516202125

2631 0

55 16

Format:

CACHE op, offset(base)

Description:

Gene rate s a virtual ad dr es s by sig n- ex te nd in g th e 16-bit o ff se t and add ing th e re sult to th e

contents of register base. The virtual address is translated to a physical address using the

TLB, and the 5-bit sub-opecode designates the cache operation to be performed at that

address.

If CP0 is unusable (in User or Supervisor mode), the CP0 enable bit in the Status register

is cleared, and a Coprocessor Unusable Exception is raised. The behavior of this instruction

for operation and cache combinations other than those listed in the table below, and when

used with an uncached address, is undefined.

Cache index opera tions designate a cache block using part of the virtual address.

The memory address that specifies in cache instruction must be cacheable area. If

uncachable area is specified, the operation is not guaranteed for TX49. If the instruction is

issued for the line which this instruction itself exists, the following operation is not

guaranteed.

The Index operation uses part of the virtual address to specify a cache block.

The each wa y is chosen by LSB (bit 0..1) of the virtual address.

Virtual Address bit (1:0) Selected Way

00 Way 0

01 Way 1

10 Way 2

11 Way 3

The Hit operation accesses the specified cache as normal data references, and performs the

specified operation if the cache block contains valid data with the specified physical address

(a hit). If the cache block is invalid or contains a different address (a miss), no operation is

performed. Write back from a cache goes to memory. The address to be written is specified

by the cache tag and not the translated physical address. TLB Refill and TLB Invalid

exceptions can occur on any operation. For Index operations (where the physical address is

used to index the cache but need not match the cache tag) unmapped addresses may be used

to avoid TLB exceptions. This operation never causes TLB Modified or Virtual Coherency

exceptions. Bits 17∼16 of the instruction specify the cache as follows:

Code Name Cache

0 I Pr imary instruction

1 D Primary data

2-reserved

3-reserved

TX49/H2 Archit ecture

A-40

CACHE Cache

(continued) CACHE

Bits 20∼18 of the instruction specify the operation as follows:

Code Caches Name Operation

0 I Index Invalidate Set the cache state of the indexed block to invalid.

0 D Index WriteBack

Invalidate Examine the cache state and W bit of the primary data cache block at the

invalidate i ndex specified by the virtual address. If the st ate is not invalid and

the W bit is set, then write back the block t o memory. The address to write is

taken from the primary cache tag. Set cache stat e of prim ary cache block to

invalid. LSB (bit 1 ∼ 0) of V A s elect th e way.

1 I / D Index Load Tag Read the tag for the cache block at the specified index and place it into the

TagLo and TagHi CP0 registers. LSB (bit 1 ∼ 0) of VA select the way.

2 I / D Index Store Tag Write the tag for the cache block at the specified index from the TagLo and

TagHi CP0 registers. LSB (bit 1 ∼ 0) of VA select the way.

3 I Undefined Undefined

3 D Create Dirty

Exclusive This operation is used to avoid loading data needlessly from memory when

writing new contents into an entire cache block. If the cacheblock does not

contain the specified address, and the block is dirty, write it back to the

memory. In all cases, set the cache block tag to the specified physical

address, s et the cache state to Dirty Exclusive.

4 I / D Hit Invalidate If the cache block contains the specified address, mark the cache block

invalid. In case of multi-hit, lock bits of the specified line become ineffective

and all way are invalidated.

5 I Fill Fill the primary instruction cache block from memory. LSB (bit 1 ∼ 0) of VA

select the way.

5DHit WriteBack

Invalidate If the cache block contains the specified address, write back the data if it is

dirty, and mark t he cache block i nval id.

6 I Undefined Undefined

6 D Hit WriteBack If the cache block contains the specified address, and the W bit is set, write

back the data to memory, and clear the W bit.

7 I Undefined Undefined

7 D Fill Fill the primary data cache block from memory. LSB (bit 1 ∼ 0) of VA select

the way.

TX49/H2 Archit ecture

A-41

CACHE Cache

(continued) CACHE

Operation:

32, 64 T: vAddr ← ((offset15)48   offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

CacheOp(op, cAddr, pAddr)

Exceptions:

Coprocessor unus able exception

TLB refill exception

TLB invalid exception

TX49/H2 Archit ecture

A-42

CFC0 Move Control From Coprocessor 0 CFC0

rd 0

000 0000 0000

00010

COP0

010000 rt

10111516202125

2631 0

555 11

Format:

CFC0 rt, rd

Description:

For ICE system only.

Loads the contents of Monit or memory into the general-purpose register rt.

Operation:

32 T: data ← CCR[0,rd]

T + 1: GPR[rt] ← data

64 T: data ← (CCR[0,rd]31)32 CCR[0, rd]

T + 1: GPR[rt] ← data

Exceptions:

Coprocessor Unusable exception

TX49/H2 Archit ecture

A-43

CFCz Move Control From Coprocessor CFCz

rd 0

000 0000 0000

00010

COPz

0100xx*rt

10111516202125

2631 0

555 11

Format:

CFCz rt, rd

Description:

The contents of coprocessor control register rd of coprocessor unit z ar e loaded into gene ral

Operation:

32 T: data ← CCR[z,rd]

T + 1: GPR[rt] ← data

64 T: data ← (CCR[z,rd]31)32 CCR[z, rd]

T + 1: GPR[rt] ← data

Exceptions:

Coprocessor unus able exception

Reserved Instruction exception (CFC3)

∗Opcode Bit Encoding:

Coprocessor Suboperation

CFCz

Coprocess or Uni t Number

CFC1

Bit #

Opcode

0212223

24252627282930

0010

0010 010

0212223

24252627282930

0010

0001 010

CFC2

Bit #

Note: CFC1 for FPU (See the Appendix B)

CFC2 for Coprocessor 2 (user define)

TX49/H2 Archit ecture

A-44

COPz Coprocessor z Operation COPz

cofun

COPz

0100xx*

2631 0

525

2425

Format:

COPz cofun.

Description:

A coprocessor operation is performed. The operation may specify and reference internal

coprocessor registers, and may change the state of the coprocessor condition line, but does

not modify state within the processor or the cache / memory system. Details of coprocessor 1

operations are contained in Appendix B.

Operation:

32, 64 T: CoprocessorOperation(z, cofun)

Exceptions:

Coprocessor unus able exception

Coprocessor interrupt or Floating-Point Exception (CP1 only)

Reserved Instruction exception (COP3)

∗Opcode Bit Encoding:

CO sub-opcode (s ee end of Appendix A)

COPz

Coprocess or Uni t Number

COP0

Bit #

Opcode

0252627282930

0010

100

0252627282930

0010

110

0252627282930

0010

111

COP1

Bit #

COP3

Bit #

0252627282930

0010

101

COP2

Bit #

Note: COP0 for ICE system

COP1 for FPU (See the Appendix B)

COP2 for Coprocessor 2 (user define)

TX49/H2 Archit ecture

A-45

CTC0 Move Control To Coprocessor 0 CTC0

rd 0

000 0000 0000

00110

COP0

010000 rt

10111516202125

2631 0

555 11

Format:

CTC0 rt, rd

Description:

For ICE system only.

Loads the contents of general-purpose register rt into the Monitor memory.

Operation:

32, 64 T: data ← GPR[rt]

T + 1: CCR[0,rd] ← data

Exceptions:

Coprocessor Unusable exception

TX49/H2 Archit ecture

A-46

A. CTCz Move Control to

Coprocessor z CTCz

rd 0

000 0000 0000

00110

COPz

0100xx*rt

10111516202125

2631 0

555 11

Format:

CTCz rt, rd

Description:

The content s of general register rt are loaded into control register rd of coprocessor unit z.

Operation:

32, 64 T: data ← GPR[rt]

T + 1: CCR[z,rd] ← data

Exceptions:

Coprocessor unusable

Reserved Instruction exception (CTC3)

* Opcode Bit Encoding:

CTCz

CTC1

Bit #

Opcode

02627282930

0010

02627282930

0010

CTC2

Bit #

222324

011

222324

011

Coprocess or Suboperation

Coprocess or Unit Num ber

Note: CTC1 for FPU (See the Appendix B)

CTC2 for Coprocessor 2 (user define)

∗See “CPU Instruction Opcode Bit Encoding” at the end of Appendix A.

TX49/H2 Architecture

A-47

DADD Doubleword Add DADD

rd DADD

101100

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

DADD rd, rs, rt

Description:

The content s of general register rs and the contents of general register rt are added to form

the result. The result is placed into general register rd.

An overflow exception occurs if the carries out of bits 62 and 63 differ(2’s-complement

overflow). The destination register rd is not modified when an integer overflow exception

occurs.

Operation:

64 T: GPR[rd] ← GPR[rs] + GPR[rt]

Exceptions:

Integer overflow exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Archit ecture

A-48

DADDI Doubleword Add

Immediate DADDI

rs rt immediate

DADDI

011000

1516202125

2631 0

55 16

Format:

DADDI rt, rs, immediate

Description:

The 16-bit immediate is sign-extended and added to the contents of general register rs to

form the result. The result is placed into general register rt.

An overflow exception occurs if carries out of bits 62 and 63 differ (2’s-complement

overflow). The destination register rt is not modified when an integer overflow exception

occurs.

Operation:

64 T: GPR [rt] ← GPR[rs] + (immediate15)48   immediate15∼0

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Integer overflow exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-49

DADDIU Doubleword Add

Immediate Unsigned DADDIU

rs rt immediate

DADDIU

011001

15162021

2631 0

55 16

Format:

DADDIU rt, rs, immediate

Description:

The 16-bit immediate is sign-extended and added to the contents of general register rs to

form the result. The result is placed into general register rt. No integer overflow exception

occurs under any circumsta nces.

The only difference between this instruction and the DADDI instruction is that DADDIU

never causes an overflow exception.

Operation:

64 T: GPR[rt] ← GPR [rs] + (immediate15)48   immediate15∼0

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Archit ecture

A-50

DADDU Doubleword Add Unsigned DADDU

rd DADDU

101101

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

DADDU rd, rs, rt

Description:

The content s of general register rs and the contents of general register rt are added to form

the result. The result is placed into general register rd.

No overflow exception occurs under any circumstances.

The only difference between this instruction and the DADD instruction is that DADDU

never causes an overflow exception.

Operation:

64 T: GPR [rd] ← GPR[ rs] + GPR[rt]

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-51

DDIV Doubleword Divide DDIV

DDIV

011110

00 0000 0000

SPECIAL

000000 rs rt

56151620

2125

2631 0

55 10 6

Format:

DDIV rs, rt

Description:

The contents of general register rs are divided by the contents of general register rt,

treating both operands as 2’s-complement values. No overflow exception occurs under any

circumstances, and the result of this operation is undefined when the divisor is zero.

This instruction is typically followed by additional instructions to check for a zero divisor

and for overflow.

When the operation completes, the quotient word of the double result is loaded into special

If either of the two preceding instructions is MFHI or MFLO, the results of those

instructions are undefined. Correct operation requires separating reads of HI or LO from

writes by two or more instructions.

Operation:

64 T-2: LO ← undefined

Hl ← undefined

T-1: LO ← undefined

Hl ← undefined

T: LO ← GPR[rs] div GPR[rt ]

Hl ← GPR[rs] mod GPR[rt]

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Archit ecture

A-52

DDIVU Doubleword Divide

Unsigned DDIVU

DDIVU

011111

000000 0000

SPECIAL

000000 rs rt

561516202125

2631 0

55 10 6

Format:

DDIVU rs, rt

Description:

The contents of general register rs are divided by the contents of general register rt,

treating both operands as unsigned values. No integer overflow exception occurs under any

circumstances, and the result of this operation is undefined when the divisor is zero.

This instruction is typically followed by additional instructions to check for a zero divisor.

When the operation completes, the quotient word of the double result is loaded into special

If either of the two preceding instructions is MFHI or MFLO, the results of those

instructions are undefined. Correct operation requires separating reads of HI or LO from

writes by two or more instructions.

Operation:

64 T-2: LO ← undefined

Hl ← undefined

T-1: LO ← undefined

Hl ← undefined

T: LO ← (0 GPR[rs]) div (0 GPR[rt])

Hl ← (0 GPR[rs]) mod (0 GPR[rt])

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-53

DERET Debug Exception Return DERET

DERET

011111

000 0000 0000 0000 0000

COP0

010000

31 0

119 6

2425

Format:

DERET

Description:

Execute a return a self-debug interrupt or exception. This instruction requires a branch

delay slot like that of the branch or jump instructions, and executes with a delay of one

instruction cycle. The DERET instruction itself cannot be put in the delay slot.

The return address stored in the DEPC register is copied to the PC, and processing returns

to the original program.

Note: If a MTC0 instruction was used to set the return address in the DEPC register, a

minimum of two instructions must be executed before executing DERET.

Operation:

32, 64 T: temp ← DEPC

T-1: PC← temp

Debug30 ← 0

Exceptions:

Coprocessor unus able exception

TX49/H2 Archit ecture

A-54

DIV Divide DIV

DIV

011010

00 0000 0000

SPECIAL

000000 rs rt

561516

202125

2631 0

55 10 6

Format:

DIV rs, rt

Description:

The contents of general register rs are divided by the contents of general register rt,

treating both operands as 2’s-complement values. No overflow exception occurs under any

circumstances, and the result of this operation is undefined when the divisor is zero. In 64-

bit mode, the operands must be valid sign-extended, 32-bit values.

This instruction is typically followed by additional instructions to check for a zero divisor

and for overflow.

When the operation completes, the quotient word of the double result is loaded into special

If either of the two preceding instructions is MFHI or MFLO, the results of those

instructions are undefined. Correct operation requires separating reads of HI or LO from

writes by two or more instructions.

Operation:

32 T-2: LO ← undefined

Hl ← undefined

T-1: LO ← undefined

Hl ← undefined

T: LO ← GPR[rs] div GPR[rt]

Hl ← GPR[rs] mod GPR[rt]

64 T-2: LO ← undefined

Hl ← undefined

T-1: LO ← undefined

Hl ← undefined

T: q ← GPR[rs]31∼0 div GPR[rt]31∼0

r ← GPR[rs]31∼0 mod GPR[rt]31∼0

LO ← (q31)32   q31∼0

HI ← (r31)32   r31∼0

Exceptions:

None

TX49/H2 Architecture

A-55

DIVU Divide Unsigned DIVU

DIVU

011011

00 0000 0000

SPECIAL

000000 rs rt

561516202125

2631 0

55 10 6

Format:

DIVU rs, rt

Description:

The contents of general register rs are divided by the contents of general register rt,

treating both operands as unsigned values. No integer overflow exception occurs under any

circumstances, and the result of this operation is undefined when the divisor is zero. In 64-

bit mode, the operands must be valid sign-extended, 32-bit values. In 64-bitmode, the

operands must be valid sign-extended, 32-bit values.

This instruction is typically followed by additional instructions to check for a zero divisor.

When the operation completes, the quotient word of the double result is loaded into special

If either of the two preceding instructions is MFHI or MFLO, the results of those

instructions are undefined. Correct operation requires separating reads of HI or LO from

writes by two or more instructions.

Operation:

32 T-2: LO ← undefined

Hl ← undefined

T-1: LO ← undefined

Hl ← undefined

T: LO ← (0 GPR[rs]) div (0 GPR[rt])

Hl ← (0 GPR[rs]) mod (0 GPR[rt])

64 T-2: LO ← undefined

Hl ← undefined

T-1: LO ← undefined

Hl ← undefined

T: q ← (0 GPR[rs]31∼0) div (0 GPR[rt]31∼0)

r ← (0 GPR[rs]31∼0) mod (0 GPR[rt]31∼0)

LO ← (q31)32   q31∼0

HI ← (r31)32   r31∼0

Exceptions:

None

TX49/H2 Archit ecture

A-56

DMFC0 Doubleword Move From

System Control Coprocessor DMFC0

rd 0

000 0000 0000

DMF

00001

COP0

010000 rt

10111516202125

2631 0

555 5

Format:

DMFC0 rt, rd

Description:

The contents of coprocessor register rd of the CP0 a re loaded into general register rt.

This operation is defined in kernel mode regardless of the setting of the Status. KX bit.

Execution of this instruction with in supervisor mode with Status. SX = 0 or in user mode

with UX = 0, causes a reserved instruction exception. All 64-bits of the general register

destination are written from the coprocessor register source. The operation of DMFC0 on a

32-bit coprocessor 0 register is undefined.

Operation:

64 T: data ← CP R[0,r d]

T + 1: GPR[rt] ← data

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Coprocessor unus able exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-57

DMTC0 Doubleword Move TO

System Control Coprocessor DMTC0

rd 0

000 0000 0000

DMT

00101

COP0

010000 rt

10111516202125

2631 0

555 11

Format:

DMTC0 rt, rd

Description:

The content s of general register rt are loaded int o coprocessor register rd of the CP0.

This operation is defined for the R4000 operating in 64-bit mode or in 32-bit kernal mode.

Execution of this instruction in 32-bit u ser or supervisor mode causes a reserved instruction

exception. All 64-bits of he coprocessor 0 register are written from the general register

source. The operation of DMTC0 on a 32-bit coprocessor 0 register is undefined.

Because the state of the virtual address translation system may be altered by this

instruction, the operation of load, store instructions and TLB operations immediately prior to

and after this instruction are undefined.

Operation:

64 T: data ← GPR[rt]

T + 1: CPR[0,rd] ← data

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Coprocessor unus able exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Archit ecture

A-58

DMULT Doubleword Multiply DMULT

DMULT

011100

00 0000 0000

SPECIAL

000000 rs rt

561516202125

2631 0

55 10 6

DMULT

011100

SPECIAL

000000 rs rt

1516202125

2631 0

55 6

0 0000

Format:

DMULT rs, rt

DMULT rd, rs, rt

Description:

The contents of general registers rs and rt are multiplied, heating both operands as 2’s-

complement values. No integer overflow exception occurs under any circumstances .

When the operation completes, the low-order word of the double result is loaded into

special register LO, and the high-order word of the double result is loaded into special

If either of the two preceding instructions is MFHI or MFLO, the results of these

instructions are undefined. Correct operation requires separating reads of HI or LO from

writes by a minimum of two other instructions.

Operation:

64 T-2: LO ← undefined

Hl ← undefined

T-1: LO ← undefined

Hl ← undefined

T: t ← GPR[rs]∗ GPR[rt]

LO ← t63∼0

HI ← t127∼64

GPR[rd] ← t63∼0

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-59

DMULTU Doubleword Multiply

Unsigned DMULTU

DMULTU

011101

00 0000 0000

SPECIAL

000000 rs rt

655 10 6

DMULTU

011101

SPECIAL

000000 rs rt

16202125

2631 0

55 6

0 0000

162021252631 0

Format:

DMULTU rs, rt

Description:

The contents of general register rs and the contents of general register rt are multiplied,

treating both operands as unsigned values. No over-flow exception occurs under any

circumstances.

When the operation completes, the low-order word of the double re-suit is loaded into

special register LO, and the high-order word of the double result is loaded into special

If either of the two preceding instructions is MFHI or MFLO, the re-suits of these

instructions are undefined. Correct operation requires separating reads of HI or LO from

writes by a minimum of two instructions.

Operation:

64 T-2: LO ← undefined

Hl ← undefined

T-1: LO ← undefined

Hl ← undefined

T: t ← (0 GPR[rs])∗ (0 GPR[rt])

LO ← t63∼0

HI ← t127∼64

GPR[rd] ← t63∼0

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Archit ecture

A-60

DSLL Doubleword Shift

Left Logical DSLL

sard DSLL

111000

00000

SPECIAL

000000 rt

5610111516202125

2631 0

55556

Format:

DSLL rd, rt, sa

Description:

The contents of general register rt are shifted left by sa bits, inserting zeros into the low-

order bits. The result is placed i n register rd.

Operation:

64 T: s ← 0 sa

GPR[rd] ← GPR[rt](63-sa) ∼0  0s

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-61

DSLLV Doubleword Shift Left

Logical Variable DSLLV

rd DSLLV

010100

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

DSLLV rd, rt, rs

Description:

The contents of general register rt are shifted left by the number of bits specified by the

low-order six bits contained as contents of general register rs, inserting zeros into the low-

order bits. The result is placed i n register rd.

Operation :

64 T: s ← GPR[rs]5∼0

GPR[rd] ← GPR[rt](63-s) ∼0  0s

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Archit ecture

A-62

DSLL32 Doubleword Shift Left

Logical + 32 DSLL32

sard DSLL32

111100

00000

SPECIAL

000000 rt

5610111516202125

2631 0

55556

Format:

DSLL32 rd, rt, sa

Description:

The contents of general register rt are shifted left by 32 + sa bits, inserting zeros into the

low-order bits. The result is placed in register rd.

Operation:

64 T: s ← 1 sa

GPR[rd] ← GPR[rt](63-s) ∼0  0s

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-63

DSRA Doubleword Shift Right

Arithmetic DSRA

sard DSRA

111011

00000

SPECIAL

000000 rt

5610111516202125

2631 0

55556

Format:

DSRA rd, rt, sa

Description:

The contents of general register rt are shifted right by sa bits, sign-ex-tending the high-

order bits. The result is placed i n register rd.

Operation:

64 T: s ← 0 sa

GPR[rd] ← (GPR[ r t]63)s GPR[rt]63∼s

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Archit ecture

A-64

DSRAV Doubleword Shift Right

Arithmetic Variable DSRAV

rd DSRAV

010111

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

DSRAV rd, rt, rs

Description:

The contents of general register rt are shifted right by the number of bits specified by the

low-order six bits of general register rs, sign-ex-tending the high-order bits. The result is

placed in register rd.

Operation:

64 T: s ← GPR[rs]5∼0

GPR[rd] ← (GPR[ r t]63)s GPR[rt]63∼s

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-65

DSRA32 Doubleword Shift Right

Arithmetic + 32 DSRA32

sard DSRA32

111111

00000

SPECIAL

000000 rt

5610111516202125

2631 0

55556

Format:

DSRA32 rd, rt,sa

Description:

The contents of general register rt are shifted right by 32 + sa bits, sign-extending the

high-order bits. The result us placed in register rd.

Operation:

64 T: s ← 1 sa

GPR[rd] ← (GPR[ r t]63)s GPR[rt]63∼s

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Archit ecture

A-66

DSRL Doubleword Shift Right

Logical DSRL

sard DSRL

111010

00000

SPECIAL

000000 rt

5610111516202125

2631 0

55556

Format:

DSRL rd, rt, sa

Description:

The content s of general register rt are shifted right by sa bits, inserting zeros into the high-

order bits. The result is placed i n register rd.

Operation:

64 T: s ← 0 sa

GPR[rd] ← 0s GPR[rt] 63∼s

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-67

DSRLV Doubleword Shift Right

Logical Variable DSRLV

rd DSRLV

010110

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

DSRLV rd, rt, rs

Description:

The contents of general register rt are shifted right by the number of bits specified by the

low-order six bits of general register rs, inserting zeros unto the high-order bits. The result

us plac ed in register rd.

Operation:

64 T: s ← GPR[rs]5∼0

GPR[rd] ← 0s GPR[rt]63∼s

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Archit ecture

A-68

DSRL32 Doubleword Shift Right

Logical + 32 DSRL32

sard DSRL32

111110

00000

SPECIAL

000000 rt

610111516202125

2631 0

55556

Format:

DSRL32 rd, rt, sa

Description:

The contents of general register rt are shif ted right by 32 + sa bit s, inserting zeros in to th e

high-order bits. The result is placed in register rd.

Operation:

64 T: s ← 1 sa

GPR[rd] ← 0s GPR[rt]63∼s

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-69

DSUB Doubleword Subtract DSUB

rd DSUB

101110

00000

SPECIAL

000000 rs rt

610111516202125

2631 0

55556

Format:

DSUB rd, rs, rt

Description:

The content s of general register rt are subtracted from the contents of general register rs to

form a result. The result is placed into general register rd.

The only difference between this instruction and the DSUBU instruction is that DSUBU

never traps on overflow.

An integer overflow exception takes place if the carries out of bits 62and 63 differ (2’s-

complement overflow). The destination register rd is not modified when an integer overflow

exception occurs.

Operation :

64 T: GPR[rd] ← GPR[rs] − GPR[rt]

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Integer overflow exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Archit ecture

A-70

DSUBU Doubleword Subtract

Unsigned DSUBU

rd DSUBU

101111

00000

SPECIAL

000000 rs rt

610111516202125

2631 0

55556

Format:

DSUBU rd, rs, rt

Description:

The content s of general register rt are subtracted from the contents of general register rs to

form a result. The result is placed into general register rd.

The only difference between this instruction and the DSUB instruction is that DSUBU

never taps on overflow. No integer overflow exception occurs under any circumstances.

Operation:

64 T: GPR[rd] ← GPR[rs] − GPR[rt]

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-71

ERET Exception Return ERET

ERET

011000

000 0000 0000 0000 0000

COP0

010000

2631 0

119 6

2425

Format:

ERET

Description:

ERET is the TX49 instruction for returning from an interrupt, exception, or error trap.

Unlike a branch or jump instruction, ERET does not execute the next instruction.

ERET must not itself be placed in a branch delay slot.

If the processor is servicing an error trap (SR2 = 1), then load the PC from the ErrorEPC

and clear the ERL bit of th e Status register (SR2). Otherwise (SR2 = 0), load the PC f ro m th e

EPC, and clear the EXL bit of the Status register (SR1).

An ERET executed between a LL and SC also causes the SC to fail.

In case of th is instruction i s placed in the bo undary of memo ry, it is necessary to kee p the

branch delay slot into same memory area.

Operation:

32, 64 T: if SR2 = 1 then

PC ← ErrorEPC

SR ← SR31∼3 0 SR1∼0

else

PC ← EPC

SR ← SR31∼2 0 SR0

endif

LLbit ← 0

Exceptions:

Coprocessor unus able exception

TX49/H2 Archit ecture

A-72

JJump J

000010 target

2631 0

Format:

J target

Description:

The 26-bit target address is shifted left two bits and combined with the high-order bits of

the address of the delay slot. The program unconditionally jumps to this calculated address

with a delay of one instruction.

Operation:

32 T: temp ← target

T + 1: PC ← PC31∼28 temp 02

64 T: temp ← target

T + 1: PC ← PC63∼28 temp 02

Exceptions:

None

TX49/H2 Architecture

A-73

JAL Jump And Link JAL

JAL

000011 target

2631 0

Format:

JAL target

Description:

The 26-bit target address is shifted left two bits and combined with the high-order bits of

the address of the delay slot. The program unconditionally jumps to this calculated address

with a delay of one instruction. The address of the instruction after the delay slot is placed

in the link register, r31.

Operation:

32 T: temp ← target

GPR[31] ← PC + 8

T + 1: PC ← PC31∼28 temp 02

64 T: temp ← target

GPR[31] ← PC + 8

T + 1: PC ← PC63∼28 temp 02

Exceptions:

None

TX49/H2 Archit ecture

A-74

JALR Jump And Link Register JALR

rd JALR

001001

00000

SPECIAL

000000 rs

5610111516202125

2631 0

55556

Format:

JALR rs

JALR rd, rs

Description:

The program unconditionally jumps to the address contained in general register rs, with a

delay of one instruction. The address of the instruction after the delay slot is placed in

general register rd. The default value of rd, if omitted in the assembly langu age instruc tion,

is 31.

the same effect when reexecuted. However, an attempt to execute this instruction is not

trapped, and the result of executing such a n instruction is undefined.

Since instructions must be word-aligned, a Jump and Link Register instruction must

specify a target register (rs) whose two low-order bits are zero. If these. low-order bits are

not zero, an address exception will occur when the jump target instruction is subsequently

fetched.

Operation:

32, 64 T: t em p ← GPR[rs]

GPR[rd] ← PC + 8

T + 1: PC ← temp

Exceptions:

None

TX49/H2 Architecture

A-75

JR Jump Register JR

001000

000 0000 0000 0000

SPECIAL

000000 rs

56202125

2631 0

5156

Format:

JR rs

Description:

The program unconditionally jumps to the address contained in general register rs, with a

delay of one instruction.

Since ins truction s mu st be wo rd- aligne d , a Jump Register instructio n mu st spe cif y a targ et

address exception will occur when the jump target instruction is subsequently fetched.

Operation:

32, 64 T: t em p ← GPR[rs]

T + 1: PC ← temp

Exceptions:

None

TX49/H2 Architecture

A-76

A. LB Load Byte LB

offset

100000 base rt

1516202125

2631 0

55 16

Format:

LB rt, offset (base)

Description:

The 16-bit offset is sign-extended and added tp the contents of general register base to fo r m

a virtual address. The contents of the byte at the memory location specified by the effective

address are sign-extended and loaded unto general register rt.

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

mem ← LoadMemory (uncac hed, BYTE, pAddr, vA ddr, DA TA)

byte ← vA ddr2∼0 xor BigEndianCPU3

GPR[rt] ← (mem7 + 8*byte)24 mem7 + 8*byte∼8*byte

64 T: vAddr ← ((offs et15)48 offset15∼0 ) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

mem ← LoadMemory (uncac hed, BYTE, pAddr, vA ddr, DA TA)

byte ← vA ddr2∼0 xor BigEndianCPU3

GPR[rt] ← (mem7 + 8*byte)56 mem7 + 8*byte∼8*byte

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-77

LBU Load Byte Unsigned LBU

offset

LBU

100100 base rt

1516202125

2631 0

55 16

Format:

LBU rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The contents of the byte at the memory location specified by the effective

address are zero-extended and loaded into general register rt.

Operation :

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

mem ← LoadMemory (uncac hed, BYTE, pAddr, vA ddr, DA TA)

byte ← vA ddr2∼0 xor BigEndianCPU3

GPR[rt] ← 024||mem7 + 8*byte∼8*byte

64 T: vAddr ← ((offs et15)48 offset15∼0 ) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

mem ← LoadMemory (uncac hed, BYTE, pAddr, vA ddr, DA TA)

byte ← vA ddr2∼0 xor BigEndianCPU3

GPR[rt] ← 056||mem7 + 8*byte∼8*byte

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-78

LD Load Doubleword LD

offset

110111 base rt

1516202125

2631 0

55 16

Format:

LD rt, offset (base)

Description:

The 16-bit offset is sign-extended and added to the contents of general register base to form

a virtual address. The contents of the 64-bit doubleword at the memory location specified by

the effective address are loaded into general register rt.

If any of the three least-significant bits of the effective address are non-zero, an address

error exception occurs.

Operation:

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

mem ← LoadMemory (uncac hed, DOUB LEWORD, pAddr, vA ddr, DA TA)

GPR[rt] ← mem

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-79

LDCz Load Doubleword To

Coprocessor z LDCz

offset

LDCz

1101xx*base rt

1516202125

2631 0

55 16

Format:

LDCz rt, offset (base)

Description :

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The processor reads a double-word from the addressed memory location

and makes the data available to coprocessor unit z. The manner in which each coprocessor

uses he data is defined by the individual coprocessor specifications.

If any of the three least-significant bits of the effective address are non-zero, an address

error exception takes place.

This instruction is not valid for use with CP0.

This instruction is undefined when the least-significant bit of the rt-field is non-zero.

*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit

Encoding” at the end of Appendix A.

TX49/H2 Architecture

A-80

LDCz Load Doubleword To

Coprocessor z

(continued) LDCz

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

mem ← LoadMemory (uncac hed, DOUB LEWORD, pAddr, vA ddr, DA TA)

COPzLD(rt, mem)

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

mem ← LoadMemory (uncac hed, DOUB LEWORD, pAddr, vA ddr, DA TA)

COPzLD (rt, mem)

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

Coprocessor unus able exception

Opcode Bit Encoding:

LDCz

Coprocessor Unit Num ber

LDC1

Bit #

Opcode

02627282930

1011

02627282930

1011

LDC2

Bit #

TX49/H2 Architecture

A-81

LDL Load Doubleword Left LDL

offset

LDL

011010 base rt

1516202125

2631 0

55 16

Format:

LDL rt, offset (base)

Description:

This instruction can be used in combination with the LDR instruction to load a register

with eight consecutive bytes from memory, when the bytes cross a boundary between two

doublewords. LDL loads the left portion of the regis ter from the appropriate part of the high-

order doubleword; LDR loads the right portion of the register from the appropriate part of

the low-order doubleword.

The LDL instructio n adds it s sig n-exte nded 16-bit offset to the contents of general register

base to form a virtual address which can specify an arbitrary byte. It reads bytes only from

the doubleword in memory which contains the specified starting byte. From one to eight

bytes will be loaded, depending on the starting byte specified.

Conceptually, it starts at the specified byte in memory and loads that byte into the high-

order (left-most) byte of the register; then it proceeds toward the low-order byte of the

doubleword in memory and the low-order byte of the register, loading bytes from memory

into the register until it reaches the low-order byte of the doubleword in memory. The least-

significant (right-most) byte(s) of the register will not be changed.

LDL $24,3($0)

memory

(big-endian)

address 0

address 8 111098 15141312

32107654 $24

before DCBA HGFE

$24

after 6543 HGF7

TX49/H2 Architecture

A-82

LDL Load Doubleword Left

(continued) LDL

The contents of general register rt are internally bypassed within the processor so that no

NOP is needed between an immediately preceding load instruction which specifies register rt

and a following LDL (or LDR) instruction which also specifies register rt.

No address exceptions due to alignment are possible.

Operation:

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEnci an3)

if BigEndianMem = 0 then

pAddr ← pAddrPSIZE-1∼3 03

endif

byte ← vA ddr2∼0 xor BigEndianCPU3

mem ← LoadMemory (uncac hed, byt e, pA ddr, vA ddr, DA TA)

GPR[rt] ← mem7 + 8*byte∼0 GPR[rt]55 − 8*byte∼0

Note: It is also the same operation in th e 32 bit kernel mode.

TX49/H2 Architecture

A-83

LDL Load Doubleword Left

(continued) LDL

Given a doublew ord in a register an d a doublew ord in memory , the oper ation of LDL u s as

follows:

LDL

Memory

CBA DEFGH

KJI LMNOP

BigEndianCPU = 0 BigEndianCPU = 1

offset offset

vAddr2∼0Destination type LEM BEM Destination type LEM BEM

0PBCDEFGH007IJKLMNOP700

1OPCDEFGH106JKLMNOPH601

2 NOPDEFGH 2 0 5 KLMNOPGH 5 0 2

3 MNOPEFGH 3 0 4 LMNOPFGH 4 0 3

4 LMNOPFGH 4 0 3 MNOPEFGH 3 0 4

5 KLMNOPGH 5 0 2 NOPDEFGH 2 0 5

6 JKLMNOPH 6 0 1 OPCDEFGH 1 0 6

7IJKLMNOP700PBCDEFGH007

LEM BigEndianMem = 0

BEM BigEndianMem = 1

Type AccessType sent t o memory

Offset Addr2∼0 sent to memory

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-84

LDR Load Doubleword Right LDR

offset

LDR

011011 base rt

1516202125

2631 0

55 16

Format:

LDR rt, offset (base)

Description:

This instruction can be used in combination with the LDL instruction to load a register

with eight consecutive bytes from memory, when the bytes cross a boundary between two

doublewords. LDR loads the right portion of the register from the appropriate part of the

low-order doubleword; LDL loads the left portion of the register from the appropriate part of

the high-order doubleword.

The LDR instruction adds its sign-extended 16-bit offset to the c on - te nts o f g e ne r al re g iste r

base to form a virtual address which can specify an arbitrary byte. It reads bytes only from

the doubleword in memory which contains the specified starting byte. From one to eight

bytes will be loaded, depending on the starting byte specified.

Conceptually, it starts at the specified byte in memory and loads that byte into the low-

order (right-most) byte of the register; then it proceeds toward the high-order byte of the

doubleword in memory and the high-order byte of the register, loading bytes from memory

into the register until it reaches the high-order byte of the doubleword in memory. The most

significant (left-most) byte (s) of the register will not be changed.

LDR $24,4 ($0)

memory

(big-endian)

address 0

address 8 111098 15141312

32107654 $24

before DCBA HGFE

$24

after 0CBA 4321

TX49/H2 Architecture

A-85

LDR Load Doubleword Right

(continued) LDR

The contents of general register rt are internally bypassed within the processor so that no

NOP is needed between an immediately preceding load instruction which specifies register rt

and a following LDR (or LDL) instruction which also specifies register rt.

No address exceptions due to alignment are possible.

Operation:

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEnci an3)

if BigEndianMem = 1 then

pAddr ← pAddr31∼3 03

endif

byte ← vA ddr2∼0 xor BigEndianCPU3

mem ← LoadMemory (uncac hed, byt e, pA ddr, vA ddr, DA TA)

GPR[rt] ← GPR[ rt] 63∼64 − 8*byte mem63∼8*byte

Note: It is also the same operation in th e 32 bit kernel mode.

TX49/H2 Architecture

A-86

LDR Load Doubleword Right

(continued) LDR

Given a d oubleword in a re gister and a double word in memory, th e operation of LDR is as

follows:

LDR

Memory

CBA DEFGH

KJI LMNOP

BigEndianCPU = 0 BigEndianCPU = 1

offset offset

vAddr2∼0Destination type LEM BEM Destination type LEM BEM

0IJKLMNOP700ABCDEFGI070

1AIJKLMNO610ABCDEFIJ160

2 ABIJKLMN 5 2 0 ABCDEIJK 2 5 0

3 ABCIJKLM 4 3 0 ABCDIJKL 3 4 0

4 ABCDIJKL 3 4 0 ABCIJKLM 4 3 0

5 ABCDEIJK 2 5 0 ABIJKLMN 5 2 0

6ABCDEFIJ160AIJKLMNO610

7ABCDEFGI070IJKLMNOP700

LEM BigEndianMem = 0

BEM BigEndianMem = 1

Type AccessType sent t o memory

Offset Addr2∼0 sent to memory

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-87

LH Load Halfword LH

offset

100001 base rt

1516202125

2631 0

55 16

Format:

LH rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The contents of the halfword at the memory location specified by the

effective address are sign-extended and loaded into general register rt.

If the least-significant bit of the effective address is non-zero, an address error exception

occurs.

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian 0))

mem ← LoadMemory (uncached, HA LFWORD, pAddr, vAddr, DATA)

byte ← vA ddr2∼0 xor (BigEndianCPU2 0)

GPR[rt] ← (mem15 + 8*byte)16 mem15 + 8*byte∼8*byte

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian 0))

mem ← LoadMemory (uncached, HA LFWORD, pAddr, vAddr, DATA )

byte ← vA ddr2∼0 xor (BigEndianCPU2 0)

GPR[rt] ← (mem15 + 8*byte)16 mem15 + 8*byte∼8*byte

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-88

LHU Load Halfword Unsigned LHU

offset

LHU

100101 base rt

1516202125

2631 0

55 16

Format:

LHU rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The contents of the halfword at the memory location specified by the

effective address are zero-extended and loaded into general register rt.

If the least-significant bit of the effective address is non-zero, an address error exception

occurs.

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian2 0))

mem ← LoadMemory (uncac hed, HA LFWORD, pAddr, vA ddr, DATA)

byte ← vA ddr2∼0 xor (BigEndianCPU2 0)

GPR[rt] ← 016 mem15 + 8*byte∼8*byte

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian2 0))

mem ← LoadMemory (uncached, HA LFWORD, pAddr, vAddr, DATA)

byte ← vA ddr2∼0 xor (BigEndianCPU2 0)

GPR[rt] ← 048 mem15 + 8*byte∼8*byte

Exceptions:

TLB refill exception

TLB invalid exception

Bus Error exception

Address error exception

TX49/H2 Architecture

A-89

LL Load Linked LL

offset

110000 base rt

1516202125

2631 0

55 16

Format:

LL rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The contents of the word at the memory location specified by the effective

address are loaded into general register rt. In 64-bit mode, the loaded word is sign-extended.

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-90

LLD Load Linke Doubleword LLD

offset

LLD

110100 base rt

1516202125

2631 0

55 16

Format:

LLD rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The contents of the doubleword at the memory location specified by the

effective address are loaded into general register rt.

The processor begins checking the accessed doubleword for modification by other

processors and devices.

Load Linked Doubleword and Store Conditional Doubleword can be used to atomically

update memory locations:

L1: LLD T1, (T0)

ADD T2, T1, 1

SCD T2, (T0)

BEQ T2, 0, L1

NOP

This atomically increments the word addressed by T0. Changing the ADD to an OR

changes this to an atomic bit set.

TX49/H2 Architecture

A-91

LLD Load Linked Doubleword

(continued) LLD

The operation of LLD is undefined if the addressed location is uncached and, for

synchronization between multiple processors, the operation of LLD is undefined if the

addressed location is noncoherent.

A cache miss that occurs between LLD and SCD may cause SCD to fail, so no load or store

instruction should occur between LLD and SCD. Exceptions also cause SCD to fail, so

persistent exceptions must be avoided.

This instruction is available in User mode, and it is not necessary for CP0 to be enabled.

If any of the three least-significant bits of the effective address are non-zero, an address

error exception takes place.

Operation:

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

mem←LoadMem ory (uncached,DOUBLE WORD,pAddr,vAddr,DATA)

GPR[rt] ← mem

LLbit ← 1

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-92

LUI Load Upper Immediate LUI

immediate

00000

LUI

001111 rt

1516202125

2631 0

55 16

Format:

LUI rt, immediate

Description:

The 16-bi t immediate is shif ted left 16 bits and concatenate d to 16 bit s of ze ros. The resu lt

is placed into general register rt. In 64-bit mode, the loaded word is sign-extended.

Operation:

32 T: GPR[rt] ← immediate 016

64 T: GPR[rs] ← (immediate15)32 immediate 016

Exceptions:

None

TX49/H2 Architecture

A-93

LW Load Word LW

offset

100011 base rt

1516202125

2631 0

55 16

Format:

LW rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The contents of the word at the memory location specified by the effective

address are loaded into general register rt. In 64-bit mode, the loaded word is sign-extended.

If either of the two least-significant bits of the effective address is non-zero, an address

error exception occurs.

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian 02)

mem ← LoadMemory (uncac hed, WORD, pAddr, vAddr, DATA)

byte ← vA ddr2∼0 xor (BigEndianCPU 02)

GPR[rt] ← mem31 + 8*byte∼8*byte

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian 02)

mem ← LoadMemory (uncac hed, WORD, pAddr, vAddr, DATA)

byte ← vA ddr2∼0 xor (BigEndianCPU 02)

GPR[rt] ← (mem31 + 8*byte)32 mem31 + 8*byte∼8*byte

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-94

LWCz Load Word To Coprocessor

zLWCz

offset

LWXz

1100xx*base rt

1516202125

2631 0

55 16

Format:

LWCz rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The processor reads a word from the addressed memory location, and

makes the data available to coprocessor unit z. The manner in which each coprocessor uses

the data is defined by the individual coprocessor specifications.

If either of the two least-significant bits of the effective address is non-zero, an address

error exception occurs.

This instruction is not valid for use with CP0.

*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit

Encoding” at the end of Appendix A.

TX49/H2 Architecture

A-95

LWCz Load Word To Coprocessor z

(continued) LWCz

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian 02)

mem ← LoadMemory (uncac hed, WORD, pAddr, vAddr, DATA)

byte ← vA ddr2∼0 xor (BigEndianCPU 02)

COPzLW (byte, rt, mem)

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian 02)

mem ← LoadMemory (uncac hed, WORD, pAddr, vAddr, DATA)

byte ← vA ddr2∼0 xor (BigEndianCPU 02)

COPzLW(byte, rt, mem)

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

Coprocessor unus able exception

Opcode Bit Encoding:

LWCz

LWC1

Bit #

Opcode Coprocessor Unit Number

02627282930

0011

02627282930

0011

LWC2

Bit #

TX49/H2 Architecture

A-96

LWL Load Word Left LWL

offset

LWL

100010 base rt

1516202125

2631 0

55 16

Format:

LWL rt, offset (base)

Description:

This instruction can be used in combination with the LWR instruction to load a register

with four consecutive bytes from memory, when the bytes cross a boundary between two

words. LWL loads the left portion of the register from the appropriate part of the high-order

word; LWR loads the right portion of the register from the appropriate part of the low-order

word.

The LWL ins truction ad ds it s sig n-ex tend ed 16-bit offset to the contents of general register

base to form a virtual address which can specify an arbitrary byte. It reads bytes only from

the word in memory which contains the specified starting byte. From one to four bytes will

be loaded, dep ending on the starting byte sp ecified. In 64-bi t mode, the lo aded word is sign-

extended.

Conceptually, it starts at the specified byte in memory and loads that byte into the high-

order (left-most) byte of the register; then it proceeds toward the low-order byte of the word

in memory and the low-order byte of the register, loading bytes from memory into the

(right-most) byte(s) of the register will not be changed.

LWL $24,1 ($0)

memory

(big-endian)

address 0

address 4 7654

3210 $24

before DCBA

$24

after D321

TX49/H2 Architecture

A-97

LWL Load Word Left

(continued) LWL

The contents of general register rt are internally bypassed within the processor so that no

NOP is needed between an immediately preceding load instruction which specifies register rt

and a following LWL (or LWR) instruction which also specifies register rt.

No address exceptions due to alignment are possible.

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEnci an 3)

if BigEndianMem = 0 then

pAddr ← pAddrPSIZE-1∼2 02

endif

byte ← vA ddr1∼0 xor BigEndianCPU2

word ← vAddr2 xor BigEndianCPU

mem ← LoadMemory (uncached, 0 byte, pAddr, vA ddr, DA TA)

temp ← me m32*word + 8*byte + 7∼32*word  GPR[rt]23 − 8*byte∼0

GPR[rt] ← temp

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEnci an 3)

if BigEndianMem = 0 then

pAddr ← pAddrPSIZE-1∼2 02

endif

byte ← vA ddr1∼0 xor BigEndianCPU2

word ← vAddr2 xor BigEndianCPU

mem ← LoadMemory (uncached, 0 byte, pAddr, vA ddr, DA TA)

temp ← me m32*word + 8*byte + 7∼32*word  GPR[rt]23 − 8*byte∼0

GPR[rt] ← (temp31)32 temp

TX49/H2 Architecture

A-98

LWL Load Word Left

(continued) LWL

Given a d oublewo rd in a register an d a doublew ord in memo ry, the o peration of LW L is as

follows:

LWL

Memory

CBA DEFGH

KJI LMNOP

BigEndianCPU = 0 BigEndianCP U = 1

offset offset

vAddr2∼0Destination type LEM BEM Destination type LEM BEM

0 SSSSPFGH 0 0 7 SSSSI JKL 3 4 0

1 SSSSOPGH 1 0 6 SSSSJKLH 2 4 1

2 SSSSNOPH 2 0 5 SSSSKLGH 1 4 2

3 SSSSMNOP 3 0 4 SSSSLFGH 0 4 3

4 SSSSLFGH 0 4 3 SSSSMNOP 3 0 4

5 SSSSKLGH 1 4 2 SSSSNOPH 2 0 5

6 SSSSJKLH 2 4 1 SSSSOPGH 1 0 6

7 SSSSI JKL 3 4 0 SSSSPFGH 0 0 7

LEM BigEndianMem = 0

BEM BigEndianMem = 1

Type AccessType (see Figure 2-2) sent to memory

Offset pAddr2∼0 sent to memory

Ssign-extend of destination31

Exception:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-99

LWR Load Word Right LWR

offset

LWR

100110 base rt

1516202125

2631 0

55 16

Format:

LWR rt, offset (base)

Description:

This instruction can be used in combination with the LWL instruction to load a register

with four consecutive bytes from memory, when the bytes cross a boundary between two

words. LWR loads the right portion of the register from the appropriate part of the low-order

word; LWL loads the left portion of the register from the appropriate part of the high-order

word.

The LWR instruction adds its sign-extended 16-bit offset to the contents of general register

base to form a virtual address which can specify an arbitrary byte. It reads bytes only from

the word in memory which contains the specified starting byte. From one to four bytes will

be loaded, depending on the starting byte specified. In 64-bit mode, if bit 31 of the

desti nation regist er is loaded, then the loaded word is sign-extended.

Conceptually, it starts at the specified byte in memory and loads. that byte into the low-

order (right-most) byte of the register; then it proceeds toward the high-order byte of the

word in memory and the high-order byte of the register, loading bytes from memory into the

The most significant (left-most) byte(s) of the register will not be changed.

LWR $24,4 ($0)

memory

(big-endian)

address 0

address 4 7654

3210 $24

before DCBA

$24

after 4CBA

TX49/H2 Architecture

A-100

LWR Load Word Right

(continued) LWR

The contents of general register rt are internally bypassed within the processor so that no

NOP is needed between an immediately preceding load instruction which specifies register rt

and a following LWR (or LWL) instruction which also specifies register rt.

No address exceptions due to alignment are possible.

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

if BigEndianMem = 1 then

pAddr ← pAddrPSIZE-31∼3 03

endif

byte ← vA ddr1∼0 xor BigEndianCPU2

word ← vAddr2 xor BigEndianCPU

mem ← LoadMemory (uncached, 0 byte, pAddr, vA ddr, DA TA)

temp ← GPR [rt ]31∼32 − 8*byte mem31 + 32*word∼32*word + 8*byte

GPR[rt] ← temp

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

if BigEndianMem = 1 then

pAddr ← pAddrPSIZE-31∼3 03

endif

byte ← vA ddr1∼0 xor BigEndianCPU2

word ← vAddr2 xor BigEndianCPU

mem ← LoadMemory (uncached, 0 byte, pAddr, vA ddr, DA TA)

temp ← GPR [rt ]31∼32 − 8*byte mem31 + 32*word∼32*word + 8*byte

GPR[rt] ← (temp31)32 temp

TX49/H2 Architecture

A-101

LWR Load Word Right

(continued) LWR

Given a word in a register and a word in memory, the operation of LWR is as follows:

LWR

Memory

CBA DEFGH

KJI LMNOP

BigEndianCPU = 0 BigEndianCP U = 1

offset offset

vAddr2∼0destination type LEM BEM Destination type LEM BEM

0 SSSSMNOP 0 0 4 XXXXEFGI 0 7 0

1 XXXXEMNO 1 1 4 XXXXEFI J 1 6 0

2 XXXXEFMN 2 2 4 XXXXEI JK 2 5 0

3 XXXXEFGM 3 3 4 SSSSI JKL 3 4 0

4 SSSSI JKL 0 4 0 XXXXEFGM 4 3 4

5 XXXXEI JK 1 5 0 XXXXEFMN 5 2 4

6 XXXXEFI J 2 6 0 XXXXEMNO 6 1 4

7 XXXXEFGI 3 7 0 SSSSMNOP 7 0 4

LEM BigEndianMem = 0

BEM BigEndianMem = 1

Type AccessType (see Figure 2-2) sent to memory

Offset pAddr2∼0 sent to memory

Ssign-extend of destination31

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-102

A. LWU Load Word Unsigned LWU

offset

LWU

100111 base rt

1516202125

2631 0

55 16

Format:

LWU rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The contents of the word at the memory location specified by the effective

address are loaded into general register rt. The l oaded word is zero-extended.

If either of the two least-significant bits of the effective address is non-zero, an address

error exception occurs.

Operation:

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian 02)

mem ← LoadMemory (uncac hed, WORD, pAddr, vAddr, DATA)

byte ← vA ddr2∼0 xor (BigEndianCPU 02)

GPR[rt] ← 032 mem31 + 8*byte∼8*byte

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-103

MADD Multiply/Add MADD

rd MADD

000000

MAC

011100 rt

1011

31 0

555 11

rs 0

00000

MADD

000000

MAC

011100 rs rt

1516202125

2631 0

55 6

00 0000 0000

151620212526 56

Format:

• MADD rs, rt

• MADD rd, rs, rt

Description:

Multiplies the contents of general registers rs and rt, treating both values as two’s

complement, and puts the double-word result in special registers HI and LO. An overview

exception is never raised. The low-order word of the multiplication result is put in general

special register HI.

If rd is omit ted in a ssembly lang uage, 0 is u sed as the defau lt value. To guarantee co rrect

operation even if an interrupt occurs, neithe of the two instructions following MADD should

be DIV or DIVU instructions which modify the HI and LO register contents.

Operation:

32, 64 T: t ← (HI LO) + GPR[rs]*GPR[rt]

LO ← t31∼0

HI ← t63∼32

GPR[rd] ←t31∼0

Exception:

None

TX49/H2 Architecture

A-104

MADDU Multiply/Add Unsigned MADDU

rd MADDU

000001

MAC

011100 rt

31 0

555 11

rs 0

00000

MADDU

000001

MAC

011100 rs rt

1516202125

2631 0

55 6

00 0000 0000

151620212526 56

1011

Format:

MADDU rs, rt

MADDU rd, rs, rt

Description:

Multiplies the contents of general registers rs and rt, treating both values as unsigned, and

puts the double-word result in special registers HI and LO. An overview exception is never

raised. The low-order word of the multiplication result is put in general register rd and in

special register LO, whereas the high-order word of the reuslt is put in special register HI.

If rd is omit ted in a ssembly lang uage, 0 is u sed as the defau lt value. To guarantee co rrect

operation even if an interrupt occurs, neithe of the two instructions following MADDU should

be DIV or DIVU instructions which modify the HI and LO register contents.

Operation:

32, 64 T: t ← (HI LO) + (0 || GPR[rs]) + (0 || GPR[rt])

LO ← t31∼0

HI ← t63∼32

GPR[rd] ← t31∼0

Exception:

None

TX49/H2 Architecture

A-105

MFC0 Move From System

Control Coprocessor 0 MFC0

rd 0

000 0000 0000

00000

COP0

010000 rt

1011

2125

2631 0

555 11

Format:

MFC0 rt, rd

Description:

The contents of coprocessor register rd of the CP0 are loaded into general register rt.

May be used on both 32-bit and 64-bit CP0 registers.

Operation:

32 T: data ← CP R[0,r d]

T + 1: GPR[rt] ← data

64 T: data ← CP R[0,r d]

T + 1: GPR[rt] ← (data31)32 data31∼0

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

A-106

MFCz Move From Coprocessor z MFCz

rd 0

000 0000 0000

00000

COPz

0100xx*rt

10111516202125

2631 0

555 11

Format:

MFCz rt, rd

Description:

The contents of coprocessor register rd of coprocessor z are loaded into general register rt.

Execution of the instruction referencing coprocessor 3 causes a reserved instruction

exception, not a coprocessor unusab le exception.

Operation:

32 T: data ← CP R[ z,rd]

T + 1: GPR[rt] ← data

64 T: if rd0 = 0

data ← CPR[z,r d4∼1 0]31∼0

else

data ← CPR[z,r d4∼1 0]63∼32

endif

T + 1: GPR[rt] ← (data31)32||data

Exceptions:

Coprocessor unus able exception

Reserved instruction exception (coprocessor 3)

*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit

Encoding” at the end of Appendix A.

TX49/H2 Architecture

A-107

MFCz Move From Coprocessor z

(continued) MFCz

Opcode Bit Encoding:

Coprocessor Suboperation

MFCz

Coprocessor Unit Number

MFC1

Bit #

Opcode

21222324252627282930

0010

0010000

21222324252627282930

0010

0001000

MFC2

Bit #

MFC0

Bit # 0

21222324252627282930

0010

0000000

TX49/H2 Architecture

A-108

MFHI Move From HI MFHI

rd MFHI

010000

00000

00 0000 0000

SPECIAL

000000

561011151625

2631 0

10 5 5 6

Format:

MFHI rd

Description:

The contents of special register HI are loaded into general register rd.

To ensure proper operation in the event of interruptions, the two instructions which follow

a MFHI instru ction may n ot be any of the ins tructions which modify th e HI register: MULT,

MULTU, DIV, DIVU, MTHI, DMULT, DMULTU, DDIV, DDIVU, MADD, MADDU.

Operation:

32, 64 T: GP R[rd] ← HI

Exceptions:

None

TX49/H2 Architecture

A-109

MFLO Move From Lo MFLO

rd MFLO

010010

00000

00 0000 0000

SPECIAL

000000

561011151625

2631 0

10 5 5 6

Format:

MFLO rd

Description:

The contents of special register LO are loaded into general register rd.

To ensure proper operation in the event of interruptions, the two instructions which follow

a MFLO instruc tion may n o t be an y o f the ins tructio ns wh ich mo d ify the LO r e gi ster : MULT,

MULTU, DIV, DIVU, MTLO, DMULT, DMULTU, DDIV, DDIVU, MADD, MADDU.

Operation:

32, 64 T: GP R[rd] ← LO

Exceptions:

None

TX49/H2 Architecture

A-110

MTC0 Move To System Control

Coprocessor 0 MTC0

rd 0

000 0000 0000

00100

COP0

010000 rt

101115162021

2631 0

555 11

Format:

MTC0 rt, rd

Description:

The content s of general register rt are loaded int o coprocessor register rd of the CP0.

Because the state of the virtual address translation system may be altered by this

instruction, the operation of load, store instructions and TLB operations immediately prior to

and after this instruction are undefined.

Operation:

32, 64 T: dat a ← GPR[rt]

T + 1: CPR[0,rd] ← data

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

A-111

MTCz Move To Coprocessor z MTCz

rd 0

000 0000 0000

00100

COPz

0100xx*rt

10111516202125

2631 0

555 11

Format:

MTCz rt, rd

Description:

The contents of general register rt are loaded into coprocessor register rd of coprocessor z.

Execution of the instruction referencing coprocessor 3 causes a reserved instruction

exception, not a coprocessor unusab le exception.

Operation:

32 T: data ← GPR[rt]

T + 1: CPR[z,rd] ← data

64 T: data ← GPR[rt]31∼0

T + 1: if rd0 = 0

CPR[z,rd4∼1 0] ← CPR[z, rd4∼1 0]63∼32 data

else

CPR[z,rd4∼1 0] ← data||CPR[z, rd 4∼1 0]31∼0

endif

Exceptions:

Coprocessor unus able exception

Reserved instruction exception (coprocessor 3)

*Opcode Bit Encoding:

MTCz

MTC0

Bit # 021222324252627282930

0010

0000001

021222324252627282930

0010

0010001

MTC1

Bit #

Coprocess or Suboperation

Coprocess or Uni t Number

Opcode

021222324252627282930

0010

0001001

MTC2

Bit #

TX49/H2 Architecture

A-112

MTHI Move To HI MTHI

MTHI

010001

000 0000 0000 0000

SPECIAL

000000 rs

5620212526

31 0

5156

Format:

MTHI rs

Description:

The content s of general register rs are loaded into special register HI

If a MTHI operation is executed following a MULT, MULTU, DIV, DIVU, DMULT,

DMULTU, DDIV, DDIVU, MADD, or MADDU instruction, but before any MFLO, MFHI,

MTLO, or MTHI instructions, the contents of sp ecial register LO are undefined.

Operation:

32, 64 T − 2: HI ← undefined

T − 1: HI ← undefined

T: HI ← GPR[rs]

Exceptions:

None

TX49/H2 Architecture

A-113

MTLO Move To LO MTLO

MTLO

010011

000 0000 0000 0000

SPECIAL

000000 rs

5620

2526

31 0

5156

Format:

MTLO rs

Description:

The contents of general register rs are loade d in to spe cial reg ister LO If a MTLO operation

is executed following a MULT, MULTU, DIV, DIVU, DMULT, DMULTU, DDIV, DDIVU,

MADD, or MADDU in struction, but before any MF LO, MFHI, MT LO, or MTHI in structions,

the contents of special register HI are undefined.

Operation:

32, 64 T − 2: LO ← undefined

T − 1: LO ← undefined

T: LO ← GPR[rs ]

Exceptions:

None

TX49/H2 Architecture

A-114

MULT Multiply MULT

MULT

011000

00 0000 0000

SPECIAL

000000 rs rt

561516202125

2631 0

55106

MULT

011000

SPECIAL

000000 rs rt

1516202125

2631 0

55 6

0 0000

Format:

MULT rs, rt

MULT rd, rs, rt

Description:

The contents of general registers rs and rt are multiplied , treating both o perands as 32-bit

2’s-complement values. No integer overflow exception occurs under any circumstances. In

64-bit mode, the operands must be valid 32-bit, sign-extended values.

When the operation completes, the low-order word of the double result is loaded into

special register LO, and the high-order word of the double result is loaded into special

If either of the two preceding instructions is MFHI or MFLO, the results is of these

instructions are undefined. Correct operation requires separating reads of HI or LO from

writes by a minimum of two other instructions.

Operation:

32 T − 2: LO ← undefined

HI ← undefined

T − 1: LO ← undefined

HI ← undefined

T: t ← GPR[rs]* GPR[rt]

LO ← t31∼0

HI ← t63∼32

GPR[rd] ← t31∼0

64 T − 2: LO ← undefined

HI ← undefined

T − 1: LO ← undefined

HI ← undefined

T: t ← GPR[rs]31∼0* GPR[rt]31∼0

LO ← (t31)32 t31∼0

HI ← (t63)32 t63∼32

GPR[rd] ← (t31)32 t31∼0

Exceptions:

None

TX49/H2 Architecture

A-115

MULTU Multiply Unsigned MULTU

MULTU

011001

00 0000 0000

SPECIAL

000000 rs rt

561516202125

2631 0

55 10 6

MULTU

011001

SPECIAL

000000 rs rt

1516202125

2631 0

55 6

0 0000

Format:

MULTU rs, rt

MULTU rd, rs, rt

Description:

The contents of general register rs and the contents of general register rt are multiplied,

treating both operands as unsigned values. No overflow exception occurs under any

circumstances. In 64-bit mode, the operands must be valid 32-bit, sign-extended values.

When the operation completes, the low-order word of the double result is loaded into

special register LO, and the high-order word of the double result is loaded into special

If either of the two preceding instructions is MFHI or MFLO, the results of these

instructions are undefined. Correct operation requires separating reads of HI or LO from

writes by a minimum of two instructions.

Operation:

32 T − 2: LO ← undefined

HI ← undefined

T − 1: LO ← undefined

HI ← undefined

T: t ← (0 GPR[rs])* (0 GPR[rt])

LO ← t31∼0

HI ← t63∼32

GPR[rd] ← t31∼0

64 T − 2: LO ← undefined

HI ← undefined

T − 1: LO ← undefined

HI ← undefined

T: t ← (0 GPR[rs]31∼0)* (0 GPR[rt]31∼0)

LO ← (t31)32 t31∼0

HI ← (t63)32 t63∼32

GPR[rd] ← (t31)32 t31∼0

Exceptions:

None

TX49/H2 Architecture

A-116

NOR Nor NOR

rd NOR

100111

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

NOR rd, rs, rt

Description:

The contents of general register rs are combined with the contents of general register rt in

a bit-wise logical NOR operation. The result is placed into general register rd.

Operation:

32, 64 T: GP R[rd] ← GPR[rs] nor GP R[rt]

Exceptions:

None

TX49/H2 Architecture

A-117

OR Or OR

rd OR

100101

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

OR rd, rs, rt

Description:

The contents of general register rs are combined with the contents of general register rt in

a bit-wise logical OR operation. The result is placed into general register rd.

Operation:

32, 64 T: GP R[rd] ← GPR[rs] or GP R[rt]

Exceptions:

None

TX49/H2 Architecture

A-118

ORI Or Immediate ORI

immediate

ORI

001101 rs rt

1516202125

2631 0

55 16

Format:

ORI rt, rs, immediate

Description:

The 16-bi t immediate is zero-extended and combined with the contents of general register

rs in a bit-wise logical OR operation. The result is placed into general register rt.

Operation:

32 T: GPR[rt] ← GPR [rs]31∼16 (immediate or GPR[rs ]15∼0)

64 T: GPR[rt] ← GPR [rs]63∼16 (immediate or GPR[rs ]15∼0)

Exceptions:

None

TX49/H2 Architecture

A-119

PREF Prefetch PREF

offset

PREF

110011 base hint

1516202125

2631 0

55 16

Format :

PREF hint, offset (base)

Description :

PREF adds the 16-bit signed offset to the contents of GPR base to form an effective byte

address. It advises that data at the effective address may be used in the near future.

If the hint field is 000002, this instruction prefetches a block of data from main memory

into cache.

PREF is an advisory instruction. It may change the performance of the program. For all

hint values and all effective addresses, it neither changes architecturally-visible state nor

alters the meaning of the program.

PREF does not cause addressing-related exceptions. If it raises an exception condition, the

exception conditions ignored. If an addressing-related exception is raised and ignored, no

data will be prefetched, even if no data is prefetched in such a case, some action that is not

architecturally-visible, such as writeback of a dirty cache line, might take place.

PREF will never generate a memory operation for a location with an uncached memory

access type.

The defined hint values are shown in the table below. The TX49 only supports hint = 0.

The hint table may be extended in future implementations.

hint field: Value

Value Name Data use and desired prefetch action

0 Load Data is expected to be loaded (not modified).

Fetch data as if for a load.

1-31 Reserved Reserved

TX49/H2 Architecture

A-120

PREF Prefetch

(continued) PREF

Programming Notes:

Prefetch can not prefetch data from a mapped location unless the translation for that

location is present in the TLB. Locations in memory pages that have not been accessed

recently may not have translations in the TLB, so prefetch may not be effective for such

locations.

Prefetch does not cause addressing exceptions. It will not cause an exception to prefetch

using an address pointer value before the validity of a pointer determined.

Operation :

32, 64 T: vAddr ← GPR[base] = sign_extend (offset)

(pAddr, uncached) ← Address Translation (vAddr, DATA, LOAD)

Prefetch (uncached, pAddr, vAddr, DATA, hint)

Exception :

None

TX49/H2 Architecture

A-121

SB Store Byte SB

offset

101000 base rt

1516202125

2631 0

55 16

Format:

SB rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The least-significant byte of register rt is stored at the effective address.

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

byte ← vA ddr2∼0 xor BigEndianCPU3

data ← GPR[rt]63−8*byte∼0 08*byte

StoreMem ory (uncached, BYTE, data, pAddr, vAddr, DATA)

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

byte ← vA ddr2∼0 xor BigEndianCPU3

data ← GPR[rt]63−8*byte∼0 08*byte

StoreMem ory (uncached, BYTE, data, pAddr, vAddr, DATA)

Exceptions:

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-122

SC Store Conditional SC

offset

111000 base rt

1516202125

2631 0

55 16

Format:

SC rt, offset (base)

Description:

The 16-bit offset is sign-extended and added to the contents of general register base to f o rm

a virtual address. The contents of general register rt are conditionally stored at the memory

location specified by the effective address.

If an ERET instruction occurs between the Load Linked instruction and this store

instruction, the store fails and is inhibited from taking place.

The success or failure of the store operation (as defined above) is indicated by the contents

of general register rt aft er execution of the instruction. A successful store sets the contents of

general register rt to1 ; an unsuccessful store sets it to 0.

The operation of Store Conditional is undefined when the address is different from the

address used in the last Load Linked.

This instruction is available in User mode; it is not necessary for CP0 to be enabled.

If either of the two least-significant bits of the effective address is non-zero, an address

error exception takes place.

TX49/H2 Architecture

A-123

SC Store Conditional

(continued) SC

If this instruction should both fail and take an exception, the exception takes precedence.

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian 02)

data ← GPR[rt]63−8*byte∼0 08*byte

if LLbit then

StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)

endif

GPR[rt] ← 031 LLbit

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian 02)

data ← GPR[rt]63−8*byte∼0 08*byte

if LLbit then

StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)

endif

GPR[rt] ← 063 Llbit

Exceptions:

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-124

SCD Store Conditional

Doubleword SCD

offset

SCD

111100 base rt

1516202125

2631 0

55 16

Format:

SCD rt, offset (base)

Description:

The 16-bit offset is sign-extended and added to the contents of general register base to f o rm

a virtual address. The contents of general register rt are conditionally stored at the memory

location specified by the effective address.

If an ERET instruction occurs between the Load Linked Doubleword instruction and this

store instruction, the store fails and is inhibited from taking place.

The success or failure of the store operation (as defined above) is indicated by the contents

of general register rt aft er execution of the instruction. A successful store sets the contents of

general register rt to1; an unsuccessful store sets it to 0.

The operation of Store Conditional Doubleword is undefined when the address is different

from the addres s used in the las t Load Linked Doubleword.

This instruction is available in User mode; it is not necessary for CP0 to be enabled.

If either of the three least-significant bits of the effective address is non-zero, an address

error exception takes place.

If this instruction should both fail and take an exception, the exception takes precedence.

Operation:

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

data ← GPR[rt]

If LLbit then

StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)

endif

GPR[rt] ← 063 Llbit

Note: It is also the same operation in th e 32 bit kernel mode.

TX49/H2 Architecture

A-125

SCD Store Conditional

Doubleword

(continued) SCD

Exceptions:

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-126

SD Store Doubleword SD

offset

111111 base rt

16202125

2631 0

55 16

Format:

SD rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The contents of general register rt are stored at the memory location

specified by the effective address.

If either of the three least-significant bits of the effective address are non-zero, an address

error exception occurs.

Operation:

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

data ← GPR[rt]

StoreMem ory (uncac hed, DOUB LEWORD, data, pAddr, vAddr, DATA)

Note:It is also the same operation, and the upper 32 bit is ignored when the virtual

address is created in the 32 bit kernel mode.

Exceptions:

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-127

SDBBP Store Debug Breakpoint SDBBP

SDBBP

001110

SPECIAL

000000

5625

2631 0

20 6

Code

Format:

SDBBP code

Description:

Raises a Debug Breakpoint exception, passing control to an exception handler. The code

field can used for passing information to the exception handler, but the only way to have the

code field retrived by the exception handler is to load the contents of the memory word

containing this instruction using the DEPC register.

Operation:

32, 64 T: Software DebugBreak poi ntExcept ion

Exception:

Debug Breakpoint exception

TX49/H2 Architecture

A-128

SDCz Store Doubleword From

Coprocessor z SDCz

offset

SDCz

1111xx*base rt

1516202125

2631 0

55 16

Format:

SDCz rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. Coprocessor unit z sources a double wo r d, w h ich the p ro c esso r wr i te s to the

addressed memory location. The data to be stored is defined by individual coprocessor

specifications.

If any of the three least-significant bits of the effective address are non-zero, an address

error exception takes place.

This instruction is not valid for use with CP0.

This instruction is undefined when the least-significant bit of the rt-field is non-zero.

*See the table, “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit

Encoding” at the end of Appendix A.

TX49/H2 Architecture

A-129

SDCz Store Doubleword From

Coprocessor z (continued) SDCz

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

data ← COPzSD (rt),

StoreMem ory (uncac hed, DOUB LEWORD, data, pAddr, vAddr, DATA)

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

data ← COPzSD (rt),

StoreMem ory (uncac hed, DOUB LEWORD, data, pAddr, vAddr, DATA)

Exceptions:

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

Coprocessor unus able exception

Opcode Bit Encoding:

SDCz

Coprocess or Uni t NumberS D opcode

2627282930

1111

2627282930

1111

SDC1

Bit #

SDC2

Bit #

TX49/H2 Architecture

A-130

SDL Store Doubleword Left SDL

offset

SDL

101100 base rt

1516202125

2631 0

55 16

Format:

SDL rt, offset (base)

Description:

This instruction can be used with the SDR instruction to store the contents of a register

into eight consecutive bytes of memory, when the bytes cross a boundary between two

doublewords. SDL stores the left portion of the register into the appropriate part of the high-

order doubleword of memory; SDR stores the right portion of the register into the

appropriat e part of the low-order doubleword.

The SDL instruction adds its sign-extended 16-bit offset to the contents of general register

base to form a virtual address which may sp ecif y an arb itrary byte . It al ters o n ly th e w ord in

memory which contains that byte. From one to four bytes will be stored, depending on the

starting byte specified.

Conceptually, it starts at the most-significant byte of the register and copies it to the

specified byte in memory; then it proceeds toward the low-order byte of the register and the

low-order byte of the word in memory, copying bytes from register to memory until it reaches

the low-order byte of the word in memory.

No address exceptions due to alignment are possible.

SWL $24,1 ($0)

memory

(big-endian)

address 0

address 8 111098 15141312

32107654 $24

before DCBA HGFE

after

address 0

address 8 111098 15141312

CBA0GFED

TX49/H2 Architecture

A-131

SDL Store Doubleword Left

(continued) SDL

This operation is only defined for the TX4300 operating in 64-bit mode nad 32-bit kernal

mode.

Execution of this instruction in 32-bit user or supervisor mode causes a reserved

instruction exception.

Operation:

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

If BigEndianMem = 0 then

pAddr ← pAddr31∼3 03

endif

byte ← vA ddr2∼0 xor BigEndianCPU3

data ← 056−8*byte GPR[rt]63∼56−8*byte

StoreMem ory (uncac hed, byt e, dat a, pAddr, vAddr, DATA)

Note:It is also the same operation, and the upper 32 bit is ignored when the virtual

address is created in the 32 bit kernel mode.

TX49/H2 Architecture

A-132

SDL Store Doubleword Left

(continued) SDL

Given a d oublewo rd in a register an d a doublew ord in memory, the o peration of SW L is a s

follows:

LWL

Memory

CBA DEFGH

KJI LMNOP

BigEndianCPU = 0 BigEndianCP U = 1

offset offset

vAddr2∼0destination type LEM BEM destination type LEM BEM

0IJKLMNOA

0AH7 0 0

1IJKLMN

     601

2 I JKLMABC 2 0 5 I JABCDEF 5 0 2

3 I JKLABCD 3 0 4 I JKABCDE 4 0 3

4 I JKABCDE 4 0 3 I JKLABCD 3 0 4

5 I JABCDEF 5 0 2 I JKLMABC 2 0 5

6 I ABCDEFG 6 0 1 I J KLMNAB 1 0 6

7 ABCDEFGH 7 0 0 I J KL MNOA 0 0 7

LEM BigEndianMem = 0

BEM BigEndianMem = 1

Type Access Type (see Figure 2-2) sent to memory

Offset pAddr2∼0 sent to memory

Exceptions:

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-133

SDR Store Doubleword Right SDR

offset

SDR

101101 base rt

1516202125

2631 0

55 16

Format:

SDR rt, offset (base)

Description:

This instruction can be used with the SDL instruction to store the contents of a register

into eight consecutive bytes of memory, when the bytes cross a boundary between two

doublewords. SDR stores the right portion of the register into the appropriate part of the

low-order doubleword; SDL stores the left portion of the register into the appropriate part of

the low-order doubleword of memory.

The SDR instruction adds its sign-extended 16-bit offset to the contents of general register

base to form a virtual address which may sp ecif y an arb itrary byte . It al ters o n ly th e w ord in

memory which contains that byte. From one to eight bytes will be stored, depending on the

starting byte specified.

Concep tually, it start s at the least-si gnificant ( rightmost) by te of the re gister and copie s it

to the specified byte in memory; then it proceeds toward the high-order byte of the register

and the high-order byte of the word in memory, copying bytes from register to memory until

it reaches the high-order byte of the word in memory.

No address exceptions due to alignment are possible.

SWR $24,4 ($0)

memory

(big-endian)

address 0

address 8 111098 15141312

32107654 $24

before DCBA HGFE

after

address 0

address 8 111098 15141312

HGFE7654

memory

(big-endian)

TX49/H2 Architecture

A-134

SDR Store Doubleword Right

(continued) SDR

This operation is only defined for the TX4300 operating in 64-bit mode and 32-bit kernal

mode.

Execution of this instruction in 32-bit user or supervisor mode causes a reserved

instruction exception

Operation:

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

if BigEndianMem = 0 then

pAddr ← pAddrPSIZE-31∼3 03

endif

byte ← vA ddr1∼0 xor BigEndianCPU3

data ← GPR[rt]63−8*byte∼0 08*byte

StoreMem ory (uncached, DOUBLEWORD-byte, data, pAddr, vA ddr,

Note:It is also the same operation, and the upper 32 bit is ignored when the virtual

address is created in the 32 bit kernel mode.

TX49/H2 Architecture

A-135

SDR Store Doubleword Right

(continued) SDR

Given a d oubleword in a re gister and a doublew ord in memory, the operation of SDR i s as

follows:

SDR

Memory

CBA DEFGH

KJI LMNOP

BigEndianCPU = 0 BigEndianCP U = 1

offset offset

vAddr2∼0destination type LEM BEM destination type LEM BEM

0 ABCDEFGH 7 0 0 HJ KLMNOP 0 7 0

1 BCDEFGHP 6 1 0 GHKL MNOP 1 6 0

2 CDEFGHOP 5 2 0 FGHLMNOP 2 5 0

3 DEFGHNOP 4 3 0 EFGHMNOP 3 4 0

4 EFGHMNOP 3 4 0 DEFGHNOP 4 3 0

5 FGHLMNOP 2 5 0 CDEFGHOP 5 2 0

6 GHKLMNOP 1 6 0 BCDEFGHP 6 1 0

7 HJKLMNOP 0 7 0 ABCDEFGH 7 0 0

LEM BigEndianMem = 0

BEM BigEndianMem = 1

Type Access Type (see Figure 2-2) sent to memory

Offset pAddr2∼0 sent to memory

Exceptions:

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisior mode)

TX49/H2 Architecture

A-136

A. SH Store Halfword SH

offset

101001 base rt

1516202125

231 0

55 16

Format:

SH rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

an unsigned effective address. The least-significant halfword of register rt is stored at the

effective address. If the least-significant bit of the effective address is non-zero, an address

error exception occurs.

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian2 0))

byte ← vA ddr2∼0 xor (BigEndianCPU2 0)

data ← GPR[rt]63-8*byte∼0 08*byte

StoreMem ory (uncached, HALFWORD, data, pAddr, vAddr, DATA)

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian2 0))

byte ← vA ddr2∼0 xor (BigEndianCPU2 0)

data ← GPR[rt]63-8*byte∼0 08*byte

StoreMem ory (uncached, HALFWORD, data, pAddr, vAddr, DATA)

Exceptions:

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-137

SLL Shift Left Logical SLL

sard SLL

000000

00000

SPECIAL

000000 rt

5610111516202125

2631 0

55556

Format:

SLL rd, rt, sa

Description:

The contents of general register rt are shifted left by sa bits, inserting zeros into the low-

order bits. The result is placed in register rd. In 64-bit mode, the 32-bit result is sign

extended when placed in the destination register. It is sign-extended for all shift amounts,

including zero; SLL with a zero shift amount truncates a 64-bit value to 32-bits and sign

extends this 32-bit value. SLL, unlike nearly all other word operations, does not repuire and

operand to be a properly sign-extended word value to produce a valid sign-extended word

result.

Note: SLL with a shift amount of zero may be treated as a NOP by some assemblers at

some optimization levels. If using SLL with zero shift to truncate 64-bit values, check the

assembler being used.

Operation:

32 T: GPR[rd] ← GPR[rt]31-sa∼0 0sa

64 T: s ← 0 sa

temp ← GPR [rt ]31-s∼0 0s

GPR[rd] ← (temp31)32 temp

Exceptions:

None

TX49/H2 Architecture

A-138

SLLV Shift Left Logical

Variable SLLV

rd SLLV

000100

00000

SPECIAL

000000 rt

5610111516202125

2631 0

55556

Format:

SLLV rd, rt, rs

Description:

The contents of general register rt are shifted left by the number of bits specified by the

low-order five bits contained as contents of general register rs, inserting zeros into the low-

order bits. The result is placed in register rd. In 64-bit mode, the 32-bit result is sign

extended when placed in the destination register. It is sign-extended for all shift amounts,

including zero; SLLV with a zero shift amount truncates a 64-bit value to 32-bits and sign

extends this 32-bit value. SLLV, unlike nearly all other word operations, does not require

the operand to be a properly sign-extended word value to produce a valid sign-extended word

result.

Note: SLLV with a shift amount of zero may be treated as a NOP by some assemblers at

some optimization levels. If u sing SLLV with zero shif t to truncate 64-bit values, check the

assembler being used.

Operation :

32 T: s ← GP[rs]4∼0

GPR[rd] ← GPR[rt](31-s) ∼0 0s

64 T: s ← 0 GP[rs]4∼0

temp ← GPR [rt ](31-s) ∼0 0s

GPR[rd] ← (temp31)32 temp

Exceptions:

None

TX49/H2 Architecture

A-139

SLT Set On Less Than SLT

rd SLT

101010

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

SLT rd, rs, rt

Description:

The contents of general register rt are subtracted from the contents of general register rs.

Considering both quantities as signed integers, if the contents of general register rs are less

than the contents of general register rt, the result is set to one, otherwise the result is set to

zero.

The result is placed into general register rd.

No integer overflow exception occurs under any circumstances. The comparison is valid

even if the subtraction used during the comparis on overflows.

Operation:

32 T: if GPR[rs] < GPR[rt] then

GPR[rd] ← 031 1

else

GPR[rd] ← 032

endif

64 T: if GPR[rs] < GPR[rt] then

GPR[rd] ← 063 1

else

GPR[rd] ← 064

endif

Exceptions:

None

TX49/H2 Architecture

A-140

SLTI Set On Less Than

Immediate SLTI

immediate

SLTI

001010 rs rt

1516202125

2631 0

55 16

Format:

SLTI rt, rs, immediate

Description:

The 16-bi t immediate is sign-extended and subtracted from the contents of general register

rs. Considering both quantities as signed integers, if rs is less than the sign-extended

immediate, the result is set to one, otherwise the result is set to zero. The result is placed

into general register rt.

No integer overflow exception occurs under any circumstances. The comparison is valid

even if the subtraction used during the comparis on overflows.

Operation:

32 T: if GPR[rs] < (immediate15)16 immediate15∼0 then

GPR[rt] ← 031 1

else

GPR[rt] ← 032

endif

64 T: if GPR[rs] < (immediate15)48 immediate15∼0 then

GPR[rt] ← 063 1

else

GPR[rt] ← 064

endif

Exceptions:

None

TX49/H2 Architecture

A-141

SLTIU Set On Less Than

Immediate Unsigned SLTIU

immediate

SLTIU

001011 rs rt

151620

2631 0

55 16

Format:

SLTIU rt, rs, immediate

Description:

The 16-bi t immediate is sign-extended and subtracted from the contents of general register

rs. Considering both quantities as unsigned integers, if rs is less than the sign-extended

immediate, the result is set to one, otherwise the result is set to zero. The result is placed

into general register rt.

No integer overflow exception occurs under any circumstances. The comparison is valid

even if the subtraction used during the comparis on overflows.

Operation:

32 T: if (0 GPR[rs]) < (immediate15)16 immediate15∼0 then

GPR[rt] ← 031 1

else

GPR[rt] ← 032

endif

64 T: if (0 GPR[rs]) < (immediate15)48 immediate15∼0 then

GPR[rt] ← 063 1

else

GPR[rt] ← 064

endif

Exceptions:

None

TX49/H2 Architecture

A-142

SLTU Set On Less Than Unsigned SLTU

rd SLTU

101011

00000

SPECIAL

000000 rs rt

56101115162021

2631 0

55556

Format:

SLTU rd, rs, rt

Description:

The contents of general register rt are subtracted from the contents of general register rs.

Considering both quantities as unsigned integers, if the contents of general register rs are

less than the contents of general register rt, the result is set to one, otherwise the result is

set to zero.

The result is placed into general register rd.

No integer overflow exception occurs under any circumstances. The comparison is valid

even if the subtraction used during the comparis on overflows.

Operation:

32 T: if (0 GPR[rs]) < 0 GPR[rt ] th e n

GPR[rd] ← 031 1

else

GPR[rd] ← 032

endif

64 T: if (0 GPR[rs]) < 0 GPR[rt ] th e n

GPR[rd] ← 063 1

else

GPR[rd] ← 064

endif

Exceptions:

None

TX49/H2 Architecture

A-143

SRA Shift Right Arithmetic SRA

sard SRA

000011

00000

SPECIAL

000000 rt

56101115162021

2631 0

55556

Format:

SRA rd, rt, sa

Description:

The contents of general register rt are shifted right by sa bits, sign-extending the high-

order bits. The result is placed in register rd. In 64-bit mode, the operand must be a valid

sign-ext ended, 32-bit value.

Operation :

32 T: GPR[rd] ← (GPR[rt]31)sa GPR[rt]31∼sa

64 T: s ← 0 sa

temp ← (GPR[rt]31)s GPR[rt]31∼s

GPR[rd] ← (temp31)32 temp

Exceptions:

None

TX49/H2 Architecture

A-144

SRAV Shift Right Arithmetic

Variable SRAV

rd SRAV

000111

00000

SPECIAL

000000 rs rt

5610111516202125

31 0

55556

Format:

SRAV rd, rt, rs

Description:

The contents of general register rt are shifted right by the number of bits specified by the

low-order five bits of general register rs, sign-extending the high-order bits. The result is

placed in register rd. In64-bit mode, the operand must be a valid sign-extended, 32-bit value.

Operation:

32 T: s ← GPR[rs]4∼0

GPR[rd] ← (GPR[ r t]31)s GPR[rt]31∼sa

64 T: s ← GPR[rs]4∼0

temp ← (GPR[rt]31)s GPR[rt]31∼s

GPR[rd] ← (temp31)32 temp

Exceptions:

None

TX49/H2 Architecture

A-145

SRL Shift Right Logical SRL

rd SRL

000010

00000

SPECIAL

000000 sart

5610111516202125

2631 0

55556

Format:

SRL rd, rt, sa

Description:

The content s of general register rt are shifted right by sa bits, inserting zeros into the high-

order bits. The result is placed in register rd. In64-bit mode, the operand must be a valid

sign-ext ended, 32-bit value.

Operation:

32 T: GPR[rd] ← 0sa GPR[rt]31∼sa

64 T: s ← 0 sa

temp ← 0s GPR[rt]31∼s

GPR[rd] ← (temp31)32 temp

Exceptions:

None

TX49/H2 Architecture

A-146

SRLV Shift Right Logical Variable SRLV

rd SRLV

000110

00000

SPECIAL

000000 rs rt

5610111516202125

2631 0

55556

Format:

SRLV rd, rt, rs

Description:

The contents of general register rt are shifted right by the number of bits specified by the

low-order five bits of general register rs, inserting zeros into the high-order bits. The result

is placed in register rd. In 64-bit mode, the operand must be a valid sign-extended, 32-bit

value.

Operation:

32 T: s ← GPR[rs]4∼0

GPR[rd] ← 0s GPR[rt]31∼s

64 T: s ← GPR[rs]4∼0

temp ← 0s GPR[rt]31∼s

GPR[rd] ← (temp31)32 temp

Exceptions:

None

TX49/H2 Architecture

A-147

SUB Subtract SUB

rd SUB

100010

00000

SPECIAL

000000 rs rt

56101115162021

2631 0

55556

Format:

SUB rd, rs, rt

Description:

The content s of general register rt are subtracted from the contents of general register rs to

form a result. The result is placed into general register rd. In 64-bit mode, the operands

must be valid sign-extended, 32-bit values.

The only difference between this instruction and the SUBU instruction is that SUBU never

traps on overflow.

An integer overflow exception takes place if the carries out of bits 30 and 31 differ (2’s-

complement overflow). The destination register rd is not modified when an integer overflow

exception occurs.

Operation:

32 T: GPR[rd] ← GPR[rs] − GPR[rt]

64 T: temp ← GPR[rs] − GPR[rt]

GPR[rd] ← (temp31)32 temp31∼0

Exceptions:

Integer overflow exception

TX49/H2 Architecture

A-148

SUBU Subtract Unsigned SUBU

rd SUBU

100011

00000

SPECIAL

000000 rs rt

561011151620

2125

2631 0

55556

Format:

SUBU rd, rs, rt

Description:

The content s of general register rt are subtracted from the contents of general register rs to

form a result. The result is placed into general register rd. In 64-bit mode, the operands

must be valid sign-extended,32-bit values.

The only difference between this instruction and the SUB instruction is that SUBU never

traps on overflow. No integer overflow exception occurs under any circumstances.

Operation:

32 T: GPR[rd] ← GPR[rs] − GPR[rt]

64 T: temp ← GPR[rs] − GPR[rt]

GPR[rd] ← (temp31)32 temp31∼0

Exceptions:

None

TX49/H2 Architecture

A-149

SW Store Word SW

offset

101011 base rt

1516202125

2631 0

55 16

Format:

SW rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. The contents of general register rt are stored at the memory location

specified by the effective address.

If either of the two least-significant bits of the effective address are non-zero, an address

error exception occurs.

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian 02)

byte ← vA ddr2∼0 xor (BigEndianCPU 02)

data ← GPR[rt]63-8*byte 08*byte

StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian 02)

byte ← vA ddr2∼0 xor (BigEndianCPU 02)

data ← GPR[rt]63-8*byte 08*byte

StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)

Exceptions:

TLB refill exception

TUB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-150

SWCz Store Word From

Coprocessor z SWCz

offset

SWCz

1110xx*base rt

1516202125

31 0

55 16

Format:

SWCz rt, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

a virtual address. Coprocessor unit z sources a word, which the processor writes to the

addressed memory l ocation.

The data to be stored is defined by individual coprocessor specifications. This instruction

is not valid for use with CP0. If either of the two least-significant bits of the effective address

is non-zero, an address error exception occurs.

Execution of the instruction referencing coprocessor 3 causes a reserved instruction

exception, not a coprocessor unusab le exception.

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian 02)

byte ← vA ddr2∼0 xor (BigEndianCPU 02)

data ← COPzSW (byte, rt)

StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor (ReverseEndian 02)

byte ← vA ddr2∼0 xor (BigEndianCPU 02)

data ← COPzSW (byte, rt)

StoreMem ory (uncache, WORD, data, pAddr, vAddr, DATA)

*See the table “Opcode Bit Encoding” on next page, or “CPU Instruction Opcode Bit

Encoding” at the end of Appendix A.

TX49/H2 Architecture

A-151

SWCz Store Word From

Coprocessor z (Continued) SWCz

Exceptions:

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

Coprocessor unus able exception

Opcode Bit Encoding:

SWCz

Coprocess or Uni t Number

SWC1

Bit #

SW Opcode

26272829

0111

2627282930

0111

SWC2

Bit #

TX49/H2 Architecture

A-152

SWL Store Word Left SWL

offset

SWL

101010 base rt

1516

2021

2631 0

55 16

Format:

SWL rt, offset (base)

Description:

This instruction can be used with the SWR instruction to store the contents of a register

into four consecutive bytes of memory, when the bytes cross a boundary between two words.

SWL stores the left portion of the register into the appropriate part of the high-order word of

memory; SWR stores the right portion of the register into the appropriate part of the low-

order word.

The SWL instruction adds its sign-extended 16-bit offset to the contents of general register

base to form a virtual address which may sp ecif y an arb itrary byte . It al ters o n ly th e w ord in

memory which contains that byte. From one to four bytes will be stored, depending on the

staring byte specified.

Conceptually, it starts at the most-significant byte of the register and copies it to the

specified byte in memory; then it proceeds toward the low-order byte of the register and the

low-order byte of the word in memory, copying bytes from register to memory until it reaches

the low-order byte of the word in memory.

No address exceptions due to alignment are possible.

SWL $24,1($0)

memory

(big-endian)

address 0

address 4 7654

3210 $24

before DCBA

after

address 0

address 4 7654

CBA0

TX49/H2 Architecture

A-153

SWL Store Word Left

(Continued) SWL

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

if BigEndianMem = 0 then

pAddr ← pAddr31∼2 02

endif

byte ← vA ddr1∼0 xor BigEndianCPU2

if (vAddr2 xor BigEndianCPU) = 0 then

data ← 032 024-8*byte GPR[rt]31∼24-8*byte

else

data ← 024-8*byte GPR[rt]31∼24-8*byte 032

endif

StoreMem ory (uncac hed, byt e, dat a, pAddr, vAddr, DATA)

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

if BigEndianMem = 0 then

pAddr ← pAddr31∼2 02

endif

byte ← vA ddr1∼0 xor BigEndianCPU2

if (vAddr2 xor BigEndianCPU) = 0 then

data ← 032 024-8*byte GPR[rt]31∼24-8*byte

else

data ← 024-8*byte GPR[rt]31∼24-8*byte 032

endif

StoreMem ory (uncac hed, byt e, dat a, pAddr, vAddr, DATA)

TX49/H2 Architecture

A-154

SWL Store Word Left

(Continued) SWL

Given a d oublewo rd in a register an d a doublew ord in memory, the o peration of SW L is a s

follows:

SWL

Memory

CBA DEFGH

KJI LMNOP

BigEndianCPU = 0 BigEndianCPU = 1

offset offset

vAddr2∼0Destination type LEM BEM Destination type LEM BEM

0IJKLMNOE007EFGHMNOP340

1IJKLMNEF106IEFGMNOP241

2 I JKLMEFG 2 0 5 I JEFMNOP 1 4 2

3IJKLEFGH304IJKEMNOP043

4IJKEMNOP043IJKLEFGH304

5IJEFMNOP142IJKLMEFG205

6 IEFGMNOP 2 4 1 I JKLMNEF 1 0 6

7EFGHMNOP340IJKLMNOE007

LEM BigEndianMem = 0

BEM BigEndianMem = 1

Type AccessType (see Figure 2-2) sent to memory

Offset pAddr2∼0 sent to memory

Exceptions:

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

TX49/H2 Architecture

A-155

SWR Store Word Right SWR

offset

SWR

101110 base rt

1516202125

2631 0

55 16

Format:

SWR rt, offset (base)

Description:

This instruction can be used with the SWL instruction to store the contents of a register

into four consecutive bytes of memory, when the bytes cross a boundary between two words.

SWR stores the right portion of the register into the appropriate part of the low-order word;

SWL stores the left portion of the register into the appropriate part of the low-order word of

memory.

The SWR instruction adds its sign-extended 16-bit offset to the contents of general register

base to form a virtual address which may sp ecif y an arb itrary byte . It al ters o n ly th e w ord in

memory which contains that byte. From one to four bytes will be stored, depending on the

starting byte specified.

Concep tually, it start s at the least-s ignificant (r ightmost) byte of the reg ister and copies it

to the specified byte in memory; then it proceeds toward the high-order byte of the register

and the high-order byte of the word in memory, copying bytes from register to memory until

it reaches the high-order byte of the word in memory.

No address exceptions due to alignment are possible.

SWR $24,4($0)

memory

(big-endian)

address 0

address 4 7654

3210 $24

before DCBA

after

address 0

address 4 765D

3210

TX49/H2 Architecture

A-156

SWR Store Word Right

(Continued) SWR

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

if BigEndianMem = 0 then

pAddr ← pAddr31∼2 02

endif

byte ← vA ddr1∼0 xor BigEndianCPU2

if (vAddr2 xor BigEndi anCPU) = 0 then

data ← 032 GPR[rt]31-8*byte∼0 08*byte

else

data ← GPR[rt]31-8*byte∼0 08*byte 032

endif

StoreMem ory (uncached, WORD-byte, data, pA ddr, vAddr, DA TA)

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, unchached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0 xor ReverseEndian3)

if BigEndianMem = 0 then

pAddr ← pAddr31∼2 02

endif

byte ← vA ddr1∼0 xor BigEndianCPU2

if (vAddr2 xor BigEndi anCPU) = 0 then

data ← 032 GPR[rt]31-8*byte∼0 08*byte

else

data ← GPR[rt]31-8*byte∼0 08*byte 032

endif

StoreMem ory (uncached, WORD-byte, data, pA ddr, vAddr, DA TA)

TX49/H2 Architecture

A-157

SWR Store Word Right

(Continued) SWR

Given a d oubleword in a register and a doublewo rd in memory, the operation of SWR is a s

follows:

SWR

Memory

CBA DEFGH

ILMNOP

BigEndianCPU = 0 BigEndianCPU = 1

offset offset

vAddr2∼0Destination type LEM BEM Destination type LEM BEM

0IJKLEFGH304HJKLMNOP070

1 I JKLFGHP 2 1 4 GHKLMNOP 1 6 0

2IJKLGHOP124FGHLMNOP250

3IJKLHNOP034EFGHMNOP340

4EFGHMNOP340IJKLHNOP034

5 FGHLMNOP 2 5 0 I JKLGHOP 1 2 4

6 GHKLMNOP 1 6 0 I JKLFGHP 2 1 4

7HJKLMNOP070IJKLEFGH304

LEM BigEndianMem = 0

BEM BigEndianMem = 1

Type AccessType (see Figure 2-2) sent to memory

Offset pAddr2∼0 sent to memory

Exceptions:

TLB refill exception

TLB invalid exception

TLB modifica tion exception

BUS error exception

Address error exception

TX49/H2 Architecture

A-158

SYNC Synchronize SYNC

SYNC

001111

0000 0000 0000 0000 0000

SPECIAL

000000

31 0

20 6

Format:

SYNC

Description:

The SYNC instruction ensures that any loads and stores fetched prior to the present

instruction are completed before any loads or stores after this instruction are allowed to

start. Use of the SYNC instruction to serialize certain memory references may be required in

multiprocessor environment for proper synchronization.

For example:

Processor A Processor B

SYNC

R1, DATA

R2, 1

R2, FLAG

1: LW

BEQ

NOP

SYNC

R2, FLAG

R2, R0, 1B

R1, DATA

The SYNC in processor A prevents DATA being written after FLAG, which could cause

processor B to read stale data. The SYNC in processor B prevents DATA from being read

before FLAG, which could likewise result in reading stale data. For processors which only

execute loads and stores in order, with respect to shared memory, this instruction is a NOP.

LL and SC instructions implicitly perform a SYNC.

This instruction is allowed in User mode.

Operation:

32, 64 T: S ync Operation()

Exceptions:

None

TX49/H2 Architecture

A-159

SYSCALL System Call SYSCALL

SYSCALL

001100

SPECIAL

000000

5625

31 0

20 6

Code

Format:

SYSCALL

Description:

A system call exception occurs, immediately and unconditionally transferring control to the

exception handler.

The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption

handler only by loading the contents of the memory word containing the instruction.

Operation:

32, 64 T: S ys temCallException

Exceptions:

System Call exception

TX49/H2 Architecture

A-160

A. TEQ Trap If Equal TEQ

code TEQ

110100

SPECIAL

000000 rs rt

561516202125

2631 0

55 10 6

Format:

TEQ rs, rt

Description:

The content s of general register rt are compared to general register rs.

If the contents of general register rs are equal to the contents of general register rt, a trap

exception occurs.

The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption

handler only by loading the contents of the memory word containing the instruction.

Operation:

32, 64 T: i f GPR[ rs] = GPR[rt] then

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-161

TEQI Trap If Equal Immediate TEQI

immediate

TEQI

01100

REGIMM

000001 rs

15162021

2631 0

55 16

Format:

TEQI rs, immediate

Description:

The 16-bi t immediate is sign-extended and compared to the contents of general register rs.

If the contents of general register rs are equal to the sign-extended immediate, a trap

exception occurs.

Operation:

32 T: if GPR[rs] ← (immediate15)16 immediate15∼0 then

TrapException

endif

64 T: if GPR[rs] ← (immediate15)48 immediate15∼0 then

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-162

TGE Trap If Greater Than Or

Equal TGE

code TGE

110000

SPECIAL

000000 rs rt

561516202125

2631 0

55 10 6

Format:

TGE rs, rt

Description:

The contents of general register rt are compared to the contents of general register rs.

Considering both quantities as signed integers, if the contents of general register rs are

greater than or equal to the contents of general register rt, a trap exception occurs.

The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption

handler only by loading the contents of the memory word containing the instruction.

Operation:

32, 64 T: i f GPR[ rs] ≥ GPR[rt] then

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-163

TGEI Trap If Greater Than Or

Equal Immediate TGEI

immediate

TGEI

01000

REGIMM

000001 rs

15162021

2631 0

55 16

Format:

TGEI rs, immediate

Description:

The 16-bi t immediate is sign-extended and compared to the contents of general register rs.

Considering both quantities as signed integers, if the contents of general register rs are

greater than or equal to the sign-extended immediate, a trap exception occurs.

Operation:

32 T: if GPR[rs] ≥ (immediate15)16 immediate15∼0 then

TrapException

endif

64 T: if GPR[rs] ≥ (immediate15)48 immediate15∼0 then

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-164

TGEIU Trap If Greater Than Or

Equal Immediate

Unsigned TGEIU

immediate

TGEIU

01001

REGIMM

000001 rs

1516202125

2631 0

55 16

Format:

TGEIU rs, immediate

Description:

The 16-bi t immediate is sign-extended and compared to the contents of general register rs.

Considering both quantities as unsigned integers, if the contents of general register rs are

greater than or equal to the sign-extended immediate, a trap exception occurs.

Operation:

32 T: if (0 GPR[rs]) ≥ (0 (immediate15)16 immediate15∼0) then

TrapException

endif

64 T: if (0 GPR[rs]) ≥ (0 (immediate15)48 immediate15∼0) then

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-165

TGEU Trap If Greater Than Or

Equal Unsigned TGEU

code TGEU

110001

SPECIAL

000000 rs rt

561516202125

31 0

55 10 6

Format:

TGEU rs, rt

Description:

The contents of general register rt are compared to the contents of general register rs.

Considering both quantities as unsigned integers, if the contents of general register rs are

greater than or equal to the contents of general register rt, a trap exception occurs.

The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption

handler only by loading the contents of the memory word containing the instruction.

Operation:

32, 64 T: if (0 GPR[rs]) ≥ (0 GPR[rt]) then

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-166

TLBP Probe TLB For Matching Entry TLBP

TLBP

001000

000 0000 0000 0000 0000

COP0

010000

2631 0

519 6

2425

Format:

TLBP

Description:

The Index register is loaded with the address of the TLB entry whose contents match the

contents of the EntryHi register. If no TLB entry matches, the high-order bit of the Index

The architecture does not specify the operation of memory references associated with the

instructio n immediately after a TL BP instruc tion, nor is the ope ration specif ied if more than

one TLB entry matches.

Operation:

32 T: Index ← 1 025 Undeficed6

for i in 0∼TLBEntries-1

if (TLB[i]95∼77 = EntryHi31∼12) and (TLB[i]76 or

(TLB[i]71∼64 = EntryHi7∼0)) then

Index ← 026 i5∼0

endif

endfor

64 T: Index ← 1 025 Undeficed6

for i in 0∼TLBEntries-1

if (TLB[i]167∼141 and not (015 TLB[i]216∼205))

= (EntryHi39∼13 and not (015 TLB[i]216∼205)) and

(TLB[i]140 or (TLB[i]135∼128 = EntryHi7∼0)) t hen

Index ← 026 i5∼0

endif

endfor

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

A-167

TLBR Read Indexed TLB Entry TLBR

TLBR

000001

000 0000 0000 0000 0000

COP0

010000

2631 0

519 6

2425

Format:

TLBR

Description:

The G bit (controls ASID matching) read from the TLB is written into both EntryLo0 and

EntryLo1.

The EntryHi and EntryLo registers are loaded with the contents of the TLB entry pointed

at by the contents of the TLB Index register. The operation is invalid (and the results are

unspecified) if the contents of the TLB Index register are greater than the number of TLB

entries in the processor.

Operation:

32 T: PageMask ← TLB[Index5∼0]127∼96

EntryHi ← TLB[Index5∼0]95∼64 and not TLB[Index5∼0]127∼96

EntryLo1 ← TLB[Index5∼0]63∼32

EntryLo0 ← TLB[Index5∼0]31∼0

64 T: PageMask ← TLB[Index5∼0]255∼192

EntryHi ← TLB[Index5∼0]191∼128 and not TLB[Index5∼0]255∼192

EntryLo1 ← TLB[Index5∼0]127∼65 TLB[Index5∼0]140

EntryLo0 ← TLB[Index5∼0]63∼1 TLB[Index5∼0]140

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

A-168

TLBWI Write Indexed TLB Entry TLBWI

TLBWI

000010

000 0000 0000 0000 0000

COP0

010000

2631 0

519 6

2425

Format:

TLBWI

Description:

The G bit of the TLB is written with the logical AND of the G bits in EntryLo0 and

EntryLo1.

The TLB entry pointed at by the contents of the TLB Index register is loaded with the

contents of the EntryHi and EntryLo registers.

The operation is invalid (and the results are unspecified) if the contents of the TLB Index

Operation:

32, 64 T: TLB [Index5∼0] ←

PageMask (EntryHi and not PageMask) EntryLo1 EntryLo0

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

A-169

TLBWR Write Random TLB Entry TLBWR

TLBWR

000110

000 0000 0000 0000 0000

COP0

010000

2631 0

519 6

2425

Format:

TLBWR

Description:

The G bit of the TLB is written with the logical AND of the G bits in EntryLo0 and

EntryLo1.

The TLB entry pointed at by the contents of the TLB Random register is loaded with the

contents of the EntryHi and EntryLo registers.

Operation:

32, 64 T: TLB [Random5∼0] ←

PageMask (EntryHi and not PageMask) EntryLo1 EntryLo0

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

A-170

TLT Trap If Less Than TLT

code TLT

110010

SPECIAL

000000 rs rt

561516202125

2631 0

55 10 6

Format:

TLT rs, rt

Description:

The content s of general register rt are compared to general register rs.

Considering both quantities as signed integers, if the contents of general register rs are

less than the contents of general register rt, a trap exception occurs.

The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption

handler only by loading the contents of the memory word containing the instruction.

Operation:

32, 64 T: i f GPR[ rs] < GPR[rt] then

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-171

TLTI Trap If Less Than Immediate TLTI

immediate

REGIMM

000001 rs TLTI

01010

151620

2125

2631 0

55 16

Format:

TLTI rs, immediate

Description:

The 16-bi t immediate is sign-extended and compared to the contents of general register rs.

Considering both quantities as signed integers, if the contents of general register rs are less

than the sign-extended immediate, a trap exception occurs.

Operation:

32 T: if GPR[rs] < (immediate15)16 immediate15∼0 then

TrapException

endif

64 T: if GPR[rs] < (immediate15)48 immediate15∼0) then

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-172

TLTIU Trap If Less Than

Immediate Unsigned TLTIU

immediate

TLTIU

01011

REGIMM

000001 rs

1516202125

2631 0

55 16

Format:

TLTIU rs, immediate

Description:

The 16-bi t immediate is sign-extended and compared to the contents of general register rs.

Considering both quantities as signed integers, if the contents of general register rs are less

than the sign-extended immediate, a trap exception occurs.

Operation:

32 T: if (0 GPR[rs]) < (0 (immediate15)16 immediate15∼0) then

TrapException

endif

64 T: if (0 GPR[rs]) < (0 (immediate15)48 immediate15∼0) then

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-173

TLTU Trap If Less than

Unsigned TLTU

code TLTU

110011

SPECIAL

000000 rs rt

561516202125

2631 0

55 10 6

Format:

TLTU rs, rt

Description:

The contents of general register rt are compared to general register rs. Considering both

quantities as unsigned integers, if the contents of general register rs are less than the

contents of general register rt, a trap exception occurs.

The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption

handler only by loading the contents of the memory word containing the instruction.

Operation:

32, 64 T: i f (0 GPR [rs ]) < (0 GPR [r t ]) the n

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-174

TNE Trap If Not Equal TNE

code TNE

110110

SPECIAL

000000 rs rt

561516202125

2631 0

55 10 6

Format:

TNE rs, rt

Description:

The contents of general register rt are compared to general register rs. If the contents of

general register rs are not equal to the contents of general register rt, a tap exception occurs.

The code field is available fo r use as sof tware parame ter s, but i s retrieve d by the exce ption

handler only by loading the contents of the memory word containing the instruction.

Operation:

32, 64 T: i f GPR [rs] ≠ GPR [rt] then

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-175

TNEI Trap If Not Equal Immediate TNEI

immediate

TNEI

01110

REGIMM

000001 rs

1516202125

2631 0

55 16

Format:

TNEI rs, immediate

Description:

The 16-bi t immediate is sign-extended and compared to the contents of general register rs.

If the contents of general register rs are not equal to the sign-extended immediate, a trap

exception occurs.

Operation:

32 T: if GPR[rs] ≠ (immediate15)16 immediate15∼0 then

TrapException

endif

64 T: if GPR[rs]≠(immediate15)48 immediate15∼0 then

TrapException

endif

Exceptions:

Trap exception

TX49/H2 Architecture

A-176

WAIT Wait WAIT

WAIT

100000

000 0000 0000 0000 0000

COP0

010000

2631 0

519 6

Format :

WAIT

Description :

The WAIT instruction is used to halt the internal pipeline and thus reduce the power

consumption of the CPU. See Chapter 16.

Operation :

32, 64 T: if G-bus is idle then

StopPipeline

Endif

Exceptions :

Coprocessor unus able exception

TX49/H2 Architecture

A-177

XOR Exclusive Or XOR

rd 0

00000 XOR

100110

SPECIAL

000000 rtrs

5610111516202125

2631 0

55556

Format:

XOR rd, rs, rt

Description:

The contents of general register rs are combined with the contents of general register rt in

a bit-wise logical exclusive OR operation. The result is placed into general register rd.

Operation:

32, 64 T: GP R [ rd] ← GPR [rs] xor GPR [rt]

Exceptions:

None

TX49/H2 Architecture

A-178

XORI Exclusive OR Immediate XORI

immediate

XORI

001110 rtrs

1516

202125

2631 0

55 16

Format:

XORI rt, rs, immediate

Description:

The 16-bi t immediate is zero-extended and combined with the contents of general register

rs in a bit-wise logical exclusive OR operation. The result is placed into general register rt.

Operation:

32 T: GPR [rt] ← GPR [rs] xor (016 immediate)

64 T: GPR [rt] ← GPR [rs] xor (048 immediate)

Exceptions:

None

TX49/H2 Architecture

A-179

A.7 Bit Encoding of CPU Instruction OPcodes

The Table A-2 shows the bit codes for all TX49 CPU instructions(ISA and extended ISA)

Table A-4 CPU Operation Code Bit Encoding

OPcode

31 26 0

OPcode

31∼29 28∼26

01234567

0 SPECIA λREGIMM λJ JAL BEQ BNE BLEZ BGTZ

1 ADDI ADDIU SLTI SLTIU ANDI ORI XORI LUI

2COP0 αCOP1 αCOP2 αCOP3 α θ BEQL BNEL BLEZL BGTZL

3DADDI εDADDIU εLDL εLDR εMAC λ***

4 LB LH LWL LW LBU LHU LWR LWU ε

5SB SH SWL SW SDL εSDR εSWR CACHE

6LL LWC1 αLWC2 αPREF LLD εLDC1 αLDC2 αLD ε

7SC SWC1 αSWC2 α*SCD εSDC1 αSDC2 αSD ε

SPECIAL Function

31 26 5 0

OPcode =

SPECIAL SPECIAL

Function

5∼32∼0

01234567

0SLL *SRL SRA SLLV *SRLV SRAV

1JR JALR **

SYSCALL BREAK SDBBP SYNC

2 MFHI MTHI MFLO MTLO DSLLV ε*DSRLV εDSRAV ε

3 MULT MULTU DIV DIVU DMULT εDMULTεDDIV εDDIVU ε

4 ADD ADDU SUB SUBU AND OR XOR NOR

5**

SLT SLTU DADD εDADDU εDSUB εDSUBU ε

6 TGE TGEU TLT TLTU TEQ *TNE *

7 DSLL ε*DSRL εDSRA εDSLL32 ε*DSRL32 εDSRA32 ε

TX49/H2 Architecture

A-180

REGIMM rt

31 26 20 16 0

OPcode =

REGIMM REGIMM

20∼19 18∼16

01234567

0 BLTZ BGEZ BLTZL BGEZL ****

1 TGEI TGEIU TLTI TLTIU TEQI *TNEI *

2 BLTZAL BGEZAL BLTZALL BGEZALL ****

3********

COPz rs

31 26 25 21 0

OPcode =

COPz COPz

25∼24 23∼21

01234567

0MF DMF εCF γMT DMT εCT γ

1BC γγγγγγγ

3CO

COPz rt

31 26 20 16 0

OPcode =

COPz COPz

20∼19 18∼16

01234567

0 BCF BCT BCFL BCTL γγγγ

1γγγγγγγγ

2γγγγγγγγ

3γγγγγγγγ

COP0 Function

31 26 5 0

OPcode =

COP0 COP0

Function

5∼32∼0

01234567

0φTLBR TLBWI φφφTLBWR φ

1TLBP φφφφφφφ

2φφφφφφφφ

3ERET φφφφφφDERET

4WAIT φφφφφφφ

5φφφφφφφφ

6φφφφφφφφ

7φφφφφφφφ

TX49/H2 Architecture

A-181

MAC Function

31 26 5 0

OPcode =

MAC MAC

Function

5∼32∼0

01234567

0 MADD MADDU γγγγγγ

1γγγγγγγγ

2γγγγγγγγ

3γγγγγγγγ

4γγγγγγγγ

5γγγγγγγγ

6γγγγγγγγ

7γγγγγγγγ

Key :

*: This opcode is reserved for future use. An attempt to execute it causes a Reserved

Instruction exception.

γ: This opcode is reserved for future use. An attempt to execute it causes a Reserved

Instruction exception.

λ: This opecode indicates an instruction class. The instruction word must be further decoded

by examining additional tables that show the values for another instruction field.

α: This opcode is a coprocessor operation, not a CPU operation. If the processor state does

not allow access to the specified coprocessor, the instruction causes a Coprocessor

Unusable exception. It is included in the table because it uses a primary opecode in the

instruction encodeing map.

φ: This opc ode is reser ved f or futur e use, but does n ot caus e a R eser ved I nstr uctio n exc eption

in TX49 implementations. It is treated as “NOP”.

θ: This opcode is valid when BC is only selected in COPz rs; In other case, it causes a

Reserved Instr uction exception .

ε: This opcode is val id when the processor is operat ing either in the Kern el mode or in the 64-

bit non-Kernel (User or Supervisor) mode; In other case, it causes a Reserved Instruction

exception .

TX49/H2 Architecture

A-182

TX49/H2 Architecture

B-1

Appendix B: FPU Instruction Set Details

This appendix provides a detailed description of the operation of each Floating-Point (FPU)

instructio n. The in struc tions are liste d alpha betical ly. Th e exce ptions that may o ccur du e to th e

execution of each instruction are listed after the description of each ins truction. The description

of the immediate causes and the manner of handling exceptions us omitted horn the instruction

descriptions in this chapter. Refer to Chapter 6 for detailed descriptions of floating-point

exceptions a n d handling.

Table B-5 lists the entire bit encoding for the constant fields of the Floating-Point instruction

set; the bit encoding for each instruction is included with that individual instruction.

B.1 Instruction Formats

There are three basic instruction format types:

• I-Type, or Immediate instructions, which include load and store operations, M-Type,

or Move instructions

• R-Type, or Register instructions, which include the two-and three-register Floating-

Point operations.

• Branch instructions and Move instructions

The instruction description subsections that follow show how the three basic instruction

formats are used by:

Load and store instructions,

Move instructions, and

Floating-Point Computational instructions.

TX49/H2 Architecture

B-2

Floating-point instructions are mapped onto the MIPS coprocessor instructions, defining

coprocessor unit number one (CP1) as the floating-point unit.

Each operation is valid only for certain formats. Implementations may support some of

these formats and operations only through emulation, but only need support combinations

that are valid, which are marked with a V in Table B-1 below. Those combinations marked

with a “R” are not currently specified by this architecture, causing an unimplemented

instruction trap, to maintain compatibility with future architecture extensions.

Table B-1 Valid FPU Instruction Formats

Source Format

Operation Single Double Word Longword

ADD V V R R

SUB V V R R

MUL V V R R

DIV V V R R

SQRT V V R R

ABS V V R R

MOV V V

NEG V V R R

TRUNC.L V V

ROUND.L V V

CEIL.L V V

LOOR.L V V

TRUNC.W V V

ROUND.W V V

CEIL.W V V

FLOOR.W V V

CVT.S V V V

CVT.D V V V

CVT.W V V

CVT.L V V

CVVRR

TX49/H2 Architecture

B-3

The coprocessor branch on condition true/false instructions can be used to logically negate

any predicate. Thus, the 32 possible conditions require only 16 distinct comparisons, as shown

in Table B-2 below.

Table B-2 Logical Negation of Predicates by Condition True/False

Condition Relations

Mnemonic

True False Code Greater

Than Less

Than Equal Unordered

Invalid Operation

exception if

unordered

FT 0FFF F No

UNOR 1FFF T No

EQNEQ2FFT F No

UEQ OGL 3 F F T T No

OLT UGE 4 F T F F No

ULT OGE 5 F T F T No

OLE UGT 6 F T T F No

ULE OGT 7 F T T T No

SF ST 8 F F F F Yes

NGLE GLE 9 F F F T Yes

SEQ SNE 10 F F T F Yes

NGLGL 11FFT T Yes

LT NLT 12 F T F F Yes

NGEGE13FTF T Yes

LE NLE 14 F T T F Yes

NGTGT 15FTT T Yes

B.1.1 Floating-Point Loads, Stores, and Moves

All movement of data between the floating-point coprocessor and memory is

accomplished by coprocessor load and store operations, which reference the floating-point

coprocessor’s General-Purpose Registers. These operations are unformated; no format

conversions are performed and, therefore, no floating-point exceptions occur due to these

operations.

Data may also be directly moved between the floating-point coprocessor and the

processor by move to coprocessor and move from coprocessor instructions. Like the

floating-point load and store operations, move to/from operations perform no format

conversions and never cause floating-point exceptions.

An additional pair of coprocessor registers are available, called Floating-Point Control

registers for which the only data movement opera-lions supported are moves to and from

processor General-Purpose Registers.

B.1.2 Floating-Point Operations

The floating-point unit’s operation set includes floating-point add, subtract, multiply,

divide, square root, convert between fixed-point and floating-point format, convert

between floating-point formats, and floating-point compare. These operations satisfy

IEEE Standard 754’s requirements for accuracy. Specifically, these operations obtain a

result which is identical to performing the result with infinite precision and then

rounding to the specified format, using the current rounding mode.

Instructions must specify the format of their operands. Except for con-version

functions, mixed-format operations are not provided.

TX49/H2 Architecture

B-4

B.2 Instruction Notational Conventions

In this appendix, all variable sub fields in an instruction format (such as fs, ft, immediate,

and so on) are shown with lower-case names. The instruction name (such as ADD, SUB, and

so on) is shown in upper-case.

For the sake of clarity, an alias is sometimes substituted for a variable subfield in the

formats of specific instructions. For example, we use rs = base in the format fo r load and sto re

instructions. Such an alias is always lower case, since it refers to a variable subfield.

In some instructions, however, the two instruction subfields op and function have constant

6-bit values. When reference is made to these instructions, upper-case mnemonics are used.

In the floating-point instruction, for example, we use op = COP1 and function = FADD. In

some cases, a single field has both fixed and variable subfields, so the name contains both

upper and lower case characters. Actual bit encoding for mnemonics is shown in Figure B-5 at

the end of this appendix, and are also included with each individual instruction.

In the instruction description examples that follow, the Operation section describes the

operation performed by each instruction using a high-level language notation.

B.2.1 Instr uction Notation Examples

Example #1:

GPR[ft] ← immediate 016

Sixteen ze ro bits are concate nated with an i mmediate value ( typic ally 1 6 bi ts),. and

the 32-bit string (with the lower 16 bits set to zero) is assigned to GPR register ft.

Example #2:

(immediate15)16 immediate15∼0

Bit 15 (the sign bit) of an immediate v alue is exte nded for 16 bi t position s, and the

result is concatenated with bits 15 through 0 of the immediate value to form a 32-bit

sign-ext ended value.

TX49/H2 Architecture

B-5

B.3 Load and Store Instructions

In the MIPS ISA, all load operations have a delay of at least one instruction. That is,

the instruction immediately following a load cannot use the contents of the register that

will be loaded with the data being fetched from storage.

In the TX49, the instruction immediately following a load may use the contents of the

cycles, so scheduling load delay slots is still desirable, although not absolutely required for

functional code.

When the FR bit in the Status register equals zero, the Floating-Point General Registers

(FGR) are 32-bits wide. When the FR bit in the Status register equals one, the Floating-

Point General Registers (FGR) are 64-bits wide. The behavior of the load store

inst urctions in dependent on the width of the FGRs.

In the load/store operation descriptions, the functions listed in Table B-3 are used to

summarize the handling of virtual addresses and physical memory.

Table B-3 Load/Store Common Functions

Function Meaning

AddressTranslation Uses the TLB to find the physical address given the virtualaddress. The function fails and an

exception is taken if therequired trans l ation is not pres ent in the TLB.

LoadMemory Uses the cache and main memory to find the contents of theword containing the specified physical

address. The low-ordertwo bits of the address and the access type field indicates whichof each of

the four bytes within the data word need to bereturned. If the cache is enabled f or this access, the

entire wordis returned and loaded i nto the cache.

StoreMemory Uses the cache, write buffer and m ain memory to st ore the wordor part of word specifi ed as data in

the word containing thespecified physical address. The low-order two bits of theaddress and the

access type fiel d indic ates which of eac h of thef our bytes within t he data word should be stored.

TX49/H2 Architecture

B-6

Figure B-1 shows the I-Type instruction format used by load and store operations.

I-Type (Immediat e)

baseop offsetft

2631

55 16

21 16 025 20 15

where:

op i s a 6-bit operation code

base is the 5-bit base register spec if i er

ft is a 5-bit. sourc e (for st ores) or destination (for loads)

FPA register specifier

offset i s the 16-bit signed immediate offset

Figure B-1 Load and Stor e Instr uction Format

All coprocessor loads and stores reference aligned word data items. Thus, for word loads

and stores, the access type field is always WORD, and the low-order two bits of the address

must always be zero.

For double word loads and stores, the access type field is always DOUBLEWORD, and the

low-order three bits of the address must always be zero.

Regardless of byte-numbering order (endianness), the address specifies that byte which ha s

the smallest byte-address of all of the bytes in the addressed field. For a Big-endian machine,

this is the leftmost byte; for a Little-endian machine, this is the rightmost byte.

TX49/H2 Architecture

B-7

B.4 Computational Instructions

Computational instructions include all of the arithmetic floating-point operations performed

by the FPU. Figure B-2 shows the R-Type instruction format used for computational

operations.

R-Type (Register)

fdfs function

COP1 fmt ft

5610111516202125

2631 0

55556

where:

COP1 is a 6-bit major operation code

fmt is a 5-bit format spec ifi er

fs is a 5-bit sourc e 1 register

ft is a 5-bit source2 register

fd i s a 5-bit destination register

function i s a 6-bit function field

Figure B-2 Computational Instruction Format

Each floating-point instruction can be applied to a number of operand formats. The operand

format for an instruction is specifie d by the 4-bit Format field; decoding for this field is shown

in Table B-4.

Table B-4 Format Field Decoding

Code Mnemonic Size Format

16 S single Binary floating-point

17 D double Binary f l oating-poi nt

18 Reserved

19 Reserved

20 W single Binary fixed-point

21 L longword 64-bit binary fixed-point

22∼31 - - Reserved

The function indicates which floating-point operation is to be performed. Table B-5 lists all

floati ng-point instructions.

TX49/H2 Architecture

B-8

Table B-5 Floating-Point Instructions and Operations

Code (5∼0) Mnemonic Operation

0ADDAdd

1 SUB Subtract

2 MUL multiply

3DIVDivide

4 SQRT Square root

5 ABS Absolute value

6MOVMove

7 NEG Negate

8 ROUND.L Convert to single fixed-point, rounded t o nearest/even

9 TRUNC.L Convert to single fixed-point, rounded toward zero

10 CEIL.L Convert to single fixed-point, rounded to +∞

11 FLOOR.L Convert to single fixed-point, rounded to −∞

12 ROUND.W Convert to single fi xed-point, rounded to nearest/even

13 TRUNC.W Convert to single fixed-point, rounded toward zero

14 CEIL.W Convert to single fixed-point, rounded to +∞

15 FLOOR. W Convert to single fixed-point, rounded to −∞

16∼31 - Reserved

32 CVT.S Convert to single floating-point

33 CVT.D Convert t o doubl e floating-point

34 - Reserved

35 - Reserved

36 CVT.W Convert to binary fixed-point

37 CVT.L Convert to 64-bit binary fixed-point

38∼47 - Reserved

48∼63 C Floating-poi nt c om pare

TX49/H2 Architecture

B-9

In the following pages, the notation FGR refers to the FPU’s 32 General-Purpose Registers

FGRO through FGR31, and FPR refers to the FPU’s Floating-Point Registers. When the FR

bit in the Status register (SR26) equals zero, only the even Floating-Point Registers are valid

and the FPU’s 32 General-Purpose Registers are 32-bits wide. When the FR bit in the Status

FPU’s 32 General-Purpose Registers are 64-bits wide.

The following routines are used in the description of the floating-point operations to get the

value of an FPR or to change the value of an FGR:

32 Bit Mode

value < - - ValueFP R (fpr, fmt)

/* undefined for odd fpr */

case fmt of

S, W: value < - - FGR[ f pr + 0]

D: /* undefined for fpr not even */

value < - - FGR[fpr + 1] FGR[fpr + 0]

end

StoreFPR (fpr, fmt, value):

/* undefined for odd fpr */

case fmt of

S, W: FGR[fpr + 1] < - - undefined

FGR[fpr + 0] < - - value

D: FGR[fpr + 1] < - - value63∼32

FGR[fpr + 0] < - - value31∼0

end

64 Bit Mode

value < - - ValueFP R (fpr, fmt)

case fmt of

S: value < - - FGR[fpr]31∼0

D, L: value < - - FGR[f pr]

W: value < - - FGR[fpr]

end

StoreFPR (fpr, fmt, value):

case fmt of

S, W: FGR[ f pr] < - - undefined32 value

D, L: FGR[fpr] < - - value

end

TX49/H2 Architecture

B-10

ABS.fmt Floating-Point Absolute

Value ABS.fmt

fdfs ABS

000101

00000

COP1

010001 fmt

56101115

202125

2631 0

55556

Format:

ABS.fmt fd, fs

Description:

The contents of the FPU register specified by fs are interpreted in thespecified format and

the arithmetic absolute value is taken. The result is placed in the floating-point register

specified by fd.

The absolute value operation is arithmetic; a NaN operand signals in-valid operation.

This instruction is valid only for single- and double-precision floating-point formats. The

operation is not defined if bit 0 of any register specification is set and the FR bit in the Status

coprocessor general registers. When the FR bit in the Status register equals one, both even

and odd register numbers are valid.

Operation:

32, 64 T: S toreFPR (fd, fmt, AbsoluteValue (ValueFPR (fs, fmt)))

Exceptions:

Coprocessor unus able exception

Coprocessor exception tap

Coprocessor Exceptions:

Unimplemented operation exception

Invalid operation exception

TX49/H2 Architecture

B-11

ADD.fmt Floating-Point Add ADD.fmt

fdfs ADD

000000

COP1

010001 ftfmt

5610111516202125

2631 0

55556

Format:

ADD.fmt fd, fs, ft

Description

The contents of the FPU registers specified by fs and ft are interpreted in the specified

format and arithme tically added. The resu lt is round- ed as if calcula ted to infinite p recision

and then rounded to the specified format (fmt), according to the current rounding mode. The

result is placed in the floating-point register ( FPR) specifie d by fd.

This instruction is valid only for single- and double-precision floating-point formats. The

operation is not defined if bit 0 of any register specification is set and the FR bit in the Status

coprocessor general registers. When the FR bit in the Status register equals one, both even

and odd register numbers are valid.

Operation:

32, 64 T: StoreFPR (fd, fmt, ValueFPR(fs, fmt) + ValueFPR (fl, fmt))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Unimplemented operation exception

Invalid operation exception

Inexact exception

Overflow exception

Underflow exception

TX49/H2 Architecture

B-12

BC1F Branch On FPU False

(coprocessor 1) BC1F

offset

BCF

00000

01000

COP1

010001

1516202125

2631 0

55 16

Format:

BC1F offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the result of the

last floating-point compare is false(zero), the program branches to the target address, with a

delay of one instruction. There must be at least one instruction between C.cond. fmt and

BC1F.

Operation:

32 T − 1: conditi on ← not COC[1]

T: target ← (offset15)14  offset   02

T + 1: if condition then

PC ← PC + target

endif

64 T − 1 conditi on ← not COC[1]

T: target ← (offset15)46   offset   02

T + 1: if condition then

PC ← PC + target

endif

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

B-13

BC1FL Branch On FPU False

Likely

(coprocessor 1) BC1FL

offset

BCFL

00010

01000

COP1

010001

1516202125

2631 0

55 16

Format:

BC1FL offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended.

If the result of the last floating-point compare is false(zero), the program branches to the

target address, with a delay of one instruction. If the conditional branch is not taken, the

instruction in the branch delay slot is nullified. There must be at least on instruction

between C.cond. fmt and BC1FL.

Operation:

32 T − 1: conditi on ← not COC[1]

T: target ← (offset15)14  offset   02

T + 1: if condition then

PC ← PC + target

Else

NullifyCurrentInstruction

Endif

64 T − 1: conditi on ← not COC[1]

T: target ← (offset15)46   offset   02

T + 1: if condition then

PC ← PC + target

Else

NullifyCurrentInstruction

endif

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

B-14

BC1T Branch On FPU True

(coprocessor 1) BC1T

offset

BCT

00001

01000

COP1

010001

1516202125

2631 0

55 16

Format:

BC1T offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended. If the result of the

last floating-point compare is true(one), the program branches to the target address, with a

delay of one instruction. There must be at least one instruction between C.cond. fmt and

BC1T.

Operation:

32 T − 1: conditi on ← COC[1]

T: target ← (offset15)14  offset   02

T + 1: if condition then

PC ← PC + target

endif

64 T − 1: conditi on ← COC[1]

T: target ← (offset15)46   offset   02

T + 1: if condition then

PC ← PC + target

endif

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

B-15

BC1TL Branch On FPU True Likely

(coprocessor 1) BC1TL

offset

BCTL

00011

01000

COP1

010001

15162021

2631 0

55 16

Format:

BC1TL offset

Description:

A branch target address is computed from the sum of the address of the instruction in the

delay slot and the 16-bit offset, shifted left two bits and sign-extended.

If the result of the last floating-point compare is true(one), the program branches to the

target address, with a delay of one instruction. If the conditional branch is not taken, the

instruction in the branch delay slot is nullified. There must be at least one instruction

between C.cond.fmt and BC1TL.

Operation:

32 T − 1: conditi on ← COC[1]

T: target ← (offset15)14  offset   02

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

64 T − 1: conditi on ← COC[1]

T: target ← (offset15)46   offset   02

T + 1: if condition then

PC ← PC + target

else

NullifyCurrentInstruction

endif

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

B-16

C.cond.fmt Floating-Point

Compare C.cond.fmt

FC*cond*ft fs 0

00000

COP1

010001 fmt

5610111516202125

2631 0

5555 42

Format:

C.cond.fmt fs, ft

Description:

The contents of the floating-point registers specified by fs and ft are interpreted in the

specified format and arithmetically compared.

A result is determined based on the comparison and the conditions specified in the

instruction. If one of the values is a Not a Number (NaN), and the high-order bit of the

condition field is set, an invalid operation exception is taken. After a one-instruction delay,

the condition is available for testing with branch on floating-point coprocessor condition

instructions. There must be at least one instruction between the conpare and branch.

Comparisons are exact and can neither overflow nor underflow. Four mutually exclusive

relations are possible results: less than, equal, greater than, and unordered. The last case

arises when one or both of the operands are NaN; every NaN compares unordered with

every-thing, including itself. Comparisons ignore the sign of zero, so + 0 = −0.

This instruction is valid only for single- and double-precision floating-point formats. The

operation is not defined if bit 0 of any register specification is set and the FR bit in the Status

coprocessor general registers. When the FR bit in the Status register equals one, both even

and odd register numbers are valid.

**See “FPU Instruction Opcode Bit Encoding” at the end of Appendix B.

TX49/H2 Architecture

B-17

C.cond.fmt Floating-Point

Compare

(continued) C.cond.fmt

Operation:

32, 64 T: if NaN (ValueFPR(is, fmt)) or NaN (ValueFPR(it, fmt)) then

less ← false

equal ← false

unordered ← true

if cond3 t hen

signal lnval i dOperationException

endif

else less ← VaIueFPR (fs, fmt) < ValueFPR (I t, fmt)

equal ← ValueFPR (fs, fmt) = ValueFPR (it, fmt)

unordered ← false

endif

condition ← (cond2 and less) or (cond1 and equal) or

(cond0 and unordered)

FCR[31]23 ← condition

COC[1] ← condition

Exceptions:

Coprocessor unusable

Floating-Point exception

Coprocessor Exceptions:

Unimplemented operation exception

Invalid operation exception

TX49/H2 Architecture

B-18

CEIL.L.fmt Floating-Point

Ceiling to Long

Fixed-Point Format CEIL.L.fmt

fdfs CEIL.L

001010

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

CEIL.L.fmt fd, fs

Description:

The contents of the floating-point register specified by fs are interpreted in. the specified

source format, fmt, and arithme tical ly con verte d to the long fixe d-po int format. The re su lt is

placed in the floating-point register specified by fd.

Regardless of the setting of the current rounding mode, the conversion is rounded as if the

current rounding mode is round to + ∞ (2).

This instruction is valid only for conversion from single-, double-, extended or quad-

precision floating-point formats. If extended or quad-precision format is specified, the

operation is not defined if bit 0 of the source register specification is set, since the register

number specifies an aligned coprocessor general register. When the FR bit in the Status

regist er equals one, both even and odd register numbers are valid.

When the source operand is an Infinity, NaN, or the correctly rounded integer result us

outside of -263 to 263 -1, the Invalid operation exception us raised. If the Invalid operation is

not enabled then no exception us taken and 263 -1 is returned.

This instruction is not implemented on MIPS I or MIPS II processors, and Will cause an

unimplemented operation exception to occur.

Operation:

32, 64 T: StoreFPR (fd, L, ConvertFmt (ValueFPR (fs, fmt), fmt, L))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

TX49/H2 Architecture

B-19

CEIL.W.fmt Floating-Point

Ceiling to Single

Fixed-Point Format CEIL.W.fmt

fdfs CEIL.W

001110

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

CEIL.W.fmt fd, fs

Description:

The contents of the floating-point register specified by fs are interpreted in the specified

source format, fmt, and arithmetically converted to the single fix ed-point f ormat. The resul t

is placed in the floating-point regist er specified by fd.

Regardless of the setting of the current rounding mode, the conversion is rounded as if the

current rounding mode is round to + ∞ (2).

This instruction is valid only for conversion from a single- or double-precision floating-

point formats. The ope ration i s not d efin ed if bit 0 of any re gister spe cificat ion is set and the

FR bit in the Status register equals zero, since the register numbers specify an even-odd pair

of adjacent coprocessor general registers. When the FR bit in the Status register equals one,

both even and odd register numbers are valid.

When the source operand is an Infinity or NaN, or the correctly rounded integer result is

outside of −231 to 231-1, the Invalid operation exception is raised. If the Invalid operation is

not enabled then no exception is taken and 231-1 is returned.

Operation:

32, 64 T: StoreFPR (fd, W, ConvertFmt (ValueFPR (fs, fmt), fmt, W))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

TX49/H2 Architecture

B-20

CFC1 Move Control Word From

FPU

(coprocessor 1) CFC1

rt fs 0

000 0000 0000

00010

COP1

010001

10111516202125

2631 0

555 11

Format:

CFC1 rt, fs

Description:

The contents of the FPU’ s control register fs are loaded into general register rt.

This operation is only defined when fs equals 0 or 31.

The contents of general register rt are undefined for the instruction immediately following

CFC1.

Operation:

32 T: temp ← FCR[fs]

T + 1: GPR[ rt] ← temp

64 T: temp ← FCR[fs]

T + 1: GPR[ rt] ← (temp31)32 temp

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

B-21

CTC1 Move Control Word To FPU

(coprocessor 1) CTC1

rt fs 0

000 0000 0000

00110

COP1

010001

10111516202125

2631 0

555 11

Format:

CTC1 rt, fs

Description:

The contents of general register rt are loaded into the FPU’s control register fs. This

operation is only defined when fs equals 0 or 31 . Writing to Control Register 31, the floating-

point Control/Status register, causes an interrupt or exception if any cause bit and its

corresponding enable bit are both set. The register will be written before the exception

occurs. The contents of floating-point control register fs are undefined for the instruction

immediately following CTC1.

Operation:

32 T: temp ← GPR[rt]

T + 1: F CR[fs] ← temp

COC[1] ← FCR[31]23

64 T: temp ← GPR[rt]31~0

T + 1: F CR[fs] ← temp

COC[1] ← FCR[31]23

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Unimplemented operation exception

Invalid operation exception

Division by zero exception

Inexact exception

Overflow exception

Underflow exception

TX49/H2 Architecture

B-22

CVT.D.fmt Floating-Point

Convert to Double

Fixed-Point Format CVT.D.fmt

fdfs CVT.D

100001

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

CVT.D.fmt fd, fs

Description:

The contents of the floating-point register specified by fs is interpreted in the specified

source format, fmt, and arithmetically converted to the double. binary floating-point format.

The result is placed in the floating-point register specified by fd.

This instruction is valid only for conversions from single floating-pount format, 32-bit or

64-bit fixed-point format.

If the single floating-point or single fixed-point format is specified, the operation is exact.

The operation is not defined if bit 0 of any register specification is set and the FR bit in the

Status register equals zero, since the register numbers specify an even-odd pair of adjacent

coprocessor general registers. When the FR bit in the Status register equals one, both even

and odd register numbers are valid.

Operation:

32, 64 T: StoreFP R (fd, D, ConvertFmt (V aIueFP R (fs, fmt), fmt , D))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

Underflow exception

TX49/H2 Architecture

B-23

CVT.L.fmt Floating-Point

Convert to Long

Fixed-Point Format CVT.L.fmt

fdfs CVT.L

100101

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

CVT.L.fmt fd, fs

Description:

The contents of the floating-point register specified by fs is interpreted in the specified

source format, fmt, and arithme tical ly con verte d to the long fixe d-po int format. The re su lt is

placed in the floating-point register specified by fd.

This instruction is valid only for conversions from single-, double-, extended- or quard-

precision floating-point formats. If extended- or quad-precision format is specified, the

operation is not defined if bit 0 of the source register specification is set, since the register

number specifies an aligned coprocessor general register.

When the source operand is an Infinity, NaN, or the correctly rounded integer result is

outside of −263 to 263-1, the Invalid operation exception is raised. If the Invalid operation is

not enabled then no exception is taken and 263-1 is returned.

This instruction is not implemented on MIPS I or MIPS II processors, and will cause an

unimplemented operation exception to occur.

The operation is not defined if bit 0 of any register specification is set and the FR bit in the

status register epuals zero.

Operation:

32, 64 T: StoreFPR (fd, L, ConvertFmt (V al ueFPR (fs, fmt), fmt, L))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

TX49/H2 Architecture

B-24

CVT.S.fmt Floating-Point

Convert to Single

Fixed-Point Format CVT.S.fmt

fdfs CVT.S

100000

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

CVT.S.fmt fd, fs

Description:

The contents of the floating-point register specified by fs are interpreted in the specified

source format, fmt, and arithmetically converted to the single binary floating-point format.

The result is placed in the floating-point register specified by fd. Rounding occurs according

to the currently specified rounding mode.

This instruction is valid only for conversions from double floating-point format, or from 32-

bit or 64-bit fixed-point format. The operation is not defined if bit 0 of any register

specification is set and the FR bit in the Status register equals zero, since the register

numbers specify an even-odd pair of adjacent coprocessor general registers. When the FR bit

in the Status register equals one, both even and odd register numbers are valid.

Operation:

32, 64 T: StoreFPR (fd, S, ConvertFmt (Val ueFPR (fs , fmt ), fmt, S))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

Underflow exception

TX49/H2 Architecture

B-25

CVT.W.fmt Floating-Point

Convert to

Fixed-Point Format CVT.W.fmt

fdfs CVT.W

100100

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

CVT.W.fmt fd, fs

Description:

The contents of the floating-point register specified by fs are interpreted in the specified

source format, fmt, and arithmetically converted to the single fix ed-point f ormat. The resul t

is placed in the floating-point regist er specified by fd.

This instruction is valid only for conversion from a single- or double-precision floating-

point formats. The ope ration i s not d efin ed if bit 0 of any re gister spe cificat ion is set and the

FR bit in the Status register equals zero, since the register numbers specify an even-odd pair

of adjacent coprocessor general registers. When the FR bit in the Status register equals one,

both even and odd register numbers are valid.

When the source operand is an Infinity or NaN, or the correctly rounded integer result us

outside of −231 to 231-1, an Invalid operation exception is raised. If Invalid operation is not

enabled, then no exception is taken and 231-1 is returned.

Operation:

32, 64 T: StoreFPR (fd, W, ConvertFmt (ValueFP R (fs, fm t), fmt, W))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

TX49/H2 Architecture

B-26

DIV.fmt Floating-Point

Divide DIV.fmt

fdfs DIV

000011

COP1

010001 ftfmt

5610111516202125

2631 0

55556

Format:

DIV.fmt fd, fs, ft

Description:

The contents of the floating-point registers specified by fs and ft are interpreted in the

specified format and the value in fs is divided by the value in ft. The result is rounded as if

calculated to infinite precision and then rounded to the specified format, according to the

current rounding mode. The result is placed in the floating-point register specified by fd.

This instruction is valid for only single or double precision floating-point formats.

The operation is not defined if bit 0 of any register specification is set and the FR bit in the

Status register equals zero, since the register numbers specify an even-odd pair of adjacent

coprocessor general registers. When the FR bit in the Status register equals one, both even

and odd register numbers are valid.

Operation:

32, 64 T: StoreFPR (fd, fmt, ValueFPR(fs, fmt)/ValueFPR(ft, fmt))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Unimplemented operation exception

Invalid operation exception

Division-by-zero exception

Inexact exception

Overflow exception

Underflow exception

TX49/H2 Architecture

B-27

DMFC1 Doubleword Move From

Floating-Point Coprocessor DMFC1

rt fs 0

000 0000 0000

DMF

00001

COP1

010001

10111516202125

2631 0

555 11

Format:

DMFC1 rt, fs

Description:

The contents of register fs from the floating-point coprocessor is stored into processor

The contents of general register rt are undefined for the instruction immediately following

DMFC1.

The FR bit in the Status register specifies whether all 32 register of the TX49 are

addressable. When FR is clear, this instruction is not defined when the least significant bit

of fs is non-zero. When FR is set, fs may specify either odd or even registers.

Operation:

64 T: if SR26 = 1 then /*64-bit wide FGRs*/

data ←FGR[fs]

elseif fs0 = 0 then /*valid specifier, 32-bit wide FGRs*/

data ←FGR[fs+1] FGR[fs]

else /*undefi ned for odd 32-bit reg #s */

data ←undefined64

endif

T+1: GPR[rt] ← data

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)

Coprocessor Exceptions:

Unimplemented operation exception

TX49/H2 Architecture

B-28

DMTC1 Doubleword Move To

Floating-Point Coprocessor DMTC1

rt fs 0

000 0000 0000

DMT

00101

COP1

010001

10111516202125

2631 0

555 11

Format:

DMTC1 rt, fs

Description:

The content s of general register rt are loaded int o coprocessor register fs of the CP1.

The contents of floating-point register fs are undefined for the instruction immediately

following DMTC1.

The FR bit in the Status register specifies whether all 32 register of the TX49 are

addre ssable. When FR equals zero, this instru ction is no t defined w hen the least significant

bit of fs is non-zero. When FR equals one, fs may specify either odd or even registers.

Operation:

64 T: data ← GPR[rt]

T + 1: if SR26 = 1 then /*64-bit wide FGRs*/

FGR[fs] ← data

elseif fs0 = 0 then /*valid specifier, 32-bit wide valid FGRs*/

FGR[fs + 1] ← data63∼32

FGR[fs] ← data31∼0

else /*undefi ned result for odd 32-bit reg #s */

undefined_result

endif

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)

Coprocessor Exceptions:

Unimplemented operation exception

TX49/H2 Architecture

B-29

FLOOR.L.fmt Floating-Point

Floor to Long

Fixed-Point Format FLOOR.L.fmt

fdfs FLOOR.L

001011

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

FLO0R.L.fmt fd, fs

Description:

The contents of the floating-point register specified by fs are interpreted in the specified

source format, fmt, and arithme tical ly con verte d to the long fixe d-po int format. The re su lt is

placed in the floating-point register specified by fd.

Regardless of the setting of the current rounding mode, the conver-sion is rounded as if the

current rounding mode is round to −∞ (3).

This instruction is valid only for conversion from single-, double-, extended or quad-

precision floating-point formats. If extended or quad-precision format is specified, the

operation is not defined if bit 0 of the source register specification is set, since the register

number specifies an aligned coprocessor general register.

When the source operand is an Infinity, NaN, or the correctly rounded integer result is

outside of −263 to 263-1, the Invalid operation exception is raised. If the Invalid operation is

not enabled then no exception is taken and 263-1 is returned. This instruction is not

implemented on MIPS I or MIPS II processors, and will cause an unimplemented operation

exception to occur.

Operation:

32, 64 T: StoreFPR (fd, L, ConvertFmt (V al ueFPR (fs, fmt), fmt, L))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

TX49/H2 Architecture

B-30

FLOOR.W.fmt Floating-Point

Floor to Single

Fixed-Point Format FLOOR.W.fmt

fdfs FLOOR.W

001111

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

FLOOR.W.fmt fd, fs

Description:

The contents of the floating-point register specified by fs are interpreted in the specified

source format, fmt, and arithmetically converted to the single fix ed-point f ormat. The resul t

is placed in the floating-point regist er specified by fd.

Regardless of the setting of the current rounding mode, the conversion is rounded as if the

current rounding mode is round to −∞ (RM = 3).

This instruction is valid only for conversion from a single- or double-precision floating-

point formats. The ope ration i s not d efin ed if bit 0 of any re gister spe cificat ion is set and the

FR bit in the Status register equals zero, since the register numbers specify an even-odd pair

of adjacent coprocessor general registers. When the FR bit in the Status register equals one,

both even and odd register numbers are valid.

When the source operand is an Infinity or NaN, or the correctly rounded integer result is

outside of −231 to 231-1, an Invalid operation exception is raised. If Invalid operation is not

enabled, then no exception is taken and 231-1 is returned.

Operation:

32, 64 T: StoreFPR (fd, W, ConvertFmt (ValueFP R (fs, fm t), fmt, W))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

TX49/H2 Architecture

B-31

B LDC1 Load Doubleword to FPU

(coprocessor 1) LDC1

LDC1

110101 offsetbase

1516202125

2631 0

55 16

Format:

LDC1 ft, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

an unsigned effective address. In 32-bit mode, the contents of the doubleword at the memory

location specified by the effective address is loaded into registers ft and ft

1 of the floating-

point coprocessor. This instruction is not v alid, and is undefined, when the least signific ant

bit of ft is non-zero. In 64-bit mode, the contents of the doubleword at the memory location

specified by the effective ad-dress are loaded into the 64-bit register ft of the floating point

coprocessor. The FR b it of th e Status register (SR26) specifies whether all 32 registers of the

TX49 are addressable. When FR = 0, this instruction is not defined when the least

significant bit of ft is non-zero. When FR = 1, ft may specify either odd or even registers.

If any of the three least-significant bits of the effective address are non-zero, an address

error exception takes place.

TX49/H2 Architecture

B-32

LDC1 Load Doubleword to FPU

(coprocessor 1)

(continued) LDC1

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← Address Trans l ation (vAddr, DATA)

data ← LoadMemory (uncached, DLUBLEWORD, pAddr, vAddr, DATA)

if SR26 = 1 then /*64-bit wide GFRs */

FGR[ft] ← data

elseif ft0 = 0 then /*valid specifier, 32-bit wide FGRs */

FGR[ft + 1] ← data63∼32

FGR[ft] ← data31∼0

else /*undefi ned result if odd */

undefined_result

endif

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← Address Trans l ation (vAddr, DATA)

data ← LoadMemory (uncached, DLUBLEWORD, pAddr, vAddr, DATA)

if SR26 = 1 then /*64-bit wide GFRs */

FGR[ft] ← data

elseif ft0 = 0 then /*valid specifier, 32-bit wide FGRs */

FGR[ft + 1] ← data63∼32

FGR[ft] ← data31∼0

else /*undefi ned result if odd */

undefined_result

endif

Exceptions:

Coprocessor unusable

TLB refill exception

TLB invalid exception

Bus error exception

Address error exception

TX49/H2 Architecture

B-33

LWC1 Load Word to FPU

(coprocessor 1) LWC1

LWC1

110001 offsetbase

1516202125

2631 0

55 16

Format:

LWC1 ft, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

an unsigned effective address. The contents of theword at the memory location specified by

the effective address is loaded into register ft of the floating-point coprocessor.

The FR bit of the Status register specifies whether all 64-bit Floating-Point Registers are

addressable. If FR equals zero, LWC1 loads eitherthe high or low half of the 16 even

Floating-Point Registers. If FR equals one, LWC1 loads the low 32-bits of both even and odd

Floating-Point Registers.

If either of the two least-significant bits of the effective address is non-zero, an address

error exception occurs.

TX49/H2 Architecture

B-34

LWC1 Load Word to FPU

(coprocessor 1)

(continued) LWC1

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE−1∼3 (pAddr2∼0xor(ReverseEndian 02))

mem ← LoadMemory(uncached, WORD, pAddr, vAddr, DATA)

byte ← vA ddr2∼0xor(BigEndianCPU 02)

/*“mem” is aligned 64-bits from mem ory. P ick out c orrect byt es. */

if SR26 = 1 then */64-bit wide FRGs */

FGR[ft] ← undefined32 mem31 + 8*byte∼8*byte

else /*32-bit wide FGRs */

FGR[rf] ← mem31 + 8*byte∼8*byte

endif

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE−1∼3 (pAddr2∼0xor(ReverseEndian 02))

mem ← LoadMemory(uncached, WORD, pAddr, vAddr, DATA)

byte ← vA ddr2∼0xor(BigEndianCPU 02)

/*“mem” is aligned 64-bits from mem ory. P ick out c orrect byt es. */

if SR26 = 1 then */64-bit wide FRGs */

FGR[ft] ← undefined32 mem31 + 8*byte∼8*byte

else /*32-bit wide FGRs */

FGR[rf] ← mem31 + 8*byte∼8*byte

endif

Exceptions:

Coprocessor unusable

TLB-refill exception

TLB invalid exception

Bus error exception

Address error exception

TX49/H2 Architecture

B-35

MFC1 Move From FPU

(Coprocessor 1) MFC1

rt fs 0

000 0000 0000

00000

COP1

010001

10111516202125

2631 0

555 11

Format:

MFC1 rt, fs

Description:

The contents of register fs from the floating-point coprocessor are stored into processor

The contents of register rt are undefined for time T of the instruction immediately

following this load instruction.

The FR bit of the Status register specifies whether all 32 registers of the TX49 are

addressable. If FR equals zero, MFC1 stores either the high or low half of the 16 even

Floating-Point Registers. If FR equals one, MFC1 stores the low 32-bits of both even and odd

Floating-Point Registers.

Operation:

32 T: data ← FGR[fs]31∼0

T + 1: GPR[ rt] ← data

64 T: data ← FGR[fs]31∼0

T + 1: GPR[rt] ← (data31)32 data

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

B-36

MOV.fmt Floating-Point Move MOV.fmt

fdfs MOV

000110

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

MOV.fmt fd, fs

Description:

The contents of the FPU register specified by fs are interpre ted in th e specifie d format and

are copied into the FPU register specified by fd. The move operation is non-arithmetic; no

IEEE 754 exceptions occur as a result of the instruction.

This instruction is valid only for single- or double-precision floating-point formats.

The operation is not defined if bit 0 of any register specification is set and the FR bit in the

Status register equals zero, since the register numbers specify an even-odd pair of adjacent

coprocessor general registers. When the FR bit in the Status register equals one, both even

and odd register numbers are valid.

Operation:

32, 64 T: StoreFPR (fd, fmt, VaIueFPR (fs, fmt ))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Unimplemented operation exception

TX49/H2 Architecture

B-37

MTC1 Move To FPU

(Coprocessor 1) MTC1

rt fs 0

000 0000 0000

00100

COP1

010001

10111516202125

2631 0

555 11

Format:

MTC1 rt, fs

Description:

The contents of register rt are loaded into the FPU’s general regist er at location fs.

The contents of floating-point register fs is undefined for the instruction immediately

following MTC1.

The FR bit of the Status register specifies whether all 32 registers of the TX49 are

addressable. If FR equals zero, MTC1 loads either the high or low half of the 16 even

Floating-Point Registers. If FR equals one, MTC1 loads the low 32-bits of both even and odd

Floating-Point Registers.

Operation:

32, 64 T: data ← GPR[rt]31∼0

T + 1: if SR26 = 1 then /* 64-bit wide FGRs */

FGR[fs] ← undefined32 data

else /* 32-bit wide FGRs */

endif

Exceptions:

Coprocessor unus able exception

TX49/H2 Architecture

B-38

MUL.fmt Floating-Point Multiply MUL.fmt

ft fdfs MUL

000010

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

MUL.fmt fd, fs, ft

Description:

The contents of the floating-point registers specified by fs and ft are interpreted in the

specified format and arithmetically multiplied. The result is rounded as if calculated to

infinite precision and then rounded to the specified format, according to the current rounding

mode. The result is pl aced in the floating-point register specified by fd.

This instruction is valid only for single- or double-precision floating-point formats.

The operation is not defined if bit 0 of any register specification is set and the FR bit in the

Status register equals zero, since the register numbers specify an even-odd pair of adjacent

coprocessor general registers. When the FR bit in the Status register equals one, both even

and odd register numbers are valid.

Operation:

32, 64 T: StoreFPR (fd, fmt, ValueFPR (fs, fmt)* ValueF PR (ft, fm t))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Unimplemented operation exception

Invalid operation exception

Inexact exception

Overflow exception

Underflow exception

TX49/H2 Architecture

B-39

NEG.fmt Floating-Point Negate NEG.fmt

fdfs NEG

000111

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

NEG.fmt fd, fs

Description:

The contents of the FPU register specified by fs are interpre ted in th e specifie d format and

the arithme t ic ne g atio n i s t aken ( the po larity o f the sig n- bit is ch an ge d ). Th e re su lt is p l ace d

in the FPU register specified by fd.

The negat e operation is arithmetic; an NaN operand signals invalid operation.

This instruction is valid only for single- or double-precision floating-point formats. The

operation is not defined if bit 0 of any register specification is set and the FR bit in the Status

coprocessor general registers. When the FR bit in the Status register equals one, both even

and odd register numbers are valid.

Operation:

32, 64 T: StoreFPR (fd, fmt, Negate (ValueFPR (fs, fmt)))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Unimplemented operation exception

Invalid operation exception

TX49/H2 Architecture

B-40

ROUND

L.fmt Floating-Point

Round to Long

Fixed-Point Format ROUND

L.fmt

fdfs ROUND.L

001000

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

ROUND.L.fmt fd, fs

Description :

The contents of the floating-point register specified by fs are interpreted in the specified

source format, fmt, and arithme tical ly con verte d to the long fixe d-po int format. The re su lt is

placed in the floating-point register specified by fd.

Regardless of the setting of the current rounding mode, the conversion is rounded as if the

current rounding mode is round to nearest/even (0).

This instruction is valid only for conversion from single-, double-, extended or quad-

precision floating-point formats. If extended or quad-precision format is specified, the

operation is not defined if bit 0 of the source register specification is set, since the register

number specifies an aligned coprocessor general register.

When the source operand is an Infinity , NaN, or the correctly rounded integer result is

outside of −263 to 263-1, the Invalid operation exception is raised. If the Invalid operation is

not enabled then no exception is taken and 263-1 is returned.

This instruction is not implemented on MIPS I or MIPS II processors, and will cause an

unimplemented operation exception to occur.

Operation:

32, 64 T: StoreFPR (fd, L, ConvertFmt (V al ueFPR (fs, fmt), fmt, L))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

TX49/H2 Architecture

B-41

ROUND W.fmt Floating-Point

Round to Single

Fixed-Point Format ROUND W.fmt

fdfs ROUND.W

001100

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

ROUND.W.fmt fd, fs

Description:

The contents of the floating-point register specified by fs are interpreted in the specified

source format, fmt, and arithmetically converted to the single fix ed-point f ormat. The resul t

is placed in the floating-point regist er specified by fd.

Regardless of the setting of the current rounding mode, the conversion is rounded as if the

current rounding mode is round to nearest/even (RM = 0).

This instruction is valid only for conversion from a single- or double-precision floating-

point formats. The ope ration i s not d efin ed if bit 0 of any re gister spe cificat ion is set and the

FR bit in the Status register equals zero, since the register numbers specify an even-odd pair

of adjacent coprocessor general registers. When the FR bit in the Status register equals one,

both even and odd register numbers are valid.

When the source operand is an Infinity or NaN, or the correctly rounded integer result is

outside of −231 to 231-1, an Invalid operation exception is raised. If Invalid operation is not

enabled, then no exception is taken and 231-1 is returned.

Operation:

32, 64 T: StoreFPR (fd, W, ConvertFmt (ValueFP R (fs, fm t), fmt, W))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

TX49/H2 Architecture

B-42

SDC1 Store Doubleword from FPU

(coprocessor 1) SDC1

SDC1

111101 offsetbase

1516202125

2631 0

55 16

Format:

SDC1 ft, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

an unsigned effective address.

In 32-bit mode, the contents of registers ft and ft

1 from the floating-point coprocessor are

stored at the memory location specified by the effective address. This instruction is not valid,

and is undefined, when the least significant bit of ft is non-zero.

In 64-bit mode, the 64-bit register ft is stored to the contents of the doubleword at the

memory location specified by the effective address. The FR bit of the Status register (SR26)

specifies whether all 32 registers of the TX49 are addressable. When FR = 0, this in struction

is not de fined if the leas t significan t bit o f ft is non-zero. If FR = 1, ft may specify either odd

or even regis ters.

If any of the three least-significant bits of the effective address are non-zero, an address

error exception takes place.

TX49/H2 Architecture

B-43

SDC1 Store Doubleword from FPU

(coprocessor 1)

(continued) SDC1

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

if SR26 = 1 /*64-bit wide FGRs */

data ← FGR[ft]

elseif ft0 = then /* valid spec ifi er, 32-bit wide FGRs */

data ← FGR[ft + 1] FGR[ft]

else /*undefi ned for odd 32-bit reg #s */

data ← undefined64

endif

StoreMem ory (uncac hed, DOUB LEWORD, data, pAddr, vAddr, DATA)

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

if SR26 = 1 /*64-bit wide FGRs */

data ← FGR[ft]

elseif ft0 = then /* valid spec ifi er, 32-bit wide FGRs */

data ← FGR[ft + 1] FGR[ft]

else /*undefi ned for odd 32-bit reg #s */

data ← undefined64

endif

StoreMem ory (uncac hed, DOUB LEWORD, data, pAddr, vAddr, DATA)

Exceptions:

Coprocessor unusable

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

TX49/H2 Architecture

B-44

SQRT.fmt Floating-Point

Square Root SQRT.fmt

fdfs SQRT

000100

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

SQRT.fmt fd, fs

Description:

The contents of the floating-point register specified by fs are interpreted in the specified

format and the positive arithmetic square root is taken. The result is rounded as if

calculated to infinite precision and then rounded to the specified format, according to the

current rounding mode. If the value of fs corresponds to −0, the resu lt will be −0. The re sult

is placed in the floating-point regist er specified by fd.

This instruction is valid only for single- or double-precision floating-point formats.

The operation is not defined if bit 0 of any register specification is set and the FR bit in the

Status register equals zero, since the register numbers specify an even-odd pair of adjacent

coprocessor general registers. When the FR bit in the Status register equals one, both even

and odd register numbers are valid.

Operation:

32, 64 T: StoreFPR (fd, fmt, SquareRoot (V al ueFPR (fs, fmt)))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Unimplemented operation exception

Invalid operation exception

Inexact exception

TX49/H2 Architecture

B-45

SUB.fmt Floating-Point Subtract SUB.fmt

ft fdfs SUB

000001

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

SUB.fmt fd,fs, ft

Description:

The contents of the floating-point registers specified by fs and ft are interpreted in the

specified format and the value in ft is subtracted from the value in fs. The re sult is rounde d

as if calculated to infinite precision and then rounded to the specified format, according to

the current rounding mode. The result is placed in the floating-point register specified by fd.

This instruction is valid only for single- or double-precision floating-point formats.

The operation is not defined if bit 0 of any register specification is set and the FR bit in the

Status register equals zero, since the register numbers specify an even-odd pair of adjacent

coprocessor general registers. When the FR bit in the Status register equals one, both even

and odd register numbers are valid.

Operation:

32, 64 T: StoreFPR (fd, fmt, ValueFPR (fs, fmt) − ValueFPR (ft, fmt))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Unimplemented operation exception

Invalid operation exception

Inexact exception

Overflow exception

Underflow exception

TX49/H2 Architecture

B-46

SWC1 Store Word from FPU

(coprocessor 1) SWC1

SWC1

111001 offsetbase

1516202125

2631 0

55 16

Format:

SWC1 ft, offset (base)

Description:

The 16-bi t offset is sign-extended and added to the contents of general register base to f o r m

an unsigned effective address. The contents of register ft from the floating-point coprocessor

are stored at the memory location specified by the effective address.

The FR bit of the Status register specifies whether all 64-bit Floating-Point Registers are

addressable. If FR equals zero, SWC1 stores either the high or low half of the 16 even

Floating-Point Registers. If FR equals one, SWC1 stores the low 32-bits of both even and odd

Floating-Point Registers.

If either of the two least-significant bits of the effective address are non-zero, an address

error exception occurs.

TX49/H2 Architecture

B-47

SWC1 Store Word from FPU

(coprocessor 1)

(continued) SWC1

Operation:

32 T: vAddr ← ((offs et15)16 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0xor (RecerseEndian  02))

byte ← vA ddr2∼0xor (BigE ndi anCPU 02)

/* tne bytes of the word are put in the correct byte lanes in

* “data” for a 64-bit path to memory */

if SR26 = 1 then /*64-bit wide FGRs */

data ← FGR[ft]63-8*byte∼0 08*byte

else /* 32-bit wide FGRs /*

data ← 032-8*byte FGR[ft] 08*byte

endif

StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)

64 T: vAddr ← ((offs et15)48 offset15∼0) + GPR[base]

(pAddr, uncached) ← AddressT ransl at i on (vAddr, DATA)

pAddr ← pAddrPSIZE-1∼3 (pAddr2∼0xor (RecerseEndian  02))

byte ← vA ddr2∼0xor (BigE ndi anCPU 02)

/* tne bytes of the word are put in the correct byte lanes in

* “data” for a 64-bit path to memory */

if SR26 = 1 then /*64-bit wide FGRs */

data ← FGR[ft]63-8*byte∼0 08*byte

else /* 32-bit wide FGRs /*

data ← 032-8*byte FGR[ft] 08*byte

endif

StoreMem ory (uncached, WORD, data, pAddr, vAddr, DATA)

Exceptions:

Coprocessor unusable

TLB refill exception

TLB invalid exception

TLB modifica tion exception

Bus error exception

Address error exception

TX49/H2 Architecture

B-48

TRUNC.L.fmt Floating-Point

Truncate to Long

Fixed-Point Format TRUNC.L.fmt

fdfs TRUNC.L

001001

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

TRUNC.L.fmt fd, fs

Description :

The contents of the floating-point register specified by fs are interpreted in the specified

source format, fmt, and arithmetically converted to the single fix ed-point f ormat. The resul t

is placed in the floating-point regist er specified by fd.

Regardless of the setting of the current rounding mode, the conversion is rounded as if the

current rounding mode is round toward zero (1).

This instruction is valid only for conversion from single-, double-, ex-tended or quad-

precision floating-point formats. If extended or quad-precision format is specified, the

operation is not defined if bit 0 of the source register specification is set, since the register

number specifies an aligned coprocessor general register.

When the source operand is an Infinity, NaN, or the correctly rounded integer result is

outside of −263 to 263-1, the Invalid operation exception is raised. If the Invalid operation is

not enabled then no exception is taken and 263-1 is returned.

This instruction is not implemented on MIPS I or MIPS II processors, and will cause an

unimplemented operation exception to occur.

Operation:

32, 64 T: StoreFPR (fd, L, ConvertFmt (V al ueFPR (fs, fmt), fmt, L))

Note: It is also the same operation in th e 32 bit kernel mode.

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Reserved Instruction exception (in the 32 bit user or 32 bit supervisor mode)

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

TX49/H2 Architecture

B-49

TRUNC.W.fmt Floating-Point

Truncate to Single

Fixed-Point Format TRUNC.W.fmt

fdfs TRUNC.W

001101

00000

COP1

010001 fmt

5610111516202125

2631 0

55556

Format:

TRUNC.W.fmt fd, fs

Description:

The contents of the FPU register specified by fs are interpreted in the specified source

format fmt and arithmetically converted to the single fixed-point format. The result us

placed in the FPU register specified by fd.

Regardless of the setting of the current rounding mode, the conversion is rounded as if the

current rounding mode is round toward zero (RM = 1).

This instruction is valid only for conversion from a single- or double-precision floating-

point formats. The ope ration i s not d efin ed if bit 0 of any re gister spe cificat ion is set and the

FR bit in the Status register equals zero, since the register numbers specify an even-odd pair

of adjacent coprocessor general registers. When the FR bit in the Status register equals one,

both even and odd register numbers are valid.

When the source operand is an Infinity or NaN, or the correctly rounded integer result is

outside of −231 to 231-1, an Invalid operation exception is raised. If Invalid operation is not

enabled, then no exception is taken and -231 is returned.

Operation:

32, 64 T: StoreFPR (fd, W, ConvertFmt (VaIueFPR (fs, fmt ), fmt , W))

Exceptions:

Coprocessor unus able exception

Floating-Point exception

Coprocessor Exceptions:

Invalid operation exception

Unimplemented operation exception

Inexact exception

Overflow exception

TX49/H2 Architecture

B-50

B.5 Bit Encoding of FPU Instruction OPcodes

The Table B-6 shows the bit codes for all TX49 FPU instructions (ISA and extended ISA)

Table B-6 FPU Operation Code Bit Encoding

Opcode

31 26 0

OPcode

28∼26

31∼2901234567

2COP1

6 LWC1 LDC1 θ

7SWC1 SDC1

Sub

31 26 25 21 0

OPcode Sub

23∼21

25∼2401234567

0MF DMF

η θ CF MT DMT η θ CT δ

1BC δδδδδδδ

2S D

θδ δWL

η θ δ δ

3δδδδδδδδ

TX49/H2 Architecture

B-51

31 26 20 16 0

OPcode Br

18∼16

20∼1901234567

0 BCF BCT BCFL BCTL γγγγ

1γγγγγγγγ

2γγγγγγγγ

3γγγγγγγγ

CP1 Function

31 26 5 0

OPcode CP1

Function

2∼0

5∼301234567

0 ADD SUB MUL DIV SQRT ABS MOV NEG

1ROUND.L η θ TRUNC.L η θ CEIL.L η θ FLOOR.L η θ ROUND.W TRUNC.W CEIL.W FLOORW

2δδδδδδδδ

3δδδδδδδδ

4CVT.S CVT.D

θδ δCVT.W CVT.L η θ δ δ

5δδδδδδδδ

6 C.F C.UN C.EQ C.UEQ C.OLT C.ULT C.OLE C.ULE

7 C.SF C.NGLE C.SEQ C.NGL C.LT C.NGE C.LE C.NGT

Key:

γ: This opcode is reserved for future use. An attempt to execute it causes a Reserved

Instruction exception.

δ: Thie opcode is reserved for future use. An attempt to execute it causes a Unimplemented

operatio n exc ept ions in al l current implementations .

η: This opcode is valid only when MIPS III instructions are enabled. An attempt to execute

these without MIPS III instruction enabled will cause an Unimplemented operation exception.

θ: This opcode is valid only when the TX49 has a double precision FPU in hardware. An

attempt to execute these without it will cause an Unimplemented operation exception.

Note:

FPU Instructions are valid only when TX49 has with FPU(CP1). An attempt to execute these

insturctions causes a Coprocessor Unusable exception, independent of C0_SR(bit 29)’s value.

TX49/H2 Architecture

B-52

TX49/H2 Architecture

C-1

Appendix C: Coprocessor 0 Hazards

C.1 Pipeline Interlock and Hazard in TX49

C.1.1 Interlock in Load Delay Slot

Pipeline control logic will interlock the pipeline when detecting a hazard condition and

pipeline won’t resume until the hazard is resolved.

An example is sho wn in Figure C- 1. In thi s case, instru ction in the load delay slot tries

to read the d estination re gister of the load instruction resulting in p ipeline sta ll until the

data is read from the cache.

lw $5, 0 ($26) F D E M W

addu $8, $7, $5 F D ES E M W

Cache Read Finish

Figure C-1 Interlock in Load Delay Slot

Pipeline also interlocks when the cache miss occurs or when the data is loaded from

uncached area (Figure C-2).

lw $5, 0 ($26) F D E M – – FX W

RD RD

addu $8, $7, $5 F D ES ES ES ES E M W

Read Bus Cycl e by lw.

Cache Read Finish

Figure C-2 Interlock in Cache Miss or in the Data Load from Non-cached Area

In this example where there is a register hazard between two consecutive instructions,

ADDU will stall at E stage until the destination register of LW is written back.

However, if there is no data dependancy between LW and ADDU, execution of ADDU

will comple te with out stall bef ore th e de stinatio n regis ter o f LW is written back. P ipeline

interlock occurs at the first instruction that has the data dependency with the preceding

load instruction (Figure C-3).

lw $5, 0 ($26) F D E M – – – FX W

RD RD RD

addu $8, $7, $6 F D E M W

ori $9, $0, 0x1f F D E M W

addu $9, $8, $5 F D ES ES ES E M W

Figure C-3 Pipeline Interlock by Cache Miss

TX49/H2 Architecture

C-2

Pipeline also interlocks on write-after-write hazard which is illustrated in Figure C-4.

Write-after-write hazard is detected when one of the instructions following a load has the

destination register which is same as that of the load instruction. In this example, the

ADDU instruction stalls at its E stage until the destination register ($1) of the load is

written back.

lw $1, 0 ($26) F D E M – – – FX W

RD RD RD

addu $8, $7, $6 F D E M W

ori $9, $0, 0x1f F D E M W

addu $1, $8, $5 F D ES E S ES E M W

Figure C-4 Write-af ter- write Hazar d b y Load Instr uc tio n

A SYNC instruction may be placed right after a load instruction. This will cause

pipelin e stall u ntil th e bu s cycle i ssued by the prev ious load instruc tion comp letes (Figur e

C-5). If the data is read from the cache, there is no bus cycle pending before the SYNC

which results in no pipeline stall.

lw $5, 0 ($26) F D E M – – FX W

RD RD

sync F D E MS MS MS M W

Read Bus Cycl e by lw.

Memory Read Finish

Figure C-5 SYNC Instruction After Load Instruction

C.1.2 Branch Delay Slot

Branch and jump instructions have a branch delay slot (Figure C-6). Also, DERET

instruction has a branch delay slot. Note that the result is undefined when the

branch/jump instruction is placed in the branch delay slot1.

beq $1, $4, L1 F D E M W

subu $3, $5, $6 (delay slot ) F D E M W

L1: addiu $7, $7, 1 (target) F D E M W

Figure C-6 Branch Delay Slot

1 Instructions which cause exception, such as, SYSCALL, BREAK, and SDBBP may be placed in the

branch delay slot.

TX49/H2 Architecture

C-3

C.1.3 Multiply, Mult iply/Add and Division Instructions

This subsection explains the pipeline hazard/interlock caused by the combinations of

multiply, multiply/add, division, and MTHI/MTLO/MFHI/MFLO instructions (Figure

C-7). Basically, the pipeline hazard/inte rlock by these in structions can be summarized in

this way:

• Pipeline interlocks when the data dependency exists.

• Pipeline interlocks when preceding 32-bit multiply or 32-bit multiply/add

instruction has <rd> field.

• Pipeline in terlocks w hen 32-bi t instructio n and 64-b it ins truction are exe cuted in

sequence.

• HI/LO registers are in undefined state within two instructions before the division

instruction, such as, DIV/DIVU/DDIV/DDIVU instruction2.

SUCCEEDING INSTRUCTION

MULT/

MULTU

(2-operand)

MULT/

MULTU

(3-operand)

MADD/

MADDU

(2-operand)

MADD/

MADDU

(3-operand)

MTHI/

MTLO MFHI/

MFLO DIV/

DIVU

DMULT/

DMULTU

(2-operand)

DMULT/

DMULTU

(3-operand)

DDIV/

DDIVU

MULT/MULTU

(2-operand) NO STALL NO STALL NO STALL NO STALL INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK

MULT/MULTU

(3-operand) INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK

MADD/MADDU

(2-operand) NO STALL NO STALL NO STALL NO STALL INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK

MADD/MADDU

(3-operand) INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK

MTHI/MTLO NO STALL NO STALL NO STALL NO STALL NO STALL NO STALL NO STALL NO STALL NO STALL NO STALL

MFHI/MFLO NO STALL NO STALL NO STALL NO STAL L NO STALL NO STALL * NO STALL NO STALL *

DIV/DIVU INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK

DMULT/DMULTU

(2-operand) INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK

DMULT/DMULTU

(3-operand) INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK

PRECEEDING INSTRUCTION

DDIV/DDIVU INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK INTERLOCK

*: HI/LO registers are in undefined state within two instructions before division instruction

Figure C-7 MAC pipeline hazard/interlock

In the following sections, the pipeline hazards/interlocks caused by the possible

combinations of the instructions related multiply, multiply/add, division and both 32-bit

and 64-bit operations are illustrated in detail. The Figures in the following sections

classifies the cases in such a way that:

A The preceding instruction is immediately followed by 32-bit multiply or multiply/add

instruction

B The preceding instruction is immediately followed by MFHI or MFLO intstruction

C The preceding instruction is immediately followed by MTHI or MTLO intstruction

D The preceding instruction is immediately followed by 32-bit division instruction

E The preceding instruction is immediately followed by 64-bit multiply instruction

F The preceding instruction is immediately followed by 64-bit division instruction

2 In the original R3000, this can be applied to MULT, MULTU, MTHI, and MTLO instructions.

TX49/H2 Architecture

C-4

Case 1: Preceding Instruction Is 32-bit Multiply or 32-bit Mutiply/Add Instruction

A. 32-bit Multiply and Multiply/Add Instructi ons

Pipeline interlocks when data dependency or write

back date into <rd> exists.

2-operand Inst ruct ion is preceeding

MULT/MADD $3, $4 F D E1 E2 E3 M W

MULT/MADD $6, $7, $8 F D E1 E2 E3 M W

Multiply Stage 1 Multiply Stage 4

With data dependency

MULT/MADD $3, $4, $5 F D E1 E2 E3 M W

MULT/MADD $6, $3, $8 F D ES ES ES E1 E2 E3 M W

B. MFHI/MFLO Instructions

Pipeline interlocks until result of MULT/MADD

instruct i ons stored int o <rd> and HI/LO register.

MULT/MADD $3, $4, $5 F D E1 E2 E3 M W

MFHI/MFLO F D ES ES E M W

HI/LO read

C. MTHI/MTLO Instruct ions

Pipeline i nterlocks until result of MULT/MADD

instruct i on is stored into <rd> and HI/ LO register.

MULT/MADD $3, $4, $5 F D E1 E2 E3 M W

MTHI/MTLO F D ES ES E M W

Update HI/LO

D. 32-bi t Divis i on Inst ruction

The result of 3-operand multiply instruction is stored

in <rd>, and HI/LO registers are eventually updated

by divisi on i nstruction.

MULT $3, $4, $5

F D E1 E2 E3 M W

DIV $6, $7 F D ES ES E M W

V1 V2 V3 V4 …V36

Division stag e 1

E. 64-bit Multipl y Instructions

Pipeline interlocks when data dependency or write

back data into <rd> exists.

2-operand Inst ruct ion is preceeding

MULT $6, $3 F D E1 E2 E3 M W

DMULT $4, $7 F D ES ES E1 E2 …E6 M W

With data dependency

MULT $3, $4, $5

F D E1 E1 E3 M W

DMULT $6, $3, $8 F D ES …ES E1 E2 …E6 M W

F. 64-bit Di vi sion Instruction

The result of 3-operand multiply instruction is stored

in <rd>, and HI/LO registers are eventually updated

by divisi on i nstruction.

MULT $3, $4, $5

F D E1 E2 E3 M W

DDIV $6, $7 F D ES ES E M W

V1 V2 V3 V4 …V68

Division stag e 1

Figure C-8 Pipeline Hazard/Interlock by 32-bit Multiply or 32-bit Multiply/Add Instruction

Note that in the category A of the Figure C-8, pipeline interlocks for any instruction

immediately after the multiply or multiply/add instruction when it has the data

dependency regarding the general purpose registers. Thus, in the category D, the DIV

instruction stalls at the E sta ge for three cycles when the division instruction has the data

dependency with the preceding multiply instruction.

Also note that in the category D of the Figure C-8, Because the division instruction

overwrites the HI/LO registers, the HI/LO registers as the result of the 2-operand

multiply instru ction is undef ined. The re sult o f the mu ltiply instructio n, as in thi s figure ,

is correctly stored in the <rd> register. If the preceding multiply or multiply/add

instruction had a <rd> field, pipeline interlocks due to the resource conflict.

TX49/H2 Architecture

C-5

Case 2: Preceding Instruction Is MFHI/MFLO Instruction

A. 32-bit Multiply and Multiply/Add Instructi ons

MULT/MADD updates the HI/LO registers at M

stage and the prior MFHI/MFLO can read the HI/LO

registers before the update.

MFHI/MFLO F D E M W

MULT/MADD $6, $7, $8 F D E1 E2 E3 M W

Update HI/LO

Read HI/LO

B. MFHI/MFLO Instructions

No hazard.

MFHI/MFLO F D E M W

C. MTHI/MTLO Instruct ions

No hazard because MTHI/MTLO updates HI/LO

resisters at M stage.

MFHI/MFLO F D E M W

MTHI/MTLO F D E M W

Update HI/LO

Read HI/LO

D. 32-bi t Divis i on Inst ruction

It is necessary to insert at least two instructions

between MFHI/MFLO and DIV.

MFHI/MFLO F D E M W

nop F D E M W

DIV F D E M W

V1 V2 V3 …V36

Update HI/LO

E. 64-bit Multipl y Instructions

DMULT updates the HI/LO registers at M stage and

the prior MFHI/MFLO can read the HI/LO registers

before the update.

MFHI/MFLO F D E M W

DMULT $6, $7, $8 F D E1 E2 …E6 M W

Update HI/LO

Read HI/LO

F. 64-bit Di vi sion Instruction

It is necessary to insert at least two instructions

between MFHI/MFLO and DDIV.

MFHI/MFLO F D E M W

nop F D E M W

DDIV F D E M W

V1 V2 V3 …V68

Update HI/LO

Figure C-9 Pipeline Hazard/Interlock by MFHI/MFLO Instructions

TX49/H2 Architecture

C-6

Case3: Preceding Instruction Is MTHI/MTLO Instruction

A. 32-bit Multiply and Multiply/Add Instructi ons

MULT/MADD updates the HI/LO registers at M

stage and MADD can use HI/LO registers updated

by the prior MTHI/ MTLO.

MTHI/MTLO F D E M W

MULT/MADD $6, $7, $8 F D E1 E2 E3 M W

Update HI/LO

B. MFHI/MFLO Instructions

No hazard because MTHI/MTLO updates the HI/LO

registers before MFHI/MFLO reads them.

MTHI/MTLO F D E M W

MFHI/MFLO F D E M W

Read HI/LO

Update HI/LO

C. MTHI/MTLO Instruct ions

No hazard.

MTHI/MTLO F D E M W

Update HI/LO

D. 32-bi t Divis i on Inst ruction

The division instruction starts to update HI/LO

registers at E stage, and the prior MTHI/MTLO has

no meaning.

MTHI/MTLO F D E M W

DIV F D E M W

V1 V2 V3 …V36

Update HI/LO

E. 64-bit Multipl y Instructions

DMULT updates the HI/LO registers at M stage.

MTHI/MTLO F D E M W

DMULT $6, $7, $8 F D E1 E2 E3 E4 E5 E6 M W

Update HI/LO

F. 64-bit Di vi sion Instruction

The division instruction starts to update HI/LO

registers at E stage, and the prior MTHI/MTLO has

no meaning.

MTHI/MTLO F D E M W

DDIV F D E M W

V1 V2 V3 …V68

Update HI/LO

Figure C-10 Pipeline Hazard/Interlock by MTHI/MTLO Instructions

TX49/H2 Architecture

C-7

Case 4: Preceding Instruction Is 32-bit Division Instruction

A. 32-bit Multiply and Multiply/Add Instructi ons

Pipeline interlocks till the division instruction is

completed.

DIV F D E M W

V1 V2 V3 …V36

MULT/MADD $6, $7, $8

F D ES ES ES …E1 …E3 M W

B. MFHI/MFLO Instructions

Pipeline i nterlocks because of data dependency.

DIV F D E M W

V1 V2 V3 …V36

MFHI/MFLO F D ESESES…EMW

C. MTHI/MTLO Instruct ions

Pipeline interlocks till the division instruction is

completed.

DIV F D E M W

V1 V2 V3 …V36

MTHI/MTLO F D ESESES…EMW

D. 32-bi t Divis i on Inst ruction

Pipeline interlocks till the division instruction is

completed.

DIV F D E M W

V1 V2 V3 …V36

DIV F D ES ES ES …EMW

V1 V2 V3 …V36

E. 64-bit Multipl y Instructions

Pipeline interlocks till the division instruction is

completed.

DIV F D E M W

V1 V2 V3 …V36

DMULT $6, $7, $8

F D ES ES ES …E1 …E6 M W

F. 64-bit Di vi sion Instruction

Pipeline interlocks till the division instruction is

completed.

DIV F D E M W

V1 V2 V3 …V36

DDIV F D ES ES ES …EMW

V1 V2 V3 …V68

Figure C-11 Pipeline Hazard/Interlock by Division Instructions

TX49/H2 Architecture

C-8

Case 5: Preceding Instruction Is 64-bit Multiply Instruction

A. 32-bit Multiply and Multiply/Add Instructi ons

Pipeline interlocks till the multiply instruction is

completed.

DMULT $3, $4

F D E1 E2 E3 E4 E5 E6 M W

MULT/MADD $6, $7, $8

F D ES ES ES …ES E1 E2 E3 M W

B. MFHI/MFLO Instructions

Pipeline i nterlocks because of data dependency.

DMULT F D E1E2E3E4E5E6 M W

MFHI/MFLO F D ES ES ES …EMW

C. MTHI/MTLO Instruct ions

Pipeline interlocks till the multiply instruction is

completed.

DMULT F D E1E2E3E4E5E6 M W

MTHI/MTLO F D ES ES ES …ES E M W

D. 32-bi t Divis i on Inst ruction

Pipeline interlocks till the multiply instruction is

completed.

DMULT $3, $4

F D E1 E2 E3 E4 E5 E6 M W

DIV $6, $7

F D ES ES ES …ES E M W

V1 V2 V3 …V36

E. 64-bit Multipl y Instructions

Pipeline interlocks till the multiply instruction is

completed.

DMULT $3, $4

F D E1 E2 E3 E4 E5 E6 M W

DMULT $6, $7, $8

F D ES ES ES …ES E1 …E6 M W

F. 64-bit Di vi sion Instruction

Pipeline interlocks till the multiply instruction is

completed.

DMULT $3, $4

F D E1 E2 E3 E4 E5 E6 M W

DDIV $6, $7

F D ES ES ES …ES E M W

V1 V2 V3 …V68

Figure C-12 Pipeline Hazard/Interlock by Division Instructions

TX49/H2 Architecture

C-9

Case 6: Preceding Instruction Is 64-bit Division Instruction

A. 32-bit Multiply and Multiply/Add Instructi ons

Pipeline interlocks till the division instruction is

completed.

DDIV F D E M W

V1 V2 V3 …V68

MULT/MADD $6, $7, $8

F D ES ES ES …E1 …E3 M W

B. MFHI/MFLO Instructions

Pipeline i nterlocks because of data dependency.

DDIV F D E M W

V1 V2 V3 …V68

MFHI/MFLO F D ES ES ES …EMW

C. MTHI/MTLO Instruct ions

Pipeline interlocks till the division instruction is

completed.

DDIV F D E M W

V1 V2 V3 …V68

MTHI/MTLO F D ES ES ES …EMW

D. 32-bi t Divis i on Inst ruction

Pipeline interlocks till the division instruction is

completed.

DDIV F D E M W

V1 V2 V3 …V68

DIV F D ES ES ES …EMW

V1 V2 V3 …V36

E. 64-bit Multipl y Instructions

Pipeline interlocks till the division instruction is

completed.

DDIV F D E M W

V1 V2 V3 …V68

DMULT $6, $7, $8

F D ES ES ES …E1 …E6 M W

F. 64-bit Di vi sion Instruction

Pipeline interlocks till the division instruction is

completed.

DDIV F D E M W

V1 V2 V3 …V68

DDIV F D ES ES ES …EMW

V1 V2 V3 …V68

Figure C-13 Pipeline Hazard/Interlock by Division Instructions

TX49/H2 Architecture

C-10

C.1.4 Instructions regarding System Control Co-processor (CP0)

C.1.4.1 MFC0 and MTC0 Instructions

Pipeline interlocks when the MFC0 instruction is followed by the instruction that

reads the destination register of MFC0 instruction (Figure C-14).

mfc0 $5, EPC F D E M W

addu $8, $7, $5 F D ES E M W

EPC Read

Stall

Figure C-14 Pipeline Interlock by MFC0 Instruction

No pipeline hazards occur when the MTC0 instruction is followed by MFC0

instruction because MTC0 writes the destination register in the M stage and MFC0

reads it also in the M stage (Figure C-15).

mtc0 $5, DEPC F D E M W

mfc0 $8, DEPC F D E M W

DEPC Write

DEPC Read

Figure C-15 MTC0 Instruction Followed by MFC0 Instruction

C.1.4.2 ERET Instruction

Unlike a branch or jump instruction, ERET does not execute the next instruction.

The changed EPC becomes effective at the second instruction after the MTC0

instruction (Figure C-16).

mtc0 $5, EPC F D E M W

nop F D E M W

eret F D E M W

nop F D E M W

EPC Update

Figure C-16 MTC0 Instruction Followed by ERET Instruction

TX49/H2 Architecture

C-11

C.1.4.3 DERET Instruction

The DERET instruction has a branch delay slot, and the debug exception mode is

effective till the delay slot instruction3. The instruction in the delay slot of DERET

must be NOP instruction. Single step exception is disabled till the instruction to

which DERET returns the control.

mtc0 $5, DEPC F D E M W

nop F D E M W

deret F D E M W

nop F D E M W

DEPC Update

Figure C-17 MTC0 Instruction Followed by DERET Instruction

3 i.e. DM bit stays one (1) and interrupts and exceptions stay disabled.

TX49/H2 Architecture

C-12

C.1.5 Control Bits Change in CP0 Registers by MTC0 Instruction

The following sections describe the timings when the control bits change by the MTC0

instruction become effective.

C.1.5.1 Status Register

CU Bits: Because the co-processor instructions refer the CU bit in the D stage, if

either of the two following instructions of the MTC0 instruction is the co-

processor instruction, then its result is undefined because the CU bit is

undefined (Figure C-18).

mtc0 $5, STATUS F D E M W

nop F D E M W

copz F D E M W

CU Bit Update

CU Bit Read

Figure C-18 Hazard regarding the CU Bits

Note that even if the CU bit is changed by the MTC0 instruction during the co-

processor bus cycles of the preceding co-processor instruction, this gives no effect on

the co-processor instruction currently being executed.

RE Bit: Because the load/store instructions refer the RE bit in the E stage, the

change becomes effective at the second instruction after the MTC0

instruction. The result of the load/store instructions immediately after

the MTC0 instruction is undefined (Figure C-19).

mtc0 $5, STATUS F D E M W

nop F D E M W

Iw F D E M W

RE Bit Update

RE Bit Read

Figure C-19 Hazard regarding the RE Bits

Note that even if the RE bit is changed by the MTC0 instruction during the bus

cycles of the preceding load/store instruction, this gives no effect on the load/store

instruction currently being executed.

TX49/H2 Architecture

C-13

BEV Bit: For the exceptions that occur in the E stage, such as, the address error

(AdEL) or the TLB miss (TLBL) exceptions which occurs in the

instruction fetch stage, the exception vector base address designated by

the changed BEV becomes effective at the second instruction after the

MTC0 instruction. If these exceptions occur in the instruction

immediately after the MTC0 instruction, the referred value of the BEV bit

is undefined4 (Figure C-20).

mtc0 $5, STATUS F D E M W

nop F D E M W

Iw F D E XXXX

BEV Bit Update

E Stage Exception Occurs

Figure C-20 Hazard regarding the BEV Bits (1)

For the exceptions that occur in the M stage, such as, IBE, DBE, NmI, CpU, Ov,

Sys, Bp, RI, Ad E L (d at a), T LB L (dat a) , and TLB S, Mo d, an d In t, the e x cep tion ve ctor

base address designated by the changed BEV becomes effective at the instruction

immediately after the MTC0 instruction (Figure C-21).

mtc0 $5, STATUS F D E M W

Iw F D E M XXXX

BEV Bit Update

M Stage Exception Occurs

Figure C-21 Hazard regarding the BEV Bits (2)

Note that because the interrupts and the Bus Error exception occurs

asynchronously with the instruction execution, the BEV bit value for them is the

value which is hold in the BEV bit when they occurs.

IntMask Bits and IE Bit:

When the MTC0 instruction enables the interrupts by changing these bit,

then the corresponding interrupts become enabled at the second

instruction after the MTC0 instruction5 (Figure C-22).

On the other hand, when the MTC0 instruction disables the interrupts, the

corresponding interrupts become disabled at the instruction immediately after the

MTC0 instruction (Figure C-23).

FR Bit: Because the FR bit is changed in the M stage of the MTC0 instruction,

new FR bit becomes effective at the third instruction after the MTC0

instruction (Figure C-24).

4 The new exception vector base a ddress may be effecti ve because of pipeline stall.

5 They may become enable at the instruction immediately after the MTC0 instruction because of

pipeline stall.

TX49/H2 Architecture

C-14

mtc0 $5, STATUS F D E M W

nop F D E M W

Iw (Interrupt E nabl ed) F D E M W

IntMask/I E Updat e

(Interrupt Enable)

Interrupt Occurs

Figure C-22 Hazard regarding the IntMask Bits and IE Bit (1)

mtc0 $5, STATUS F D E M W

Iw (Interrupt Disabl ed) F D E M W

IntMask/I E Updat e

(Interrupt Dis abl e)

Figure C-23 Hazard regarding the IntMask Bits and IE Bit (2)

mtc0 $5, STATUS F D E M W

nop F D E M W

dmtc1 F D E M W

FR Bit Update

Reference FR Read

Figure C-24 Hazard regarding the FR Bit

TX49/H2 Architecture

C-15

EXL, ERL, KX, SX, UX, KSU Bit:

The modification of these bits become effective at the forth instruction

after the MTC0 instruction. On the other hand, new addressing mode for

a load/store instruction which is accessing the address in

Kernel/Supervisor space or accessing in 64-bit addressing is effective at

the second instruction after the MTC0 instruction. If either of the two

instructions after the MTC0 instruction is co-processor instruction, result

of the instruction is undefined (Figure C-25).

mtc0 $5, STATUS F D E M W

nop F D E M W

Iw F D E M W

cpz F D E M W

sd F D E M W

Update

MIPS-III i n structi o n

64-bit addressi ng

Kernel or

Supervisor mode

Figure C-25 EXL, ER L, KX, SX, UX, KSU Bit

C.1.5.2 Config Register

ICE# Bit: The MTC0 instruction may change the ICE# bit during the instruction

cache streaming. In this case, the old ICE# bit are effective for the

instructions during the streaming (Figure C-26).

mtc0 $5, Config ; updat e ICE # bit

nop

beq $0, $0, L1 ; stop instruction streaming

nop

L1: Iw $2, 0 ($0) ; new ICE# bit is effect i ve

Figure C-26 ICE# Bit update

DCE# Bit: The changed DCE# becomes effective at the second instruction after the

MTC0 instruction. The DCE# bit is undefined at the instruction

immed iately after the MTC0 instruc tion . Note th at the MTC0 instruc tion

may change the DCE# bit during the data cache refill. In this case, the

hardware interlock waits updating the DCE# bit till the data cache refill

finishes.

K0 Bit: The modification of these bits becomes effective at the forth instruction

after the MTC0 instruction, the result of the instruction in Kseg0 address

space is undefined if they executed as first, second or third instruction

after the MTC0 instruction. On the other hand, the modification of these

bits are effective at the third instruction after MTC0 instruction. New

addressing mode for a load/store instruction accessing the Kseg0 address

space is undefined if the instruction executed as first or second instruction

after MTC0 instruction.

TX49/H2 Architecture

C-16

C.2 Pipeline Beha vior on Cache Miss

This section describes the pipeline behavior on cache miss.

C.2.1 Instruction Cache Miss

Instruction cache miss is detected in F stage and it is immediately followed by a cache

refill cycle (Figure C-27).

GRD

GDIN[31:0]

addu $5, $26, $7 F D E M W

addu $8, $7, $6 F D E M W

Iw $2, 0 ($1) F D E M W

addu $9, $8, $5 F DS DS DS DS DS D E M W

subu $5, $3, $7 F D E M W

addu subu

Inst. Cache Miss

Instruction Cache Refill

Figure C-27 Streaming on Instruction Cache Refill Cycle in 32-bit GBus mode

On cache miss, the fetched instructions are immediately decoded and executed before

completion of refill cycle so that the pipeline resumes the execution of instruction stream

as shown in Figure C-27. This is so called streaming6 and its refill cycle is called stream

cycle.

When the branch or jump instruction is executed during the stream cycle, streaming

will be terminated which means refill cycle will completed but the fetched instructions

after the branch delay slot won' t be exe cuted. The pipeline will stall until th e in struction

at the branch or jump target is fetched. (Figure C-28).

6 No streaming in 64-bit GBus mode with 1:1 of GBus clock rate. TX49 executes one instruction per

clock cycle even if two instructions are fetched in one cycle. In this case, fetched instruction won't be

executed until the refill cycle completes.

TX49/H2 Architecture

C-17

GRD

GDIN[31:0]

addu $5, $26, $7 F DS DS DS DS DS D E M W

subu $9, $8, $5 F D E M W

jr $25 F D E M W

lw $2, 0 ($1) F D E M W

lw $3, 0 ($5) (target Inst ruct i on) F D E M W

addu subu

Instruction Cache Refill

Inst. Cache Miss

jr Iw

Jump

Figure C-28 Branch/Jump Instruction during Stream Cycle in GBus 32-bit Mode

C.2.2 Data Cache Miss

The data cache miss is detected in the M stage of load instruction and it is immediately

follow ed by a cache re fill cycle. Non -blocking load mechanism implem ented in TX49 data

cache allows the following instruction stream to be executed without waiting for the

completion of data cache refill if there is no data dependancy between the load and the

following instructions.

The pipeline will stall at E-stage of the instruction which use the refilled data as its

source until the data is loaded. (Figure C-29).

Iw $5, 0 ($26) F D E M – – – FX W

RD RD RD

addu $8, $7, $6 F D E M W

ori $9, $0, 0x1f F D E M W

addu $9, $8, $5 F D E S E S ES E M W

Figure C-29 Pipeline Interlock by Cache Miss

The pipeline also interlocks when a load/store instruction is issued during the data

cache refill cycle because of the resource (i.e. data cache) conflict (Figure C-30).

TX49/H2 Architecture

C-18

Iw $5, 0 ($26) F D E M – – – F X W

RD RD RD

Iw $7, 0 ($25) F D E MS MS MS MS M W

ori $9, $0, 0x1f F D E S ES ES ES E M W

addu $9, $8, $5 F DS DS DS DS D E M W

Reference FR Read

resource conflict

Figure C-30 Load Instruction during the Data Cache Refill Cycle

It is possible that the conflict at W-stage occurs between load instruction and one of the

following instructions if the load instruction causes cache refill cycle. This situation is

shown in Figure C-31.

In this c ase, W-stage o f load instructio n takes pre cedence re sulting in on e cycle stall at

M-stage of the addu inst ruction.

Iw $5, 0 ($26) F D E M – – – FX W

RD RD RD

addu $4, $3, $7 F D E M W

ori $9, $0, 0x1f F D E M W

addu $9, $8, $7 F D E M W

addu $7, $6, $8 F D E MS M W

W stage Resource Conflict

Data Cache Miss

Figure C-31 W stage Pipeline Register Conflict

If the instruction fetch cycle is requested during the data cache refill cycle, the data

cache refill completes first followed by the instruction fetch cycle (Figure C-32).

Iw $5, 0 ($26) F D E M – – – M W

RD RD RD

addu $7, $6, $8 F D E M W

addu $4, $3, $7 F D E M W

ori $9, $0, 0x1f F D E M W

addu $9, $8, $5 F DS DS DS DS DS DS DS D E M W

addu $7, $6, $5 F D E M WInst. Cache Miss

Data Cache Miss

Figure C-32 Instruction Cache Miss during the Data Cache Refill Cycle

TX49/H2 Architecture

C-19

C.3 Pipeline Behavior in Uncached Area

The pipeline behavior regarding the memory access to an uncached area is similar to that of

refill cycle sequence caused by the cache miss.

C.3.1 Data Read from Uncached Area

FDEM––––FXW

Iw $5, 0 ($26) RD RD RD RD

addu $8, $7, $6 F D E M W

ori $9, $0, 0x1f F D E M W

addu $9, $8, $5 F D E S E S ES ES E M W

Figure C-33 Data Read from Uncached Area

C.3.2 Instruction Fetch from Uncached Area

addu $5, $3, $3 F DS DS DS DS DS D E M W

Iw $2, 0 ($1) F DS D E M W

ori $9, $0, 0x1f F DS D E M W

addu $8, $9, $8 F DS D E M W

Figure C-34 Instruction Fetch from Uncached Area

C.3.3 Data Write to Uncached Area

FDEMW

sw $5, 0 ($26) WR

addu $8, $7, $6 F D E M W

ori $9, $0, 0x1f F D E M W

addu $9, $8, $5 F D E M W

Write to Write Buffer

Figure C-35 Data Write to Uncached Area

TX49/H2 Architecture

C-20

C.4 Timings on the Exception Handling

This section describes the detail pipeline behavior on exception. When an exception takes

place, the instruction on which the exception occurs is aborted. All instructions immediately

after that instruction are also aborted and the processor passes the control to the exception

handler.

The exceptions normally occur in the M stage, but some of the exceptions occur in the E

stage. The exceptions which occur in the E stage are:

• Debug Single Step (DSS)

• Debug Instruction Break (DIB)

• Address Error on Instruction Fetch (AdEL)

• TLB Refill/Invalid on Instruction Fetch (TLBL)

Note that the Reset/Soft Reset Exceptions occur in any stage.

C.4.1 Basic Pipeline Behavior W hen Exceptions Occur

The following Figure illustrates the pipeline behavior when an exception occurs.

Iw $5, 0 ($26) F D E M W

addu $7, $6, $8 F D E M Aborted

addu $4, $3, $7 F D E Aborted

ori $9, $0, 0 × 1f F D Aborted

addu $9, $8, $5 F Aborted

addu $7, $6, $5 F D E M W

Exception Detect ed

Exception Handler

(a) Exception Detected in the M Stage

Iw $5, 0 ($26) F D E M W

addu $4, $3, $7 F D E A borted

ori $9, $0, 0x1f F D Aborted

addu $9, $8, $5 F Aborted

addu $7, $6, $5 F D E M W

Exception Detect ed

Exception Handler

(b) Exception Detected in the E Stage

Figure C-36 Pipeli ne Be ha vior in Cas e of Exception

TX49/H2 Architecture

C-21

C.4.2 Exceptions during the Execution of Multi-cycle Instructions

As described in the section entitle Multiply, Multiply/Add and Division Instructions,

multi-cycle instructions which do not have a destination register file, such as DIV, and the

following instructions will be executed in parallel if they do not have data dependency.

If an exception takes place at the instruction being executed in parallel with this type of

multi-cycle instructions, the preceding multi-cycle instruction is completed while the

instructions after the exception are aborted and the control is passed to the exception

handler.

FDEMW

div $8, $9 V1 V 2 V 3 V 4 V 5 V 6 v7 ….. V 35 V36

addu $7, $6, $5 F D E M Aborted

addu $4, $3, $7 F D E Aborted

ori $9, $0, 0x1f F D Abort ed

addu $9, $8, $5 F Aborted

addu $7, $6, $5 F D E M W

Exception Detected

Exception Handler

Figure C-37 Exception during the Execution of Division Instruction

C.4.3 Exceptions during the Data Cache Refill Cycle

When one of the exceptions occurs at the instruction which is being executed in parallel

with data cache refill, the data cache refill cycle is completed while the instructions after

the exception are aborted and the control is passed to the exception handler.

FDEM–––FXW

Iw $3, 0 ($1) RD RD RD

addu $7, $6, $5 F D E M Aborted

addu $4, $3, $7 F D E Aborted

ori $9, $0, 0x1f F D Abort ed

addu $9, $8, $5 F Aborted

addu $7, $6, $5 F D E M W

Exception Detected

Exception Handler

Figure C-38 Exceptions during the Data Cache Refill Cycle (1)

TX49/H2 Architecture

C-22

However, when one of the fatal exceptions, such as Bus Error or Reset occurs, the refill

cycle is also aborted and the control is passed to the exception handler.

F D E M – Aborted

Iw $3, 0 ($1) RD Aborted

addu $7, $6, $5 F D E M Aborted

addu $4, $3, $7 F D E Aborted

ori $9, $0, 0x1f F D Abort ed

addu $9, $8, $5 F Aborted

addu $7, $6, $5 F D E M W

Fatal Exception Detected

Exception Handler

Figure C-39 Exception during Data Cache Refill Cycle (2)

TX49/H2 Architecture

D-1

Appendix D: G-Bus Overview

D.1 G-Bus Operation

The G- Bus has a 36- bit addres s bus and a 64- bit data bus. Byte and halfword transfers c an

occur in any byte lane, depending on how GBE[7:0]* are driven.

The G-Bus speed can be divided by 2, 2.5, 3 or 4 relative to the CPU full speed. Selection of

which G-Bus speed to use is determined by the value of GCRATE[1:0] while GCO LDRESET is

asserted. Correct operation is not guaranteed if GCRATE[1:0] changes while the TX49 is

running.

The TX49 supports four different types of bus transactions: single-read, burst-read, single-

write and burst-write. When a bus transaction starts, GBSTART* is asserted for one

GBUSCLK cycle, regardless of the type of the transaction. Peripheral logic must sample

GBSTART* to recognize the beginning of a bus cycle. It should be noted that when multiple

read or write trans actions occur back- to-back, GRD* or GWR* remains asserted until the last

transaction is completed; therefore, GRD* and GWR* can not be used to detect the beginning

of a bus cycle.

During a read operation, the TX49 samples GACK* with the rising edge of GBUSCLK.

When it is detected as asserted, the TX49 captures the data on GDTM at the next rising edge

of GBUSCLK. If the bus transaction is a burst-read, the TX49 also automatically increments

the address value.

During a write operation, the TX49 samples GACK* with the rising edge of GBUSCLK.

When it is detected as asserted during a single-write, the TX49 terminates the current bus

transaction at the next rising edge of GBUSCLK. If the bus transaction is a burst-write, the

TX49 goes ahead with the next write, automatically incrementing the address val ue.

GLAST* indicates the completion of a bus cycle. Peripheral logic must sample GLAST* to

terminate a bus tran sact io n.

D.2 Types of G-Bus Arbitration

One important feature of the TX49 is its enhanced bus arbitration flexibility. This section

introduces two types of bus arbitration: Snoop & Transfer (ST) concurrency and Execute &

Transfer (ET) concurrency. ST concurrency causes the TX49 to stall the processor pipeline

while allowing the internal data cache to be snooped during DMA transfers. In contrast, ET

concurrency allows the processor core to continue execution out of the internal cache during

external bus mastership; ET concurrency does not al low data cache snooping.

D.2.1 Snoop & Transfer (ST) Concurrency

In systems in which main memory is accessed by DMA, it must be ensured that the

intern al data cache of the TX4 9 always ha s the mo st recent da ta and is not in pos session

of stale data. In other words, if the data in main memory has been changed by DMA, the

matching cache entries in the TX49 must be marked as "modified" (i.e., invalidated). ST

concurrency allows the TX49 to "snoop" DMA’s access to main memory and check for a

matching dat a cache entry. Figure D -1 illustr ates this f eature. Du ring an ST con currency

operation, the TX49 stalls the processor pipeline.

An alternate bus master asserts either GHPSREQ* or GSREQ* to request bus

mastership for an ST concurrency operation. Once GHPSREQ* or GSREQ* is detected,

the TX49 will flush the internal write buffer before granting the bus to the requesting

master; GHPSGNT* or GSGNT* is asserted to indicate that the bus has been granted.

TX49/H2 Architecture

D-2

While GHPSGNT* or GSGNT* i s a sserte d, th e TX49 c ont inually sample s GSNO OP* with

the rising edge of GBUSCLK. When GSNOOP* is recognized as asserted, the TX49

captures the address on GATM[35:5] and compares it to the addresses of all data items

held in the data cache. If the snoop address hits in the data cache, the cache entry is

invalidated. GSNOOP* is valid only when either GHPSGNT* or GSGNT* is asserte d.

The internal data cache of the TX49 can employ either the write-through or write-back

policy. The write-back data cache does not provide support for snooping. When the write-

back option is sel ected, GHPSREQ* and GSREQ* can not be used.

Figure D-1 ST Concurr enc y

D.2.2 Execute & Trans fer (ET) Concurrency

Figure D-2 illustrates ET concurrency. Whereas ST concurrency causes the TX49 to

stall the processor pipeline, ET concurrency allows the processor to continue execution out

of the internal cache during external bus mastership. However, it does stall when there is

a need for a cache refill. Also, if the write buffer is full, additional stores will stall until

there is room for them in the write buffer.

ET concurrency is recommended for the following cases:

• when the internal data cache is programmed for write-back mode

• when performing DMA transfers to an uncached address space even if the internal

data cache is programmed for write-t hrough mode

An alternate bus master asserts either GHPGREQ* or GREQ* to request bus

mastership for an ET concurrency operation. Once GHPGREQ* is detected and the bus is

free, the TX49 will grant the bus to the requesting master. GHPGGNT* or GREQ* is

asserted to indicate that the bus has been granted to the master. If the bus is busy, the

TX49 will relinquish the bus after it completes the current bus cycle. GHPGREQ* and

GREQ* are sampled with the rising edge of GBUSCLK.

G-Bus

External Bus

TX49

WBU, etc.

Bus Master

External Device

Interface

TX49

Processor

Core

TX49/H2 Architecture

D-3

Figure D-2 ET Concurr enc y

Table D-1 summarizes the differences between ST and ET concurrency.

Table D-1 ST Concurrency vs. ET Concurrency

ST Concurrency ET Concurrency

Handshake Signals Bus request signal: GHPSREQ*

Bus grant signal: GHPSGNT*

Bus request signal : GSRE Q *

Bus grant signal: GSGNT*

Bus request signal: GHPGREQ*

Bus grant signal: GHPGGNT*

Bus request signal: GREQ*

Bus grant signal: GGNT*

Data Cache Snooping Acc ept ed by assert i on of GSNOOP*

(Not support ed in write-back m ode) Not Supported

Stores to the Write Buffer Disabled Enabled

Usage Example When an external bus mast er

performs store operat ions to a

memory space mapped to the data

cache (i.e., When data cache

snooping is necess ary)

• When an external bus m ast er

transfers dat a over the G-Bus

without performing a snoop

operation.

• When the data cache employs the

write-back policy

Maximum Bus Control

(Request-to-Grant)

Latency

Remaini n g current bus c ycle

+ write buffer flushing*

+ dta read bus cycle already issued int ernal l y

+ instructi on fetch bus cyc l e already iss ued internally

* During an ET concurrency operation, the write buffer is flushed only when the write buffer

contains uncac hed store data which has not yet been written to m emory and the T X49 issues

an uncached read request to the target address of one of the write buffer entries.

G-Bus

External Bus

TX49

WBU, etc.

Bus Master

External Device

Interface

TX49

Processor

Core

TX49/H2 Architecture

D-4

TX49/H2 Architecture

E-1

Appendix E: Differences From TX4955A,TX4300 and TX4600

Item TX4955A TX4300 TX4600

Datapath 64 64 64

ISA MIPS I, II, III MIPS I, II,III MIPS I, II, III

+MADD, +Debug

+PREF

Pipeline 5 5 5

MMU TLB TLB TLB

JointT LB 48 double 32 double 48 double

I-TLB 2 entry 2 entry 2 entry

D-TLB 4 entry No 4 entry

Page Size 4 K-16 MB 4 K-16 MB 4 K-16 MB

Shutdown No-TS Yes No-TS

V.A. Size 40 40 40

P.A. Size 36 32 36

I-cache

Size 32 KB 16 KB 16 KB

Associate. 4-way Dir.-map 2-way

Lock Yes No No

Snoop No No No

Index V V V

TagPPP

Line 32 B 32 B 32 B

Parity No No Yes

D-cache

Size 32 KB 8 KB 16 KB

Associate. 4-way Dir.-map 2-way

Lock Yes No No

Write Policy W.-back/ -through W. -back W.-back/-through

Snoop No No No

Index V V V

TagPPP

Line 32 B 16 B 32 B

Parity No No Yes

TX49/H2 Architecture

E-2

Item TX4955A TX4300 TX4600

WriteBuffer 4A/D pairs 4A/D pai rs 4A/D pairs

FPU FPU Hard Shared w/ IU FPU Hard

(CP1) Shared w/

I-mul/div

Single Single Single

Double Double Double

Debug Support Unit Yes No No

MPU SysAD SysAD SysAD

Bus I/F 32-bit 32-bit 64-bit

A/D multip lexed A/D multiplex e d A/D mult ip lexed

Sys.Clock Ratio:

1:1NoNoNo

2:1 Yes Yes Yes

2.5:1 Yes No No

3:1 Yes Yes Yes

4:1 Yes No Yes

5:1NoNoYes

6:1NoNoYes

7:1NoNoYes

8:1NoNoYes

JTAG Yes Yes(No func.) No

Power Sup. Internal: 1.5 V

External: 3.3 V 3.3 V 3.3 V

Power down -W AI T Inst. -Status. Reg. -W AI T Inst.

Mode (Halt/Doze) (1/4 P Cl ock) (S tand-by)

Package PQFP-160 PQFP-120 PGA-179

HSQFP-208