PNX1300EH/G,557 - Trident Microsystems

Preliminary Specification

Supersedes PNX1300 data of 2002 Feb 15

File under INTEGRATED CIRCUITS, TR1

2004 Aug 20

INTEGRATED CIRCUITS

PNX1300 Series

Media Processors

2002 Feb 15

Philips Semiconductors Preliminary Specification

Media Processors PNX1300 Series

PNX1300 Series Data Book

Foreword

Table of Contents

1Pin List

2Overview

3DSPCPU Architecture

4Custom Operations for Multimedia

5Cache Architecture

6Video In

7Enhanced Video Out

8Audio In

9Audio Out

10 SPDIF Out

11 PCI Interface

12 SDRAM Memory System

13 System Boot

14 Image Coprocessor

15 Variable Length Decoder

16 I2C Interface

17 Synchronous Serial Interface

18 JTAG Functional Specification

19 On-Chip Semaphore Assist Device

20 Arbiter

21 Power Management

22 PCI-XIO Bus Functional Specification

ADSPCPU Operations

BMMIO Register Summary

CEndian-ness

Index

Preliminary Specification

 2001-2004 Philips Electronics North America Corporation

See Terms and Conditions on the next page.

2004 Aug 20

TERMS AND CONDITIONS

Philips Semiconductors and Philips Electronics North America Corporation reserve the right to make changes,

without notice, in the products, including circuits, standard cells, and/or software, described or contained

herein in order to improve design and/or performance. Philips Semiconductors assumes no responsibility or

liability for the use of any of these products, conveys no license or title under any patent, copyright, or most

work right to these products, and makes no representations or warranties that these products are free from

patent, copyright, or most wor k right infringement, u nless otherwise specified. Applications that are described

herein for any of these products are for illustrative purposes only. Philips Semiconductors makes no

representation or warranty that such applications will be suitable for the specified use without further testing

or modification.

LIFE SUPPORT APPLICATIONS

Philips Semiconductors and Philips Electronics North America Corporation products are not designed for use

in life support appliances, devices, or systems where malfunction of a Philips Semiconductors and Philips

Electronics North America Corporation product can reasonably be expected to result in a personal injury.

Philips Semiconductors and Philips Electronics North America Corporation customers using or selling Philips

Semiconductors and Philips Electronics North America Corporation products for use in such applications do

so at their own risk and agree to fully indemnify Philips Semiconductors and Philips Electronics North America

Corporation for any damages re sulting from improper use or sale.

Philips Semiconductors and Philips Electronics North America Corporation register eligible circuits under the

Semiconductor Chips Protection Act.

 2001, 2002, 2003, 2004 Philips Electronics North America Corporation

Printed in U.S.A.

Business Line Media Processing, 81 1 E. Arques Avenue, Sunnyvale, CA 94088

DEFINITIONS

Data Sheet

Identification Product Status Definition

Objective

Specification Formative or in

Design This data sheet contains the design target or goal specifications for product

development. Specifications may change in any manner without notice.

Preliminary

Specification Preproduction

Product This data sheet contains preliminary data, and supplementary data will be pub-

lished at a later date. Philips Semiconductors reserves the right to make

changes at any time without notice in order to improve design and supply the

best possible product.

Product

Specification Full

Production This data sheet contains Final Specifications. Philips Semiconductors reserves

the right to make changes at any time without notice, in order to improve the

design and supply the best possible product.

Terms and Conditions

PRELIMINARY INFORMATION 1

Foreword

The TriMedia PNX1300 Ser ies is an enhan ced version

of the TM-1300 family of media proce ssor .

The PNX1300 Series contains an ultra-high performance

Very Long Instruction Word p rocessor, as well as a com-

plete intelligent video and audio input/output subsystem.

The processor has an instruction set that is optimize d for

processing audio, video and graphics. It inclu des power-

ful SIMD multimedia operators for eight- and 16-bit signal

datatypes as well as a full complement of 32-bit IEEE

compatible floating point operations.

The PNX1300 Series is intended as a multi-standard

programmable video, audio and graphics processor. It

can either be used standalone, or as an accelerator to a

general purpose processor.

The architecture of the TriMedia family came about as

the result of many years of effort of many dedicated ind i-

viduals. Going back in history, the origin of TriMedia was

laid by the LIFE-1 VLIW processor, designed by Junien

Labrousse and myself in 1987. Work continued after-

wards in Philips Research Labs, Palo Alto. My special

thanks go to the entire Palo Alto research team: Mike

Ang, Uzi Bar-Gadda, Peter Donovan, Martin Freeman,

Eino Jacobs, Beomsup Kim, Bob Law, Yen Lee, Vijay

Mehra, Pieter van der Meulen, Ross Morley, Mariette

Parekh, Bill Sommer, Artur Sorkin and Pierre Uszynski.

The Palo Alto period matured the architecture—we port-

ed all video and audio algorithms that we could find to the

compiler/simulator and refin ed the operation set. In addi-

tion, we learned h ow to give the architecture a market d i-

rection. In May 1994, Philips management—in particular

Cees-Jan Koomen, Eddy Odijk, Theo Claasen and Dou g

Dunn—decided to develop TriMedia into a major Philips

Semiconduc to rs pr o duc t line.

Under the guidance of Keith Flagler, the TriMedia team

was built. All of them contributed to take this from a set

of interesting ideas to a reliable and competitive product

in a short period of time. The initial TriMedia team includ-

ed Fuad Abu Nofal, Karel Allen, Mike Ang, Robert Aqui-

no, Manju Asthana, Patrick de Bakker, Shiv Balakrish-

nan, Jai Bannur, Marc Berger, Sunil Bhandari, Rusty

Biesele, Ahmet Bindal, David Blakely, Hans Bouw-

meester, Steve Bowden, Robert Bradfield, Nancy

Breede, Shawn Brown, Sujay Chari, Catherine Chen,

Howen Chen, Yan-ming Chen, Yong Cho, Scott Clapper,

Matthew Clayson, Paul Coelho, Richard Dodds, Marc

Duranton, Darcia Eding, Aaron Emigh, Li Chi Feng, Keith

Flagler, Jean Gobert, Sergio Golombek, Mike Grimwood,

Yudi Halim, Hari Hampapuram, Carl Hartshorn, Judy

Heider, Laura Hrenko, Jim Hsu, Eino Jacobs, Marcel

Janssens, Patricia Jones, Hann-Hwan Ju, Jayne Keith,

Bhushan Kerur, Ayub Khan, Keith Knowles, Mike Kong,

Ashok Krishnamurti, Yen Lee, Patrick Leong, Bill Lin,

Laura Ling, Chialun Lu, Naeem Maan, Nahid Mansipur,

Mike Maynard, Vijay Mehra, Jun Mejia, Derek Meyer,

Prabir Mohanty, Saed Muhssin, Chris Nelson, Stephen

Ness, Keith Ngo, Francis Nguyen, Kathleen Nguyen,

Derek Noonburg, Ciaran O’Donnel, Sang-Ju Park,

Charles Peplinski, Gene Pinkston, Maryam Pirayou, Par-

dha Potana, Bill Price, Victor Ramamoorthy, Babu Rao

Kandamilla, Ehsan Rashid, Selliah Rathnam, Margaret

Redmond, Donna Richardson, Alan Rodgers, Tilakray

Roychoudhury, Hani Salloum, Chris Salzmann, Bob

Seltzer, Ravi Selvaraj, Jim Shimandle, Deepak Singh,

Bill Sommer, Juul van der Spek, Manoj Srivastava, Ren-

ga Sundararajan, Ken-Sue Tan, Ray Ton, Steve Tran,

Cynthia Tripp, Ching-Yih Tseng, Allan Tzeng, Barbara

Vendelin, John Vivit, Rudy Wang, Rogier Wester, Wayne

Wonchoba, Anthony Wong, Sara Wu, David Wyland,

Ken Xie, Vincent Xie, Bettina Yeung, Robert Yin, Charles

Young, Grace Yun, Elena Zelayeta and Vivian Zhu.

Expert help and feedback was received from many. In

particular, I’d like to mention Kees van Zon of Philips

Eindhoven for the help with filtering-related issues, and

Craig Clapp of PictureTel for excellent feedback on all

aspects of the ar ch ite ctu re .

My special thanks go to Joe Kostelec. He made me un-

derstand that my ambitions could better be realized in

California than in Europe. Furthermore, his vision and his

wisdom are credited with keeping this project alive and

growing until the ‘investment decision.’

The vision of a universal media accelerator is credited to

Jaap de Hoog. Jaap, I wish you were here to see it come

to fruition.

–Gerrit Slavenburg

After the initial TM-1000 product, the TM-11 00, TM-1300

and now PNX1300 Series chips have been successfully

integrated in many video a nd audio products. It has been

my pleasure to have been in volved i n these de signs and

would like to thank the people involved in TM-1300 and

PNX1300 Series projects under the guidande of Cees

Hartgring and Simon Wegerif. The team included Karel

Allen, Tien-Cheng Bau, Jim Campbell, Anitamk Chan,

John Chang, Roel Coppoolse, Taufik Dakhil, Mitch Dani-

il, Nam Dao, Patrick Debaumarche, Thuy Duong, Tor-

sten Fink, Jan Grotenbreg, Mohammad Hafeez, Feng

Hao, Farah Jubran, Babu Rao Kandamalla, Aki Kaniel,

Yan-Ling Li, Ying-Chao Liu, Naee m Maan, Don Marshal,

Thomas Meyer, Javed Mukarram, Long Nguyen, Tu

Nghiem, Elaine Outler, Charles Peplinski, Duc T. Pham,

Thorwald Rabeler, Raquel Ruiz, Ensieh Saffari, Hani

Salloum, Wenyi Song, Stephen Tomasello, Tran Tung,

Maria F. Wang sa ha m idja ja , Chang-Ming Yang, Moham-

med I. Yousuf, Hui Zhang and Gerrit Slavenburg.

- Luis Lucas

PNX1300/01/02/11 Data Book Philips Semiconductors

2 PRELIMINARY INFORMATION

PRELIMINARY SPECIFICATION 3

Table of Contents

Foreword

1 Pin List

1.1 PNX1300 Series versus TM-1300 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1

1.2 Boundary Scan Notice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1

1.3 I/O Circuit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1

1.4 Signal Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2

1.5 Power Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8

1.6 Pin Reference Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9

1.7 Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

1.8 Ordering Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

1.8.1 Lead Parts: Last time buy for these parts is September 30, 2005: . . . . . . . . . . . . . . . . . . . . . . 1-10

1.8.2 Lead-Free Parts: Available for ordering starting October 1, 2004: . . . . . . . . . . . . . . . . . . . . . . 1-11

1.9 Parametric Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12

1.9.1 PNX1300/01/02/11 Absolute Maximum Ratings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12

1.9.2 PNX1300/01/02 Operating Range and Thermal Characteristics . . . . . . . . . . . . . . . . . . . . . . . 1-12

1.9.3 PNX1311 Operatin g Range and Thermal Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12

1.9.4 PNX1300/01/02/11 Power Supply Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12

1.9.5 PNX1300/01/02 DC/AC Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13

1.9.6 PNX1311 DC/AC Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13

1.9.7 PNX1300 Series Power Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-14

1.9.7.1 Power Consumption for Applications on PNX1300 Series . . . . . . . . . . . . . . . . . . . . . . 1-14

1.9.7.2 PNX1300/01/02 DSPCPU Core Current and Power Consumption . . . . . . . . . . . . . . . . 1-15

1.9.7.3 PNX1311 DSPCPU Core Current and Power Consumption Details . . . . . . . . . . . . . . . 1-15

1.9.7.4 PNX1300/01/02 Current Consumption For On-Chip Peripherals . . . . . . . . . . . . . . . . . 1-16

1.9.7.5 PNX1311 Current Consumption For On-Chip Peripherals . . . . . . . . . . . . . . . . . . . . . . 1-17

1.9.7.6 STRG3, STRG5 type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18

1.9.7.7 NORM3 type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18

1.9.7.8 WEAK5 type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18

1.9.7.9 IICOD (I2c) type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18

1.9.7.10 SDRAM interface timing for PNX1300/01/02/11 speed grades. . . . . . . . . . . . . . . . . . 1-19

1.9.7.11 PCI Bus timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19

1.9.7.12 JTAG I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20

1.9.7.13 I2C I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20

1.9.7.14 Video In I/O Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20

1.9.7.15 Video Out I/O Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20

1.9.7.16 AudioIn I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-21

PNX1300/01/02/11 Data Book Philips Semiconductors

4 PRELIMINARY SPECIFICATION

1.9.7.17 Audio Out I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-21

1.9.7.18 SSI I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-21

2 Overview

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1

2.2 PNX1300 Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1

2.3 PNX1300 Chip Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1

2.4 Brief Examples of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3

2.4.1 Video Decompression in a PC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3

2.4.2 Video Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3

2.5 Introduction to PNX1300 Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3

2.5.1 Internal ‘Data Highway’ Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3

2.5.2 VLIW Processor Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4

2.5.3 Video In Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4

2.5.4 Enhanced Video Out Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4

2.5.5 Image Coprocessor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4

2.5.6 Variable-Length Decoder (VLD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5

2.5.7 Audio In and Audio Out Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6

2.5.8 S/PDIF Out Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6

2.5.9 Synchronous Serial Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6

2.5.10 I2C Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6

2.6 New In PNX1300 (Versus TM-1300) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6

2.7 New In PNX1300 (Versus TM-1100) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6

2.8 New In PNX1300 (Versus TM-1000) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6

3 DSPCPU Architecture

3.1 Basic Architecture Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1

3.1.1 Register Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1

3.1.2 Basic DSPCPU Execution Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2

3.1.3 PCSW Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2

3.1.4 SPC and DPC—Source and Destination Program Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3

3.1.5 CCCOUNT—Clock Cycle Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3

3.1.6 Boolean Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3

3.1.7 Integer Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4

3.1.8 Floating Point Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4

3.1.9 Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4

3.1.10 Software Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4

3.2 Instruction Set Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5

3.2.1 Guarding (Conditional Execution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5

3.2.2 Load and Store Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5

3.2.3 Compute Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6

Philips Semiconductors

PRELIMINARY SPECIFICATION 5

3.2.4 Special-Register Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6

3.2.5 Control-Flow Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6

3.3 PNX1300 Instruction Issue Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6

3.4 Memory and MMIO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7

3.4.1 Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7

3.4.2 The Memory Hole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7

3.4.3 MMIO Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7

3.5 Special Event Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8

3.5.1 RESET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9

3.5.2 EXC (Exceptions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9

3.5.3 INT and NMI (Maskable and Non-Maskable Interrupts) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9

3.5.3.1 Interrupt vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9

3.5.3.2 Interrupt modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10

3.5.3.3 Device interrupt acknowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10

3.5.3.4 Interrupt priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10

3.5.3.5 Interrupt masking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10

3.5.3.6 Software interrupts and acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11

3.5.3.7 NMI sequentialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11

3.5.3.8 Interrupt source assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11

3.6 PNX1300 to Host Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11

3.7 Host to PNX1300 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12

3.8 Timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12

3.9 Debug Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13

3.9.1 Instruction Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13

3.9.2 Data Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14

4 Custom Operations for Multimedia

4.1 Custom OperationS Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1

4.1.1 Custom Operation Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1

4.1.2 Introduction to Custom Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1

4.1.3 Example Uses of Custom Ops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3

4.2 Example 1: Byte-Matrix Transposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3

4.3 Example 2: MPEG Image Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4

4.4 Example 3: Motion-Estimation Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7

4.4.1 A Simple Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8

4.4.2 More Unrolling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10

5 Cache Architecture

5.1 Memory System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1

5.2 DRAM Aperture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2

5.3 Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3

PNX1300/01/02/11 Data Book Philips Semiconductors

6 PRELIMINARY SPECIFICATION

5.3.1 General Cache Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3

5.3.2 Address Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3

5.3.3 Miss Processing Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4

5.3.4 Replacement Policies, Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4

5.3.5 Alignment, Partial-Word Transfers, Endian-ness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4

5.3.6 Dual Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4

5.3.7 Cache Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4

5.3.8 Memory Hole and PCI Aperture Disable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5

5.3.9 Non-cacheable Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5

5.3.10 Special Data Cache Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6

5.3.10.1 Copyback and invalidate operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6

5.3.10.2 Data cache tag and status operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6

5.3.10.3 Data cache allocation operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7

5.3.10.4 Data cache prefetch operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7

5.3.11 Memory Operation Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7

5.3.12 Operation Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8

5.3.13 MMIO Register References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8

5.3.14 PCI Bus References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8

5.3.15 CPU Stall Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8

5.3.16 Data Cache Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8

5.4 Instruction Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8

5.4.1 General Cache Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8

5.4.2 Address Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8

5.4.3 Miss Processing Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9

5.4.4 Replacement Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9

5.4.5 Location of Program Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9

5.4.6 Branch Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9

5.4.7 Coherency: Special iclr Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9

5.4.8 Reading Tags and Cache Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9

5.4.9 Cache Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10

5.4.10 Instruction Cache Initialization and Boot Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10

5.5 LRU Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11

5.5.1 Two-Way Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11

5.6 Cache Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11

5.6.1 Example 1: Data-Cache/Input-Unit Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11

5.6.2 Example 2: Data-Cache/Output-Unit Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11

5.6.3 Example 3: Instruction-Cache/Data-Cache Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11

5.6.4 Example 4: Instruction-Cache/Input-Unit Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11

5.6.5 Four-Way Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11

5.6.6 LRU Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12

Philips Semiconductors

PRELIMINARY SPECIFICATION 7

5.6.7 LRU Bit Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12

5.6.8 LRU for the Dual-Ported Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12

5.7 Performance Evaluation Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12

5.8 MMIO Register Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13

6 Video In

6.1 video in overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1

6.1.1 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1

6.1.2 Diagnostic Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2

6.1.3 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2

6.1.4 Hardware and Software Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2

6.2 Clock Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4

6.3 Fullres Capture Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4

6.4 Halfres Capture Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9

6.5 Raw Capture Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10

6.6 Message-Passing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11

6.6.1 VI_DVALID in Message Passing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12

6.7 Highway Latency and HBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13

7 Enhanced Video Out

7.1 Enhanced Video Out Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1

7.2 About This Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1

7.3 Backward Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1

7.4 Function summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1

7.4.1 Detailed Feature Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2

7.4.2 Summary of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2

7.5 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2

7.6 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3

7.7 Clock System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3

7.8 Image Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4

7.8.1 CCIR 656 Pixel Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4

7.8.2 CCIR 656 Line Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4

7.8.3 SAV and EAV Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5

7.8.4 Video Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6

7.8.5 CCIR 656 Frame Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6

7.9 Enhanced Video Out Timing Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6

7.9.1 Active Video Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6

7.9.2 SAV and EAV Overlap Period . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7

7.9.3 Control of Frame and Image Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7

7.9.4 Horizontal and Frame Timing Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7

7.10 Genlock Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-8

PNX1300/01/02/11 Data Book Philips Semiconductors

8 PRELIMINARY SPECIFICATION

7.11 Data Transfer Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9

7.12 Image Data Memory Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9

7.12.1 Video Image Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9

7.12.2 Planar Storage of Video Image Data in Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10

7.12.3 Graphics Overlay Image Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10

7.13 Video Image Conversion Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10

7.13.1 YUV 4:2:2 Interspersed to YUV 4:2:2 Co-sited Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11

7.13.2 YUV 4:2:0 to YUV 4:2:2 Co-sited Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11

7.13.3 YUV-2x Upscaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11

7.13.4 Pixel Mirroring for Four-tap Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11

7.14 EVO Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13

7.15 Video Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13

7.15.1 Alpha Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13

7.15.2 Chroma Keying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14

7.15.3 Programmable Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14

7.16 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14

7.16.1 VO Status Register (VO_STATUS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-16

7.16.2 VO Control Register (VO_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-17

7.16.3 VO-Related Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18

7.16.4 EVO Control Register (EVO_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-20

7.16.5 EVO-Related Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21

7.17 Enhanced Video Out Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21

7.17.1 Video Refresh Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21

7.18 Frame and field timing control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23

7.18.1 Recommended values for timing registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23

7.18.2 Data-transfer Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23

7.18.3 Interrupts and Error Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23

7.18.4 Latency and Bandwidth Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-24

7.18.5 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-24

7.19 DDS and PLL Filter Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-25

8 Audio In

8.1 Audio In Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1

8.2 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1

8.3 Clock System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2

8.3.1 PNX1300 Improved Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2

8.3.2 TM-1000 Compatibility Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2

8.4 Clock System Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2

8.5 Serial Data Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3

8.6 Memory Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4

8.7 Audio In Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6

Philips Semiconductors

PRELIMINARY SPECIFICATION 9

8.8 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7

8.9 Highway Latency and HBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7

8.10 Error Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7

8.11 Diagnostic Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7

9 Audio Out

9.1 Audio Out Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1

9.2 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1

9.3 Summary of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2

9.4 Internal Clock Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3

9.4.1 PNX1300 Standard Improved Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3

9.4.2 TM-1000 Compatibility Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4

9.5 Clock System Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4

9.6 Serial Data Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4

9.6.1 Serial Frame Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5

9.6.2 I2S Serial Framing Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6

9.7 Codec Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6

9.8 Memory Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-7

9.9 Audio Out Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-8

9.10 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-9

9.11 Timestamp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10

9.12 powerdown and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10

9.13 Highway Latency and HBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10

9.14 Error Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-11

10 SPDIF Out

10.1 SPDIF Out Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1

10.2 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1

10.3 Summary of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1

10.3.1 SPDIF Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1

10.3.2 Transparent DMA Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1

10.4 IEC-958 Serial Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2

10.5 IEC-958 Bit Cell and Pre-amble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2

10.6 IEC-958 Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3

10.7 IEC-958 Memory Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3

10.8 Sample Rate Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3

10.9 Transparent Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4

10.10 DMA Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4

10.11 DMA Error Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4

10.12 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4

10.13 Timestamps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4

PNX1300/01/02/11 Data Book Philips Semiconductors

10 PRELIMINARY SPECIFICATION

10.14 MMIO Register Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-5

10.15 RESET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6

10.16 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6

10.17 HBE and Highway Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6

10.18 Literature References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-7

11 PCI Interface

11.1 PCI Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-1

11.2 PCI Interface as an Initiator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2

11.2.1 DSPCPU Single-Word Loads/Stores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2

11.2.2 I/O Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2

11.2.3 Configuration Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2

11.2.4 DMA Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2

11.3 PCI Interface as a Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3

11.4 Transaction Concurrency, Priorities, and Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3

11.5 Registers Addressed in PCI Configuration Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3

11.5.1 Vendor ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3

11.5.2 Device ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3

11.5.3 Command Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3

11.5.4 Status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-5

11.5.5 Revision ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-6

11.5.6 Class Code Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-6

11.5.7 Cache Line Size Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7

11.5.8 Latency Timer Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7

11.5.9 Header Type Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7

11.5.10 Built-In Self Test Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7

11.5.11 Base Address Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7

11.5.12 Subsystem ID, Subsystem Vendor ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9

11.5.13 Expansion ROM Base Address Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9

11.5.14 Interrupt Line Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9

11.5.15 Interrupt Pin Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9

11.5.16 Max_Lat, Min_Gnt Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9

11.6 Registers in MMIO Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9

11.6.1 DRAM_BASE Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9

11.6.2 MMIO_BASE Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9

11.6.3 MMIO/DRAM_BASE updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-10

11.6.4 BIU_STATUS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-11

11.6.5 BIU_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-11

11.6.6 PCI_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-12

11.6.7 PCI_DATA Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-12

11.6.8 CONFIG_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-12

Philips Semiconductors

PRELIMINARY SPECIFICATION 11

11.6.9 CONFIG_DATA Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13

11.6.10 CONFIG_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13

11.6.11 IO_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13

11.6.12 IO_DATA Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13

11.6.13 IO_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13

11.6.14 SRC_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-14

11.6.15 DEST_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-14

11.6.16 DMA_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-14

11.6.17 INT_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-15

11.7 PCI Bus Protocol Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-15

11.7.1 Single-Data-Phase Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-16

11.7.2 Multi-Data-Phase Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-16

11.8 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17

11.8.1 Bus Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17

11.8.2 No Expansion ROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17

11.8.3 No Cacheline Wra p Address Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17

11.8.4 No Burst for I/O or Configuration Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17

11.8.5 Word-Only MMIO Register Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17

12 SDRAM Memory System

12.1 New in PNX1300/01/02/11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1

12.2 PNX1300 Main Memory Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1

12.3 Main-Memory Address Aperture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1

12.4 Memory Devices Supported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2

12.4.1 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2

12.4.2 SGRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2

12.5 Memory Granularity and Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2

12.6 Memory System Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3

12.6.1 MM_CONFIG Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3

12.6.2 PLL_RATIOS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-4

12.7 Memory Interface Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5

12.8 Address Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5

12.8.1 Address Mapping in 32-bit mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5

12.8.2 Address Mapping in 16-bit mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6

12.9 Memory Interface and SDRAM Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6

12.10 On-Chip SDRAM Interleaving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6

12.11 Refresh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6

12.12 Power-Down Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7

12.13 Output Driver Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7

12.14 Signal Propagation Delay Compensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7

12.15 Circuit Board Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7

PNX1300/01/02/11 Data Book Philips Semiconductors

12 PRELIMINARY SPECIFICATION

12.15.1 General Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7

12.15.2 Specific Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8

12.15.3 Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8

12.16 Timing Budget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8

12.16.1 Main AC Parameter requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9

12.17 Example Block Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9

12.17.1 Block Diagrams for a 32-bit interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9

12.17.1.1 16-Mbit Devices or Less . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9

12.17.1.2 64-Mbit Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-10

12.17.1.3 128-Mbit Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-13

12.17.1.4 256-Mbit Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-16

12.17.2 Block Diagrams for a 16-bit interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-17

13 System Boot

13.1 Boot Sequence Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-1

13.2 Boot Hardware Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2

13.2.1 Boot Procedure Common to Both Autonomous and Host-Assisted Bootstrap . . . . . . . . . . . . 13-2

13.2.2 Initial DSPCPU Program Load for Autonomous Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5

13.3 Host-Assisted Boot Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6

13.3.1 Stage 1: PNX1300 System Boot Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6

13.3.2 Stage 2: Host-System PCI Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6

13.3.3 Stage 3: PNX1300 Driver Executing on the Host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6

13.4 Detailed EEPROM Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-7

13.5 EEPROM Access Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-9

14 Image Coprocessor

14.1 Image Coprocessor Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1

14.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1

14.2.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1

14.2.2 Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1

14.2.3 Image Size and Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3

14.3 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3

14.4 Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3

14.4.1 Image Input Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3

14.4.1.1 YUV 4:2:2 Co-Sited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3

14.4.1.2 YUV 4:2:2 Interspersed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3

14.4.1.3 YUV 4:2:0 XY Interspersed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3

14.4.1.4 YUV 4:1:1 Co-Sited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3

14.4.2 Image Overlay Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-5

14.4.3 Alpha Blending Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-5

14.4.4 Output Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-5

Philips Semiconductors

PRELIMINARY SPECIFICATION 13

14.5 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6

14.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6

14.5.2 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6

14.5.3 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6

14.5.4 YUV to RGB Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-9

14.5.5 Overlay and Alpha Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-9

14.5.6 Dithering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-10

14.5.7 Implementation Overview: Horizontal Scaling and Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 14-11

14.5.7.1 Loading the extra pixels in the filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-12

14.5.7.2 Mirroring pixels at the ends of a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-12

14.5.7.3 Horizontal filter SDRAM timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-12

14.5.8 Implementation Overview: Vertical Scaling and Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-13

14.5.8.1 Mirroring lines at the ends of an image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15

14.5.8.2 Vertical filter SDRAM block timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15

14.5.9 Horizontal Scaling and Filtering for RGB Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15

14.5.9.1 YUV sequence counter in YUV 4:2:2 output Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15

14.5.9.2 PCI output block timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-16

14.6 Operation and Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-16

14.6.1 ICP Register Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-17

14.6.2 Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-17

14.6.3 ICP Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-18

14.6.4 ICP Microprogram Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-18

14.6.5 ICP Processing Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-18

14.6.6 Priority Delay and ICP Minimum Bus Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-21

14.6.7 ICP Parameter Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22

14.6.8 Load Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22

14.6.9 Horizontal Filter - SDRAM to SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22

14.6.9.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22

14.6.9.2 Parameter table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22

14.6.9.3 Control word format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-23

14.6.10 Vertical Filter - SDRAM to SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-24

14.6.10.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-24

14.6.10.2 Parameter table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-24

14.6.10.3 Control word format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-25

14.6.11 Horizontal Filter with RGB/YUV Conversion to PCI or SDRAM . . . . . . . . . . . . . . . . . . . . . . 14-25

14.6.11.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-25

14.6.11.2 Parameter table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-26

14.6.11.3 Control word format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-27

15 Variable Length Decoder

15.1 VLD Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-1

PNX1300/01/02/11 Data Book Philips Semiconductors

14 PRELIMINARY SPECIFICATION

15.2 VLD Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-1

15.3 Decoding up to A slice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2

15.4 VLD Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2

15.5 VLD Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-3

15.5.1 Macroblock Header Output Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-3

15.5.2 Run-Level Output Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4

15.6 VLD Time Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4

15.7 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4

15.7.1 VLD Status (VLD_STATUS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4

15.7.2 VLD Interrupt Enable (VLD_IMASK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4

15.7.3 VLD Control (VLD_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5

15.8 VLD DMA Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5

15.8.1 DMA Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5

15.8.2 Macroblock Header Output DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5

15.8.3 Run-Level Output DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5

15.9 VLD Operational Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7

15.9.1 VLD Command (VLD_COMMAND) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7

15.9.2 VLD Shift Register (VLD_SR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7

15.9.3 VLD Quantizer Scale (VLD_QS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7

15.9.4 VLD Picture Info (VLD_PI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8

15.10 Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8

15.11 Interrupt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8

15.12 RESET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8

15.13 Endian-ness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8

15.14 Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8

15.15 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8

16 I2C Interface

16.1 I2C Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1

16.2 Compared TO TM-1000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1

16.3 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1

16.4 I2C Register Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1

16.4.1 IIC_AR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1

16.4.2 IIC_DR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2

16.4.3 IIC_SR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-3

16.4.4 IIC_CR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-4

16.5 I2C Software Operation Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-5

16.6 I2C Hardware Operation Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-5

16.6.1 Slave NAK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-6

16.7 I2C Clock Rate Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-7

Philips Semiconductors

PRELIMINARY SPECIFICATION 15

17 Synchronous Serial Interface

17.1 Synchronous Serial Interface Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-1

17.2 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-1

17.3 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-1

17.3.1 General Purpose I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-2

17.3.2 Frame Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3

17.3.3 SSI Transmit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3

17.3.4 SSI Receive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3

17.4 SSI Transmit operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5

17.4.1 Setup SSI_CTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5

17.4.2 Operation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5

17.4.3 Interrupt and Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5

17.5 SSI Receive Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6

17.5.1 Setup SSI_CTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6

17.5.2 Operation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6

17.5.3 Interrupt and Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6

17.6 Frame Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6

17.7 Interrupt Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-7

17.8 16-bit Endian-ness and Shift Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-7

17.9 SSI Test Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8

17.9.1 Remote Loopback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8

17.9.2 Local Loopback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8

17.10 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8

17.10.1 SSI Control Register (SSI_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-9

17.10.2 SSI Control/Status Register (SSI_CSR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-11

17.11 Timing Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-12

17.12 Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-12

18 JTAG Functional Specification

18.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-1

18.2 Test Access Port (TAP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-1

18.2.1 TAP Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-1

18.2.2 PNX1300 JTAG Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2

18.3 Using JTAG for PNX1300 Debug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3

18.3.1 JTAG Instruction and Data Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-4

18.3.2 JTAG Communication Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-5

18.3.3 Example Data Transfer Via JTAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-5

18.3.3.1 Transferring data to TriMedia via JTAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-5

18.3.3.2 Transferring data from TriMedia via JTAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-6

18.3.4 JTAG Interface Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-6

PNX1300/01/02/11 Data Book Philips Semiconductors

16 PRELIMINARY SPECIFICATION

19 On-Chip Semaphore Assist Device

19.1 OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1

19.2 SEM Device Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1

19.3 Constructing a 12-Bit ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1

19.4 Which SEM to Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1

19.5 Usage Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1

20 Arbiter

20.1 Arbiter Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-1

20.2 Dual Priorities with Priority Raising Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-1

20.3 Round Robin Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-2

20.3.1 Weighted Round Robin Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-2

20.3.2 Arbitration Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-3

20.4 Arbiter Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-4

20.5 Arbiter programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-5

20.5.1 Latency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-5

20.5.2 Bandwidth Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-6

20.6 Extended Behavior Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-7

20.6.1 Extended Bandwidth Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-7

20.6.2 Extended Latency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-7

20.6.3 Raising Priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-8

20.6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-8

21 Power Management

21.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-1

21.2 Entering and Exiting Global Power Do wn Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-1

21.3 Effect Of Global Power Down On Peripherals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-1

21.4 Detailed Sequence of Events For Global Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-2

21.5 MMIO Register POWER_DOWN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-2

21.6 Block Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-2

22 PCI-XIO External I/O Bus

22.1 Summary Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-1

22.1.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-1

22.2 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-3

22.3 Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-5

22.4 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-5

22.4.1 PCI-XIO Bus Interface Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-5

22.4.1.1 Flash EEPROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6

22.4.1.2 68K Bus I/O device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6

22.4.1.3 x86/ISA Bus I/O device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6

Philips Semiconductors

PRELIMINARY SPECIFICATION 17

22.4.1.4 Multiple Flash EEPROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6

22.5 XIO_CTL MMIO Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-7

22.5.1 PCI_CLK Bus Clock Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-7

22.5.2 Wait State Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-8

22.6 PCI-XIO Bus Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-8

22.7 PCI-XIO Bus Controller Operation and Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-12

A PNX1300/01/02/11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DSPCPU Operations

A.1 Alphabetic Operation List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1

A.2 Operation List By Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2

alloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-4

allocd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-5

allocr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6

allocx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7

asl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8

asli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9

asr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-10

asri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-11

bitand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-12

bitandinv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-13

bitinv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-14

bitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-15

bitxor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-16

borrow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17

carry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-18

curcycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-19

cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-20

dcb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-21

dinvalid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-22

dspiabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-23

dspiadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-24

dspidualabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-25

dspidualadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-26

dspidualmul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-27

dspidualsub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-28

dspimul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-29

dspisub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-30

dspuadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-31

dspumul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-32

dspuquadaddui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-33

PNX1300/01/02/11 Data Book Philips Semiconductors

18 PRELIMINARY SPECIFICATION

dspusub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-34

dualasr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-35

dualiclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-36

dualuclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-37

fabsval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-38

fabsvalflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-39

fadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-40

faddflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-41

fdiv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-42

fdivflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-43

feql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-44

feqlflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-45

fgeq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-46

fgeqflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-47

fgtr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-48

fgtrflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-49

fleq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-50

fleqflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-51

fles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-52

flesflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-53

fmul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-54

fmulflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-55

fneq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-56

fneqflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-57

fsign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-58

fsignflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-59

fsqrt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-60

fsqrtflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-61

fsub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-62

fsubflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-63

funshift1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-64

funshift2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-65

funshift3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-66

h_dspiabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-67

h_dspidualabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-68

h_iabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-69

h_st16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-70

h_st32d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-71

h_st8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-72

hicycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-73

Philips Semiconductors

PRELIMINARY SPECIFICATION 19

iabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-74

iadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-75

iaddi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-76

iavgonep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-77

ibytesel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-78

iclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-79

iclr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-80

ident . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-81

ieql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-82

ieqli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-83

ifir16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-84

ifir8ii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-85

ifir8ui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-86

ifixieee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-87

ifixieeeflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-88

ifixrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-89

ifixrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-90

iflip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-91

ifloat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-92

ifloatflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-93

ifloatrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-94

ifloatrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-95

igeq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-96

igeqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-97

igtr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-98

igtri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-99

iimm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-100

ijmpf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-101

ijmpi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-102

ijmpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-103

ild16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-104

ild16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-105

ild16r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-106

ild16x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-107

ild8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-108

ild8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-109

ild8r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-110

ileq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-111

ileqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-112

iles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-113

PNX1300/01/02/11 Data Book Philips Semiconductors

20 PRELIMINARY SPECIFICATION

ilesi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-114

imax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-115

imin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-116

imul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-117

imulm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-118

ineg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-119

ineq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-120

ineqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-121

inonzero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-122

isub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-123

isubi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-124

izero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-125

jmpf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-126

jmpi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-127

jmpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-128

ld32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-129

ld32d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-130

ld32r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-131

ld32x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-132

lsl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-133

lsli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-134

lsr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-135

lsri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-136

mergedual16lsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-137

mergelsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-138

mergemsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-139

nop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-140

pack16lsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-141

pack16msb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-142

packbytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-143

pref . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-144

pref16x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-145

pref32x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-146

prefd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-147

prefr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-148

quadavg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-149

quadumax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-150

quadumin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-151

quadumulmsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-152

rdstatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-153

Philips Semiconductors

PRELIMINARY SPECIFICATION 21

rdtag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-154

readdpc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-155

readpcsw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-156

readspc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-157

rol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-158

roli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-159

sex16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-160

sex8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-161

st16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-162

st16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-163

st32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-164

st32d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-165

st8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-166

st8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-167

ubytesel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-168

uclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-169

uclipu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-170

ueql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-171

ueqli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-172

ufir16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-173

ufir8uu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-174

ufixieee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-175

ufixieeeflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-176

ufixrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-177

ufixrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-178

ufloat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-179

ufloatflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-180

ufloatrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-181

ufloatrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-182

ugeq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-183

ugeqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-184

ugtr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-185

ugtri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-186

uimm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-187

uld16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-188

uld16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-189

uld16r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-190

uld16x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-191

uld8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-192

uld8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-193

PNX1300/01/02/11 Data Book Philips Semiconductors

22 PRELIMINARY SPECIFICATION

uld8r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-194

uleq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-195

uleqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-196

ules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-197

ulesi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-198

ume8ii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-199

ume8uu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-200

umin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-201

umul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-202

umulm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-203

uneq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-204

uneqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-205

writedpc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-206

writepcsw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-207

writespc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-208

zex16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-209

zex8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-210

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-212

B MMIO Register Summary

B.1 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1

C Endian-ness

C.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1

C.2 Little and Big Endian Addressing Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1

C.3 Test to Verify the Correct Operation of PNX1300 in Big and Little Endian Systems . . . . . . . . . . . . . . C-2

C.4 Requirement for the PNX1300 to Operate in Either Little Endian or Big Endian Mode . . . . . . . . . . . . C-2

C.4.1 Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2

C.4.2 Instruction Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3

C.4.3 PNX1300 PCI Interface Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3

C.4.4 Image Coprocessor (ICP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3

C.4.5 Video In (VI) and Video Out (VO) Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-7

C.4.6 Audio In (AI), Audio-Out (AO), and SPDIF Out (SDO) Units . . . . . . . . . . . . . . . . . . . . . . . . . . C-7

C.4.7 Variable Length Encoder (VLD) Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-7

C.4.8 Synchronous Serial Interface (SSI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-8

C.4.9 Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-9

C.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-9

C.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-9

Index

PRELIMINARY SPECIFICATION 1-1

Pin List Chapter 1

by John Chang, Wenyi Song, Thorwald Rabeler, Luis Lucas

1.1 PNX1300 SERIES VERSUS TM-1300

The following summarizes differences between TM-1300 and PNX1300/01/02/11:

• Lower core voltage for PNX1311 (2.2V core voltage) and therefore lower power consumption.

• DSPCPU speed of up to 200 MHz.

• SDRAM speed of up to 183 MHz.

• Support for 256 Mbit SDRAM organized in x16. The REFRESH counter must be chang ed. Refer for in Chap ter 12,

“SDRAM Memory System” for details.

• Support for 16- and 32-bit Main Memory Interface.

• Simplified power supplies sequencing (see Section 1.9.4).

• Additional mode where VI_DATA[9:8] in message passing mode are not affected by the VI_DVALID signal.

• Bug fixed for PCI Special Cycles. PNX1300 Series discards PCI Special Cycles issued by some PCI chipsets.

• Autonomous boot bug in non 1:1 ratio is fixed, resulting in 2KB boot EEPROM size for all CPU:SDRAM ratios.

In the document, ‘PNX1300 Series’ is used interchangebly with ‘PNX1300/01/02/11’, and it always refers to

PNX1300, PNX1301, PNX1302 and PNX1311 products. Any exception will be noted.

1.2 BOUNDARY SCAN NOTICE

PNX1300 Series implements full IEEE 1149.1 boundary scan. Any PNX1300 Series pin designated “IN” only (from a

functionality point of view) can become an output during boundary scan.

1.3 I/O CIRCUIT SUMMARY

PNX1300 Series has a total of 169 functional pins, excluding VDDQ, VSSQ, VREF_PCI and VREF_PERIPH and digital

power/ground. PNX13 00 Series uses the types of I/O circuits shown in the table below.

For the pins with 5-V input capability, the special pins VREF_PCI or VREF_PERIPH determine 3.3- or 5-V input toler-

ance, as per the table in Section 1.6. The above pad types are used in the modes listed in the following table.

Unused pins may remain floating, i.e. unconnected.

All pins that drive a clock should drive a series resistor.

Pad Type Pad Type Description

PCI PCI2.1 compliant I/O, capable of using 3.3-V or 5-V PCI signaling conventions.

PCIOD PCI2.1 compliant Open Drain I/O, capable of using 3.3-V or 5-V PCI signaling conventions.

IICOD Open drain 3.3-V or 5-V I2C I /O (for I2C pins).

STRG3 3.3-V only low impedance I/O. Requires board level 27-33 ohm series terminator resistor to match 50 ohm

PCB trace.

NORM3 3.3-V only I/O circuit with regular drive strength and board trace matched drive impedance.

STRG5 3.3-V low impedance output, combined with 5-V tolerant input. If used as output, it requires a board level

27-33 ohm series terminator resistor to match 50-ohm PCB trace.

WEAK5 3.3-V regular impedance output, with slow rise/fall, combined with 5-V tolerant input.

Modes Description

IN Input only, except during boundary scan

OUT Output only, except during boundary scan

OD Open drain output - active pull low, no active drive high, requires external pull-up

I/O Output or input

I/OD Open drain output with input - active pull low, no active drive high, requires external pull-up

PNX1300/01/02/11 Data Book Philips Semiconductors

1-2 PRELIMINARY SPECIFICATION

1.4 SIGNAL PIN LIST

In the table below, a pin name ending in a ‘#’ designates an active-low signal (the active state of the signal is a low

voltage level). All other signals have active-high polarity.

Pin Name BGA

Ball Pad

Type Mode Description

Main Clock Interface

TRI_CLKIN L20 NORM3 IN Main input clock. The SDRAM clock outputs (MM_CLK0 and MM_CLK1) can be set to

2x or 3x this frequency. The on-chip DSPCPU cloc k (DSPCPU_CLK) can be set to 1x,

5/4, 4/3, 3/2 or 2x the SDRAM clock frequency. Maximum recommended ppm level is

+/- 100 ppm or lower to improve jitter on generated clocks. Duty cycle should not

exceed 30/70% asymmetry.

The operating limits of the internal PLLs are:

• 27 MHz < Output of the SDRAM PLL < 200 MHz

• 33 MHz < Output of the CPU PLL < 266 MHz

These are not the speed grades of the chips, just the PLL limits.

VDDQ K20 N/A PWR Quiet VDD for the PLL subsystem. This pin should be supplied from VDD through a

low-Q series inductor. It should be bypassed for AC to VSSQ, using a dual capacitor

bypass (hi and low frequency AC bypass).

VSSQ L19 N/A GND Quiet VSS for the PLL subsystem. Should be AC bypassed to VDDQ, but should

otherwise be left DC floating. It is connected on-chip to VSS. No external coil or

other connection to board ground is needed, such connection would create a

ground loop.

Miscellaneous System Interface

TRI_RESET# G19 WEAK5 IN PNX1300/01/02/11 RESET input. This pin can be tied to the PCI RST# signal in PCI

bus systems. Upon releasing RESET, PNX1300/01/02/11 initiates its boot protocol.

BOOT_CLK T20 NORM3 IN Used for testing purposes. Must be connected to TRI_CLKIN for normal operation.

TESTMODE P19 NORM3 I N Used for testing purposes. Must be connected to VSS for normal operation.

SCANCPU D20 NORM3 IN Used for testing purposes. Must be connected to VSS for normal operation.

RESERVED1 E19 NORM3 I/O Reserved pin. Has to be left unconnected for normal operation.

RESERVED2 D19 STRG5 I/O Reserved pin. Has to be left unconnected for normal operation.

VREF_PCI F2 N/A PWR VREF_PCI determines the mode of operation of the PCI pins listed in Section 1.6.

VREF_PCI must be connected to 5V for use in a 5-V PCI signaling environment or to

VSS (0 V) for use in 3.3-V PCI signaling environment. The supply to this pin should be

AC bypassed and provide 40 mA of DC sink or source capability. Note that this pin

can not be directly connected to the PCI ‘I/O designated power pins’ in a dual

voltage PCI plug-in card. Board level conversion circuitry is required.

VREF_PERIPH C18 N/A PWR VREF_PERIPH determines the mode of operation of the I/O pins listed in Section 1.6.

VREF_PERIPH should be connected to 5V if any of the listed I/O pins provided should

be 5-V input voltage capable. VREF_PERIPH should be connected to VSS (0-V) if all

listed I/O pins are 3.3-V only inputs. The supply to this pin should be AC bypassed and

provide 40 mA of DC sink or source capability.

TRI_USERIRQ G20 WEAK5 IN G eneral purpose level/edge interrupt input. Vectored interrupt source number 4.

TRI_TIMER_CLK H19 WEAK5 IN External general purpose clock source for timers. Max. 40 MHz.

Philips Semiconductors Pin List

PRELIMINARY SPECIFICATION 1-3

Main Memory Interface

MM_CLK0

MM_CLK1 Y10

W10 STRG3 OUT SDRAM output clock at 2x or 3x TRI_CLKIN frequency. Two identical outputs are pro-

vided to reliably drive several small memory configurations without external glue.

A series terminating resistor close to PNX1300/01/02/11 is required to reduce ringing.

For driving a 50-ohm trace, a resistor of 27 to 33 ohm is recommended. It is recom-

mended against using higher impedance traces in the SDRAM signals.

MM_A00

MM_A01

MM_A02

MM_A03

MM_A04

MM_A05

MM_A06

MM_A07

MM_A08

MM_A09

MM_A10

MM_A11

MM_A12

MM_A13

W12

Y12

W11

Y11

V12

Y13

W13

Y14

NORM3 OUT Main memory address bus; used for row and column addresses

WARNING: MM_A[13:11] DO NOT CONNECT DIR ECTLY TO SDRAM A[13:11] pins.

Refer to Chapter 12, “SDRAM Memory System” for accurate connection diagrams.

MM_DQ00

MM_DQ01

MM_DQ02

MM_DQ03

MM_DQ04

MM_DQ05

MM_DQ06

MM_DQ07

MM_DQ08

MM_DQ09

MM_DQ10

MM_DQ11

MM_DQ12

MM_DQ13

MM_DQ14

MM_DQ15

MM_DQ16

MM_DQ17

MM_DQ18

MM_DQ19

MM_DQ20

MM_DQ21

MM_DQ22

MM_DQ23

MM_DQ24

MM_DQ25

MM_DQ26

MM_DQ27

MM_DQ28

MM_DQ29

MM_DQ30

MM_DQ31

Y20

V18

W19

W20

U18

V19

V20

T18

W18

V17

Y18

W17

Y17

W16

Y16

V15

NORM3 I/O 32-bit data I/O bus.

The Main Memory Interface unit also supports a 16-bit I/O interface. Refer to Chapter

12, “SDRAM Memory System.”

MM_CKE0

MM_CKE1 Y19

U1 NORM3 OUT Clock enable output to SDRAMs. Tw o identical outputs are provided in order to reli-

ably drive several small memory configurations without external glue.

MM_CS0#

MM_CS1#

MM_CS2#

MM_CS3#

U20

U19

NORM3 OUT Chip select for DRAM rank n; active low

In PNX1300/01/02/11 the chip selects pins may be used as address pins to support

the 256 Mbit SDRAM device organized in x16. Refer to Chapter 1 2, “SDRAM Memory

System.”

MM_RAS# W14 NORM3 OUT Row address strobe; active low

MM_CAS# Y15 NORM3 OUT Column address strobe; active low

MM_WE# W15 NORM3 OUT Write enable; active low

Pin Name BGA

Ball Pad

Type Mode Description

PNX1300/01/02/11 Data Book Philips Semiconductors

1-4 PRELIMINARY SPECIFICATION

MM_DQM0

MM_DQM1

MM_DQM2

MM_DQM3

T19

R18

NORM3 OUT MM_DQ Mask Enable; these are byte enable signals for the 32-bit MM_DQ bus

PCI Interface (Note: current buffer design allows drive/receive from either 3.3 or 5V PCI bus)

PCI_CLK T2 PCI IN All PCI input signals are sampled with respect to the rising edge of this clock. All PCI

outputs are generated based on this clock. Clock is required for normal operation of

the PCI block.

PCI_AD00

PCI_AD01

PCI_AD02

PCI_AD03

PCI_AD04

PCI_AD05

PCI_AD06

PCI_AD07

PCI_AD08

PCI_AD09

PCI_AD10

PCI_AD11

PCI_AD12

PCI_AD13

PCI_AD14

PCI_AD15

PCI_AD16

PCI_AD17

PCI_AD18

PCI_AD19

PCI_AD20

PCI_AD21

PCI_AD22

PCI_AD23

PCI_AD24

PCI_AD25

PCI_AD26

PCI_AD27

PCI_AD28

PCI_AD29

PCI_AD30

PCI_AD31

PCI I/O Multiplexed address and data.

PCI_C/BE#0

PCI_C/BE#1

PCI_C/BE#2

PCI_C/BE#3

PCI I/O Multiplexed bus commands and byte enables. High for command, low for byte enable.

PCI_PAR H1 PCI I/O Even parity across AD and C/BE lines.

PCI_FRAME# E2 PCI I/O Sustained tri-state. Frame is driven by a master to indicate the beginning and duration

of an access.

PCI_IRDY# E1 PCI I/O Sustained tri-state. Initiator Ready indicates that the bus master is ready to complete

the current data phase.

PCI_TRDY# F3 PCI I/O Sustained tri-s tate. Target Ready indicates that the bus target is ready to complete the

current data phase.

PCI_STOP# G2 PCI I/O Sustained tri-state. Indicates that the target is requesting that the master stop the cur-

rent transaction.

PCI_IDSEL A2 PCI IN Used as chip select during configuration read/write cycles.

PCI_DEVSEL# F1 PCI I/O Sustained tri-state. Indicates whether any device on the bus has been selected.

PCI_REQ# B7 PCI I/O Driven by PNX1300/01/02/11 as PCI bus master to request use of the PCI bus.

PCI_GNT# B5 PCI IN Indicates to PNX1300/01/02/11 that access to the bus has been granted.

PCI_PERR# G1 PCI I/O Sustained tri-state. Parity error generated/received by PNX1300/01/02/11.

PCI_SERR# H2 PCI OD System error. This signal is asserted when operating as target and detecting an

address parity error.

Pin Name BGA

Ball Pad

Type Mode Description

Philips Semiconductors Pin List

PRELIMINARY SPECIFICATION 1-5

PCI_INTA#

PCI_INTB#

PCI_INTC#

PCI_INTD#

PCIOD

PCI

PCIOD

I/OD

I/O/OD

I/OD

• Can operate as input (power up default) or output, as determined by direction con-

trol bits in PCI MMIO register INT_CTL.

• As input, a PCI_INT# pin can be used to receive PCI interrupt request s (normal PCI

use is active low, level sensitive mode, but the VIC can be set to treat these as pos-

itive edge triggered mode). As input, a PCI_INT# pin can also be used as a general

interrupt request pin if not needed for PCI.

• As output, the value of a PCI_INT# can be programmed through PCI MMIO regis-

ters to generate interrupts for other PCI masters.

• Whenever XIO bus functionality is active, PCI_INTB# is a push-pull CMOS I/O pin.

When the XIO bus is not active and regular PCI bus functionality is activated, then

PCI_INTB# has a PCI compatible open drain output.

JTAG Interface (debug access port and 1149.1 boundary scan port)

JTAG_TDI F20 WEAK5 IN JTAG test data input

JTAG_TDO F18 WEAK5 I/O JTAG test data output. This pin can either drive active low, high or float.

JTAG_TCK F19 W EAK5 IN JTAG test clock input

JTAG_TMS E20 WEAK5 IN JTAG test mode select input

Video In

VI_CLK C20 STRG5 I/O • If configured as input (power up default): a positive transition on this incoming video

clock pin samples all other VI_DATA input signals below if VI_DVALID is HIGH. If

VI_DVALID is LOW, VI_DATA is ignored. Clock and data rates of up to 81 MHz are

supported. PNX1300 Series supports an additional mode where VI_DATA[9:8] in

message passing mode are not af fected by the VI_DVALID signal, Section 6 .6.1 on

page 6-12.

• If configured as output: programmable output clock to drive an external video A/D

converter. Can be programmed to emit integral dividers of DSPCPU_CLK.

If used as output, a board level 27-33 ohm series resistor is recommended to reduce

ringing.

VI_DVALID A17 WEAK5 IN VI_DVALID indicates that valid data is present on the VI_DATA lines. If HIGH,

VI_DATA will be accepted on the next VI_CLK positive edge. If LOW, no VI_DATA will

be sampled. PNX1300 Series supports an additional mode where VI_DATA[9:8] in

message passing mode are not affected by the VI_DVALID signal, Section 6.6.1 on

page 6-12.

VI_DATA0

VI_DATA1

VI_DATA2

VI_DATA3

VI_DATA4

VI_DATA5

VI_DATA6

VI_DATA7

D18

C19

B20

B19

A20

A19

C17

B18

WEAK5 IN CCIR656 style YUV 4:2:2 data from a digital camera, or general purpose high speed

data input pins. Sampled on VI_CLK if VI_DVALID HIGH.

VI_DATA8

VI_DATA9 A18

B17 WEAK5 IN Extension high speed data input bits to allow use of 10 bit video A/D converters in

raw10 modes. VI_DATA[8] serves as START and VI_DATA[9] as END message input

in message passing mode. Sampled on positive transitions of VI_CLK if VI_DVALID

HIGH. PNX1300 Series supports an additional mode where VI_DATA[9:8] in message

passing mode are not affected by the VI_DVALID signal, Section 6.6.1 on page 6-12.

I2C Interface

IIC_SDA R19 IICOD I/OD I2C serial data

IIC_SCL R20 IICOD I/OD I2C clock

Video Out

VO_DATA0

VO_DATA1

VO_DATA2

VO_DATA3

VO_DATA4

VO_DATA5

VO_DATA6

VO_DATA7

P20

N19

N20

M18

M19

M20

K19

J20

WEAK5 OUT CCIR656 style YUV 4:2:2 digital output dat a, or general purpose high speed data out-

put channel. Output changes on positive edge of VO_CLK.

Pin Name BGA

Ball Pad

Type Mode Description

PNX1300/01/02/11 Data Book Philips Semiconductors

1-6 PRELIMINARY SPECIFICATION

VO_IO1 J18 WE AK5 I/O T his pin can function as HS output or as STMSG (Start Message) output.

• If set as HS output, it outputs the horizontal sync signal

• In message passing mode, this pin acts as STMSG output.

VO_IO2 H20 WEAK5 I/O This pin can function as FS (frame sync) input, FS output or as ENDMSG output.

• If set as FS input, it can be set to respond to positive or negative edge transitions.

• If the Video Out (VO) unit operates in external sync mode and the selected transition

occurs, the VO unit sends two fields of video data. Note: this works only once af ter a

reset.

• In message passing mode, this pin acts as ENDMSG output.

VO_CLK J19 STRG5 I/O The VO unit emits VO_DATA on a positive edge of VO_CLK. VO_CLK can be config-

ured as input (reset default) or output.

• If configured as input: VO_CLK is received from external display clock master cir-

cuitry.

• If configured as output, PNX1300/01/02/11 emits a programmable clock frequency.

The emitted frequency can be set between approx. 4 and 81 MHz with a sub-Hertz

resolution. The clock generated is frequency accurate and has low jitter properties

due to a combination of an on-chip DDS (Direct Digital Synthesizer) and VCO/PLL.

If used as output, a board level 27-33 ohm series resistor is recommended to reduce

ringing.

Audio In (always acts as receiver, but can be master or slave for A/D timing)

AI_OSCLK B15 STRG3 OUT Over-sampling clock. This output can be programmed to emit any frequency up to 40

MHz with a sub-Hertz resolution. It is intended for use as the 256fs or 384fs over sam-

pling clock by external A/D subsystem. A board level 27-33 ohm series resistor is rec-

ommended to reduce ringing.

AI_SCK A16 STRG5 I/O • When the Audio In (AI) unit is programmed as a serial-interface timing slave

(power-up default), AI_SCK is an input. AI_SCK receives the serial bit clo ck from

the external A/D subsystem. This clock is treated as fully asynchronous to the

PNX1300/01/02/11 main clock.

• When the AI unit is programmed as the serial-interface timing master , AI_SCK is an

output. AI_SCK drives the serial clock for the external A/D subsystem. The fre-

quency is a programmable integral divisors of the AI_OSCLK frequency.

AI_SCK is limited to 22 MHz. The sample rate of valid samples embedded within the

serial stream is variable. If used as output, a board level 27-33 ohm series resistor is

recommended to reduce ringing.

AI_SD C15 WE AK5 IN Serial data from external A/D subsystem. Data on this pin is sampled on positive or

negative edges of AI_SCK as determined by the CLOCK_EDGE bit in the AI_SERIAL

AI_WS B16 WEAK5 I/O • When the AI unit is programmed as the serial-interface timing slave (power-up

default), AI_WS acts as an input. AI_WS is sampled on the same edge as selected

for AI_SD.

• When Audio In is programmed as the serial-interface timing master, AI_WS acts as

an output. It is asserted on the opposite edge of the AI_SD sampling edge.

AI_WS is the word-select or frame-synchronization signal from/to the external A/D

subsystem.

Pin Name BGA

Ball Pad

Type Mode Description

Philips Semiconductors Pin List

PRELIMINARY SPECIFICATION 1-7

Audio Out (always acts as sender, but can be master or slave for D/A timing)

AO_OSCLK B14 STRG3 OUT Over sampling clock. This output can be programmed to emit any frequency up to 40

MHz, with a sub-Hertz resolution. It is intended for use as the 256 or 384fs over sam-

pling clock by the external D/A conversion subsystem. A board level 27-33 ohm series

resistor is recommended to reduce ringing.

AO_SCK A14 STRG5 I/O • When the Audio Out (AO) unit is programmed to act as the serial interface tim ing

slave (power up default), AO_SCK acts as input. It receives the Serial Clock from

the external audio D/A subsystem. The clock is treate d as fully asynchronous to the

PNX1300/01/02/11 main clock.

• When the AO unit is programmed to act as serial interface timing master, AO_SCK

acts as output. It drives the serial clock for the external audio D/A subsystem. The

clock frequency is a programmable integral divisor of the AO_OSCLK frequency.

AO_SCK is limited to 22 MHz. The sample rate of valid samples embedded within the

serial stream is variable. If used as output, a board level 27-33 ohm series resistor is

recommended to reduce ringing.

AO_SD1 B13 WEAK5 OUT Serial dat a to external stereo audio D/A subsystem for first 2 of 8 channels. The timing

of transitions on this output is determined by the CLOCK_EDGE bit in the AO_SERIAL

AO_SD2 A13 WEAK5 OUT Serial data.

AO_SD3 C12 WEAK5 OUT Serial data.

AO_SD4 B12 WEAK5 OUT Serial data.

AO_WS A15 WEAK5 I/O • When the AO unit is programmed as the serial-interface timing slave (power-up

default), AO_WS acts as an input. AO_WS is sampled on the opposite AO_SCK

edge at which AO_SDx are asserted.

• When the AO unit is programmed as serial-interface timing master , AO_WS acts as

an output. AO_WS is asserted on the same AO_SCK edge as AO_SDx.

AO_WS is the word-select or frame-synchronization signal from/to the external D/A

subsystem. Each audio channel receives 1 sample for every WS period.

S/PDIF Output (Output)

SPDO A12 STRG3 OUT S elf clocking serial data stream as per IEC958, with 1937 extensions. Note that the

low impedance output buffer requires a 27 to 33 ohm series terminator close to

PNX1300/01/02/11 in order to match the board trace impedance. This series termina-

tor can be/must be part of the voltage divider needed to create the coaxial output

through the AC isolation transformer.

Synchronous Serial Interface (SSI) to an off-chip modem front-end

SSI_CLK B11 WEAK5 IN Clock signal of the synchronous serial interface to an off-chip modem analog frontend

or ISDN terminal adapter; provided by the receive channel of an external communica-

tion device.

SSI_RXFSX A11 WEAK5 IN Receive frame sync reference of the synchronous serial interface, provided by the

receive channel of an external communication device.

SSI_RXDATA A10 WEAK5 IN Receive serial data input; provided by the receive channel of an external communica-

tion device.

SSI_TXDATA B10 WEAK5 OUT Transmit serial data output; sent to the transmit channel of the external communica-

tion device.

SSI_IO1 A9 WEAK5 I/O General purpose programmable I/O. Set to input on power up.

SSI_IO2 B9 WEAK5 I/O General purpose programmable I/O. Set to input on power up. Can also be pro-

grammed to function as the transmit channel frame synchronization reference output.

Pin Name BGA

Ball Pad

Type Mode Description

PNX1300/01/02/11 Data Book Philips Semiconductors

1-8 PRELIMINARY SPECIFICATION

1.5 POWER PIN LIST

VSS (ground) VCC (3.3V I/O supply) VDD (2.5V core supply)

C16

D16

D17

E17

E18

T17

U16

U17

V16

H10

H11

H12

H13

J10

J11

J12

J13

K10

K11

K12

K13

L10

L11

L12

L13

M10

M11

M12

M13

N10

N11

N12

N13

C10

C11

C14

D10

D11

D14

D15

F17

G17

G18

K17

K18

L17

L18

P17

P18

R17

U10

U11

U14

U15

V10

V11

V14

C13

D12

D13

H17

H18

J17

M17

N17

N18

U12

U13

V13

Philips Semiconductors Pin List

PRELIMINARY SPECIFICATION 1-9

1.6 PIN REFERENCE VOLTAGE

With the exception of Open Drain mode outputs, outputs always drive to a level determined by the 3.3-V I/O voltage.

VREF_PERIPH and VREF_PCI purely determine input voltage clamping, not input signal thresholds or output levels.

VREF_PCI determined mode VREF_PERIPH determined mode SDRAM i/f (always 3.3-Volt mode)

PCI_AD00

PCI_AD01

PCI_AD02

PCI_AD03

PCI_AD04

PCI_AD05

PCI_AD06

PCI_AD07

PCI_AD08

PCI_AD09

PCI_AD10

PCI_AD11

PCI_AD12

PCI_AD13

PCI_AD14

PCI_AD15

PCI_AD16

PCI_AD17

PCI_AD18

PCI_AD19

PCI_AD20

PCI_AD21

PCI_AD22

PCI_AD23

PCI_AD24

PCI_AD25

PCI_AD26

PCI_AD27

PCI_AD28

PCI_AD29

PCI_AD30

PCI_AD31

PCI_CLK

PCI_C/BE#0

PCI_C/BE#1

PCI_C/BE#2

PCI_C/BE#3

PCI_PAR

PCI_FRAME#

PCI_IRDY#

PCI_TRDY#

PCI_STOP#

PCI_IDSEL

PCI_DEVSEL#

PCI_REQ#

PCI_GNT#

PCI_PERR#

PCI_SERR#

PCI_INTA#

PCI_INTB#

PCI_INTC#

PCI_INTD#

TRI_RESET#

TRI_USERIRQ

TRI_TIMER_CLK

JTAG_TDI

JTAG_TDO

JTAG_TCK

JTAG_TMS

VI_CLK

VI_DVALID

VI_DATA0

VI_DATA1

VI_DATA2

VI_DATA3

VI_DATA4

VI_DATA5

VI_DATA6

VI_DATA7

VI_DATA8

VI_DATA9

IIC_SDA

IIC_SCL

VO_IO1

VO_IO2

VO_CLK

AI_SCK

AI_SD

AI_WS

AO_SCK

AO_WS

SSI_CLK

SSI_RXFSX

SSI_RXDATA

SSI_IO1

SSI_IO2

RESERVED2

MM_CLK0

MM_CLK1

MM_A00

MM_A01

MM_A02

MM_A03

MM_A04

MM_A05

MM_A06

MM_A07

MM_A08

MM_A09

MM_A10

MM_A11

MM_A12

MM_A13

MM_DQ00

MM_DQ01

MM_DQ02

MM_DQ03

MM_DQ04

MM_DQ05

MM_DQ06

MM_DQ07

MM_DQ08

MM_DQ09

MM_DQ10

MM_DQ11

MM_DQ12

MM_DQM0

MM_DQM1

MM_DQM2

MM_DQM3

MM_DQ13

MM_DQ14

MM_DQ15

MM_DQ16

MM_DQ17

MM_DQ18

MM_DQ19

MM_DQ20

MM_DQ21

MM_DQ22

MM_DQ23

MM_DQ24

MM_DQ25

MM_DQ26

MM_DQ27

MM_DQ28

MM_DQ29

MM_DQ30

MM_DQ31

MM_CKE0

MM_CKE1

MM_CS0#

MM_CS1#

MM_CS2#

MM_CS3#

MM_RAS#

MM_CAS#

MM_WE#

Inputs always in 3.3-V mode Output only pins

TRI_CLKIN

BOOT_CLK

TESTMODE

SCANCPU

RESERVED1

VO_DATA0

VO_DATA1

VO_DATA2

VO_DATA3

VO_DATA4

VO_DATA5

VO_DATA6

VO_DATA7

AI_OSCLK

AO_OSCLK

AO_SD1

AO_SD2

AO_SD3

AO_SD4

SSI_TXDATA

SPDO

PNX1300/01/02/11 Data Book Philips Semiconductors

1-10 PRELIMINARY SPECIFICATION

1.7 PACKAGE

1.8 ORDERING INFORMATION

1.8.1 Lead Parts: Last time buy for these parts is September 30, 2005:

To order 143-MHz/2.5V product, part number is ‘PNX1300EH’, 12 nc product code 9352 7097 6557. End of Life 09/30/08.

To order 180-MHz/2.5V product, part number is ‘PNX1301EH’, 12 nc product code 9352 7097 9557. End of Life 09/30/08.

To order 200-MHz/2.5V product, part number is ‘PNX1302EH’, 12 nc product code 9352 7098 2557. End of Life 09/30/08.

To order 166-MHz/2.2V product, part number is ‘PNX1311EH’, 12 nc product code 9352 7098 5557. End of Life 09/30/08.

1.27 24.13

A1E1

bA2

UNIT Dyek

mm 0.70

0.50

2.51 27.2

26.8

D1e1

24.1

23.9 27.2

26.8 24.1

23.9 4.2

3.8

∅ j

21.0

15.4

1.83

1.63

0.90

0.60 0.2 0.15 0.25

DIMENSIONS (mm are the original dimensions)

0.2

0 10 20 mm

scale

SOT553-

BGA292: plastic, heatsink ball grid array package; 292 balls; body 27 x 27 x 1.75 mm

max.

detail X

y1C

∅

∅ j

2468101214161820

135791113151719

ball A1

index area

Philips Semiconductors Pin List

PRELIMINARY SPECIFICATION 1-11

1.8.2 Lead-Free Parts: Available for ordering starting October 1, 2004:

To order 143-MHz/2.5V product, part number is ‘PNX1300EH/G’, 12 nc product code 9352 7771 6557.

To order 180-MHz/2.5V product, part number is ‘PNX1301EH/G’, 12 nc product code 9352 7771 7557.

To order 200-MHz/2.5V product, part number is ‘PNX1302EH/G’, 12 nc product code 9352 7771 8557.

To order 166-MHz/2.2V product, part number is ‘PNX1311EH/G’, 12 nc product code 9352 7772 1557.

PNX1300/01/02/11 Data Book Philips Semiconductors

1-12 PRELIMINARY SPECIFICATION

1.9 PARAMETRIC CHARACTERISTICS

1.9.1 PNX1300/01/02/11 Absolute Maximum Ratings

Permanent damage may occur if these conditions are exceeded

Notes: 1. VX in the 5V mode pin is either VREF_PCI or VREF_PERIPH, see Section 1.6.

2. JEDEC Standard, June 2000

3. JEDEC Standard, October 1997

1.9.2 PNX1300/01/02 Operating Range and Thermal Characteristics

Functional operation, long-term reliability and AC/DC characteristics are guaranteed for the operating conditions below.

1.9.3 PNX1311 Operating Range and Thermal Characteristics

Functional operation, long-term reliability and AC/DC characteristics are guaranteed for the operating conditions below.

1.9.4 PNX1300/01/02/11 Power Supply Sequencing

Power application and power removal should obey the following rule:

VDD should never exceed VCC by more than 0.5 V

Permanent damage may occur if this rule is not observed.

Similarly, if the device is operated in 5V Input Tolerant mode, the 5V power supply must be present be first:

VDD and VCC should never exceed by more than 0 V the 5V reference voltage (VREF_PERIPH and VREF_PCI)

Permanent damage may occur if this rule is not observed.

Symbol Parameter Min. Max Units Notes

VDDMAX 2.5-V core supply voltage (PNX1300/01/02/11) -0.5 3.5 V

VCCMAX 3.3-V I/O supply voltage -0.5 4.6 V

VI-5V DC input voltage on all 5-V pins -0.5 VX+0.5 V 1

VI-3.3V DC input voltage on all 3.3-V pins -0.5 VCC+0.3 V

Tstg Storage temperature range -65 150 Deg. C

Tcasemax Maximum case temperature range 0 120 Deg. C

HBMESD Human Body Model Electrostatic handling for all pins - - CLASS 1C 2

MMESD Machine Model Electrostatic handling for all pins - - CLASS A 3

Symbol Parameter Minimum Typica

lMaximum Units

VDD PNX1300/01/02 Core supply voltage 2.375 2.50 2.625 V

VCC I/O supply voltage 3.135 3.30 3.465 V

Tcase Operating case temperature range 0 85 °C

jt junction to case thermal resistance 3.8 °C/W

ja junction to ambient thermal resistance (natural convection) 15 °C/W

Symbol Parameter Minimum Typica

lMaximum Units

VDD PNX1311 Core supply voltage 2.090 2.20 2.310 V

VCC I/O supply voltage 3.135 3.30 3.465 V

Tcase Operating case temperature range 0 85 °C

jt junction to case thermal resistance 3.8 °C/W

ja junction to ambient thermal resistance (natural convection) 15 °C/W

Philips Semiconductors Pin List

PRELIMINARY SPECIFICATION 1-13

1.9.5 PNX1300/01/02 DC/AC Characteristics

Notes: 1. VX for a 5V mode pin is either VREF_PCI or VREF_PERIPH, see Section 1.6.

1.9.6 PNX1311 DC/AC Characteristics

Notes: 1. VX for a 5V mode pin is either VREF_PCI or VREF_PERIPH, see Section 1.6.

Symbol Parameter Condition/Notes Min. Max Units

VDD Core supply voltage 2.375 2.625 V

VCC I/O supply voltage 3.135 3.465 V

IDD-typ Core supply current 200 MHz CPU operation (Max. application) 1400 m A

ICC-typ I/O supply current 183 MHz SDRAM operation (Max. application) 160 mA

IDD-pdn Core supply current CPU power down mode; 200 MHz 300 mA

ICC-pdn I/O supply current CPU power down mode; 183 MHz 50 mA

VIH-5v Input HIGH voltage for I/O-5 V Note 1. All I/O’s except IICOD 2.0 VX+ 0.5 V

VIH-3.3v Input HIGH voltage for I/O-3.3 V All I/Os except IICOD 2.0 VCC + 0.3 V

VIL-5v Input LOW voltage for I/O-5 V All I/Os except IICOD -0.5 0.8 V

VIL-3.3v Input LOW voltage for I/O-3.3 V All I/Os except IICOD -0.3 0.8 V

IIL-5v Input leakage current for I/O-5 V 0 < VIN < 2.7V -70 70 uA

IIL--3.3v Input leakage current for I/O-3.3 V 0 < VIN < 2.7V -0 10 uA

CIN Input pin capacitance 8pF

Symbol Parameter Condition/Notes Min. Max Units

VDD Core supply voltage 2.090 2.310 V

VCC I/O supply voltage 3.135 3.465 V

IDD-typ Core supply current 166 MHz CPU operation (Max. application) 1110 mA

ICC-typ I/O supply current 166 MHz SDRAM operation (Max. application) 145 mA

IDD-pdn Core supply current CPU power down mode; 166 MHz 215 mA

ICC-pdn I/O supply current CPU power down mode; 166 MHz 46 mA

VIH-5v Input HIGH voltage for I/O-5 V Note 1. All I/O’s except IICOD 2.0 VX+ 0.5 V

VIH-3.3v Input HIGH voltage for I/O-3.3 V All I/Os except IICOD 2.0 VCC + 0.3 V

VIL-5v Input LOW voltage for I/O-5 V All I/Os except IICOD -0.5 0.8 V

VIL-3.3v Input LOW voltage for I/O-3.3 V All I/Os except IICOD -0.3 0.8 V

IIL-5v Input leakage current for I/O-5 V 0 < VIN < 2.7V -70 70 uA

IIL--3.3v Input leakage current for I/O-3.3 V 0 < VIN < 2.7V -0 10 uA

CIN Input pin capacitance 8pF

PNX1300/01/02/11 Data Book Philips Semiconductors

1-14 PRELIMINARY SPECIFICATION

1.9.7 PNX1300 Series Power Consumption

The power consumption of PNX1300 Series is depen-

dent on the activity of the DSPCPU, the amount of pe-

ripherals being used, the frequency at which the system

is running as well as the loads on the pins.

The first section presents the power consumption for

known applications. The other power related sections

present the maximum power consumption. These maxi-

mum values are obtained with a ‘fake’ application that

turns on all the peripherals and runs intensive compute

on the CPU.

1.9.7.1 Power Consumption for

Applications on PN X1300 Series

The Table 1-1 and Table 1-2 present the power con-

sumption for two typical applications:

• The DVD playback includes video display using the

VO peripheral and audio streaming using AO periph-

eral. The bitstream is brought into the TM-1300 sys-

tem over the PCI peripheral. The VLD co-processor

is used to perform the bitstream parsing. The bit-

stream is not scrambled therefore the DVDD co-pro-

cessor is not used and it is turned off.

• The MPEG4 application includes video and audio

playback of an enocded CIF stream. The bit stream

is brought into the PNX1300 system over the PCI

peripheral. The Video and Audio subsystems of the

PNX1300 were used to render the video and sound

from the decoded stream into the video monitor and

speakers.

• The H263 video conferencing application includes

the following steps. It captures a CCIR656 video

stream at 30 frames/second using the VI peripheral.

The incoming video stream is downscaled, on the fly,

to SIF resolution by VI. The captured frames are then

downscaled to a QSIF resolution using the ICP co-

processor. The resulting QSIF image is sent over the

PCI bus via the ICP co-processor to a SVGA card

(PC monitor display) and encoded by the DSPCPU.

The resulting bitstream is then decoded by the

DSPCPU and displayed as a SIF image on the same

PC monitor (also using the ICP co-processor). All the

encoding/decoding part is done in the YUV color

space. The display is in the RGB16 color space.

Software is not optimized.

Three main technics may be applied to reduce the ‘Out

of the Box’ power consumption.

• Turn off the unused peripherals. Refer to Section

21.6 on page 21-2.

• Run the system at the required speed, i.e. some

application may not require to run at the full speed

grade of the chip.

• Powerdown the system or the DSPCPU each time

the DSPCPU reached the Idle task.

A more detailed description can be found in the applica-

tion note ‘TM-1300 Power Saving Features’ available at

the following website:

http://www.semiconductors.philips.com/trimedia/

As previously mentioned the Table 1-1 and Table 1-2

show that the final power consumption for a realistic ap-

plication may be lower than the values reported in the

next section.

Based on these results and the following section, the

power consumption of PNX1300 Series, using an artifi-

cial scenario depicting an extremely demand ing appli ca-

tion, for commonly used speeds, is as follows:

• PNX1300/01/02 is < 3.4 W @ 166:133 MHz

• PNX1311 is < 2.9 W @ 166:133 MHz

• PNX1302 is < 4.0 W @ 200:133 MHz

Table 1-1. Power Consumption of Example Applications for PNX1300/01/02 (Vdd = 2.5V)

APPLICATIONS AFTER

POWER

OPTIMIZATIONS

WITHOUT

POWER

OPTIMIZATIONS

Optimizations

Unused

Peripherals

Turned Off

System Speed

Adjustment Idle task power

management

DVD Playback 2.2 W 3.0 W @ 180 MHz 2.6 W @ 180 MHz 2.6 W @ 180 MHz 2.2 W @ 180 MHz

H.263 Vconf 1.7 W 2.9 W @ 166 MHz 2.7 W @ 166 MHz 1.9 W @ 111 MHz 1.7 W @ 111 MHz

Table 1-2. Power Consumption of Example Applications for PNX1311(Vdd = 2.2V)

APPLICATIONS AFTER

POWER

OPTIMIZATIONS

WITHOUT

POWER

OPTIMIZATIONS

Optimizations

Unused

Peripherals

Turned Off

System Speed

Adjustment Idle task power

management

MPEG4 (CIF) A/V

Playback 1.2 W 2.5 W @ 166 MHz 2.1 W @ 166 MHz 1.3 W @ 70 MHz 1.2 W @ 70 MHz

H.263 Vconf 1.5 W 2.4 W @ 166 MHz 2.2 W @ 166 MHz 1.7 W @ 111 MHz 1.5 W @ 111 MHz

Philips Semiconductors Pin List

PRELIMINARY SPECIFICATION 1-15

1.9.7.2 PNX1300/01/02 DSPCPU Core Current and Power Consumption

Notes: 1. Consumption for PNX1300/01/02 is organized in several categories. The “Typ” column shows current consumption for a typ-

ical application with a CPI (Clocks Per Instruction) of 1.4. The “Max” column provides current consumption for an application

with a CPI of 1.1. The measurements were taken with all the peripheral units turned on (peripherals run on a random data

pattern at the specified frequencies, except for VO which runs at 27 MHz). This “Max” data represnts an application that

heavily uses the DSPCPU and does not reflect a realistic application; it is used to determine peak currents. The “Typ” mea-

surements reflect real applications. The “P wd” column shows current consumption when Global Powerdown mode is acti-

vated. See Chapter 21, “Power Management.”

2. Standby rows indicate current consumption when DSPCPU is maintained under RESET (See Section 11.6.5, “BIU_CTL

3. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.5V and Vcc set to 3.3V.

4. Currents do not scale with frequency unless the CPU to SDRAM ratio is maintained. As an example, the data for CPU to

SDRAM ratio 1:1 for 183:183 MHz can be calculated by using the data from the 143:143 MHz column, and scaling the cur-

rents by a factor of 1.279.

1.9.7.3 PNX1311 DSPCPU Core Current and Power Consumption Details

Notes: 1. Consumption for PNX1311 is organized in several categories. The “Typ” column shows current consumption for a typical

application with a CPI (Clocks Per Instruction) of 1.4. The “Max” column provides current consumption for an application with

a CPI of 1.1. The measurements were taken with all the peripheral units turned on (peripherals run on a random data pattern

at the specified frequencies, except for VO which runs at 27 MHz). This “Max” data represnts an application that heavily uses

the DSPCPU and does not reflect a realistic application; it is used to determine peak currents. The “Typ” measurements

reflect real applications. The “Pwd” column shows current consumption when Global Powerdown mode is activated. See

Chapter 21, “Power Management.”

2. Standby rows indicate current consumption when DSPCPU is maintained under RESET (See Section 11.6.5, “BIU_CTL

3. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.2V and Vcc set to 3.3V.

4. Currents do not scale with frequency unless the CPU to SDRAM ratio is maintained.

PNX1300

143:143 PNX1301

166:133 PNX1302

192:144 PNX1302

200:133

Symbol Current/Notes Pwd Typ Max Pwd Typ Max Pwd Typ Max Pwd Typ Max Units

PNX130x

(note 1) IDD 225 1125 1200 250 1200 1300 300 1380 1475 300 1400 1525 mA

ICC 40 125 135 40 120 135 40 130 135 36 125 130 mA

Total Power Dissipa-

tion 0.8 3.2 3.5 0.8 3.4 3.7 0.9 3.9 4.1 0.9 4.0 4.2 W

IDD , DSPCPU Only - 820 920 - 900 1030 - 1030 1200 - 1050 1250 mA

ICC , DSPCPU Only - 55 45 - 50 45 - 55 45 - 55 45 mA

Power DSPCPU Only - 2.2 2.5 - 2.4 2.7 - 2.8 3.1 - 2.8 3.3 W

PNX130x

(note 1,2) IDD , Standby - 550 - - 615 - - 720 - - 740 - mA

Power Standby - 1.5 - - 1.7 - - 1.9 - - 2.0 - W

IDD , Standby + bpwd - 405 - - 450 - - 525 - - 540 - mA

Power Standby + bpwd - 1.1 - - 1.2 - - 1.4 - - 1.5 - W

PNX1311

100:100 PNX1311

143:143 PNX1311

166:166 PNX1311

166:133

Symbol Current/Notes Pwd Typ Max Pwd Typ Max Pwd Typ Max Pwd Typ Max Units

PNX131x

(note 1) IDD 129 670 720 185 955 1025 215 1110 1200 200 1032 1100 mA

ICC 28 87 100 40 125 140 46 145 170 37 123 130 mA

Total Power Dissipa-

tion 0.4 1.8 1.9 0.5 2.5 2.7 0.6 2.9 3.2 0.6 2.7 2.9 W

IDD , DSPCPU Only - 490 550 - 700 785 - 815 915 - 756 880 mA

ICC , DSPCPU Only - 38 31 - 55 45 - 65 55 - 50 45 mA

Power DSPCPU Only - 1.2 1.3 - 1.7 1.9 - 2.0 2.2 - 1.8 2.1 W

PNX131x

(note 1,2) IDD , Standby - 325 - - 460 - - 535 - - 518 - mA

Power Standby - 0.8 - - 1.1 - - 1.3 - - 1.3 - W

IDD , Standby + bpwd - 240 - - 340 - - 395 - - 375 - mA

Power Standby + bpwd - 0.6 - - 0.9 - - 1.0 - - 0.9 - W

PNX1300/01/02/11 Data Book Philips Semiconductors

1-16 PRELIMINARY SPECIFICATION

1.9.7.4 PNX1300/01/02 Current Consumption For On-Chip Peripherals

Notes: 1. Pwd. column for peripheral units indicates current savings when block powerdown is activated compared to when it is idle.

See Chapter 21, “Power Management” for block powerdown activation.

2. Typ. column for peripheral units indicates current required when data pattern is random. The Max. column indicates current

ratings when data is switching from high to low level each cycle. Again that Max. column is to show peak current and does

not represent a real application. For both columns the current reported is the current required by the peripheral as well as

the internal bus and MMI to transfer the data to/from the peripheral unit.

3. Some currents are not reported due to the difficulty to measure it or because they are not relevant. For example SSI current

is difficult to measure because it heavily involves the DSPCPU and thus makes it almost impossible to separate the current

consumed by the SSI or the DSPCPU.

4. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.5V and Vcc set to 3.3V.

5. Currents do not scale with frequency if the CPU:SDRAM ratio are different. Same ratio must be used.

PNX1300

143:143 PNX1301

166:133 PNX1302

192:144 PNX1302

200:133

Symbol Current/Notes Pwd Typ Max Pwd Typ Max Pwd Typ Max Pwd Typ Max Units

27 MHz IDD , running raw mode 50 28 39 55 29 38 65 16 26 72 27 36 mA

ICC , running raw mode - 9 17 - 12 17 - 12 17 - 12 17 mA

81 MHz IDD , running raw mode -23 75 - 33 54 -30 58 -47 72 mA

ICC , running raw mode - 33 51 - 37 51 - 36 52 - 36 52 mA

27 MHz IDD , running raw mode 6 8 18 6 6 18 7 8 18 7 6 18 mA

ICC , running raw mode - 7 14 - 6 14 - 8 15 - 9 15 mA

44 KHz IDD , stereo 16-bit 231131134533mA

ICC , stereo 16-bit - 2 1 - 1 1 - 1 1 - 1 1 mA

44 KHz IDD , stereo 16-bit 122133132133mA

ICC , stereo 16-bit - 1 1 - 1 1 - 1 1 - 1 1 mA

SPDIF

48 KHz IDD running PCM audio 2 3 2 2 3 1 3 3 3 4 2 2 mA

ICC running PCM audio - 3 3 - 2 2 - 2 2 - 2 2 mA

ICP IDD , mem. block move 61 95 176 67 95 170 80 105 188 86 106 184 mA

ICC , mem. block move - 28 28 - 27 54 - 30 61 - 29 59 mA

PCI

33 MHz IDD , DMA transfer - 37 83 - 34 80 - 32 83 - 40 53 mA

ICC , DMA transfer - 58 102 - 58 102 - 58 104 - 58 82 mA

VLD IDD 3--5--6--6--mA

ICC ------------mA

SSI

10 MHz IDD 4--5--6--6--mA

ICC ------------mA

DVDD IDD 18 - - 21 - - 24 - - 24 - - mA

ICC ------------mA

Philips Semiconductors Pin List

PRELIMINARY SPECIFICATION 1-17

1.9.7.5 PNX1311 Current Consumption For On-Chip Peripherals

Notes: 1. The “Pwd” column for peripheral units indicates current savings when block powerdown is activated, compared to when it is

idle. See Chapter 21, “Power Management” for block powerdown activation.

2. The “Typ” column for peripheral units indicates current required when data pattern is random. The “Max” column indicates

current ratings when data is switching from high to low level each cycle. Again that “Max” column is to show peak current

and does not represent a real application. For both columns the current reported is the current required by the peripheral as

well as the internal bus and MMI to transfer the data to/from the peripheral unit.

3. Some currents are not reported due to the difficulty to measure it or because they are not relevant. For example SSI current

is difficult to measure because it heavily involves the DSPCPU and thus makes it almost impossible to separate the current

consumed by the SSI or the DSPCPU.

4. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.2V and Vcc set to 3.3V.

5. Currents do not scale with frequency if the CPU:SDRAM ratio are different. Same ratio must be used.

PNX1311-100:100 PNX1311-143:143 PNX1311-166:166 PNX1311-166:133

Symbol Current/Notes Pwd Typ Max Pwd Typ Max Pwd Typ Max Pwd Typ Max Units

27 MHz IDDL , running raw mode 33 17 23 47 25 33 56 29 38 48 24 31 mA

ICC , running raw mode - 8 12 - 12 17 - 14 20 - 25 17 mA

81 MHz IDDL , running raw mode - 14 31 - 20 44 - 23 51 - 33 54 mA

ICC , running raw mode - 25 36 - 36 52 - 42 60 - 37 51 mA

27 MHz IDDL , running raw mode 3 5 8 5 7 11 6 8 13 5 7 15 mA

ICC , running raw mode - 6 10 - 9 15 - 10 17 - 8 15 mA

44 KHz IDDL , stereo 16-bit 421632732122mA

ICC , stereo 16-bit - 1 1 - 1 1 - 1 1 - 1 1 mA

44 KHz IDDL , stereo 16-bit 111122122123mA

ICC , stereo 16-bit - 1 1 - 1 1 - 1 1 - 1 1 mA

SPDIF

48 KHz IDDL running PCM audio 2 2 1 3 3 2 3 3 2 2 2 2 mA

ICC running PCM audio - 1 1 - 2 2 - 2 2 - 2 2 mA

ICP IDDL , mem. block move 40 55 101 57 79 144 66 92 167 60 76 136 mA

ICC , mem. block move - 19 38 - 27 55 - 31 64 - 26 54 mA

PCI

33 MHz IDDL , DMA transfer - 17 36 - 25 51 - 29 59 - 20 50 mA

ICC , DMA transfer - 41 57 - 58 82 - 67 95 - 45 81 mA

VLD IDDL 3--4--5--4--mA

ICC ------------mA

SSI

10 MHz IDDL 2--3--3--4--mA

ICC ------------mA

DVDD IDDL 11--16--19--18--mA

ICC ------------mA

PNX1300/01/02/11 Data Book Philips Semiconductors

1-18 PRELIMINARY SPECIFICATION

1.9.7.6 STRG3, STRG5 type I/O circuit

1.9.7.7 NORM3 type I/O circuit

1.9.7.8 WEAK5 type I/O circuit

1.9.7.9 IICOD (I2c) type I/O circuit

PNX1300/01/02/11

Symbol Parameter Condition/Notes Min. Nominal Max Units

VOH Output HIGH voltage IOUT = 16.0 mA 0.9VCC V

VOL Output LOW voltage IOUT = -16.0 mA 0.1VCC V

ZOH Output AC impedance HIGH level output state 11 ohm

ZOL Output AC impedance LOW level output state 11 ohm

trOutput rise time Test load of Figure 1-1.2.0ns

trOutput fall time Test load of Figure 1-1.2.0ns

PNX1300/01/02/11

Symbol Parameter Condition/Notes Min. Nominal Max. Units

VOH Output HIGH voltage IOUT = 8.0 mA 0.9VCC V

VOL Output LOW voltage IOUT = -8.0 mA 0.1VCC V

ZOH Output AC impedance HIGH level output state 23 ohm

ZOL Output AC impedance LOW level output state 23 ohm

trOutput rise time Test load of Figure 1-2.4.0ns

trOutput fall time Test load of Figure 1-2.4.0ns

PNX1300/01/02/11

Symbol Parameter Condition/Notes Min. Nominal Max. Units

VOH Output HIGH voltage IOUT = 6.0 mA 0.9VCC V

VOL Output LOW voltage IOUT = -6.0 mA 0.1VCC V

ZOH Output AC impedance HIGH level output state 33 ohm

ZOL Output AC impedance LOW level output state 33 ohm

trOutput rise time Test load of Figure 1-3.4.0ns

trOutput fall time Test load of Figure 1-3.4.0ns

Symbol Parameter Condition/Notes Min. Nominal Max. Units

VIL-IIC Input LOW voltage -0.5 1.0 V

VIH-IIC Input HIGH voltage VX is 3.3V or 5V depending

on VREF_PERIPH value 2.3 VX+0.5 V

VHYS Input Schmitt trigger hysteresis 0.25 V

VOL Output LOW voltage IOUT = -6.0 mA 0.6 V

tfOutput fall time 10 - 400 pF load 1.5 250 ns

Philips Semiconductors Pin List

PRELIMINARY SPECIFICATION 1-19

1.9.7.10 SDRAM interface timing for PNX1300/01/02/11 speed grades.

Notes: 1. For best high speed SDRAM operation, 50-ohm matched PCB traces are recommended for all MM_xxx signals.

Use 27-33 ohm series terminator resistors close to PNX1300/01/02/11 in the MM_CLK0 and MM_CLK1 line only.

2. Equal load circuit. MM_CLK0 and MM_CLK1 are matched output buffers.

3. The center of the two rising edges on MM_CLK0, MM_CLK1 are used as the clock reference point.

Propagation delay guarantee is defined from 50% point of clock edge to 50% level on D/A/C.

Output hold time guarantee is defined from 50% point of clock edge to 50% level on D/A/C.

4. MM_CLK0 is used as a reference clock.

Input setup time requirement is defined as data value 50% complete to 50% level on clock.

Input hold time requirement is defined as minimum time from 50% level on clock to 50% change on data.

1.9.7.11 PCI Bus timing

The following specifications meet the PCI Specifications, Rev. 2.1 for 33-MHz bus operation.

Notes: 1. See the timing measurement conditions in Figure 1-4.

2. Minimum times are measured at the package pin with the load circuit shown in Figure 1-8. Maximum times are measured

with the load circuit shown in Figure 1-6 and Figure 1-7.

3. REG# and GNT# are point-to-point signals and have different input setup times. All other signals are bused.

4. See the timing measurement conditions in Figure 1-5.

5. RST# is asserted and de-asserted asynchronously with respect to CLK.

6. All output drivers are floated when RST# is active.

7. For the purpose of Active/Float timing measurements, the Hi-Z or ‘off’ state is defined to be when the total current delivered

through the component pin is less than or equal to the leakage current specification.

PNX1300

143 PNX1301

166 PNX1301

180 PNX1311

166 PNX1302

200 N

Symbol Parameter Min Max Min Max Min Max Min Max Min Max Units

fSDRAM MM_CLK frequency 143 166 166 166 183 MHz 1

TCS Skew between MM_CLK0, CLK1 0.05 0.05 0.05 0.05 0.05 ns 2

TPD Propagation delay of data, address, control 4.7 4.2 4.2 4.2 3.7 ns 3

TOH Output hold time of data, address and control 1.5 1.5 1.5 1.5 1.5 ns 3

TSU Input data setup time 0 0 0 0 0 ns 4

TIH Input data hold time 2.0 1.5 1.5 1.5 1.5 ns 4

Symbol Parameter Min. Max Units Notes

Tval-PCI (Bus) Clk to signal valid delay, bused signals 2 11 ns 1,2,3

Tval-PCI (ptp) Clk to signal valid delay, point-to-point signals 2 12 ns 1,2,3

Ton-PCI Float to active delay 2 ns 1

TOff-PCI Active to float delay 28 ns 1,7

Tsu-PCI Input setup time to CLK - bused signals 7 ns 3,4

Tsu-PCI (ptp) Input setup time to CLK - point-to-point signals 12 ns 3,4

Th-PCI Input hold time from CLK 0.21

1. PCI Clock skew between two PCI devices must be lower than 1.8ns instead of the 2 ns as specified in PCI

2.1 specification

ns 4

Trst-PCI Reset active time after power stable 1 ms 5

Trst-clk-PCI Reset active time after CLK stable 100 s5

Trst-off-PCI Reset active to output float delay 40 ns 5,6,7

PNX1300/01/02/11 Data Book Philips Semiconductors

1-20 PRELIMINARY SPECIFICATION

1.9.7.12 JTAG I/O timing

Notes: 1. See the timing measurement conditions in Figure 1-10.

2. See the timing measurement conditions in Figure 1-9.

1.9.7.13 I2C I/O timing

Notes: 1. See the timing measurement conditions in Figure 1-11.

2. See the timing measurement conditions in Figure 1-12.

3. See the timing measurement conditions in Figure 1-13.

4. See the timing measurement conditions in Figure 1-14.

5. See the timing measurement conditions in Figure 1-15.

1.9.7.14 Video In I/O Timing

Notes: 1. See the timing measurement conditions in Figure 1-16.

1.9.7.15 Video Out I/O Timing

Notes: 1. See the timing measurement conditions in Figure 1-17.

2. See the timing measurement conditions in Figure 1-18.

3. CLKOUT asserted, i.e. the VO unit is the source of VO_CLK

4. CLKOUT negated, i.e. the external world is the source of VO_CLK

Symbol Parameter Min. Max Units Notes

fJTAG-CLK JTAG clock frequency 20 MHz

Tclk-TDO JTAG_TCK to JTAG_TDO valid delay 2 10 ns 1

Tsu-TCK Input setup time to JTAG_TCK 3 ns 2

Th-TCK Input hold time from JTAG_TCK 7 ns 2

Symbol Parameter Min. Max Units Notes

fSCL SCL clock frequency 400 kHz 1

TBUF Bus free time 1 s2

Tsu-STA Start condition set up time 1 s3

Th-STA Start condition hold time 1 s3

TLOW SCL LOW time 1 s1

THIGH SCL HIGH time 1 s1

TfSCL and SDA fall time (Cb = 10-400 pF, from VIH-IIC to VIL-IIC) 20+0.1Cb 250 ns 1

Tsu-SDA Data setup time 100 ns 4

Th-SDA Data hold time 0 ns 4

Tdv-SDA SCL LOW to data out valid 0.5 s5

Tdv-STO SCL HIGH to data out 1 ns 5

Symbol Parameter Min. Max Units Notes

fVI-CLK Video In clock frequency 81 MHz

Tsu-CLK Input setup time to VI_CLK 2 ns 1

Th-CLK Input hold time from VI_CLK 2 ns 1

Symbol Parameter Min. Max Units Notes

fVO-CLK Vi deo Out clock frequency 81 MHz

TCLK-DV VO_CLK to VO_DATA (or VO_IO*) out 3 7.5 ns 1,3

TCLK-DV VO_CLK to VO_DATA (or VO_IO*) out 3 7.5 ns 1,4

Tsu-CLK VO_IO* setup time to VO_CLK 10 ns 2

Th-CLK VO_IO* hold time from VO_CLK 3 ns 2

Philips Semiconductors Pin List

PRELIMINARY SPECIFICATION 1-21

1.9.7.16 AudioIn I/O timing

Notes: 1. See the timing measurement conditions in Figure 1-19.

2. The timing measurements are done with respect to the clock edge according to CLOCK_EDGE

3. SER_MASTER asserted, i.e. Audio In is the source of AI_WS. See the timing measurement condition in Figure 1-20.

1.9.7.17 Audio Out I/O timing

Notes: 1. See the timing measurement conditions in Figure 1-21.

2. See the timing measurement conditions in Figure 1-23.

3. The timing measurements are done with respect to the AO_SCK clock edge according to CLOCK_EDGE

4. PNX1300/01/02/11 is the serial interface master, i.e. AO_SCK, AO_WS are outputs

5. PNX1300/01/02/11 is serial interface slave, i.e. AO_SCK, AO_WS are inputs

6. See the timing measurement conditions in Figure 1-22.

1.9.7.18 SSI I/O timing

Notes: 1. Interrupt latency limits SSI to a practical use at a bit rate of 1.5 Mbit/sec.

2. See the timing measurement conditions in Figure 1-24.

3. See the timing measurement conditions in Figure 1-25.

Symbol Parameter Min. Max Units Notes

fAI-SCK Audio In AI_SCK clock frequency 22 MHz

Tsu-SCK Input setup time to AI_SCK 3 ns 1,2

Th-SCK Input hold time from AI_SCK 2 ns 1,2

TSCK-WS AI_SCK to AI_WS 10 ns 3

Symbol Parameter Min. Max Units Notes

fAO-SCK Audio Out AO_SCK clock frequency 22 MHz

TSCK-DV AO_SCK to AO_SDx valid 2 12 ns 1,3,4

TSCK-DV AO_SCK to AO_SDx valid 2 12 ns 1,3,5

Tsu-SCK Input setup time to AO_SCK 4 ns 2,3,5

Th-SCK Input hold time from AO_SCK 2 ns 2,3,5

TSCK-WS AO_SCK to AO_WS 10 ns 3,4,6

Symbol Parameter Min. Max Units Notes

fSSI-CLK SSI_CLK clock frequency 20 MHz 1

TCLK-DV SSI_CLK to data valid 2 12 ns 2

Tsu-CLK Input setup time to SSI_CLK 3 ns 3

Th-CLK Input hold time from SSI_CLK 2 ns 3

PNX1300/01/02/11 Data Book Philips Semiconductors

1-22 PRELIMINARY SPECIFICATION

Figure 1-1. STRG3, STRG5 test load circuit

12 pF

Output

Buffer

rise/fall test point

2” true length

50-ohm

30-ohm

PNX1300 pin

Figure 1-2. NORM3 test load circuit

30 pF

Output

Buffer

rise/fall test point

50-ohm

PNX1300 pin 2” true length

Figure 1-3. WEAK5 test load circuit

15 pF

Output

Buffer

rise/fall test point

50-ohm

PNX1300 pin 2” true length

V_test

T_on

T_off

V_trise

V_tfall

T_fval

T_rval

V_tl

V_th

CLK

Output

Tri-State

Delay

Output

Delay

Figure 1-4. PCI Output Timing Measurement Con-

ditions

inputs

V_test

V_tl

V_th

CLK

Input

Figure 1-5. PCI Input Timing Measurement Conditions

V_th

V_tl valid

V_test

T_h

T_su

V_max

10 pF

Figure 1-6. PCI Tval(max) Rising Edge

1/2 in. max

Output

25 

Buffer

10 pF

Figure 1-7. PCI Tval(max) Falling Edge

1/2 in. max

Output

25 

Buffer

Vcc

10 pF

Figure 1-8. PCI Tval(min) and Slew Rate

1/2 in. max

Output

1K 

Buffer

1K Vcc

TCK

TDI, TMS

Figure 1-9. JTAG Input Timing

valid

Th_TCK

Tsu_TCK

Philips Semiconductors Pin List

PRELIMINARY SPECIFICATION 1-23

TCK

TDO

Figure 1-10. JTAG Output Timing

valid

Tclk_TDO

SCL

Figure 1-11. I2C I/O Timing

THIGH TLOW

SCL

SDA

Figure 1-12. I2C I/O Timing

TTBUF

SCL

SDA

Figure 1-13. I2C I/O Timing

Th_STA

Tsu_STA

SCL

SDA

Figure 1-14. I2C I/O Timing

valid

Th_SDA

Tsu_SDA

Figure 1-15. I2C I/O Timing

SCL

SDA valid

Tdv_STO

Tdv_SDA

VI_CLK

VI_DATA, VI_IO

Figure 1-16. VideoI n I/O Timing

valid

Th_CLK

Tsu_CLK

Figure 1-17. Video Out I/O Timing

VO_CLK

VO_DATA valid

TCLK_DV

VO_CLK

VO_IO

Figure 1-18. Video Out I/O Timing

valid

Th_CLK

Tsu_CLK

AI_SCK

AI_SD, AI_WS

Figure 1-19. Audio In I/O Timing

valid

Th_SCK

Tsu_SCK

PNX1300/01/02/11 Data Book Philips Semiconductors

1-24 PRELIMINARY SPECIFICATION

Figure 1-20. Audio In I/O Timing

AI_SCK

AI_WS valid

TSCK_WS

Figure 1-21. Audio Out I/O Timing

AO_SCK

AO_SDx valid

TSCK_DV

Figure 1-22. Audio Out I/O Timing

AO_SCK

AO_WS valid

TSCK_WS

AO_SCK

AO_WS

Figure 1-23. Audio Out I/O Timing

valid

Th_SCK

Tsu_SCK

Figure 1-24. SSI I/O Timing

SSI_CLK

SSI I/O valid

TCLK_DV

SSI_CLK

SSI_IO

Figure 1-25. SSI I/O Timing

valid

Th_CLK

Tsu_CLK

PRELIMINARY SPECIFICATION 2-1

Overview Chapter 2

by Gert Slavenburg

2.1 INTRODUCTION

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

PNX1300 is a successor to the TM-1300, TM-1100 and

TM-1000 media processors. For those familiar with the

TM-1300, the new features specific to the PNX1300 are

summarized in Section 2.6. For those familiar with the

TM-1100, the new features specific to the PNX1300 are

summarized in Section 2.7. For those familiar with the

TM-1000, new features for the PNX1300 are summa-

rized in Section 2.8.

2.2 PNX1300 FUNDAMENTALS

PNX1300 is a media processor for high-performance

multimedia applications that deal with high-quality video

and audio. These applications can range from low-cost,

dedicated systems such as video phones, video editing,

digital television, security systems or set-top boxes to re-

programmable, multipurpose plug-in cards for personal

computers. PNX1300 easily implements popular multi-

media standards such as MPEG-1 and MPEG-2, but its

orientation around a powerful general-purpose CPU

(called the DSPCPU) makes it capable of implementing

a variety of multimedia algo rithms, both open and propri-

etary. PNX1300 is also easily configured in multiple pro-

cessor configurations for very high-end applications.

More than just an integrated microprocesso r with unusu-

al peripherals, the PNX1300 is a fluid computer system

controlled by a small real-time OS kernel running on a

very-long instruction word (VLIW) processor core.

PNX1300 contains a DSPCPU, a high-bandwidth inter-

nal bus, and internal bu s- m ast er in g DM A periph e ra ls.

Software compatibility between current and future Trime-

dia processor family members is at the sou rce-c ode an d

library API level; binary compatibility between family

members is not guaranteed.

Defining software compatibility at the source-code level

gives Philips the freedom to strike the optimum balance

between cost and performa nce for all chips in the family.

A powerful compiler and software development environ-

ment ensure that programmers never need to resort to

non-portable assembler programming. Programmers

use the library APIs and multimedia operations from C

and C++ source code.

PNX1300 is designed bo th for use as an acce lerator in a

PC environment or as the sole CPU in cost-effective

standalone system s. In stan dalone system applicatio ns,

the PNX1300 external bus allows for glueless connection

of 8-bit wide ROM, EEPROM, or F lash me mor y fo r cod e

storage. The external bus also allows intermixing of

PCI2.1 master/slave peripher als and 8-bit simple periph-

erals, such as UARTs and other 8-bit microprocessor pe-

ripherals. This powerful external bus architecture gives

system designers a variety of options to configure low-

cost, high-performance system solutions.

Because it is based on a general-purpose CPU,

PNX1300 can also serve as a multifunctional PC en-

hancement vehicle. Typically, a PC must deal with multi

standard video and audio streams; and applications re-

quire both decompression and compression. While the

CPU chips used in PCs are becoming capable of low-

resolution, real-time video decompression, high-quality

decompression—not to mention compression—of stu-

dio-resolution video is still out of reach. Further, users

expect their systems to h andle live vid eo and audio with-

out sacrificing system responsiveness.

PNX1300 enhances a PC system by providing real-time

multimedia with the advantages of a special-purpose,

embedded solution—low cost and chip count—and the

advantages of a general-purpose processor—repro-

grammability. For PC applications, PNX1300 far sur-

passes the capabilities of fixed-function multimedia

chips.

Future media processor family members will have differ-

ent sets of interfaces appropriate for their intended use.

2.3 PNX1300 CHIP O VERVIEW

Key features of PNX1300 include:

• A very powerful, general-purpose VLIW processor

core (the DSPCPU) that coordinates all on-chip

activities. In addition to implementing the non-trivial

parts of multimedia algorithms, the DSPCPU runs a

small real-time operating system driven by interrupts

from the othe r un its.

• Independent DMA-driven multimedia I/O units that

properly format data to make software media pro-

cessing efficient.

• DMA-driven multimedia coprocessors that operate

independently and in parallel with the DSPCPU to

perform operations specific to important multimedia

algorithms.

PNX1300/01/02/11 Data Book Philips Semiconductors

2-2 PRELIMINARY SPECIFICATION

• A high-performance bus and memory system that

provide communication between PNX1300’s pro-

cessing units.

• A flexible external bus interface.

Figure 2-1 shows a PNX1300 block diagram. The bulk of

a PNX1300 system consists of the PNX1300 micropro-

cessor itself, external synchronous DRAM (SDRAM),

and the external cir cuitry needed to interface to incomin g

and/or outgoing video and audio data streams and com-

munication lines. PNX1300’s ex ternal peripheral bus can

gluelessly interface to PC! 2.1 components and/or 8-bit

microprocessor peripherals.

Figure 2-2 shows a possible minimally configured

PNX1300 system. A video input stream might come di-

rectly from a CCIR 656-compliant video camera chip in

YUV 4:2:2 format through a glueless interface in this

case. An analog camera can be connected via a CCIR

656 interface chip (such as the Philips SAA7113H).

PNX1300 outputs a CCIR656 video stream to drive a

dedicated video monitor. Stereo audio input and up to 8 -

channel audio outpu t require only low- cost external ADC

and DAC. The operation of the video and audio interface

units is highly customizable through programmable pa-

rameters.

The glueless PCI interface allows the PNX1300 to dis-

play video in a host PC’s video card. The Image Copro-

cessor (ICP) provides display support for live video input

an arbitrary number of arbitrarily overlapped windows.

PNX1300

Video In

Audio In

Audio Out

I2C Interface

VLD

Coprocessor

Video Out

Timers

Synchronous

Serial

Interface

Image

Coprocessor

VLIW

CPU 16K

32K

CCIR656 di g. video

YUV 4:2:2

up to 81 MHz (40 Mpix/sec)

Stereo digital audio

8 and 16-bit data

I2S DC, up to 22 MHz AI_SCK

2/4/6/8 ch. digital audio

16 and 32-bit data

I2S DC, up to 22 MHz AO_SCK

I2C bus to

camera, etc.

Huffman decoder

Slice-at-a-time

MPEG-1 & 2

CCIR656 digital video

YUV 4:2:2

up to 81 MHz (40 Mpix/sec)

Analog modem or ISDN

front end

Down & up scaling

YUV  RGB

50 Mpix/sec

PCI-XIO Interface External bus

- PC!2.1 (32 bits, 33-MHz)

+ glueless 24A/ 8D slaves

SDRAM

Main Memory

Interface

DVDD

SPDIF Out

IEC958

up to 40 Mbit/sec

32-bit data

up to 572 MB/sec

Figure 2-1. PNX1300 block diagram.

Figure 2-2. PNX1300 system connections. A minimal

PNX1300 requires few supporting components.

PNX1300

CCIR656

digital video

2Mx32 SDRAM

ADC

stereo

audio in DAC 2 - 8 ch

audio out

CCIR656

dig. video

JTAG modem

front end

PCI and 8-bit peripheral bus

ROM

Philips Semiconductors Overview

PRELIMINARY SPECIFICATION 2-3

Finally, the Synchronous Serial Interface (SSI) requires

only an external ISDN or analog modem front-end chip

and phone line interface to provide remote communica-

tion support. It can be used to connect PNX1300-based

systems for video phone or videoconferencing applica-

tions, or it can be used for g eneral-purpose data commu-

nication in PC systems.

The PNX1300 JTAG port allows a debugger on a host

system to access and control the state of a PNX1300 in

a target system. It also implements 1149.1 boundary

scan functionality.

2.4 BRIEF EXAMPLES OF OPERATION

The key to understanding PNX1300 ope ration is observ-

ing that the DSPCPU and peripherals are time-shared

and that communication between units is through

SDRAM memory. The DSPCPU switches from one task

to the next; first it decompresses a video frame, then it

decompresses a slice of the audio stream, then back to

video, etc. As necessary, the DSPCPU issues com-

mands to the peripheral function units to orchestrate their

operation.

The DSPCPU can enlist the ICP and other co processors

to help with some of the straightforward, tedious tasks

associated with video processing. The ICP is very well

suited for arbitrary size horizontal and vertical video re-

sizing and color space conversion.

The DSPCPU can enlist the input/output peripherals to

autonomously receive or transmit d igital video and audio

data with minimal CPU supervision. The I/O units have

been designed to interface to the outside world through

industry standard audio and video interfaces, while deliv-

ering or taking data in memory in formats suitable for

software processing.

2.4.1 Video Decompression in a PC

An example PNX1300 implementation is as a video-de-

compression engine on a PCI card in a PC. In this case,

the PC does not need to know the PNX1300 ha s a pow-

erful, general-purpose CPU; rather, the PC just treats the

hardware on the PCI card as a ‘black-box’ engine.

Video decompression begins when the PC operating

system hands the PNX1300 a pointer to compressed vid-

eo data in the PC’s memory (the deta ils of the communi-

cation protocol are handled by the software driver in-

stalled in the PC’s operating system).

The DSPCPU fetches data from the compressed video

stream via the PCI bus, decompresses frames from the

video stream, and places them into local SDRAM. De-

compression may be aided by the VLD (variable-length

decoder) coprocessor unit, which implements Huffman

decoding and is controlled by the DSPCPU.

When a frame is ready for display, the DSPCPU gives

the ICP a display command. The ICP then autonomously

fetches the decompressed fra me data from SDRAM and

transfers it over the PCI bus to the frame buffer in the

PC’s video dis play card. Alterna tely, video can be se nt to

the graphics card using the VO unit.

2.4.2 Video Compression

Another typical application for PNX1300 is in video com-

pression. In this case, uncompressed video is usually

supplied directly to the PNX13 00 system via the Video In

(VI) unit. A camera chip connected directly to the VI unit

supplies YUV data in 8-bit, 4:2:2 format. The VI unit sam-

ples the data from the camera chip and demultiplexes

the raw video to SDRAM in three separate areas, one

each for Y, U, and V.

When a complete video frame has been read from the

camera chip by the VI unit, it interrupts the DSPCPU. The

DSPCPU compresses the video data in software (using

a set of powerful data-parallel multimedia operations)

and writes the compressed data to a separate area of

SDRAM.

The compressed video data can now be transmitted or

stored in any of several ways. It can be sent to a host

system over the PCI bu s fo r archival on local mass stor-

age, or the host can transfer the compressed video over

a network. The data can also be se nt to a remote system

using the modem/ISDN interface to create, for example,

a video phone or videoconferencing system.

Since the powerful, general-purpose DSPCPU is avail-

able, the compressed data can be encrypted before be-

ing transferred for security.

2.5 INTRODUCTION TO PNX1300 BLOCKS

The remainder of this chapter provides a brief introduc-

tion to the internal components of PNX1300.

2.5.1 Internal ‘Data Highway’ Bus

The internal bus (or data highway) connects all internal

blocks together and provides access to internal control/

status registers of each block, external SDRAM, and the

external bus peripheral chips. The internal bus consists

of separate 32-bit da ta and address buses. Tra nsactions

on the bus use a block-transfer pr otocol. On-chip periph-

eral units and coprocessors can be masters or slaves on

the bus.

Access to the internal bus is controlled by a central arbi-

ter, which has a request line from each potential bus

master. The arbiter is programmable so that the arbitra-

tion algorithm can be tailored for different applications.

Peripheral units make requests to the arbiter for bus ac-

cess and, depen ding on the arbitration mode, bus band-

width is allocated to the units in different a mou nts. Ea ch

mode allocates bandwidth differently, but each mode

guarantees each unit a minimum bandwidth and maxi-

mum service latency. All unused bandwidth is allocated

to the DSPCPU.

The bus allocation mechanism is one of the features of

PNX1300 that makes it a true real-time system instead of

just a highly integrated microprocessor with unusual pe-

ripherals.

PNX1300/01/02/11 Data Book Philips Semiconductors

2-4 PRELIMINARY SPECIFICATION

2.5.2 VLIW Processor Core

The heart of PNX1300 is a powerful 32-bit DSPCPU

core. The DSPCPU implements a 32-bit linear address

space and 128, fully general-purpose 32-bit registers.

The registers are not separated into banks; any opera-

tion can use any register for any operand.

The PNX1300 core uses a VLIW instruction-set architec-

ture and is fully general-purpose. The VLIW instruction

length allows five simultaneous operations to be issued

every clock cycle. These operations can target any five

of the 27 functional units in the DSPCPU, including inte-

ger and floating-point arithmetic units and data-parallel

multimedia operation units.

Although the processor core runs a real-time operating

system to coordinate all activities in the PNX1300 sys-

tem, the core is not intended for true general-purpose

computer use. For example, the PNX1300 processor

core does not implement de mand-paged virtual memory,

memory address translation, or 64-bit floating point - all

essential features in a general-purpose computer sys-

tem.

PNX1300 uses a VLIW arch itecture to maximize proces-

sor throughput at the lowest possible cost. VLIW archi-

tectures have performance exceeding that of supersca-

lar general-purpose CPUs without the cost and

complexity of a superscalar CPU implementation. The

hardware saved by eliminating superscalar logic reduces

cost and allows the integration of multimedia-specific

features that enhance the power of the processor core.

The PNX1300 operation set includes all traditional micro-

processor operations. In add ition, multim edia op erations

are included that dramatically accelerate standard video

and audio compression and decompression algorithms.

As just one of the five operations issued in a single

PNX1300 instruction, a single ‘custom’ or ‘media’ opera-

tion can implement up to 11 traditional microprocessor

operations. These multimedia operations combined with

the VLIW architecture result in tremendous throughput

for multimedia applications.

The DSPCPU core is supported by se parate 16-KB d ata

and 32-KB instruction caches. The data cache is dual-

ported to allow two simultane ous accesses; both caches

are 8-way set-associative with a 64-byte block size.

2.5.3 Video In Unit

The Video In (VI) unit interfaces directly to any CCIR 601/

656-compliant device that outputs 8-bit parallel, 4:2:2

YUV time-multiplexed data. Such devices include direct

digital camera systems, which can connect gluelessly to

PNX1300 or through the standard CCIR 656 connector

with only the addition of ECL level converters. A single

chip external device can be used to convert to/from serial

D1 professional video. Non-CCIR-compliant devices can

use a digital video decoder chip, such as the Philips

SAA7113H, to interface to PNX1300.

The VI unit demultiplexes the captured YUV data before

writing it into local PNX1300 SDRAM. Separate planar

data structures are maintained for Y, U, and V.

The VI unit can be programmed to perform on-the-fly

horizontal resolution subsampling by a factor of two if

needed. Many camera systems capture a 640-pixel/line

or 720-pixel/line image. With subsampling, direct conver-

sion to a 320-pixel/line or a 360-pixel/line image can be

performed with no DSPCPU intervention. Performing this

function during video input reduces initial storage and

bus bandwidth requirements for applications requiring

reduced resolution.

2.5.4 Enhanced Video Out Unit

The Enhanced Video Out (EVO) unit essentially per-

forms the inverse function of the VI unit. EVO generates

an 8-bit, CCIR656 digital video d ata stream that contains

a composited video and graphics overlay image. The vid-

eo image is taken from separa te Y, U, and V planar data

structures in SDRAM. The graph ics overlay is taken from

a pixel-packed YUV data structu r e in SDR A M . Com po s-

iting allows both alpha-blending and chroma keying.

The EVO unit can also up scale the video im age horizon-

tally by a factor of two to convert from CIF/SIF to CCIR

601 resolution. The overlay image, if enabled, is always

in full-pixel resolution.

The EVO unit is capable of pixe l emission r ates up to 40

Mpix/sec and allows full prog ramming of a horizontal and

vertical frame/field structure. It is thus capable of refresh-

ing both interlaced and non-interlaced (‘two fh’) video dis-

plays with 4:3 or 16:9 or other aspect ratios.

The sample rate for EVO unit pixels is independently and

dynamically programmable. The high-quality, on-chip

sample clock generator circuit allows the programmer

subtle control over the sampling frequency so that audio

and video synchronization can be achieved in any sys-

tem configuration. When changing the sample frequen-

cy, the instantaneous phase does not change, which al-

lows sample frequency manipulation without introducing

audio or video distortion.

2.5.5 Image Coprocessor

The ICP off-loads common image scaling or filtering

tasks from the DSPCPU. Although these tasks can be

easily performed by the DSPCPU, they are a poo r use of

the relatively expensive CPU resource. When p erformed

in parallel by the ICP, these tasks are performed effi-

ciently by simple hardware, which allows the DSPCPU to

continue with more complex tasks.

The ICP can operate as e ither a memory-to-memory or a

memory-to-PCI coprocessor device.

In memory-to-memory mode , the ICP can perform eithe r

horizontal or ve rtical image filtering an d resizing. A high

quality algorithm is used (5-tap polyphase filter in each

direction). Filtering or scaling is done in either the hori-

zontal or vertical direction in one pass. Two invocations

of the ICP are required to filter or resize in both direc-

tions.

In memory-to-PCI mode , the ICP can perform horizontal

resizing followed by color-space conversion. For exam-

ple, assume an n  m pixel array is to be displayed in a

Philips Semiconductors Overview

PRELIMINARY SPECIFICATION 2-5

window on the PC video screen while the PC is running

a graphical user interface. The first step (if necessary)

would use the ICP in memory-to-memory mode to per-

form a vertical resizing. The second step would use the

ICP in memory-to-PCI mode to perform horizontal resiz-

ing and optional colorspace conversion from YUV to

RGB.

While sending the final, resampled and converted pixels

over the PCI bus to the video fr am e buffer , the ICP uses

a full, per-pixel occlusion bit mask —accessed in dest ina-

tion coordinates—to determine which pixels are actually

written to the graphics card frame buffer for display. Con-

ditioning the transfer with the bit mask allows PNX1300

to accommodate an arbitrary arrangement of overlap-

ping windows on the PC video screen.

Figure 2-3 illustrates a possible display situation and the

data structures in SDRAM that support ICP operation.

On the left, the PC video screen has four overlapping

windows. Two, Image 1 and Image 2, are being used to

display video generated by PNX1300. The right side

shows a conceptual view of SDRAM contents. Two data

structures are present, on e for Image 1 and the oth er for

Image 2. Figure 2-3 represents a point in time during

which the ICP is displaying Image 2.

When the ICP is displaying an image (i.e., copying it from

SDRAM to a frame buffer), it maintains four pointers to

the SDRAM data structures. Three pointers locate the Y,

U, and V data arrays, the fourth locates the per-pixel oc-

clusion bit map. The Y, U, and V arrays are indexed by

source coordinates while the occlusion bit map is ac-

cessed with screen coordinates.

As the ICP generates pixels for display, it performs hori-

zontal scaling and colorspace conversion. The fin al RGB

pixel value is then copied to the destination address in

the screen’s frame buffer only if the corresponding bit in

the occlusion bit map is a ‘1’.

As shown in the conceptual diagram, the occlusion bit

map has a pattern of 1s and 0s corresponding to the

shape of the visible area o f the destination window in the

frame buffer. When the arrangement of windows on the

PC screen changes, modifications to the occlusion bit

map is performed by PNX1300 or host resident software.

It is important to note that there is no preset limit on the

number and sizes of windows that can be handled by th e

ICP. The only limit is the available bandwidth. Thus, the

ICP can handle a few large windows or many small win-

dows. The ICP can sustain a transfer rate of 50 megapix-

els per second, which is more than enough to saturate

PCI when transferring images to video frame buffers.

2.5.6 Variable-Length Decoder (VLD)

The variable-length decoder (VLD) relieves the DSPCPU

of decoding Huffman-encoded video d ata streams. It can

be used to help decode high bitrate MPEG-1 and MPEG-

2 video streams. The lower bitrate of videoconferencing

can be adequately handled by DSPCPU software with-

out coprocessor.

The VLD is a memory-to-memory coprocessor. The

DSPCPU hands the VLD a pointer to a Huffma n-encod-

ed bit stream, and the VLD produces a tokenized bit

stream that is very convenient for the PNX1300 image

decompression software to use. The format of the output

token stream is optimized for the MPEG-2 decompres-

sion software so that communication between the

DSPCPU and VLD is minimized.

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

PC Screen

Image 1

File Edit Format View

File Edit

FrameMaker 5

IMAGE 1

Calendar

In SDRAM

Image 2

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1

Image 1

Image 2

ICP

Figure 2-3. ICP - Windows on the PC screen and data structures in SDRAM for two live video windows.

PNX1300/01/02/11 Data Book Philips Semiconductors

2-6 PRELIMINARY SPECIFICATION

2.5.7 Audio In and Audio Out Units

The Audio In (AI) and Audio Out (AO) units are similar to

the video units. They connect to most serial ADC and

DAC chips, and are programmable enough to handle

most serial bit protocols. These units can transfer MSB

or LSB first and left or right channel first.

The audio sampling clock is driven by PNX1300 and is

software programmable within a wide range. Like the VO

unit, AI and AO sample rates are separately and dynam-

ically programmable. The high-quality on-chip sample

clock generator circuits allows the programmer subtle

control over the sampling frequency so that audio and

video synchronization can be achieved in any system

configuration. When changing the sample frequency, the

instantaneous phase does not change, which allows

sample frequency manipulation without introducing au-

dio or video distortion.

As with the video units, the audio-in and audio-out units

buffer incoming and outgoing audio data in SDRAM. The

audio-in unit buffers samples in either 8- or 16-bit fo rmat,

mono or stereo. Th e audio-out unit transfers 16 - or 32-bit

sample data for mono, stereo or up to 8 audio channels

from memory to the external DACs. Any manipulation or

mixing of sound data is pe rformed by the DSPCPU since

this processing will require only a small fraction of its pro-

cessing capacity.

2.5.8 S/PDIF Out Unit

The Sony/Philips Digital Interface Out (SPDO) unit al-

lows output of a 1-bit hi gh-speed serial d ata stream. The

primary application is output of digital audio data in Sony/

Philips Digital Interface (S/PDIF) format to an external

electrically isolated transformer. The SPDO unit can also

be used as a general purpose high-speed data stream

output device such as a UART.

The SPDO unit supports 2-channel PCM audio, one or

more Dolby Digital six-channel data streams, or one or

more MPEG-1 or MPEG-2 audio streams (embedded

per Project 1937). It supports arbitrary programmable

sample rates independent of and asynchronous to the

AO unit sample rate.

2.5.9 Synchronous Serial Interface

The on-chip synchronous serial interface (SSI) is spe-

cially designed to interface to high integration analog mo-

dem frontends or ISDN frontend devices. In the analog

modem case, all of the modem signal processing is per-

formed in the PNX1300 DSPCPU.

2.5.10 I2C Interface

The I2C bus is a 2-wire multi-master, multi-slave inter-

face capable of transmitting up to 400 kbit/sec. PNX1300

implements an I2C ma ster fo r use in s ingle ma ster en vi-

ronments only. This interface allows PNX1300 to config-

ure and inspect the status of I2C peripheral devices, such

as video decoders, video encoders and some camera

types.

2.6 NEW IN PNX1300 (VERSUS TM-1300)

PNX1300/01/02/11 offers the following improvements

over the TM-1300:

• Lower co re voltage for PNX1311 (2.2V core voltage)

and therefore lower power consumption.

• DSPCPU speed of up to 200 MHz for PNX1302.

• Support for 256 Mbit SDRAM organized in x16. The

REFRESH counter must be changed. Refer for Sec-

tion 12.11, “Refresh” in Chapter 12, “SDRAM Mem-

ory System” for details.

• Support for 16 and 32-bit Main Memory Interface.

• Bug fixes in VI message passing mode.

• Additional VI mode where VI_DATA[9:8] in message

passing mode are not affected by the VI_DVALID

signal.

• PCI bug fix on PCI Special Cycles.

• Autonomous boot in non 1:1 ratio is fixed.

2.7 NEW IN PNX1300 (VERSUS TM-1100)

In addition to the features described in Section 2.6

PNX1300 offers also the following improvements over

the TM-1100:

• no external MATCHOUT to MATCHIN delay line.

• Video outpu t speed improvement: up to 81 MHz.

• Video input speed improvement: up to 81 MHz.

• Prefetcheable SDRAM aperture to increase perfor-

mance. See Chapter 11, “PCI Interface. ”

• Individual powerdown capability for each coproces-

sor (e.g. ICP, EVO, etc.).

• New AO coprocessor with four separate channels

and support of 16 or 32-bit samples. 8-bit samples

are no longer supported.

• New SPDO coprocessor (for output of SPDIF and

other 1-bit high-speed serial data streams)

2.8 NEW IN PNX1300 (VERSUS TM-1000)

In addition to the features described in Section 2.7

PNX1300 offers also the following improvements over

the TM-1000:

• New DSPCPU instructions. See Appendix A,

“PNX1300/01/02/11 DSPCPU Operations.”

• Video Output unit improvements (8-bit alpha blend-

ing, chroma keying, genlock). See Chapter 7,

“Enhanced Video Out.”

• Capability to intermix PCI2.1 and 8-bit peripherals or

ROM/Flash memories on the external bus. See

Chapter 22, “PCI-XIO External I/O Bus.”

• An on-chip DVD authentication/descrambling copro-

cessor. Information available to DVD product devel-

opers on special request.

• Full 1149.1 boundary scan.

• Improved PCI DMA read performance. See Chapter

11, “PCI Interface.”

• Improved clo ck ge n erat ion w ith new DDS blo cks.

PRELIMINARY SPECIFICATION 3-1

DSPCPU Architecture Chapter 3

by Gert Slavenburg, Marcel Janssens

3.1 BASIC ARCHITECTURE CONCEPTS

In the document the generic PNX1300 product name

refers to PNX1300 Series, or the PNX1300/01/02/11

products.

This section documents the system programmer or

‘bare-machine’ view of the PNX1300 CPU (or DSPCPU).

3.1.1 Register Model

Figure 3-1 shows the DSPCPU’s 128 general purpose

registers, r0...r127. In addition to the hardware program

counter, PC, there are 4 user-accessible special purpose

registers, PCSW, DPC (destination program counter),

SPC (source program counter), and CCCOUNT.

Table 3-1 lists the registers and their purposes.

sponding to the boolean value 'FAL SE' or the single-pre-

cision floatin g point value +0.0 . Register r1 always con-

tains the integer value '1' ('TRUE'). The programmer is

NOT allowed to write to r0 or r1.

Note: Writing to r0 or r1 may cause reads from r0 or

r1 scheduled in adjacent clock cycles to return unpre-

dictable values. The standard assembler prevents/

forbids the use of r0 or r1 as a destination register.

Registers r2 through r127 are true general purpose reg-

isters; the hardware d oes not im ply their use in a ny way,

though compiler or progr ammer conventions may assign

particular roles to particular registers. The DPC and SPC

relate to inter rupt and exception handling and ar e treated

in Section 3.1.4, “SPC and DPC—Source and Destina-

tion Program Counter.” The PCSW (Program Control

and Status Word) register is treated in Section 3.1.3,

“PCSW Overview.” CCCOUNT, the 64-bit clock cycle

counter is treated in Section 3.1.5, “CCCOUNT—Clock

Cycle Counter.”

31 23 15 7 0

0 00

0 10

00000000000000000000000000000

31 23 15 7 0

63 55 47 39

r126

r127

PCSW

DPC

SPC

CCCOUNT

128 General-Pu rp os e Re gi st ers

• r0 & r1 fixed

• r2–r127 variable

System Status & Control Registers

•

Figure 3-1. PNX1300 registers.

Table 3-1. DSPCPU registers

r0 32 bits Always reads as 0x0; must not be used

as destination of operations

r1 32 bits Always reads as 0x1; must not be used

as destination of operations

r2–r127 32 bits 126 general-purpose registers

PC 32 bits Program counter

PCSW 32 bits Program control & status word

DPC 32 bits Destination program counter; latches

target of taken branch that is interrupted

SPC 32 bits Source program counter; latches tar-

get of taken branch that is not inter-

rupted

CCCOUNT 64 bits Counts clock cycles since reset

PNX1300/01/02/11 Data Book Philips Semiconductors

3-2 PRELIMINARY SPECIFICATION

3.1.2 Basic DSPCPU Execution Model

The DSPCPU issues one ‘long instruction’ every clock

cycle. Each instruction consists of several operations

(five operations for the PNX1300 microprocessor). Each

operation is comparable to a RISC machine instruction,

except that the execution of an operation is conditional

upon the content of a general purpose register. Exam-

ples of operations are:

IF r10 iadd r11 r12  r13

(if r10 true, add r11 and r12 and write sum in r13)

IF r10 ld32d(4) r15  r16

(if r10 true, load 32 bits from mem[r15+4] into r16)

IF r20 jmpf r21 r22

(if r20 true and r21 false, jump to address in r22)

Each operatio n has a specific, known exec ution latency

in clock cycles. For example, iadd takes 1 cycle; thus the

result of an iadd operation started in clock cycle i is avail-

able for use as an argument to operations issued in cycle

i+1 or later. The other operations issued in cycle i cannot

use the result of iadd. The ld 32d op eration has a latency

of 3 cycles. The result of an ld32d operation started in cy-

cle j is available for use by other op erations issued in cy-

cle j+3 or later. Branches, such as the jmpf example

above have three delay slots. This means that if a branch

operation in cycle k is taken, all operations in the instruc-

tions in cycle k+1, k+2 and k+3 are still executed.

In the above examples, r10 and r20 control conditional

execution of the operations. Also known as ‘guarding’,

here r10 a nd r20 contai n the operation ‘guard’. See Sec-

tion 3.2.1, “Guarding (Conditional Execution).”

Certain restrictions exist in the choice of what operations

can be packed into an instruction. For example, the

DSPCPU in PNX1300 allows no more than two load/

store class operatio ns to be packe d into a single in struc-

tion. Also, no more than five results (of previously started

operations) can be written during any one cycle. The

packing of operations is not normally done by the pro-

grammer. Instead, the instruction scheduler (See Philips

TriMedia SDE Reference Manu al) takes car e of convert-

ing the parallel intermediate format code into packed in-

structions ready for the assembler. The rules are formally

described in the machine description file used by the in-

struction scheduler and other tools.

3.1.3 PCSW Overview

Figure 3-2 shows the PCSW register. T he PNX1300 val-

ue of PCSW on reset is 0x800. For compatibility, any un-

defined PCSW fields should never be modified.

Note that the DSPCPU architecture has no condition

codes or integer arithmetic status flags. Integer opera-

tions that generate out-of-range results deliver an opera-

tion specific bit pattern. For examples, see dspiadd in

Appendix A, “PNX1300/01/02/11 DSPCPU Operations.”

Predicate operations exist that take the place of integer

status flags in a classical architecture. Multiword arith-

metic is supported by the ‘carry’ operation which gener-

ates a ‘0’ or ‘1’ depending on the carry that would be gen-

erated if its arguments were summed.

FP-Relat ed F ields.The IEEE mode field determines the

IEEE rounding mode of all floating point operations, with

the exception of a few floating point conversion opera-

tions that use fixed rounding mo de. For examples, see if-

ixrz, ifloatrz, ifixrz, ifloatrz in Appendix A, “PNX1300/01/

02/11 DSPCPU Operations.”

The FP exception flags are ‘sticky bits’ that are set as a

side effect of floating-point computations. Each floating

point operation can set one or mo re of the flags if it incurs

the corresponding e xception. The flags can only be reset

by direct software manipulation of the PCSW (using the

writepcsw operation). The bits have the meanings shown

in Table 3-2.

The FP exception trap enable bits determine which FP

exception flags invoke CPU exception handling. An ex-

ception is requested if the intersection of the exception

flags and trap enable flags is non-zero. The acceptance

and handling of exceptions is described in Section 3.5,

“Special Event Handling.”

BSX (Bytesex). The DSPCPU has a switchable bytesex.

The BSX flag in the PCSW can be written by software.

Load/store operations observe little- or big-endian byte

ordering based on the current setting of BSX.

IEN (Interrupt Enable). The IEN flag disables or enables

interrupt processing for most interrupt sources. Only NMI

(non-maskable interr upt) bypasses IEN. The acceptance

and handling of interrupts is described in Section 3.5.3,

“INT and NMI (Maskable and Non-Maskable Interrupts).”

MSE CS IEN BSX IEEE MODE OFZ IFZ INV OVF UNF INX DBZ

01234567891011121415

Misaligned store exception

Count stall s (1  Yes)

FP exception trap-ena ble bits

IEEE rounding mode

0  to nearest, 1  to zero, 2  to positive, 3  to negative

Interrupt enable (1  allow interrupts)

Byte sex (1  little endian)

PCSW[31:16]

PCSW[15:0] UNDEF

Misaligned store

exception trap enable Trap on first exit

FP exceptions

TRP

MSE TFE TRP

OFZ TRP

IFZ TRP

INV TRP

OVF TRP

UNF TRP

INX TRP

DBZ

1617181920212223252627283031

UNDEF UNDEFINED

WBE RSE

Write back error

Reserved ex ce ption

TRP

WBE TRP

RSE

Write back error trap enable Reserved exception

trap enab le

PCSW = 0x800

after RESET

Figure 3-2. PNX1300 PCSW (Program Control and Status Word) register format.

Philips Semiconductors DSPCPU Architecture

PRELIMINARY SPECIFICATION 3-3

CS (Count Stalls). The CS flag determines the mode of

CCCOUNT, the 64-bit clock cycle counter. If CS = ‘1’, the

cycle counter increments on all clock cycles. If CS = ‘0’,

the clock cycle counter only increments on non-stall cy-

cles. See also Section 3.1.5, “CCCOU NT—Clock Cycle

Counter.” After RESET, CS is set to ‘1’.

MSE and TRPMSE (Misaligned-Store Excep tion). The

MSE bit will be set when the processor detects a store

operation to an address th at is not alig ned. For exam ple,

a 32-bit store executed with an address that is not a mul-

tiple of four will cause MSE to be set. The TR PMSE bit

enables the DSPCPU to raise misaligned address ex-

ceptions. An exception is requested if the intersection of

MSE and TRPMSE is non-zero. The acceptance and

handling of exceptions is described in Section 3.5, “Spe-

cial Event Handling.”

Unaligned load operations do not cause an exception,

because load ope rations can be speculative (i.e . their re-

sult is thrown away).

When the DSPCPU generates an unaligned address, the

low order address bit(s) (one bit in the case of a 16-bit

load, two bits for a 32-bit load) are forced to zero and th e

load/store is executed from this aligned address.

WBE and TRPWBE (Write Back Error). The WBE flag

will be set whenever a program attempts to write back

more than 5 results simultaneously. This is indicative of

a programming error, likely caused by the scheduler or

assembler. The TRPWBE bit enables the co rrespo nding

exception.

RSE, TRPRSE (Reserved Exception). RSE and TR-

PRSE are reserved for diagnostic purposes and not de-

scribed here.

TFE (Trap on First Exit). The TFE bit is a support bit for

the debugger. The TFE bi t is set by the debugger prior to

taking a (non-interruptible) jump to the application pro-

gram. On the next interruptible jump (the first interrupt-

ible jump in the application being debugged), an excep-

tion is requested because the TFE bit is set. The

acceptance and handling of exception processing is de-

scribed in Section 3.5, “Special Event Handling.” It is the

responsibility of the exception handler software to clear

the TFE bit. The hardware does not clear or set TFE.

Corner-case note: Whenever a hardware update (e.g. an

exception being raised) and a software update (through

writepcsw) of the PCSW coincide, the new value of the

PCSW will be the value that is written by the writepcsw

instruction, except for those bits that the hardware is cur-

rently updating (which will reflect the hardware value).

3.1.4 SPC and DPC—Source and

Destination Program Counter

The SPC and DPC registe rs are support register s for ex-

ception processing. The DPC is updated dur ing every in-

terruptible jump with the target address of that interrupt-

ible jump. If an exception is taken at an interruptible

jump, the value in the DPC register can be used by the

exception handling routine as the return address to re-

sume the program at the place of interruption.

The SPC register is updated during every interruptible

jump that is not interrupted by an exception. Thus on an

interrupted interruptib le jump, the SPC register is not up-

dated. The SPC register allows the exception handling

routine to determine the start address of the decision tree

(a block of uninterruptible, scheduled PNX1300 code)

that was executing when the exception was taken (see

also Section 3.5, “Special Event Handling”).

Corner-case note: Whene ver a har dware up date (during

an interruptible jump) and a software update (through

writedpc or writespc) coinci de, the software update takes

precedence.

3.1.5 CCCOUNT—Clock Cycle Counter

CCCOUNT is a 64-bit counter that counts clock cycles

since RESET. Cycle counting can occur in two modes,

depending on PCSW.CS. If PCSW.CS = ‘1’, the cycle

count increments on every CPU clock cycle. If PCSW.CS

= ‘0’, the clock cycle count only increments on non-stall

CPU cycles.

CCCOUNT is implemented as a master counter/slave

continuously. The value of the CCCOUNT slave register

is updated with the current master cycle count during

successful interruptible jumps only. The cycles and hicy-

cles DSPCPU operations return the content of the 32

LSBs and 32 MSBs, respectively, of the slave register.

This ensures that the value returned by hicycles and cy-

cles is coherent, as long as there is no intervening inter-

ruptible jump, which makes these operations suitable for

64-bit high resolution timing from C source code pro-

grams. The curcycles DSPCPU operation returns the 32

LSBs of the master counter. The latter operation can be

used for instruction cycle precise timing. When used, it

must be precisely placed, proba bly at the assembly code

level.

3.1.6 Boolean Representation

The bit pattern generated by boolean valued operations

(ileq, fleq etc.) is '00...00' (FALSE) or '00...01' (TRUE).

When interpreting a bit pattern as a boolean value, only

the LSB is taken into account, i.e. 'xx..x0' is interpreted

as FALSE and 'xx..x1' is interpreted as TRUE. In partic-

ular, wherever a general purpose register is used as a

‘guard’, the LSB determines whether execution of the

guarded operation takes place.

Table 3-2. PCSW FP exception flag definitions

Flag Function

INV Standard IEEE invalid flag

OVF Standard IEEE overflow flag

UNF Standard IEEE underflow flag

INX Standard IEEE inexact flag

DBZ Standard IEEE divide-by-zero flag

OFZ ‘Output flushed to zero’ set if an operation caused a

denormalized result

IFZ ‘Input flushed to zero’ set if an operation was applied to

one or more denormalized operands

PNX1300/01/02/11 Data Book Philips Semiconductors

3-4 PRELIMINARY SPECIFICATION

3.1.7 Integer Representation

The architecture supports the notion of 'unsigned inte-

gers' and 'signed integers.' Signed integer s use the stan-

dard two’s-complement representation.

Arithmetic on integers does not gener ate traps. If a result

is not representable, the bit p attern returned is operatio n

specific, as defined in the individual operation description

section. The typical cases are:

• Wrap aro und for regula r add- and subtract-type op er-

ations.

• Clamping against the minimum or maximum repre-

sentable value for DSP-type operations.

• Returning the least significant 32-bit value of a 64-bit

result (e.g., integer/unsigned multiply).

3.1.8 Floating Point Representation

The PNX1300 architecture supports single precision (32-

bit) IEEE-754 floating point arithmetic.

All arithmetic conforms to the IEEE-754 standard in

flush-to-zero mode .

All floating point compute operations round according to

the current setting of the PCSW IEEE mode field. The

current setting of the field determines result rounding (to

nearest, to zero, to positive infinity, to negative infinity).

Conversions from float to integer/unsigned are available

in two forms: a PCSW rounding-mode-observing form

and an ANSI-C-specific-rounding form. The ANSI-C-

specific form forces round to zero regardless of the

PCSW IEEE rounding mode. Conversion from integer/

unsigned to float always observes the IEEE rounding

mode.

Floating point exceptions are supported with two mecha-

nisms. Each individual floating point oper ation (e.g. fadd)

has a counterpart operation (faddflags) that computes

the exception flag values. These operations can be used

for precise exception identification1. The second mecha-

nism uses the ‘sticky’ exception bits in the PCSW that

collect aggregate exception events. The PCSW excep-

tion bits can selectively invoke CPU exception handling.

See Section 3.5. 2, “E XC (Exceptions ).”

Table 3-3 shows the representation choices that were

made in PNX1300’s floating point implementation.

3.1.9 Addressing Modes

The addressing modes shown in Table 3-4 are support-

ed by the DSPCPU architecture (store operations allow

only displacement mode).

In these addressing modes, R[i] indicates one of the gen-

eral purpose registers. The sc ale factor applied (1/2/4) is

equal to the size of the item loaded or stored, i.e. 1 for a

byte operation, two for a 16-bit operation and four for a

32-bit operation. The range of valid 'i', 'j' and 'k' values

may differ be tween impleme ntations of the a rchitecture;

the minimum values for impl ementation-dependent char-

acteristics are shown in Table 3-5.

Note that the assembly code specifies the true displace-

ment, and not the value to be scaled. For example,

‘ld32d(–8) r3’ loads a 32-bit value from address (r3 – 8).

This is encoded in the bina ry operation pattern as a –2 in

the seven-bit field by the assembler. At runtime, the

scale factor four is applied to reconstruct the intended

displacement of –8.

3.1.10 Software Compatibility

The DSPCPU architecture expressly does not support

binary compatibility between family members. The ANSI

C compiler ensures that all family members are compat-

ible at the source -c od e lev el.

1. This mechanism allows precise exception identification

in the context of our multi-issue microprocessor core—

where many floating point operations may issue simul-

taneously—at the expense of additional operations

generated by the compiler. It also allows the compiler to

issue compute operations speculatively and compute

exceptions precisely.

Table 3-3. Special Float Value Representation

Item Representation

+inf 0x7f800000

-inf 0xff800000

self g ener ated qN aN 0xffffffff

result of operation

on any NaN argu-

ment

argument | 0x00400000 (forcing the

NaN to be quiet)

signalling NaN never generated by PNX1300,

accepted as per IEEE-754

Table 3-4. Addressing Modes

Mode Suffix Applies to Name

R[i] + scaled(#j) d Load & Store Displacement

R[i] + R[k] r Load only Index

R[i] + scaled(R[k]) x Load only Scaled index

Table 3-5. Mini mu m valu es for implementation-

dependent addressing mode components

Parameter Minimum Range

‘i’ and ‘k’ 0..127 (i.e., each implementation has at least 128

registers)

‘j’ -64..63 (i.e., displacements will be at least 7 bits

long and signed)

Philips Semiconductors DSPCPU Architecture

PRELIMINARY SPECIFICATION 3-5

3.2 INSTRUCTION SET OVERVIEW

3.2.1 Guarding (Conditional Execution)

In the PNX1300 architecture, all operations can be op-

tionally 'guarded'. A guarded operation executes condi-

tionally, depending on the value in the ‘guard' register.

For example, a guarded add is written as:

IF R23 iadd R14 R10  R13

This should be taken to mean

if R23 then R13  R14 + R10.

The ’if R23' clause controls the execution of the opera-

tion based on the LSB of R23. Hence , depending on the

LSB of R23, R13 is either unchanged or set to contain

the integer sum of R14 and R10.

Guarding applies to all DSPCPU operations, except iimm

and uimm (load-immediate). It controls the effect on all

programmer-visible states of the system, i.e. register val-

ues, memory content, exception raising and device state.

3.2.2 Load and Store Operations

Memory is byte addressable. Loads and stores must be

‘naturally aligned’, i.e. a 16-bit load or store must target

an address that is a multiple of 2. A 32-bit load or store

must target an address that is a multiple of 4. The BSX

bit in the PCSW determines the byte order of loads and

stores. For example, see ld32 and st32 in Appendix A,

“PNX1300/01/02/11 DSPCPU Ope rations.”

Only 32-bit load and store operations are allowed to ac-

cess MMIO registers in the MMIO address apertu re (see

Section 3.4, “Memory and M MIO”). The results are un de-

fined for other loads and stores. A load from a non-exis-

tent MMIO register returns an undefined result. A store to

a non-existent MMIO register times out and then does

not happen. There are no other side effects of an acce ss

to a nonexistent MMIO register. The state of the BSX bit

has no effect on the result of MMIO accesses.

Loads are allowed to be issue d speculatively. Loads out-

side the range of valid data memory addresses for the

active process return an implementation-dependent val-

ue and do not generate an exception. Misaligned loads

also return an implementation dependent value and do

not generate an exception.

If a pair of memory operations involves one or more com-

mon bytes in memory, the e ffect on the common bytes is

as defined in Table 3-6.

Table 3-4 shows the supported addressing modes. The

minimum values of implementation-dependent address-

ing-mode com po ne n ts ar e sh ow n in Table 3-5.

Note: The index and scaled-index modes are not

allowed with store opcodes, due to the hardware

restriction that each operation have at most 2 source

operand registers and 1 condition register. Stores

use 1 operand register for the value to be stored

leaving only 1 register to form an add ress.

The scale factor a pplied (1 /2 /4 ) in the sca l ed ad dr essin g

modes is equal to the size of the item loaded or stored,

i.e. 1 for a byte operation, 2 for a 16-bit operation and 4

for a 32-bit ope ratio n.

Table 3-7 lists the available load and store mnemonics

for the three addressing modes.

Example usage of load and store operations:

IF r10 ild16d(12) r12  r13

If the LSB of r10 is set, lo ad 16 bi ts sta r t in g at

address (r12+12) using the byte ordering indi cated

in PCSW.BSX, sign-extend the value to 32 bi ts and

store the result in r13.

IF r10 st32d(40) r12 r13

If the LSB of r10 is set, store the 32-bit value from

r13 to the address (r12 +40) using the b yte ordering

indicated in PCSW.BSX.

Table 3-6. Behavior of loads and stores with

coincident addresses

Condition Behavior

Tstore < Tload If a store is issued before a load, the value

loaded contains the new bytes.

Tload < Tstore If a load is issued before a store, the value

loaded contains the old bytes.

Tstore1 < Tstore2 If store1 is issued before store2, the result-

ing value contains the bytes of store2.

Tstore = Tload If a load and store are issued in the same

clock cycle, the result is UNDEFINED.

Tstore1 = Tstore2 If two stores are issued in the same clock

cycle, the resulting stored value is unde-

fined.

Table 3-7. Load and store mnemonics

Operation Displacement Index Scaled-

Index

8-bit signed load ild8d ild8r —

8-bit unsigned load uld8d uld8r —

16-bit signed load ild16d ild16r ild16x

16-bit unsigned load uld16d uld16r uld16x

32-bit load ld32d ld32r ld32x

8-bit store st8d — —

16-bit store st16d — —

32-bit store st32d — —

PNX1300/01/02/11 Data Book Philips Semiconductors

3-6 PRELIMINARY SPECIFICATION

3.2.3 Compute Operations

Compute operations are register-to-register operations.

The specified operation is performed on one or two

source registers and the result is written to the destina-

tion register.

Immediate Operations. Immediate operations load an

immediate constant (specified in the opcode) and pro-

duce a result in the destination register.

Floating-Point Compute Operations. Floating-point

compute operations are register-to-register operations.

The specified operation is performed on one or two

source registers and the result is written to the destina-

tion register. Unless otherwise mentioned all floating

point operations obser ve the rounding mo de bits defined

in the PCSW register. All floating-point operations not

ending in ‘flags’ update the PCSW exception flags. All

operations ending in ‘flags’ compute the exception flags

as if the operation were executed and retur n the flag val-

ues (in the sam e fo rmat as in th e PCS W); the exce ption

flags in the PCSW itself remain unchanged.

Multimedia Operations. These special compute opera-

tions are like normal compute operations, but the speci-

fied operations are not usually found in general purpose

CPUs. These operations provide spe cial support for mul-

timedia applications.

3.2.4 Special-Register Operations

Special register operations operate on the special regis-

ters: PCSW, DPC, SPC and CCCOUNT.

3.2.5 Control-Flow Operations

Control-flow opera tions change the valu e of the progr am

counter. Conditional jumps test the value in a register

and, based on this value, change th e program counter to

the address contained in a second register or continue

execution with the next instruction. Unconditional jumps

always change t he program count er to the specifie d im-

mediate address.

Control-flow operations can be interruptible or non-inter -

ruptible. Execution of an interruptible jump is the only oc-

casion where PNX1300 allows special event handling to

take place (see Section 3.5, “Special Event Handling”).

3.3 PNX1300 INSTRUCTION ISSUE RULES

The PNX1300 VLIW CPU allows issue of 5 operations in

each clock cycle according to a set of specific issue

rules. The issue rules impose issue time constraints and

a result writeback constraint. Any set of operations that

meets all constraints constitutes a legal PNX1300 in-

struction. A more extensive description and a few special

case issue rules and limita tions can be found in the Phil-

ips TriMedia SDE document ation.

Issue time constraints:

• an operation implies a need for a functional unit type

(as documented in Appendix A, “PNX1300/01/02/11

DSPCPU Operations.”)

• each operation requires an issue slot that has an

instance of the appropriate functional unit type

attached

FALU DSPMUL DSPMUL FALU DMEMSPEC

SHIFTER SHIFTER FCOMP DMEM DMEM

BRANCH BRANCH BRANCH

IFMUL IFMUL

DSPALU FTOUGH

(latency 17,

recovery 16)

DSPALU

ALU ALU ALU ALU ALU

CONST CONST CONST CONST CONST

issue slot 1 issue slot 2 issue slot 3 issu e slot 4 issue slot 5

Figure 3-3. PNX1300 issue slots, functional units, and latency.

Philips Semiconductors DSPCPU Architecture

PRELIMINARY SPECIFICATION 3-7

• functional units should be ‘recovered’ from any prior

operation issues

Writeback constraint:

• No more than 5 results should be simultaneously

written to the register file at any point in time (write-

back occurs ‘latency’ cycles after issue)

Figure 3-3 shows all functional u nits of PNX1300, includ-

ing the relation to issue slots, and each functional unit’s

latency (e.g. 1 for CONST, 3 for FALU, etc.). With the ex-

ception of FTOU GH, each functional unit can accept an

operation every clock cycle, i.e. has a reco very time of 1.

The binding of opera tions to functional unit types is sum-

marized in Table 3-8. In Appendix A, “PNX1300/01/02/

11 DSPCPU Operations”, each operation lists the pre-

cise functional unit and unit latency.

3.4 MEMORY AND MMIO

PNX1300 defines four apertures in its 32-bit address

space: the memory hole, the DRAM aperture, the MMIO

aperture and the PCI apertures (See Figure 3-4).The

memory hole covers addresses 0..0xff. The DRAM and

MMIO apertures are d efined by the values in MMIO reg-

isters; the PCI apertures consist of every address that

does not fall in the other three apertures.

3.4.1 Memory Map

DRAM is mapped into an aperture extending from the

address in DRAM_BASE to the address in

DRAM_LIMIT. The maximum DRAM aperture size is 64

MB.

The MMIO aperture is located at address MMIO_BASE

and is a fixed 2-MB size.

In the default operating mode, all memory accesses not

going to either the hole, DRAM or MMIO space are inter-

preted as PCI accesses. This behavior can be overrid-

den as described in Section 5.3.8, “Memory Hole and

PCI Aperture Disable.”

The MMIO aperture and the DRAM aperture can be at

any naturally aligned location, in any order, but should

not overlap; if they do, the conseq uences are undefin ed.

The values of DRAM_BASE, DRAM_LIMIT, and

MMIO_BASE are set during the boot process. In the

case of a PCI host assisted boot, the values are deter-

mined by the host BIOS. In case of standalone boot (i.e.,

PNX1300 is the PCI host), the value s are taken from the

boot ROM. Refer to Chapter 13, “System Boot” for de-

tails. DSPCPU update of DRAM_BASE and

MMIO_BASE is possible, but not recommended, see

Section 11.6.3, “MMIO/DRAM_ BASE updates.”

3.4.2 The Memory Hole

The memory hole from address 0 to 0xff serves to protect

the system from performance loss due to speculative

loads. Due to the nature of C program references, most

speculative loads issued by the DSPCPU fall in the

range covered by the hole. Activated by default upon RE-

SET, the hole serves to ensure that these speculative

loads do NOT cause PCI read accesses and slow down

the system. The value returned by any da ta load from the

hole is 0. The hole only protects loads. Store operations

in the hole do cause writes to PC I, SDRAM or MMIO as

determined by the aperture base address values. If the

SDRAM aperture overlaps the memory hole, the memory

hole is ignored.

The hole can be temporarily disabled through the

DC_LOCK_CTL register. This is described in Section

5.3.8, “Memory Hole and PCI Aperture Disable.”

3.4.3 MMIO Memory Map

Devices are controlled through memory-mapped device

registers, referred to as MMIO registers. To ensure com-

patibility with future devices, any undefined MMIO bits

should be igno red wh en read , and written as ‘0’s. Some

devices can autonomously access data memory (DMA)

and most devices can cause CPU interrupts.

The 2-MB MMIO aperture is initially located at address

0xEFE00000 on RESET; it is relocated by the PCI BIOS

Table 3-8. Functional unit operatio ns

unit type operation category

const immediate operations

alu 32-bit arithmetic, logical, pack/unpack

dspalu dual 16-bit, quad 8-bit multimedia arithmetic

dspmul dual 16-bit and quad 8-bit multimedia multiplies

dmem loads/stores

dmemspec cache coherency, cache control, prefetch

shifter multi-bit shift

branch control flow

falu floating point arithmetic & conversions

ifmul 32-bit integer and floating point multiplies

fcomp single cycle floating point compares

ftough iterative floating point square root and division

hole

256byte

0x0000 0000

PCI

MMIO_BASE

MMIO Aperture

DRAM_LIMIT

DRAM_BASE

DRAM Aperture

0xFFFF FFFFF

PCI

2 MB

1 MB - 64 MB

PCI

Figure 3-4. PNX1300 memory map.

PNX1300/01/02/11 Data Book Philips Semiconductors

3-8 PRELIMINARY SPECIFICATION

for PC-hosted PNX1300 boards; its final location is de-

termined by the boot EEPROM for standalone systems.

See Chapter 13, “System Boot” for more information.

Figure 3-5 gives a detailed o verview of the MMIO mem-

ory map (addresses used are offsets with respect to the

MMIO base). The operating system on PNX1300 can

change MMIO_BASE by writing to the MMIO_BASE

MMIO location. User programs should not attempt this.

Refer to the TriMedia SDE Reference Manual for the

standard method to access the device registers from C

language devic e dr ive rs.

Only 32-bit load and store operations are allowed to ac-

cess MMIO registers in the MMIO address a perture. The

results are undefined for other loads and stores. Reads

from non-existent MMIO registers return undefined val-

ues. Writes to nonexistent MMIO registers time out.

There are no side effects of accesses to nonexistent

MMIO registers. The state of the PCSW BSX bit has no

effect on the result of MMIO accesses.

The Icache tag and LRU bit access aperture give the

DSPCPU read-only access to the Icache status. Re fer to

Section 5.4.8, “Reading Tags and Cache Status” for de-

tails.

The EXCVEC MMIO location is explained in Section

3.5.2, “EXC (Exceptions).” Section 3.5.3, “INT and NMI

(Maskable and Non-Maskable Interrupts),” describes

the locations that deal with the setup and handling of in-

terrupts: ISETTING, IPENDING, ICLEAR, IMASK and

the interrupt vectors. The timer MMIO locations are de-

scribed in Section 3.8, “Timers.” The instruction and

data breakpoint are described in Section 3.9, “Debug

Support.” The MMIO locations of each device are treat-

ed in the respective device chapters.

3.5 SPECIAL EVENT HANDLING

The PNX1300 microprocessor responds to the special

events shown in Table 3-9, ordered by priority.

With the exception of RESET, which is enabled at all

times, the architecture of the DSPCPU allows special

event handling to begin only during an interruptible jump

operation (ijmpt, ijmpf or ijmpi) that succeeds (i.e., is a

taken jump). EXC, NMI and INT handling can be in itiated

during handling of an EXC or an INT, but only during suc-

cessful interruptible jumps.

0x00 0000

Reserved

for

Future Use

Reserved

for

Future Use

0x10 3800 JTAG interface

0x10 3400 I2C interface

0x10 3000 PCI interface

0x10 2C00 SSI interface

0x10 2800 VLD coprocessor

0x10 2400 Image coprocessor

0x10 2000 Audio Out

0x10 1C00 Audio In

0x10 1800 Video Out

0x10 1400 Video In

0x10 1000 Debug suppor t

0x10 0C00 Timers

0x10 0800 Vectored interrupt controller

0x10 0400 MMIO base

0x10 0000 Main memory, cache control

0x1F FFFFF 0x10 1200 data breakpoints

0x10 1000 instruction breakp oints

0x10 0C60 systimer

0x10 0C40 timer3

0x10 0C20 timer2

0x10 0C00 timer1

0x10 08Fc intvec31

0x10 08F8 intvec30

0x10 0888 intvec2

0x10 0884 intvec1

0x10 0880 intvec0

0x10 0828 imask

0x10 0824 iclear

0x10 0820 ipending

0x10 081C isetting3

0x10 0818 isetting2

0x10 0814 isetting1

0x10 0810 isetting0

0x10 0800 excvec

0x10 0400 MMIO_BASE

0x10 0004 DRAM_LIMIT

0x10 0000 DRAM_BASE

0x01 0000 Icache tags & LRU (r/o)

Figure 3-5. Memory map of MMIO address space (addresses are offset from MMIO_BASE).

Table 3-9. Special Events and Event Vectors

Event Vector

RESET (Highest priority) vector to DRAM_BASE

EXC (All exceptions) vector to EXCVEC (programmable)

NMI,

INT (Non-maskable interrupt, maskable interrupt) use

the programmed vector (one of 32 vectors depend-

ing on the interrupt source)

Philips Semiconductors DSPCPU Architecture

PRELIMINARY SPECIFICATION 3-9

The instruction scheduler uses interruptible jumps exclu-

sively for inter-decision tree jumps. Hence, within a de ci-

sion tree, no special-event processing can be initiated. If

a tree-to-tree jump is taken, special-even t processing is

allowed. Since the only registers live at this point (i.e.,

that contain useful data) are the global registers allocat-

ed by the ANSI C compile r, only a subset of the registers

needs to be preserved by the event handlers. Refer to

the TriMedia SDE Reference Manu al for details on which

registers can be in use. The DSPCPU register state can

be described by the contents of this subset of general

purpose register s and the contents of the PCSW and the

DPC value (the target of the inter-tree jump).

The priority resolu tion mechanism built into the DSPCPU

hardware dispatches the highest-priority, non-masked

special-event request at the time of a successful inter-

ruptible jump operation. In view of the simple, real-time-

oriented nature of the mechanisms provided, only limited

nesting of events should be allowed.

3.5.1 RESET

RESET is the highest priority special event. It is asserted

by external hardware or by the host CPU. PNX1300 will

respond to it at any time.

External hardware reset through the TRI_RESET# pin

initiates boot protocol execution as describe d in Chapter

13, “System Boot.” This causes the current PC value to

be lost and instruction execution to start from address

DRAM_BASE.

A PCI host CPU can perform a PNX1300 DSPCPU-only

reset by an MMIO write to the BIU_CTL.SR and CR bits.

Such a reset does not cause a full boot, instead the

DSPCPU resumes execution from DRAM_BASE.

3.5.2 EXC (Exceptions)

The DSPCPU enters EXC special-event processing un-

der the following conditions:

1. RESET is de-asserted.

2. The intersection PCSW[15,6:0] & PCSW[31,22:16] is

non-empty or PCSW.TFE is set.

3. A successful interruptible jump is in the final jump ex-

ecution stage.

DSPCPU hardware takes the following actions on the ini-

tiation of EXC processing:

1. DPC is assigned the intended destination addr ess of

the successful jump.

2. Instruction processing starts at EXCVEC.

All other actions are the responsibility of the EXC handler

software. Note that no other special event processing will

take place until the handler decides to execute an inter-

ruptible jump that succeeds.

3.5.3 INT and NMI (Maskable and Non-

Maskable Interrupts)

The on-chip Vectored Interrupt Co ntroller (VIC) provides

32 INT request input hardware lines. The interrupt con-

troller prioritizes and maps attention requests from sev-

eral different peripherals onto successive INT requests

to the DSPCPU.

INT special event processing will occur under the follow-

ing conditions:

1. RESET is de-asserted.

2. The intersection PCSW[15,6:0] & PCSW[31,22:16] is

empty and PCSW.TFE is not set.

3. The intersection of IPENDING and IMASK is non-

empty.

4. The interrupt is at level NMI or PCSW.IEN = 1.

5. A successful interruptible jump is in the final jump ex-

ecution stage.

DSPCPU hardware takes the following actions on the ini-

tiation of NMI or INT processing:

1. DPC gets assig ned the intend ed destination add ress

of the successful jump.

2. Instruction processing starts at the appropriate inter-

rupt vector.

All other actions are the responsibility of the INT handler

software. Note that no other special event processing will

take place until the handler decides to execute an inter-

ruptible jump that succeeds.

3.5.3.1 Interrupt vectors

Each of the 32 interrupt sources can be assigned an ar-

bitrary interrupt vector (the addr ess of the first instruction

of the interrupt handler). A vector is setup by writing the

address to one of the MMIO locations shown in

Figure 3-6. The state of the MMIO vector locations is un-

defined after RESET. (Addresses of the MMIO vector

registers are offset with respect to MMIO_BASE.)

Source 0 vector

INTVEC0 (r/w) Source 1 vector

INTVEC1 (r/w) Source 2 vector

INTVEC2 (r/w)

Source 30 vector

INTVEC30 (r/w) Source 31 vector

INTVEC31 (r/w)

•

0x10 0880

0x10 0884

0x10 0888

0x10 08F8

0x10 08FC

•

31 0

MMIO_BASE

offset:

Figure 3-6. Interrupt vect or locations in MMIO address space.

PNX1300/01/02/11 Data Book Philips Semiconductors

3-10 PRELIMINARY SPECIFICATION

Programmer’s note: See the Philips TriMedia Cookbook

(Book 2 of TriMedia SDE documentation) for in formatio n

on writing interrupt handlers.

3.5.3.2 Interrupt modes

DSPCPU interrupt sour ces can be progra mmed to oper-

ate in either level-sensitive or edge-triggered mode. Op-

eration in edge-triggered or level-sensitive mode is de-

termined by a bit in the ISETTING MMIO locations

corresponding to the source, as defined in Figure 3-7.

On RESET, all ISETTING registers are cleared.

In edge-triggered mode, the leading edge of the signal

on the device interrup t request line caus es the VIC (Vec-

tored Interrupt Controller) to set the interrupt pending flag

corresponding to the device source number. Note that,

for active high signals, the leading edge is the positive

edge, whereas for active low request signals (such as

PCI INTA#), the negative edge is the leading edge. The

interrupt remains pen ding until one of two events occurs:

• The VIC successfully dispatches the vector corre-

sponding to the source to the PNX1300 CPU, or

• PNX1300 CPU software clears the interrupt-pending

flag by a direct write to the ICLEAR location.

No interrupt acknowledge to ICLEAR is needed for de-

vices operating in edge-trigger ed mode, since the vecto r

dispatch clears the IPENDING request . The device itself

may however need a device-specific interrupt acknowl-

edge to clear the requesting condition. Edge-triggered

mode is not recommended for devices that can signal

multiple simultaneous interrupt conditions. The on-chip

timers must be operated in edge triggered mode.

In level-sensitive mode, the device requests an interrupt

by asserting the VIC source request line. The device

holds the request until the device interrupt handler per-

forms a device interr upt acknowledge. It is highly recom-

mended that all off-chip and on-chip sources, with the ex-

ception of the timers, operate in level- sensitive mode.

3.5.3.3 Device interrupt acknowledge

All devices capable of generating level-triggered inter-

rupts have interrupt acknowledge bits in their memory

mapped control registers for this purpose. An interrupt

acknowledge is performed b y a store to such control reg-

ister, with a ‘1’ in the bit position(s) corresponding to the

desired acknowledge flags.

Programmers note: the stor e operation that performs th e

interrupt acknowledge should be issued at least 2 cycles

before the (inter ruptible) jump that ends an in terrupt han-

dler. This ensures that the same interrupt is not dis-

patched twice due to request de-assertion clock delays.

3.5.3.4 Interrupt priorities

Each interrupt source can be programmed to request

one out of eight levels of priorities. The highest priority

level (level 7) co rr es po nd s t o r equ es tin g a n NM I— an in-

terrupt that cannot be masked by the DSPCPU PC-

SW.IEN bit. The other levels request regular interrupts,

that can be masked as a group by the PCSW.IEN flag.

Level six represents the highest priority normal interrupt

level and level zero represents the lowest. Refer to

Figure 3-7 for details of programming the priority level.

The VIC arbitrates the highest-priority pending interrupt

requestor. Sources programmed to request at the same

level are treated with a fixed priority, from source numbe r

0 (highest) to 31 (lowest). At such time as the DSPCPU

is willing to process special events, the vector of highest

priority NMI source will be dispatched. If no NMI is pend-

ing, and the DSPCPU allows regular interrupts (PC-

SW.IEN is asserted), the vector of the highest priority

regular source is dispatched. Once a vector is dis-

patched, the corresponding interrupt pending flag is de-

asserted (edge triggered mode sources only).

3.5.3.5 Interrupt masking

A single MMIO register (IMASK in Figure 3-8) allows

masking of an arbitrary subset of the interrupt sources.

Masking app lies to both regu lar as well as NMI level re-

questors. Masking is used by software to disa ble unused

devices and/or to implement nested interrupt handling. In

the latter case, each interrupt handler can stack the old

IMASK content for later restoration and insert a new

mask that only allows the interrupts it is willing to handle.

For level-triggered device handlers, IMASK should also

exclude the device itself to p revent r epeated han dle r ac-

tivation.

Each interrupt source device typically has its own inter-

rupt enable flag(s) that determine whether certain key

MP31

ISETTING3 (r/w)0x10 081C 31 0

MMIO_BASE

offset:

ISETTING2 (r/w)0x10 0818

ISETTING1 (r/w)0x10 0814

ISETTING0 (r/w)0x10 0810

MP30 MP29 MP28 MP27 MP26 MP25 MP24

371115192327

Each MP Field:

0xxx source operates in edge-triggered mode

1xxx source operates in level-sensitive mode

Each MP Field:

x111 NMI (highest) priority

x110 maskable level 6

...

x000 maskable level 0

MP23 MP22 MP21 MP20 MP19 MP18 MP17 MP16

MP15 MP14 MP13 MP12 MP11 MP10 MP9 MP8

MP7 MP6 MP5 MP4 MP3 MP2 MP1 MP0

Figure 3-7. Interrupt mode and priority MMIO locations and formats.

Philips Semiconductors DSPCPU Architecture

PRELIMINARY SPECIFICATION 3-11

device events lead to the request of an inte rrupt. In addi-

tion, the PCSW.IEN flag determines whether the

DSPCPU is willing to handle regular interrupts. Non

maskable interrupts ignore the state of this flag.

All three mechanisms are necessary: the PCSW.IEN flag

is used to implement critical sections of code during

which the RTOS (real-time operating system) is unable

to handle regular interrupts. The IMASK is used to allow

full control over interrupt handler nesting. The device in-

terrupt flags set the operational mode of the device.

When RESET is asserted, IPENDING, ICLEAR, and

IMASK are set to all zeroes. (MMIO register addresses

shown in Figure 3-8 ar e offset addr esse s with re spect to

MMIO_BASE.)

3.5.3.6 Software interrupts and

acknowledgment

The IPENDING register shown in Figure 3-8 can be read

to observe the curr ently pending interrupts. Each bit read

depends on the mode of the source:

• For a level-sensitive source, a bit value corresponds

to the current state of the device interrupt request

line.

• For an edge-triggered interrupt, a ‘1’ is read if and

only if an interrupt request occurred and the corre-

sponding vector has not yet been dispatched.

Software can request an interrupt for sources operating

in edge-triggered mode. Writes to the IPENDING register

assert an interrupt request for all sources where a 1 oc-

curred in the bit position of the written value. The state of

sources where a 0 occurred in the written value is un-

changed. Writes have no effect on level-sensitive mode

sources. The interrupt request, if not masked, will occur

at the next successful interruptible jump. This differs from

the conventional software interrupt-like semantics of

many architectures. Any of the 32 sources can be re-

quested in software. In normal operation however, soft-

ware-requested interrupts should be limited to source

vectors not allocated for h ardware devices. Note that an -

other PCI master can request interrupts by manipulating

the IPENDING location in the MMIO aperture. This is

useful for inter-processor communication.

The ICLEAR register reads the same as the IPENDING

pending flags for edge-triggered mode sources. All IP-

ENDING flags corresponding to bit positions in which ‘1’s

are written are cleared. IPENDING flags corresponding

to bit positions in which ‘0’s are written are not affected.

Writes have no effect on level-sensitive mode sources.

When a pending interrupt bit is being cleared through a

write to the ICLEAR register at the same time that the

hardware is trying to set that interrupt bit, the hardware

takes precedence.

3.5.3.7 NMI sequentialization

In most applications, it is desirable not to nest NMIs. The

NMI interrupt ha ndler can acco mplish this by saving the

old IMASK content and clearing IMASK before the first

interruptible jump is executed by the NMI handler.

3.5.3.8 Interrupt source assignment

Table 3-10 shows the assignment of devices to interrupt

source numbers, as well as the recom mended opera ting

mode (edge or level triggered). Note that there are a total

of 5 external pins available to assert interrupt requests.

The PCI INTA to INTD requests are asserted by active

low signal conventions, i.e. a zero level or a negative

edge asserts a requ est. The USERIRQ pin operates with

active high signalling conventions.

3.6 PNX1300 TO HOST INTERRUPTS

In systems where PNX1300 is operating in the presence

of a host CPU on PCI, PNX1300 can generate in terrupts

to the host, using any combination of the four PCI INTA#

to INTD# pins. In a typical host system, only one of th ese

pins needs to be wired to the PCI bus interrupt request

lines. Any unused pins of this group are then available for

use as software programmable I/O pins.

The INT_CTL register (see Figure 3-9) IEx bits, when

set, enable the open collector driver of the four

INTD#..INTA# pins. The INTx bits determine the output

value generated (if enabled). A ‘1’ in INTx causes the

corresponding PCI interrupt pin to be asserted (low IN-

Tx# pin). The ISx bits are read-only and reflect the cur-

IMASK (r/w)0x10 0828 31 0

MMIO_BASE

offset: 723 15

ICLEAR (r/w)0x10 0824

IPENDING (r/w)0x10 0820

Each IMASK(i) bit:

On read or write, 0  disallow source i interrupt request

On read or write, 1  allow source i interrupt request

Each ICLEAR(i) bit:

On read, same as IPENDING(i)

On write, 1  clear source i interrupt request

Each IPENDING(i) bit:

On read, 1  source i interrupt request is pending

On write, 1  software source i interrupt request

Figure 3-8. Interrupt controller request, clear, and mask MMIO registers.

PNX1300/01/02/11 Data Book Philips Semiconductors

3-12 PRELIMINARY SPECIFICATION

rent actual state of the pins. Note tha t the pins have neg-

ative logic (active low) polarity and are of the open

collector output type. Hence the pin voltage is low (ac-

tive) when the logical value set or seen in the INT_CTL

The assertion and de-assertion of host interrupts is the

responsibility of PNX1300 software.

See also Section 11.6.1 7, “I NT _ CT L Reg ist er.”

3.7 HOST TO PNX1300 INTERRUPTS

A host CPU can generate an interrupt to PNX1300 in

several ways:

• by a PCI MMIO write to IPENDING to assert the

HOSTCOMM interrupt (bit 28)

• by a hardware circuit that asserts one of the interrupt

request pins TRI_USERIRQ, or INTA..INTD.

The first and most common method requires no circuitry

and leaves the interrupt pins available for other purposes.

3.8 TIMERS

The DSPCPU contains four programmable timer/

counters, all with the same function. The first three

(TIMER1, TIMER2, TIMER3) are intended for general

use. The fourth timer/counter (SYSTIMER) is reserved

for use by t he system software and should not be used

by applications.

Each timer has three registers as shown in Figure 3-10.

The MMIO register add resses shown are offset address-

es with respect to the timer’s base address.

Each timer/counter can be set to count one of the event

types specified in Table 3-12. Note that the

DATABREAK event is special, in that the timer/counter

may increment by zero, one or two in each clock cycle.

For all other event types, incre ments ar e b y zero o r one.

The CACHE1 a nd CACH E2 even ts serv e as ca che pe r-

formance monitoring support. The actual event selected

for CACHE1 and CACHE2 is determined by the

MEM_EVENTS MMIO register , see Section 5.7, “Perfor-

mance Evaluation Support.” If a PNX1 300 pin signal (VI-

CLK, etc.) is selected as an event, positive-going edges

on the signal are counte d.

Each timer increments its value until the modulus is

reached. On the clock cycle where the incremented val-

ue would equal or exceed the modulus, the value wraps

around to zero or one (in the case of an increment by

two), and an interrupt is generated as defined in

Table 3-10. The timer interrupt source mode should be

set as edge-sensitive. No software interrupt acknowl-

edge to the timer device is necessary.

Counting starts and continues as long as the run bit is

set.

Loading a new modulus does not affect the contents of

the value register. If a store operation to either the mod-

ulus or value register results in value and modulus bein g

the same, no interrupt will be generated. If the run bit is

set, the next value will be modulus+1 or modulus+2, and

Table 3-10. Interrupt source assignments

SOURCE

NAME SRC

NUM MODE SOURCE DESCRIPTION

PCI INTA 0 level PCI_INTA# pin signal

PCI INTB 1 level PCI_INTB# pin signal

PCI INTC 2 level PCI_INTC# pin signal

PCI INTD 3 level PCI_INTD# pin signal

TRI_USERIRQ 4 either external general-purpose

TIMER1 5 edge general-purpose timer

TIMER2 6 edge general-purpose timer

TIMER3 7 edge general-purpose timer

SYSTIMER 8 edge reserved for debugger

VIDEOIN 9 level video in block

VIDEOOUT 10 level video out block

AUDIOIN 11 level audio in block

AUDIOOUT 12 level audio out block

ICP 13 level image coprocessor

VLD 14 level VLD coprocessor

SSI 15 level SSI interface

PCI 16 level PCI BIU (DMA, etc.; see

Table 11-14 for possible

interrupt causes)

IIC 17 level I2C interface

JTAG 18 level JTAG interface

t.b.d. 19..24 reserved for future devices

SPDO 25 level SPDO block

t.b.d. 26..27 reserved for future devices

HOSTCOM 28 edge (software) host communica-

tion

APP 29 edge (software) application

DEBUGGER 30 edge (software) debugger

RTOS 31 edge (software) RTOS

Figure 3-9. Host interrupt control register

31 0

MMIO_BASE

offset:

0x10 3038 371115192327

INT_CTL (r/w)

IS[D:A] IE[D:A] INT[D:A]

Philips Semiconductors DSPCPU Architecture

PRELIMINARY SPECIFICATION 3-13

the counter will have to loop around before an interrupt is

generated.

A modulus value of zero causes a wrap-around as if the

modulus value was 232.

On RESET, the TCTL registers ar e clea red, and the va l-

ue of the TMODULUS and TVALUE registers is unde-

fined.

3.9 DEBUG SUPPORT

This section describes the special debug support offe red

by the DSPCPU. Instruction and data breakp oints can be

defined through a set of registers in the MMIO register

space. When a breakpoint is matched, an event is gen-

erated that can be used as a timer source (see Section

3.8, “Timers”). The timer TMODULUS has to be set to

generate a DSPCPU interrupt after the desired number

of breakpoint matches.

3.9.1 Instruction Breakpoints

The instruction-breakpoint control register is shown in

Figure 3-11. On RESET, the BICTL register is cleared.

(MMIO-register addresses shown are offset with respect

to MMIO_BASE.)

The instruction-breakpoint address-range registers are

shown in Figure 3-12. After RESET, the value of these

registers is undefined. (MMIO-r egister ad dresse s sh own

are offset with respect to MMIO_BASE.)

When the IC bit in th e breakpoint control register is set to

‘1’, instruction breakpoints are activated. Any instruct ion

address issued by the PNX1300 chip is compared

against the low an d high a ddress-r ange va lues. The IAC

bit in the breakpoint control register determines whether

the instruction address needs to be inside or outside of

the range defined by the low and high address-range

registers. A successful comparison takes place when ei-

ther:

• IAC = ‘0’ and low  iaddr  high, or

• IAC = ‘1’ and iaddr < low or iaddr > high.

On a successful comparison, an instruction breakpoint

event is generated, which can be used as a clock input

to a timer. After counting the programmed number of in-

struction breakpoint events, the timer will generate an in-

terrupt request.

Table 3-11. Timer base MMIO address

TIMER1 MMIO_BASE+0x10,0C00

TIMER2 MMIO_BASE+0x10,0C20

TIMER3 MMIO_BASE+0x10,0C40

SYSTIMER MMIO_BASE+0x10,0C60

Table 3-12. Timer source selections

Source Name Source

Bits

Value Source Description

CLOCK 0 CPU clock

PRESCALE 1 p rescaled CPU clock

TRI_TIMER_CLK 2 external clock pin

DATABREAK 3 data breakpoints

INSTBREAK 4 instruction breakpoints

CACHE1 5 cache event 1

CACHE2 6 cache event 2

VI_CLK 7 video in clock pin

VO_CLK 8 video out clock pin

AI_WS 9 audio in word strobe pin

AO_WS 10 audio out word strobe pin

SSI_RXFSX 11 SSI receive frame sync pin

SSI_IO2 12 SSI transmit frame sync pin

— 13-15 undefined

MODULUS

TMODULUS (r/w)031 0

Timer base offset:

TVALUE (r/w)4

TCTL (r/w)8

371115192327

“PRESCALE”:

Prescale value is

2^PRESCALE, i.e.,

in the range [1..32768] “SOURCE” select:

see table Table 3-12

VALUE

PRESCALE SOURCE “RUN” bit:

0 Timer stopped

1 Timer running

Figure 3-10. Timer register definitions.

PNX1300/01/02/11 Data Book Philips Semiconductors

3-14 PRELIMINARY SPECIFICATION

3.9.2 Data Breakpoints

The data-breakpoint address-range and compare-value

registers are shown in Figure 3-13. After RESET, the val-

ue of the data breakpoint registers is undefined. (MMIO-

MMIO_BASE.)

The data-breakpoint control register is shown in

Figure 3-14. On RESET, the BDCTL register is cleared.

(The register address shown is offset with respect to

MMIO_BASE.)

When the DC bits in the data breakpoint c ontr ol re gister

are not set to ‘0’, data breakpoints are activated. When

the value of the DC bits is ‘1’ or ‘3’, any data address from

load operations (if the BL bit is set) and/or store opera-

tions (if the BS bit is set) issued by the DSPCPU is com -

pared against the low and high address-range values.

The DAC bit in the breakpoint control register determines

whether data addresses need to be inside or outside of

the range defined by the low and high address-range

registers. A successful comparison occurs when either:

• DAC = ‘0’ and low  daddr  high, or

• DAC = ‘1’ and daddr < low or daddr > high.

31 0

MMIO_BASE

offset: BICTL (r/w)0x10 1000 371115192327

‘IAC’ Instruction address control:

0 Breakpoint if address inside range

1 Breakpoint if address outside range ‘IC’ Instruction control bit:

0 Disable instruction breakpoints

1 Enable instruction breakpoints

Figure 3-11. Instruction-breakpoint control register.

Address Range Start

BINSTLOW (r/w)0x10 1004 31 0

MMIO_BASE

offset:

BINSTHIGH (r/w)0x10 1008

371115192327

Address Range End

Figure 3-12. Instruction-breakpoint address-range registe r s.

BDATAALOW (r/w)0x10 1030 31 0

MMIO_BASE

offset:

BDATAAHIGH (r/w)0x10 1034

BDATAVAL (r/w)0x10 1038

BDATAMASK (r/w)0x10 103C

Address Range Start 371115192327

Address Range End

Data Breakpoint Value

Data Breakpoint Value Mask

Figure 3-13. Data-breakpoint address-range and value-compare registers.

31 0

MMIO_BASE

offset: BDCTL (r/w)0x10 1020 371115192327

‘DVC’ Data Value Control:

0 Breakpoint if data equal

1 Breakpoint if data not equal

DCBS BL

‘BS’ Break on Store:

0 Don’t check data stores

1 Do check data stores

‘DAC’ Data Address Control:

0 Breakpoint if address inside range

1 Breakpoint if address outside range

‘BL’ Break on Load:

0 Don’t check data loads

1 Do check data loads

‘DC’ Data Control:

0 No checking

1 Check data addresses

2 Check data values

3 Check data value and addresses

Figure 3-14. Data-breakpoint control register.

Philips Semiconductors DSPCPU Architecture

PRELIMINARY SPECIFICATION 3-15

Note that this comparison works for all addresses re-

gardless of the aperture to which they belong. When the

value of the DC bits is ‘2’ or ‘3’, any data value from load

operations (if the BL bit is set) and/o r store opera tions (if

the BS bit is set) issued by the PNX1300 CPU is com-

pared against th e valu e in the BDAT AVAL r egister. Only

the bits for which the corresponding BDATAMASK regis-

ter bits are set to ‘1’ will be used in the comparison. The

DVC bit in the breakpoint control register determines

whether the data value needs to be equal or not equal to

the comparison value. A successful comparison occurs

when either of the following are true:

• DVC = ‘0’ and (data & BDATAMASK) = (BDATAVAL

& BDATAMASK).

• DVC = ‘1’ and (data & BDATAMASK) != (BDATAVAL

& BDATAMASK).

Note: use a nonzero datamask or the result is undefined.

When a successful comparison has taken place, a data

breakpoint event is generated, which can be used as a

clock input to a timer. After counting the set number of

data breakpoint events, the timer will generate an inter-

rupt request.

When the value of the DC bits is ‘3’, a data breakpoint

event is generated if and only if a successful compariso n

occurs on both address and data simultaneously.

Note that up to two data breakpoint events can occur per

clock cycle, due to the dual load/store capability of the

CPU and data cache.

PNX1300/01/02/11 Data Book Philips Semiconductors

3-16 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION 4-1

Custom Operations for Multimedia Chapter 4

by Gert Slavenburg, Pieter v.d. Meulen, Yong Cho, Sang-Ju Park

4.1 CUSTOM OPERATIONS OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

Custom operations in the PNX1300 DSPCPU architec-

ture are specialized, high-function operations designed

to dramatically improve performance in important multi-

media applications. When properly incorporated into ap-

plication source code, custom operations enable an ap-

plication to take advantage of the highly parallel

PNX1300 microprocessor implementation. Achieving a

similar performance increase through other means—

e.g., executing a higher number of traditional micropro-

cessor instructions per cycle—would be prohibitively ex-

pensive for PNX1300 ’s low-cost target applications.

Custom operations are simple to understand and consis-

tent in their definition, but their unusual functions make it

difficult for automatic code generation algorithms to use

them effectively. Consequently, custom operations are

inserted into source code by the programmer. To make

this process as painless as possible, custom operation

syntax is consistent with the C programming language,

and, just as with all other operations generated by the

compiler, the scheduler takes care of register allocation,

operation packing, and flow analysis.

4.1.1 Custom Operation Motivation

For both general-purpose and embedded microproces-

sor-based applications, prog ramming in a hig h-level lan-

guage is desirable. To effectively support optimizing

compilers and a simple programming model, certain mi-

croprocessor architecture features are needed, such as

a large, linear address space, general-purpose registers,

and register-to-register operations that directly support

the manipulation of linear address pointers. A common

choice in microprocessor architectures is 32-bit linear

addresses, 32-bit registers, and 32-bit integer opera-

tions. PNX1300 is such a microprocessor architecture.

For the data manipulation in many algorithms, however,

32-bit data and operations are wasteful of expensive sil-

icon resources. Important multimedia applications, such

as the decompression of MPEG video streams, spend

significant amounts of execution time dealing with eight-

bit data items. Using 32-bit operations to manipulate

small data items makes inefficient use of 32-bit execution

hardware in the implementation. If these 32-bit resources

could be used instead to operate on four eight-bit data

items simultaneously, performance would be improved

by a significant factor with only a tiny increase in imple-

mentation cost.

Getting the highest execution rate from standard micro-

processor resources is one of the motivations behind

custom operations in PNX1300. A ra nge of custom oper-

ations is provided that each processes—simultaneous-

ly—four 8-bit or two 16-bit data items. There is little cost

difference between a standard 32-bit ALU and one that

can process either one pair of 32-bit operands or four

pairs of eight-bit operands, but there is a big perfor-

mance difference for PNX1 300’s target applications.

PNX1300’s custom opera tions go beyond simply making

the best use of standard resources. Some custom oper-

ations combine several simple operations. These combi-

nations are tailored specifica lly to the needs of important

multimedia applications. Some high-function custom op-

erations eliminate conditional branches, which helps the

scheduler make effective use of all five operation slots in

each PNX1300 instruction. Filling up all five slots is es-

pecially important in the inner loops of computational in-

tensive multimedia applications.

In short, custom operations help PNX1300 reach its

goals of extremely high multimedia performance at the

lowest possible cost.

4.1.2 Introduction to Custom Operations

Table 4-1 and Table 4-2 contain two listings of the cus-

tom operations available in the PNX1300 architecture.

Table 4-1 groups the custom operations by type of func-

tion while Table 4-2 lists the operation s by oper and size.

For more detailed information about the custom opera-

tions, Appendix A, “PNX13 00/01/02/11 DSPCPU Opera-

tions.”

Some operations exist in several versions that differ in

the treatment of their operands and results, and the mne-

monics for these versions make it easy to select the ap-

propriate operation. For example, the sum of products

operations all have “fir” in their mnemonics; the prefix

and suffix of the mnemonic expresses the treatment of

the operands and result. The ifir8ii operation treats both

of its operands as signed (ifir8ii) and produces a signed

result (ifir8ii). The ifir8iu operation treats its first operand

as signed (ifir8iu), the second as unsigned (ifir8iu), and

produces a signed result (ifir8iu). The ume8ii operation

implements an eight-bit motion-estimation; it treats both

operands as signed but produces an unsigned result.

The operations beginning with “dsp” implement a clip-

ping (sometimes called saturating) function before stor-

PNX1300/01/02/11 Data Book Philips Semiconductors

4-2 PRELIMINARY SPECIFICATION

ing the result(s) in the destination register. Otherwise,

their naming follows the rules given above where appro-

priate. For example, the dspuquadaddui operation imple-

ments four 8-bit additions; it treats the first operand of

each addition as unsigned, the second operand as

signed, and produces an unsigned result for ea ch addi-

tion. Each result, which is computed with no loss of pre-

cision, is clipped into the representable range of a byte

(0..255).

Table 4-1. Key Multimedia Custom Operations

Listed by Function Type

Function Custom Op Description

DSP

absolute

value

dspiabs Clipped signed 32-bit absolute

value

dspidualabs Dual clipped absolute values of

signed 16-bit halfwords

Shift dualasr dual-16 arithmetic shift right

Clip dualiclipi dual-16 clip signed to signed

dualuclipi dual-16 clip signed to unsigned

Min,max quadumax Unsigned bytewise quad max

quadumin Unsigned bytewise quad min

DSP add dspiadd Clipped signed 32-bit add

dspuadd Clipped unsigned 32-bit add

dspidualadd Dual clipped add of signed 16-

bit halfwords

dspuquadaddui Q uad clipped add of unsigned/

signed bytes

DSP

multiply dspimul Clipped signed 32-bit multiply

dspumul Clipped unsigned 32-bit multi-

ply

dspidualmul Dual clipped multiply of signed

16-bit halfwords

DSP

subtract dspisub Clipped signed 32-bit subtract

dspusub Clipped unsigned 32-bit sub-

tract

dspidualsub Dual clipped subtract of signed

16-bit halfwords

Sum of

products ifir16 Signed sum of products of

signed 16-bit halfwords

ifir8ii Signed sum of products of

signed bytes

ifir8iu Signed sum of products of

signed/unsigned bytes

ufir16 Unsigned sum of products of

unsigned 16-bit halfwords

ufir8uu Unsigned sum of products of

unsigned bytes

Merge,

pack mergedual16lsb Merge dual-16 least-significant

bytes

mergelsb Merge least-significant bytes

mergemsb Merge most-significant bytes

pack16lsb Pack least-significant 16-bit

halfwords

pack16msb Pack most-significant 16-bit

halfwords

packbytes Pack least-significant bytes

Byte

averages quadavg Unsigned byte-wise quad aver-

age

Byte

multiplies quadumulmsb Unsigned quad 8-bit multiply

most significant

Motion

estima-

tion

ume8ii Unsigned sum of absolute val-

ues of signed 8-bit differences

ume8uu Unsigned sum of absolute val-

ues of unsigned 8-bit differ-

ences

Table 4-2. Key Multimedia Custom Operations

Listed by Operand Size

Op. Size Custom Op Description

32-bit dspiabs Clipped signed 32-bit abs value

dspiadd Clipped signed 32-bit add

dspuadd Clipped unsigned 32-bit add

dspimul Clipped signed 32-bit multiply

dspumul Clipped unsigned 32-bit multi-

ply

dspisub Clipped signed 32-bit subtract

dspusub Clipped unsigned 32-bit sub-

tract

16-bit mergedual16lsb Merge dual-16 least-significant

bytes

dualasr dual-16 arithmetic shift right

dualiclipi dual-16 clip signed to signed

dualuclipi dual-16 clip signed to unsigned

dspidualmul Dual clipped multiply of signed

16-bit halfwords

dspidualabs Dual clipped absolute values of

signed 16-bit halfwords

dspidualadd Dual clipped add of signed 16-

bit halfwords

dspidualsub Dual clipped subtract of signed

16-bit halfwords

ifir16 Signed sum of products of

signed 16-bit halfwords

ufir16 Unsigned sum of products of

unsigned 16-bit halfwords

pack16lsb Pack least-significant 16-bit

halfwords

pack16msb Pack most-significant 16-bit

halfwords

Philips Semiconductors Custom Operations for Multimedia

PRELIMINARY SPECIFICATION 4-3

4.1.3 Example Uses of Custom Ops

The next three sections illustrate the advantages of using

custom operations. Also, the more complex examples il-

lustrate how custom operations can be integrated into

application code by providing listin gs of C-la ngu age p ro-

gram fragments. The examples progress in complexity

from simple to intricate; the most interesting examples

are taken from actual multimedia codes, such as MPEG

decompression.

4.2 EXAMPLE 1: BYTE-MATRIX

TRANSPOSITION

The goal of this example is to provide a simple, intr oduc-

tory illustration of how custom operations can significant-

ly increase processing speed in small kernels of applica-

tions. As in most uses of custom operations, the power

of custom operations in this case comes from their ability

to operate on multiple data items in parallel.

Imagine that our task is to transpose a packed, 4-by-4

matrix of bytes in memory; the matrix might, for example,

contain 8-bit pixel values. Figure 4-1 illustrates both the

organization of the matrix in memory and the task to be

performed in standard mathematical notation.

Performing this operation wit h traditional microprocessor

instructions is straight forward but time consuming. One

way to perform the manipulation is to perform 12 load-

byte instructions (since only 12 of the 16 bytes need to

be repositioned) and 12 store- byte instructions that place

the bytes back in mem ory in their new positions. Another

way would be to perform four load-word instructions, re-

position the bytes in registers, and then perform four

store-word instructions. Unfortunately, repositioning the

bytes in registers would require a large number of in-

structions to properly shift and mask the bytes. Perform-

ing the 24 loads and stores makes implicit use of the

shifting and masking hardware in the load/store units and

thus yields a shorter instruction sequence.

The problem with performing 24 loads and stores is that

loads and stores are inheren tly slow operations because

they must access at least the cache and possibly slower

layers in the memory hier archy. Further, pe rforming byte

loads and stores when 32-bit word-wide accesses run

just as fast wastes the power of the ca che/memory inter-

face. We would prefer a fast algorithm that takes full ad-

vantage of cache/memory bandwidth while not requiring

an inordinate number of byte-manipulation instructions.

PNX1300 has instructions that merge and pack bytes

and 16-bit halfwords directly and in parallel. Four of

these instructions can be applied in this case to speed up

the manipulation of bytes that are packed into words.

Figure 4-2 shows the application of these instructions to

the byte-matrix transposition problem, and the left side of

Figure 4-3 shows a list of the operations needed to im-

plement the matrix transpose. When assembled into ac-

tual PNX1300 instructions, these custom operations

would be packed as tightly a s depen dencies allow, up to

five operations per instruction.

Note that a programmer would not need to program at

this level (PNX1300 assembler). The matrix transpose

would be expressed just as efficiently in C-language

source code, as shown on the right side of Figure 4-3.

The low-level code is shown here for illustration purpos-

es only.

The first sequence of four load-word operations in

Figure 4-3 brings the packed words of the input matrix

into registers R10, R11, R12, and R13. The next se-

quence of four merge operations produces intermediate

results into registers R14, R15, R16, and R17. The next

sequence of four pack operat ions could then replace the

original operands or place the transposed matrix in sep-

arate registers if the origi nal matrix operands were need-

8-bit quadumax Unsigned bytewise quad max

quadumin Unsigned bytewise quad min

dspuquadaddui Quad clipped add of unsigned/

signed bytes

ifir8ii Signed sum of products of

signed bytes

ifir8iu Signed sum of products of

signed/unsigned bytes

ufir8uu Unsigned sum of products of

unsigned bytes

mergelsb Merge least-significant bytes

mergemsb Merge most-significant bytes

packbytes Pack least-significant bytes

quadavg Unsigned byte-wise quad aver-

age

quadumulmsb Unsigned quad 8-bit multiply

most significant

ume8ii Unsigned sum of absolute val-

ues of signed 8-bit differences

ume8uu Unsigned sum of absolute val-

ues of unsigned 8-bit differ-

ences

Table 4-2. Key Multimedia Custom Operations

Listed by Operand Size

Op. Size Custom Op Description 31 0

Row Major Column Major

Transpose

a b c d

e f g h

i j k l

m n o p

31 0

a e i m

b f j n

c g k o

d h l p

Transpose

n+0:

n+4:

n+8:

n+12:

Memory

Location

Figure 4-1. Byte-matrix transposition. Top shows

byte matrices packed into memory words; bottom

shows mathematical matrix representation.

PNX1300/01/02/11 Data Book Philips Semiconductors

4-4 PRELIMINARY SPECIFICATION

ed for further computations (the PNX1300 optimizing C

compiler performs this analysis automatically). In this ex-

ample, the transpose matrix is placed in registers R18,

R19, R20, and R21. The final four stor e-wo rd oper ations

put the transposed matrix back into memory.

Thus, using the PNX1300 custom operations, the byte-

matrix transposition requires four load-word operations

and four store-word operations (the minimum possible)

and eight register-to-register data-manipulation opera-

tions. The result is 16 operations, or byte-matrix transpo-

sition at the rate of one op eration per byte.

While the advantage of the custom-operation-based al-

gorithm over the brute- force code tha t uses 24 load- an d

store-byte instruction seems to be only eight operations

(a 33% reduction), the ad vantage is actually much great-

er. First, using cu stom oper ations, the n umber of m emo-

ry references is reduced from 24 to eight (a factor of

three). Since memory references are slower than regis-

ter-to-register operations (such as the custom operations

in this example), the reduction in memory references is

significant.

Further, the ability of the PNX1300 VLIW compilation

system to exploit the performance potential of the

PNX1300 microprocessor hardware is enhanced by the

custom-operation-based code. This is because it is eas-

ier for the compilation system to produce an optimal

schedule (arran gement) of the code when the n umber of

memory referen ces is in balance with the number of reg-

ister-to-register operations. The PNX1300 CPU (like all

high-performance microprocessors) has a limit on the

number of memory references that can be processed in

a single cycle (two is the current limit). A long sequence

of code that contains on ly memory r efer ences ca n result

in empty operation slots in the long PNX1300 instruc-

tions. Empty operation slots waste the performance po-

tential of the PNX1300 hardwar e.

As this example has shown, careful use of custom oper-

ations has the potential to not only reduce the absolute

number of operations needed to perform a computation

but can also help the compilation syste m produce code

that fully exploits the performance potential of the

PNX1300 CPU.

4.3 EXAMPLE 2: MPEG IMAGE

RECONSTRUCTION

The complete MPEG video decoding algorithm is com-

posed of many different phases, each with computational

intensive kernels. One important kernel deals with recon-

structing a single image frame given that the forward-

and backward-predicted frames and the inverse discr ete

cosine transform (IDCT) results ha ve already been com -

puted. This kernel pr ovides an excellent opportunity to il-

lustrate of the power of PNX1300’s specialized custom

operators.

In the code fragments that follow, the backward-predict-

ed block is assumed to have been computed into an ar-

ray back[], the forward-predicted block is assumed to

have been co mputed into forward[], a nd the IDCT results

are assumed to have been computed into idct[].

Row Major Column Major

mergemsb

a e b f

i m j n

mergelsb

c g d h

k o l p

pack16msb

pack16lsb

pack16msb

pack16lsb

Figure 4-2. Application of merge and pack instructions to the byte-matrix transposition of Figure 4-1.

ld32d(0) r100  r10

ld32d(4) r100  r11

ld32d(8) r100  r12

ld32d(12) r100  r13

mergemsb r10 r11  r1 4

mergemsb r12 r13  r1 5

mergelsb r10 r11  r1 6

mergelsb r12 r13  r1 7

pack16msb r14 r15  r18

pack16lsb r14 r15  r19

pack16msb r16 r17  r20

pack16lsb r16 r17  r21

st32d(0) r101 r18

st32d(4) r101 r19

st32d(8) r101 r20

st32d(12) r101 r21

char matrix[4][4];

int *m = (int *) matrix;

temp0 = MERGEMSB(m[ 0], m[1] );

temp1 = MERGEMSB(m[ 2], m[3] );

temp2 = MERGELSB(m[ 0], m[1] );

temp3 = MERGELSB(m[ 2], m[3] );

m[0] = PACK16MSB(temp 0, temp1 );

m[1] = PACK16LSB(temp 0, temp1 );

m[2] = PACK16MSB(temp 2, temp3 );

m[3] = PACK16LSB(temp 2, temp3 );

Figure 4-3. On the left is a complete list of operations to perform the byte-matrix transposition of Figure 4-1

and Figure 4-2. On the left is an equivalent C-language fragment.

Philips Semiconductors Custom Operations for Multimedia

PRELIMINARY SPECIFICATION 4-5

A straightforward coding of the reconstruction algorithm

might look as shown in Figure 4-4. This implementation

shares many of the undesir able proper ties of the fir st ex-

ample of byte-matrix transposition. The code accesses

memory a byte at a time instead of a word at a time,

which wastes 75% of the available bandwidth. Also, in

light of the many quad-byte-parallel operations intro-

duced in Section 4.1.2, “Introduction to Custom Opera-

tions,” it se ems inefficien t to sp en d thr ee separ ate a dd i-

tions and one shift to process a single eight-bit pixel.

Perhaps even more unfortunate for a VLIW processor

like PNX1300 is the branch -intensive code that performs

the saturation testing; eliminating these branches could

reap a significant performance gain.

Since MPEG decoding is the kind of task for which

PNX1300 was created, there are two custom opera-

tions—quadavg and dspuquadad dui—that exactly fit this

important MPEG kernel (and other kernels). These cus-

tom operatio ns pr oc ess four pairs of 8-bit pixel values in

parallel. In addition, dspuquadaddui performs saturation

tests in hardware, which eliminates any need to execute

explicit tests and branches.

For readers familiar with the details of MPEG algorithms,

the use of eight-bit IDCT values later in this example may

be confusing. The standard MPEG implementation calls

for nine-bit IDCT values, but extensive analysis has

shown that values outside the range [–128..127] occur

so rarely that they can be considered unimportant. Pur-

suant to this observation, the IDCT values are clipped

into the eight-bit range [–128..127] with saturating arith-

metic before the frame reconstruction code runs. The as-

sumption that this saturation occurs permits some of

PNX1300’s custom oper ations to have clean, simple def-

initions.

The first step in seeing how custom o perations can be of

value in this case, is to unroll the loop by a factor of four.

The unrolled code is shown in Figure 4-5. This creates

code that is parallel with respect to the four pixel compu-

tations. As it is easily seen in the code, the four groups of

computations (one group per pixel) do not depend on

each othe r.

After some experience is gained with custom o perations,

it is not necessary to unroll loops to discover situations

where custom operations are useful. Often, a good pro-

grammer with knowledge of the function of the custom

operations can see by simple inspection opportunitie s to

exploit custom operations.

To understand h ow quadavg and dspu quadaddui can be

used in this code, we examin e the fu nctio n of the se cu s-

tom operations.

The quadavg custom o peration performs pixel aver aging

on four pairs of pixe ls in pa ra llel. Forma lly, the opera tio n

of quadavg is as follows:

quadavg rscr1 rsrc2 -> rdest

takes arguments in registers r src1 and rsrc2, and it com-

putes a re sult into register rde st. rsrc1 = [abcd] , rsrc2 =

[wxyz], and rdest = [pqrs] where a, b, c, d, w, x, y, z, p, q,

r, and s are a ll unsigned eight-b it va lues. Then, quad avg

computes the output vector [pqrs] as follows:

p = (a + w + 1) >> 1

q = (b + x + 1) >> 1

r = (c + y + 1) >> 1

s = (d + z + 1) >> 1

The pixel averaging in Figure 4-5 is evident in the first

statement of each of the four groups of statements. The

rest of the code—ad ding id ct[i] va lue and p er formin g th e

saturation test—can be performed by the dspuquadad-

dui operation. Formally, its function is as follows:

dspuquadaddui rsrc1 rsrc2 -> rdest

takes arguments in registers r src1 and rsrc2, and it com-

putes a result into register rdest. rsrc1 = [efgh], rsrc2 =

[stuv], and rdest = [ijkl] where e, f, g, h, i, j, k, and l are

unsigned 8-bit values; s, t, u, and v are signed 8-bit val-

ues. Then, dspuquadaddui computes the output vector

[ijkl] as follows:

i = uclipi(e + s, 255)

j = uclipi(f + t, 255)

k = uclipi(g + u, 255)

l = uclipi(h + v, 255)

The uclipi operation is defined in this case as it is for the

separate PNX1300 operation of the same name de-

scribed in Appendix A, “PNX1300/01/02/11 DSPCPU

Operations,”. Its definition is as follows:

void reconstru ct (u nsi gne d cha r *ba ck,

unsigned char *forward,

char *idct,

unsigned char *destination)

{

int i, temp;

for (i = 0; i < 64; i += 1)

{

temp = ((back[i] + for war d[ i] + 1) >> 1) + idc t[i ];

if (temp > 255)

temp = 255;

else if (temp < 0)

temp = 0;

destination[i] = temp;

}

Figure 4-4. Straightforwar d code for MPEG frame reconstruction.

PNX1300/01/02/11 Data Book Philips Semiconductors

4-6 PRELIMINARY SPECIFICATION

uclipi (m, n)

{

if (m < 0) return 0;

else if (m > n) return n;

else return m;

}

To make is easier to see how these operations can sub-

sume all the code in Figure 4-5, Figure 4-6 shows the

same code rearranged to group the related functions.

Now it should be clear that the quadavg operation can re-

place the first four lines of the loop assuming that we can

get the individual 8-bit elements of the back[] and for-

ward[] arrays positioned correctly into the bytes of a 32-

bit word. That, of course, is easy: simply align the byte ar-

rays on word boundarie s and access them with word (in-

teger) pointers.

Similarly, it should now be clear that the dspuquadaddui

operation can replace the remaining code (except, of

course, for storing the result into the destination[] array)

assuming, as above, that the 8-bit elements are aligned

and packed into 32-bit words.

Figure 4-7 shows the new code. The arrays are now ac-

cessed in 32-bit (int-sized) chunks, the loop iteration con-

trol has been modified to reflect the ‘four-at-a -time’ oper -

ations, and the quadavg and dspuquadaddui operations

have replaced the bulk of the loop code. Finally,

Figure 4-8 shows a more compact expression of the loop

code, eliminating the temporary variable. Note that

PNX1300 C compiler does the optimization by itself.

Again, note that the code in Figure 4-7 and Figure 4-8

assumes that the character arrays are 32-bit word

aligned and padded if necessary to fill an integral number

of 32-bit words.

The original code required th ree additions, one sh ift, two

tests, three loads, and one store per pixel. The new code

using custom operations requires only two cu stom oper-

ations, three load s, and one store for four pixels, which is

more than a factor of six improvement. The actual perfor-

mance impr ovement ca n be even gr eater depe nding on

how well the compiler is able to deal with the branches in

the original version of the code, which depends in part on

the surrounding code. Reduci ng the number of branches

almost always improves the chances of realizing maxi-

mum performance on the PNX1300 CPU.

The code in Figure 4-8 illustrates several aspects of us-

ing custom operations in C-language source code. First,

the custom operations require no special de clarations or

syntax; they app ear to be simple f unct ion ca lls. Sec ond ,

there is no need to explicitly specify register assignments

for sources, destinations, and intermediate results; the

compiler and scheduler assign registers for custom oper-

ations just as they would for bu ilt-in langua ge operations

such as integer addition. Third, the scheduler packs cus-

tom operations into PNX1300 VLIW instructions as effec-

tively as it packs operations generated by the compiler

for native language constructs.

Thus, although the burden of making effective use of

custom operations falls on the programmer, that burden

consists only of discovering the opportunities for exploit-

ing the operations and then coding them using standard

C-language notation. The compiler and scheduler take

care of the rest.

void reconstruct (unsigned char *back,

unsigned char *forward,

char *idct,

unsigned char *destination)

{

int i, temp;

for (i = 0; i < 64; i += 4)

{

temp = ((back[i+0] + forward[i+0] + 1) >> 1) + idct[i+0];

if (temp > 255) temp = 255;

else if (temp < 0) temp = 0;

destination[i+0] = temp;

temp = ((back[i+1] + forward[i+1] + 1) >> 1) + idct[i+1];

if (temp > 255) temp = 255;

else if (temp < 0) temp = 0;

destination[i+1] = temp;

temp = ((back[i+2] + forward[i+2] + 1) >> 1) + idct[i+2];

if (temp > 255) temp = 255;

else if (temp < 0) temp = 0;

destination[i+2] = temp;

temp = ((back[i+3] + forward[i+3] + 1) >> 1) + idct[i+3];

if (temp > 255) temp = 255;

else if (temp < 0) temp = 0;

destination[i+3] = temp;

}

Figure 4-5. MPEG frame reconstruction code using PNX1300 custom operations; compare with Figure 4-4.

Philips Semiconductors Custom Operations for Multimedia

PRELIMINARY SPECIFICATION 4-7

4.4 EXAMPLE 3: MOTION-ESTIMATION

KERNEL

Another part of the MPEG coding algorithm is motion es-

timation. The purpose of motion estimation is to reduce

the cost of storing a frame of video by expressing the

contents of the frame in terms of adjacent fra mes. A gi v-

en frame is reduced to small blocks, and a subsequent

frame is represented by specifying how these small

blocks change position and appe arance; usually, sto ring

the difference information is cheaper than storing a

whole block. For example, in a video sequence where

the camera pans across a static scene, some frames can

be expressed simply as displaced versions of their pre-

decessor frames. To create a subsequent frame, most

blocks are simply displaced relative to the outpu t screen.

The code in this example is fo r a match- cost calcu lation,

a small kernel of the complete motion-estimation code.

As with the pre vious ex ampl e, this code pro vides an ex -

cellent example of how to transform source code to make

the best use of PNX1300’s custom operations.

void reconstruct (unsigned char *back,

unsigned char *forward,

char *idct,

unsigned char *destination)

{

int i, temp0, temp1, temp2, temp3;

for (i = 0; i < 64; i += 4)

{

temp0 = ((back[i+0] + forward[i+0] + 1) >> 1);

temp1 = ((back[i+1] + forward[i+1] + 1) >> 1);

temp2 = ((back[i+2] + forward[i+2] + 1) >> 1);

temp3 = ((back[i+3] + forward[i+3] + 1) >> 1);

temp0 += idct[i+0];

if (temp0 > 255) temp0 = 255;

else if (temp0 < 0) temp0 = 0;

temp1 += idct[i+1];

if (temp1 > 255) temp1 = 255;

else if (temp1 < 0) temp1 = 0;

temp2 += idct[i+2];

if (temp2 > 255) temp2 = 255;

else if (temp2 < 0) temp2 = 0;

temp3 += idct[i+3];

if (temp3 > 255) temp3 = 255;

else if (temp3 < 0) temp3 = 0;

destination[i+0] = temp0;

destination[i+1] = temp1;

destination[i+2] = temp2;

destination[i+3] = temp3;

}

Figure 4-6. Re-grouped code of Figure 4-5.

void reconstruct (unsigned char *back,

unsigned char *forward,

char *idct,

unsigned char *destination)

{

int i, temp;

int *i_back = (int *) back;

int *i_forward = (int *) forward;

int *i_idct = (int *) idct;

int *i_dest = (int *) destination;

for (i = 0; i < 16; i += 1)

{

temp = QUADAVG(i_back[i], i_forward[i]);

temp = DSPUQUADADDUI(temp, i_idct[i]);

i_dest[i] = temp;

}

Figure 4-7. Using the custom operation dspquadaddui to speed up the loop of Figure 4-6.

PNX1300/01/02/11 Data Book Philips Semiconductors

4-8 PRELIMINARY SPECIFICATION

Figure 4-9 shows the original source code for the match-

cost loop. Unlike the previous example, the code is not a

self-contained function. Somewhere early in the code,

the arrays A[][] and B[][] are declared; somewhere be-

tween those declarations and the loop of interest, the ar-

rays are filled with data.

4.4.1 A Simple Transformation

First, we will look at the simplest way to use a PNX1300

custom operation.

We start by noticing that the computation in the loop of

Figure 4-9 involves the absolute value of the difference

of two unsigned characters (bytes). By now, we are fa-

miliar with the fact that PNX1300 includes a number of

operations that process all four bytes in a 32-bit word si-

multaneously. Since the match-cost calculatio n is funda-

mental to the MPEG algorithm, it is not surprising to find

a custom operation—ume8uu—that implements this op-

eration exac tly.

To understand how ume8uu can be used in this case, we

need to transform the code as in the previous example.

Though the steps are presented here in detail, a pro-

grammer with a even a little experience can often per-

form these transformations by visual inspection.

To use a custom operation that p rocesses 4 pixel values

simultaneously, we first need to create 4 parallel pixel

computations. Figure 4-10 shows the loop of Figure 4-9

unrolled by a factor of 4. Unfortunately, the code in the

unrolled loop is not parallel because each line depends

on the one above it. Figure 4-11 shows a more parallel

version of the code from Figure 4-10. By simply giving

each computation its own cost variable and then sum-

ming the costs all at once, each cost computation is com-

pletely independent.

void reconstruct (unsigned char *back,

unsigned char *forward,

char *idct,

unsigned char *destination)

{

int i;

int *i_back = (int *) back;

int *i_forward = (int *) forward;

int *i_idct = (int *) idct;

int *i_dest = (int *) destination;

for (i = 0; i < 16; i += 1)

i_dest[i] = DSPUQUADADDUI(QUADAVG(i_back[i], i_forward[i]), i_idct[i]);

}

Figure 4-8. Final version of the frame-reconstruction code.

unsigned char A[16][16];

unsigned char B[16][16];

for (row = 0; row < 16; row += 1)

{

for (col = 0; col < 16; col += 1)

cost += abs(A[row][col] – B[row][col]);

}

Figure 4-9. Match-cost loop for MPEG motion es timation.

unsigned char A[16][16];

unsigned char B[16][16];

for (row = 0; row < 16; row += 1)

{

for (col = 0; col < 16; col += 4)

{

cost += abs(A[row][col+0] – B[row][col+0]);

cost += abs(A[row][col+1] – B[row][col+1]);

cost += abs(A[row][col+2] – B[row][col+2]);

cost += abs(A[row][col+3] – B[row][col+3]);

Figure 4-10. Unrolled, but not parallel, version of the loop from Figure 4-9.

Philips Semiconductors Custom Operations for Multimedia

PRELIMINARY SPECIFICATION 4-9

Excluding the array accesses, the loop body in

Figure 4-11 is now recognizable as the function per-

formed by the ume8uu custom operation: the sum of 4

absolute values of 4 di fferences. To use the u me8uu op -

eration, however, the code must access the arrays with

32-bit word pointers instead of with 8-bit byte pointers.

Figure 4-13 shows the loop recoded to access A[][] and

B[][] as one-dimensional instead of two-dimensional ar-

rays. We take advantage of our knowledge of C-lan-

guage array storage conventions to perform this code

transformation. Recoding to use one-dimensional arrays

prepares the code for transformation to 32-bit array ac-

cesses.

(From here on, until the final code is sh own, the declara -

tions of the A and B arrays will be omitted from the code

fragments for the sake of brevity.)

Figure 4-14 shows the loop of Figure 4-13 recoded to

use ume8uu. Once again taking ad vantage of our knowl-

edge of the C-language array storage conventions, the

one-dimensional byte array is now accessed as a one-di-

mensional 32-bit-word array. The declarations of the

pointers IA and IB as pointers to integers is the key, but

also notice that the multiplier in the expression for row

offset has been scaled from 16 to 4 to account for the fact

that there are 4 bytes in a 32-bit word.

Of course, since we are now using one-dimensional ar-

rays to access the pixel data, it is natural to use a single

for loop instead of two. Figure 4-12 shows this stream-

lined version of the code witho ut the inner loop. Since C-

language arrays are stored as a linear vector of values,

we can simply increase the number of iterations of the

outer loop from 16 to 64 to traverse the entire array.

The recoding and use of the ume8uu operation has re-

sulted in a substantial improvement in the performance

of the match-cost loop. In the original version, the code

executed 1280 operations (including loads, adds, sub-

tracts, and absolute values); in the restructured version,

there are only 256 operations—128 loads, 64 ume8uu

operations, and 64 additions. This is a factor of five re-

duction in the number of operations executed. Also, the

unsigned char A[16][16];

unsigned char B[16][16];

for (row = 0; row < 16; row += 1)

{

for (col = 0; col < 16; col += 4)

{

cost0 = abs(A[row][col+0] – B[row][col+0]);

cost1 = abs(A[row][col+1] – B[row][col+1]);

cost2 = abs(A[row][col+2] – B[row][col+2]);

cost3 = abs(A[row][col+3] – B[row][col+3]);

cost += cost0 + cost1 + cost2 + cos t3 ;

Figure 4-11. Parallel version of Figure 4-10.

Figure 4-12. The loo p of Figure 4-14 with the inner

loop eliminated.

unsigned int *IA = (unsigned int *) A;

unsigned int *IB = (unsigned int *) B;

for (i = 0; i < 64; i += 1)

cost += UME8UU(IA[i], IB[i] );

Figure 4-13. The loop of Figure 4-11 recoded with one-dimensional array accesses.

unsigned char A[16][16];

unsigned char B[16][16];

unsigned char *CA = A;

unsigned char *CB = B;

for (row = 0; row < 16; row += 1)

{

int rowoffset = row * 16;

for (col = 0; col < 16; col += 4)

{

cost0 = abs(CA[rowoffset + col+0] – CB[rowoffset + col+0]);

cost1 = abs(CA[rowoffset + col+1] – CB[rowoffset + col+1]);

cost2 = abs(CA[rowoffset + col+2] – CB[rowoffset + col+2]);

cost3 = abs(CA[rowoffset + col+3] – CB[rowoffset + col+3]);

cost += cost0 + cost1 + cost2 + cost3;

PNX1300/01/02/11 Data Book Philips Semiconductors

4-10 PRELIMINARY SPECIFICATION

overhead of the inner loop has been eliminated, further

increasing the performance advantage.

4.4.2 More Unrolling

The code transformations of the previous section

achieved impressive performance improvements, but

given the VLIW nature of the PNX1300 CPU, more can

be done to exploit PNX1300’s parallelism.

The code in Figure 4-12 has a loop containing only 4 op-

erations (excluding loop overhead). Since PNX1300’s

branches have a 3-instruction d elay and each instruction

can contain up to 5 operations, a fully utilized minimum-

sized loop can contain 16 operations (20 minus loop

overhead).

The PNX1300 compilation system performs a wide vari-

ety of powerful code transfor mation a nd schedu ling o pti-

mizations to ensure that the VLIW capabilities of the

CPU are exploited. It is still wise, however, to make pro-

gram parallelism explicit in source code when possible.

Explicit parallelism can only help the compile r p roduce a

fast running program.

To this end, we can unroll the loop of Figure 4-12 some

number of times to create explicit parallelism and help

the compiler create a fast running loop. In this case,

where the number of iterations is a power-of-two, it

makes sense to unroll by a factor that is a power-of-two

to create clean code.

Figure 4-15 shows the loop unrolled by a factor of eight.

The compiler can apply common sub-expression elimi-

nation and other optimizations to eliminate extraneous

operations in the array indexing, but, again, improve-

ments in the source code can only help the compiler pro-

duce the best possible code and fastest-running pro-

gram.

Figure 4-16 shows one way to modify the code for sim-

pler array indexing.

Figure 4-14. The loop of Figure 4-13 recoded with 32-bit array accesses and the ume8uu custom operation.

unsigned int *IA = (unsigned int *) A;

unsigned int *IB = (unsigned int *) B;

for (row = 0; row < 16; row += 1)

{

int rowoffset = row * 4;

for (col4 = 0; col4 < 4; col4 += 1)

cost += UME8UU(IA[rowoffset + col4], IB[rowoffset + col4]);

}

unsigned int *IA = (unsigned int *) A;

unsigned int *IB = (unsigned int *) B;

for (i = 0; i < 64; i += 8)

{

cost0 = UME8UU(IA[i+0], IB[i+0]);

cost1 = UME8UU(IA[i+1], IB[i+1]);

cost2 = UME8UU(IA[i+2], IB[i+2]);

cost3 = UME8UU(IA[i+3], IB[i+3]);

cost4 = UME8UU(IA[i+4], IB[i+4]);

cost5 = UME8UU(IA[i+5], IB[i+5]);

cost6 = UME8UU(IA[i+6], IB[i+6]);

cost7 = UME8UU(IA[i+7], IB[i+7]);

cost += cost0 + cost1 + cost2 +

cost3 + cost4 + cost5 +

cost6 + cost7;

}

Figure 4-15. Unrolled version of Figure 4-12. This

code makes good use of PNX1300’s VLIW capabili-

ties.

unsigned char A[1 6] [16 ];

unsigned char B[1 6] [16 ];

unsigned int *IA = (un sig ne d int *) A;

unsigned int *IB = (un sig ne d int *) B;

for (i = 0; i < 64; i += 8, IA += 8, IB += 8)

{

cost0 = UME8UU(IA[0], IB[0]);

cost1 = UME8UU(IA[1], IB[1]);

cost2 = UME8UU(IA[2], IB[2]);

cost3 = UME8UU(IA[3], IB[3]);

cost4 = UME8UU(IA[4], IB[4]);

cost5 = UME8UU(IA[5], IB[5]);

cost6 = UME8UU(IA[6], IB[6]);

cost7 = UME8UU(IA[7], IB[7]);

cost += cost0 + cost1 + cost2 +

cost3 + cost4 + cost5 +

cost6 + cost7;

}

Figure 4-16. Code from Figure 4-15 with simplified

array index calculations.

PRELIMINARY SPECIFICATION 5-1

Cache Architecture Chapter 5

by Eino Jacobs

5.1 MEMORY SYSTEM OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The high-performance video and audio throughput of

PNX1300 is implemented by its DSPCPU and autono-

mous I/O and co-processing units, but the foundatio n of

this processing is the PNX1300 memory hierarchy. To

get the full potential of the chip’s processing units, the

memory hierarchy must read and write data (and DSP

CPU instructions) fast enough to keep the units busy.

To meet the requirements of its target applications,

PNX1300’s memory hierarchy must satisfy the conflict-

ing goals of low cost, simple system design (e.g., low

parts count), and high performance. Since multimedia

video streams can require relatively large temporary

storage, a significant amount of external DRAM is re-

quired. Minimizing the cost of bulk memory is important.

PNX1300’s memory system achieves a good compro-

mise between cost and performance by coupling sub-

stantial on-chip caches with a glueless interface to syn-

chronous DRAM (SDRAM). SDRAM provides higher

bandwidth than standard DRAM for only a small cost pre-

mium. A block diagram of th e memory system is shown

in Figure 5-1. SDRAM permits PNX1300 to use a nar-

rower and simpler interface than would be required to

achieve similar performance with standard DRAM.

The separa te on-chip da ta and instruction caches serve

only the DSPCPU s ince the data access patter ns of the

autonomous I/O and graphics units exhibit little or no lo-

cality of reference (they access each piece of the multi-

media data stream only once in each operation).

Without the caches, the CPU would not be able to

achieve its performance potential. SDRAM has enough

bandwidth to handle serial streams of multimedia data,

but its bandwidth and latency are insufficient to satisfy

the CPU’s high rate of random data accesses and re-

peated instruction accesses.

Table 5-1 shows bandwidth parameters for the PNX1300

DSPCPU and the main-memory interface. Although 400

MB/s is a lot of bandwidth, it is clear that the SDRAM

alone cannot keep up with the CPU’s maximum require-

ments for instructions and data. Luckily, multimedia algo-

rithms resemble other computer progr ams in terms of lo-

cality of reference, so the o n-chip caches typically supply

VLIW

CPU

Three

Branch

Units

Decompressor

32KB, 8-way

Instruction

Cache

Two

Memory

Units

16KB, 8-way

Data

Cache

Three sets, each has address,

opcode, condition, and guard

224 bits of decompressed

instruction

Two sets, each has a guard,

opcode, data, and two

address components

Main

Memory

Interface

SDRAM

Main

Memory

Internal data highway:

32-bit address, 32-bit

data

To on-chip

peripherals

Main-memory bus:

glueless, SDRAM

control with 32-bit

data

Figure 5-1. The main components of the PNX1300 memory system.

Table 5-1. 100-MHz PNX1300 memory bandwidth

parameters

Magnitude Use

2800 MB/s Instruction bandwidth (224 bits/instruction)

800 MB/s Data bandwidth (two 32-bit memory ports)

400 MB/s Main-memory bandwidth (one 32-bit port)

PNX1300/01/02/11 Data Book Philips Semiconductors

5-2 PRELIMINARY SPECIFICATION

the majority of instructions and data to the DSPCPU. The

wide paths to the caches are matched to the bandwidth

requirements of the DSPCPU.

To improve cache behavior and thus program perfor-

mance, the caches have a locking mechanism. In addi-

tion, the instruction cache is coupled with an instruction

decompression unit. The compressed instruction format

improves the cache hit rate and reduces the bus band-

width required between main memory and cache. In-

structions in main memory and cache use the com-

pressed format.

PNX1300’s processing units access the external

SDRAM through the on-chip central “data highway” bus.

The highway consists of separate 32-bit address and

data buses, and use of the bus is med iated b y the main -

memory interface unit. The main-memory interface con-

tains the SDRAM controller and a cen tral arb iter that de-

termines how much of the available SDRAM memory

bandwidth is allocated to each u nit. Unused bandwidth is

always made available to the VLIW CPU for cache refill

and memory accesses that bypass the caches.

Table 5-2 gives a summary description of each compo-

nent of PNX1300’s memory system.

5.2 DRAM APERTURE

PNX1300 implements a 32-bit linear address space of

bytes. Within that address space, PNX1300 supports

several different apertures for specific purposes. The

DRAM aperture describes the part of the address space

into which the external SDRAM is mapped. SDRAM

must consist of a single, contiguous region of memory,

which is the most practical configuration for PNX1300

systems.

The location and size of the DRAM aperture is defined by

two registers, DRAM_BASE and DRAM_LIMIT. These

registers are both readable and writeable as MMIO reg-

isters and as PCI configuration space registers. The view

of the registers in MMIO space is shown in Figure 5-2.

The view of the registers in PCI configuration space is

described in Ch apter 11, “P CI Interfac e.” In normal op er-

ation, the base address registers are assigned once dur-

ing boot and not change d when the DSPCPU is runn ing.

Refer to Chapter 11, “PCI Interface,” and Chapter 13,

“System Boot,” for a description of this process.

DRAM_LIMIT must be set equal to DRAM_BASE plus

the actual size of SDRAM present. The amount of the

SDRAM is not required to be a power of 2, but it must be

a multiple of 64 KB. Note that the size of the aperture as

set in the PCI configuration space can be larger, be-

cause it must be a power of 2.

A memory operation will access SDRAM if its address

satisfies:

[DRAM_BASE]  address < [DRAM_LIMIT]

Any address outside this range cannot access SDRAM.

When PNX1300 is reset, DRAM_BASE_FIELD is set to

0x0 and DRAM_LIMIT is set to 0x0010 0000 (1-MB

DRAM aperture starting at address 0x0). The boot pro-

cess described in Chapter 13, “System Boot,” ov errides

these initial settings.

Table 5-2. Summary of memory system

characteristics

Unit Description

Branch units Branch units execute branch operations. Up

to three branch operations can be executed in

parallel, but the program must guarantee that

only one branch is taken.

Decompres-

sion unit Instructions are stored in memory and in the

instruction cache in a space-saving, com-

pressed format. The decompression unit

expands instructions to their full, 28-byte size

before they are issued to the CPU.

Instruction

cache The instruction cache holds 32 KB, is 8-way

set-associative, and has a 64-byte block size.

A miss in a block causes the entire block to be

read from SDRAM. The cache can sustain an

issue rate of one instruction per cycle on

cache hits.

Memory units Memory units execute load and store opera-

tions. The data cache is dual ported to allow

the memory units to operate concurrently.

Data cache The data cache holds 16 KB, is 8-way set-

associative, has a 64-byte block size, and

implements a copyback, allocate-on-write pol-

icy. A miss in a block causes the entire block

to be read from SDRAM. The cache supports

memory-mapped I/O through non-cacheable

address regions.

Data highway The on-chip data highway bus serves all on-

chip units. The highway has sep arate 32-bit

data and address buses. Bus bandwidth is

allocated by the highway arbiter according to

one of several modes.

Main-memory

interface The main-memory interface contains the data-

highway access arbiter, the SDRAM control-

ler, and MMIO logic.

SDRAM main

memory External SDRAM connects gluelessly to

PNX1300 over the 32-bit main-memory bus.

31 0371115192327

DRAM_BASE (r/w)0x10 0000 DRAM_BASE_FIELD

DRAM_LIMIT (r/w)0x10 0004 DRAM_LIMIT_FIELD

0000000000000000

MMIO_BASE

offset:

0000

Figure 5-2. Formats of the DRAM_BASE and DRAM_LIMIT registers.

Philips Semiconductors Cache Architecture

PRELIMINARY SPECIFICATION 5-3

5.3 DATA CACHE

The data cache serves only the DSPCPU and is con-

trolled by two memory units that execute the load and

store operations issued by the DSPCPU. The following

sections describe the data cache and its operation;

Table 5-3 summarizes the important characteristics for

easy reference.

5.3.1 General Cache Parameters

The PNX1300 data cache is 1 6 KB in size with a 64-byte

block size. Thus, it contains 256 blocks each with its own

address tag. The cache is 8-way set-associative, so

there are 32 sets, each containing 8 tags. A single valid

bit is associated with a block, so each block and associ-

ated address tag is eithe r entirely valid in the cache or in-

valid. On a cache miss, 64 bytes are read from SDRAM

to make the entire block valid.

Each block also contains a dirty bit, which is set whenev-

er a write to the block occurs. Each set contains 10 bits

to support the hierarchical LRU repla cement policy.

The geometry of the data cache is available to software

by reading the MMIO register DC_PARAMS. Figure 5-3

shows the format of the DC_PARAMS register;

Table 5-4 lists its field values. The product of block size,

associativity, and number of sets gives the total cache

size (16 KB in this case).

5.3.2 Address Mapping

PNX1300 data addresses are mapped onto the data

cache storage structure as shown in Figure 5-4. A data

address is partitioned into four fields as described in

Table 5-5.

Table 5-3. Summary of data cache charact eristics

Characteristic PNX1300 Implementation

Cache size 16 KB

Cache associativity 8-way set-associative

Block size 64 bytes

Valid bits One valid bit per 64-byte block

Dirty bits One dirty bit per 64-byte block

Miss transfer order Miss transfers begin with the critical

word first

Replacement poli-

cies Copyback, allocate on write, hierarchical

LRU

Endianness Either little- or big-endian, determined

by PCSW bit

Ports The cache is quasi dual ported; two

accesses can proceed concurrently if

they reference differen t banks (deter-

mined by bits [4:2] of the computed

addresses)

Alignment Access must be naturally aligned (32-bit

words on 32-bit boundaries, 16-bit half-

words on 16-bit boundaries); the appro-

priate number of LSBs of un-naturally

aligned addresses are set to zero.

For misaligned stores, PCSW.MSE is

asserted to generate an exception

Partial word opera-

tions The cache implements 8-bit and 16-bit

accesses with the same performance as

32-bit accesses

Operation latency Three cycles for both load and store

operations

Coherency enforce-

ment Software uses special operations to

enforce cache coherency

Cache locking Up to 1/2 (four out of 8 blocks of each

set) of the cache contents can be

locked; granularity is 64-byte

Non-cacheable

region One non-cacheable aperture in the

DRAM address space is supported.

Table 5-4. DC_PARAMS field values

Field Name Value

BLOCK SIZE 64

ASSOCIATIVITY 8

NUMBER_OF_SETS 32

Table 5-5. Data address field partitioning

Field Address

Bits Purpose

Byte 1..0 Byte offset within a word for byte or half-

word accesses

Word 5..2 Selects one of the words in a set (one of

16 words in the case of PNX1300)

Set 10..6 Selects one of the sets in the cache (one

of 32 in the case of PNX1300)

Tag 31..11 Compared against address tags of set

members

31 0371115192327

DC_PARAMS (r/o)0x10 001C ASSOCIATIVITY NUMBER_OF_SETS

MMIO_BASE

offset:

BLOCKSIZE

Figure 5-3. Format of the DC_PARAMS register .

Word ByteSetTag

31 12561011

Data Cache Address

Figure 5-4. Data cache address partitioning.

PNX1300/01/02/11 Data Book Philips Semiconductors

5-4 PRELIMINARY SPECIFICATION

5.3.3 Miss Processing Order

When a miss occurs, the data cache fills the block con-

taining the requested word from the critical word first.

The CPU is stalled until the first word is transferred. The

block is then filled up while the CPU keeps running.

5.3.4 Replacement Policies, Coherency

The cache implements a copyback replacement policy

with one dirty bit per 64-byte block. Thus, when a miss

occurs and the block selected for replacement has its

dirty bit set, the dirty block must b e written to main mem-

ory to preserve its modified contents. On PNX1300, the

dirty block is written to memory before the needed block

is fetched.

Coherency is not maintained in any way by hardware be-

tween the data cache, the instruction cache, and main

memory. Special operations are available to implement

cache coherency in software. See Section 5.6, “Cache

Coherency,” for a discussion of coherency issues.

Write misses are handled with an allocate-on-write poli-

cy—the write that caused the miss stores its data in the

cache after the missing block is fetched into the cache.

The cache implements a hierarchical LRU replacement

algorithm to determine which of the eight elements

(blocks) in a set is replaced. The algorithm partitions the

eight set elements into four groups, each group with two

elements. The hierarchical LRU replacement victim is

determined by selecting the le ast- re cently use d g roup of

two elements and then selecting the least-recently used

element in that group. This hierarchical algorithm yields

performance close to full LRU but is simpler to imple-

ment.

See Section 5.5, “LRU Alg orithm,” for a full discussion of

the LRU algorithm.

5.3.5 Alignment, Partial-Word Transfers,

Endian-ness

The cache implements 3 2-bit word, 16-bit half-word, an d

8-bit byte transfers. All transfers, however, must be to

addresses that are naturally aligned; that is, 32-bit words

must be aligned on 32-bit boundaries, and 16-bit half-

words must be aligned on 16-bit boundaries.

Like other PNX1300 processing units, the CPU has the

capability to use either big- or little-endian byte order. It

is recommended that all units and the CPU run with the

same endian-ness. Detailed endian-ness description

can be found in Appendix C, “Endian-ness.”

5.3.6 Dual Ports

To allow two accesses to proceed in parallel, the data

cache is quasi-dual ported. The cache is implemented as

eight banks of single-ported memory, but the hardware

allows each bank to operate independently. Thus, when

the addresses of two simultaneous accesses select two

different banks, both accesses can complete simulta-

neously. Bank selection is determined by the three low-

order address bits [4..2] of each address. Thus, the

words in a 64-byte cache block are distributed among the

eight blocks, which prevents conflicts between two simul-

taneously issued accesses to adjacent words in a cache

block. The PNX1300 compiling system attempts to avoid

bank conflicts as much as possible.

The dual-ported cache can execute the load and store

opcodes (ild8d, uld8d, ild16d, uld16d, ld32d, h_st8d,

h_st16d, h_st32d, ild8r, uld8r, ild16r, uld16r, ld32r,

ild16x, uld16x, ld32x) in either or both of the two ports.

The special opcodes alloc, dcb, dinvalid, pref, rdtag and

rdstatus can only be executed in the second port, not in

the first port. Whenever any of these special opcodes is

issued in the second port, there should not be a concur-

rent load or store operation in the first. This is a special

scheduling co ns tra in t.

5.3.7 Cache Locking

The data cache allows the contents of up to one-half of

its blocks to be locked. Thus, on PNX1300, up to 8 KB of

the cache can be used as a high -spe ed lo cal data mem -

ory. Only four out of eight blocks in any set can be

locked.

A locked block is never chosen as a victim by the re-

placement algorithm; its contents remain undisturbed un-

til either (1) the block’s locked status is changed explicitly

by software, or (2) a dinvalid operation is executed that

targets the locked block.

Cache locking occurs only for the data in the address

range described by the MMIO registers

DC_LOCK_ADDR and DC_LOCK_SIZE. The granulari-

ty of the address range is one 64-byte cache block. The

MMIO register DC_LOCK_CTL contains the cache-lock-

ing enable bit DC_LOCK_ENABLE. Figure 5-5 shows

the layout of the data-cache lock registers. Locking will

occur for an addres s if locking is e nabled and both o f the

following are true:

1. The address is grea ter than or equal to the value in

DC_LOCK_ADDR.

2. The address is less than the sum of the value s in

DC_LOCK_ADDR and DC_LOCK_SIZE.

Programmers (or compilers) must combine all data that

needs to be locked into this single linear address range.

Setting DC_LOCK_ENABLE to ‘1’ causes the following

sequence of events:

1. All blocks that are in cache locations that will be used

for locking are co pie d back to ma in me m o ry (if they

are dirty) and removed from the cache.

2. All blocks in the lock range are fetched from main

memory into the cach e. If any block in the lock range

was already in the cache, it’s first copied back into

main memory (if it’s dirty) and invalidated.

3. The LRU status of any set that cont ains locked blocks

is set to the initialization value.

4. Cache locking is activated so that the locked blocks

cannot be victims of the replacement algorithm.

This sequence of events is triggered by writing ‘1’ to

DC_LOCK_ENABLE even if the enable is already set to

Philips Semiconductors Cache Architecture

PRELIMINARY SPECIFICATION 5-5

‘1’. Setting DC_LOCK_ENABLE to ‘0’ causes no action

except to allow the previously locked blocks to be re-

placement victims.

To program a n ew lock range , th e follo wing seq uen ce of

operations is used:

1. Disable cache locking by writing ‘0’ to

DC_LOCK_ENABLE.

2. Define a new lock range by writing to

DC_LOCK_ADDR and DC_LOCK_SIZE.

3. Enable cache locking by writing ‘1’ to

DC_LOCK_ENABLE.

Dirty locked blocks can be written back to main memory

while locking is enabled by executing copyback opera-

tions in software.

Programmer’s note: Software should not execute din-

valid operations on a locked block. If it does, the block

will be removed from the cache, creating a ‘hole’ in the

lock range (and the data cache) that cannot be reused

until locking is deactivated.

Cache locking is disabled by default when PNX1300 is

reset.

The RESERVED field in DC_LOCK_CTL should be ig-

nored on reads and written as all zeroes.

Locking should not be enabled by PCI accesses to the

MMIO registers.

5.3.8 Memory Hole and PCI Aperture

Disable

Bits 6 and 5 in DC_LOCK_CTL comprise the

APERTURE_CONTROL field. This field can be used to

change the memory map as seen by the DSPCPU. The

hardware RESET value of the field corresponds to the

memory map as described in Section 3.4.1, “Memory

Map.”

5.3.9 Non-cacheable Region

The data cache suppor ts one non-cacheable address re-

gion within the DRAM address space aperture. The base

address of this region is determined by the value in the

DRAM_CACHEABLE_LIMIT MMIO register, which is

shown in Figure 5-6. Since uncached memory opera-

tions always incur many stall cycles, the non-cacheable

region should be used sparingly.

A memory operation is non-cacheable if its target ad-

dress satisfies:

[dram_cacheable_limit] <= address < [dram_limit]

Thus, the non-cacheable region is at the high end of the

DRAM aperture. The format of the

DRAM_CACHEABLE_LIMIT register forces the size of

the non-cacheab le region to be a multiple of 64 KB.

When PNX1300 is reset, DRAM _CACHEABLE_LIMIT is

set equal to DRAM_LIM IT, which results in a zero-len gth

non-cacheable region.

Programmer’s note: When DRAM_CACHEABLE_LIMIT

is changed to enlarge the region that is non-cacheable,

software must ensure coherency. This is accomplished

by explicitly copying back dirty data (using dcb opera-

tions) and invalidating (using dinvalid operations) the

cache blocks in the previously unlocked region.

DC_LOCK_ADDR (r/w)0x10 0014 DC_LOCK_ADDRESS

DC_LOCK_SIZE (r/w)0x10 0018 DC_LOCK_SIZE

000000

0 00000

31 0371115192327

DC_LOCK_CTL (r/w)0x10 0010 0000000000000000000000000

DC_LOCK_ENABLE

MMIO_BASE

offset:

00000000

000000000 000000000 0

APERTURE_CONTROL

reserved

Figure 5-5. Formats of the registers in charge of data-cache locking.

Table 5-6. Aperture control field

Value Memory map properties

00 (RESET) Normal operation memory map (Section 3.4.1):

• loads to 0..0xff always return 0 and cause no

PCI read (memory hole is enabled)

• PCI aperture(s) are enabled

01 • loads to address 0..0xf f cause a PCI read, i.e.

the memory hole is disabled

• PCI aperture(s) are enabled

10 PCI apertures are disabled for loads

• loads return a 0 and cause no PCI read

11 RESERVED for future extensions

31 0371115192327

DRAM_CACHEABLE_LIMIT

(r/w)

0x10 0008 DRAM_CACHEABLE_LIMIT_FIELD 0000000000000000

MMIO_BASE

offset:

Figure 5-6 Formats of the DRAM_CACHEABLE_LIMIT register.

PNX1300/01/02/11 Data Book Philips Semiconductors

5-6 PRELIMINARY SPECIFICATION

5.3.10 Special Data Cache Operations

A program can exercise some contr ol over the operation

of the data cache by executing special operations. The

special operations can cause the data cache to initiate

the copyback or invalidation of a block in the cache.

These operations are typically used by software to keep

the cache coherent with main memory.

In addition, there are sp ecial operations th at allow a pro -

gram to read tag and status information from the data

cache.

Special data cache operations are always executed on

the memory port associated with issue slot 5.

5.3.10.1 Copyback and invalidate operations

The data cache controller r ecognizes a copyback and an

invalidate operation as shown in Table 5-7.

The dcb and dinvalid operations both compute a target

word address that is the sum of a register and seven-bit

offset. The offset can be in the range [–256..252] and

must be divisible by four.

dcb operation. The dcb operation computes the target

address, and if the block contain ing the a ddress is found

in the data cache, its contents are written back to main

memory if the block is both valid and dirty. If the block is

not present, not valid, or not dirty, no action results from

the dcb operation. If the dcb causes a copyback to occur,

the CPU is stalled until the copyback completes. If the

block is not in cache, the operation causes no stall cy-

cles. If the block is in cache but not dirty, the operation

causes 4 stall cycles. If the block is dirt y, t he dcb opera-

tion causes a writeback a nd takes at least 19 stall cycles.

The dcb operation clears the dirty bit but leaves a valid

copy of the written-back block in the cache.

dinvalid operation. The dinvalid operation computes

the target address, and if the block containing the ad-

dress is found in the data cache, its valid and dirty bits

are cleared. No copyback operation will occur even if the

block is valid and dirty prior to executing the dinvalid op-

eration. The CPU is stalled for 2 cycles, if the target block

is in the cache; otherwise, no stall cycles occur.

A dinvalid or dcb op erat ion upd ates the LRU in fo rmatio n

to least recently used in its set.

Programmer’s note: Software should not execute din-

valid operations on locked blocks; otherwise, a ‘hole’ is

created that cannot be reused until locking is deactivated.

5.3.10.2 Data cache tag and status

operations

The data cache controller recognizes two DSPCPU op-

erations for reading cache status as shown in Table 5-8.

The rdtag and r dstatus operations both compute a tar get

word address that is the sum of a register and scaled

seven-bit offset. The offset must be divisible by four and

in the range [–256..252].

rdtag operation. Th e ta rget a ddr ess comp uted by rd ta g

selects the data cache block by specifying the cache set

and set element directly. Address bits [10..6] specify the

cache set (one of 32), and bits [13..11] specify the set e l-

ement (one of eight). All other target address bits are ig-

nored. This operation causes n o CPU stall cycles.

The result of the rdtag op erat ion is a full 32- bit word with

the format shown in Figure 5-7.

rdstatus operation. The target address computed by rd-

status selects the data cache set by specifying the set

number directly. Address bits [10..6] specify the cache

set (one of 32); all other target address bits are ignored.

This operation causes 1 CPU stall cycle.

The result of the rdstatus operation is a full 32-bit word

with the format shown in Figure 5-7. See Section 5.6.7,

“LRU Bit Definitions,” for a de sc rip tio n of the LRU bits.

Table 5-7. Co py b ac k and in va lidate ope rat io n s

Mnemonic Description

dcb(offset) rsrc1 Data-cache copyback block. Causes

the block that contains the target

address to be copied back to main

memory if the block is valid and dirty.

dinvalid(offset) rsrc1 Data-cache invalidate block. Causes

the block that contains the target

address to be invalidated. No copy-

back occurs even if the block is dirty.

Table 5-8. Cache read-status operations

Mnemonic Description

rdtag(offset) rsrc1 Read data-cache tag. The target

address selects a data-cache block

directly; the operation returns a 32-bit

result containing the 21-bit cache tag

and the valid bit.

rdstatus(offset) rsrc1 Read data-cache status. The target

address selects a data-cache set

directly; the operation returns a 32-bit

result containing the set’s eight dirty

bits and ten LRU bits.

31 0371115192327

VALID

rdtag Result Format TAG

rdstatus Result Format LRUDIRTY00000000000

0000000000

000

Figure 5-7. Result formats for rdtag and rdstatus operations.

Philips Semiconductors Cache Architecture

PRELIMINARY SPECIFICATION 5-7

5.3.10.3 Data cache allocation operation

The data cache controller recognizes allocation opera-

tions as shown in Table 5-9. The allocation operations al-

locate a block an d set the status of this block to valid. No

data is fetched from main memory. The allocated block

is undefined after this oper ation. The programmer has to

fill it with valid data by store operations. Allocation oper-

ations to apertures other than cacheable DRAM will be

discarded. Allocation of a non-dirty block causes 3 stall

cycles. Allocation of a dirty block will cause w riteback of

this block to the SDRAM and take at least 11 stall cycles.

5.3.10.4 Data cache prefetch operation

The data cache controller recognizes prefetch opera-

tions as shown in Table 5-10. The prefetch operations

load a full cache block from memory concurrently with

other computation. If the prefetched block is already in

cache, no data is fetched from main memory. Prefetch

operations to other aper tures than cach eable DRAM ar e

discarded. This operation is not guaranteed to execute,

it will not execute if the cache is already occupied with

two cache misses when the operation is issued. The

prefetch operations cause 3 stall cycles if there is no

copyback of a dirty block. If a dirty block is the target of

the prefetch, the dirty block will be written back to

SDRAM, and at least 11 stall cycles are taken.

5.3.11 Memory Operation Ordering

The PNX1300 memory system implements traditional or-

dering for memory op er ations that ar e issued in different

clock cycles. That is, the effects of a memory operation

issued in cycle j occur before the effects of a memory op-

eration issued in cycle j+1.

For memory operations issued in the same cycle, howev-

er, it is not possible to execute memory operations in a

traditional order. So long as the simultaneous memory

operations access different addresses (aliasing is not

possible in PNX1300), no problems can occur. If two si-

multaneous operations do access the same address,

however, PNX1300 behavior is undefined. Specifically,

two cases are possible:

1. When multiple values are written to the same address

in the same cycle, the resulting value in memory is un-

defined.

2. When a read and a write occur to the same address

in the same clock cycle, the value returned by the

read is undefined.

The behavior of simultaneous accesses to the same ad-

dress is undefined regardless of whether one or both

memory operations hit in the cache.

Hidden Memory System Concurrency. Some cache

operations may be overlapped with CPU execution. In

general, a program cannot determine in what order

cache misses will complete nor can a program determine

when and in what order copyback operations will com-

plete. A program can, however, enforce the completion

of copyback transactions to main memory because copy-

back and invalidate operations can complete only if

pending copyback transactions for the same block have

completed. Thus, a program can synchronize to the com-

pletion of a copyback operation by dirtying a block, issu-

ing a copyback operation for the block, and then issuing

an invalidate operation for the block.

Ordering Of Special Memory Operations. The follow-

ing are special me m or y op era tion s:

1. Loads or stores to MMIO addresses.

2. Non-cached loads or stores.

3. Any copyback or invalidate operation.

4. Loads or stores that cause a PCI-bus access.

The CPU is stalled until these special memory opera-

tions are completed; there is no overlap of CPU execu-

tion with these sp ecial memory ope rations. Thu s, a pro-

grammer can assume that traditional memory operation

ordering applies to special memory operations. Note,

however, that ordering is undefined for two special mem-

ory operations issued in the same cycle.

Table 5-9. Data cache allocation operations

Mnemonic Description

allocd(offset) rsrc1 Data-cache allocate block with dis-

placement. Causes the block with

address (rsrc1+offset) &

(~(cache_block_size - 1)) to be allo-

cated and set valid.

allocr r src1 rsrc2 Data-cache allocate block with index.

Causes the block with address

(rsrc1+rsrc2) & (~(cache_block_size -

1)) to be allocated and set valid.

allocx rsrc1 rsrc2 Data-cache allocate block with scaled

index. Causes the block with address

(rsrc1 + 4 * rsrc2) &

(~(cache_block_size - 1)) to be allo-

cated and set valid.

Table 5-10. Da ta cache prefetch operations

Mnemonic Description

prefd(offset) rsrc1 Data-cache prefetch block with dis-

placement. Causes the block with

address (rsrc1+of fset) &

(~(cache_block_size - 1)) to be

prefetched

prefr rsrc1 rsrc2 Data-cache prefetch block with index.

Causes the block with address

(rsrc1+rsrc2) & (~(cache_block_size -

1)) to be prefetched.

pref16x rsrc1 rsrc2 Data-cache prefetch block with scaled

16-bit index. Causes the block with

address (rsrc1 + 2 * rsrc2) &

(~(cache_block_size - 1)) to be

prefetched.

pref32x rsrc1 rsrc2 Data-cache prefetch block with scaled

32-bit index. Causes the block with

address (rsrc1 + 4 * rsrc2) &

(~(cache_block_size - 1)) to be

prefetched.

PNX1300/01/02/11 Data Book Philips Semiconductors

5-8 PRELIMINARY SPECIFICATION

5.3.12 Operation Latency

Load and store operations have an operation latency of

three cycles, regardless of the size of the data transfer.

5.3.13 MMIO Register References

Memory operations that reference MMIO registers are

not cached, and the CPU is stalled until the MMIO refer-

ence completes. A MMIO register reference occurs when

an address is in the range:

[MMIO_BASE]  address < ([MMIO_BASE] + 0x200000)

The size of the MMIO aperture is hardwired at 2 MB.

5.3.14 PCI Bus References

Any CPU memory operation that references an address

outside the SDRAM and MMIO address apertures is as-

sumed to reference a device o r me mory on the PCI bus.

PCI-bus data transfers are not cached, and the CPU is

stalled until the PCI transfer completes.

5.3.15 CPU Stall Conditions

The data cache causes the CPU to stall when:

1. Any cache miss occurs.

2. Two simultaneously issu ed, cacheabl e memor y oper -

ations need to access the same cache bank (bank

conflict).

3. An access that references an address in the MMIO

aperture is issued.

4. An access to the PCI bus is issued.

5. A non-trivial copyback or invalidate operation is is-

sued.

6. An access to the n on- cachea ble re gion in the DRAM

aperture is issued.

5.3.16 Data Cache Initialization

When PNX1300 is reset, the data cache executes an ini-

tialization sequence. The cache asserts the CPU stall

signal while it sequentially resets all valid and dirty bits.

The cache de-asserts the stall signal after completing the

initialization sequence.

5.4 INSTRUCTION CACHE

The instruction cache stores compressed CPU instruc-

tions; instructions are decompressed before being d eliv-

ered to the CPU. The following sections describe the in-

struction cache and its operation; Table 5-11

summarizes instruction-cache charac terist ics.

5.4.1 General Cache Parameters

The PNX1300 instruction cache is 32 KB in size with a

64-byte block size. Thus, the cache contains 512 blocks

each with its own address tag. The cache is 8-way set-

associative, so there are 64 sets, each containing 8 tags.

A single valid bit is associated with a block, so each block

and associated address tag is either entirely valid or in-

valid; on a cache miss, 64 bytes are read from SDRAM

to make the entire block valid.

The geometry of the instruction cache is available to soft-

ware by reading the MMIO register IC_PARAMS.

Figure 5-8 shows the format of the IC_PARAMS register;

Table 5-12 lists its field values.

The product of the block size, associativity, and number

of sets gives the total cache size (32 KB in this case).

5.4.2 Address Mapping

PNX1300 instruction addresses are mapped onto the

data cache storage structure as shown in Figure 5-9. An

instruction address is partitioned into three fields as de-

scribed in Table 5-13

Table 5-11. Instruction cache characteristics

Characteristic PNX1300 Implementation

Cache size 32 KB

Cache associativity 8-way set-associative

Block size 64 bytes

Valid bits One valid bit per 64-byte block

Replacement policy Hierarchical LRU (least-recently used)

among the eight blocks in a set

Operation latency Branch delay is three cycles

Coherency enforce-

ment Software uses a special operation to

enforce cache coherency

Cache locking Up to 1/2 (four out of eight blocks of

each set) of the cache contents can be

locked; granularity is 64 bytes

Table 5-12. IC_PARAMS field values

Field Name Value

BLOCKSIZE 64

ASSOCIATIVITY 8

NUMBER_OF_SETS 64

31 0371115192327

IC_PARAMS (r/o)0x10 0020 ASSOCIATIVITY NUMBER_OF_SETS

MMIO_BASE

offset:

BLOCKSIZE

Figure 5-8. Format of the instruction-cache parameters register.

Philips Semiconductors Cache Architecture

PRELIMINARY SPECIFICATION 5-9

5.4.3 Miss Processing Order

When a miss occurs, the instruction cache starts filling

the requested block from th e beginning of the blo ck. The

DSPCPU is stalled until the entire block is fetched and

stored in the cache.

5.4.4 Replacement Policy

The hierarchical LRU replacement policy implemented

by the instruction cache is identical to that implemented

by the data cache. See Section 5.3.4, “Replacement Pol-

icies, Coherency,” for a description of the hierarchical

LRU algorithm.

5.4.5 Location of Program Code

All program code must first be loaded into SDRAM. The

instruction cache cannot fetch instructions from other

memories or devices. In particular, the cache cannot

fetch code from on-chip devices or over the PCI bus.

5.4.6 Branch Units

The instructio n cach e is close ly coup led to three br anch

units. Each unit can accept a branch independently, so

three branches can be processed simultaneously in the

same cycle.

Branches in PNX1300 are called ‘delayed branches’ be-

cause the effect of a successful (taken) branch is not

seen in the flow of control until some number of cycles af-

ter the successful branch is executed. The number of cy-

cles of latency is called the branch delay. On PNX1300,

the branch delay is three cycles.

Although three branches can be executed simultaneous-

ly, correct operation of the DSPCPU requires that only

one branch be successful (taken) in any one cycle.

DSPCPU operation is undefined if more than one con-

current branch operation is successful.

Each branch unit takes four inputs from the DSPCPU:

the branch opcode, a guard bit, a branch condition, and

a branch target addr ess. A branch is deemed succe ssful

if and only if the opcode is a branch opcode, the guard bit

is TRUE (i.e., = 1), and the condition (determined by the

opcode) is satisfied.

5.4.7 Coherency: Special iclr Operation

A program can exercise some control over the operation

of the instruction cache by executing the special iclr op-

eration. This operation causes the instruction cache to

clear the valid bits for all blocks in the cache, including

locked blocks. The LRU replacement status of all blocks

is reset to its initial value. The CPU is stalled while iclr is

executing.

See Section 5.6, “Cache Coherency,” fo r further discu s-

sion of coherency issues.

5.4.8 Reading Tags and Cache Status

The instruction cache supports read access to its tag and

status bits, but not through special operations as with the

data cache. Since the instru ction cache and bran ch units

can execute only resultless operations, access to the in-

struction-cache tags and status bits is implemented us-

ing normal load operations executed by the DSPCPU

that reference a special region in the MMIO address ap-

erture. The region is 64 KB long and starts at

MMIO_BASE. Instruction cache tags and status bits are

read-only; store operations to this region have no effect.

MMIO operations to this special region are only allowed

by the DSPCPU, not by any other ma sters of the on-chip

data highway, such as external PCI initiators.

Programmer’s note: Tag and status information cannot

be read by PCI access, but only by DSPCPU access.

Tag and status read cannot be scheduled in the same cy-

cle with or one cycle after an iclr operation.

Reading A Tag And Va lid Bit. To read the tag and valid

bit for a block in the i nstruction cache, a prog ram can ex-

ecute a ld32 operation directed at the instruction-cache

region in the MMIO aperture. The top of Figure 5-10

shows the required format for the target address. The

most-significant 16 bits must be equal to MMIO_BASE,

the least-significant 15 bits select the block (by naming

the set and set member), and bit 15 must be set to zero

to perform a tag read. Note that in PNX1300, valid set

numbers range from 0 to 63. Space to encode set num-

bers 64 to 511 is provided for future extensions.

A ld32 with an address as specified above returns a 32-

bit result with the format sh own at the top of Figure 5-11.

Bit 20 contains the state of the valid bit, and the least-sig-

nificant 20 bits contain the tag for th e block addressed by

the ld32.

Reading The LRU Bits. To read the LRU bits fo r a set in

the instruction cache, a program can execute a ld32 op-

eration as above but using the address format shown at

the bottom of Figure 5-10. In this format, bit 15 is set to

one to perform the read of the LRU bits, and the

tag_i_mux field is set to ze ros because it is not needed.

Table 5-13. Instruction Address Field Partitioning

Field Address

Bits Purpose

Offset 5..0 Byte offset into a set

Set 11..6 Selects one of the sets in the cache (one

of 64 in the case of PNX1300)

Tag 31..12 Compared against address tags of set

members

OffsetSetTag

31 561112

Instruction Cache

Address

Figure 5-9. Instruction-cache address partitioning.

PNX1300/01/02/11 Data Book Philips Semiconductors

5-10 PRELIMINARY SPECIFICATION

Reading the LRU bits produces a 32-bit result with the

format shown at the bottom of Figure 5-11. The least-sig-

nificant ten bits contain the state of the LRU bits when the

ld32 was executed. See Section 5.6.7, “LRU Bit Defini-

tions,” for a description of the LRU bits.

Note that the tag_i_mux and se t fields in the address for-

mats of Figure 5-10 are larger than necessary for the in-

struction cache in PNX1300. These fields will allow fu-

ture implementations with larger instruction caches to

use a compatible mechanism for reading instruction

cache information. The tag_i_mux field can accommo-

date a cache of up to 16-way set-associativity, and the

set field can accommodate a cache with up to 512 sets.

For PNX1300, the following constraints of the values of

these fields must be observed:

1. 0  tag_i_mux  7

2. 0  set  63

5.4.9 Cache Locking

Like the data cache, the instruction cache allows up to

one-half of its blocks to be locked. A locked block is nev-

er chosen as a victim by the replacement algorithm; its

contents remain undisturbed until the locked status is

changed explicitly by software. Thus, on PNX1300 , up to

16 KB of the cache can be used as a high- speed instruc-

tion ‘ROM.’ Only four out of eight blocks in any set ca n be

locked.

The MMIO registers IC_LOCK_ADDR, IC_LOCK_SIZE,

and IC_LOCK_CTL—shown in Figure 5-12—are used to

define and enable instruction locking in the same way

that the similarly named data -cache locking registers are

used. Section 5.3.7, “Cache Locking,” describes the de -

tails of cache locking; they are not repeated here.

Setting the IC_LOCK_ENABLE bit (in IC_LOCK_CTL) to

‘1’ causes the following sequence of events:

1. The instruction cache invalidates all blocks in the

cache.

2. The instruction cache fetches all blocks in the lock

range (defined by IC_LOCK_ADDR and

IC_LOCK_SIZE) from main memory into the cache.

3. Cache locking is activated so that the locked blocks

cannot be victims of the replacement algorithm.

The only difference between this sequence and the ini-

tialization sequence for data-cache locking is that dirty

blocks (which cannot exist in the instruction cache) are

not written back first.

Programmer’s note: Programmers (or compilers) must

combine all instructions that need to be locked into the

single linear instruction-locking address range.

The special iclr operation also removes locked blocks

from the cache. If blocks are locked in the instruction

cache, then instruction cache locking should be disabled

in software (by writing ‘0’ to IC_LOCK_CTL) before an

iclr operation is issued.

Locking should not be enabled by PCI accesses to the

MMIO register.

5.4.10 Instruction Cache Initialization and

Boot Sequence

When PNX1300 is reset, the instruction cache executes

an initialization and processor boot sequence. While re-

set is asserted, the instruction cache forces NOP opera-

tion to the DSPCPU, and the program counter is set to

the default value reset_vector. When reset is deassert-

ed, the initialization an d boot sequence is as follows.

31 0371115192327

To Read Tag & Valid Bit

To Read LRU Bits SET

MMIO_BASE

10000

MMIO_BASE

TAG_I_MUX SET

Figure 5-10. Required address format for reading instruction-cache tags and status.

31 0371115192327

VALID

I-Cache Tag-Read Result Format

I-Cache Status-Read Result Format LRU00000000000

0000000000

000

00000000

TAG

Figure 5-11. Result formats for reads from the instruction-cache region of the MMIO aperture.

IC_LOCK_ADDR (r/w)0x10 0214 IC_LOCK_ADDRESS

IC_LOCK_SIZE (r/w)0x10 0218 IC_LOCK_SIZE

000000

31 0371115192327

IC_LOCK_CTL (r/w)0x10 0210 000000000000000000000000000

IC_LOCK_ENABLE

MMIO_BASE

offset:

000000000

0000000000000000 00

reserved

Figure 5-12. Formats of the registers that control instruction-c ache locking.

Philips Semiconductors Cache Architecture

PRELIMINARY SPECIFICATION 5-11

1. The stall signal is asserted to prevent activity in the

DSPCPU and data cache.

2. The valid bits for all blocks in the instruction cache are

reset.

3. At the completion of the block invalidation scan, the

stall signal to the DSPCPU an d da ta cache are deas-

serted.

4. The DSPCPU begins normal operation with an in-

struction fetch from the address reset_vector.

The initialization process takes 512 clock cycles. Reset

sets reset_vector equal to DRAM_BASE so that program

execution starts at the initial value of DRAM_BASE. The

initial value of DRAM_BASE is determined as described

in Section 5.2, “DRAM Aperture.”

5.5 LRU ALGORITHM

When a cache miss occurs, the block containing the re-

quested data must be brought into the cache to replace

an existing cache block. The LRU algorithm is responsi-

ble for selecting the replacement victim by selecting the

least-recen tly -u se d blo ck .

The 8-way set-associative caches implement a hierarchi-

cal LRU replacement algorith m as follows. Eight sets are

partitioned into fo ur gr ou p s of two e leme nts each. To se -

lect the LRU element:

• First, the LRU pair is selected out of the four pairs

using a four-way LR U algo r i th m.

• Second, the LRU element of the pair is selected

using a two-way LRU algorithm.

5.5.1 Two-Way Algorithm

The two-way LRU requires an administration of one bit

per pair of elements. On every cache hit to one of the two

blocks, the cache writes once to this bit (just a write, not

a read-modify-write). If the even-numbered block is ac-

cessed, the LRU bit is set to ‘1’; if the odd-numbered

block is accessed, the LRU bit is set to ‘0’. On a miss, the

cache replaces the LRU element, i.e. if the LRU bit is ‘0’,

the even numbered element will be replaced; if the LRU

bit is ‘1’, the odd numbered element will be replaced.

5.6 CACHE COHERENCY

The PNX1300 hardware does not implement coherency

between the caches and main memory. Generalized co-

herency is the responsibility of software, which can use

the special operations dcb, dinvalid, and iclr to enforce

cache/memory synchronization.

5.6.1 Example 1: Data-Cache/Input-Unit

Coherency

Before the CPU comma nds the video-in unit to capture a

video fram e, the CPU must be sure that the data cac he

contains no blocks that are in the address regio n that the

video-in unit will use to store th e input frame. If the video-

in unit performs its input function to an address region

and the data cache does hold one or more blocks from

that region, any of the following may happen:

• A miss in the data cache may cause a dirty block to

be copied back to the address region being used by

the video-in unit. If the video-in unit already stored

data in the block, the write-back will corrupt the frame

data.

• The CPU will read stale data from the cache instead

of from the block in main memory. Even though the

video-in unit stored new video data in the block in

main memory, the cache contents will be used

instead because it is still valid in the cache.

To prevent erroneous copybacks or the use of stale data,

the CPU must use dinvalid operations to invalidate all

blocks in the address region that will be used by the VI

unit.

5.6.2 Example 2: Data-Cache/Output-Unit

Coherency

Before the CPU commands the video-out unit to send a

frame of video, the CPU must be sure that all the data for

the frame has been written fr om the data cache to the re-

gion of main memory that the video-out unit will output.

Explicit action is necessary because the data cache—

with its copyback write policy—will hold an exclusive

copy of the data until it is either replaced by the LRU al-

gorithm or the CPU explicitly forces it to be copied back

to main memory.

Before an output command is issued to the video-out

unit, the CPU must execute dcb operations to force co-

herency between cache contents and main memory.

5.6.3 Example 3: Instruction-Cache/Data-

Cache Coherency

If code prepared by a p rogram ru nning on the CPU mu st

be subsequently executed, coherency between the in-

struction and data caches must be enforced. This is ac-

complished by a two-step process:

1. Coherency between the da t a cache an d main memo-

ry must be enforced since the instruction cache can

fetch instructions only from main memory.

2. Coherency between the instruction cache and main

memory is enforced by executing an iclr operation.

The CPU will now be able to fetch and execute the new

instructions.

5.6.4 Example 4: Instruction-Cache/Input-

Unit Coherency

When an input unit is used to load program code into

main memory, the iclr operation must be issued before

attempting to execute the new code.

5.6.5 Four-Way Algorithm

For administration of the four-way algorithm, the cache

maintains an upper-left triangular matrix ‘R’ of 1-bit ele-

ments without the diagonal. R contains six bits (in gener-

PNX1300/01/02/11 Data Book Philips Semiconductors

5-12 PRELIMINARY SPECIFICATION

al, n(n–1)/2 bits for n-way LRU). If set element k is ref-

erenced, the cache sets row k to ‘1’ and column k to ‘0’:

R[k, 0..n–1]  1,

R[0..n–1, k]  0

The LRU element is the one for which the entir e row is ‘0’

(or empty) and the entire column is ‘1’ (or empty):

R[k, 0..n–1] = 0 and R[0..n–1, k] = 1

For a 4-way set-associative cache, this algorithm re-

quires six bits per set of four cache blocks. On every

cache hit, the LRU info is updated by setting thre e of the

six bits to ‘0’ or ‘1’, depending on the set element that

was accessed. The bits need only be written, no read-

modify-write is necessary. On a miss, the cache reads

the six LRU bits to determine the replacement block.

PNX1300 combines the two-way and four-way algo-

rithms into an 8-way hierarchical LRU algorithm. A total

of ten administration bits are required: six to maintain the

four-way LRU plus four bits maintain the four two-way

LRUs.

The hierarchical algorithm has performance close to full

eight-way LRU, but it requires far fewer bits—ten instead

of 28 bits—and is much simpler to implement.

To update the LRU bits on a cache hit to element j (with

0 <= j <= 7), the cache applies m = (j div 2) to the four-

way LRU administration and (j mod 2) is applied to the

two-way administration of pair m. To select a replace-

ment victim, the cache first determines the pair p from the

four-way LRU and then retrieves the LRU bit q of pair p.

The overall LRU element is the p2+q.

5.6.6 LRU Initialization

Reset causes the LRU administration bits to initialized to

a legal state:

R[1,0]  R[2,0]  R[3,0]  1

R[2,1]  R[3,1]  R[3,2]  0

2_way[3]  2_way[2]  2_way[ 1]  2_way[0]  0

5.6.7 LRU Bit Definitions

The ten LRU bits per set are mapped as shown in

Figure 5-13. This is the format of the LRU field as re-

turned by the special operation rdstatus for the data

cache and a ld32 from MMIO space (see Section 5.4.8,

“Reading Tags and Cache Status”) for the instruction

cache.

5.6.8 LRU for the Dual-Ported Cache

For the PNX1300 dual-ported data cache, two memory

operations to the same set are possible in a single clock

cycle. To support this concurrency, two updates of the

LRU bits of a single set must be possible.

The following rules are used by PNX1300:

1. LRU bits that are changed by exactly one port receive

the value according to the algorithm described a bove.

2. LRU bits th at are changed by both port s receive a val-

ue as if the algorith m were first applied for the access

in port zero and then for the access in port one.

5.7 PERFORMANCE EVALUATION

SUPPORT

The caches implement support for performance evalua-

tion. Several events that occur in the caches can be

counted using the PNX1300 timer/co unters, by selecting

the source CACHE1 and/or CACHE2, as described in

Section 3.8, “Timers.” Two different events can be

tracked simult an e ous ly by usin g 2 tim er s.

The MMIO register MEM_EVENTS determines which

events are counted. See Figure 5-14 for the format of

MEM_EVENTS. Table 5-14 lists the events that can be

tracked and the corresponding values for the

MEM_EVENTS fields. Event1 selects the actual source

LRU bit 0

R[3,1] R[3,0]R[3,2]R[2,0]R[1,0] R[2,1]2_way[1] 2_way[0]2_way[3] 2_way[2] LRU bit 1LRU bit 2LRU bit 3LRU bit 4LRU bit 5LRU bit 6LRU bit 7LRU bit 8LRU bit 9

Figure 5-13. LRU bit definitions; 2_way[k] is the two-way LRU bit of pair k = (j div 2) for set element j.

31 0371115192327

MEM_EVENTS (r/w)0x10 000C 0Event2

MMIO_BASE

offset:

00000000000000000000000 Event1

Figure 5-14. Format of the memory_events MMIO register.

Philips Semiconductors Cache Architecture

PRELIMINARY SPECIFICATION 5-13

for the TIMER CACHE1 source. Event2 selects the

source for TIMER CACHE2.

If the memory bus is available:

• On read data cache miss the minimum waiting time

is 12 SDRAM clock cycles, if critical word first is

granted by the Main Memory Interface (MMI). If not,

then data cache waits from 12 to 18 SDRAM cycles

(16 SDRAM cycles are required to fetch 64 bytes

from SDRAM.

• On write data cache miss, the missing line needs to

be fetched, thus it implies the same SDRAM cycles

as a read data cache miss. If the victimized cache

line is dirty, the cache line is copied back to memory

after the read of the missing line is done and thus

does not add ex tr a stall cycles .

• Prefetch delay is the same as read data cache if

memory bus is available. As a reminder the prefetch

may be discarded if the data cache state machine is

“full”, and there is a 3 stall cycle penalty when the

prefetch is issued.

5.8 MMIO REGISTER SUMMARY

Table 5-15 lists the MMIO registers that pertain to the op-

eration of PNX1300’s instruction and data caches.

Table 5-14. Trackable cache-performance events

Encoding Event

0 No event counted

1 Instruction-cache misses

2 Instruction-cache stall cycles (including data-

cache stall cycles if both instruction-cache and

data-cache are sta lled simultaneously)

3 Data-cache bank conflicts

4 Data-cache read misses

5 Data-cache write misses

6 Data-cache stall cycles (that are not also instruc-

tion-cache stall cycles)

7 Data-cache copyback to SDRAM

8 Copyback buffer full

9 Data-cache write miss with all fetch units occu-

pied

10 Data cache stream miss

11 Prefetch operation started and not discarded

12 Prefetch operation discarded (because it hits in

the cache or there is no fetch unit available)

13 Prefetch operation discarded (because it hits in

the cache)

14–15 Reserved

Table 5-15. MMIO regi st er summary

Name Description

DRAM_BASE Sets location of the DRAM aperture

DRAM_LIMIT Sets size of the DRAM aperture

DRAM_CACHEABLE

_LIMIT Divides DRAM aperture into cache-

able and non-cacheable portions

MEM_EVENTS Selects which two events will be

counted by timer/counters

DC_LOCK_CTL Data-cache locking enable and aper-

ture control

DC_LOCK_ADDR Sets low address of the data-cache

address lock aperture

DC_LOCK_SIZE Sets size of the data-cache address

lock aperture

DC_PARAMS Read-only register with data-cache

parameter information

IC_PARAMS Read-only register with instruction-

cache parameter information

IC_LOCK_CTL Instruction-cache locking enable

IC_LOCK_ADDR Sets low address of the instruction-

cache address lock aperture

IC_LOCK_SIZE Sets size of the instruction-cache

address lock aperture

MMIO_BASE Sets location of the MMIO aperture

PNX1300/01/02/11 Data Book Philips Semiconductors

5-14 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION 6-1

Video In Chapter 6

by Gert Slavenburg

6.1 VIDEO IN OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The Video In (VI) un it pr ovides the following fun ctio ns :

• Digital video input from a digital camera or analog

camera (using a video decoder).

• High-bandwidth (81 MB/sec) raw input data channel.

• Direct 8-10 bit interface for video A/D converters at

up to 81-MHz sample rate.

• Receiver port for PNX1300-to-PNX1300 unidirec-

tional message passing

The VI unit operates in one of the modes per Table 6-1.

Digital video input is in YUV 4:2:2 with 8-bit resolution

multiplexed in CCIR656 format1 from a d igital cam er a or

CCIR656-capable video decoder (such as the Philips

SAA7111 or SAA7113), across an 8-bit-wide interface.

Resolutions up to CCIR601 are accepted at 50 or 60

fields per second. A programmable rectangular image is

captured from a video frame and written i n planar format

to PNX1300 SDRAM. The video camera or decoder can

be programmed using the PNX1300 I2C bus. In fullres

capture mode, luminance (Y) and chrominance (U, V)

pass unmodified. In halfres capture mode, luminance

and chrominance are horizontally decimated by a factor

of two to convert to CIF- like re solution with YUV 4:2:2 or

MPEG sampling rules. If vertical subsampling on chromi-

nance is desired, it can be pe rformed by softwar e on the

DSPCPU or by the on-chip image coprocessor (ICP ).

When operating as raw input d ata channel, VI accepts 8-

bit-wide data. The operation mode is raw8 capture. No

data selection or data interpretation is done. Data is writ-

ten in packed form, four bytes to a word, to local SDRAM.

There is no hardware control over the rate at which the

source sends data. Instead, VI maintains two pointer/

counter registers to ensure that no data is lost when the

local SDRAM memory buffer fills. Data is accepted at the

clock of the sender. If desired, VI_CLK can be pro-

grammed as an outpu t to drive the data tra nsfer at a pro-

grammable rate.

VI can accept raw data from up to 10-bit A/D converter s,

at sampling rates up to 81 MHz. VI can operate in raw8,

raw10u, or ra w10s capture mode for eight-bit, unsigned

10-bit or signed 10-bit data. In the 10-bit modes, data is

zero- or sign-extended to 16 bits and stored in packed

form in local SDRAM. As with the raw8-capture mode, VI

maintains two pointer/counter registers to en sure that no

data is lost when the local SDRAM memory buffer fills.

Data is accepted at the externally set sampling rate. If

desired, VI_CLK can be programmed as an output to

serve as a programmable sampling clock.

VI can act as receiver from the Enhanced Video Out

(EVO) unit of another PNX1300. One EVO unit can

broadcast to multiple receiving VIs. In this message

passing mode, no data selection or data interp retation is

done. Each message of the sender is written as byte-

packed data to a separate local SDRAM memory buffer.

Message start and end is indicated by the sender. The

receiving VI will accept data until the sender indicates

message end or until the current memory buffer is full. If

the memory buffer fills before message end is encoun-

tered, the received data is truncated and an error condi-

tion is raised.

6.1.1 Interface

Besides the VI-specific pins in Table 6-2, the PNX1300

I2C interface is typically used to control the external cam-

era or video decoder.

Figure 6-1 through Figure 6-4 illustrate typical connec-

tions for commonly used external sources. Note that

VI_DVALID is only used in special circumstances, e.g.

when sending data through a channel that results in

clock periods both with and without data transfers.

Table 6-1. VI unit mode selection.

Mode Function Explanation

0000 fullres capture YUV 4:2:2 capture, no decimation

0001 halfres capture YUV 4:2:2 capture, decimate by 2

0010 raw8 capture raw 8-bit data capture, pack 4

bytes to a word

0011 raw10s capture raw 10-bit data capture, sign

extend to 16 bits, pack 2 to a word

0100 raw10u capture raw 10-bit data capture, zero-

extend to 16 bits, pack 2 to a word

0101 message passing message reception from EVO

0110

1111

Reserved

1. Refer to CCIR recommendation 656: interfaces for dig-

ital component video signals in 525-line and 625-line

television systems. Recommendation 656 is included in

the Philips Desktop Video Data Handbook.

PNX1300/01/02/11 Data Book Philips Semiconductors

6-2 PRELIMINARY SPECIFICATION

6.1.2 Diagnostic Mode

The VI logic can be set to operate in diagnostic mode,

which connects the inputs of VI to the outputs o f the EVO

unit. This mode provides boot diagnostics with the ability

to verify major operational aspects of the chip before

handing control to an operating system.

Diagnostic mode is entered by writing a control word with

a ‘1’ in the DIAGMODE bit position to the VI_CTL register

(see Figure 6-11). The EVO unit has to be setup to pro-

vide a clock before starting DIAGMODE. Aft er a VI soft-

ware reset, the DIAGMODE bit has to be set back to ‘1’.

In diagnost ic mode, the V I signals are exactly as shown

in Figure 6-2, except that the inputs come from the on-

chip EVO unit. Note that the inputs are truly taken from

the PNX1300 EVO external pins, i.e. if an external (board

level) source is driving EVO pins, diagnostic mode is not

capable of testing the EVO unit.

Note that the diagnostic mode only controls an input mul-

tiplexer. VI can be programm ed and operated in all u sual

modes. The raw modes are particularly attrac tive for di-

agnostics purposes, since they allow VI to operate al-

most as an on-chip logic analyzer.

6.1.3 Power Down and Sleepless

The VI unit enters power down state whenever PNX1300

is put in global power down mode, except if the SLEEP-

LESS bit in VI_CTL is set. In the latter case, the block

continues DMA operation and will wake up the DSPCPU

whenever an interrupt is generated.

The EVO block can be se parately powered down by set-

ting a bit in the BL OCK_POWER_ DOWN re gister. Re fer

to Chapter 21, “Power Management.”

It is recommended that the EVO unit be stopped (by ne-

gating VI_CTL.CAPTURE_ENABLE) before block-level

power down is started, or that SLEEPLESS mode be

used when global power down is activated.

6.1.4 Hardware and Software Reset

Video In is reset by a PNX1300 hardware reset (pin

TRI_RESET#) or by a VI software reset. The latter is ac-

complished by writing a control word of 0x00080000 to

the VI_CTL register. After a software reset, allow for 5

video clock cycles delay before enabling VI capture.

Upon hardware or software reset, the VI_CTL,

VI_STATUS, and VI_CLOCK registers are set to all ’0’s.

The state of the other registers after RESET is unde-

Table 6-2. VI unit interface pins

VI_CLK I/O-5 • If configured as input (power up

default): a positive transition on this

incoming video clock pin samples

all other VI_DATA input signals

below if VI_DVALID is HIGH. If

VI_DVALID is LOW, VI_DATA is

ignored. Clock and data rates of up

to 81 MHz are supported. PNX1300

supports an additional mode where

VI_DATA[9:8] in message passing

mode are not affected by the

VI_DVALID signal, Section 6.6.1.

• If configured as output: programma-

ble output clock to drive an external

video A/D converter. Can be pro-

grammed to emit integral dividers of

DSPCPU_CLK.

• See Section 6.2 for clock program-

ming details.

VI_DVALID IN-5 VI_DVALID indicates that valid data is

present on the VI_DATA lines. If

HIGH, VI_DATA will be accepted on

the next VI_CLK positive edge. If

LOW, no VI_DATA will be sampled.

PNX1300 supports an additional mode

where VI_DATA[9:8] in message pass-

ing mode are not affected by the

VI_DVALID signal, Section 6.6.1.

VI_DATA[7:0] IN-5 CCIR656 style YUV 4:2:2 data from a

digital camera, or general purpose

high speed data input pins. Sampled

on positive transitions of VI_CLK if

VI_DVALID HIGH.

VI_DAT A[9:8] IN-5 Extension high speed data input bits to

allow use of 10-bit video A/D convert-

ers in raw10 modes. VI_DATA[8]

serves as START and VI_DATA[9] as

END message input in message pass-

ing mode. Sampled on positive transi-

tions of VI_CLK if VI_DVALID HIGH.

PNX1300 supports an additional mode

where VI_DATA[9:8] in message pass-

ing mode are not affected by the

VI_DVALID signal, Section 6.6.1.

Philips Semiconductors Video In

PRELIMINARY SPECIFICATION 6-3

fined. Note that the VI clock has to be present while ap-

plying the software reset.

DATA[7:0]

CLOCK

SDA, SCL GND Cable Connector

VI_DATA[7:0]

VI_DVALID

VI_CLK

VSS

SDA, SCL

PNX1300

logic ‘1’

VI_DATA[9:8]

GND

Termination &

Receivers

I2C bus 2

Figure 6-1. VI connected to an 8-bit CCIR656 digital camera.

VI_DATA[7:0]

VI_DVALID

VI_CLK

PNX1300 2

logic ‘1’

VI_DATA[8]

VI_DATA[9]

VO_DATA[7:0]

VO_CLK

(STMSG) VO_IO1

(ENDMSG) VO_I O2

PNX1300 1

Figure 6-2. VI unit connected to an EVO unit of anot her PNX1300.

VI_DATA[7:0]

VI_DVALID

VI_CLK

IIC_SCL

IIC_SDA

PNX1300

logic ‘1’

VI_DATA[9:8]

GND

VPO[15:8]

LLC

SCL

SDA

SAA7111

Analog video

1–2 S-VHS Y/C

1–4 CVBS

To other I2C devices

I2C bus

24.576 MHz

Figure 6-3. VI unit connected to a video decoder.

PNX1300/01/02/11 Data Book Philips Semiconductors

6-4 PRELIMINARY SPECIFICATION

6.2 CLOCK GENERATOR

The VI block can operate in two d i stinct clocking m ode s,

as controlled by the VI_CLOCK control register (see

Figure 6-11).

SELFCLOCK = 0: ‘External clocking mode’. This is the

most common mode of operation. In this mode, the

VI_CLK pin is an asynchronous clock input. All other in-

puts are sampled on positive edges of the VI_CLK clock

signal. On-chip synchronizers ensure reliable asynchro-

nous capture. This mode can be combined with DIAG-

MODE, in which case the EVO clock acts as the asyn-

chronous clock source. In external clocking mode, the

value of DIVIDER is ignored.

SELFCLOCK = 1: ‘Internal clocking mode”. This

mode is typically intended for use with external A/D con-

verters or other sources that require a clock. In this

mode, VI_CLK is an output pin. Positive edges of

VI_CLK are used to sample all other inputs. The gener-

ated clock frequency can be programmed using the DI-

VIDER field in the VI_CLOCK register.

On RESET, VI_CLOCK is set to zero , i.e. external clock -

ing mode is the default with DIVIDER ignored.

6.3 FULLRES CAPTURE MODE

In fullres ca pture mode, the VI unit receives all three vid-

eo components Y, U, and V, as well as synchronization

information (SAV and EAV codes) on the VI_DATA[7:0]

pins in CCIR656 format. See Figure 6-8. The three video

components Y, U, and V are separated into three differ-

ent streams. Each component is written in packed form

into separate Y, U, and V buffers in the SDRAM. This is

commonly called a planar format1 (see Figure 6-10).

The CCIR656 standard specifies that the camera has to

obey the sampling rules illustrated in Figure 6-5. VI is ca-

pable of chrominance resampling, and can produce sam-

ples in memory in two ways:

VI_CTL.SC=0. ‘Co-sited sampling’ places luminance

and chrominance samples in memory without any modi-

fication. Hence, a planar format results with sampling po-

sitions as per co-sited luminance and chrominance YUV

4:2:2 convention.

VI_DATA[9:0]

VI_DVALID

VI_CLK

PNX1300

logic ‘1’

Analog vide o 10-bit Video A/D

Figure 6-4. VI connected to a 10-bit video A/D converter.

fVICLK fDSPCPU

DIVIDER

------------------------=

1. The planar format is most suitable as input to software

compression algorithms.

Chrominance (U,V)

samples Luminance

samples

Figure 6-5. Camera YUV 4:2:2 sampling (co-sited luminance/chrominance).

Philips Semiconductors Video In

PRELIMINARY SPECIFICATION 6-5

VI_CTL.SC=1: ‘Interspersed sampling’ serves to gen-

erate a sampling structure in memory where chromi-

nance samples are spatially midway between luminance

samples, as shown in Figure 6-6. This ‘interspersed’ for-

mat is suitable for use in MPEG-1 encoding.

The VI hardware applies a (–1 13 5 –1)/16 filter as illus-

trated in Figure 6-6 to the chrominance samples before

writing them to memory. This filter computes chromi-

nance values at sample points midway between lumi-

nance samples1. Computed video data is clamped to

01h if the filter result is less than 01h and clamped to FFh

if greater than FFh. Inter spersed data format is preferre d

by some video compression standards. The MPEG-1

standard, for example, requires YUV 4:2:0 data with

chrominance sampling positions horizontally and verti-

cally midway between luminance samples. This can be

achieved from the horizontally interspersed sampling for-

YUV 4:2:2 CCIR656

input samples

abcdefghi j k l

Resampled sa mple

values

Yg'Yg

Uef U–c13Ue5UgUi

–++16=

Vef Vc

–13Ve5VgVi

–++16=

Figure 6-6. Chrominance re-sampling to achieve interspersed sampling.

Active area

abcdefghi jdcb zu zv zw zx zy zz zy zx zwzs zt

• • •

Figure 6-7. Filtering at the edge of the active area.

Preamble

11111111 00000000 00000000 1FVHPPPP

Timing reference code

Protection bits

(error correction)

H = 0 for SAV

H = 1 for EAV

V = 1 during fi eld blanking

V = 0 elsewhere

F = 0 during field 1

F = 1 during field 2

Figure 6-8. Format of CCIR656 SAV and EAV timing reference codes.

Captured Image

START_X

WIDTH

HEIGHT

START_Y

Pixel 0 Pixel M–1Line 0

Line N–1

Figure 6-9. VI capture parameters.

1. All filters perform full precision intermediate computa-

tions and saturation upon generating the result bits.

PNX1300/01/02/11 Data Book Philips Semiconductors

6-6 PRELIMINARY SPECIFICATION

mat by vert ical subsampling with a (1 1) / 2 or more so-

phisticated filter. Vertical filtering can be performed in

software using the DSPCPU’s efficient multimedia oper-

ations or by hardware in the on-chip ICP.

The filtering process exercises special care at the left

and right edges of the active area of the CCIR656 data

stream, as defined by the SAV, EAV code positions. See

Figure 6-7. Sinc e no pixels exist to the left of the first pix-

el or to the rig ht of the last pi xel, filtering can result in ar-

tifacts. To minimize artifacts, the image is extended by

mirroring pixels around the left-most and right-most pixel.

Note that the image is mirrored around pixel ‘a’, the first

pixel after the SAV code and around pixel ‘zz’, the last

pixel before the EAV1 code. Pixel ‘a’ in Figure 6-7 is the

(chroma, luma) pair defined by the first three camera

bytes of the UYVYUYVY... stream after SAV.

Refer to Figure 6-11 for an overview of the memory

mapped I/O (MMIO) registers that are used to control

and observe the operation of VI in fullres capture mode.

To ensure compatibility with future devices, any unde-

fined MMIO bits should be ignored when read and written

as’0’s.

Upon hardwar e or software reset (Section 6.1.4, “Hard-

ware and Software Reset”), the VI_CTL, VI_STATUS,

and VI_CLOCK re gis te rs are se t to all zer o s.

At any point in time, the VI_STATUS register fields (see

Figure 6-11) indicate the current camera status:

• CUR_X: The pixel index (0 to M–1) of the most

recently received camera pixel. CUR_X gets set to

zero for the first pixel following receipt of a SAV

code2, and incremented on every valid Y sample

received thereafter.

• CUR_Y: The line index (0 to N–1) within the current

field of the camera line that is currently being

received. CUR_Y gets set to zero upon receipt of a

negative edge of V, i.e., upon the first SAV code con-

taining V=0 after one or more SAV codes containing

V=1. This is equivalent to the first line af ter the end of

vertical retrace. CUR_Y gets incremented upon

every successive SAV code.

• FIELD2: Indicates whether the field currently being

received is a field1 or 2. This flag ge ts update d based

on the F field of every received SAV code. Note that

field1 is the ‘top’ field, i.e. the field containing the top-

most visible line. Field1 contains lines 1,3,5 etc.

Field2 conta ins lines 2,4,6,8 etc.

Table 6-3 illustrates common digital camera standards

and the number of active pixels per line, lines per field,

and fields per second. Note that any source is accept-

able to VI, as long as the maximum VI_CLK rate is not

exceeded.

Figure 6-9 shows the deta ils of an incoming field and th e

captured image. The incoming field consists of N hori-

zontal lines, each line having M pixels labeled 0 through

M–1. Lines are numbered from 0 through N–1. The cap-

tured image is a subset of the incoming image. It is de-

fined by the capture parameters (START_X, START_Y,

WIDTH, HEIGHT) held in the VI_CAP_START and

VI_CAP_SIZE MMIO registers (see Figure 6-11).

• START_X: defines the starting pixel number (X-coor-

dinate of the starting pixel). START_X must be even,

and greater than or equal to ‘0’.

• START_Y: defines the starting line number (Y-coor-

dinate of the starting pixel). START_Y must be

greater than or equal to ‘0’.

• WIDTH: Defines the width of the captured image in

pixels. WIDTH must be even.

• HEIGHT: Defines the height of the captured image in

lines.

Image capture starts after the following conditions are

met:

• VI_CTL.CAPTURE ENABLE is asserted.

• VI_STATUS.CAPTURE COMPLETE is de-asserted,

indicating that any previously captured image has

been acknowledged.

• CUR_Y = START_Y occurs.

Once image capture is started, HEIGHT ‘lines’ are cap-

tured. Each line capture starts if:

• The previous line capture, if any, is completed.

• CUR_X = START_X

Once line capture starts, it continues for 2*WIDTH pixel

clocks3 in which VI_DVALID is asserted, irrespec tive of

the presence of one or more EAV codes.

Note that capture continues regardless of any horizontal

or vertical retrace and associated CUR_Y or CUR_X re-

set. This provides special applications with the ability to

capture information embedded inside the horizontal or

vertical blan king interval. If it is desirable to capture pix-

els in the horizontal blanking interval, a minimum time

separation of 1 s is required between the last pixel cap-

tured on line y and the first pixel captured on line y+1. An

exception to this rule is allowed if and only if the storage

parameters below are cho sen such tha t the la st a nd fir st

1. EAV codes with multiple bit errors are accepted and en-

able the mirroring function.

2. Note that VI uses the SAV protection bits to implement

single error correction and double error detection. An

SAV code with double error is ignored.

Table 6-3. Common video source parameters.

Video Source M

(# active pixels) N

(# active lines)

Field

Rate

(Hz)

CCIR601

50 Hz/625 lines 720 288 50

CCIR601

60 Hz/525 lines 720 240 60

square pixel

50 Hz/625 lines 768 288 50

square pixel

60 Hz/525 lines 640 240 60

3. Four clocks for each Cb,Y,Cr,Y group representing two

luminance pixels

Philips Semiconductors Video In

PRELIMINARY SPECIFICATION 6-7

pixel end up in adjacent memory locations. Note that

blanking information capture only makes sense in fullres

mode with co-sited sampling. All other modes apply filter-

ing, which will distort the numeric sample values.

The captured image is stor ed in SDRAM at a location de-

fined by the storage parameters in MMIO registers

(Y_BASE_ADR, Y_DELTA, U_BASE_ADR, U_DELTA,

V_BASE_ADR, V_DELTA). Note that the base-address

registers force alignment to 64-byte boundaries (six

LSBs are always zero). The default memory packing is

big-endian although little-endia n packing is also support-

ed by setting the LITTLE_ENDIAN b it in the VI_CTL reg -

ister.

• Y_BASE_ADR: The desired starting (byte) address

in SDRAM memory where the first Y (luminance)

sample of the captured image will be stored. This

address is forced to be 64-byte aligned (six LSBs

always ‘0’).

• Y_DELTA: The desired address difference between

the last sample of a line and the address of the first

sample on the next line. Note that the value of

Y_DELTA must be chosen so that all line-start

addresses are 64-byte aligned.

• U_BASE_ADR, U_DELTA, V_BASE_ADR,

V_DELTA: Same functions and alignment restric-

tions as above, but for chrominance-component

samples.

Horizontally-adjacent samples are stored at successive

byte addresses, resulting in a packed form (four 8-bit

samples are packed into one 32-bit word). Upon horizon-

tal retrace, pixel storage addresses are incremented by

the corresponding DELTA to compute the starting byte

address for the next line. Note th at DELTA is a 16-bit un-

signed quantity. This process continues until HEIGHT

lines of WIDTH samples have been stored in memory for

luminance (Y). For chrominance, HEIGHT lines of half

the WIDTH are stored1. See Figure 6-10.

Modifications to Y_BASE_ADR, U_BASE_ADR and

V_BASE_ADR have no effect until the start of next cap-

ture, i.e. VI hardware maintains a separate pointer to

track the current address. Modifications to Y_DELTA,

U_DELTA and V_DELTA do affect the next horizontal re-

trace. Hence, under normal circumstances, the DELTA

variables should not be changed during capture.

When capture is complete, i.e. any internal VI buffers

have been flushed and th e entire captured image is in lo-

cal SDRAM, VI raises the STATUS register flag CAP-

TURE COMPLETE. If enabled in the VI_CTL register,

this event cause s a DSPCP U interrupt to be requested.

The programmer can determine whether the captured

image is a field1 or field2 by inspection of the FIELD2 flag

in VI_STATUS. Note that the FIELD2 flag changes at the

start of the vertical blanking interval of the next field.

The CAPTURE COMPLETE flag is cleared by writing a

word to VI_CTL with a ‘1’ in the CAPTURE COMPLETE

ACK bit position. This action has the following effect:

• it tells the hardware that a new Y,U, and V DMA

buffer is available (or the old one has been copied)

• it clears the CAPTURE COMPLETE flag

• it tells VI to capture the next image

The user can program the Y_THRESHOLD field to gen-

erate pre-completion (or post-completion) interrupts.

Whenever CUR_Y reaches Y_THRESHOLD, the

THRESHOLD REACHED flag in the STATUS register is

set. If enabled in the VI_CTL register, this event causes

a DSPCPU interrupt request. The THRESHOLD

REACHED flag is cleared by writing a word to VI_CTL

with a ‘1’ in the THRESHOLD REACHE D ACK bit posi-

tion. Note that, due to internal buffering in the VI unit, it is

NOT guaranteed th at all samples from lin es up to and in-

1. Note that consecutive pixel components of each line

are stored in consecutive memory addresses but con-

secutive lines need not be in consecutive memory ad-

dresses

WIDTH pixels

HEIGHT lines

pix0 pix1 pix2 pix

W–1

• • •

. . .

Y_BASE_ADR

WIDTH/2 pixels

HEIGHT lines

pix0 pix2 • • •

. . .

U_BASE_ADR

(Repeated for V_BASE_ADDR,

V_DELTA)

Y_DELTA

U_DELTA

Figure 6-10. VI YUV 4:2:2 planar memory format.

PNX1300/01/02/11 Data Book Philips Semiconductors

6-8 PRELIMINARY SPECIFICATION

cluding CUR_Y have been written to lo cal SDRAM upo n

THRESHOLD REACHED. The implementation guaran-

tees a fixed maximum time of 2 s between raising the

interrupt and completion of all writes to SDRAM. The

THRESHOLD interrupt mechanism works regardless of

CAPTURE ENABLE. Hence, it can also be used to skip

a desired number of fields without constant DSPCPU

polling of VI_STATUS.

If VI internal buffers overflow due to insufficient internal

data-highway bandwidth allocation, the HIGHWAY

BANDWIDTH ERROR condition is raised in the

VI_STATUS register. If enabled, this causes assertion of

a VI interrupt request. Capture continues at the correct

memory address as soon as the internal buffers can be

written to memory, but one or more pixels may have

been lost, and the corresponding memory locations are

not written. The HBE condition can be clea red b y wr iting

a ‘1’ to the HIGHWAY BANDWIDTH ERROR ACK bit in

VI_CTL. Refer to Section 6.7, “Highway Latency and

HBE” for more information.

Any interrupt event of VI (CAPTURE COMPLETE,

THRESHOLD REACHED, HIGHWAY BANDWIDTH ER-

ROR) leads to the assertion of a single VI interrupt

(SOURCE 9) to the PNX1300 Vectored Interrupt Control-

ler. The interrupt handler routine should check the STA-

TUS register to determine the set of VI events associated

with the request. The vector ed interrupt controller should

always be set to have VI (SOURCE 9) operate in level

sensitive mode. This ensures that each event is handled.

VI asserts the interrupt request line as long as one or

more enabled events are asserted. The interrupt handler

clears one or more sele cted events b y writing a ‘1’ to the

corresponding ACK field in VI_CTL. The clearing of the

last event leads to immediate (next DSPCPU clock edge)

de-assertion of the interrupt request line to the Vectored

Interrupt Controller. See Section 3.5.3, “INT and NMI

(Maskable and Non-Maskable Interrupts),” for informa-

tion on how to program interrupt handler rou tines.

VI_STATUS (r)0x10 1400 31 0

MMIO_base

offset:

VI_CLOCK (r/w)0x10 1408

VI_CAP_START (r/w)0x10 140C

VI_CAP_SIZE (r/w)0x10 1410

CUR_Y(12) 371115192327

DIVIDER

START_Y

WIDTH

CUR_X(12)

FIELD2

Threshold reached Capture complete

VI_CTL (r/w)0x10 1404 Y_THRESHOLD MODE

Capture complete

INT enable

Threshold reached ACK

(write ‘1’ to ACK)

Capture comp lete ACK

Threshold reached

INT enable

SC (Sampling conventions)

0  Co-sited

1  Interspersed

Little endian

Capture ena ble

software RESET

DIAGMODE

SELFCLOCK

START_X

HEIGHT

VI_Y_BASE_ADR (r/w)0x10 1414 Y_BASE_ADR

VI_U_BASE_ADR (r/w)0x10 1418 U_BASE_ADR

VI_V_BASE_ADR (r/w)0x10 141C V_BASE_ADR

VI_UV_DELTA (r/w)0x10 1420 U_DELTA(16)

VI_Y_DELTA (r/w)0x10 1424 Y_DELTA(16)

V_DELTA(16)

HBE (highway bandwidth error)

HBE INT enabl e

Highway bandwidth error ACK SLEEPLESS

000000

RESERVED

Figure 6-11. YUV capture view of VI MMIO registers.

Philips Semiconductors Video In

PRELIMINARY SPECIFICATION 6-9

6.4 HALFRES CAPTURE MODE

Halfres capture mode is identical in operation to fullres

capture mode except that horizontal resolution is re-

duced by a factor of two on both luminance and chromi-

nance data.

Referring to Figure 6-9 and Figure 6-11, if VI is pro-

grammed to capture HEIGHT lines of WIDTH pixels in

WIDTH/2 pixels

HEIGHT lines

pix0 pix1 pix2 pix

W/2–1

• • •

. . .

Y_BASE_ADR

WIDTH/4 pixels

HEIGHT lines

pix0 pix2 • • •

. . .

U_BASE_ADR

(Repeated for V_BASE_ADDR,

V_DELTA)

Y_DELTA

U_DELTA

Figure 6-12. VI halfres planar memory format.

YUV 4:2:2 CCIR656

input samples

abcdefghi j k l

Halfres capture

sample results

Uf'3Uc

–19Ue19Ug3Ui

–++32=

Vf'3Vc

–19Ve19Vg3Vi

–++32=

Yh'3Ye

–19Yg32Yh19Yi3Yk

–+++64=

Figure 6-13. Halfr es co-sited sample capture.

YUV 4:2:2 CCIR656

input samples

abcdefghi j k l

Halfres capture

sample results

Yg'3Yd

–19Yf32Yg19Yh3Yj

–++ +64=

Uf'3Uc

–19Ue19Ug3Ui

–++32=

Vf'3Vc

–19Ve19Vg3Vi

–++32=

Figure 6-14. Halfres interspersed sample capture .

PNX1300/01/02/11 Data Book Philips Semiconductors

6-10 PRELIMINARY SPECIFICATION

halfres mode, the resulting captured planar data is as

shown in Figure 6-12. Note that WIDTH/2 luminance and

WIDTH/4 chrominance samples are captured. In this

mode, START_X and WIDTH must be a mu ltiple o f four.

Horizontal-resolution r eduction is performed as shown in

Figure 6-13 or Figure 6-14. The spatial sampling con-

ventions of the pixels in memory depends on the SC

(sampling conventio n) bit in the VI_CTL register. Assum-

ing that the camera sam pling positions obey the conven-

tions shown in Figure 6-5, two possible spatial formats

are supported in memory:

• If SC=0, co-sited luminance and chrominance sam-

ples result as shown in Figure 6-13. This corre-

sponds to the standard YUV 4:2:2 sampling

conventions.

• If SC=1, interspersed chrominance samples result,

as shown in Figure 6-14. This form is (after vertical

subsampling of the chroma components) identical to

the MPEG-1 sampling conventions. If vertical sub-

sampling is desired, it can either be performed in

software on the DSPCPU or in hardware by the ICP.

The filtering process applies mirroring at the edge of the

active video area, as per Figure 6-7.

For both filters, compute d video data is clamped to 01h if

result of the filter is less than 01h and clamped to FFh if

greater than FFh.

6.5 RAW CAPTURE MODES

All raw capture modes (raw8, raw10s and raw10u) be-

have similarly. VI_DATA information is captured at the

rate of the sender’s clock, without any interpretation or

start/stop of capture on th e basis of the data values . Any

clock cycle in which VI_DVALID is asserted leads to the

capture of one data sample. Samples are 8 or 10 bits

long (raw8 versus raw10 modes). For the 8-bit capture

mode, four samples are packed to a word. F or the 10-bit

capture modes, two 16-bit samples are packed to a

word. The extension from 10 to 16 bits uses sign exten-

sion (raw10s) or zero extension (raw10u).

For 8-bit and 16-bit capture, successive captured values

are written to increasing memory addresses. For 16-bit

capture, the byte order with which the 16-bit data is writ-

ten to memory is governed by the LITTLE ENDIAN bit.

The VI LITTLE ENDIAN bit should be set the same as the

DSPCPU endianness (PCSW.BSX). This ensures that

the DSPCPU sees correct 16-bit data.

Figure 6-15 illustrates the ‘raw-mode’ view of the VI

MMIO registers. Figure 6-16 shows the major VI states

associated with raw-mode capture. The initial state is

reached on software or hardware reset as described in

Section 6.1.4, “Hardware and Software Reset”. Upon re-

set, all status and control bits are set to ‘0’. In particular,

CAPTURE_ENABLE is set to ‘0’ and no capture takes

place.

Once the software has programmed BASE1 and BASE2

(with the start addresses of two SDRAM buffer areas1)

VI_STATUS (r)0x10 1400 31 0

MMIO_BASE

offset:

VI_CLOCK (r/w)0x10 1408

VI_BASE1 (r/w)0x10 1414

VI_BASE2 (r/w)0x10 1418

371115192327

DIVIDER

BUF1ACTIVE

BUF2FULL BUF1FULL

VI_CTL (r/w)0x10 1404 MODE

BUF1FULL

ACK2

ACK1

BUF2FULL

Little endian Capture en able

software RE SET

DIAGMODE

SELFCLOCK

BASE1

BASE2

VI_SIZE (r/w)0x10 141C SIZE (in samples)

OVERFLOW

(message mo de only)

OVERRUN

ACK_OVF

ACK_OVR

OVF

OVR

Interrupt enables

Highway ba nd width error

Highway bandwidth error

INT enable

Highway ba ndwidth error AC K SLEEPLESS

000000

RESERVED

31 15192327

VALID

Figure 6-15. Raw and message passing modes view of VI MMIO registers.

Philips Semiconductors Video In

PRELIMINARY SPECIFICATION 6-11

and SIZE (in number of samples), it is safe to enable cap-

ture by setting CAPTURE_ENABLE. Note that SIZE is in

samples and must be a multiple of 64, hence setting a

minimum buffer size of 64 bytes for raw8 mode and 128

bytes for raw10 modes. At this point, buffer1 is the active

capture buffer. Data is captured in buffer1 until capture is

disabled or until SIZE samples have been captured. After

every sample, a running address pointer is incremented

by the sample size (one or two bytes). If SIZE samples

have been captured, capture continues (without missing

a sample) in buffer2. At the same time, BUF1FULL is as-

serted. This causes an interrupt on the DSPCPU, if en-

abled by BUF1FULL INTERRUPT ENABLE.

Buffer2 is now the active capture buffer and behaves as

described above. In normal operation, the DSPCPU will

respond to the BUF1FULL event by assigning a new

BASE1 and (optionally) SIZE and performing an ACK1.

If the DSPCPU fails to assign a new buffer1 and per-

forms an ACK1 before buffer2 also fills up, the OVER-

RUN condition is raised and capture stops. Capture con-

tinues upon receipt of an ACK1, ACK2, or both,

regardless of the OVERRUN state. The buffer in which

capture resumes is as indicated in Figure 6-16. The

OVERRUN condition is ‘sticky’ and can only be cleared

by software, by writing a ‘1’ to the ACK_OVR bit in the

VI_CTL register.

If insufficient bandwidth is allocated from the internal

data highway, the VI internal buffers may overflow. This

leads to assertion of the HIGHWAY BANDWIDTH ER-

ROR condition. One or more data sa mples are lost. Cap-

ture resumes at the correct memory address as soon as

the internal buffer is written to memory. The HBE error

condition is sticky. It remains asserted until it is cleared

by writing a ‘1’ to HIGHWAY BANDWIDTH ERROR

ACK. Refer to Section 6.7, “Highway Late ncy and HBE.”

Note that VI hardware uses copies of the BASE and SIZE

registers once capture has started. Modifications of

BASE or SIZE, therefore, have no effect until the start of

the next use of the corresponding buffer.

Note also that the VI_BASE1 and VI_BASE2 addresses

must be 64-byte aligned (the six LSBs are always ‘0’).

6.6 MESSAGE-PASSING MODE

In this mode, VI receives 8-bit message data over the

VI_DATA[7:0] pins. The message data is written in

packed form (four 8-bit message bytes per 32-bit word)

to SDRAM. Message data capture starts on receipt of a

START event on VI_DATA[8]. Message data is received

until EndOfMessage (EOM) is received on VI_DATA[9]

or the receive buffer is full. Note that the VI_SIZE MMIO

message length. It should not be changed without a VI

(soft) reset.

Figure 6-17 illustrates an example of an 8-byte message

transfer. The first byte (D0) is sampled on the ri sing edge

of the VI_CLK clock after a valid START was sampled on

the preceding rising clock edge. The last byte (D7) is

1. SDRAM buffers must start on a 64-byte boundary.

ACTIVE = BUF2

BUF1FULL

ACTIVE = BUF1

ACTIVE = BUF2

ACTIVE = BUF1

BUF2FULL

BUF1FULL

BUF2FULL

raise OVERRUN*

* OVERRUN is a sticky flag. It is set but does not af-

fect operation. It can only be cleared by software, by

writing a ‘1’ to ACK_OVR.

(See text in Section 6.5)

ACK1 & ~ACK2

ACK1 & ACK2

~ACK1 & ACK2

Buffer2 Full

Buffer1 Full

Buffer1

Full

ACK1

Buffer2

Full

ACK2

RESET

Figure 6-16. VI raw mode major states.

PNX1300/01/02/11 Data Book Philips Semiconductors

6-12 PRELIMINARY SPECIFICATION

sampled on the rising clock edg e where EOM is sampled

asserted.

The message passing mode view of the VI MMIO regis-

ters is shown in Figure 6-15. Th e major states are shown

in Figure 6-18. The operation is almost identical to the

operation in raw-capture mode, except that transitions to

another active buffer occur upon receipt of EOM rather

than on buffer full. OVERRUN is raised if the second

buffer receive s a complete message before a new buffe r

is assigned by the DSPC PU.

OVERFLOW is raised if a buffer is full and no EOM has

been received. If enabled, it causes a DSPCPU interrupt.

Since digital interconnection b etween devices is reliable,

overflow is indicative of a protocol error between the two

PNX1300s involved in the exchange (failure to agree on

message size). Detection of overflow leads to total halt of

capture of this message. Capture resumes in the next

buffer upon receipt of the next START event on

VI_DATA[8]. The OVERFLOW flag is sticky and can only

be cleared by writing a ‘1’ to ACK_ OVF.

Highway bandwidth error behavior in message passing

mode is identical to that of raw mode.

6.6.1 VI_DVALID in Message Passing Mode

PNX1300 offers a new mode where the VI_DVALID pin

does not control the sampling of the VI_DATA[9:8] pins.

These pins are used for END and START of a m essage.

This new mode is controlled by a new field, VALID, in the

VI_CLOCK MMIO register. The default value after RE-

SET is ‘0’.

When VI_CLOCK.VALID is set to ‘0’ (the RESET value)

then PNX1300 behaves as in TM-1300. In this case the

START and END of messages are sampled only if the

VI_DVALID pin is HIGH.

When VI_CLOCK.VALID is set to ‘1’ then PNX1300 acti-

vates the new behavior. In this case the START and END

of messages are always sampled independently of the

state of the VI_DVAL ID pin.

VI_CLOCK.VALID cannot be read back, therefore it al-

ways read 0.

VI_DATA[7:0]

VI_DATA[8]

VI_DATA[9]

VI_CLK

XX D0 D1 D2 D3 D4 D5 D6 D7 XX XX

Start of

message

End of

message

Figure 6-17. VI message passing signal example.

ACTIVE = BUF2

BUF1FULL

ACTIVE = BUF1

ACTIVE = BUF2

ACTIVE = BUF1

BUF2FULL

BUF1FULL

BUF2FULL

raise OVERRUN*

* OVERRUN and OVERFLOW ar e sticky flags. They are set,

but do not affect operation. They can only be cleared by soft-

ware, by writing a ‘1’ to ACK_OVR or ACK_OVF.

(See text in Section 6.6)

ACK1 & ~ACK2

ACK1 & ACK2

~ACK1 & ACK2

EOM

ACK1

EOM

ACK2

RESET

No EOM  raise OVERFLOW*

(See text in Section 6.6)

No EOM  raise OVERFLOW*

(See text in Section 6.6)

Figure 6-18. VI mes sage passing mode major states.

Philips Semiconductors Video In

PRELIMINARY SPECIFICATION 6-13

6.7 HIGHWAY LATENCY AND HBE

Refer to Chapter 20, “Arbiter ,” for a de scription o f the ar-

biter terminology used here. The VI unit uses internal

buffering before writing data to SDRAM. There are two

internal buffers, each 16 entries of 32 bits.

In fullres mode, each internal buffer is used for 128 Y

samples, 64 U samples, and 64 V samples. Once the first

internal buffer is filled, 4 highway transactions must oc-

cur before the second buffer fills completely. Hence, the

requirement for not losing samples is:

• 4 requests must be served within 256 VI clock

cycles.

For the typical CCIR601-resolution NTSC or PAL 27-

MHz VI clock rate, the latency req uirement is 4 requests

in 9481 ns ( 25600/27). This can be used as one request

every 2370 ns or, with a PNX1300 SDRAM clock speed

of 100 MHz, every 237 SDRAM clock cycles. The one re-

quest latency is used to define the priority raising value

(see Section 20.6.3 on page 20-8).

In halfres mode, the Y, U, and V decimation by 2 takes

place before writing to the internal buffers. So, the re-

quirement for not loosing samples is:

• 4 requests served within 512 VI clock cycles.

For halfres su bsamp ling, NTSC or PAL 27-MHz VI clo ck

rate and PNX1300 SDRAM clock speed of 100 MHz, la-

tency is 4 requests in 51200/27 = 18962 ns (1896 high-

way clock cycles) or one request every 4740 ns (474

SDRAM clock cycles).

For raw8 capture and me ssage passing modes, each in -

ternal buffer stores 64 samples at the incoming VI clock

rate. The latency requirement is one request served ev-

ery 64 VI clock cycles.

For the raw10 captur e modes, each internal buffer stores

32 samples. Hence, the requirement for not losing sam-

ples is one request served every 32 VI clock cycles.

For a 38-MHz data rate on the incoming 10-bit samples

and a PNX1300 SDRAM clock speed of 100 MHz, high-

way latency should be set to guarantee less than 3200/

38 = 842 ns (84 SDRAM clock cycles) per clock cycle.

This cannot be met if any other peripherals are enabled.

Table 6-4 summarizes the maximum allowed highway la-

tency (in SDRAM clock cycles) needed to guarantee that

no samples are lost. The general formula uses ‘F’ to rep-

resent the VI clock frequency ( in MHz).

In fullres mode, bandwidth requirements (in bytes) per

video line with active image for VI is:

•B

fullr = ceil(WIDTH*2/256) * 4 * 64

ceil(X) function is the least integral value greater than or

equal to X.

In halfres mode, the bandwidth is:

•B

halfr = ceil(WIDTH*2/512) * 4 * 64

Raw8 mode and message passing mode bandwidth de-

pends only on VI clock speed. For raw1 0 mode each 10-

bit value counts as 2 bytes for bandwidth com p ut ations.

Table 6-4. VI highway latency requirements (27-MHz

data rate, 100-MHz PNX1300 highway clock)

Mode Max latency setting

(27 MHz, 100 MHz) Formula

fullres capture 237 6,400/F

halfres capture 474 12,800/F

raw8 237 6,400/F

raw10s 118 3,200/F

raw10u 118 3,200/F

message passing 237 6,400/F

PNX1300/01/02/11 Data Book Philips Semiconductors

6-14 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION 7-1

Enhanced Video Out Chapter 7

by Marc Duranton, Dave Wyland, Gert Slavenburg

7.1 ENHANCED VIDEO OUT SUMMARY

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The PNX1300 Enhanced Video Out (EVO) improves on

the design of the TM-1000 Video Out (VO) unit while

maintaining binary-compatibility. PNX1300 EVO is fully

backward compatible with TM-1100, and has been ex-

tended to support byte data rates up to 81-MHz and im-

prove the Genlock mode. The summary of new EVO fea-

tures versus TM-1000 includ es :

• Internal clock generator (DDS) has reduced jitter

• Full alpha blending supports 129-levels

• Chroma keying

• Frame synchronization can be internally or externally

generated (Genlock mode)

• External frame sync. follows the field number gener-

ated in the EAV/SAV code

• Programmable YUV output clipping

• Data-valid signal generated in data-streaming mode

• In message passing mode, message length can

range from one word (4 bytes) up to 16 MB.

7.2 ABOUT THIS DOCUMENT

This chapter describes the PNX1300 EVO unit which ex -

tends and improves the design of the TM-1 000 VO unit,

and consolidates the changes introduced in the TM-

1100. Please refer to the TM-1000 databook for a de-

scription of the VO unit’s functionality.

7.3 BACKWARD COMPATIBILITY

The EVO is functionally compatible with the TM-1000 VO

unit. All TM-1000 VO features are supported exactly in

the same fashion by the PNX1300 EVO. Software written

for the TM-1 000 VO can contr ol the PNX1300 EVO with -

out modification (with the exception of the Genlock mode

which now requires EVO_CTL. GENLOCK to be set to 1

in addition to VO_CTL. SYNC_MASTER = 0).

All new features (with respect to TM-1000) and improve-

ments are selectively enabled by setting bits in the

EVO_CTL MMIO register , described in Section 7.16.4. A

method to determine the existence of EVO registers is

given in Section 7.16.1.

The PNX1300 EVO features are disabled on hardware

reset in order to remain hardware-compatible with the

TM-1000 VO. So it is assumed throughout this chapter

that all new functions controlled by EVO_CTL are en-

abled by software. Any new software should use the new

EVO modes.

7.4 FUNCTION SUMMARY

The PNX1300 EVO ge nerates and transmits continuous

digital video images. It can connect to an off-chip video

subsystem such as a digital vid eo encoder chip (e.g ., the

Philips SAA7125 DENC digital encoder), a digital video

recorder, or th e video input of anoth er PNX1300 throug h

a CCIR 656-compatible byte-parallel video interface.

See Figure 7-1, Figure 7-2, and Figure 7-3.

The EVO can either supply video pixel clock and syn-

chronization signals to the external interface or synchro-

nize to signals received fr om the external interface (Gen-

lock mode).

PAL, NTSC, 16:9 and other video formats including dou-

ble pixel-rate, non-interlaced video formats are support-

ed through programmable registers which control pixel

clock frequency and video field or frame format.

The EVO can combine a background video image from

SDRAM with an optional foreground graphics overlay im-

age from SDRAM using 129-level , per-pixel alpha blend-

ing. The composite result is sent out as continuous vid-

eo. Video image data is taken from a planar memory

format, with separate Y, U and V planes in memory in

YUV 4:2:2 or 4:2:0 format. The optio nal graphics overlay

is taken from a pixel-packed YUV 4:2:2+ data structure

in memory.

The EVO can also be used to stream continuous data

(data-streaming mode) or send unidirectional messages

(message-passing mode) from one PNX1300 to another.

In data-streaming mode, the EVO generates a continu-

ous stream of arbitrary byte data using internal or exter-

nal clocking. Dual buffers allow continuous data stream-

ing in this mode by allowing the DSPCPU to set up a

buffer while another is being emptied by the EVO. Data-

valid signals are generated on VO_IO1 and VO_IO2 to

synchronize data streaming to other PNX1300 data re-

ceivers.

In message-passing mode, un idirectional message s can

be sent to the Video In (VI) port(s) of one or more

PNX1300s. Start and end-of-message signals are pro-

PNX1300/01/02/11 Data Book Philips Semiconductors

7-2 PRELIMINARY SPECIFICATION

vided to synchronize message passing to other

PNX1300 message receivers.

7.4.1 Detailed Feature Descriptions

The EVO provides the following key functions.

• Continuous digital video output of PAL or NTSC for-

mat dat a according to CCIR 601.

• Transmissio n o f YUV 4:2:2 co-sited pix el da ta across

a standard 8-bit parallel CCIR 6561 interface.

Embedded SAV and EAV synchronization codes and

separate sync control signals compatible with Philips

DENC encoders are available.

• Supports the nominal PAL/NTSC data rate of 27

MB/sec. (13. 5 Mpix/sec.), or any byte data rate up to

an 81-MHz EVO clock.

• Custom video formats can be programmed with

frames or fields of up to 4095 lines of up to 4095 pix-

els, subject only to the data rate limitation above.

• Support for video images in planar YUV 4:2:2 co-

sited, planar YUV 4:2:2 interspersed, or planar YUV

4:2:0 memory formats.

• Optional 129-level alpha blending. Graphics overlay

image is in pixel-packed YUV 4:2:2+ format, and is

alpha blended on top of the video image. Each pixel

has a 1-bit alpha, which selects one of two global 8-

bit alpha values which provide 129 layers of transp ar-

ency. With overlay enabled, the output byte data rate

is limited to 45% of the SDRAM clock, or up to an 81-

MHz EVO clock, whichever is smaller.

• Optional horizontal 2X upscaling of the video image

for display. The overlay is always in display format.

• In data-streaming mode, the EVO acts as a high

bandwidth continuous-output data channel. The byte

data rate is limited to an 81-MHz EVO clock.

• In message-passing mode, the EVO can send mes-

sages from 1 word (4 bytes) up to 16 MB. The byte

data rate is limited to an 81-MHz EVO clock.

• For diagnostic purposes, EVO output data can be

internally looped back to the VI port. This is con-

trolled by the VI DIAGMODE bit.

7.4.2 Summary of Operation

The EVO normally supplies continuous video data to its

outputs. The EVO is programmed and started by the

PNX1300 DSPCPU. The EVO issues an interrupt to the

DSPCPU at the end of each transmitted field, and/o r at a

programmable vertical position in the field . The DSPCPU

updates the EVO video image data pointers with pointers

to the next field during the vertical blanking interval so as

to maintain continuous video output. During video output,

the EVO supplies embedded CCIR 656 SAV (Start Ac-

tive Video) and EAV (End Active Video) sync codes and

optionally supplies horizontal and frame sync signals.

The EVO can either supply pixel clock and horizontal and

frame timing signals or it can lock to external timing sig-

nals such as those supplied by a Philips SAA7125 DENC

digital encoder or similar sync source.

7.5 INTERFACE

Table 7-1 lists the interface pins of the EVO unit.

Figure 7-1, Figure 7-2, and Figure 7-3 illustrate typical

connections for commonly-used exte rnal devices that in-

terface to the EVO.

The most common way to generate analog video is

shown in Figure 7-1. In this setup, an SAA7125 Digital

Encoder (DENC) can be programmed to derive sync ei-

ther from the VO_DATA stream EAV/SAV codes, or from

its RCV1/2 pins.

Figure 7-2 illustrates how a byte-parallel ECL-level stan-

dard CCIR 656 interface can be created. In certain pro-

fessional applications, serial D1 video is also used. In

that case, the EVO can be connected to a Gennum

GS9022 Digital Video Serializer or similar part (not

shown).

Figure 7-3 shows the EVO unit of one PNX1300 con-

nected to the VI unit of a second PNX1300.

1. Refer to CCIR recommendation 656: Interfaces for dig-

ital component video signals in 525 line and 625 line

television systems. Recommendation 656 is included in

the Philips Desktop Video Data Handbook.

PNX1300

VO_DATA[7:0]

(HS) VO_IO1

(FS) VO_IO2

VO_CLK SAA7125

MP[7:0]

RCV1

RCV2

LLC

Figure 7-1. EVO conne cted to a digital v ideo encod-

er (DENC).

PNX1300

VO_DATA[7:0]

VO_CLK

TTL to ECL

CCIR 656

Subminiature

“D” Connector

Data A,B[7:0]

Clock A,B

Figure 7-2. EVO connected to a CCIR 656 video-

output connector.

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-3

7.6 BLOCK DIAGRAM

Figure 7-4 shows a block diagram of the EVO unit. It con-

sists of a clock generator, a vid eo frame timing generator

and an image or data generator. The image generator

produces either a CCIR 656 digital video data stream

with optional YUV overlay or a continuous-data or mes-

sage-data stream. It also performs optional format con-

version and optional 2:1 horizontal scaling.

The frame timing generator provides programmable im-

age timing including horizontal and vertical blanking,

SAV and EAV code insertion, overlay start and end tim-

ing, and horizontal and frame timing pulses. It also sup-

plies data-valid timing signals in data-streaming mode

and start-of-message and end-of-message timing sig-

nals in message-passing mode. The sync timing pulses

can be generated by the frame timing unit, or the frame

timing unit can be driven by external ly-supplied sync tim-

ing pulses, when VO_CTL. SYNC_MASTER = 0 and

EVO_CTL. GENLOCK = 1.

The video clock generator produces a programmable

video clock. The video clock generator can supply the

video clock for the frame timing generator and external

devices, or it can be driven by an external clock signal.

7.7 CLOCK SYSTEM

Positive edges of VO_CLK drive all EVO output events.

A block diagram of the EVO clock system is shown in

Figure 7-5. The EVO clock is either supplied externally or

internally generated by the EVO, as controlled by the

VO_CTL. CLKOUT bit. When CLKOUT = 0, the EVO

clock is supplied by an external source through the

VO_CLK pin as an input. This is the default mode, en-

tered at hardware reset. When CLKOUT = 1, an internal

clock generator supplies the EVO clock and drives the

VO_CLK pin as an output.

The internal clock ge nerator system is a square wave Di-

rect Digital Synthesizer (DDS) which can be pro-

grammed to emit frequencies from 1 Hz to 50 MHz. The

output of the DDS is sent to a phase-locked loop filter

(PLL) which removes clock jitter from the DDS output

Table 7-1. EVO unit interface pins

Signal Name Typ

eDescription

VO_DATA[7:0

]OUT CCIR 656-style YUV 4:2:2 digital out-

put data, or general-purpose high

speed data output channel. Output

changes on positive edge of VO_CLK.

VO_IO1 I/O-5 Horizontal Sync (HS) output or Start

Message (STMSG) output. See

Figure 7-18.

VO_IO2 I/O-5 Frame Sync (FS) input, FS output or

ENDMSG output.

• If set as FS input, it can be set to

respond to positive or negative edge

transitions.

• If the EVO operates in Genlock mode

and the selected transition occurs,

the EVO sends two fields of video

data.

• In message-passing mode, this pin

acts as the ENDMSG output. See

Figure 7-18.

VO_CLK I/O-5 The EVO unit emits VO_DATA on a

positive edge of VO_CLK. VO_CLK

can be configured as an input (the

hardware reset default) or output.

• If configured as an input, VO_CLK is

received from external display-clock

master circuitry.

• If configured as output, the PNX1300

emits a low-jitter clock frequency

programmable between approx. 4

and 81 MHz.

PNX1300 A

VO_DATA[7:0]

(STMSG) VO_IO1

(ENDMSG) VO_IO2

VO_CLK

PNX1300 B

VI_DATA[7:0]

VI_DATA[8]

VI_DATA[9]

VI_CLK

VI_DVALID

logic ‘1’

Figure 7-3. EVO unit connected to the VI unit of a

second PNX1300.

Video Frame

Timing

Generator

Video Clock

Generator

Image Generator

Overlay Generator

Message/Data Generator

VO_IO1

(HS, Start Msg, or

valid data pulse)

VO_IO2

(VS, End Msg, or

valid data level)

VO_CLK

VO_DATA[0:7]

SDRAM Highway

Figure 7-4. EVO unit block diagram.

Square-Wave DDS

FREQUENCY

PLL

Filter VO_CLK

VO_CLK Internal

(to Frame Timing Gen.)

CLKOUT9  CPU Clock

031

Figure 7-5. EVO clock system.

PNX1300/01/02/11 Data Book Philips Semiconductors

7-4 PRELIMINARY SPECIFICATION

signal. The PLL can also be used to divide or double the

DDS frequency. The PLL VCO operates from 8-MHz to

90 MHz. The PLL is enabled and programmed as de-

scribed in Section 7.19.

DDS clock rate is set by the VO_CLOCK. FREQUENCY

field according to the equation shown in Figure 7-6. The

VO_CLK frequency can be a divider or multiplier of fDDS,

as determined by the PLL subsystem settings.

Low-jitter clock mode is automatically entered whenever

FREQUENCY[31] = 1. If FREQUENCY[31] = 0, the DDS

operates at 1/3 the rate (for compatibility with TM-1000

code), and FREQUENCY must be set as shown in

Figure 7-7.

The DDS synthesizer maximum jitter can be computed

as follows:

Example of jitter values can be found in Table 7-2.

7.8 IMAGE TIMING

The EVO emits a serial byte-data stream used by

CCIR 656 devices to generate a displayed image.

Figure 7-9 shows an NTSC-compatible, 525-line inter-

laced image. The field and line numbers are shown for

reference.

Interlaced images are generated by the display hardware

by controlling the vertical retrace timing. For reference,

Figure 7-8 shows a timing diagram of NTSC-compatible

interlaced frame timing illustrating the analog vertical re-

trace signal. The vertical retrace signal for the second

field begins in the middle of the horizontal line that ends

the first field. This causes the fir st line of the second field

to begin halfway across the display screen and the lines

of the second field to be scanned between the lines of the

first field, resulting in an interlaced display.

The analog timing required to generate the interlaced

signal is supplied by the display device. The CCIR 656

digital video signals generated by the EVO use frame

synchronization timing and do not generate any vertical

retrace timing.

7.8.1 CCIR 656 Pixel Timing

The EVO generate s pixels according to CCIR 656 timing

in YUV 4:2:2 co-sited format and outputs these pixels as

shown in Figure 7-10. Pixels are generated in groups of

two, with four bytes per two pixels. Each pair of pixels

has two luminance bytes (Y0, Y1) and one pair of chromi-

nance bytes (U0, V0) arranged in the sequence shown.

The chrominance samples U0 and V0 are sampled spa-

tially co-sited with luminance sample Y0. For PAL or

NTSC video, pixels are generated at a nominal rate of

13. 5 Mpix/sec. (27 MB/sec.). Pixels are clocked out on

the positive edge of VO_CLK.

7.8.2 CCIR 656 Line Timing

The CCIR 656 line tim ing is sh own in Figure 7-11. Each

line begins with an EAV code, a blanking interval and an

SAV code, followed by the line of active video. The EAV

code indicates end of active video for the previous line,

and the SAV code indicates start of active video for the

current line.

Table 7-2. Jitter values for common DSPCPU MHz

fDSPCPU

(MHz) jitter

(nSec) fDSPCPU

(MHz) jitter

(nSec)

143 0.777 180 0.617

166 0.669 200 0.555

Figure 7-6. DDS low-jitter oscillator frequency.

FREQUENCY 231 fDDS 232



9fDSPCPU



-----------------------------+=

Figure 7-7. DDS slow speed oscillator frequency

FREQUENCY fDDS 232



3fDSPCPU



-----------------------------=

jitter 1

9fDSPCPU



-----------------------------=

1 19 20 262 263 282 525 1

One Fram e

One Line

Field 2Field 1

Blanking BlankingActive Video Active Video

1/2 Line Interlace Offset

Vertical

Sync

Video

Lines

Figure 7-8. Interlaced ti ming—NTSC analog sync. signals.

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-5

7.8.3 SAV and EAV Codes

The End Active Video (EAV) and Start Active Video

(SAV) codes are issued at the start of each video line.

EAV and SAV codes have a fixed format: a 3-byte pre-

amble of 0xFF, 0x00, 0x00 followed by the SAV or EAV

code byte. The EAV and SAV code byte format is shown

in Figure 7-12 for reference. The EAV and SAV codes

define the start and end of the horizontal blanking inter-

val, and they also indicate the current field number and

the vertical blanking interval.

Line 20

Line 21 Line 282

Line 283

Line 262

Line 263 Line 524

Line 525

Field 1 Field 2

Scan Direction

Displayed Image

Figure 7- 9. In t erl ac ed di sp la y : 52 5- line, 60-Hz im ag e.

U0 Y0 V0 Y1 U2 Y2 V2 Y3 U4

Byte 0

Line Scan @ 27 MHz = 13. 5 Mpix/sec.

VO_DATA[0:7]

VO_CLK

Figure 7-10. CCIR 656 pixel timing.

ES SEE

Blanking Active Video Blanking Active Video

Line i Line i+1

SAV, EAV Codes YUV 4:2:2 pixels

Figure 7-11. CCIR 656 line timing.

Figure 7-12. Format of SAV and EAV timing codes.

Preamble

11111111 00000000 00000000 1FVHPPPP

Timing refe rence code

Protection bits

(error correction)

H = 0 for SAV

H = 1 for EAV

V = 1 during field blanking

V = 0 elsewhere

F = 0 during Field 1

F = 1 during Field 2

PNX1300/01/02/11 Data Book Philips Semiconductors

7-6 PRELIMINARY SPECIFICATION

The SAV and EAV codes have a 4-bit protection field to

ensure valid co des. The EVO generates these protection

bits as part of the SAV and EAV codes as defined by

CCIR 656. There are 8 possible valid SAV and EAV

codes shown with their correct protection bits in

Table 7-3. The EVO generates SAV and EAV sync

codes and inserts them into the video out data stream ac-

cording to the CCIR 656 specification under all condi-

tions, whether it is generating or r eceiving horizontal and

frame timing information.

7.8.4 Video Clipping

SAV and EAV codes are identified by a 3-byte preamble

of 0xFF, 0x00 and 0x00. This combination must be

avoided in the video data output by the EVO to prevent

accidental generation of an invalid sync code. The EVO

provides programmable maximum and minimum value

clipping on the video data to prevent this possibility. If

clipping is enabled, the EVO automatically clips the re-

sulting image data as de scr ib ed in Section 7.15.3.

7.8.5 CCIR 656 Frame Timing

The interlaced frame timing defined by CCIR 656 is

shown in Table 7-4. Lines are numbered from 1 to 525

for 525-line, 60-Hz systems and from 1 to 625 for 625-

line, 50-Hz systems. The Field and Vertical Blanking col-

umns indicate whether the field and vertical blanking bits,

respectively, are set in the SAV and EAV codes for the

indicated lines. The 525 and 625 formats have similar

timing but differ in their line numbering.

7.9 ENHANCED VIDEO OUT TIMING

GENERATION

The EVO generates timing fo r frames, active video areas

within frames, images within the active video area, and

overlays within the image area. The relationship between

these four is shown in Figure 7-13. The frame includes

the timing for both interlaced fields. Progressive scan, or

non-interlaced video, is accomplished by settin g the tim-

ing parameters such that two identical successive fields

are generated.

7.9.1 Active Video Area

Shown in Figure 7-13, the active video area begins after

the horizontal and vertical blanking intervals and repre-

sents the pixels visible on the scr een . The im age ar ea is

the actual displayed image within the active video area.

It can be slightly smaller than the active video area to

avoid edge effects at the top, bo ttom and sides of the im-

age. The overlay area is within the image area.

The EVO uses counters to generate and control image

timing. The Frame Line Counter and Frame Pixel

Counter control the overall timing for the frame and de-

fine the total number of pixels per line, lines per frame,

and interlace timing, including horizontal and vertical

blanking intervals.

Note that the Frame Line Counte r has a starting value of

one, not zero, and it counts from 1 to 525 o r 625, consis-

tent with CCIR 656 line numbering. The Image Line

Counter and Image Pixel Counter define the visible im-

age within the field.

The geometry of the active video area is defined by the

contents of several MMIO registers shown in

Figure 7-29. The VO_FRAME. FIELD_2_START field

defines the start line of Field 2. Field 2 is active when the

Field Line Counter contents equal or exceed this value.

The active video area is defined by the F1_VIDEO_LINE

and F2_VIDEO_LINE fields of the VO_FIELD register for

each field of the frame, and by the

VIDEO_PIXEL_START field of the VO_LINE register for

each line of the frame. The active video area begins

when the contents of the Frame Line Counter and Frame

Pixel Counter equals or exceeds these values.

Table 7-3. SAV and EAV codes

Code Binary Value Field Vertical Blanking

SAV 1000 0000 1

EAV 1001 1101 1

SAV 1010 1011 1 X

EAV 1011 0110 1 X

SAV 1100 0111 2

EAV 1101 1010 2

SAV 1110 1100 2 X

EAV 1111 0001 2 X

Table 7-4. CCIR 656 frame timing

Line Number F bit V bit Comments

525/60 625/50

1–3 624–625 1 1 Vertical blanking for

Field 1, SAV/EAV

code still indicates

Field 2

4–19 1–22 0 1 Vertical blanking for

Field 1, change

SAV/EAV code to

Field 1

20–263 23–310 0 0 Active video, Field 1

264–265 311–312 0 1 Vertical blanking for

Field 2, SAV/EAV

code still indicates

Field 1

266–282 313–335 1 1 Vertical blanking for

Field 2, change

SAV/EAV code to

Field 2

283–525 336–623 1 0 Active video, Field 2

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-7

7.9.2 SAV and EAV Overlap Period

The CCIR 656-compliant 525/60 and 625/50 timing

specifications define an overlap period where the field

number in the SAV and EAV codes from Field 1 persists

into the vertical blanking interval for Field 2, and the

codes for Field 2 persist into the vertical blanking interval

for Field 1. The F1_OLAP and F2_OLAP fields of the

VO_FIELD register define these overlap intervals.

F1_OLAP and F2_OLAP are small two’s complement

values in the range -8... +7. A positive value indicates

that the overlap extends into the current field, while a

negative value indicates that it extends backward into the

previous field. See Figure 7-31 for the effect of negative

and positive values.

During the overlap interval, the vertical blanking for the

next field has begun; however, the field number flag in

the SAV and EAV codes still shows the field number for

the previous field. The field number is upda ted to the cor-

rect field value at the end of the overlap interval.

F1_OLAP defines the overlap from Field 1 to Field 2.

This overlap occurs during the beginning of vertical

blanking for Field 2. The SAV and EAV codes continue

to show Field 1 during this overlap interval, and they

change to Field 2 at the end of the interval.

F2_OLAP defines the overlap from Field 2 to Field 1.

This overlap occurs during the beginning of vertical

blanking for Field 1. The SAV and EAV codes continue

to show Field 2 during this overlap interval, and they

change to Field 1 at the end of the interval.

7.9.3 Control of Frame and Image Counters

The frame and image counters have different start and

stop points. The frame counters begin in the vertical

blanking interval of the first field and the horizontal blank-

ing interval of the first line. They stop co unting when they

reach the height and width values of the frame. When the

EVO generates frame timing, the fram e co un te rs are re-

set to their start values when they reach their stop val-

ues. When the EVO receives frame timing signals, the

frame counters continue counting until reset by the exter-

nal signals.

The image area is defined by VO_YTHR register fields

IMAGE_VOFF and IMAGE_HOFF. These values are

added to the F1_VIDEO_LINE or F2_VIDEO_LINE and

VIDEO_PIXEL_START values to define the starting line

and pixel, respectively, of the image area. The image

area is active when the contents of the Frame Line

Counter and Fram e Pixel Counter equal or exceed these

values.

The Image Line Counter and Image Pixel Counter start

counting at the first active pixel in the image area an d the

first active line in the image area, respectively. The im-

age counters start at zero and stop counting when they

reach their image height and width values. The image

counters are re set by frame counter values ind icating the

start of the image pixel in a line and the start of the image

line in a field.

The image counters define the active image area of the

frame, the area of interest for image processing. This al-

lows the overlay start address to be defined relative to

the active image area , for example. When th e EVO is not

sending out active pixels from the image area, it sends

out blanking codes. The blanking codes are 0x80, 0x10,

0x80, and 0x10 for each 2-pixel group in YUV 4:2:2 im-

age data format, as defined by CCIR 656 and shown in

Figure 7-10.

7.9.4 Horizontal and Frame Timing Signals

The EVO can supply horizontal and frame timing signals

or receive a frame timing signal from an external sou rce.

When VO_CTL. SYNC_MASTER = 1, the EVO gener-

ates horizontal and frame timing for the external video

device. When SYNC_MASTER = 0, the EVO operates in

Genlock mode and an exter nal device, such as a DENC,

must provide frame sync. This section describes EVO

operation when it is sync master. See Section 7.10 for a

description of Genlock mode.

If SYNC_MASTER = 1, the VO_IO1 signal generates a

horizontal timing signal, and the VO_IO2 signal gener-

ates a frame timing signal. When EVO_ENABLE = 1 and

FIELD_SYNC = 1, the VO_IO2 signal indicates the field

number (low = Field 1, high = Field 2), according to the

SAV/EAV field indication (bit[6]) as shown in Figure 7-14.

The VO_IO2 signal toggles just before the first byte of the

preamble that protects the EAV code and after the SAV

code. Non-interlaced output can be simulated by pro-

gramming the EVO to generate fields equivalent to the

desired frames. In this case, VO_IO2 indicates odd or

even frames.

Overlay

Image Area, Fiel d 1

Vertical Blanking, Field 1

Horizontal

Blanking

Overlay

Image Area, Field 2

Vertical Blanking, Field 2

Horizontal

Blanking

Image V Offset

Image H Offset

Image Width

Image Height

Frame

Active Video Area

Start Pixel

Start

Line

Figure 7-13. Ac tive Vid eo Are a and Imag e Are a in r e-

lation to vertical and horizontal blanking intervals.

PNX1300/01/02/11 Data Book Philips Semiconductors

7-8 PRELIMINARY SPECIFICATION

The horizontal timing signal VO_IO1, shown in

Figure 7-15, corresponds to the horizontal-blanking in-

terval. It is active low from the EAV code at the start of

the line to the SAV code at the start of active video for the

line.

7.10 GENLOCK MODE

In Genlock mode, the EVO is not synchronization master

but receives frame tim ing signal s on VO_IO 2. The EVO

operates in Genlock mode when SYNC_MASTER = 0,

EVO_CTL. EVO_ENABLE = 1 and EVO_CTL. GEN-

LOCK = 1.

The active edge can b e programmed using the VO_CTL.

VO_IO2_POS bit. The initial transition of the frame tim-

ing signal on VO_IO2 causes the Fram e Line Co unter to

be set to the value in VO_FRAME. FRAME_PRESET.

After reaching FRAME_LENGTH, the Frame Line

Counter star ts co untin g ag a in fro m 1.

EVO_SLVDLY. SLAVE_DLY is typically used to com-

pensate for any delay in the frame timing source or inter-

nal pipeline synchronization anywhere in a line. Internal-

ly, the active edge of VO_IO2 is delayed by SLAVE_DLY

VO_CLK clock cycles. Typically, it will allow FRAME_

PRESET to be loaded at the beginning of a new line.

With correct values of SLAVE_DLY and

FRAME_PRESET loaded, the PNX1300 can generate

frames totally synchronized with the active edge of

VO_IO2. All the internal MMIO registers (except of

course VO_CTL) should be programmed with the same

values as for SYNC_MASTER mode. See Figure 7-16.

In Genlock mode, the EVO is free-running according to

the values programmed in its internal registers before the

initial VO_IO2 active edge. Just after receiving the a ctive

edge that will synchronize the EVO, output values may

be erroneous f or several VO_CLK c ycles, but it is guar-

anteed that the next frame will be correct.

After the first synchronizing edge, if the next one hap-

pens according to the values programmed in the EVO

MMIO registers, no change will appear in the output tim-

ing of the EVO. If the active edge of VO_IO2 does not

match the programmed value, a new synchronization

phase is performed.

Typically, this is programmed as follows: SLAVE_DLY is

loaded with the number of clock cycles for one video line

minus the number of delay cycles used by the EVO to

synchronize itself. FRAME_PRESET is programmed

with the value 2. With this programmi ng, the active edge

of VO_IO2 will happen just before the first byte (pream-

ble) of the first line.

The first active edge of VO_IO2 is delayed internally by

SLAVE_DLY VO_CLK cycles so that it appears internally

just before the sta rt of the se co nd line min us the in te rnal

EVO pipeline delay. After this inter nal pipelin e dela y, the

line counter is loaded by FRAME_PRESET, (‘2’), and the

EVO starts sending data for line 2.

For the next frame, if the internal EVO programming

matches the VO_IO2 timing, the EVO will appear to start

4 19 20 265 266 283 14

One Frame

One Line

Field 2Field 1

Blanking Blanking

Active Video Active Video

Vertical

Sync

Video

Lines

NTSC

PAL

263 264 282 525 3

Blanking Blanking

23 310 311 312 313 335 336 623 624 625 1

221

VO_IO2

Figure 7-14. EVO VO_IO2 timing in FIELD_SYNC mode.

Image Line: Image Width

Blanking

Image Width, Pixels

Field Width, Pixels

SAVEAV

VO_IO1

Image Data

EAV

Blanking

Figure 7-15. EVO VO_IO1 timing in FIELD_SYNC mode .

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-9

the first byte of the first line just after the VO_IO2 active

signal.

7.11 DATA TRANSFER TIMING

In data-streaming and message-passing modes, the

EVO supplies a stream of 8-bit data. No data selection or

data interpretation is do ne, and data is transfer red a t th e

rate of one byte per VO_CLK. Data is clocked out on the

positive edge of VO_CLK.

When data-streaming mode is enabled and

EVO_ENABLE = 1 and SYNC_STREAMING = 1, the

VO_IO2 signal indicates a data-valid condition. This sig-

nal is asserted when the EVO starts outputting valid data

(that is, data-streaming mode is enabled and video out is

running), and is de-asser ted when da ta-stre aming mode

is disabled. As shown in Figure 7-17, the data-valid sig-

nal on VO_IO2 is asserted just before the first valid byte

is present on VO_DATA[7:0], and is de-asserted just af-

ter the last valid byte was sent, or if an HBE error is sig-

naled. All transitions of VO_IO2 occur on the ri sing edge

of VO_CLK. The VO_IO1 signal generates a pulse one

VO_CLK cycle before the first valid data is sent. The

transitions of VO_IO1 occur on the rising edge of

VO_CLK and last for one VO_CLK cycle.

In message-passing mode, the EVO issues signals on

VO_IO1 and VO_IO2 to indicate the start and end of

messages.

When message passing is started by setting VO_CTL.

VO_ENABLE, the EVO sends a Start condition on

VO_IO1. When the EVO has transferred the contents of

the buffer, it sends an End condition on VO_IO2, sets

BFR1_EMPTY, and interrupts the DSPCPU. The EVO

stops, and no further operation takes place until the

DSPCPU sets VO_ENABLE again to start another mes-

sage, or until the DSCPU initiates other EVO operation.

The timing for these signals is shown in Figure 7-18.

7.12 IMAGE DATA MEMORY FORMATS

7.12.1 Video Image Formats

The EVO accepts memory-resident video image data in

three formats: YUV 4:2:2 co-sited, YUV 4:2:2 inter-

spersed, and YUV 4:2:0. These formats are shown in

Figure 7-19 through Figure 7-21.

EAV

Image Data

EAV

Line 525/625

One Frame

VO_IO2

Delay SLAVE_DLY in VO_CLK cycles

Line 1 Line 2 Line FRAME_PRESET Line 525/625 Line 1

EAV

Line counter loaded by FRAME_PRESET

Figure 7-16. Genlock mode.

VO_DATA[7:0]

VO_IO2

VO_IO1

VO_CLK

XX XX D0 D1 D2 D3 D4 D5 Dk XX XX

DATA_VALID

Figure 7-17. Data-streaming valid data signals.

VO_DATA[7:0]

VO_IO1

VO_IO2

VO_CLK

XX D0 D1 D2 D3 D4 D5 D6 D7 XX XX

Start of

message

End of

message

Figure 7-18. Message-passing START and END signals.

PNX1300/01/02/11 Data Book Philips Semiconductors

7-10 PRELIMINARY SPECIFICATION

7.12.2 Planar Storage of Video Image Data in

Memory

Video image data is stored in memory with one table for

each of the Y, U and V components. T his is called planar

format. This is shown in Figure 7-22 for YUV 4:2:2 image

data. The EVO merges bytes from each of the three ta-

bles to generate the CCIR 656-compatible output data.

The U and V tables have the same number of lines but

half the number of pixels per line as the Y table. The

transfer is the same for YUV 4:2:0 format except the U

and V tables will be 1/4 the size of the Y table. The U and

V tables have the half the number of lines and half the

number of pixels per line as the Y table.

7.12.3 Graphics Overlay Image Format

Graphics o verlay imag e data is stored in a pix el-packed

format in SDRAM. Graphics images are stored in YUV

4:2:2+alpha format. Figure 7-23 shows this format. The

YUV overlay area is always within the image output res-

olution. The EVO does not upscale the graphics overlay

image. If the EVO is upscaling the video image by 2, the

graphics overlay must be provided in upscaled format.

Pixel data is a 16-bit data and follows endian-ness con-

ventions based on 16-bit data. Refer to Appendix C, “En-

dian-ness” for details.

7.13 VIDEO IMAGE CONVERSION

ALGORITHMS

The memory video image data formats are converted to

the output YUV 4:2:2 co-sited format and optionally up-

scaled 2 horizontally. The conversion algorithms are

detailed below.

Chrominance (U,V)

samples Luminance

samples

Figure 7-19. YUV 4:2:2 co-site d forma t.

Chrominanc e (U,V)

samples Luminance

samples

Figure 7-20. YUV 4:2:2 interspersed format.

Chrominance (U,V)

samples Luminance

samples

Figure 7-21. YUV 4:2:0 for mat.

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-11

7.13.1 YUV 4:2:2 Interspersed to YUV 4:2:2

Co-sited Conversion

The EVO accepts data from SDRAM in either YUV 4:2:2

co-sited, YUV 4:2:2 interspersed, or YUV 4:2:0 inter-

spersed formats. If the input data is in YUV 4:2:2 or YUV

4:2:0 interspersed format, interspersed-to-co-sited con-

version is performed to generate co-sited output. The

EVO uses a 4-tap, (–1, 5, 13, – 1)/16 filter to perfo rm this

conversion on the U and V chroma data. Figure 7-24

shows an example of interspersed to co-sited conversion.

7.13.2 YUV 4:2:0 to YUV 4:2:2 Co-sited

Conversion

YUV 4:2:0 to YUV 4:2:2 conversion is a variation of YUV

4:2:2 interspersed-to-co-sited conversion. The YUV

4:2:0 format has the U and V pixels positioned between

lines as well as between pixels within each line. It also

has half the numbe r of U and V p i xe ls compared to YUV

4:2:2 formats. The EVO converts YUV4:2:0 to YUV 4:2:2

co-sited by using the U and V chrominance pixel values

for both surrounding lines and converting the resultin g U

and V pixels fr om interspersed to co- sited format . This is

shown in Figure 7-25. For true vertical re-sampling of U

and V, the PNX1300 ICP unit can be invoked on U and

V to convert from YUV 4:2:0 to YUV 4:2:2 interspersed.

7.13.3 YUV-2x Upscaling

In the YUV-2 modes, the EVO performs 2 horizontal

upscaling of th e YUV data from SDR A M. No vertical up-

scaling is performed. The width of the result image

(IMAGE_WIDTH) should be an even number. Upscaling

is performed by 4-tap filter ing . For a ll 3 memory form ats,

Y luminance data is upscaled using a (–3,19,19,–3)/32

filter to generate the missing output pixels. Output pixels

at the same location as the input pixels use the corre-

sponding input pixel values, as shown in Figure 7-26.

The U and V chrominance values are generated in the

same way as the Y luminance signal for 2 upscaling, as-

suming that both the input and o utput use YUV 4:2:2 co-

sited chrominance coding. The U and V output pixels at

the same location as the U and V input pixels use the cor-

responding input pixel va lues. The U and V output pixels

between the U and V input pixels are generated using the

(–3,19,19,–3)/32 filter, as shown in Figure 7-26.

If the input chroma is interspersed, a (–1,13,5,–1)/16 fil-

ter is used to generate the U and V ou tput pixels that are

displaced by half a Y pixel from the U a nd V input p ixels,

and a (–1,5,13,–1)/16 filter is used to generate the addi-

tional upscaled U and V output pixels that are displaced

by 1. 5 pixels from the U and V input pixels. This is shown

in Figure 7-27.

7.13.4 Pixel Mirroring for Four-tap Filters

The EVO uses a 4-tap filter for upscaling and for convert-

ing from interspersed to co-sited format. One extra pixel

is needed at the beginning and two at the end of each

line processed by this filter. These pixels are supplied

WIDTH pixels

HEIGHT lines

pix0 pix1 pix2 pix

W–1

• • •

Y_BASE_ADR

WIDTH/2 pixels

HEIGHT lines

pix0 pix2 • • •

U_BASE_ADR

(Repeated for

V_BASE_ADDR,

V_OFFSET)

Y_OFFSET

U_OFFSET

Figure 7-22. Image storage in pla nar memory format

for YUV 4:2:2.

Figure 7-23. YUV 4:2:2+alpha overlay format.

OVERLAY_WIDTH pixels

OVERLAY_HEIGHT lines

pix0 pix1 pix2 pix

W–1

• • •

OL_BASE_ADR

OL_OFFSET

Y0 U0 Y1 V0

YUV 4:2:2+



Chrominance (U,V)

samples Luminance

samples

Input Pixels: YUV

Output Pixels: YU’V’

Co-sited Chrominance Output:

U’,V’ = (–1,5,13,–1)/16U,V

Figure 7-24. YUV interspersed to co-sited conversion.

PNX1300/01/02/11 Data Book Philips Semiconductors

7-12 PRELIMINARY SPECIFICATION

automatically by mirroring the first and last pixels of each

line. For example:

• Output pixel 1 uses input pixel 1 to generate its

value. (same location, no filtering).

• Output pixel 2 uses pixels 1,1, 2 and 3 to generate it s

value.

• Output pixel 3 uses pixel 2 to generate its value.

• Output pixel 4 pixel uses pixels 1, 2, 3 and 4, etc.

Chrominance (U,V)

samples Luminance

samples

Input Pixels: YUV 4:2:0

Output Pixels: YU’V’ 4:2:2

Co-sited Chrominance Output:

U’,V’ = (–1,5,13,–1 )/1 6U,V

Y0,0; U0,0; V0,0

Y0,0

U0,0; V0,0

U0, V0

U2, V2

Y0, U0, V0

Y1, U0, V0

Y2, U2, V2

Y3, U2, V2

Figure 7-25. YUV 4:2:0 to YUV 4:2:2 co-sited conversion.

Chrominanc e (U,V)

samples Luminance

samples

Input Pixels: YUV

Output Pixels: Y’U’V’

Output Loca tion Same

As Input Pixel: Y’U’V’ = YUV Upscaled Luminance Ou tput Between

Input Pixels: Y’ = (-3, 19 ,19 ,-3)/32Y

Upscaled Chrominance Outp ut Be tw ee n

Input Pixels: U’,V’ = (-3,19,19,-3)/32 U,V

Figure 7-26. 2x upscaling of Y pixels.

Chrominanc e (U,V)

samples Luminance

samples

Input Pixels: YUV

Output Pixels: Y’U’V’

Co-sited Chrominance Output

U’,V’ = (–1,13,5,–1)/16U,V

Co-sited Chrominance Output

U’,V’ = (–1,5,13,–1 )/1 6 U,V

Upscaled Lumina nc e Output Same

As Input Pixel: Y’ = Y

Upscaled Luminance Output Between

Input Pixels: Y’ = (-3,19,19,-3)/32 Y

Figure 7-27. 2x upscaling of U and V with interspersed to co-sited conversion.

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-13

•...

• Output pixel 2N–2 uses pixels N–2, N–1, N, and N–1

to generate its value.

• Output pixel 2N–1 uses pixel N to generate its value.

• Output pixel 2N uses pixels N–1, N, N, and N–1 to

generate its value.

Figure 7-28 shows an example of six pixels upscaled to

12 pixels.

7.14 EVO OPERATING MODES

EVO operating modes belong to two grou ps as follows:

• Video-refresh modes

• Data-transfer modes

Data-transfer modes are further broken down into data-

streaming mode and message-passing mode.

The operating mode is set by the VO_CTL. MODE field

and the VO_CTL. OL_EN (overlay enable) control bit.

The VO_CTL. MODE field determines video-refresh,

message-passing or data-streaming mode. It further de-

fines the video image format and whether or no t 2 hori-

zontal upscaling takes place . The OL_EN bit determines

whether a video-refresh mode has a graphics overlay

present. The modes are shown in Table 7-5.

7.15 VIDEO PROCESSING

If enabled, the PNX1300 implements functions for chro-

ma keying, alpha blending and programmable clipping,

as described in this section.

7.15.1 Alpha Blending

If enabled by setting EVO_ENABLE = 1 and

FULL_BLENDING = 1, the EVO provides full 129-layer

alpha blending of a ba ckgrou nd vid eo i mage with a fo re-

ground graphics overlay imag e. If either bit is 0, the EVO

implements the cruder 25% step alpha blending resolu-

tion of the TM-1000. Alpha blending can operate in con-

junction with chroma keying, as described in

Section 7.15.2.

Alpha blending combines a graphics overlay image with

the video image according to an alpha value provided

with each overlay pixel. The graphics overlay is taken

from a pixel-packed YUV 4:2:2+ data structure in mem-

ory. In the YUV 4:2:2+ format, each pixel has a single

-bit supplied as the LSB of the U and V pixels. The U

byte LSB corresponds to the alpha for pixel Y0, the V

byte LSB for pixel Y1, respectively. When the -bit is ‘0’,

the ALPHA_ZERO register supplies the actual 8-bit 

value. When the -bit is ‘1’, the ALPHA_ONE register

supplies the 8-bit  value. In the YUV 4:2:2 format, only

one set of U and V values is supplied for the two Y pixels,

Y0 and Y1. In this case, the alpha bit in U0 determines

the alpha value for U, Y0 and V. The alpha blend bit in

V0 only sets the alpha value for Y1 and does not affect

the U or V values.

The EVO uses the 8-bit content of the selected alpha

blending register (ALPHA_ZERO or ALPHA_ONE) to

determine the amount by which the overlay plane is

merged with the image plan e as follows. The least-signif-

icant 7 bits of the selected blending register encode 128

Table 7-5. EVO Operating Modes

Mode Function Explanation

Video-refresh modes

0 YUV 4:2:2C-1YUV 4:2:2 co-sited, no scaling

1 YUV 4:2:2I-1YUV 4:2:2 interspersed, no scaling

2 YUV 4:2:0-1YUV 4:2:0, no scaling

3 Reserved

4 YUV 4:2:2C-2YUV 4:2:2 co-sited, horizontal 2

upscaling

5 YUV 4:2:2I-2YUV 4:2:2 interspersed, horizontal

2 upscaling

6 YUV 4:2:0-2YUV 4:2:0, horizontal 2 upscaling

7 Reserved

Data-transfer modes

8 data

streaming continuous transmission of raw 8-bit

data with valid data pulse and level

timing signals

Input Pixels: Y

Output Pixe ls: Y’

23456

135791124681012

Y’=Y1 Y’=Y2 Y’=Y3 Y’=Y4 Y’=Y5 2N–1:

Y’=Y6

Y’=F(Y1,Y1,Y2,Y3)

Y’=F(Y1,Y2,Y3,Y4)

Y’=F(Y2,Y3,Y4,Y5)

Y’=F(Y3,Y4,Y5,Y6)

Y’=F(Y4,Y5,Y6,Y6)

2N:

Y’=F(Y5,Y6,Y6,Y5)

Figure 7-28. Mirroring pixels in 2x upscaling.

9 message

passing transmission of raw 8-bit data with

STMSG and ENDMSG timing sig-

nals

0xA

—

0xF

Reserved

Table 7-5. EVO Operating Modes

Mode Function Explanation

PNX1300/01/02/11 Data Book Philips Semiconductors

7-14 PRELIMINARY SPECIFICATION

blending levels fro m 0 to 0x7F . The MSB is used t o turn

on blending (MSB = ‘0’) or to select the overlay plane as

the only output (MSB = ‘1’), so all values between 0x80

and 0xFF select 100% overlay. Th erefore, the total num-

ber of blending levels is 129: 128 variable blending val-

ues from 0 to 0x7F plus one ‘blending’ value from 0x80

to 0xFF for 100% overlay. An alpha value of 0 selects

100% image plane and 0% overlay. Similarly, a value of

0x40 selects 50% image and 50% overlay blendin g.

The equations for the blending are illustrated below.

7.15.2 Chroma Keying

If the EVO_ENABLE and KEY_ENABLE bits are set to

‘1’ in EVO_CTL the PNX1300 activates chroma keying.

The graphics overlay is taken from a pixel-packed YUV

4:2:2+ data structure in memory. The EVO_KEY regis-

ter provides the value which signifies full transparency

for the overlay. The overlay values (Y, U and V) are com-

pared to the values stored in bit-fields of the EVO_KEY

KEY_U and KEY_V, which store the values to be com-

pared to the Y, U, and V components, respectively, of the

overlay for chroma keying. Bits that correspond to bits

set in MASK_Y and MASK _UV are ig nor ed for th e co m-

parison. When there is an exact match between the pixel

value and the value in EVO_KEY (disregarding any bits

masked by MASK_Y and MASK_UV), then the overlay

value is not present in the output stream, resulting in full

transparency.

The mask bits in EVO_MASK provide for varying de-

grees of precision in the chroma-key matching process.

The EVO_MASK. MASK_Y field can mask from 0 to 4

LSBs of the overlay Y co mponent during the chroma key

process. For example, setting MASK_Y = 1 eliminates

the influence of the LSB of KEY_Y in the keying process.

This can be used to w iden th e range of k ey mat ching to

account for irregularities in the chroma-key video signal.

Likewise, EVO_MASK. MASK_UV is used to mask from

zero to four LSBs of the overlay U and V components

during the chroma key process. For example, setting

MASK_UV = 1 eliminates the influence of the LSB of

KEY_U and KEY_V in the keying process.

7.15.3 Programmable Clipping

If EVO_CTL. CLIPPING_ENABLE = 1 the EVO performs

fully-compliant programmable clipping. Clipping is per-

formed as the last step of the video pipeline, after chroma

keying and alpha blending. It is applied only on the image

areas (Field 1 and Field 2) defined by IMAGE_WIDTH,

IMAGE_HEIGHT, IMAGE_VOFF and IMAGE_HOFF in-

side the Active Video Area. Blanking values are not

clipped.

The EVO_CLIP MMIO register stores four 8-bit fields

used to clip output components. The Y output compo-

nent is clipped between the values stored in

LOWER_CLIPY and HIGHER_CLIPY. A value less than

or equal to LOWER_CLIPY is forced to LOWER_CLIPY

and a value greater than or equal to HIGHER_CLIPY is

forced to HIGHER_CLIPY.

The same behavior is implemented for U and V with the

values stored in the LOWER_CLIPUV and

HIGHER_CLIPUV fields.

This mode allows fully-compliant 16 to 235 Y clipping

and 16 to 240 Cb and Cr clipping to be programmed.

These are the default values of the EVO_CLIP register

after reset.

If CLIPPING_ENABLE = 0, the EVO clips Y, U and V be-

tween the default values 16 and 240, as it is implemented

in the TM-1000. When LOWER_CLIP{Y,UV} registers

are set to ‘0’ and HIGHER_CLIP{Y,UV} registers are set

to ‘255’, no clipping is performed.

7.16 MMIO REGISTERS

The MMIO registers are in two groups:

• VO registers — control basic VO functions (those

shared with the TM-1000 VO unit)

• EVO registers — control new EVO unit functions

(those new in TM-1100/TM-1300/PNX1300)

VO MMIO registers are shown in Figure 7-29. VO MMIO

functionality is unchanged e xcept where noted in the text

(see for instance, Section 7.16.1). The register fields ar e

described in Table 7-6, Table 7-7 and Table 7-8. They

are discussed in sections 7.16.1 through 7.18.1.

EVO MMIO registers are shown in Figure 7-30. EVO

MMIO register names are prefixed with “EVO_”. The

EVO_CTL register selectively enables new TM-

1100/TM-1300/PNX1300 functions. The register fields

are described in Table 7-9 and Table 7-10. They are dis-

cussed in sections 7.16.4 and 7.16.5.

To ensure compatibility with future devices, any unde-

fined MMIO bits should be ignored when read, and writ-

ten as ‘0’s.

if alpha[7] = 1 then

output[7:0] = overlay[7:0]

else output[7:0] = (alpha[6:0] · overlay[7:0] + (alpha[6:0] + 1) · image[7:0]) >> 7

(or)output[7:0] = (alpha[6:0] · (overlay[7:0] – image[7:0]) >> 7) + image[7:0]

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-15

VO_STATUS (r)0x10 1800

MMIO_BASE

offset:

VO_CLOCK (r/w)0x10 1808

VO_FRAME (r/w)0x10 180C

VO_FIELD (r/w)0x10 1810

FREQUENCY

FRAME_PRESET

F2_OLAP

VO_CTL (r/w)0x10 1804 MODE

FIELD_2_START

F2_VIDEO_LINE

VO_LINE (r/w)0x10 1814 VIDEO_PIXEL_START

VO_IMAGE (r/w)0x10 1818 IMAGE_HEIGHT

VO_YTHR (r/w)0x10 181C Y_THRESHOLD

VO_OLSTART (r/w)0x10 1820 OL_START_LINE

VO_OLHW (r/w)0x10 1824

OL_START_PIXEL

RESET

SLEEPLESS

CLKOUT

SYNC_MASTER

VO_IO1_POS

VO_IO2_POS

OL_EN

BFR1_ACK

BFR2_ACK

HBE_ACK

URUN_INTEN

YTR_INTEN

URUN_ACK

YTR_ACK

LTL_END

VO_ENABLE

31 0371115192327

VO_YADD (r/w)0x10 1828 Y_BASE_ADR or BFR1BASE_ADR

VO_UADD (r/w)0x10 182C U_BASE_ADR or BFR2BASE_ADR

VO_VADD (r/w)0x10 1830 V_BASE_ADR or SIZE1

VO_OLADD (r/w)0x10 1834 OL_BASE_ADR or SIZE2

VO_VUF (r/w)0x10 1838 U_OFFSET(16)

VO_YOLF (r/w)0x10 183C Y_OFFSET(16)

V_OFFSET(16)

31 0371115192327

FRAME_LENGTH

F1_VIDEO_LINEF1_OLAP

FRAME_WIDTH

IMAGE_WIDTH

IMAGE_VOFF IMAGE_HOFF

GLOBAL ALPHA 1

OVERLAY_HEIGHT OVERLAY_WIDTH

OL_OFFSET(16)

GLOBAL ALPHA 0

BFR2_INTEN

HBE_INTEN

BFR1_INTEN

CLOCK_SELECT

PLL_S

PLL_T

reserved

31 0371115192327

31 0

CUR_Y(12) 371115192327 CUR_X(12)

BFR1_EMPTY

BFR2_EMPTY

HBE

URUN

YTR

FIELD2

VBLANK

Indicates EVO functionality

Figure 7-29. EVO MMIO registers.

PNX1300/01/02/11 Data Book Philips Semiconductors

7-16 PRELIMINARY SPECIFICATION

7.16.1 VO Status Register (VO_STATUS)

The VO_STATUS register is a read-only register that

shows the current status of th e EVO. Its fields are shown

in Figure 7-29 and Table 7-6.

VO_STATUS[4] is now hard-wired to ‘1’. This allows soft-

ware to determine if the unit is an EVO unit (containing

extra MMIO registers) or a TM-1000 VO unit, a s follows.

In the TM-1000, this bit is a copy of the HBE flag

(VO_STATUS[5]). In the EVO unit, it is hard-wired to ‘1’.

Software can use this bit to determine the type of (E)VO

unit by clearing the HBE bit then reading

VO_STATUS[4]. If the bit remains ‘1’, the unit is an EVO.

Table 7-6. VO_STATUS — status register fields

Field Description

CUR_Y Current Y.

Image line index of the current line in the current field being output by the EVO. CUR_Y reflects the current st ate of

the Image Line Counter. CUR_X and CUR_Y form a single 24-bit output data byte counter (CUR_X is the counter

LSBs) when the EVO is in data-streaming or message-passing mode. This counter reflects the status of the SIZE

counter for the currently active buffer. The two LSBs of this counter are not valid for reading during transfers; only

the upper 22 bits (the word count) are valid.

CUR_X Current X.

Image pixel index of the most-recently-output pixel. CUR_X reflects the current state of the Image Pixel Counter.

BFR1_EMPTY

BFR2_EMPTY Buffers 1 and 2 Empty.

These bits are valid in video-refresh, data-streaming and message-passing modes.

• In video-refresh modes, only Buffer 1 is used. BFR1_EMPTY indicates that the last byte of a field has been

transferred. It is actually raised at the completion of the transmission of the Overlap area of the field, as shown in

Figure 7-31. At this point, software should assign a new field of imagery to {Y,U,V}_BASE_ADR and perform a

BFR1_ACK. If BFR1_EMPTY is not cleared by BFR1_ACK before the active video area of the next field starts to

be emitted, the EVO sets the URUN bit.

• In data-streaming mode, BFR1_EMPTY and BFR2_EMPTY indicate that the last byte in their corresponding

buffer has been transferred. When BFR1_EMPTY or BFR2_EMPTY is set, transfer stops from the correspond-

ing buffer.

• In message passing mode, BFR1_EMPTY signals completion of message transmission.

These bits cause an interrupt if their interrupt-enable bit s are set. One interrupt per buffer is signaled.

HBE Highway Bandwidth Error.

HBE is set when the highway fails to respond in time to a highway read request and data was not ready in time to

be set on EVO data lines. HBE can be set in both image- and data-transfer modes . HBE indicates insufficient band-

width was requested from the highway arbiter.

1 EVO unit indicator.

This bit allows software to determine if the unit is an EVO (containing extra MMIO registers) or a TM-1000 VO unit.

In the TM-1000, this bit is a copy of the HBE flag. In the EVO unit, it is hard-wired to ‘1’. Software can easily deter-

mine the type of video output unit by clearing the HBE bit then reading this bit.

YTR Y threshold.

In video-refresh modes, YTR indicates that the Image Line Counter value is equal to the Y_THRESHOLD value in

VO_YTHR. The Y_THRESHOLD value can be set to provide an interrupt on any line in the valid image area.

URUN Underrun.

In video-refresh and data-streaming mode, this bit indicates that the CPU did not perform an acknowledge to indi-

cate updated address pointers for the next field or buffer in time for continuous image or data transfer. URUN

causes an interrupt if the corresponding interrupt-enable condition is set.

• In video-refresh modes, URUN indicates that the SAV code marking beginning of active video has been gener-

ated without BFR1_ACK being set by the CPU. (Setting BFR1_ACK to ‘1’ clears BFR1_EMPTY). In this case,

video refresh continues with previous address pointers.

• In data-streaming mode, URUN indicates the last byte in the active buffer was transferred, and no BFR1_ACK or

BFR2_ACK occurred to enable the next buffer. In this case, transfer continues with previous address pointers.

FIELD2 Field 2 or Buffer 2 active.

• In data-streaming mode, FIELD2 = 0 when Buffer 1 is active; FIELD2 = 1 when Buffer 2 is active.

• In video-refresh modes, FIELD2 indicates that the EVO is actively sending out a video image for Field 2, as

defined by Figure 7-31.

VBLANK Vertical blanking.

Indicates that the EVO is in a vertical-blanking interval. VBLANK is asserted only in video-refresh modes.

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-17

7.16.2 VO Control Register (VO_CTL)

The VO_CTL register sets the operating mode, enables

interrupts, clears interrupt flags, and initiates EVO oper-

ations. Its fields are unchanged from the TM-1000, as

shown in Figure 7-29 and Table 7-7, however the pre-

cise functionality implemented by a field may be changed

if PNX1300 functionality is enabled by software. Its hard-

ware reset value is 0x32400000 which sets

CLOCK_SELECT = 3, PLL_S = 1 and PLL_T = 1, and

all other bits to ‘0’. To ensure compatibility with future de-

vices, any undefined MMIO bits shou ld be ig no red whe n

read, and written as ‘0’s.

Table 7-7. VO_CTL register fields

Field Description

RESET Software reset of the EVO.

The recommended software reset procedure is as follows.

• Write the desired VO_CTL state with the RESET bit set to ‘1’.

• Write the desired VO_CTL state word, this time with the RESET bit cleared to ‘0’. Both writes should have

VO_ENABLE set to 0.

• Finally, enable the newly selected mode by setting VO_ENABLE. This step should be done last, as a separate

transaction.

After a software reset, 5 VO_CLK clock cycles are required to stabilize the internal circuitry (before enabling EVO).

Note: A hardware reset clears the CLKOUT and SYNC_MASTER bits and puts VO_CLK, VO_IO1, and VO_IO2 in

the input state. This result s in a VO_CTL value of 0x32400000. In contrast, a software reset does not change

device registers. So a software reset results in a state as specified by the VO_CTL word value written during the

above-described procedure.

SLEEPLESS Disable power management.

If SLEEPLESS = 1, power-down of the EVO is prevented during global PNX1300 power-down.

CLOCK_SELECT Clock select.

00 — Select PLL VCO output as the VO_CLK source.

01 — Select PLL feedback loop divider output as VO_CLK source.

10 — Select PLL input divider output as VO_CLK source.

11 — Select DDS output directly as VO_CLK source, bypassing the PLL altogether. (Hardware reset default.)

PLL_S PLL input divider division ratio.

A value of k selects division by k+1. The hardware reset default = 1, causing division by 2.

PLL_T PLL feedback loop divider division ratio.

A value of k selects division by k+1. The hardware reset default = 1, causing division by 2.

CLKOUT Clock output.

• When CLKOUT = 1, the EVO clock generator is enabled, and VO_CLK is an output.

• When CLKOUT = 0, VO_CLK is an input, and EVO clock is provided by the external device. (Hardware reset

default.)

SYNC_MASTER Sync master.

• When set, VO_IO1 and VO_IO2 are outputs. In video-refresh modes, the EVO generates horizont al and frame

timing signals on VO_IO1 and VO_IO2 respectively. In message-passing mode and data-streaming mode, this

bit should always be set so that VO_IO1 and VO_IO2 generate START and END message signals respectiv ely.

• When zero, VO_IO2 is an input. (Hardware reset default.) In v ideo-refresh modes, VO_IO2 serves as the frame

time reference. The active edge is selected by VO_IO2_POS.

VO_IO1_POS

VO_IO2_POS Polarity of VO_IOx_POS.

VO_IO1_POS currently has no function.

VO_IO2_POS determines the input polarity of VO_IO2.

• When ‘0’, the corresponding input triggers on the negative (high-to-low) transition of the input signal.

• When ‘1’, the input triggers on the positive (low-to-high) transition.

OL_EN Overlay Enable.

Enables the YUV overlay function in video-refresh modes.

MODE Major operating mode.

Defines the video output major operating mode, as listed in Table 7-5 on page 7-13.

BFR1_ACK

BFR2_ACK Buffer 1 and Buffer 2 acknowledge.

When active in data-transfer modes, writing a ‘1’ to BFR1_ACK clears BFR1_EMPTY and enables Buffer 1 for

transfer until BFR1_EMPTY is set. Writing a ‘0’ to BFR1_ACK has no ef fect. BRF2_ACK operates similarly for

Buffer 2. Writing a ‘1’ to VO_ENABLE in data-streaming mode is the same as writing a ‘1’ to both BFR1_ACK and

BFR2_ACK, and enables both buffers 1 and 2 for transfer. Wr iting a ‘1’ to VO_ENABLE in message-p assing mode

is the same as writing a ‘1’ to BFR1_ACK, and enables Buffer 1 for transfer. BFR2_ACK is not used in message-

passing mode, since only Buffer 1 is used.

HBE_ACK

URUN_ACK Acknowledge HBE or URUN.

Writing a ‘1’ to these bits clears the HBE or URUN flags and resets their corresponding interrupt conditions.

PNX1300/01/02/11 Data Book Philips Semiconductors

7-18 PRELIMINARY SPECIFICATION

7.16.3 VO-Related Registers

The VO-related registers and their fields are shown in

Table 7-8. Their fields a re unchanged from the TM-1000,

however their function may vary depending upon the

PNX1300 features that are selectively enabled by

EVO_CTL (see Section 7.16.4).

YTR_ACK Acknowledge Y threshold.

Writing a ’1’ to this bit clears the YTR flag and resets its interrupt condition. YTR signals the CPU to set new point-

ers for the next field. If YTR_ACK is not received by the time the active image area for the next field starts, the

URUN flag is set. Data transfer continues with the old pointer values.

BFR1_INTEN

BFR2_INTEN

HBE_INTEN

URUN_INTEN

YTR_INTEN

Enable interrupt conditions.

Enable corresponding interrupts to be generated when the BFR1_EMPTY, BFR2_EMPTY, HBE, URUN (under-

run/end of transfer), and YTR (end of field/buffer) flags are set, respectively.

Note: BFR2_INTEN, URUN_INTEN, YTR_INTEN must be 0 in message passing mode.

LTL_END Little-endian.

Specifies that data in SDRAM is stored in little-endian format. This only affects the overlay packed-image format

interpretation in video-refresh modes. Refer to Appendix C, “Endian-ness,” for details on byte ordering.

VO_ENABLE Enable the EVO to send image data or message data to its output.

Note: This bit should not be simultaneously asserted with the RESET bit. The correct sequence to reset and

enable the EVO is as follows.

• Set all VO_CTL control fields as desired, writing VO_CTL with RESET = 1, VO_ENABLE = 0.

• Retain all desired values of control fields, but rewrite VO_CTL with RESET = 0, VO_ENABLE = 0.

• Finally, still retaining all desired control fields, rewrite VO_CTL with RESET = 0, VO_ENABLE = 1.

Setting VO_ENABLE in video-refresh modes starts the EVO sending image data beginning with the first pixel in

the image. Setting VO_ENABLE in data-streaming and message-passing modes starts the EVO sending data

beginning with the first byte in Buff er 1. In video-refresh and data-streaming modes, VO_ENABLE remains set until

cleared by the CPU. In message-p assing mode, VO_ENABLE is cleared when BFR1_EMPTY is set, indicating the

end of message transfer.

Note: De-asserting VO_ENABLE in video-refresh modes causes SDRAM reads to stop, but sync framing and

BFR1_EMPTY generation and interrupts remain fully operational. The transmitted active image data is undefined

in this case. To fully halt video output, a software reset is required.

Table 7-7. VO_CTL register fields

Field Description

Table 7-8. VO register fIelds

VO_CLOCK FREQUENCY VO_CLK frequency. See DDS equation in Figure 7-6, and PLL description in Section 7.19.

VO_FRAME FRAME_LENGTH Total number of lines per frame; the ending value of the Frame Line Counter; typically 525

or 625. Note: the Frame Line Counter counts from 1 to 525 or 625, consistent with

CCIR 656 line numbering.

FIELD_2_START Start line number in the Frame Line Counter; where the second field of the frame begins.

If non-interlaced pictures are desired, then the same value is programmed for Field 1 and

Field 2. Field 1 becomes Frame 1 and Field 2 becomes Frame 2.

FRAME_PRESET Va lue loaded into the Frame Line Counter when frame timing edge is received on

VO_IO2.

VO_FIELD F1_VIDEO_LINE Line number in the Frame Line Counter of the first active video line of Field 1 of the frame.

F2_VIDEO_LINE Line number in the Frame Line Counter of the first active video line of Field 2 of the frame.

If non-interlaced pictures are desired, this is programmed to the same value as

F1_VIDEO_LINE

F1_OLAP Overlap of the SAV and EAV codes from Field 1 to Field 2. Overlap is defined as the delay

in lines from start of blanking for Field 2 until SAV and EAV codes for Field 2 are emitted.

Typical values are +2 for 525/60 and +2 for 625/50.

F2_OLAP Overlap in lines of the SAV and EAV code from Field 2 to Field 1. Overlap is defined as

the delay in lines from start of blanking for Field 1 until the SAV and EAV codes for Field 1

are emitted. Typical values are +3 for 525/60 and –2 for 625/50. The negative value

means Field 1 blanking actually starts two lines before end of Field 2 of previous frame.

This overlap is described in Table 7-4 on page 7-6, and illustrated in Figure 7-31.

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-19

VO_LINE FRAME_WIDTH Total line length in pixels including blanking. Also the ending value for the Frame Pixel

Counter. Lines always begin with a horizontal blanking interval, and the image starts after

the blanking interval and runs to the end of the line.

VIDEO_PIXEL_STAR

TPixel number in Frame Pixel Counter of starting pixel of active video area within the line.

Note: Must be even.

VO_IMAGE IMAGE_HEIGHT Video Image height in lines.

IMAGE_WIDTH Video Image line (scaled) output width in pixels. Must be even for upscaling by 2.

VO_YTHR Y_THRESHOLD Threshold image line number in the Image Line Counter for the YTR interrupt.

Can be reprogrammed on a frame-by-frame basis.

IMAGE_VOFF Image vertical offset in lines from the top of the active video window.

IMAGE_HOFF Image horizontal offset in pixels from the start of the active video window.

VO_OLSTART OL_START_LINE Starting image line of YUV overlay within the image.

Zero indicates that the overlay starts at the same line as the image.

OL_START_PIXEL Starting image pixel of the YUV overlay within the image. ‘0’ indicates that the overlay

starts at same pixel as the image. Note: Must be even.

ALPHA_ONE Alpha blend value used for YUV 4:2:2+alpha format overlays when the alpha bit = 1.

VO_OLHW OVERLAY_HEIGHT Height of the YUV overlay image in lines. Note: The height of the overlay should be cho-

sen such that it does not extend beyond the image area.

OVERLAY_WIDTH Width of the YUV overlay image in pixels. Note: Must be even.

ALPHA_ZERO Alpha blend value used for YUV 4:2:2+alpha format overlays when the alpha bit = 0.

VO_YADD Y_BASE_ADR

BFR1BASE_ADR Y-component buffer address or Buffer 1 address.

• In video-refresh modes: Y- component starting byte address.

• In data-streaming and message-passing modes: Buffer 1 starting byte address. Note:

must be 64-byte aligned in data-streaming mode and 4-byte aligned in message pass-

ing mode.

VO_UADD U_BASE_ADR

BFR2BASE_ADR U-component buffer address or Buffer 2 address.

• In video-refresh modes: U-component starting byte address

• In data-streaming mode: Buffer 2 starting byte address; must be 64-byte aligned

• Not used in message-passing mode

VO_VADD V_BASE_ADR

SIZE1 V-component buffer address or Buffer 1 length.

• In video-refresh modes: V-component starting byte address

• In data-streaming and message-passing modes: Buffer 1 length in bytes. Note: must be

a multiple of 64 in data-streaming mode. SIZE1 is limited to 24 bits.

VO_OLADD OL_BASE_ADDR

SIZE2 Overlay-image buffer address or Buffer 2 length.

• In video-refresh modes: overlay-image starting byte address. OL_BASE can be repro-

grammed on a frame-by-frame basis.

• In data-streaming mode: Buffer 2 length in bytes. Note: Must be multiple of 64 in data-

streaming mode; Not used in message-passing mode.

VO_VUF U_OFFSET Offset in bytes from start of one line to start of next line (16-bits unsigned).

V_OFFSET Offset in bytes from start of one line to st art of next line (16-bits unsigned).

VO_YOLF Y_OFFSET Offset in bytes from start of one line to start of next line (16-bits unsigned).

OL_OFFSET Offset in bytes from start of one line to start of next line (16-bits unsigned).

Table 7-8. VO register fIelds

PNX1300/01/02/11 Data Book Philips Semiconductors

7-20 PRELIMINARY SPECIFICATION

7.16.4 EVO Control Register (EVO_CTL)

PNX1300 EVO features are enabled by setting the ap-

propriate fields of the EVO_CTL register shown in

Figure 7-30. The register fields are described in

Table 7-9. If features are enabled, new PNX1300 the

functionality replaces TM-1000 functions.

The hardware reset value of EVO_CTL register is

0x10000000, which means that EVO functions are dis-

abled on reset and must be enabled by software. The MS

four bits indicate the EVO revision number.

To ensure compatibility with future devices, any unde-

fined MMIO bits should be ignored when read, and writ-

ten as ‘0’s.

MMIO_BASE

offset:

EVO_MASK (r/w)0x10 1844

EVO_CLIP (r/w)0x10 1848

EVO_KEY (r/w)0x10 184C

EVO_CTL (r/w)0x10 1840

CLIPPING_ENABLE

SYNC_STREAMING

FIELD_SYNC

KEY_ENABLE

EVO_ENABLE

31 0371115192327

FULL_BLENDING

1000 RESERVED

RESERVED KEY_Y

KEY_V KEY_U

HIGHER_CLIPUV LOWER_CLIPUV HIGHER_CLIPY LOWER_CLIPY

MASK_Y MASK_UV

GENLOCK

RESERVED

EVO_SLVDLY (r/w)0x10 1850 RESERVED SLAVE_DLY

Figure 7-30. EVO MMIO registers.

Table 7-9. EVO_CTL Register Fields

EVO_CTL EVO_ENABLE When set to 1, EVO features are enabled. When set to 0 (the hardware reset value), the EVO

behaves exactly like a TM-1000 VO unit. Default: 0.

FULL_BLENDING Activates full 8-bit alpha blending when set to 1. When set to 0, only the original five TM-1000

blending levels are implemented (0%, 25%, 50%, 75%, 100%). Default: 0.

CLIPPING_ENABLE When set to 1, the values stored in EVO_CLIP are used for the clipping of output data . Otherwise,

TM-1000 default values (240 and 16 for Y, U and V) are used. Default: 0.

SYNC_STREAMING When set to 1 in data-streaming mode, VO_IO2 generates a DATA_VALID signal. See Section

7.18.2, “Data-transfer Modes”. Default: 0.

FIELD_SYNC When set, VO_IO2 will generate frame synchronization signal that follows the field number in

SAV/EAV codes (Field1 gives a low VO_IO2, Field2 gives a high VO_IO2). Default: 0.

GENLOCK Activates Genlock mode when set to 1 and VO_CTL. SYNC_MASTER = 0. Default: 0.

KEY_ENABLE When set, this bit activates chroma key. The overlay values (Y, U and V) are compared to the val-

ues stored in the EVO_KEY register. Bits that correspond to bits set in MASK_Y and MASK_UV

are ignored for the comparison. When there is an exact match between the pixel value and the

value in EVO_KEY register (less the bits selected by MASK_Y and MASK_UV), then the overlay

value is not present in the output stream, resulting in full transparency.

The key is 24 bits (Y, U and V are 8 bits each). Default: 0.

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-21

7.16.5 EVO-Related Registers

As shown in Figure 7-30, four additional registers are in-

troduced in the PNX1300, as follows.

• EVO_MASK and EVO_KEY — used in chroma key

(see Section 7.15.2).

• EVO_CLIP — provides programmable clipping (see

Section 7.15.3).

• EVO_SLVDLY — used in Genlock mode (see

Section 7.10).

These registers are shown in Figure 7-30, and their reg-

ister fields are shown in Table 7-10.

To ensure compatibility with future devices, any unde-

fined MMIO bits should be ignored when read, and writ-

ten as ‘0’s.

7.17 ENHANCED VIDEO OUT OPERATION

As described in Section 7.14, the EVO operates in either

video-refresh or data-transfer modes. The DSPCPU

starts the EVO by setting the ap pro pri ate VO MMIO r eg -

isters and the appr o pr iate EVO MMIO regis te rs.

VO_CTL. MODE mu st be set to the appropriate transfer

mode, appropriate addresses, address offsets, and im-

age timing registers and the associated control bits in the

control register must be set. Lastly, software sets

VO_CTL. VO_ENABLE to begin EVO operation. The

EVO transfers the image, data, or message as com-

manded. In video-refresh and data-streaming modes,

the EVO runs continuously. In message-passing mode,

the EVO runs only until the message has been trans-

ferred.

The EVO unit is reset by a PNX1300 hardware reset, or

by a software reset, as described in Table 7-7 for the RE-

SET bit.

The VO_CLK signal is norm ally set a s an o utput to drive

the data transfer for all modes at a programmable rate.

The VO_CLK signal can be an input or output, as con-

trolled by the VO_CTL. CLKOUT bit. When

CLKOUT = 1, VO_CLK is an output, and its frequency is

set by the VO_CLOCK register value. When

CLKOUT = 0, VO_CLK is an input and the EVO gener-

ates data at the clock rate of the sender.

In video-refresh modes, the EVO receives or generates

horizontal and frame synchronization signals on the

VO_IO1 and VO_IO2 lines, as described in

Section 7.9.4.

7.17.1 Video Refresh Modes

In video-refresh mode, the EVO transfer s an image from

SDRAM to the EVO port. The VO_CTL. MODE field de-

fines the video image memory data format and deter-

mines whether the EVO is to perform horizontal upscal-

ing (see Table 7-5). The EVO accepts memory image

data in YUV 4:2:2 co-sited, YUV 4:2:2 interspersed and

YUV 4:2:0 formats, and gener ates a CCIR 656-compati-

ble, YUV 4:2:2 co-sited image output stream. Scaling is

identified by the YUV-1 and YUV-2 modes. In YUV-1

modes, luminance and chr ominance pass unmodified . In

YUV-2 modes, luminance and chrominance are hori-

zontally upscaled by a factor of two.

During video refresh, the VO_STATUS. YTR bit is set

when the Image Line Counter reaches the

Y_THRESHOLD value. When an image field has been

transferred, the VO_STATUS. BFR1_EMPTY bit is set.

The DSPCPU is interrupted when either the YTR or

BFR1_EMPTY flag is set and its corresponding interrupt

is enabled. To maintain continuous transfer of image

fields, the DSPCPU supplies new pointers for the next

field following each BFR1_EMPTY interrupt. If the

DSPCPU does not supply new pointers before the next

field, the URUN bit is set, and the EVO uses the same

pointer values un til the y ar e up d at ed.

Table 7-10. EVO-Related MMIO Registers Fields

EVO_MASK MASK_Y This 4-bit value is used to mask the four lower bits of the overlay Y component during

the chroma key process. Example: Setting MASK_Y to ‘1’ will eliminate the influence of

the LSB of KEY_Y in the keying process.

MASK_UV This 4-bit value is used to mask the four lower bits of the overlay U and V components

during the chroma key process. Example: Setting MASK_UV to ‘1’ will eliminate the

influence of the LSB of KEY_U and KEY_V in the keying process.

EVO_CLIP LOWER_CLIPY A Y value lower or equal to LOWER_CLIPY is forced to LOWER_CLIPY. Default: 16.

HIGHER_CLIPY A Y value higher or equal to HIGHER_CLIPY is forced to HIGHER_CLIPY. Default: 235.

LOWER_CLIPUV An U or Y value less than or equal to LOWER_CLIPUV is forced to LOWER_CLIPUV.

Default: 16.

HIGHER_CLIPUV An U or and an V value higher than or equal to HIGHER_CLIPUV is forced to

HIGHER_CLIPUV. Default: 240.

EVO_KEY KEY _Y Value compared to the Y component of the overlay for chroma keying.

KEY_U Value compared to the U component of the overlay for chroma keying.

KEY_V Value compared to the V component of the overlay for chroma keying.

EVO_SLVDLY Number of VO_CLK cycles of internal delay for VO_IO2 in Genlock mode.

PNX1300/01/02/11 Data Book Philips Semiconductors

7-22 PRELIMINARY SPECIFICATION

Graphics Overlay

The graphics overlay is enabled by the VO_CTL. OL_EN

bit. The graphics overlay is typically a software-generat-

ed graphic overlaid onto the output video image stream.

The graphics overlay is either generated in YUV by the

DSPCPU or converted by the DSPCPU from an RGB to

a YUV overlay image. Because RGB-to-YUV conver sion

can potentially lose information, this conversion is done

by the DSPCPU, because it has the most information

about how best to perfor m this conversion in the most ef-

fective manner.

The overlay height shou ld be chosen such that the over-

lay does not vertically extend beyond the image area. A

height greater than this causes undefined results and

may result in vertical overlay wraparo un d .

Note: The emitted byte data rate is limited to 45% of the

SDRAM clock when overlays are enabled.

The YUV overlay logic assembles the U0, Y0, V0, Y1

bytes for a pair of YUV 4:2:2 pixels for both the main im-

age and the overlay image. The alpha bit for pixel 0 (the

LSB of the U0 byte of the overlay image) selects

ALPHA_ZERO or ALPHA_ONE as the alpha source,

and the alpha blend log ic combines U0, Y0, and V0 from

the main and overlay images to gene rate the U0, Y0 and

V0 output values. The alpha bit for pixel 1 ( the LSB of the

V0 byte of the overlay image) selects ALPHA_ZERO or

ALPHA_ONE as the alpha source for blending the Y1

pixels to generate the Y1 o utpu t va lue . Th e a lph a b lend -

ed U0, Y0, V0 and Y1 bytes are sent to the EVO output

port in the YUV 422 sequence . Th e overlay U and V va l-

ues used assume an LSB of zero.

Video Image Addressing

The output image is read from SDRAM at a location de-

fined by Y_BASE_ADR, Y_OFFSET, U_BASE_ADR,

U_OFFSET, V_BASE_ADR, and V_OFFSET. The de-

fault memory packing is big-endian although little -endian

packing is also supported by setting the VO_CTL.

LTL_END bit.

Horizontally-adjacent samples are stored at successive

byte addresses, resulting in a packed form (four 8-bit

samples are packed into o ne 32-bit word). Upon horizon-

tal retrace, the starting byte address for the next line is

computed by adding the corresponding offset value to

the previous line’s starting byte address. Note that

{OL,Y,U,V}_OFFSET values are 16-bit unsigned quanti-

ties. This process continues until the total image —height

in lines and width in pixels per lin e—has been read from

memory for luminance (Y). For chrominance, the same

number of lines are read, but half the number of pixels

per line are read in YUV 4:2:2 and YUV 4:2:0 formats1.

The YUV 4:2:0 format has half the number of U and V

lines in memory that the YUV 4:2:2 formats have, but

each line of U and V data is read and used twice. See

Figure 7-19 through Figure 7-22.

Blanking: Field 2 Overlap

Blanking: Field 1

Video Image: Field 1

Blanking: Field 1 Overlap

Blanking: Field 2

Video Image: Fiel d 2

525 Line / 60 Hz

264

266

283

525

Blanking: Field 1

Video Image: Field 1

Blanking: Field 1 Overlap

Blanking: Field 2

Video Image: Field 2

625 Line / 50 Hz

311

313

336

623

Blanking: Field 2 Overlap

624

625

Figure 7-31. EVO frame timing.

1. Note that consecutive pixel components of each line

are stored in consecutive memory addresses but con-

secutive lines need not be in consecutive memory ad-

dresses

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-23

7.18 FRAME AND FIELD TIMING CONTROL

The frame timing for 525/60 and 625/50 timing cases is

shown pictorially in Figure 7-31. CCIR 656 line defini-

tions are used.

7.18.1 Recommended values for timing registers

The recommended values for the various fields of the

timing registers are shown in Table 7-11 for 525/60 and

625/50 timing cases. The FREQUENCY field value

shown is for 27 MHz assuming a DSPCPU clock of

143 MHz.

7.18.2 Data-transfer Modes

In data-streaming and message-passing modes, the

EVO supplies a stream of 8-bit data to the

VO_DATA[7:0] lines at rates up to 81 MHz.

Note: In the PNX1300, the data-rate is limited to an 81-

MHz EVO clock.

Data is read from SDRAM in packed form (four 8-bit

bytes per 32-bit word). No data selection or data interpre-

tation is done, and data is transferred at one byte per

VO_CLK from successive byte addresses.

Note: Unused bits of the EVO MMIO registers must be

set to 0 when operating in data transfer modes.

Data-Streaming Mode. In data-streaming mode, data is

stored in SDRAM in two buffers.

When the EVO has transferred out the contents of one

buffer, it interrupts the DSPCPU and begins transferring

out the contents of the second buffer. The DSPCPU sup-

plies pointers to both buffers. The EVO can provide a

continuous stream of data to the EVO output if the

DSPCPU updates the pointer to the next buffer before

the EVO starts transferring data from the next table.

Note: In this mode, SYNC_MASTER must be set to en-

sure correct operation of VO_IO1 and VO_IO2 as out-

puts.

When each buffe r has been tran sferred, the correspo nd-

ing buffer-empty bit is set in the status register, and the

DSPCPU is interrupted if the buffer-empty interrupt is en-

abled. To maintain continuous transfer of data, the

DSPCPU supplies new pointers for the next data buffer

following each buffer-empty interrupt. If the DSPCPU

does not supply new pointers before the next field, the

URUN bit is set, and the EVO uses the same pointer val-

ues until they are updated.

When data-streaming mode is enabled and

EVO_ENABLE = 1 and SYNC_STREAMING = 1, the

VO_IO2 signal indicates a data-valid condition. This sig-

nal is asserted when the EVO sta rts outputting valid d ata

(that is, data-streaming mode is enabled and video out-

put is running ) and is de-asserted whe n data-streaming

mode is disabled. The VO_IO1 signal generates a pulse

one VO_CLK cycle before the first valid data is sent. See

Section 7.11 for timing signal details.

Message-Passing Mode. In message-passing mode

data is stored in SDRAM in one buffer.

Note: In this mode, SYNC_MASTER must be set to en-

sure correct operation of VO_IO1 and VO_IO2 as out-

puts.

When message passing is started by setting VO_CTL.

VO_ENABLE, the EVO sends a Start condition on

VO_IO1. When the EVO has transferred the contents of

the buffer, it sends an End condition on VO_IO2 as

shown in Figure 7-18, sets BFR1_EMPTY, and inter-

rupts the DSPCPU. The EVO stops, and no further oper-

ation takes place until the DSPCPU sets VO_ENABLE

again to start another message, or until the DSCPU ini-

tiates other EVO operation. See Section 7.11 for timing

signal details.

7.18.3 Interrupts and Error Conditions

The EVO has five interrupt conditions defined by bits in

the VO_STATUS register: BFR1_EMPTY,

BFR2_EMPTY, HBE, URUN, and YTR. Each of these

conditions has a corr esponding interru pt enable flag an d

interrupt acknowledge bit in the VO_CTL register.

The EVO asserts a SO URCE 10 interrup t r eque st to the

PNX1300 vectored interrupt controller as long as one or

more enabled events is asser ted.

Note: The interrupt controller should always be pro-

grammed such that the EVO interrupt operates in level-

triggered mode. T his ensures that no EVO events can be

lost to the interrupt handler. Refer to Section 3.5.3, “INT

and NMI (Maskable and Non-Maskable Interrupts),” for

a description of setting level-triggered mode, as well as

for recommendations on writing interrupt handlers.

The BFR1_EMPTY, BFR2_EMPTY and YTR status

flags indicate to the DSPCPU that a buffer has been

emptied or that the Y threshold has been reached.

The buffer-underrun (URUN) status flag indicates that

the DSPCPU did not acknowledge a BFR1_EMPTY or

Table 7-11. Timing register recommended values

Value 625/50

Value

VO_CLOCK FREQUENCY 0x855E,

E191 0x855E,

E191

VO_FRAME FRAME_LENGTH 525 625

FIELD_2_START 264 311

FRAME_PRESET 1 1

VO_FIELD F1_VIDEO_LINE 20 23

F2_VIDEO_LINE 283 336

F1_OLAP 2 2

F2_OLAP 3 –2 (0xE)

VO_LINE FRAME_WIDTH 858 864

VIDEO_PIXEL_STAR

T138 144

VO_IMAGE IMAGE_HEIGHT 240 288

IMAGE_WIDTH 720 720

(704 visible)

PNX1300/01/02/11 Data Book Philips Semiconductors

7-24 PRELIMINARY SPECIFICATION

BFR2_EMPTY interrupt before the EVO required the

next buffer. In this case, the EVO uses the old address

pointer value and continues image or data transfer.

When the DSPCPU updates the pointer, the new pointer

value will be used at the start of the next frame or buffer

transfer. Therefore, the URUN flag can be interpreted as

indicating to the DSPCPU that the EVO is using its old

pointer values because it did not receive the n ew ones in

time.

Note: The actual buffer pointer write operation to the

MMIO registers is not seen by the hardware—only writ-

ing a ’1’ to the appropriate BFR1_ACK or BFR2_ACK

bits signals buffer availability.

The Hardware Bandwidth Er ror (HBE) flag ind icate s that

the EVO did not get data from SDRAM via the

PNX1300’s internal data highway in time to continue

data transfer or video refresh. Data or video refresh will

continue using whatever data is in the EVO internal data

buffers. The address counter for the failing buffer(s) will

continue to count, and the EVO will continue to request

data from the SDRAM over the highway.

The EVO is a read-only device, transferring data from

SDRAM to the EVO output port. Unlike Video In, the EVO

does not modify SDRAM da ta. URUN and HBE are the

only EVO error conditions that can arise. In the case of

URUN or HBE, a scrambled image may be temporarily

displayed or incorrect data may be temporarily sent. The

EVO can cause no other system hardware error condi-

tions.

Even changing operating modes can not cause system

hardware error conditions to arise. For example, chang-

ing the MODE bits, the OL_EN and format bits, or the

LTL_END bit while the EVO is runn ing may cause wrong

data to be displayed or transferred. However, the EVO

does not detect this or stop for it.

In normal operation, the user should not change the

mode or transfer-control bits while the EVO is enabled.

The EVO should be disabled before changing bits such

as the MODE bits, the OL_EN bit, or the LTL_END bit.

However if these bits are changed while the EVO is run-

ning, they will take effect at the beginning of the next field

or buffer.

7.18.4 Latency and Bandwidth Requirements

In order to av oid Hardware Bandw idth Error (HBE) co n-

ditions, the internal highway bus arbi ter (see Chapter 20,

“Arbiter”) must be programmed according to the latency

requirements of the EVO unit described in this section. In

the following discussion, it is assumed that d ata for video

lines (in Y, U, V and overlay planar memory format) is

stored in memory alig ned on 6 4-b yte bound ar ies . In o th-

er words, it means that the {OL,Y,U,V}_OFFSET fields

are multiples of 64 bytes. Otherwise internal EVO arbitra-

tion for OL, Y, U and V requests will be different than de-

scribed here, and the following latencies would not be

guaranteed. The EVO uses internal 64-byte buffers.

1. Latency requirements for the EVO in image mode

4:2:2 or 4:2:0 co-sited or interspersed without up scal-

ing and with overlay disabled is expressed as follows.

During 128 EVO clock cycles, the EVO block must

have 2 request s acknowledged, that is, ( [2Ys, 1U and

1V] / 2). For example, if the EVO clock is 27 MHz,

then the EVO must get two r equest s (128 bytes) from

SDRAM in 128 / 027 = 4740 ns.

The byte bandwidth B1x per video line within the ac-

tive image for this case is:

where ceil(X) is a function returning the least integral

value greater than or equal to X, and W is the

IMAGE_WIDTH field value.

2. In the same modes but with overlay enabled, the la-

tency is as follows:

• During the first 64 EVO clock cycles at least one

request must be acknowledged for the OL data.

• During 128 EVO clock cycles, the EVO unit must

have 4 requests acknowledged ([4 OLs, 2 Ys, 1 V

and 1 U] / 2).

For example, if the EVO clock runs at 54 MHz then

the EVO must get the first request from SDRAM in

64/. 054 = 1185 ns and must average a bandwid th la-

tency of 4 requests in 128/.054 = 2370 ns.

Byte bandwidth B1x,OL per video line within the active

image is then as follows:

3. When the EVO is set to image mode with 2 upscal-

ing, the latency requirements are multiplied by a fac-

tor of 2. For example, if 1mode calle d fo r on e re -

quest per 64 EVO clock cycles, the latency becomes

one request per 128 EVO clock cycles. Bandwidth is

roughly divi de d by 2:

4. Latency for data-streaming mode or message-pass-

ing mode is as follows:

During 64 EVO clock cycles, the EVO unit must get

one request from SDRAM. For example, if the EVO

clock runs at 38 MHz, then the latency is 64/.038 =

1684 ns and bandwidth is 38 MB/s.

7.18.5 Power Down and Sleepless

The EVO block enters in power down state whenever

PNX1300 is put in global power down mode, except if the

SLEEPLESS bit in VO_CTL is set. In the latter case, the

block continues DMA operation and will wake up the

DSPCPU whenever an interrupt is generated.

B1xceil W

------()ceil W

128

---------()24++





64=

B1xOL B1xceil W

------()4+





+64=

B2xceil W

128

---------()ceil W

256

---------()24++





64=

B2xOL B2xceil W

------()4+





+64=

Philips Semiconductors Enhanced Video Out

PRELIMINARY SPECIFICATION 7-25

The EVO block can be se parately powered d own by set-

ting a bit in the BLOCK_POWER_DOWN register. Refer

to Chapter 21, “Power Management.”

It is recommended that EVO be stopped (by negating

VO_CTL. ENABLE) before block level power down is

started, or that SLEEPLESS mode is used when global

power down is activated.

7.19 DDS AND PLL FILTER DETAILS

The PLL filter reduces the phase jitter of the DDS synthe-

sizer output. It can also be used to multiply the DDS out-

put frequency by 2. The DDS and PLL filter together

provide a high-quality, accurately-programmable output

video clock. The PLL filter block is shown in Figure 7-32.

At hardware reset, the output multiplexer is set to 0x3,

and the PLL system is disab led. To start the PLL system,

the following steps must be performed:

1. Assign a DDS frequency. This starts the DDS. Allow

for at least 31 DSPCPU cycles for the DDS frequency

setting to take effect.

2. Choose a value for PLL_S and P LL_T. For 8-40 MHz

operation, a value of 1 (which sele cts division by 2) is

recommended.

3. Choose a value for CLOCK_SELECT. For 8-81 MHz

operation, CLOCK_SELECT = 00 is recommended.

4. Assign values to the VO_CTL register containing the

above choices. The first assignment with

CLOCK_SELECT not equal to 0x3 enables the PLL

system. Allow for a maximum of 50 microseconds to

achieve lock.

Once the PLL is locked, small changes to the DDS fre-

quency are allowed, and the VO_CLK output will

smoothly track the frequency change.

Note: Most consumer electronics equipment imposes

very high precisio n requ ire ments on the value of the col-

or burst frequency. A video encoder will derive the color

burst frequency from VO_CLK. When changing the

VO_CLK frequency in software to phase-lock the EVO to

a master reference, special care is required to keep the

color burst signal frequency within a tolerance of about

50 ppm. When using a Philips DENC (Digital Encoder),

the color burst frequency is derived from the master

DENC frequency by a programmable synthesizer on the

DENC chip. In this case, VO_CLK changes larger than

50 ppm are allowed by changing the DENC synthesizer

over its I2C interface to compensate for the VO_CLK

change.

Table 7-12 illustrates recommended settings.

Square-Wave DDS

FREQUENCY

VCO

8–90 MHz

VO_CLK

VO_CLK Internal

(to Frame Timing Gen.)

CLKOUT

9  CPU Clock

031

Loop

Filter

Phase

Detect

PLL_S div T+1

PLL_T

CLOCK_SELECT

div S+1

Figure 7-32. PLL f ilter block diagram.

Table 7-12. DDS and PLL ex ample settings

Desired

Frequency DDS frequency PLL_S PLL_T CLOCK_SELECT Usage

4 – 10 MHz 8 – 20 MHz 1 (divide by 2) 1 (divide by 2) 01 (T divider) C ustom low speed video

8 – 45 MHz 8 – 45 MHz 1 (divide by 2) 1 (divide by 2) 00 (VCO) Standard or 16:9 digital video

40 – 81 MHz 20 – 40. 5 MHz 1 (divide by 2) 3 (divide by 4) 00 (VCO) High pixel rate custom video

PNX1300/01/02/11 Data Book Philips Semiconductors

7-26 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION 8-1

Audio In Chapter 8

by Gert Slavenburg

8.1 AUDIO IN OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The PNX1300 Audio In (AI) unit connects to an off-chip

stereo A/D converter subsystem through a flexible bit-se-

rial connection. The AI unit provides all signals needed to

interface to high quality, low cost over samplin g A/D con-

verters, including a generator for a precisely p rogramma-

ble oversampling A/D system clock. Together, the AI unit

and external A/D provide the following capabilities:

• One or two channels of audio input.

• 8- or 16-bit samples per channel.

• Programmable sampling rate.

• Internal or external sampling clock source.

• Supports autonomous writes of sampled audio data

to memory using double buffering (DMA).

• Supports 8-bit mono and stereo as well as 16-bit

mono and stereo PC standard memory data formats.

• Support s little- and big-endian memory formats.

8.2 EXTERNAL INTERFACE

Four PNX1300 pins are associated with the AI unit. The

AI_OSCLK output is an accurately programmable clock

output intended to serve as the master system clock for

the external A/D subsystem. The other three pins

(AI_SCK, AI_WS and AI_SD) constitute a flexible serial

input interface. Using the AI unit’s MMIO registers, these

pins can be configured to oper ate in a variety of serial in-

terface framing modes, including but not limited to:

• Standard stereo I2S (MSB first, 1-bit delay from

AI_WS, left & right data in a frame).1

• LSB first with 1–16 bit data per channel.

• Complex serial frames of up to 512 bits/frame, with

‘valid sample’ qualifier bit.

The AI unit can be used with many serial A/D converter

devices, including the Philips SAA7366 (stereo A/D),

Crystal Semiconductor CS5331, CS5336 (stereo A/D’s),

CS4218 (codec), Analog De vices AD1847 (codec).

1. A definition of the Philips I2S serial interface protocol,

among others, can be found in the Philips IC01 da-

tabook.

Table 8-1. AI unit external signals

Signal Type Description

AI_OSCLK OUT Over-sampling clock. This output can be

programmed to emit any frequency up to

40-MHz with a sub Hertz resolution. It is

intended for use as the 256fs or 384fs

over sampling clock by external A/D sub-

system.

AI_SCK I/O-5 • When the AI unit is programmed as

serial-interface timing slave (power-up

default), AI_SCK is an input. AI_SCK

receives the serial bitclock from the

external A/D subsystem. This clock is

treated as fully asynchronous to

PNX1300 main clock.

• When the AI unit is programmed as the

serial-interface timing master, AI_SCK

is an output. AI_SCK drives the serial

clock for the external A/D subsystem.

The frequency is a programmable inte-

gral divide of the AI_OSCLK frequency.

AI_SCK is limited to 22 MHz. The sample

rate of valid samples embedded within

the serial stream is also limited by the

bandwidth.latency available in the system

(Section 8-10).

AI_SD IN-5 Serial data from external A/D subsystem.

Data on this pin is sampled on positive or

negative edges of AI_SCK as determined

by the CLOCK_EDGE bit in the

AI_SERIAL register.

AI_WS I/O-5 • When the AI unit is programmed as the

serial-interface timing slave (power-up

default), AI_WS acts as an input.

AI_WS is sampled on the same edge

as selected for AI_SD.

• When the AI unit is programmed as the

serial-interface timing master, AI_WS

acts as an output. It is asserted on the

opposite edge of the AI_SD sampling

edge.

AI_WS is the word-select or frame-syn-

chronization signal from/to the external A/

D subsystem.

PNX1300/01/02/11 Data Book Philips Semiconductors

8-2 PRELIMINARY SPECIFICATION

8.3 CLOCK SYSTEM

Figure 8-1 illustrates the different clock capabilities of the

AI unit. At the heart of the clock system is a sq uare wave

DDS (Direct Digital Synthesizer). The DDS can be pro-

grammed to emit frequencies from approx. 1 Hz to 40

MHz with a resolution of better than 0.3 Hz.

The output of the DDS is always sent on the AI_OSCLK

output pin. This output is intended to be used as the

256fs or 384fs system clock source instead of a fixed fre-

quency crystal for over sampling A/D conver ters, such as

the Philips SAA7366T, or Analog Devices AD1847.

The PNX1300 AI DDS frequency is set by writing to the

FREQUENCY MMIO register. The programmer can

change the FREQUENCY setting dynamically, so as to

adjust the input sampling rate to track an application de-

pendent master reference.

Depending on bit 31 (MSB), the DDS runs in one of two

modes:

• bit 31 = 1 (PNX1300 improved mode)

• bit 31 = 0 (TM-1000 compatibility mode)

8.3.1 PNX1300 Improved Mode

In improved mode, a high quality, low-jitter AI_OSCLK is

generated. The setting of the FREQUENCY register to

accomplish a given AI_OSCLK frequency is given by:

This mode, and the above formula, should be used for all

new software development on PNX1300. It is not avail-

able on TM-1000.

In the improved mode the DDS synthesizer maximum jit-

ter can be computed as follows:

Example of jitter values can be found in Table 8-2.

8.3.2 TM-1000 Compatibility Mode

TM-1000 compatibility mode is provided so that TM-1000

software runs without changes. It should NOT be used

for new PNX1300 software development. TM-1000

mode is automatically entered whenever FREQUEN-

CY[31] = 0. In TM-1000 mode, AI_OSCLK frequency is

set as follows:

8.4 CLOCK SYSTEM OPERATION

AI_SCK and AI_WS can be configured as input or out-

put, as determined by the SER_MASTER control field.

As output, AI_SCK is a divider of the DDS output fre-

quency. Whether input or output, the AI_SCK pin signal

is used as the bit clock for serial-parallel conversion.

If set as output, AI_WS can similarly be programm ed us-

ing WSDIV to control the serial frame length from 1 to

512 bits.

The preferred application of the clock system options is

to use AI_OSCLK as A/D master clock, and let the A/D

converter be timing master over the serial interface

(SER_MASTER=0).

In case an external codec (e.g. the AD1847 or CS4218)

is used for common audio I/O, it may not be possible to

independently control the A/D and D/A system clocks. In

that case it is recommended that the Audio Out (AO) unit

FREQUENCY

AI_OSCLK

AI_SCK

AI_WS

div N+1 SCKDIV

div N+1

Square Wave DDS

9  DSPCPUCLK

AI_SD

SER_MASTER

Serial To Parallel Converter

16 LEFT[15:0]

RIGHT[15:0]

sample_clock

(e.g. 64fs)

WSDIV

31 0

(e.g. 256fs)

Figure 8-1. AI clock system and I/O interface.

FREQUENCY 231 fOSCLK 232



9fDSPCPU



------------------------------+=

jitter 1

9fDSPCPU



-----------------------------=

Table 8-2. Jitter values for common DSPCPU MHz

fDSPCPU

(MHz) jitter

(nSec) fDSPCPU

(MHz) jitter

(nSec)

143 0.777 180 0.617

166 0.669 200 0.555

FREQUENCY fOSCLK 232



3fDSPCPU



------------------------------=

SCKDIV 0 255[, ]

fAISCK fAIOSCLK

SCKDIV 1+

----------------------------------=

Philips Semiconductors Audio In

PRELIMINARY SPECIFICATION 8-3

clock system DDS is used to provide a single master A/

D and D/A clock. The AO unit, or the D/A converter, can

be used as serial interface timing master, and the AI unit

is set to be slave to the serial frame determined by AO

(AI SER_MASTER=0, AI_SCK and AI_WS externally

wired to the corresponding AO pins). In such systems, in-

dependent software control over A/D and D/A sampling

rate is not possible, but component count is minimized.

8.5 SERIAL DATA FRAMING

The AI unit can accept data in a wide variety of serial

data framing conventions. Figure 8-2 illustrates the no-

tion of a serial frame. If POLARITY=1 and

CLOCK_EDGE=0, a frame is defined with respect to th e

positive transition of the AI_WS signal, as obser ved by a

positive clock transition on AI_SCK. Each data bit sam-

pled on positive AI_SCK transitions has a specific bit po-

sition: the data bit sampled on the clock edge after the

clock edge on which the AI_WS transition is seen has bit

position 0. Each subsequent clock edge defines a new

bit position. As defined in Table 8-5, other combinations

of POLARITY and CLOCK_EDGE can be used to define

a variety of serial frame bitposition definitions.

The capturing of samples is g overned by FRAMEMODE.

If FRAMEMODE=00, every serial frame results in one

sample from the serial-parallel converter. A sample is de-

fined as a left/right pair in stereo modes or a single left

channel value in mono modes. If FRAMEMODE=1y, the

serial frame data bit in bit position VALIDPOS is exam-

ined. If it has value ‘y’, a sample is taken from the data

stream (the valid bit is allowed to precede or follow the

left or right channel data provided it is in the same serial

frame as the data).

The left and right sample data can be in a LSB-first or

MSB-first form, at an arbitrary bit position, and with an ar-

bitrary length.

Table 8-3. Sample rate settings (fDSPCPUCLK=133

MHz, improved PNX1300 mode)

fsOSCLK SCK FREQUENCY SCKDIV

44.1 kHz 256fs64fs2187991971 3

48.0 kHz 256fs64fs2191574340 3

44.1 kHz 384fs64fs2208246133 5

48.0 kHz 384fs64fs2213619686 5

Table 8-4.AI MMIO clock & interface control bits

Field Name Description

SER_MASTER 0  (RESET default), the A/D converter

is the timing master over the serial inter-

face. AI_SCK and AI_WS are set to be

inputs.

1  PNX1300 is timing master over the

AI serial interface. The AI_SCK and

AI_WS pins are set to be outputs.

FREQUENCY Sets the clock frequency emitted by the

AI_OSCLK output. RESET default 0.

SCKDIV Sets the divider used to derive AI_SCK

from AI_OSCLK. Set to 0..255, for divi-

sion by 1..256. RESET default 0.

WSDIV Sets the divider used to derive AI_WS

from AI_SCK. Set to 0..511 for a serial

frame length of 1..512. RESET default 0.

7654321031302928272625242322212019181716151413121110987654321

AI_SCK

AI_WS

framen

AI_SD framen+1

Figure 8-2. AI serial frame and bit position definition (POLARITY=1, CLOCK_EDGE=0).

Table 8-5. AI MMIO serial framing control fields

Field Name Description

POLARITY 0  serial frame start s on AI_WS negedge

(RESET default)

1  serial frame starts on AI_WS posedge

FRAMEMODE 00  accept a sample every serial frame

(RESET default)

01  unused, reserved

10  accept sample if valid bit = 0

11  accept sample if valid bit = 1

VALIDPOS • Defines the bit position within a serial frame

where the valid bit is found.

• Default 0.

LEFTPOS • Defines the bit position within a serial frame

where the first data bit of the left channel is

found.

• Default 0.

RIGHTPOS • Defines the bit position within a serial frame

where the first data bit of the right channel

is found.

• Default 0.

DATAMODE 0  MSB first (RESET default)

1  LSB first

SSPOS • Start/Stop bit position. Default 0.

• If DATAMODE=MSB first, SSPOS deter-

mines the bit index (0..15) in the parallel

word of the last data bit. Bits 15 (MSB) up

to/including SSPOS are taken in order from

the serial frame data. All other bits are set

to ‘0’.

• If DATAMODE=LSB first, SSPOS deter-

mines the bit index (0..15) in the parallel

word of the first data bit. Bits SSPOS up to/

including 15 are taken in order from the

serial frame data. All other bits are s et to ‘0’.

PNX1300/01/02/11 Data Book Philips Semiconductors

8-4 PRELIMINARY SPECIFICATION

In MSB-first mode, the serial-to-parallel converter as-

signs the value of the bit at LEFTPOS to LEFT[15 ]. Sub-

sequent bits are assigned, in order, to d ecreasing bit po-

sitions in the LEFT data word, up to and including

LEFT[SSPOS]. Bits LEFT[SSPOS–1:0] are cleared.

Hence, in MSB-first mode, an arbitrary number of bits are

captured. They are left-adjusted in the 16-bit parallel out-

put of the converter.

In LSB-first mode, the serial to parallel converter assigns

the value of the bit at LEFTPOS to LEFT[SSPOS]. Sub-

sequent bits are assigned, in order, to increasing bit po-

sitions in the LEFT data word, up to and including

LEFT[15]. Bits LEFT[SSPOS–1:0] are cleared. Hence, in

LSB-first mode, an arb itrary number o f bits are captured.

They are returned left-adjusted in the 16-bit parallel out-

put of the converter.

Refer to Figure 8-3 and Table 8-6 to see an example of

how the AI unit MMIO registers are set to collect 16-bit

samples using the Philips SAA7366 I2S 18-bit A/D con-

verter. This setu p assumes the SAA7 366 acts as the se -

rial master.

For example, if it were desirable to use only the 12 MSBs

of the A/D converter in Figure 8-3, use the settings of

Table 8-6 with SSPOS set to ‘4’. This results in

LEFT[15:4] being set with data bits 0..11, and LEFT[3:0]

being set to ’0’. RIGHT[ 15:4] is set with data bits 32..43

and RIGHT[3:0] is set to ’0’.

8.6 MEMORY DATA FORMATS

The AI unit autonomously writes samples to memory in

mono and stereo 8- and 16-bits per sample formats, as

shown in Figure 8-4. Successive samples are always

stored at increasing memory address location s. The set-

CLOCK_EDGE • if ‘0’(RESET default) the AI_SD and AI_WS

pins are sampled on positive edges of the

AI_SCK pin. If SER_MASTER =1, AI_WS is

asserted on negative edges of AI_SCK.

• if 1, AI_SD and AI_WS are sampled on neg-

ative edges of AI_SCK. As output, AI_WS

is asserted on positive edges of AI_SCK.

Table 8-5. AI MMIO serial framing control fields

Field Name Description

Figure 8-3. Serial frame of the SAA7366 18 bit I2S A/D converter (format 2 SWS).

16362525150343332311918

AI_SCK

AI_WS

AI_SD

leftn(18)

3210

rightn(18)

leftn+1(18)

Table 8-6. Example setup for SAA7366

Field Value Explanation

SER_MASTER 0 SAA7366 is serial master

FREQUENCY 161628209 256fs 44.1 kHz

SCKDIV 3 AI_SCK set to AI_OSCLK/4

(not needed since

SER_MASTER=0)

WSDIV 63 Serial frame length of 64 bits

(not needed since

SER_MASTER=0)

POLARITY 0 Frame starts with neg. AI_WS

FRAMEMODE 00 Take a sample each ser.

frame

VALIDPOS n/a Don’t care

LEFTPOS 0 Bit position 0 is MSB of left

channel and will go to

LEFT[15]

RIGHTPOS 32 Bit position 32 is MSB of right

channel and will go to

RIGHT[15]

DATAMODE 0 MSB first

SSPOS 0 Stop with LEFT/RIGHT[0]

CLOCK_EDGE 0 Sample WS and SD on posi-

tive SCK edges for I2S

Figure 8-4. AI memory DMA formats.

adr

leftn

adr+1

leftn+1

adr+2

leftn+2

adr+3

leftn+3

adr+4

leftn+4

adr+5

leftn+5

adr+6

leftn+6

adr+7

leftn+7

8-bit

mono

adr

leftn

adr+1

rightn

adr+2

leftn+1

adr+3

rightn+1

adr+4

leftn+2

adr+5

rightn+2

adr+6

leftn+3

adr+7

rightn+3

8-bit

stereo

16-bit

mono leftn

adr

leftn+1

adr+2

leftn+2

adr+4

leftn+3

adr+6

16-bit

stereo leftn

adr

rightn

adr+2

leftn+1

adr+4

rightn+1

adr+6

Philips Semiconductors Audio In

PRELIMINARY SPECIFICATION 8-5

ting of the LITTLE_ENDIAN bit in the AI_CTL register de-

termines how increa sing memory addresses map to byte

positions within words. Refer to Appendix C, “Endian-ness,”

for details on byte ordering conventions.

The AI hardware implemen ts a dou ble buffering scheme

to ensure that no samples are lost, even if the DSPCPU

is highly loaded and slow to respond to interrupts. The

DSPCPU software assigns buffers by writing a base ad-

dress and size to the MMIO control fields described in

Table 8-7. Refer to Section 8.7 for details on hardware/

software synchronization.

In 8-bit capture modes, the eight MSBs of the serial par-

allel converter output data are written to memory. In 16-

bit capture modes, all bits of the parallel data are written

to memory. If SIGN_C ONVERT is set to ’1’, t he MSB of

the data is inverted, which is equivalent to translating

from two’s complement to offset binary representation.

This allows the use of an exter nal two’s comp lement 16 -

bit A/D converter to generate 8-bit unsigned samples,

which is often used in PC audio.

Note that the AI hardware does not generate A-law o r -

law 8-bit data formats. If such formats are desired, the

DSPCPU can be used to convert from 16-bit linear data

to A-law or -law data.

Figure 8-5. AI status/control field MMIO layout.

MMIO_base

offset:

AI_STATUS (r/w)0x10 1C00

AI_CTL (r/w)0x10 1C04

AI_SERIAL (r/w)0x10 1C08 SCKDIV

AI_FRAMING (r/w)0x10 1C0C

AI_FREQ (r/w)0x10 1C10

AI_BASE1 (r/w)0x10 1C14

FREQUENCY

BUF1_ACTIVE

AI_BASE2 (r/w)0x10 1C18 BASE2

AI_SIZE (r/w)0x10 1C1C SIZE (in samples)

31 0371115192327

VALIDPOS

BASE1

OVERRUN

HBE (Highway bandwidth error)

BUF2_FULL

RESET

CAP_ENABLE

CAP_MODE

SIGN_CONVERT

LITTLE_ENDIAN

DIAGMODE

OVR_INTEN

HBE_INTEN

BUF2_INTEN

BUF1_INTEN

ACK_OVR

ACK_HBE

ACK2

ACK1

WSDIV

SER_MASTER

DATAMODE

FRAMEMODE

POLARITY

LEFTPOS RIGHTPOS SSPOS

00000

000000

BUF1_FULL

SLEEPLESS

CLOCK_EDGE

000000

31 0371115192327

RESERVED

Table 8-7. AI MMIO DMA control fields

Field Name Description

LITTLE_ENDIAN 0  capture in big endian memory format

(RESET default)

1  capture little endian

BASE1 Base address of buf fer1; a 64-byte

aligned address in local SDRAM.

RESET default 0.

BASE2 Base address of buf fer2; a 64-byte

aligned address in local SDRAM.

RESET default 0.

SIZE • Number of samples to be placed in

buffer before switching to other buffer

• Stereo modes: a pair of 8- or 16-bit data

is 1 sample

• Mono modes: a single value is 1 sample

• RESET default 0.

CAP_MODE 00  mono (left ADC only), 8 bit s/sample.

(RESET default).

01  stereo, 2 times 8 bits/sample

10  mono (lef t ADC only), 16 bit s/sample

11  stereo, 2 times 16 bits/sample

SIGN_CONVERT 0  leave MSB unchanged (RESET

default)

1  invert MSB

PNX1300/01/02/11 Data Book Philips Semiconductors

8-6 PRELIMINARY SPECIFICATION

8.7 AUDIO IN OPERATION

Figure 8-5, Table 8-8 and Table 8-9 describe the func-

tion of the control and status fields of the AI unit. To en-

sure compatibility with future devices, undefined bits in

MMIO registers should be ignored when read, and writ-

ten as ’0’s.

The AI unit is reset by a PNX1300 hardware reset, or by

writing 0x80000000 to the AI_CTL register. Upon RE-

SET, capture is disabled (CAP_ENABLE = 0), and

buffer1 is the active buffer (BUF1_ACTIVE=1). A mini-

mum of 5 valid AI_SCK clock cycles is required to allow

internal AI circuitry to stabilize before enabling capture.

This can be accomplished by programming AI_FREQ

and AI_SERIAL and then delaying for the appropriate

time interval.

Programing of the AI_SERIAL MMIO register needs to

follow the following sequence order:

• set AI_FREQ to ensure that a valid clock is gener-

ated (Only when AI is the master of the audio clock

system)

• MMIO(AI_CTL) = 1 << 31; /* Software Reset */

• MMIO(AI_SERIAL) = 1 << 31; /* sets serial-master

mode, starts AI_SCK */

• MMIO(AI_SERIAL) = (1 << 31) | (SCKDIV value); /*

then set DIVIDER values */

The DSPCPU initiates capture by providing two equal

size empty buffers and putting their base address and

size in the BASEn and SIZE registers. Once two valid (lo-

cal memory) buffers are assigned, capture can be en-

abled by writing a ‘1’ to CAP_ENABLE. The AI unit hard-

ware now proceeds to fill buffer 1 with input samples.

Once buffer 1 fills up, BUF1_FULL is asserted, and cap-

ture continues without interruption in buffer 2. If

BUF1_INTEN is enabled, a SOURCE 11 interrupt re-

quest is generated.

Table 8-8. AI MMIO control fields

Field Name Description

RESET The AI logic is reset by writing a 0x80000000

to AI_CTL. This bit always reads as a ‘0’.

See Section 8.7, “Audio In Operation” for

details on sof tware reset.

DIAGMODE 0  normal operation (RESET default)

1  diagnostic mode (see Section 8.11,

“Diagnostic Mode”)

SLEEPLESS 0  participate in global power down

(RESET default)

1  refrain from participating in power down

CAP_ENABLE Capture Enable flag. If 1, AI unit captures

samples and acts as DMA master to write

samples to local SDRAM. If ’0’ (RESET

default), AI unit is inactive.

BUF1_INTEN Buffer 1 full Interrupt Enable. Default 0.

0  no interrupt

1  interrupt (SOURCE 11) if buffer 1 full

BUF2_INTEN Buffer 2 full interrupt enable. Default 0

0  no interrupt

1  interrupt (SOURCE 11) if buffer 2 full

HBE_INTEN HBE Interrupt Enable. Default 0.

0  no interrupt

1  interrupt (SOURCE 11) if a highway

bandwidth error occurs.

OVR_INTEN Overrun Interrupt Enable. Default 0

0  no interrupt

1  interrupt (SOURCE 11) if an overrun

error occurs

ACK1 Write a ’1’ to clear the BUF1_FULL flag and

remove any pending BUF1_FULL interrupt

request. This bit always reads as 0.

ACK2 Write a ’1’ to clear the BUF2_FULL flag and

remove any pending BUF2_FULL interrupt

request. This bit always reads as 0.

ACK_HBE Write a ’1’ to clear the HBE flag and

remove any pending HBE interrupt request.

This bit always reads as 0.

ACK_OVR Write a ’1’ to clear the OVERRUN flag and

remove any pending OVERRUN interrupt

request. This bit always reads as 0.

Table 8-9. AI MMIO status fields (read only)

Field Name Description

BUF1_ACTIVE • If ‘1’, buffer 1 will be used for the next

incoming sample. If ‘0’, buffer 2 will receive

the next sample.

• 1 after RESET.

BUF1_FULL • If ‘1’, buffer 1 is full. If BUF1_INTEN is also

‘1’, an interrupt request (source 11) is

pending. BUF1_FULL is cleared by writing

a ‘1’ to ACK1, at which point the AI hard-

ware will assume that BASE1 and SIZE

describe a new empty buffer.

• 0 after RESET.

BUF2_FULL • If ‘1’, buffer 2 is full. If BUF2_INTEN is also

‘1’, an interrupt request (source 11) is

pending. BUF2_FULL is cleared by writing

a ‘1’ to ACK2, at which point the AI hard-

ware will assume that BASE2 and SIZE

describe a new empty buffer.

• 0 after RESET.

HBE • Highway Bandwidth Error. Condition raised

when the 64-byte internal AI buffer is not

yet written to SDRAM when a new input

sample arrives. Indicates insuff icient allo-

cation of PNX1300 highway bandwidth for

the audio sampling rate/mode. Refer to

Chapter 20, “Arbiter.”

• 0 after RESET.

OVERRUN • OVERRUN error occurred, i.e. the CPU did

not provide an empty buffer in time, and 1

or more samples were lost. If OVR_INTEN

is also 1, an interrupt request (source 11) is

pending. The OVERRUN flag can ONLY

be cleared by writing a ‘1’ to ACK_OVR.

• 0 after RESET.

Table 8-9. AI MMIO status fields (read only)

Field Name Description

Philips Semiconductors Audio In

PRELIMINARY SPECIFICATION 8-7

Note that the buffers must be 64-byte aligned, and a mul-

tiple of 64 samples in size (the six LSBs of AI_BASE1,

AI_BASE2 and AI_SIZE are always ’0’).

The DSPCPU is required to assign a new, empty buffer

to BASE1 and perform an ACK1, before buffer 2 fills up.

Capture continues in buffer 2, until it fills up. At that time,

BUF2_FULL is asserted, and capture continues in the

new buffer 1, etc.

Upon receipt of an ACK, the AI hardware removes the re-

lated interrupt request line assertion at the next DSPCPU

clock edge. Refer to Section 3.5.3, “INT and NMI

(Maskable and Non-Maskable Interrupts),” for the rules

regarding ACK and interrupt re-enabling. The AI interrupt

should always be operated in level-sensitive mode, since

AI can signal multiple conditions that each need indepen-

dent ACKs over the single internal SOURCE 11 request

line.

In normal operatio n, the DSPCPU an d AI hard ware con-

tinuously exchange buffers without ever loosing a sam-

ple. If the DSPCPU fails to provid e a ne w buff er in time ,

the OVERRUN error flag is raised. This flag is not affect-

ed by ACK1 or ACK2; it can only be cleared by an explicit

ACK_OVR.

8.8 POWER DOWN AND SLEEPLESS

The AI unit enters power down state whenever PNX1300

is put in global power down mode, except if the SLEEP-

LESS bit in AI_CTL is set. In the latter case, the unit con-

tinues DMA operation and will wake up the DSPCPU

whenever an interrupt is generated.

The AI unit can be separately powere d down by setting a

bit in the BLOCK_POWER_DOWN register. Refer to

Chapter 21, “Power Management.”

It is recommended that AI be stopped (by negating

AI_CTL.CAP_ENABLE) before block level power down

is started, or that SLEEPLESS mode is used when global

power down is activated.

8.9 HIGHWAY LATENCY AND HBE

The AI unit uses internal buffering before writing data to

SDRAM. The internal buffer consists of one stereo sam-

ple input holding register and 64 bytes of internal buffer

memory. Under normal operation, the 64-byte buffer is

written to SDRAM while the input register receives an-

other sample. This normal opera tion is guaranteed to be

maintained as long as the h ighway arbiter is set to guar-

antee a latency for the AI unit that matche s the sampling

interval. Given a sample rate fs, and an associated sam-

ple interval T (in nsec), the arbiter should be set to have

a latency of at most T-20 nsec. Refer to Chapter 20, “Ar-

biter,” for information on arbiter programming. If the re-

quested latency is not adequate, the HBE (Highway

Bandwidth Error) condition may result. This error flag

gets set when the input register is full, the 64-byte bu ffe r

has not yet been written to memory, and a new sample

arrives.

Table 8-10 shows the required arbiter latency settings for

a number of common operating modes. The rightmost

column illustrates the nature of the resulting 64-byte

highway requests. Is not necessary to compute arbiter

settings, but they may be used to com pute bus availabil-

ity in a given interval.

8.10 ERROR BEHAVIOR

If either an OVERRUN or HBE error occurs, input sam-

pling is temporarily halted, and samples will be lost. In

case of OVERRUN, sampling resumes as soon as the

DSPCPU makes one or more new buffers available

through an ACK1 or ACK2 operation. In the case of HBE,

sampling will resume as soon as the internal buffer is

written to SDRAM.

HBE and OVERRUN are ‘sticky’ error flags. They will re-

main set until an explicit ACK_HBE or ACK_OVR.

8.11 DIAGNOSTIC MODE

Diagnostic mode is entered by setting the DIAGMODE

bit in the AI_CTL register. In diagnostic mode, the

AI_SCK, AI_WS and AI_SD inputs of the serial-parallel

converter are taken from the ou tput pins of the PNX1300

AO unit. This mode can be used during the diagnostic

phase of system boot to verify correct operation of most

of the AI unit and AO unit logi c circuitry.

Note that the inputs are truly taken from the PNX1300

AO external pins, i.e. if an external (board level) source

is driving AO_SCK or AO_WS, diagnostic mode is not

capable of testing Audio Out.

Special care must be taken to enable diagnostic mode.

The recommended way of entering diagn ostic mode is:

• setup the AO unit such that an AO_SCK is genera ted

• set DIAGMODE bit followed by a 5 (AI_SCK) cycle

delay

• perform a software reset of the AI unit and immedi-

ately set the DIAGMODE bit back to ‘1’.

Table 8-10. AI highway arbit er latency requirement

examples

CapMode fs

(kHz) T

(nS)

max

arbiter

latency

(nsec)

access pattern

stereo

16 bits/sample 44.1 22,676 22,656 1 request every

362,812 nsec

stereo

16 bits/sample 48.0 20,833 20,813 1 request every

333,333 nsec

stereo

16 bits/sample 96.0 10,417 10,397 1 request every

166,667 nsec

PNX1300/01/02/11 Data Book Philips Semiconductors

8-8 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION 9-1

Audio Out Chapter 9

by Gert Slavenburg, Santanu Dutta

9.1 AUDIO OUT OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The PNX1300 Audio Out (AO) unit contains many fea-

tures not available in the TM-1000 and the TM-1100. It

has up to 8 channels, and drives up to 4 external stereo

D/A converters through a flexible bit-seri al connection.

It provides all signals to interface to high qu ality, low cost

oversampling D/A converters, including a p recisely pro-

grammable oversampling D/A system clock. The AO unit

and external D/A’s together provide the following capa-

bilities:

• Up to 8 channels of audio output.

• 16-bit or 32-bit samples per channel.

• Programmable sampling rate.

• Internal or external sampling clock source.

• Autonomously reads processed audio data from

memory using double buffering (DMA).

• Supports 16-bit mono and stereo PC standard mem-

ory data formats.

• Supports little- and big-endian memory formats.

• Provides control capability for highly integrated PC

codecs such as the AD1847, CS4218 or UAD1340.

• No support for connecting several D/As to one serial

data output.

9.2 EXTERNAL INTERFACE

Seven PNX1300 pins are associated with the AO unit.

The AO_OSCLK output is an accurately programmable

clock output intended to be used as the master system

clock for the external D/A subsystem. The other pins

(AO_SCK, AO_WS and AO_SDx) constitute a flexible

serial output interface. Using the AO MMIO registers,

these pins can be configured to operate in a variety of se-

rial interface framing modes, including but not limited to:

• Standard stereo I2S (MSB first, 1-bit delay from

PNX1300/01/02/11 Data Book Philips Semiconductors

9-2 PRELIMINARY SPECIFICATION

AO_WS, left & right data in a frame).

• LSB first, with 1–16-bit data per channel.

• Complex se ria l fram e s of up to 51 2 bits/fram e.

• Up to 8 channels of audio output.

9.3 SUMMARY OF OPERATION

The AO unit consists of three major subsystems, a pro-

grammable sample clock generator, a DMA engine and

a data seriali zer.

The DMA engine reads 16 or 32-bit samples from mem-

ory using a double buffered DMA approach. The

DSPCPU initially assigns two full sample buffers contain-

ing an integral n umber of samples for all active channels.

The DMA engine retrieves samples from the first buffer

until exhausted and continues from the second buffer,

while requesting a new first sample buffer from the

DSPCPU, etc.

The samples are given to the data serializer, which

sends them out in a MSB first or LSB first serial frame for-

mat that can also contain 1 or 2 codec control words of

up to 16 bits. The frame structure is highly programmable

by a series of MMIO fields.

Table 9-1. AO unit external signals

Signal Type Description

AO_OSCLK OUT Over sampling clock. Can be pro-

grammed to emit any frequency up to 40

MHz, with sub-Hz resolution. Intended for

use as the 256 or 384fs oversampling

clock by the external D/A conversion sub-

system.

AO_SCK IO • When AO is programmed to act as a

serial interface timing slave (RESET

default), AO_SCK acts as input. It

receives the serial clock from the exter-

nal audio D/A subsystem. The clock is

treated as fully asynchronous to the

PNX1300 main clock.

• When AO is programmed to act as

serial interface timing master,

AO_SCK acts as output. It drives the

serial clock for the external audio D/A

subsystem. Clock frequency is a pro-

grammable integral divide of the

AO_OSCLK frequency.

AO_SCK is limited to 22 MHz. The sam-

ple rate of valid samples embedded within

the serial stream is limited by the

AO_SCK maximum frequency and the

available highway bandwidth.

AO_WS IO • When AO is programmed as the serial-

interface timing slave (RESET default),

AO_WS acts as an input. AO_WS is

sampled on the opposite AO_SCK

edge at which AO_SDx are asserted.

• When AO is programmed as serial-

interface timing master, AO_W S acts

as an output. AO_WS is asserted on

the same AO_SCK edge as AO_SDx.

AO_WS is the word-select or frame-sync

signal from/to the external D/A sub-

system. Each audio channel receives 1

sample for every WS period.

AO_WS can be set to change on

AO_OSCLK positive or negative edges by

the CLOCK_EDGE bit.

AO_SD1 OUT Serial data to stereo external audio D/A

subsystem. AO_SD1 can be set to

change on AO_OSCLK positive or nega-

tive edges by the CLOCK_EDGE bit.

AO_SD2 OUT Serial data to stereo external audio D/A

subsystem. AO_SD2 can be set to

change on AO_OSCLK positive or nega-

tive edges by the CLOCK_EDGE bit.

AO_SD3 OUT Serial data to stereo external audio D/A

subsystem. AO_SD3 can be set to

change on AO_OSCLK positive or nega-

tive edges by the CLOCK_EDGE bit.

AO_SD4 OUT Serial data to stereo external audio D/A

subsystem. AO_SD4 can be set to

change on AO_OSCLK positive or nega-

tive edges by the CLOCK_EDGE bit.

Philips Semiconduc tors Audio Out

PRELIMINARY SPECIFICATION 9-3

9.4 INTERNAL CLOCK SOURCE

Figure 9-1 illustrates the different clock capabilities of the

AO unit. At the heart of the clock system is a square

wave DDS (Direct Digital Synthesizer). The DDS can be

programmed to emit freq uencies from approx. 1 Hz to 80

MHz with a sub Hertz resolution.

The output of the DDS is always sent to the AO_ OSCLK

output pin. This output is intended to be used as the

256fs or 384 fs system clock source for oversampling D/A

converters, such as the Philips SAA7322, or codecs

such as the AD1847, CS4218, or UAD1340.

The PNX1300 DDS frequency is set by writing to the

FREQUENCY MMIO register. The p rogrammer is fr ee to

change the FREQUENCY setting dynamically, in order

to adjust the outgoing audio sample ra te. In ATSC tran s-

port stream decoding, this is the method by which the

system software locks audio output sample rate to the

original program provide r sample rate.

Depending on bit 31 (MSB), the DDS runs in one of the

two following modes:

• bit 31 = 1 (standard improved mode)

• bit 31 = 0 (TM-1000 compatibility mode)

9.4.1 PNX1300 Standard Improved Mode

This mode was first available in the TM-1100. In this

mode, a high quality, low-jitter AO_OSCLK is generated.

The setting of the FREQUENCY register to accomplish a

given AO_OSCLK frequency is given by the formula:

This mode, and the above formula, should be used for all

new software development on PNX1300.

In the improved mode th e DDS synthesizer maximum jit-

ter can be computed as follows:

Example of jitter values can be found in Table 9-3.

FREQUENCY

AO_OSCLK

AO_SCK

AO_WS

div N+1 SCKDIV

div N+1

Square Wave DDS

9  DSPCPUCLK

AO_SDx Parallel to Serial Converter

16 LEFT[15:0]

RIGHT[15:0]

(e.g. 64fs)

WSDIV

31 0

(e.g. 256fs)

32 AO_CC[31:0]

Figure 9-1. AO clock system and I/O interface

SER_MASTER

Table 9-2. Clock system setting (fDSPCPU=133 MHz)

fsOSCLK SCK FREQUENCY SCKDIV

44.1 kHz 256fs 64fs 2187991971 3

48.0 kHz 256fs 64fs 2191574340 3

44.1 kHz 384fs 64fs 2208246133 5

48.0 kHz 384fs 64fs 2213619686 5

Table 9-3. Jitter values for common DSPCPU MHz

fDSPCPU

(MHz) jitter

(nSec) fDSPCPU

(MHz) jitter

(nSec)

143 0.777 180 0.617

166 0.669 200 0.555

FREQUENCY 231 fOSCLK 232



9fDSPCPU



------------------------------+=

jitter 1

9fDSPCPU



-----------------------------=

PNX1300/01/02/11 Data Book Philips Semiconductors

9-4 PRELIMINARY SPECIFICATION

9.4.2 TM-1000 Compatibility Mode

TM-1000 clock compatibility mode is provided so that

TM-1000 audio software runs without changes. It shou ld

NOT be used for new software developme nt, due to a 3x

higher jitter. TM-1000 mode is automatically entered

whenever FREQUENCY[31] = 0. In TM-1000 mode,

AO_OSCLK frequency is set as follows:

9.5 CLOCK SYSTEM OPERATION

The output of the DDS is a lways sent to the AO_OSCLK

output pin. This output is typically used as the 256fs or

384fs system clock source for oversampling D/A convert-

ers, such as the Philips SAA7322, or codecs such as the

AD1847, CS4218 or UD1340.

AO_WS and AO_SCK are sent to each external D/A con-

verter in the master mode.

AO_WS, the word strobe, determines the sample rate:

each active channel receives one sample for each

AO_WS period.

AO_SCK is the data bit clock. The number of AO_SCK

clocks in an AO_WS period is the number of data bits in

a serial frame required by the attached D/A converter.

AO_WS is a divider of the bit clock and is set using WS-

DIV to control the serial frame length. The number of bits

per frame is equal to WSDIV+1. There are some mini-

mum length requirements for a serial frame, refer to

Section 9.6.1.

AO_SCK and AO_WS can be configured as input or out-

put, as determin ed by the SER_MASTER control field. If

set as output, AO_SCK can be set to a divider of the DDS

output frequency.

Whether set as input or output, the AO_SCK pin signal is

always used as the bit clock for parallel-serial conver-

sion. The AO_WS pin always acts as the trigger to start

the generation of a seria l frame. AO_WS can similarly be

programmed using WSDIV to control the serial frame

length. The number of bits per frame is equal to WS-

DIV+1.

The preferred use of the clock system options is to use

AO_OSCLK as D/A master clock, and let the D/A con-

verter be a timing slave of the serial interface

(SER_MASTER=1). This is important in view of compat-

ibility with future Trimedia devices, which may only sup-

port the AO unit as serial interface master.

Some D/A converters however, like the AD1847, provide

better SNR properties if they are configured as serial

master, with the AO unit as slave (SER_MASTER=0). As

illustrated by Figure 9-1, the internal parallel to serial

converter that constructs the serial frame is oblivious to

which component is timing master.

9.6 SERIAL DATA FRAMING

The AO unit can generate data in a wide variety of seri al

data framing conventions. Figure 9-2 illustrates the no-

tion of a serial frame. If POLARITY=1, a frame starts with

a positive edge of the AO_WS signal. If POLARITY=0, a

serial frame starts with a negative edge on AO_WS. If

CLOCK_EDGE=0, the parallel to serial converter sam-

ples AO_WS on a positive clock edge transition, and out-

puts the first bit (bit 0) of a serial frame on the next falling

edge of AO_SCK.

If CLOCK_EDGE=1, the parallel to serial converter sam-

ples AO_WS on the negative edge of AO_SCK, while au-

dio data is output on the positive edge, i.e. the AO_SC K

polarity would be reversed with respect to Figure 9-2.

FREQUENCY fOSCLK 232



3fDSPCPU



------------------------------=

SCKDIV 0 255[, ]

fAOSCK fAOOSCLK

SCKDIV 1+

----------------------------------=

Table 9-4. AO MMIO Clock & Interface Control

Field Name Description

SER_MASTER 0  (RESET default), the D/A subsystem

is the timing master over the AO

serial interface. AO_SCK and

AO_WS act as inputs.

1  PNX1300 is the timing master over

the serial interface. AO_SCK and

AO_WS act as outputs. This mode is

required for 4,6 or 8 channel opera-

tion.

The SER_MASTER bit should only be

changed while the AO unit is disabled, i.e.

TRANS_ENABLE = 0.

FREQUENCY Sets the clock frequency emitted by the

AO_OSCLK output. RESET default 0.

SCKDIV Sets the divider used to derive AO_SCK

from AO_OSCLK. Set to 0..255, for divi-

sion by 1..256. RESET default 0.

WSDIV Sets the divider used to derive AO_WS

from AO_SCK. Set to 0..511 for a serial

frame length of 1..512. RESET default 0.

7654321031302928272625242322212019181716151413121110987654321 framen

0framen+1

3130

framen-1

AO_SCK

AO_WS

AO_SDx

Figure 9-2. Definition of serial frame bit positions (PO LARITY = 1, CLOCKEDGE = 0)

Philips Semiconduc tors Audio Out

PRELIMINARY SPECIFICATION 9-5

Every serial frame transmits a single left and right chan-

nel sample, and optiona l codec control data to each D/A

converter. The left and right sample data can be in an

LSB first or MSB first form, at an arbitrar y serial frame bit

position, and with an arbitrary length.

In MSB-first mode (DATAMODE = 0), the parallel to se-

rial converter sends the value of LEFT[MSB] in bit posi-

tion LEFTPOS in the serial frame. Subsequently, bits

from decreasing bit positions in the LEFT data word, up

to and including LEFT[SSPOS], are transmitted in order.

In LSB-first mode (DATAMODE = 1), the parallel-to-seri-

al converter sends the value of LEFT[SSPOS] in bit po-

sition LEFTPOS in the serial frame. Subsequent bits

from the LEFT data word, up to and including

LEFT[MSB], are transmitted in order. Table 9-6. shows

the transmitted bits in different modes.

Frame bits that do not belong to either LEFT[MSB:SS-

POS] or RIGHT[MSB:SSPOS] or a codec control field

(Section 9.7, “Codec Control”) are shifted out as zero.

This zero ex te ns ion en su re s t ha t PN X13 0 0 ca n be used

in combination with D/A converters of higher precision

than the actual number of transmitted bits in the current

operating mode, e.g. 18-bit D/As operating with 16-bit

memory data.

9.6.1 Serial Frame Limitations

Due to the implementation, there is a minimum serial

frame length require d that is operating mode dependent.

This is shown in Table 9-7.

Table 9-5. AO Serial Framing Control Fields

Field Name Description

POLARITY 0  serial frame starts with an AO_WS

negedge (RESET default)

1  serial frame starts with an AO_WS

posedge

This bit should NOT be changed during

operation of the AO unit, i.e. only update this

bit when TRANS_ENABLE = 0.

LEFTPOS(9) Defines the bit position within a serial frame

where the first data bit of the left channel is

placed. Reset default ‘0’.

RIGHTPOS(9) Defines the bit position within a serial frame

where the first data bit of the right channel is

placed. Reset default ‘0’.

DATAMODE 0  M SB first (RES ET default)

1  LSB first

SSPOS Start/Stop bit position. Reset default 0. Note

that SSPOS is a 5-bit field, with SSPOS bit 4

not-adjacent. This is for backwards compati-

bility in 16 bits/sample modes with TM-1000/

1100.

• If DATAMODE=MSB first, transmission

starts with the MSB of the sample, i.e. bit

15 for 16 bits/sample modes or bit 31 for

32 bits/sample modes. SSPOS determines

the bit index (0..31) in the parallel input

word of the last transmitted data bit.

• If DATAMODE=LSB first, SSPOS deter-

mines the bit index (0..31) in the parallel

word of the first transmitted data bit. Bits

SSPOS up to/including the MSB are trans-

mitted, i.e. up to bit 15 in 16 bits/sample

mode and bit 31 in 32 bits/sample mode.

See Table 9-6 for more information.

CLOCK_EDGE 0  the parallel to serial converter samples

AO_WS on positive edges of AO_SCK

and outputs data on the negative edge

of AO_SCK (RESET default).

1  the parallel to serial converter samples

AO_WS on negative edges of AO_SCK

and outputs data on positive edges of

AO_SCK.

WS_PULSE 0  emit 50% AO_WS (RESET default).

1  emit single AO_SCK cycle AO_WS

NR_CHAN 00  Only AO_SD1 is active

01  AO_SD1 and 2 are active

10  AO_SD1, 2 and 3 are active

 AO_SD1..SD4 are active

Each SD output either receives 1 or 2 chan-

nels depending on TRANS_MODE mono

resp. stereo. Non-active channels receive 0

value samples. In mono modes, each chan-

nel of a SD output receives identical left &

right samples. See also Table 9-10.

Table 9-6. Bits transmitted for each memory data

item S

operating mode first

bit last

bit

valid

SSPOS

values

16 bits/sample, MSB-first S[15] S[SSPOS] 0..15

16 bits/sample, LSB-first S[SSPOS] S[15] 0..15

32 bits/sample, MSB-first S[31] S[SSPOS] 0..31

32 bits/sample, LSB-first S[SSPOS] S[31] 0..31

Table 9-7. Minimum serial frame length in bits

operating mode minimum serial frame length

16 bits/sample, mono 13 bit s

32 bits/sample, mono 13 bit s

16 bits/sample, stereo 13 bits

32 bits/sample, stereo 36 bits

PNX1300/01/02/11 Data Book Philips Semiconductors

9-6 PRELIMINARY SPECIFICATION

9.6.2 I2S Serial Framing Example

Refer to Figure 9-3 and Table 9-8 to see how the AO unit

MMIO registers should be se t to tran smit 16 or 3 2 bits of

stereo data via an I2S serial standard to an 18-bit D/A

converter with a 64-bit serial frame.

9.7 CODEC CONTROL

In addition to the left and rig ht data fields tha t are gen er -

ated based on autonomous DMA action, a serial frame

generated by the AO unit can be set to contain 1 or 2

control fields up to 16 bits in length. Each control field can

be independently enabled/disabled by the CC1_EN,

CC2_EN bits in AO_CTL. The content shifted into the

frame is taken from the CC1 and CC2 field in the AO_CC

AO_CFC register determine the first bit position in the

frame where the control field is emitted. The field is emit-

ted observing the setting of DATAMODE, i.e. LSB or

MSB first.

The CC_BUSY bit in AO_STATUS indicates if the AO

unit is ready to receive another CC1, CC2 value pair.

Writing a new value pair to AO_CC writes the value into

a buffer register, and raises the CC_BUSY status. As

soon as both CC1 and CC2 values have been copied to

a shadow register in preparation for transmission,

CC_BUSY is negated, indicating that the AO logic is

ready to accept a new codec control pair. The old CC1/

CC2 data keeps being transmitted - i.e. software is not

required to provide new CC1 and CC2 data.

Software always needs to ensure that the CC_BUSY sta-

tus is negated before writing a new CC1, CC2 pair. By

polling CC_BUSY, the DSPCPU can emit a sequence of

individual audio frames with distinct control field values

reliably. This can, for example, be used during codec ini-

tialization. No provision is made for interrupt driven oper-

ation of such a sequence of control values; it is assumed

that after initialization, the value of control fields deter-

mine slow, asynchronous changing parameters such as

volume.

It is legal to program the control fie ld po sition s within th e

frame such that CC1 and CC 2 overlap each other and/or

left/right data fields. If two fields are defined to start at the

same bit position, the priority is left (highest), right, CC1

then CC2. The field with the highest priority will be emit-

ted starting at the conflicting bit position. If a field f2 is de-

fined to start at a bit position i that falls within a field f1

starting at a lower bit position, f2 will be emitted starting

from i and the rest of f1 will be lost. Any bit positions not

belonging to a data or control field will be emitted as ‘0’.

Table 9-8. Example setup for 64-bit I2S framing

Field Value Explanation

POLARITY 0 Frame starts with negedge AO_WS.

LEFTPOS 0 LEFT[msb] will go to serial frame

position 0.

RIGHTPOS 32 RIGHT[msb] will go to serial frame

position 32.

DATAM ODE 0 MSB first.

SSPOS 0 Stop with LEFT/RIGHT[0], send 0’s

after.

(for 32 bits/sample mode, this field

could be set to 14 to ensure zeroes

in all unused bit positions)

CLOCK_EDGE 0 AO_SDx change on negedge

AO_SCK

WSDIV 63 Serial frame length = 64.

WS_PULSE 0 emit 50% duty cycle AO_WS.

163625251503332313018173210 0 left channel datan+1(18)

left channel datan(18) right channe l datan(18)

Figure 9-3. Serial frame (64 bits) of a 18-bit precision I2S D/A converter.

AO_SCK

AO_WS

AO_SDx

Table 9-9. AO MMIO codec control/status fields

Field Name Description

CC1 (16) The 16-bit value of CC1 is shifted into each

emitted serial frame starting at bit position

CC1_POS, as long as CC1_EN is asserted.

CC1_POS Defines the bit position within a serial frame

where the first data bit of CC1 is placed.

RESET Default 0.

CC1_EN 0  CC1 emission disabled (RESET default)

1  CC1 emission enabled.

CC2(16) The 16-bit value of CC2 is shifted into each

emitted serial frame starting at bit position

CC2_POS, as long as CC2_EN is asserted.

CC2_POS Defines the bit position within a serial frame

where the first data bit of CC2 is placed.

Default 0.

CC2_EN 0  CC2 emission disabled (RESET default)

1  CC2 emission enabled.

CC_BUSY 0  AO is ready to receive a CC1, CC2 pair

(RESET default).

1  AO is not ready to receive a CC1, CC2

pair. Try again in a few SCK clock inter-

vals.

Philips Semiconduc tors Audio Out

PRELIMINARY SPECIFICATION 9-7

Figure 9-4 shows a 64-bit frame suitable for use with the

CS4218 codec. It is obtained by setting POLARITY=1,

LEFTPOS=0, RIGHTPOS=32, DATAMODE=0, SS-

POS=0, CLOCK_EDGE=1, WS_PULSE=1, CC1_POS =

16, CC1_EN=1, CC2_POS=48, CC2_EN=1.

Note that frames are generated (externally or internally)

even when TRANS_ENABLE is de-asserted. Writes to

CC1 and CC2 should only be done after

TRANS_ENABLE is asserted. The ‘first’ CC values will

then go out on the next frame. For a summary of codec

control fields see Table 9-9

9.8 MEMORY DATA FORMATS

The AO unit autonomou sly reads samples from memory

in 16 or 32 bit-per-sample memory formats, as shown in

Figure 9-5 for some example modes. Memory samples

are retrieved and used as described in Table 9-10. Suc-

cessive samples are always read from increasing mem-

ory address locations. The setting of the

LITTLE_ENDIAN bit in the AO_CTL register determines

the byte order of retrieved 16 or 32-bit samples. Refer to

Appendix C, “Endian-ness,” for details on byte ord ering con-

ventions.

AO hardware implements a double buffering scheme to

ensure that there are always samples available to trans-

mit, even if the DSPCPU is highly loaded and slow to re-

spond to interrupts. The DSPCPU software assigns 2

equal size buffers by writing a base address and size to

the MMIO control fields described in Figure 9-6. Refer to

Section 9.9, “Audio Out Operation,” for details on hard-

ware/software synchronization.

If SIGN_CONVERT is set to one, the MSB of the memory

data is inverted, which is equivalent to translating from

offset binary representation to two’s complement. This

allows the use of an external two’s complement 16-b it D/

A converter to generate audio from 16-bit unsigne d sam-

ples. This MSB inversion also applies to the ‘0’ values

transmitted to non-active output channels.

Note that the AO hardware does not support A-law or -

law eight-bit data formats. If such formats are desired,

the DSPCPU should be used to convert fr om A-law or -

law data to 16-bit linear data.

Table 9-10. Operating modes and memory formats

NR_CHAN MODE destination of successive samples

00 mono SD1.left

00 stereo SD1.left, SD1.right

01 mono SD1.left, SD2.left

01 stereo SD1.left, SD1.right, SD2.left, SD2.right

10 mono SD1.left, SD2.left, SD3.left

10 stereo SD1.left, SD1.right, SD2.lef t, SD2.ri ght,

SD3.left, SD3.right

11 mono SD1.left, SD2.left, SD3.left, SD4.left

11 stereo SD1.left, SD1.right, SD2.lef t, SD2.ri ght,

SD3.left, SD3.right, SD4.left, SD4.right.

Figure 9-4. Example codec frame layout for a Crystal Semi, CS4218.

16362

48473231

3210 0

left datan+1(16)

left channel datan(16) right channel datan(16)

15 CC1(16)

lsb lsb lsb CC2(16) lsb

AO_SCK

AO_WS

AO_SDx

Figure 9-5. AO memory DMA formats.

adr

SD1.leftn

adr+2

SD1.rightn

adr+4

SD1.leftn+1

adr+6

SD1.rightn+1

adr+8

SD1.leftn+2

adr+10

SD1.rightn+2

adr+12

SD1.leftn+3

adr+14

SD1.rightn+3

16-bit, stereo,

NR_CHAN=00

32-bit, ster eo,

NR_CHAN=00 SD1.leftn

adr

SD1.rightn

adr+4

SD1.leftn+1

adr+8

SD1.rightn+1

adr+12

adr

SD1.leftn

adr+2

SD1.rightn

adr+4

SD2.leftn

adr+6

SD2.rightn

adr+8

SD3.leftn

adr+10

SD3.rightn

adr+12

SD1.leftn+1

adr+14

SD1.rightn+1

16-bit, stereo,

NR_CHAN=10

PNX1300/01/02/11 Data Book Philips Semiconductors

9-8 PRELIMINARY SPECIFICATION

9.9 AUDIO OUT OPERATION

Figure 9-6, Table 9-11 a nd Table 9-12 describe the func-

tion of the control and sta tus fields of the AO un it. To en-

sure compatibility with future devices, any undefined or

reserved MMIO bits should be ignored when read, and

written as zeroes

The AO unit is reset by a PNX1300 hardware reset, or by

writing 0x80000000 to the AO_CTL register. The AO unit

is not affected by DSPCPU reset initiated through the

BIU_CTL register. Either reset method sets all MMIO

fields as indicated in the tables.

The timestamp counter is reset by TRI_RESET# or by

DSPCPU reset initiated through BIU_CTL. It is not affect-

ed by AO_CTL reset. This ensures that the timestamp

counter stays synchronous with the DSPCPU

CCCOUNT register.

After an AO reset, 5 AO_SCK clock cycles are required

to stabilize the internal circuitry before enabling Audio

Out. This can be accomplished by programming the

AO_FREQ and AO_SERIAL registers to start AO_SCK

generation then waiting for the appropriate 5 AO_SCK

cycle interval.

Programing of the AO_SERIAL MMIO register needs to

follow the following sequence order:

• set AO_FREQ to ensure that a valid clock is gener-

ated (Only when AO is the master of the audio clock

system)

• MMIO(AO_CTL) = 1 << 31; /* Software Reset */

Figure 9-6. AO status/control field MMIO layout.

MMIO_base

offset:

AO_STATUS (r/w)0x10 2000

AO_CTL (r/w)0x10 2004

AO_SERIAL (r/w)0x10 2008 SCKDIV

AO_FRAMING (r/w)0x10 200C

AO_FREQ (r/w)0x10 2010

AO_BASE1 (r/w)0x10 2014

FREQUENCY

BUF1_ACTIVE

AO_BASE2 (r/w)0x10 2018 BASE2

AO_SIZE (r/w)0x10 201C SIZE (in samples)

31 0371115192327

BASE1

UNDERRUN

HBE (Highway bandwidth error)

BUF2_EMPTY

RESET

TRANS_ENABLE

TRANS_MODE

SIGN_CONVERT

LITTLE_ENDIAN

UDR_INTEN

HBE_INTEN

BUF2_INTEN

BUF1_INTEN

ACK_UDR

ACK_HBE

ACK2

ACK1

WSDIV

DATAMODE

CLOCK_EDGE

POLARITY

LEFTPOS RIGHTPOS SSPOS

00000

000000

SLEEPLESS

BUF1_EMPTY

AO_CC (r/w)0x10 2020

AO_CFC (r/w)0x10 2024 CC1_POS CC2_POS

CC2CC1

CC1_EN

CC2_EN

WS_PULSE

CC_BUSY

NR_CHAN

000000

31 0371115192327

RESERVED

SSPOS[4]

AO_TSTAMP (r/o)0x10 2028 TIMESTAMP

31 0371115192327

SER_MASTER

Philips Semiconduc tors Audio Out

PRELIMINARY SPECIFICATION 9-9

• MMIO(AO_SERIAL) = 1 << 31; /* sets serial-master

mode, sta rts AO_SCK */

• MMIO(AO_SERIAL) = (1 << 31) | (SCKDIV value); /*

then set DIVIDER values */

Upon reset, transmission is disabled (TRANS_ENABLE

= 0), and buffer 1 is the active buffer (BUF1_ACTIVE=1).

The DSPCPU initiates transmission by providing two full

equal size buffers and putting their base address and

size in the BASEn and SIZE registers. Once two valid

buffers are assigned, transmission can be enabled by

writing a ‘1’ to TRANS_ENABLE. The AO hardwar e now

proceeds to empty buffer 1 by transmission of output

samples. Once buffer 1 empties, BUF1_EMPTY is as-

serted, and transmission continues without interruption

from buffer 2. If BUF1_INTEN is en abled, a SOURCE 12

interrupt request is generated.

Note that buffers must be 64-byte aligned (the six LSBs

of AO_BASE1, AO_BASE2 are zero). Buffer sizes must

be a multiple of 64 samples (the 6 LSB’s of AO_SIZE are

zero).

The DSPCPU is required to assign a new, full buffer to

BASE1 and perform an ACK1 before buffer 2 empties.

Transmission continues from buffer 2 until it is empty. At

that time, BUF2_EMPTY is asserted and transmission

continues from the new buffer 1, etc. An ACK performs

two functions: it tells the AO unit that the corresponding

BASE register now points to a buffer filled with samples,

and it clears BUF_EMPTY. Upon receipt of an ACK, the

AO hardware removes the BUF_EMPTY related inter-

rupt request line assertion at the next DSPCPU clock

edge. Refer to the interrupt controller documentation for

details on interrupt handler programming. The AO inter-

rupt (SOURCE 12) should always be operated in level

sensitive mode

9.10 INTERRUPTS

The AO unit has a private interrupt request line to the

DSPCPU vectored interrupt controller. It uses SRC# 12

(same as TM-1000/TM-1100/TM-1300 AO).

An interrupt is asserted as long as one or more of the

UNDERRUN, HBE, BUF1_EMPTY or BUF2_EMPTY

condition flags and the corresponding INTEN bit are as-

serted. Interrupts are sticky, i.e. an interrupt remains as-

serted until the software explicitly clears the condition

flag by an ACK_x action.

Table 9-11. AO MMIO DMA control fields

Field Name Description

LITTLE_ENDIAN 0  big endian memory format (RESET

default)

1  little endian

BASE1 Base Address of buffer1. Must be a 64-

byte aligned address in local SDRAM.

RESET default 0.

BASE2 Base Address of buffer2. Must be a 64-

byte aligned address in local SDRAM.

RESET default 0.

SIZE DMA buffer size, in samples.

This number of mono samples or stereo

sample pairs is read from a DMA buffer

before switching to the other buffer.

Buffer size in bytes is as follows:

16 bps, mono : 2 * SIZE

32 bps, mono : 4 * SIZE

16 bps, stereo : 4 * SIZE

32 bps, stereo : 8 * SIZE

RESET default 0.

TRANS_MODE 00  mono, 32 bits/sample. (RESET

default). Left data and Right dat a

sent to each active output are the

same.

01  stereo, 32 bits/sample

10  mono, 16 bits/sample. Left data

and Right data are the same.

11  stereo, 16 bits/sample

Refer to Table 9-10 for an explanation of

how TRANS_MODE and NR_CHAN

map to output behavior.

SIGN_CONVERT 0  leave MSB unchanged (RESET

default)

1  invert MSB

(not applied to codec control fields)

Table 9-12. AO DMA status fields (read only)

Field Name Description

BUF1_ACTIVE • If 1, buffer 1 will be used for the next sam-

ple to be transmitted.

• If 0, buffer 2 will contain the next sample

(1 after RESET).

BUF1_EMPTY • If 1, buffer 1 is empty.

• If BUF1_INTEN is also 1, an interrupt

request (source 12) is asserted.

• BUF1_EMPTY is cleared by writing a ‘1’

to ACK1, at which point the AO hardware

will assume that BASE1 and SIZE

describe a new full buffer.

• 0 after RESET.

BUF2_EMPTY • If 1, buffer 2 is empty.

• If BUF2_INTEN is also 1, an interrupt

request (source 12) is asserted.

• BUF2_EMPTY is cleared by writing a ‘1’

to ACK2, at which point the AO hardware

will assume that BASE2 and SIZE

describe a new full buffer.

• 0 after RESET.

HBE • Highway Bandwidth Error.

• 0 after RESET.

• Indicates that no data was transmitted

due to inability to read the local AO buffer

from SDRAM in time. This indicates an

insufficient allocation of PNX1300 High-

way bandwidth for the audio sampling

rate/mode.

UNDERRUN • An UNDERRUN error has occurred, i.e.

the CPU failed to provide a full buffer in

time, and no samples were transmitted,

although requested by the D/A converter.

• If UDR_INTEN is also 1, an interrupt

request (source 12) is pending. The

UNDERRUN flag can ONLY be cleared

by writing a ‘1’ to ACK_UDR.

• 0 after RESET.

PNX1300/01/02/11 Data Book Philips Semiconductors

9-10 PRELIMINARY SPECIFICATION

9.11 TIMESTAMP

The AO_TSTAMP MMIO register provides a 32-bit

timestamp value that contains the CCCOUNT time value

at which the last sample of the last DMA buffer transmit-

ted was sent across the SD output pin. This value is

available for software inspection (read-only) in the inter-

rupt handler for BUFx_EMPTY.

The implementation involves an internal DSPCPU clock

cycle counter that is reset to have the same value as the

DSPCPU CCCOUNT register. It is guaranteed to be in

sync with the 32 LSB of CCCOUNT provided that PC-

SW.CS=1.

9.12 POWERDOWN AND SLEEPLESS

The AO unit enters powerdown state whenever

PNX1300 is put in global powerdown mode, except if the

SLEEPLESS bit in AO_CTL is set. In the latter case, the

block continues DMA operation and will wake up the

DSPCPU when ever an in te rr up t is ge ne r at ed. T h e in te r-

nal timestamp counter never powers down to ensure that

it remains synchronous with CCCOUNT.

The AO unit can be se parate ly p owered down by settin g

a bit in the BLOCK_POWER_DOWN register. Refer to

Chapter 21, “Power Management.”

If the block enters powerdown state, AO_ SCK, AO_SDx,

and AO_WS hold their value stable. AO_OSCLK contin-

ues to provide a D/A converter clock. The signals resume

their original transitions at the point where they were in-

terrupted once the system wakes up. The external D/A

converter subsystem is most likely confused by this be-

havior, hence it is recommended AO unit to be stopped

(by negating TRANS_ENABLE) before block level pow-

erdown is started, or that SLEEPLESS mode is used

when global powerdown is activated.

9.13 HIGHWAY LATENCY AND HBE

The AO unit uses an internal 64 -byte buffer as well as an

output holding register that contains a single mono sam-

ple or single stereo sample pair. Under normal operation,

the internal buffer is refreshed from SDRAM fast enough

to avoid any missing samples, while data is being emit-

ted from the holding register. If th e highway arbiter is set

up with an insufficient latency guarantee, the situation

can arise that the 64-byte buffer is not refilled and the

holding register is exhausted by the time a new output

sample is due. In that case the HBE error is raised. The

last sample for each channel will be repeated until the

buffer is refreshed. The HBE condition is sticky, and can

only be cleared by an explicit ACK_HBE. This condition

indicates an incorrect setting of the highway bandwidth

arbiter.

Given a sample rate fs, and an associated sample inter-

val T (in ns), the arbiter should be set to have a latency

of at most T-20 ns for all modes. The latency for 4,6 an d

8 channel modes can be computed as if the system is op-

erating in stereo mod e with a 2x, 3x respectively 4x sam-

ple rate.

Table 9-14 shows the required arbiter latency settings for

a number of common operating modes. The right most

column in illustrates the nature of the resulting 64-byte

highway requests. Is not necessary to compute arbiter

settings, but they may be used to compute bus availabil-

ity in a given interval.

Refer to Chapter 20, “Arbiter,” for informa tion on arbiter

programming.

Table 9-13. AO MMIO Control Fields

Field Name Description

RESET Resets the audio-out logic. See Section

9.9, “Audio Out Operation” for a descrip-

tion of the recommended procedure.

TRANS_ENABLE Transmission Enable flag.

0  (RESET default) AO inactive.

1  AO transmits samples and acts as

DMA master to read samples from

local SDRAM.

Do NOT change the POLARITY bit while

transmission is enabled.

SLEEPLESS 0  (power up default) AO goes into

power-down mode if PNX1300 goes

to global powerdown mode.

1  AO continues operation when

PNX1300 goes to global powerdown

mode. Samples are read from mem-

ory as needed, and AO interrupts,

when enabled, will wake up the

DSPCPU.

BUF1_INTEN Buffer 1 Empty Interrupt Enable.

0  (default) no interrupt

1  interrupt (SOURCE 12) if buffer 1

empty

BUF2_INTEN Buffer 2 Empty Interrupt Enable.

0  (default) no interrupt

1  interrupt (SOURCE 12) if buffer 2

empty

HBE_INTEN HBE Interrupt Enable.

0  (default) no interrupt

1  interrupt (SOURCE 12) if a highway

bandwidth error occurs.

UDR_INTEN UNDERRUN Interrupt Enable.

0  (default) no interrupt

1  interrupt (SOURCE 12) if an

UNDERRUN error occurs

ACK1 • Write a 1 to clear the BUF1_EMPTY flag

and remove any pending BUF1_EMPTY

interrupt request.

• ACK1 always reads 0.

ACK2 • Write a 1 to clear the BUF2_EMPTYflag

and remove any pending BUF2_EMPTY

interrupt request.

• ACK2 always reads 0.

ACK_HBE • Write a 1 to clear the HBE flag and

• remove any pending HBE interrupt

request.

• ACK_HBE always reads as 0.

ACK_UDR • Write a 1 to clea r the UNDERRUN flag

and remove any pending UNDERRUN

interrupt request.

• ACK_UDR always reads 0.

Philips Semiconduc tors Audio Out

PRELIMINARY SPECIFICATION 9-11

9.14 ERROR BEHAVIOR

In normal operation, the DSPCPU and AO hardware

continuously exchange buffers without ever failing to

transmit a sample. If the DSPCPU fails to provide a new

buffer in time, the UNDERRUN error flag is raised, and

the last valid sample or sample pair is repeated until a

new buffer of data is assigned by an ACK1 or ACK2. The

UNDERRUN flag is not affected by ACK1 or ACK2; it can

only be cleared by an explicit ACK_UDR.

If an HBE error occurs, the last valid sample or sample

pair is repeated until the AO hardware retrieves a new

sample buffer ac ro ss th e hig h w ay .

Table 9-14. AO highway arbiter latency requirement

examples

TransMode fs

(kHz) T

(ns)

max.

arbiter

latency

(ns)

access

pattern

stereo

16 bits/sample 44.1 22,676 22,656 1 request every

362,812 ns

stereo

16 bits/sample 48.0 20,833 20,813 1 request every

333,333 ns

stereo

16 bits/sample 96.0 10,417 10,397 1 request every

166,667 ns

6 channel

16 bits/sample 48.0 20,833 6,924 1 request every

111,111 ns

stereo

32 bits/sample 48.0 20,833 20,813 1 request every

166,667 ns

6 channel

32 bits/sample 48.0 20,833 6,924 1 request every

55,556 ns

PNX1300/01/02/11 Data Book Philips Semiconductors

9-12 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION 10-1

SPDIF Out Chapter 10

by Gert Slavenburg, Santanu Dutta

10.1 SPDIF OUT OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The PNX1300 SPDIF Output unit (SPDO) allows gener-

ation of a 1-bit high -speed serial data strea m. The prima-

ry application is to make SPDIF (Sony/Philips Digital In-

terface) data available for use by external audio

equipment.

The SPDO unit has the following features:

• fully compliant with IEC958, for both consumer and

professional applications

• supports 2-channel linear PCM audio, with 16 or 24

bits per sample

• supports one or more Dolby Digital(r) 6-channel data

streams embedded per Project 1937

• supports one or more MPEG-1 or MPEG-2 audio

streams embedded per Project 1937

• allows arbitrary, programmable, sample rates from 1

Hz to 300 kHz

• can output data with a sample rate independent of

and asynchronous to the sample rate of the Audio

Out (AO) unit

• hardware performs autonomous DMA of memory

resident IEC958 sub-frames

• hardware performs parity generation and bi-phase

mark encoding

• allows software to have full control over all data con-

tent, including user and channel data

Alternate use of the SPDO unit to generate a general-

purpose high-speed data stream is possible. Potential

applications include use as a high-speed UART or high

speed serial data channel. In this case features are:

• up to 40 Mbit/sec data rate

• full software control over each bit cell transmitted

• LSB first or MSB first data format

10.2 EXTERNAL INTERFACE

The external interface consists of only one pin, SPDO,

which is described in Table 10-1.

An external circuit (see Figure 10-1) is required to pro-

vide an electrically isolated output and convert the 3.3 V

output pin to a drive level of 0.5 V peak-peak into a 75-

ohm load, as required for consumer applications of IEC-

958.

10.3 SUMMARY OF OPERATION

In both SPDIF and transparent DMA modes, SPDO

sends alternating memory data buffers out across the

output pin. Software initially gives SPDO two memory

data buffers and enables the SPDO unit. When the first

buffer is sent, SPDO requests a new buffer from software

while switching over to use the other buffer, etc. Trans-

mission continues uninterrupted until the un it is disabled.

10.3.1 SPDIF Mode

SPDIF driver software assembles SPDIF data in each

memory data buffer. Each memory data buffer consists

of groups of 32-bit words in memory. Each word de-

scribes the data to be transmitted for a single IEC-958

sub-frame, including what type of preamble is to be in-

cluded. Each sub-frame is transmitted in 64-clock cycle

intervals of the SPDO clock, a progra mmable clo ck gen-

erated by the SPDO Direct Digital Synthesizer (DDS).

10.3.2 Transparent DMA Mode

In transparent DMA mode, software prepares each data

bit exactly as it is to be transmitted, in a series of 32-bit

words in each memory data buffer. Each 32-bit word is

Table 10-1. SPDO external signals

Signal Type Description

SPDO I/O SPDIF output. Self clocking interface

carrying either 2-channel PCM data with

samples up to 24 bits, or encoded Dolby

AC-3(r) or MPEG audio data for decod-

ing by an external audio component.

Figure 10-1. External SPDIF interface circuitry

10 uF 240E

110E

transformer

1:1

1.5 - 7 MHz

RCA

phono

SPDO

PNX1300

PNX1300/01/02/11 Data Book Philips Semiconductors

10-2 PRELIMINARY SPECIFICATION

transmitted LSB first or MSB first in 32-clock cycle inter-

vals of the SPDO clock, a programmable clock generat-

ed by the SPDO Direct Digital Synthesizer.

10.4 IEC-958 SERIAL FORMAT

Figure 10-2 shows the serial format layout of a IEC-958

block. A block starts with a special ‘B’ pre-amble, and

consists of 192 frames. The sample-rate of all embedded

audio data is equal to the frame rate. Each frame con-

sists of 2 sub-frames. Sub-frame 1 always starts with a

‘M’ pre-amble, except for sub-frame 1 in frame 0, which

starts with a ‘B’. Sub-frame 2 always starts with a ‘W’ pre-

amble.

When IEC-958 data carries 2-channel PCM data, one

audio sample is transmitted in each sub-frame, ‘left’ in

sub-frame 1 and ‘right’ in sub-frame 2. Each sa mple can

be 16 or 24 bits in length, where the MSB is always

aligned with bit slot 28 of the sub-fra me. In case of mor e

than 20 bits/sample, the Aux field is used for the 4 LSBs.

When IEC-958 data carries non-PCM audio, such as 1 or

more streams of Dolby AC-3 encoded data and/or MPEG

audio, each sub-frame carries 16-bit data. The data of

successive frames adds up to a payload data-stream

which carries its own burst-data.This is described in [2].

Programmers should refer to the IEC-958 documents [1]

and Project 1937 document [2] for a precise description

of the required values in each field for different types of

consumer equipment. A complete discussion of this is-

sue is outside the scope of this document.

The SPDO block hardware only concerns itself with gen -

erating B, W and M preamb les as we ll as gener ating th e

P (parity) bit. All other bits in the sub-frame are complete-

ly determined by software and copied verbatim from

memory to output, subject only to bit-cell coding.

The programmer mu st construct valid IEC-95 8 blocks by

constructing the right sequence of 32-bit words as de-

scribed in Section 10.7, “IEC-958 Memory Data Format.”

10.5 IEC-958 BIT CELL AND PRE-AMBLE

Each data bit in IEC-958 is transmitted using bi-phase

mark encoding. In bi-p hase mark encoding, each data bit

is transmitted as a cell consisting of two consecutive bi-

nary states. The first state of a cell is always inverted

from the second state of the previous cell. The second

state of a cell is identical to the first state if the data bit

value is a “0”, and inverted if the data bit value is a “1”.

Pre-ambles are coded as bi-phase mark violations,

where the first state of a cell is not the inver se of the last

state of the prev iou s ce ll.

The duration of ea ch state in a cell is called a UI (Uni t In-

terval), so that each cell is 2 UI’s long. In SPDO, the

length of a UI is 1 SPDO clock cycle as determined by

Figure 10-2. Serial format of a IEC958 block

sub-frame 1Msub-frame 2Wsub-frame 1Bsub-frame 2Wsub-frame 1Msub-frame 2W

Start of block (indicated by unique B pre-amble)

sub-frame sub-frame

frame 0 frame 1

sub-fra

frame 191

031282420161284

Sample data

B, W or M

pre-amble Aux. VUCP

Validity flag

User data

Channel status

Parity bit

sub-frame (2 channel PCM)

031282420161284

16-bit data

B, W or M

pre-amble VUCP

Validity flag

User data

Channel status

Parity bit

sub-frame (non-PCM audio)

unused (0)

Philips Semiconductors SPDIF Out

PRELIMINARY SPECIFICATION 10-3

the settings of the DDS ( see Section 10.8, “Sample Rate

Programming”).

Figure 10-3 illustrates the transmission format of 8-bit

data value “10011000”, as well as the transmission for-

mat of the 3 pre-ambles. Note that each pre-amble al-

ways starts with a rising edge. This is made possible

thanks to the presence of the parity bit, which always

guarantees an even number of ‘1’ bits in each sub-frame.

10.6 IEC-958 PARITY

The parity bit, or P bit in Figure 10-2, is computed by the

SPDO hardware. The P bit value should be se t such that

bit cells 4 to 31 inclusive contain an even number of ‘1’s

(and hence even number of ‘0’s). The P bit is bi-phase

mark encoded using the same method as for all other

bits.

10.7 IEC-958 MEMORY DATA FORMAT

The DSPCPU software must prepare a memory data

structure that instructs the SPDO hardware to generate

correct IEC-958 blocks. This data structure consists of

32-bit words with the following content:

The data structure for a block consists of 384 of these 32-

bit descriptor words, one fo r each subframe o f th e block,

with the correct B, M , W va lu es. All da ta co nt en t, in clu d -

ing the U, C and V flag are fully under control of the soft-

ware that builds each block.

A DMA buffer handed to th e hardware is req uired to be a

multiple of 64 bytes in length. It can contain 1 or more

complete blocks, or a block may straddle DMA buffer

boundaries. The 64-byte length will result in DMA buffers

that contain a multiple of 16 sub-frames.

Note that the descriptor structure is a 32-b it word memo-

ry data structure, and is hence subject to processor en-

dian-ness. To allow software to be efficient in both little-

endian and big-endian operation, the SPDO block

SPDO_CTL register has an endian-ness bit

‘LITTLE_ENDIAN’. The SPDO block performs byte

swapping when loading the SPDIF descriptors as fol-

lows.

• If LITTLE_ENDIAN = 1, 32-bit words at address ‘a’

will be assembled from bytes (a+3,a+2,a+1,a), with

the byte at ‘a+3’ cont aining the MSB’ s an d the byte at

‘a’ the LSB’s.

• If LITTLE_ENDIAN = 0, 32-bit words at address ‘a’

will be assembled from bytes (a,a+1,a+2,a+3), with

the byte at ‘a’ containing the MSB’s and the byte at

‘a+3’ the LSB’ s.

10.8 SAMPLE RATE PROGRAMMING

In he SPDO unit, the frame rate always equals fs, the

sample rate of embedded audio. This relation holds for

PCM as well as for Dolby AC-3 and MPEG encoded au-

dio. Each frame consists of 128 Unit Inte rvals (UI’s). The

length of a UI is determined by the frequency setting of

the DDS (Direct Digital Synthesizer) in the SPDO block.

The DDS can be programmed to emit frequencies from

approx. 1 Hz to 80 MHz in steps of approx. 0.3 Hz, with

a jitter of approx. 750 psec (at DSPCPU frequency of 143

MHz, see equations below).

Programming is accomplished through the FREQUEN-

CY MMIO register: the relation between FREQUENCY

quency is:

Putting equation 1 and 2 above together yields the for-

mula for setting FREQUENCY to accomplish a given

sample rate:

The DDS synthesizer maximum jitter can be computed

as follows:

Table 10-2. SPDIF sub-frame descriptor word

bits definition

31 (MSB) this bit must be a ‘0’ for future compatibili ty

30..4 Data value for bits 4..30 of the subframe, exactly

as they are to be transmitted. Hardware will per-

form the bi-phase mark encoding and parity gen-

eration.

3..0

(LSB) 0000 - generate a B preamble

0001 - generate a M preamble

0010 - generate a W preamble

0011 .. 1111 reserved for future

Figure 10-3. Bi-phase mark data transmission

“1” “0” “0” “1” “1” “0” “0” “0”

cell

bi-phase mark violation

fsfDDS



128

----------------=Eq. 1

FREQUENCY 231 fDDS 232



9fDSPCPU



-----------------------------+= Eq. 2

FREQUENCY 231 fs239



9fDSPCPU



-----------------------------+=

PNX1300/01/02/11 Data Book Philips Semiconductors

10-4 PRELIMINARY SPECIFICATION

Table 10-3 shows settings for common sample rate and

DSPCPU clock combinations:

The programmer is free to change FREQUENCY, and

hence the system sample rate to perform long-term

tracking of any absolute timing source and/or control

software buffer fullness. Changes to the FREQUENCY

instantaneous effect on clock level, i.e. the rate of ph ase

progression is changed, not the phase.

10.9 TRANSPARENT MODE

When SPDO is set to operate in transparent mode, it

takes all 32 bits of the memory data and shifts them out

verbatim, without bi-phase mark encoding, parity gener-

ation, or preamble.

Two transparent modes are provided, as determined by

TRANS_MODE in SPDO_CTL: LSB first and MSB first.

One bit of memory data is transmitted for each DDS

clock, such that the FREQUENCY register value for a

desired bitrate is given by the following equation:

The 32-bit memory word is constructed according to the

same rules for LITTLE_ENDIAN as in Section 10.7,

“IEC-958 Memory Data Format.”

10.10 DMA OPERATION

Before enabling the SPDO block, software must assign

two buffers with data to SPDO_BASE1, SPDO_BASE2,

and SPDO_SIZE (buffer size in bytes). Each memory

buffer size must be a multiple of 64 bytes regardless of

the operating mode.

The SPDO block is enabled by writing a ‘1’ to

SPDO_CTL.TRANS_ENABLE. Once enabled, the first

DMA buffer is sent out at the programmed sample rate.

Once the first buffer is empty, BUF1_ACTIVE is negated,

a timestamp is generated (see Section 10.13, “Times-

tamps”) and the BUF1_EMPTY flag in SPDO_STATUS

is asserted. If BUF1_INTEN in SPDO_CTL is also as-

serted, an interrupt to the DSPCPU is generated. The

SPDO block continues emitting the data in DMA buffer 2.

In normal op eration, the DSPCPU as signs a new buffer

1 full of data to SPDO and signals this by writing a ‘1’ to

ACK_BUF1. The SPDO block immediately negates the

BUF1_EMPTY condition and the related interrupt re-

quest. Once buffer 2 is empty, similar signaling occurs

and the hardware switches back to using buffer 1.

10.11 DMA ERROR CONDITIONS

Two types of erro r can oc cu r du ring DM A op er at ion .

If the software fails to provide a new buffer of data in

time, and both DMA buffers empty out, the SPDO hard-

ware raises the UNDERRUN flag in SPDO_STATUS.

Transmission switches over to the use of the next buffer,

but the data transmitted is incorrect. If UDR_INTEN is

asserted, an interrupt will be generated. The UNDER-

RUN flag is sticky, i.e. it will remain asserted until the

software clears it by writing a ‘1’ to ACK_UDR.

A lower level error can also occur when the limited size

internal buffer empties out before it can be refilled across

the highway. This situation can arise only if insufficient

bandwidth has been requested from the highway. In this

case, the HBE error flag is raised. Refer to Section 10.17,

“HBE and Highway Latency” for a description of how to

set the arbiter latency correctly.

10.12 INTERRUPTS

The SPDO block uses inter rupt SRC NUM 25, with inter-

rupt vector MMIO offset 0x1008E4.

It is highly recommended that the interrupt be operated

in level-sensitive mode only.

The SPDO block generates an interrupt if one of the fol-

lowing status bit flags, and its corresponding INTEN_xxx

flag are set: BUF1_EMPTY, BUF2_EMPTY, HBE, UN-

DERRUN.

All these status flags are sticky, i.e. they are asserted by

hardware when a certain condition occurs, and remain

set until the interrupt handler explicitly clears them by

writing a ‘1’ to the corresponding ACK bit in SPDO_CTL.

The SPDO hardware takes the flag away in th e clock cy-

cle after the ACK is received. This allows immediate re-

turn from interrupt once performing an ACK.

10.13 TIMESTAMPS

Any outgoing DMA buffer is assigned a 32-bit ‘time of de-

parture’ timestamp. The co unter used to generate times-

tamps uses the DSPCPU clock and the same reset time

as the DSPCPU CCCOUNT register, resulting in a value

that corresponds to the 32 LSB’s of CCCOUNT - provid-

ed that PCSW.CS=1, i.e. the real CCCOUNT counter in-

crements on every clock cycle.

Table 10-3. SPDIF sample rate setting

(kHz) fDSPCPU

(MHz) FREQUENCY

(hexadecimal) UI

(nSec) jitter

(nSec)

32.000 143 0x80D0,9316 244.14 0.777

32.000 166 0x80B3,ACF8 244.14 0.669

32.000 180 0x80A5,B36E 244.14 0.617

44.100 143 0x811F,711B 177.15 0.777

44.100 166 0x80F7,9D93 177.15 0.669

44.100 180 0x80E4,5B47 177.15 0.617

48.000 143 0x8138,DCA1 162.76 0.777

48.000 166 0x810D,8375 162.76 0.669

48.000 180 0x80F8,8D25 162.76 0.617

jitter 1

9fDSPCPU



-----------------------------=

FREQUENCY 231 232 bitrate

9fDSPCPU



------------------------------+= Eq. 2

Philips Semiconductors SPDIF Out

PRELIMINARY SPECIFICATION 10-5

The timestamp can be read in the DMA interrupt handler

as MMIO register SPDO_TSTAMP. Its contents corre-

sponds to the (synchronized) clock edge at which the last

bit in the DMA buffer was sent across the output signal

pin.

10.14 MMIO REGISTER DESCRIPTION

Figure 10-4. SPDO unit status/control field MMIO layout.

MMIO_base

offset:

SPDO_STATUS (r/0x10 4C00

SPDO_CTL (r/w)0x10 4C04

SPDO_FREQ (r/w)0x10 4C08

SPDO_BASE1 (r/w)0x10 4C0C

FREQUENCY

BUF1_ACTIVE

SPDO_BASE2 (r/w)0x10 4C10 BASE2

SPDO_SIZE (r/w)0x10 4C14 SIZE (in bytes)

31 0371115192327

BASE1

UNDERRUN

HBE (Highway bandwidth error)

BUF2_EMPTY

RESET

TRANS_ENABLE

TRANS_MODE

LITTLE_ENDIAN

UDR_INTEN

HBE_INTEN

BUF2_INTEN

BUF1_INTEN

ACK_UDR

ACK_HBE

ACK_BUF2

ACK_BUF1

00000

000000

SLEEPLESS

BUF1_EMPTY

000000

31 0371115192327

SPDO_TSTAMP (r/o)0x10 4C18 TIMESTAMP

Table 10-4. SPDO_STATUS MMIO register

field type description

BUF1_EMPTY

r/o

Sticky flag - set if DMA buffer 1 emp-

tied by the SPDO hardware. Can only

be cleared by software write to

ACK_BUF1.

BUF2_EMPTY

r/o

Sticky flag - set if DMA buffer 2 emp-

tied by the SPDO hardware. Can only

be cleared by software write to

ACK_BUF2.

HBE

r/o

Highway Bandwidth Error. S ticky flag -

set if internal SPDO buffers emptied

before new data brought from mem-

ory. Refer to Section 10.17, “HBE and

Highway Latency.” Can be cleared

only by a software write to ACK_HBE.

UNDERRUN

r/o

Sticky flag - set if both DMA buffers

were emptied before a new full buffer

was assigned by the DSPCPU. The

hardware has performed a normal

buffer switch over and is emitting old

data. Can only be cleared by software

write to ACK_UDR.

BUF1_ACTIVE r/o Flag - set if the hardware is currently

emitting DMA buffer 1 data; negated

when emitting DMA buffer 2 data.

Table 10-5. SPDO_CTL MMIO register

field type description

ACK_BUF1

w/o

Always reads as ‘0’. Write a ‘1’ here

to clear BUF1_EMPTY. This

informs SPDO that DMA buffer 1 is

now full. Writing a ‘0’ has no effect.

ACK_BUF2

w/o

Always reads as ‘0’. Write a ‘1’ here

to clear BUF2_EMPTY. This

informs SPDO that DMA buffer 2 is

now full. Writing a ‘0’ has no effect.

ACH_HBE w/o Always reads as ‘0’. Writing a ‘1’

here clears HBE.

ACK_UDR w/o Always reads as ‘0’. Writing a ‘1’

here clears UNDERRUN.

BUF1_INTEN r/w If BUF1_EMPTY asserted and this

bit asserted, the SRC 25 interrupt

line is asserted.

Table 10-4. SPDO_STATUS MMIO regis ter

field type description

PNX1300/01/02/11 Data Book Philips Semiconductors

10-6 PRELIMINARY SPECIFICATION

To ensure compatibility with future devices, any unde-

fined MMIO bits should be ignored when read, and writ-

ten as ’0’s.

The SPDO_FREQ register determines the frequency of

operation of the DDS, and h ence th e samp le rate of o ut-

going audio. Refer to Section 10.8, “Sample Rate Pro-

gramming.” and Section 10.9, “Transparent Mode.”

SPDO_BASE1 contains the memory address of DMA

buffer 1. SPDO_BASE2 contains the memory address of

DMA buffer 2. SPDO_SIZE determines the size, in bytes,

of both DMA buffers. Assignment to SPDO_BASE1,

SPDO_BASE2 and SPDO_SIZE have no effect on the

state of the SPDO_STATUS flags; the ACK_BUF1 and

ACK_BUF2 bits signal the assignment of valid data to

the DMA buffers. Any change to the BASE register

should only be done to an inacti ve buffer and should pre-

cede the ACK to that buffer.

SPDO_TSTAMP is a read-only register containing the

cycle count at which the last bit from the last emptied

buffer was transmitted across the output pin. Refer to

Section 10.13, “Timestamps.”

10.15 RESET

The SPDO block is reset by global PNX1300 reset pin

TRI_RESET# or by writing a ‘1’ to the RESET bit in

SPDO_CTL. The SPDO block is not affected by

DSPCPU reset initiated though the PCI block BIU_CTL

following state:

• SPDO_BASE1, SPDO_BASE2, SPDO_SIZE = 0

• SPDO_STATUS: all defined fields set to ’0’, except

BUF1_ACTIVE = 1

• SPDO_CTL all defined fields set to value 0

The SPDO block timestamp counter is reset by

TRI_RESET# or by DSPCPU reset initiated through

BIU_CTL, so as to ensure that it stays synchronous to

the CCCOUNT DSPCPU register.

10.16 POWER DOWN AND SLEEPLESS

The SPDO block enters powerdown state whenever

PNX1300 is put in global powerdown mode, except if the

SLEEPLESS bit in SPDO_CTL is set. In the latter case,

the block continues DMA operation and will wake up the

DSPCPU whenever an interrupt is generated.

SPDO can be separately powered down by setting a bit

in the BLOCK_POWER_DOWN register. For a descrip-

tion of powerdown, see Chapter 21, “Power Manage-

ment.”

The SPDO block should not b e active when applying glo-

bal powerdown (TRANS_ENABLE = 0), or if active,

SLEEPLESS should be asserted. SPDO should not be

active if powered down separately.

If the block enters power-down state while transmission

is enabled, its operation continues from the interrupted

clock cycle, but the o utput signal generated by the block

has undergone a pause that is unacceptable to external

equipment.

10.17 HBE AND HIGHWAY LATENCY

The SPDO unit uses one in te rnal 64-byte buffer and two

32-bit holding registers. Under normal operation, the in-

ternal buffer is refilled from SDRAM fast enough to avoid

missing any data, while data is being sent from the two

32-bit registers. If the highway arbiter is set up with an in-

sufficient latency guarantee, the situation can arise in

which the 64-byte buffer is not refilled in time. In that case

the HBE error is raised, and some data has bee n irrevo-

cably lost. The HBE condition is sticky, and can only be

cleared by an explicit ACK_HBE.

BUF2_INTEN r/w If BUF2_EMPTY asserted and this

bit asserted, the SRC 25 interrupt

line is asserted.

HBE_INTEN r/w If HBE asserted and this bit

asserted, the SRC 25 interrupt line

is asserted.

UDR_INTEN r/w If UNDERRUN asserted and this bit

asserted, the SRC 25 interrupt line

is asserted.

SLEEPLESS

r/w

If ‘1’, the SPDO block does not

power down when PNX1300 goes

into global power-down mode. If ‘0’,

the block does power down.

LITTLE_ENDIAN

r/w

If asserted, the 32-bit data SPDIF

descriptor word or transparent

mode data word is assembled

using little endian byte ordering,

otherwise big-endian.

TRANS_MODE

r/w

• 000 - IEC-958 mode. Hardware

performs bi-phase mark encod-

ing, preamble generation, and

parity generation, and transmits

one IEC-958 subframe for each

data descriptor word.

• 010 transparent mode, LSB first.

The 32-bit data descriptor words

are transmitted as is, LSB first.

• 011 transparent mode, MSB

first. The 32-bit data descriptor

words are transmitted as is,

MSB first.

• Any other code reserved for

future extensions.

The transmission mode should only

be changed while transmission is

disabled.

TRANS_ENABLE

r/w

Writing a ‘1’ to this bit enables

transmission per the selected

mode. Writing a ‘0’ here stops any

ongoing transmission after com-

pleting any actions related to the

current data descriptor word.

RESET

w/o

Writing a ‘1’ to this bit resets the

SPDO unit and should be used with

extreme caution. Ongoing trans-

mission will be interrupted, receiv-

ers may be left in a strange state.

Table 10-5. SPDO_CTL MMIO register

field type description

Philips Semiconductors SPDIF Out

PRELIMINARY SPECIFICATION 10-7

The highway arbiter needs to be programmed such that

the SPDO unit’s latency requirement can always be met.

Refer to Chapter 20, “Arbiter” for details. The required la-

tency can be computed as indicated below.

Given an output data rate fs in samples/sec, 2x 32 bits

are required each sample inte rval. The arbiter sh ould be

set to have a latency so that the buffer is refilled before a

sample interval expires. See Table 10-6 for example

practical settings.

10.18 LITERATURE REFERENCES

[1] IEC-958 Digital Audio Interfa ce, Par t 1: Gene ra l; Part

2: Professional applications; Part 3: Consumer applica-

tions.

[2] ‘Interface for non-PCM encoded Audio bitstreams ap-

plying IEC958’, Philips Consumer Electronics, June 6

1997. IEC 100c/WG11(project 1937)

Table 10-6. SPDO block highway latency

requirements

(kHz) Max. latency

(nSec)

32.000 31250

44.100 22675

48.000 20833

PNX1300/01/02/11 Data Book Philips Semiconductors

10-8 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION 11-1

PCI Interface Chapter 11

by Gert Slavenburg, Ken-Sue Tan, Babu Kandimalla

11.1 PCI OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

PNX1300 includes a PCI interface for easy integration

into personal computer app lications—where the PCI-bus

is the standard fo r high- speed p er iphe ra ls. In e mbe dded

applications, with PNX1300 serving as the main CPU,

the PCI bus can interface to peripheral devices that im-

plement functions not provided by the on-chip peripher-

als. See Figure 11-1.

The main function of the PCI interface is to connect the

PNX1300 on-chip highway and PCI buses. A bus cycle

on the internal highway that targets an address mapped

into PCI space will cause the PCI interface to create a

PCI bus cycle. Similarly, a bus cycle on PCI that targets

an address mapped into PNX1300 memory space will

cause the PCI interface to create a highway bus cycle

targeted at SDRAM. For some o per ations, the PCI inter-

face is explicitly programmed by the DSPCPU.

From PNX1300, only th e DSPCPU and the image copro-

cessor (ICP) unit can cause the PCI interface to create

PCI bus cycles; the other on-chi p peripherals cannot se e

external hardware through the PCI interface. From PCI,

SDRAM and most of the registers in MMIO space can be

accessed by external PCI initiators.

The PCI interface implements DMA (also called block or

burst) and non-DMA transfers. DMA transfers are inter-

ruptible on 64-byte boundaries. The PCI interface can

service outbound (PNX1300  PCI) and inbound (PCI

 PNX1300) data flows simultaneously.

Table 11-1 lists some of the features of the PCI interface.

PNX1300 DMA read transactions use an efficient ‘mem-

ory read multiple’ PCI transactions, unless explicitly dis-

abled. Section 11.6.5.

PNX1300 contains an on-board PCI_CLK generator for

low-cost configurations. It can be enabled/disabled at

boot time. See Section 13.1 on pag e 13-1.

PNX1300 has a sideband control sig nal that allows glue-

less connection of simple slave peripherals directly to the

PCI bus wires. This can be used to connect Flash, ROM,

SRAM, UARTs, etc. with 8-bit data and demultiplexed

addresses. Refer to Chapter 22, “PCI-XIO External I/O

Bus.”

PCI Agent PCI Agent PCI Agent

PNX1300 PCI Bus

Arbiter

Host CPU

(e.g., x86)

Interrupt

Controller

PCI Agent PCI Agent PCI Agent

PNX1300 PCI Bus

Arbiter

a) PNX1300 as peripheral b) PNX1300 as host CPU

PCI Bus PCI Bus

PCI Bridge

Figure 11-1. Two typical system impleme ntations: ( a) shows PNX1 300 as a PCI peripheral in a desktop PC, (b)

shows an embedded system with PNX1300 as the host CPU.

Table 11-1. PCI interface characteristics

Characteristic Comments

PCI Compliance PCI Local Bus Specification Rev. 2.1

PCI Speed Up to 33 MHz

Data bus width 32-bit only

Address space 32 bits (4 GB)

Voltage levels Drive & receive at either 3.3 V or 5V

Burst mode Yes, w/ double buffering so maxi-

mum transfer rate (132 MB/sec) is

sustainable

Posted write Yes, can be disabled

PCI ‘special cycle’ Not recognized

PCI ‘memory write &

invalidate’ Supported for PNX1300 as initiator

PCI ‘interrupt acknowl-

edge’ Not generated

PCI ‘dual-address

cycle’ Not generated

PNX1300/01/02/11 Data Book Philips Semiconductors

11-2 PRELIMINARY SPECIFICATION

11.2 PCI INTERFACE AS AN INITIATOR

The following classes of operations invoked by PNX1300

cause the PCI interface to act as a PCI initiator:

• Transparent, single-word (or smaller) transactions

caused by DSPCPU loads and stores to the PCI

address aperture

• Explicitly programmed single-word I/O or configura-

tion read or write transactions

• Explicitly programmed multi-word DMA transactions.

•ICP DMA

11.2.1 DSPCPU Single-Word Loads/Sto res

From the point of view of programs executed by

PNX1300’s DSPCPU, there are three apertures into

PNX1300’s 4-GB memory address space:

• SDRAM space (0.5 to 64 MB; programmable)

• MMIO space (2 MB)

• PCI space

MMIO registers control the positions of the address-

space apertures (see Chapter 3, “DSPCPU Architec-

ture”). The SDRAM aperture be gins at the address spec-

ified in the MMIO register DRAM_BASE and extends up-

ward to the address in the DRAM_LIMIT registe r. The 2-

MB MMIO aperture begins at the address in

MMIO_BASE (defaults to 0xEFE00000 after power-up).

All addresses that fall outside these two apertures are

assumed to be part of the PCI address aperture. Refer-

ences by DSPCPU loads and stores to the PCI aperture

are reflected to external PCI devices by the coordinated

action of the data cache and PCI interface.

When a DSPCPU load or store targets the PCI aperture

(i.e., neither of the other two apertures), the DSPCPU’s

data cache automatically carries out a special sequence

of events. The data cache wr ite s to the PCI_ADR an d (if

the DSPCPU operation was a store) PCI_DATA regis-

ters in the PCI interface and a sserts (loa d) or de -asser ts

(store) the internal signal pci_read_operation (a direct

connection from the data cache to the PCI interface).

While the PCI interface executes the PCI bus transac-

tion, the DSPCPU is held in the stall state by the data

cache. When the PCI interface has completed the trans-

action, it asserts the internal signal pci_ready (a direct

connection from the PCI interface to the data cache).

When pci_ready is asserted, the data cache finishes the

original DSPCPU operation by reading data from the

PCI_DATA register (if the DSPCPU operation was a

load) and releasing the DSPCPU from the stall state.

Explicit Writes to PCI_ADR, PCI_DATA

The PCI_ADR and PCI_DATA registers are intended to

be used only by the data cache. Explicit writes are not al-

lowed and may cause undetermined results and/or data

corruption.

11.2.2 I/O Operations

Explicit programming by DSPCPU software is the only

way to perform transactions to PCI I/O space. DSPCP U

software writes three MMIO re gisters in the following se-

quence:

1. The IO_ADR register.

2. The IO_DATA register (if PCI operation is a write).

3. The IO_CTL register (controls directio n of data move-

ment and which bytes participate).

The PCI interface starts the PCI-bus I/O transaction

when software writes to IO_CTL. The interface can raise

a DSPCPU interrupt at the completion of the I/O transac-

tion (see BIU_CTL register definition in Section 11.6.5,

“BIU_CTL Register”) or the DSPCPU can poll the appro-

priate status bit (see BIU_STATUS register definition in

Section 11.6.4, “BIU_STATUS Register”). Note that PCI

I/O transactions should NOT be initiated if a PCI config-

uration transaction described below is pending. This is a

strict implementation limitation.

The fully detailed description of the steps needed ca n be

found in Section 11.6.13, “IO_CTL Register.”

11.2.3 Configuration Operations

As with I/O operations, explicit programming by

DSPCPU software is the only way to perform transac-

tions to PCI configuration space. DSPCPU software

writes three MMIO registers in the following sequence:

1. The CONFIG_ADR register.

2. The CONFIG_DATA register (if PCI operation is a

write).

3. The CONFIG_CTL register (c ontrols direction of data

movement and which bytes participate).

The PCI interface starts the PCI-bus configuration trans-

action when software writes to CONFIG_CTL. As with

the I/O operations, the biu_status and BIU_CTL registers

monitor the status of the operation and control interrupt

signaling. Note that PCI configuration space transactions

should NOT be initiated if a PCI I/O transaction de-

scribed above is pending. This is a strict implementation

limitation.

The fully detailed description of the steps needed ca n be

found in Section 11.6.10, “CONFIG_CTL Register.”

11.2.4 DMA Operations

The PCI interface can operate as an autonomous DMA

engine, executing block- tran sfer operation s at maxim um

PCI bandwidth. As with I/O and configuration operation s,

DSPCPU software explicitly programs DMA operations.

General-purpose DMA

For DMA betwee n SDRAM and PCI, DSPCPU software

writes three MMIO registers in the following sequence:

1. The SRC_ADR and DEST_ADR registers.

2. The DMA_CTL register (controls direction of data

movement and amount of data transferred).

Philips Semiconductors PCI Interface

PRELIMINARY SPECIFICATION 11-3

The PCI interface begins the PCI-bus transa ctions when

software writes to DMA_CTL. As with the I/O and config-

uration operations, the BIU_STATUS and BIU_CTL reg-

isters monito r the status of the operation and control in-

terrupt signaling.

The fully detailed description of the steps needed to start

a DMA transaction can be found in Section 11.6.16,

“DMA_CTL Register.”

Image-Coprocessor DMA

The PCI interface also executes DMA transactions for

the Image Coprocessor (ICP). The ICP performs rapid

post-processing of image data and writes it at PCI DMA

speed to a PCI graphics card frame bu ffer. The ICP can-

not perform PCI read transactions. BIU_CTL.IE (ICP

DMA Enable) should be asserted before attempting ICP

PCI operation. Progr amming of ICP DMA is d escribed in

Section 14.6, “Operation and Programming.”

11.3 PCI INTERFACE AS A TARGET

The PNX1300 PCI interface responds as a target to ex-

ternal initiators for a limited set of PCI transaction types:

• Configuration read/write

• Memory read/write, read line, and read multiple to

the PNX1300 SDRAM or MMIO apertures. See Sec-

tion 11.8, “Limitations.”

PNX1300 ignores PCI transactions other than the above.

11.4 TRANSACTION CONCURRENCY,

PRIORITIES, AND ORDERING

The PCI interface can be processing more than one op-

eration at a given time. There are five distinct classes of

operations implemented by the PCI interface:

1. DSPCPU load/store to PCI space.

2. PCI I/O read/write and PCI configuration read/write.

3. General-purpose DMA read/write.

4. ICP DMA write.

5. External-PCI-agent-initiated read/write (to PNX1300

on-chip resource).

If the active general-purpose DMA transaction is a read,

up to five transactions, one from each, can be active si-

multaneously. If the active general-purpose DMA opera-

tion is a write, then only four transactions can be active

simultaneously because general-purpose DMA writes

force ICP DMA writes to wait until the general-purpose

DMA completes. When a general-purpose DMA write is

pending, an in-progress ICP DMA operation is suspend-

ed at the next 64-byte block boundary an d waits until the

completion of the DMA write operation. General- purpose

DMA reads are interleaved with ICP DMA writes, so both

can be active concurrently.

PCI single-data-phase transactions (DSPCPU load/

store, I/O read/write, and configuration read/write) are

executed in the order they are issued to the PCI inter-

face. Note the strict implementation limitation that PCI -

I/O and PCI configuration transactions cannot be simul-

taneously active.

11.5 REGISTERS ADDRESSED IN PCI

CONFIGURATION SPACE

Since it is a PCI device, PNX1300 has a set of configu-

ration registers to determine PCI behavior. PCI configu-

ration registers allow full relocation of interrupt binding

and address mapping by the system’s host processor.

This relocatability of PCI-space parameters eases instal-

lation, configuration, and system boot.

The PCI standard specifies a 64-byte PCI configuration

header region within a reserved 256-byte block. During

system initialization, host system software scans the PCI

bus, looking for PCI headers, to determine what PCI de-

vices are present in the system. The fields in the header

region uniquely identify the PCI device and allow the host

to control the device in a generic way. Figure 11-2 shows

the layout of the configuration header region.

Figure 11-2 also shows the initial values for the configu-

ration registers. Some registers, such as Device ID, have

hardwired va lues, while others are pr ogramm ed by soft -

ware. Still others are set automatically from the external

boot ROM during PNX1300’s power-up initialization.

11.5.1 Vendor ID Register

For PNX1300, the value of the 16-bit Vendor ID field is

hardwired to 0x1131 (Philips). This value identifies the

manufacturer of a PCI device. Valid vendor identifiers

are assigned by the PCI special interest group (PCI SIG)

to ensure uniqueness. The value 0xFFFF is reserved

and must be returned by the host/PCI bridge when an at-

tempt is made to read a non-existent d evice’s Vend or ID

configuration register.

11.5.2 Device ID Register

For PNX1300, the value of the 16-bit Device ID field is

hardwired to 0x5402. The Device ID is assigned by the

manufacturer to uniquely identify each PCI device it

makes.

11.5.3 Command Register

The 16-bit co mmand regi ster provides basic control over

a PCI device’s ability to generate and/or respond to PCI

bus cycles. According to the PCI specification, after re-

set, all bits in this register are cleared to ‘0’ (except fo r a

device that must be initially enabled). Clearing all bits to

’0’ logically disconnects the device from the PCI bus for

all accesses except configuration accesses.

The command register format is shown in Figure 11-3.

Table 11-2 summarizes the field values. Note that the

values listed as ‘normally taken’ are not necessarily the

reset values, i.e. the Command register is reset to all ‘0’s,

meaning the features are disconnected on reset.

Following are detailed descrip tions of the command r eg-

ister fields.

PNX1300/01/02/11 Data Book Philips Semiconductors

11-4 PRELIMINARY SPECIFICATION

I/O (I/O access enable). This bit controls a device’s abil-

ity to respond to I/O-space accesses. A value of ’0’ dis-

ables PCI device response; a value of ’1’enables re-

sponse. This bit is hardwired to ’0’ because all PNX1300

internal regis te rs ar e m emo ry m ap pe d .

MA (Memory access enable). This bit controls re-

sponse to memory-space accesses. A value of ’0’ dis-

ables PNX1300 response; a value of ’1’ enables re-

sponse. This bit is set to ’0’ at power-up; software can set

this bit to ’1’ with a configuration write.

0Normally ’0’ 0 Hardwired to ground sp Set by software if aperture s ize allows p Set by software

1 Normally one 1 Hardwired to Vdd s Set by hardware from boot EEPROM

015

Device ID (0x5402) Vendor ID (0x1131)

004

01 000 reserved reserved 11 11

Status Command

0000 0

008

10 100 010000010

Class Code (0x048000) Revision ID (see text)

0000 0 0 00000000000

00C

00 000 0

BIST (0x00) Latency Timer

0000 0 0 0000pppp00p

Header Type (0x00) Cache Line Size

p10

spspspspsp0

DRAM Base Address

pppp spsp000000000000000000

p14

pp ppp 0

MMIO Base Address

pppp p 0 0000000000000000000

18, 1C,

20, 24

34, 38

000 1 Interrupt Line

0 01100000000p

s sssssssssssssss

ppppppp

Interrupt Pin (0x01)Min_Gnt (0x03)Max_Lat (0x01)

0000 0010

723

01010100000000100001000100110001

00p000

Configuration-Space Address Offset

000 000 000000000

Four other base address registers

0000 0 0 00000000000

000 000 00000 0 0 000000

Reserved register

Expansion Rom Base Address

0 0000000000000000000

Two reserved registers

0 0000000000000000000

0000000000000

0000000000 0

ssssssssssssssss

Subsystem ID Subsystem Vendor ID

00ppp00

Key

Prefetchable

Figure 11-2. PCI configuration header region register layout and initial values. (All values in hex.)

15 0

Command Register I/O

MWI

VGA

PAR

Wait

SERR#

Reserved

Figure 11-3. Command Register format.

Philips Semiconductors PCI Interface

PRELIMINARY SPECIFICATION 11-5

EM (Enable mastering). This bit controls the PNX1300

PCI interface’s ability to act as a PCI master. A value of

’0’ prevents th e PC I in ter fa ce from initiating PC I access-

es; a value of ’1’ allows the PCI interface to initiate PCI

accesses.

Note that the EM bit is automatically set to ’1’ whenever

the HE bit in the BIU_CTL register is set to ’1’ (see Sec-

tion 11.6.5, “BIU_CTL Register” ). Master ing must be en -

abled for PNX1300 to serve as PCI host processor.

EM is set to ’0’ at power-up. Host system software can

set this bit to ’1’ with a configuration write.

SC (Special cycle). This bit controls PCI device recog-

nition of special-cycle operations. A value o f ’0’ causes a

PCI device to ignore all special cycles; a value of ’1’ al-

lows a PCI device to monitor special cycle operations.

This bit is hardwired to ’0’ in PNX1300.

MWI (Memory write and invalidate). This bit deter-

mines a PCI device’s ability to generate memory-write-

and-invalidate command s. A value of ’1’ allows a PCI de -

vice to generate memory-write-and-invalidate com-

mands; a value of ’0’ forces the PCI device to use mem-

ory-write commands instead. PNX1300 implements this

bit. The conditions under which PNX1300 DMA transac-

tions generate memory-write-and-invalidate are de-

scribed in Section 11.6.16, “DMA_CTL Register.” De-

tails of operation can be found in Section 11.5.7, “Cache

Line Size Regis ter.” Image Co processor DMA writes al-

ways use regular memory-write transactions.

VGA (VGA palette snoop). This bit controls how VGA-

compatible PCI devices handle accesses to their palette

registers. This bit is hardwired to ’0’.

PAR (Parity error r esponse). This bit controls signaling

of parity errors (data or address). A value of ’0’ causes

the PCI interface to ignore parity errors; a value of ’1’

causes the PCI interface to report parity errors on the

perr# PCI signal. This bit is set to ’0’ at power-up; since

the PCI interface checks parity , software can set this bit

to ’1’ with a configuration write.

Wait (Wait-cycle control). This bit controls whether or

not a PCI device does a ddress/data stepping. PCI devic-

es that never do stepping must hardwire this bit to 0.

Since PNX1300 does not implement stepping, this bit is

hardwired to ’0’.

SERR# (serr# enable). This bit enables the drive r of the

serr# pin (system er ror): a value of ’0’ disables it, a valu e

of ’1’ enables it. All PCI devices that have an serr# pin

must implement this bit. This bit is set to ’0’ after reset; it

can be set to ’1’ with a configuration write. SERR# and

PAR must both be set to ’1’ to allow signaling of address

parity errors on the serr# sign a l.

FB (Fast back-to-b ack enable). This bit controls wheth-

er or not a PCI master can do fast back-to- back transa c-

tions to different devices. A value of ’0’ me ans fast back-

to-back transactions are only allowed when the transac-

tions are to the same agent; a value of ’1’ means the

master is allowed to generate fast back-to -back tran sac-

tions to different agents. Initialization software will set

this bit if all targets are capable of fast back-to-back

transactions. In PNX1300, this bit is hardwired to ’0’.

Reserved. Reads from reserved bits returns ’0’; writes to

reserved bits cause no action.

11.5.4 Status Register

The status register is used to record information about

PCI bus events. The status register format is shown in

Figure 11-4. Table 11-3 lists the Status register fields.

Reserved. Reads from reserved bits return ’0’; writes to

reserved bits cause no action.

66M (66-MHz capable). This bit is hardwired to ’0’ for

PNX1300 (PCI runs at 33-MHz maximum).

UDF (user-definable features). Since the PNX1300

PCI interface does not implement PCI user-definable

features, this bit is hardwired to ’0’.

FBC (Fast back-to-backcapable). The PNX1300 PCI

interface does not support fast back-to-back capability,

so this bit is hardwired to ’0’.

DPD (Data parity detected). Since the PNX1300 PCI in-

terface can act as a PCI bus initiator, this bit is imple-

mented. DPD is set in the initiator’s status register when:

• The PAR (parity-error response) bit in the command

Table 11-2. Field values for Command Register

Field Value Explanation

I/O Hardwired to 0 (ignore I/O space accesses)

MA 0  no recognition of memory-space accesses

1  recognizes memory-space accesses

EM 0  cannot act as PCI initiator

1  can act as PCI initiator

SC Hardwired to 0 (ignore special cycle accesses)

MWI 0  cannot generate memory write and invalidate

1  can generate memory write and invalidate

VGA Hardwired to 0

Par 0  ignore parity errors

1  acknowledge parity errors

SERR# 0  disable driver for serr# pin

1  enable driver for serr# pin

FB 0  fast back-to-back only to same agent

1  fast back-to-back to different agents

Reserved Write ignored; reads return 0

15 0

Status Register 45

66M

UDF

FBC

DPD

910 Reserved

SSEDPE 13

RMA 12

RTA 11

STA DEVSEL

Figure 11-4. Status register format.

PNX1300/01/02/11 Data Book Philips Semiconductors

11-6 PRELIMINARY SPECIFICATION

• The initiator asserted perr# or detected it asserted by

the target (during a write cyc le) .

DEVSEL (Device select timing). This read-only field

defines the slowest timing that will be used for the

devsel# signal when PNX1300 is a target on the PCI bus.

Table 11-4 shows the allowable encodings and mean-

ings. These bits are hardwired to ‘01’ to indicate that

PNX1300 uses a ‘medium’ devsel# timing.

STA (Signaled target abort). PNX1300’s PCI interface

sets this bit when it is a targe t device and aborts a trans-

action.

RTA (Receive target abort). PNX1300’s PCI interface

sets this bit when it is the initiating device and the trans-

action is aborted by th e target device. (All initiating devic-

es must implement this bit.)

RMA (Receive master abor t). PNX1300’s PCI interface

sets this bit when it is the initiating device and aborts a

transaction (except when the transaction is a special cy-

cle). (All initiating devices must implement this bit.)

SSE (Signaled system error). PNX1300’s PCI interface

sets this bit when it asserts the serr# signal. (PNX1300

can generate serr#, so this bit is implemented; devices

incapable of generating serr# need not implement SSE.)

DPE (Detected parity error). PNX1300’s PCI interface

sets this bit when it detects a parity error, even if parity

error handling is disable d. (The PAR b it in the co mmand

11.5.5 Revision ID Register

The value in the Revision ID register is a read only value

chosen by the manufacturer to indicate product revi-

sions. For the PNX1300 pr oduct fa mily, the two MSBs of

the revision ID indicate the fab where the part was man-

ufactured. The next two bits indicate an all-layer revision

number, and the 4 LSBs indicate metal layer revisions.

Each all-layer revision adds 0x10 to the revision ID and

resets the 4 LSBs to ‘0’. Non-pin or -function compatible

TriMedia devices will use the same Revision ID conven-

tion, but with a revised Device ID.

11.5.6 Class Code Register

The value in the Class Code register is read-only. Sys-

tem software uses th e Class Code register to identify th e

generic function of the device, and in some cases, the

Class Code can specify a register-leve l prog ra mming in -

terface.

Class Code consists of three 1-byte fields as shown in

Figure 11-5. The value of the upper byte, Base Class

Code, broadly classifies the function of the device. The

value of the middle byte, Subclass Code, identifies the

function more specifically. The value of the lower byte

specifies a register-level programming interface so that

device-independent software can interact with the de-

vice. The meanings of the Base Class byte values are

shown in Table 11-6.

The value of Base Class is hardwired to 0x04 since

PNX1300 is a multimedia device. Currently, there are no

specific register-level programming interfaces defined

for multimedia devices.

Table 11-7 lists the defined subclasses of multimedia de-

vices. PNX1300 is both a video and audio multimedia de-

vice, so its subclass value is hardwired to 0x80.

Table 11-3. Status register fields

Field Characteristics

Reserved Writes ignored; reads return 0

66M PCI bus speed (hardwired to 0  33-MHz)

UDF User-definable features (hardwired to 0  none)

FBC Fast back-to-back capable (hardwired to 0 

unsupported)

DPD Data parity detected

DEVSEL devsel# signal timing (hardwired to 1  ‘medium’)

STA Signaled target abort

RTA Receive target abort

RMA Receive master abort

SSE Signaled system error

DPE Detected parity error

Table 11-4. DEVSEL encodings

DEVSEL Meaning

00 Fast

01 Medium

10 Slow

11 Reserved

Table 11-5. Actual revision ID values

Value (hex) Product description

0x80 TM-1300 original mask - tm1f-1.0

0x81 TM-1300 1st metal revision - tm1f-1.1

0x82 TM-1300 2nd metal revision - tm1f-1.2

0x83 PNX1300/01/02/11 3nd metal revision - tm1f-

1.3

23 0

Class Code Programming InterfaceBase Class Co de 15 7

Subclass Code

Figure 11-5. Class-code register format.

Philips Semiconductors PCI Interface

PRELIMINARY SPECIFICATION 11-7

11.5.7 Cache Line Size Register

This field only matters when the MWI bit in configuration

space is set. The value of the Cache Line Size register

specifies the host system cache line size in units of 32-

bit words. Initiating devices, such as the PNX1300, that

can generate memory-write-and-invalidate commands

must implement this register. When implemented, the

cache line size allows initiators participating in the PCI

caching protocol to retry burst accesses at cache-line

boundaries.

This register is implemented in PNX1300. In the

PNX1300, PCI DMA performs write-and-invalidate cy-

cles as per the table below. ICP DMA and CPU PCI

writes are performed using norma l memory-write cycles.

11.5.8 Latency Timer Register

The value of the Latency Timer register specifies the

minimum number of PCI clock cycles the PNX1300 BIU

(as initiator) is allowed to own the PCI bus. This register

is readable and writable in PCI configuration space.

This register must be writable in any PCI-initiating device

that can burst more than two data phases. In the

PNX1300 PCI interface, the least-significant three bits

are hardwired to ’0’ an d software can progr am any value

into the most-significant five bits. This permits software

to specify the time slice with a minimum granularity of

eight PCI clocks. A value of ’0’ signifies maximum laten-

cy, i.e. 256 PCI clocks.

11.5.9 Header Type Register

The value of the He ader Type register defines the format

of words 16 through 63 in configuration space and

whether or not the device contains multiple functions.

Figure 11-6 shows the format of Header Type.

Bit 7 of Header Type is ’0’ for single-function devices, ’1’

for multi-function devices. PNX1300 is a single-function

device, so bit 7 is ’0’. Table 11-9 shows the encodings of

the Layout field.

11.5.10 Built-In Self Test Register

When implemented, the BIST register is used to con trol

the operation of a device’s built-in self testing capability.

PNX1300 does not implement BIST, so this register is

hardwired to return ’0’s when read.

11.5.11 Base Address Registers

The PNX1300 PCI interface implements two configura-

tion space memory Base Address registers:

DRAM_BASE and MMIO_BASE. DRAM_BASE relo-

cates PNX1300’s SDRAM within the system address

space; MMIO_BASE relocates the 2-MB memory-

mapped I/O address aperture.

The values in the Base Address registers determine the

address map as see n by both th e DSPCPU and extern al

PCI masters. These values are normally set once, and

not changed dynamically once the DSPCPU operates.

Table 11-6. Base Class Encodings

Base Class

(in hex) Meaning

00 Device was built before class code definitions

were finalized

01 Mass-storage controller

02 Network controller

03 Display controller

04 Multimedia device

05 Memory controller

06 Bridge device

07 Simple communications controller

08 Base system peripheral

0A Docking station

0B Processor

0C Serial bus controller

0D–FE Reserved

FF Device does not fit any of the above classes

Table 11-7. Subclass & programming interface fields

Subclass

(in hex) Programming

Interface (in hex) Meaning

00 00 Video device

01 00 Audio device

80 00 Other multimedia device

Table 11-8. Cache line size values

Cache Line Size

(binary) Effect

0000,0100 write-and-invalidates are done in 4-

DWORD, i.e. 16-byte chunks

0000,1000 write-and-invalidate in 8-DWORD chunks

0001,0000 write-and-invalidate in 16-DWORD chunks

all other values only normal ‘memory-write’ is performed

Table 11-9. Layout encodings

Layout (in hex) Meaning

00 Non-bridge PCI device

01 PCI-to-PCI bridge device

Header Type 0

Layout

Figure 11-6. Header type register format.

PNX1300/01/02/11 Data Book Philips Semiconductors

11-8 PRELIMINARY SPECIFICATION

Hardware RESET initializes DRAM_BASE to 0x0 and

MMIO_BASE to 0xefe0,0000, after which the PNX1300

boot protocol sets the final value.

In standalone systems, the autonomous boot sequence

is executed. In this case, the values of DRAM_BASE and

MMIO_BASE are copied from the content of the serial

boot EEPROM, as described in Section 13.2.2, “Initial

DSPCPU Program Load for Autonomous Bootstrap.”

In X86 or other host-assisted platforms, the PCI host as-

sisted boot sequence is executed. In this case, the base

registers are not set from the EEPROM. Instead, the host

BIOS executes a scan for de vices on each PCI bus. Dur-

ing this scan, memory apertures needed by each device

are determined, and a suitable base is assigned by the

host BIOS. The details of this process are described be-

low.

Figure 11-7 shows the formats for DRAM_BASE and

MMIO_BASE. Following are descriptions of the register

fields.

M (Memory). The value of the M bit indicates whether

the desired resource is a memory or PC I/O aperture.

The M bit is hardwired to ’0’, indicating a memory type

aperture for both the DRAM_BASE and MMIO_BASE

registers.

T (Type). The value of the T fie ld indicates the size of the

base address register and constraints on its relocatabili-

ty. Table 11-10 lists the encodings an d meanings of the

T field.

PNX1300’s PCI-interface base re gisters are 32 bits wid e

and can be relocated in the 32-bit address space; thus,

the value of the T field is ‘00’ for both DRAM_BASE and

MMIO_BASE.

P (Prefetchable). The valu e of the P bit indicate s to oth-

er devices whether or not the range is prefetchable.

The P bit in DRAM_BASE reflects the DRAM prefetch-

able attribute as set by the prefetchable bit in the boot

prom (Refer to Table 13-5 on page 13-7 for program-

ming).

MMIO is not prefetchable, so the P bit is hardwired to ’0’

for MMIO_BASE.

Being prefetchable means there are no side effects on

reads, the device returns all bytes on re ads regardless of

the byte enables, and host bridges can merge pro cessor

writes into this range without causing errors.

Note: the setting of the P bit does not chang e the behav-

ior of the cache or memory interface. It simply signals the

host if the range is assumed to be prefetchable.

DRAM/MMIO base address. In X86 or other host plat-

forms, the configuration space DRAM Base Address and

MMIO Base Address fields serve two purposes. First, the

host BIOS software can use them to determine the sizes

of the SDRAM and MMIO apertures. Second, the BIOS

can write to these fields to cause the apertures to be re-

located within the PCI memory address space.

To determine the sizes of an aperture, the BIOS first

writes all ‘1’s (0xFFFFFFFF) to the address field. When

the BIOS reads the field immediately after, the value re-

turned will have ’0’s in all don’t-care bits and ‘1’s in all re-

quired address bits. Required address bits form a left-

aligned (i.e., star ting at th e MSB) contiguous field of ‘1’s,

thus effectively specifying the size of the aperture.

For example, the MMIO aperture is a fixed 2-MB space.

After writing all ‘1’s to the MMIO Base Address field, a

subsequent read returns the va lue 0xFFE0 0000. The M,

T, and P fields are all ’0’ indicating the aperture is mem-

ory (not I/O), can be relocated anywhere in a 32-bit ad-

dress space, and is not prefetchable. Since the aperture

has 21 address bits (the position of the first ’1’ bit), MMIO

space is a 2-MB aperture (221 bytes). The host BIOS now

assigns a suitable 2-MB aligned base address by writing

to the MMIO_BASE register in configuration space.

The DRAM aperture can range in size from 1 MB to 64

MB (but the size must be a power of 2). Thus, the number

of required address bits can ra nge from 20 to 26. The ac-

tual amount of SDRAM present is determined by the con-

tent of the first byte of the boot EEPROM, as described

in Section 13.4, “Detailed EEPROM Contents.” The PCI

BIU uses this size to determine which of the bits marked

‘sp’ in Figure 11-7 are writable and which are set to ‘0’.

This causes the BIOS to determine the correct actual

DRAM aperture size.

Table 11-10. Type field encodings

Type Meaning

00 Base register is 32 bits wide; mapping can relocate

anywhere in 32-bit memory space

01 Base register is 32 bit s wide; mapping must relocate

below 1 MB in memory space

10 Base register is 64 bits wide; mapping can relocate

anywhere in 64-bit address space

11 Reserved

31 0

DRAM_BASE M

DRAM Base Address

123 TP

MMIO_BASE MTP

00000000spspspspspsp00000000

25 19

MMIO Base Address 00000000000000000

31 0123420

Figure 11-7. Base address register format.

Philips Semiconductors PCI Interface

PRELIMINARY SPECIFICATION 11-9

11.5.12 Subsystem ID, Subsystem Vendor ID

The subsystem and subsystem vendor ID are new in PCI

Rev 2.1. These fields are optional, but their use i s highly

recommended as a means to have software drivers iden-

tify the board rather than the ch ip on the board.

This register is im plemented st arting with PNX13 00 and

onwards, and replaces the ‘Personality’ register function-

ality in the TriMedia CTC chip.

The board manufacturer chooses the values of both 16

bits fields by modifying the PNX1300 Boot EEPROM.

The location of these bits is described in Section 13.4,

“Detailed EEPROM Contents.” A legal Vendor ID must

be obtained from the PCI SIG. The vendor is free to as-

sign subsystem ID’s.

11.5.13 Expansion ROM Base Address

The Expansion ROM Base Address register is similar in

purpose to the SDRAM and MMIO Base Address regis-

ters. This registe r relocates a separate me mory aperture

for PCI devices that wish to implement additional ROM.

PNX1300 does not implement expansion ROM; conse-

quently, the least-significant bit of this register—which in-

dicates whether or not PNX1300 responds to expansion

ROM accesses—is hardwired to ’0’. All other bits also

read as ’0’s.

11.5.14 Interrupt Line Register

The value of the Interrupt Line Register determines

which input of the system interrup t co ntroller is driven by

PNX1300’s interrupt pin . As it configures the system and

assigns resources, host system software writes this reg-

ister to assign one of the system interrupt lines to

PNX1300.

11.5.15 Interrupt Pin Register

The value of the Interrupt Pin Register determines which

interrupt pin PNX1300 uses. Table 11-11 lists the possi-

ble values for this register.

Since PNX1300 uses inta#, the value of this register is

hardwired to ‘1’.

11.5.16 Max_Lat, Min_Gnt Registers

The value in the Max_Lat register specifies how often the

PNX1300 PCI interface needs access to the PCI bus.

The value in the Min_Gnt regi ster specifies the minimum

length for a burst period on the PCI bus.

Both of these timer values are specified as multiples of

250 ns. Values of ’0’ ind icate that a device has no specif-

ic requirements for latency and burst-length.

For PNX1300, Max_Lat is hardwired to 0x01 (250 ns),

and Min_Gnt is hardwired to 0x03 (750 ns).

11.6 REGISTERS IN MMIO SPACE

The PNX1300 PCI interface c ontains 13 MMIO registers;

most, except the status bits in BIU_Status, are usually

written only by the DSPCPU. Table 11-12 lists the sup-

ported cycles sequenced by the PCI interface and the

registers involved in each cycle. To ensure compatibility

with future devices, all undefined MMIO bits should be ig-

nored when read, and written as ’0’s.

The MMIO registers are all accessible to DSPCPU soft-

ware, and all but the PCI_ADR and PCI_DATA registers

are accessible to external PCI initiators. The facilities of

PNX1300’s PCI interface can be useful to e xternal initia-

tors in certain circumstances. For example:

• The PCI DMA engine might be useful during host-

assisted boot .

• Host-resident diagnostics may want to test the PCI

interface during boot.

• The MMIO registers can be used to diagnose mal-

functioning parts.

Note, however, that external PCI initiators can access

MMIO register s in on ly one wa y: as 32-bit words on nat-

urally aligned, 32-bit addresses. If any other type of ac-

cess is attempted, the results are undefined. Also, the

byte order of the external initiator and the PCI interface

must be the same; otherwise, the result of an access with

disagreein g by te or d er is und efin e d.

For easy refere nce, Table 11-13 lists the MMIO registers

together with their offsets from MMIO_BASE and their

accessibility by the DSPCPU and external PCI initiators.

Figure 11-8 shows the formats of the PCI interface

MMIO registers. The following are detailed descriptions

of the MMIO registers.

11.6.1 DRAM_BASE Register

The DRAM_BASE register in MMIO space is a shadow

copy of the DRAM_BASE register in PCI Configuration

space. See Section 11.5.11, “Base Address Registers,”

for more details. This copy provides MMIO-space access

to this register. The P,T and M bitfields of this MMIO r eg-

ister are read-only.

11.6.2 MMIO_BASE Register

The MMIO_BASE register in MMIO space is a copy of

the MMIO_BASE register in PCI Configuration space.

See Section 11.5.11, “Base Address Registers,” for

Table 11-11. Inter r upt pin encodings

Interrupt Pin Meaning

1Use interrupt pin inta#

2 Use interrupt pin intb#

3 Use interrupt pin intc#

4 Use interrupt pin intd#

all others Reserved

PNX1300/01/02/11 Data Book Philips Semiconductors

11-10 PRELIMINARY SPECIFICATION

more details. This shadow copy provides MMIO-space

access to this register. The P,T and M bitfields of this

MMIO register are read-only.

11.6.3 MMIO/DRAM_BASE updates

The DRAM_BASE and MMIO_BASE registers are not

normally written throu gh MMIO; their value is determined

by the boot process. Though no t recommended, the reg -

isters are writable in MMIO. Special care should be exer-

cised when writing these registers:

• writing to SDRAM_BASE moves the origin of any

executing DSPCPU program, which will cause it to

fail

• writing to MMIO_BASE moves devices around, and

moves MMIO_BASE and SDRAM_BASE around

• writing to both registers in sequence requires a

delay, due to the implementation. It is recommended

to space such writes far apart, or iterate until the first

before writing the second one.

MMIO_base

offset:

DRAM_BASE (r/w)0x10 0000

MMIO_BASE (r/w)0x10 0400

BIU_STATUS (r/w)0x10 3004

SDRAM Base Address

MMIO Base Address

BIU_CTL (r/w)0x10 3008

PCI_ADR (r/w)0x10 300C PCI Address

PCI_DATA (r/w)0x10 3010

CONFIG_ADR (r/w)0x10 3014

CONFIG_DATA (r/w)0x10 3018

Error: Duplicate dma_cycle

CONFIG_CTL (r/w)0x10 301C

IO_ADR (r/w)0x10 3020 I/O Address

IO_DATA (r/w)0x10 3024 I/O Data

IO_CTL (r/w)0x10 3028

SRC_ADR (r/w)0x10 302C

DEST_ADR (r/w)0x10 3030 Destination Address

Source Address

31 0371115192327

Reserved IntE

PCI Data

Configuration Data

DMA_CTL (r/w)0x10 3034

INT_CTL (r/w)0x10 3038 INT

PTM

Error: Duplicate io_cycle or config_cycle

Done

Busy

Done

Busy

Done

Busy

Done

Busy

CR (PCI Clear Reset)

HE (Host Enab le)

IE (ICP DMA Enable) BO (Burst Mode Off)

SE (Byte Swap Enable)

RNFN

RW (Read/Write)

PCI-to-SDRAM

dma_cycle

io_cycle

config_cycle

SR (PCI Set Reset)

RMA Received Master Abor t

RTA Received Target Abort

TTE Target Timer Expired

31 0371115192327

RMD (Read Multiple Disable)

Figure 11-8. PCI interface registers accessible in MMIO address space.

Philips Semiconductors PCI Interface

PRELIMINARY SPECIFICATION 11-11

11.6.4 BIU_STATUS Register

The BIU_Status register holds bits that track the status of

bus cycles initiated by the DSPCPU and b us cycles from

external devices that write into SDRAM.Two bits of sta-

tus are provided for each type of bus cycle: a busy bit and

a done bit. The DSPCPU can read both bits; a done bit

is cleared by writing a ‘1’ to it. The status register also

holds two error-flag bits.

DSPCPU software must check the busy bits to avoid is-

suing a PCI interface bus cycle request while a request

of a similar type is in progress. If a bus cycle is issued

while a request of similar type is in progress, the PCI in-

terface ignores the second command and sets the ap-

propriate error bit in the status register.

When the DSPCPU issues either an io_cycle or

config_cycle request while a previous request of either

type is already in progress, th e PCI interface sets bit 8 in

BIU_STATUS. When the DSPCPU issues a dma_cycle

while a previous one is already in progress, the PCI inter-

face sets bit 9 in BIU_STATUS. To reset either of th e er-

ror bits 8 or 9 in BIU_STATUS write a ‘1’ to it.

RTA (Received target abort). This bit is set when

PNX1300 initiated a transaction that was aborted by the

target. To reset this bit, write a ‘1’ to this bit position. This

bit is set simultaneous with the RTA bit in the configura-

tion space status register, but is cleared independently.

RMA (Received master abort). This bit is set when

PNX1300 initiated a transaction and aborts it. This usu-

ally signals a transaction to a nonexistent device . To re-

set this bit, write a ‘1’ to this bit position. This bit is set si-

multaneous with the RMA bit in the configuration space

status register, but is cleared independently.

TTE (Target timer expired). In normal operation, a read

of a PNX1300 data item is performed on retry basis:

PNX1300 tells the external master to retry, meanwhile it

fetches the data item across the highway. This bit is set

if an external master did not retry a read of a PNX1300

data item within 3 2768 PCI clocks. The requested data is

discarded. To reset this bit, wr ite a ‘1’ to this bit position.

This is purely a software information bit. No software ac-

tion is required when this condition occurs, but it may in-

dicate a non-compliant or defective master on the bus.

11.6.5 BIU_CTL Register

The BIU_CTL register contains bits that control miscella-

neous aspects of the PCI interface operation. Following

are descriptions of the fields.

SE (Swap bytes enable). This bit is initialized after reset

to ’0’, which causes the PCI inter face to operate in its de-

fault big-endian mode. Writing a ’1’ to SE causes access-

es to MMIO registers over the PCI interface to be made

in little endian mode.

BO (Burst mode off). This bit is initialized to ’0’, which

allows the PCI interface to support burst-mode writes as

a target on the PCI bus. Setting this bit to ’1’ disables

burst-mode writes.

With burst mode enabled, the PCI interface buffers as

much data as possible into r_buffer before issuing a dis-

connect to the PCI initiator. With burst mode disabled,

the PCI interface buffers only one data phase before is-

suing a disconnect to the PCI initiator.

IntE (Interrupt enables). The bits in the IntE field control

the signaling of interrupts to the DSPCPU for PCI inter-

face events. These events raise DSPCPU interrupt 16 if

enabled. Interrupt 16 must be set up as a level triggered

interrupt. Table 11-14 lists the function of each IntE bit.

IntE is initially set to ‘0’s (interrupts disabled).

Note that the error condition masked by bit 6 (see Sec-

tion 11.6.4, “BIU_STATUS Register”) occurs when either

a config_cycle or a n io_cycle is requested and a reque st

of either type is already in progress. That is, the second

Table 11-12. PCI MMIO registers and bus cycles

Internal Cycle Registers Involved

mmio_cycle

(MMIO register R/W) All registers accessible by

external PCI devices

mem_cycle

(PCI-space memory R/W) PCI_ADR,

PCI_DATA

dma_cycle

(Block data transfer) SRC_ADR,

DEST_ADR,

DMA_CTL

IO_cycle

(I/O register R/W) IO_ADR,

IO_DATA,

IO_CTL

config_cycle

(Configuration register R/W) CONFIG_ADR,

CONFIG_DATA,

CONFIG_CTL

Table 11-13. PCI MMIO register accessibility

Offset

Accessibility

DSPCPU External

Initiator

DRAM_BASE 0x10 0000 R/W R/W

MMIO_BASE 0x10 0400 R/W R/W

BIU_STATUS 0x10 3004 R/W R/W

BIU_CTL 0x10 3008 R/W R/W

PCI_ADR 0x10 300C R/W –/–

PCI_DATA 0x10 3010 R/W –/–

CONFIG_ADR 0x10 3014 R/W R/W

CONFIG_DATA 0x10 3018 R/W R/W

CONFIG_CTL 0x10 301C R/W R/W

IO_ADR 0x10 3020 R/W R/W

IO_DATA 0x10 3024 R/W R/W

IO_CTL 0x10 3028 R/W R/W

SRC_ADR 0x10 302C R/W R/W

DEST_ADR 0x10 3030 R/W R/W

DMA_CTL 0x10 3034 R/W R/W

INT_CTL 0x10 3038 R/W R/W

Table 11-12. PCI MMI O reg is ter s an d bus cyc le s

Internal Cycle Registers Involved

PNX1300/01/02/11 Data Book Philips Semiconductors

11-12 PRELIMINARY SPECIFICATION

request need not be of exactly the sam e type that is al-

ready in progress.

IE (ICP DMA enable).This bit is must be set to ’1’ to allow

the ICP to write pixel data through the PCI interface. If

this bit is cleared to ’0’, the ICP is not allowed to use the

PCI interface . Program ming of ICP DMA is described in

Section 14.6, “Operation and Programming.”

HE (Host enable). This bit is initialized to ’0’, which pre-

vents the DSPCPU from serving as the host CPU in the

PCI system. If this bit is set to one, the Enable Mastering

(EM) bit in the PCI Configuration register (see Section

11.5.3, “Command Register”) is also set to ’1’ (since

PNX1300 must be enable d to serve as a PCI bus initiato r

to perform PCI configuration).

CR (PCI clear reset). This bit releases the DSPCPU

from its reset state. The PNX1300 device driver (execut-

ing on an external host CPU) sets this bit to ’1’ after it

completes PNX1300’s configuration. The DSPCPU

starts to execute the pointed by DRAM_BASE MMIO

SR (PCI set reset). This bit forces the DSPCPU into its

reset state. Writing ’1’ to this bit resets the CPU; writing

’0’ causes no action. The PNX1300 device driver (exe-

cuting on an external host CPU) can set this bit to reset

the DSPCPU. This form of reset resets only CPU and In-

struction cache. The Dcache is NOT reset, nor are any

peripherals.

RMD (Read Multiple Disable). In default operating

mode, the RMD bit should be set to ‘0’. In that case, the

BIU uses ‘memory read multiple’ PCI transactions for

BIU DMA, and ‘memory read’ PCI transactions for

DSPCPU reads to PCI space. If the RMD bit is set, DMA

transactions are forced to also use the - less efficient -

memory read tra nsactions. Note that TM-1 000 only used

memory read transactions.

11.6.6 PCI_ADR Register

The 30-bit PCI_ADR register is intended to be written

only by the data cache. PCI_ADR participates in the spe-

cial two-cycle data-cache-to-PCI protocol. See Section

11.6.7, “PCI_DATA Register,” for more information.

Only the DSPCPU can write to PCI_ADR. External PCI

initiators can neither read nor write this register.

DSPCPU software should not write to this register (by

writing to PCI_ADR in MMIO space). This register is in-

tended only to support the special protocol between the

data cache and PCI bus. An unexpected write to

PCI_ADR via MMIO space will not be prevented by hard-

ware and may result in data corruption on the PCI bus.

11.6.7 PCI_DATA Register

The 32-bit PCI_DATA register is intended to be used

only by the data cache. PCI_DATA participates in the

special two-cycle data-cache-to-PCI protocol.

The PCI_DATA and PCI_ADR re gisters are used togeth-

er by the data cach e to perform a single data phase PCI

memory-space read or write. A read operation is trig-

gered when the data cache has written the transaction

address into PCI_ADR and asserted the internal signal

pci_read_operation (a direct internal connection be-

tween the data cache and PCI interface). A write opera-

tion is triggered when the data cache has written both

PCI_ADR and PCI_DATA with the signal

pci_read_operation deasserted.

While the PCI interface is performing the PCI read or

write, the DSPCPU is stalled waiting for the completion

of the PCI transa ction. When the PCI transactio n is com-

plete, the PCI interface asserts pci_ready (a direct inter-

nal connection between the data cache and PCI inter-

face). To finish a read operation, the data cache reads

the PCI_DATA register, forwards the data to the

DSPCPU, and then unlocks the DSPCPU. To finish a

write, the data cache simply unlocks the DSPCPU.

Note that, if the DSPCPU attempts to access a non-exis-

tent PCI address, an RMA condition occurs. In this case,

the value in the PCI_DATA register is set to ‘0’. Hence,

the DSPCPU always reads non-existent PCI locations as

‘0’.

Normal MMIO write operations to PCI_DATA have no ef-

fect. Reads return the register’s current value. External

PCI initiators can neither read no r write this register.

11.6.8 CONFIG_ADR Register

The CONFIG_ADR register is written by the DSPCPU to

set up for a configuration cycle. When PNX1300 is acting

as the host CPU, it must configure devices on the PCI

bus. The DSPCPU writes CONFIG_ADR to select a con-

figuration register within a specific PCI device. See Sec-

tion 11.6.10, “CONFIG_CTL Register,” for more infor-

mation on initiating configuration cycles.

Following are descriptions of the fields of CONFIG_ADR.

BN (PCI bus number). The BN field (the two least-sig-

nificant bits of CONFIG_ADR) selects one of four possi-

ble PCI buses. A value of ’0’ for BN means that the tar-

geted device is on the PCI bus directly connected to

PNX1300 and that any PCI-to -PCI bridges should ignore

the configuration address. Any value for BN other than ’0’

means that the targeted device is on a PCI bus connect-

ed to a PCI-to-PCI bridge and that all devices directly

connected to PNX1300’s local PCI bus should ignore the

configuration address.

RN (Register number). The RN field (bits 2..7 of

CONFIG_ADR) is used to specify one of the 64 configu-

Table 11-14. IntE bit functions

BIU_CTL Bit If set to ‘1’, interrupt DSPCPU when...

2 config_cycle done

3 io_cycle done

4 dma_cycle done

5 pci_dram write cycle done

6 second config_cycle or io_cycle requested

7 second dma_cycle requested

Philips Semiconductors PCI Interface

PRELIMINARY SPECIFICATION 11-13

ration words within the target device’s configuration

space.

FN (Function number). The FN field (bits 8..10 of

CONFIG_ADR) is used to specify one of up to eight func-

tions of the addressed PCI device.

DN (Device number). The DN field (bits 11..31 of

CONFIG_ADR) is used to select the targeted PCI de-

vice. Each bit correspond s to o ne of th e 21 po ssible PCI

devices on a single PCI bus, i.e., each bit corresponds to

the idsel signal of one PCI device. Only one idsel sig-

nal—and, therefore, only one DN bit—can be asserted

during a given configuration cycle.

11.6.9 CONFIG_DATA Register

The 32-bit CONFIG_DATA register is used by the

DSPCPU to buffer data for a configuration cycle. When

PNX1300 is acting as the host CPU, it must configure the

PCI bus and devices. The DSPCPU writes or reads

CONFIG_DATA depending on whether it is performing a

write or read to a PCI device’s configuration space. See

Section 11.6.10, “CONFIG_CTL Register,” for more in-

formation on initiating configuration cycles.

11.6.10 CONFIG_CTL Register

The DSPCPU writes to CONFIG_CTL to trigger a config-

uration read or write cycle on the PCI bus. A PCI config-

uration read or write should not be performed during an

ongoing PCI I/O read or write.

The steps involved in a DSPCPU PCI configuration ac-

cess are:

1. Wait until BIU_STATUS io_cycle.Busy and

config_cycle.Busy are both de-asserted

2. Write to CONFIG_ADR as described above, and (in

case of a write operation) write to CONFIG_DATA.

3. Write to CONFIG_CTL to sta rt the read or write.This

action sets config_cycle.Busy.

4. Wait (polling or interrupt based) until

config_cycle.Done is asserted by the hardware.

5. Retrieve the requested data in CONFIG_DATA (in

case of a read)

6. Clear config_cycle.Done by writing a ‘1’ to it.

Following are descriptions of the fields of CONFIG_CTL

and a discussion of how a DSPCPU write to

CONFIG_CTL triggers configuration cycles.

BE (Byte enables). The BE field (the four LSBs of

CONFIG_CTL) determines the state of PCIs 4-line c/be#

bus during the data phase of a configuration cycle. Since

the c/be# bus signals are a ctive low, a ‘0 ’ in a BE field bit

means byte participates; a ‘1’ in a BE field bit means

‘byte does not participate.’ Table 11-15 shows the corre-

spondence between BE bits and bytes on the PCI bus

assuming little-endian byte order.

RW (Read/Write). The RW field (bit 4 of CONFIG_CTL)

determines whether the configuration cycle will be a read

or a write. Table 11-16 shows the interpretation of RW.

A write by the DSPCPU to the CONFIG_CTL register

starts a configuration cycle on the PCI bus. The

CONFIG_DATA (for a write) and CONFIG_ADR regis-

ters must be set up before writing to CONFIG_CTL.

During a configuration read, the PCI interface drives the

PCI bus with the address from CONFIG_ADR and the

BE field from CONFIG_CTL. The returned data is buff-

ered in CONFIG_DATA. When the data is returned, the

PCI interface will gener ate a DSPCPU interrupt if the ap-

propriate IntE bit is set in BIU_CTL. Alternatively,

DSPCPU software can poll th e appropriate “done” status

bin in BIU_STATUS. Finally, DSPCPU software reads

the CONFIG_DATA register in MMIO space to access

the data returned from the configuration cycle.

A write operation proceeds as for a read, except that PCI

data is driven from CONFIG_DATA during the transac-

tion and no data is returned in CONFIG_DATA.

11.6.11 IO_ADR Register

The 32-bit IO_ADR register is written by the DSPCPU to

set up for an access to a location in PCI I/O space. The

DSPCPU writes the address of the I/O register into

IO_ADR. See Section 11.6.13, “IO_CTL Register,” for

more information on initiating I/O cycles.

11.6.12 IO_DATA Register

The 32-bit IO_DATA register is used by the DSPCPU to

set up for an access to a location in PCI I/O space. The

DSPCPU writes or reads IO_DATA depending on wheth-

er it is performing a write or read from IO space. See

Section 11.6.13, “IO_CTL Register,” for more informa-

tion on initiating I/O cycles.

11.6.13 IO_CTL Register

The DSPCPU writes to IO_CTL to trigger a read or write

access to PCI I/O space. The function of this register is

similar to that of CONFIG_CTL, and the protocol for an I/

O cycle is similar to the configuration cycle protocol. A

Table 11-15. BE field interpretation (assumes little-

endian byte ordering)

BE Bit Interpretation

00  byte 0 (LSB) participates

1  byte 0 (LSB) does not participate

10  byte 1 participates

1  byte 1 does not participate

20  byte 2 participates

1  byte 2 does not participate

30  byte 3 (MSB) participates

1  byte 3 (MSB) does not participate

Table 11-16. RW Interpretation

RW Interpretation

0 Write

1 Read

PNX1300/01/02/11 Data Book Philips Semiconductors

11-14 PRELIMINARY SPECIFICATION

PCI I/O read or write sh ould not be per form ed dur ing an

ongoing PCI configuration read or write.

The steps involved in a DSPCPU PCI I/O access are:

1. Wait until BIU_STATUS io_cycle.Busy and

config_cycle .Bu sy ar e bo th de -a sse rt ed

2. Write IO address to IO_ADR, and (in case of a write

operation) write data to IO_DATA.

3. Write to IO_CTL to start the read or write.This action

sets io_cycle.Busy.

4. W ait (polling or interrupt based) until io_cycle.Done is

asserted by the hardware.

5. Retrieve the requ ested data in IO_DAT A (in ca se of a

read)

6. Clear io_cycle.Done by writing a ‘1’ to it.

Following are descriptions of the fields of IO_CTL and a

discussion of how a DSPCPU write to IO_ CTL triggers I/

O cycles.

BE (Byte enables). The BE field (the four least-signifi-

cant bits of IO_CTL) determines the state of PCI’s 4-line

c/be# bus during the data phase of an I/O cycle. Since

the c/be# bus signals are active low, a ‘0’ in a BE field bit

means ‘byte participates;’ a ‘1’ in a BE field bit means

‘byte does not participate.’ Table 11-15 shows the corre-

spondence between BE bits and bytes on the PCI bus

assuming little-endian byte order.

RW (Read/Write). The RW field (bit 4 of IO_CTL) deter-

mines whether the I/O cycle will be a read or a write.

Table 11-16 shows the interpretation of RW (0  write,

1  read).

A write by the DSPCPU to the IO_CTL register starts a n

I/O cycle on the PCI bus. The IO_DATA (for a write) and

IO_ADR registers must be set up before writing to

IO_CTL.

During an I/O read, the PCI interface drives the PCI bus

with the address from IO_ADR and the BE field from

IO_CTL. The returned data is buffered in IO_DATA.

When the data is returned, the PCI interface will gener-

ate a DSPCPU interrupt if the appropriate IntE bit is set

in BIU_CTL. Alternatively, DSPCPU software can poll

the appropriate ‘done’ status bit in BIU_STATUS. Finally,

DSPCPU software reads the IO_DATA register in MMIO

space to access the data ret urned from the I/O cycle.

A write operation proceeds as for a read, except that PCI

data is driven from IO_DATA during the transaction and

no data is returned in IO_DATA.

11.6.14 SRC_ADR Register

The 32-bit SRC_ADR register is used to set the source

address for a block transfer DMA operation. The addre ss

in SRC_ADR must be word (4-byte) aligned, i.e. the 2

LSBs have to be ‘0’. The content of this register during or

after DMA is not defined, hence it cannot be used to track

progress or verify completion of a DMA transaction.

11.6.15 DEST_ADR Register

The 32-bit DEST_ADR register is used to set the desti-

nation address for a block transfer DMA operation. The

address is DEST_ADR must be word (4 byte) aligned,

i.e. the 2 LSBs must be ‘0’. The content of this register

during or after DMA is not defined, hence it cannot be

used to track progress or verify completion of a DMA

transaction.

11.6.16 DMA_CTL Register

A write by the DSPCPU to the DMA_CTL register starts

a DMA block transfer on the PCI bus. The SRC_ADR

and DEST_ADR registers must be set up before writing

to DMA_CTL.

The steps involved in a DMA transfer are:

1. Wait until BIU_STATUS dma_cycle.Busy is de-as-

serted

2. Write to SRC_ADR and DEST_ADR as described

above

3. Write to DMA_CTL to start the DMA transaction.This

action sets dma_cycle.Busy

4. Wait (polling or interrupt based) until dma_cycle.Done

is asserted by the hardware

5. Clear dma_cycle.Done by writing a ‘1’ to it

The fields of DMA_ CTL ar e de scribed below .

TL (Transfer length). The TL field (bits 0..25 of

DMA_CTL) specifies the number of data bytes to be

transferred during the DMA opera tion. It must be a multi-

ple of 4 bytes. The maximum length of a DMA operation

is limited to 64 MB, the maximum amount of SDRAM

supported by PNX1300. The content of this field during

or after a DMA transaction is not defined.

D (DMA direction). The D field (bit 26 of DMA_CTL) de-

termines the direction of data movement during the block

transfer. Table 11-17 (shows the interpretation of the D

field.

T (DMA Transaction type). The T field (bit 27 of

DMA_CTL) determines the transaction type of a write, as

described below.

Table 11-17. D interpretation

D Data Movement Direction

0 SDRAM  PCI memory space (DMA write)

1 PCI memory space  SDRAM (DMA read)

Table 11-18. T interpretation

T DMA Write transaction type

0 memory write

1 memory write-and-invalidate

Philips Semiconductors PCI Interface

PRELIMINARY SPECIFICATION 11-15

PNX1300 generates memory write-and-invalidate PCI

transactions if all conditions below are satisfied, other-

wise it generates regular memory write transactions:

• The MWI bit in the Command Register is set.

• The Cache Line Size register is set to 4,8, or 16 32-

bit words.

• The DMA source address is 64 byte aligned.

• The DMA destination address is cache line size

aligned.

•The T bit is set

PNX1300 generates ‘memory read multiple’ PCI transac-

tions for DMA reads, unless the RMD (Read Multiple Dis-

able) bit is set in BIU_CTL, in which case the less effi-

cient ‘memory read’ transactions are used.

During a PCI  SDRAM block transfer, the PCI interface

drives the PCI bus with the address from SRC_ADR. The

returned data is buffered in r_buffer. The PCI interface

then drives the address from DEST_ADR and the data

from r_buffer to the SDRAM controller. SRC_ADR and

DEST_ADR are incremented, the TL field in DMA_CTL

is decremented, and this sequence repeats until TL

reaches ‘0’.

At the end of the PCI  SDRAM block transfer, the PCI

interface will generate a DSPCPU interrupt if the appro-

priate IntE bit is set in BIU_CTL. Alternatively, DSPCPU

software can poll the appropriate ‘done’ status bit in

BIU_STATUS.

During an SDRAM  PCI block transfer, the PCI inter-

face drives the address from SRC_ADR to the SDRAM

controller. The returned data is buffer ed in w_buffer. The

PCI interface then drives the address from DEST_ADR

and the data from w_buffer to the PCI bus. SRC_ADR

and DEST_ADR are incremented, the TL field in

DMA_CTL is decremented, and this sequence repeats

until TL reaches ‘0’.

At the end of the SDRAM  PCI block transfer, the PCI

interface can generate a DSPCPU interrupt if the appro-

priate IntE bit is set in BIU_CTL. Alternatively, DSPCPU

software can poll the appropriate ‘done’ status bit in

BIU_STATUS.

11.6.17 INT_CTL Register

The INT_CTL register contains three fields for setting,

enabling, and sensing the four PCI interrupt lines.

Table 11-19 shows the interpretation of the fields in

INT_CTL.

INT (Interrupt bits) . The INT field (bits 0..3 of INT_CTL)

can force a PCI interrupt to be signalled.

IE (Interrupt enable ). The IE field ( bits 4..7 of INT_CTL)

enables PNX1300 to drive PCI interrupt lines.

IS (Interrupt state). The IS field (bits 8..11 of INT_CTL)

senses the state of the PCI interrupt lines.

Figure 11-9 shows a conceptual realization of the logic

used to implement the control of each intx# pin.

See also Section 3.6, “PNX1300 to Host Interrupts.”

11.7 PCI BUS PROTOCOL OVERVIEW

PNX1300’s PCI interface can generate and respond to

several types of PCI bus commands. Table 11-20 lists

the 12 possible command s and whether or not PNX1300

can generate them.

Table 11-21 lists the 12 possible commands and wheth-

er or not PNX1300 can respond to them.

The basic transfer mech anism on the PCI bus is a burst,

which consists of an address phase followed by one or

more data phases. In PNX1300, the DSPCPU and ICP

are the only two units that can cause PNX1300 to be-

Table 11-19. INT_CTL Bits

INT_CTL PCI Signal Programming

Field Bit

INT 0 inta# 0 Deassert intx#

1 Assert intx# (if enabled);

i.e., pull intx# pin to a low

logic level

1 intb#

2intc#

3 intd#

IE 4 inta# 0 Disable open-collector

output to intx#

1 Enable open-collector

output to intx#

5 intb#

6intc#

7 intd#

IS 8 inta# Reads state of intx# pin:

0 No interrupt asserted

(intx# is high)

1 Interrupt is asserted

(intx# is low)

9 intb#

10 intc#

11 intd#

Table 11-20. PNX1300 PCI Commands as Initiator

PNX1300 Generates PNX1300 Cannot

Generate

Configuration read

Configuration write

Memory read

Memory read multiple

Memory write

Memory write and invalidate

I/O read

I/O write

Interrupt acknowledge

Special cycle

Dual address

Memory read line

INTx

oc PCI intx#

IEx

ISx

Figure 11-9. Conceptual realization of intx# pin con-

trol logic.

PNX1300/01/02/11 Data Book Philips Semiconductors

11-16 PRELIMINARY SPECIFICATION

come a PCI-bus initiator, i.e., only the DSPCPU and ICP

can access external resources.

11.7.1 Single-Data-Phase Operations

When the DSPCPU reads or writes PC memory, the PCI

transaction has only a single data phase. A typical sin-

gle-data-phase read operation is illustrated in

Figure 11-10. Durin g the first clock pe riod, the PNX1300

asserts the frame# signal to indicate that the transaction

has begun an d that an address and command are stab le

on ad and c/be#, respectively.

PNX1300 then releases the ad bus, deasserts frame#,

asserts irdy#, asserts byte enables on c/be#, and waits

for the target to claim the transaction by asserting

devsel#. The target asserts trdy# to signal the master

that the ad bus contains stable data. The assertion of

trdy# causes the initiator (PNX1300 in this case) to sam-

ple the ad bus data and deassert irdy# to complete the

single-data-phase read transaction.

Figure 11-11 shows a typical single-data-phase write op-

eration. The operation begins like a read: PNX1300 as-

serts the frame# signal and drives the ad bus with the tar-

get address and drives the command onto the c/be# bus.

The operation continues when PNX1300 deasserts

frame#, asserts irdy #, and drives the byte enables as be -

fore, but it also drives the data to be written on the ad

bus. The target device asserts devsel# to claim the trans-

action. Eventually, the target asserts trdy# to signal that

it is sampling the data on the ad bus. PNX1300 continues

to drive the data on the ad bus until after the target deas-

serts trdy#, which completes the write operation.

11.7.2 Multi-Data-Phase Operations

As with the single-data-phase operations, DMA opera-

tions begin with the assertion of frame# and valid ad-

dress and command informa tion. See Figure 11-12. The

target knows a burst is requested because frame# re-

mains asserted when irdy# becomes asserted.

In the example timin g of Figure 11-12, a fast device is re-

ceiving the burst from PNX1300. The target asserts

devsel# and trdy# simultaneously. The trdy# signal re-

mains asserted while PNX1300 sends a new word of

data on each PCI clock cycle. The burst operation shown

is a 16-word burst transfer. Since only the starting ad-

dress is sent by the initiator, both initiator and target must

increment source and destination addresses during the

burst.

The initiator signals the end of the burst of data in

Figure 11-12 when it deasserts frame# in clock 17. The

last word (or partial word) of data is transferred in the

clock cycle after frame# is deasserted. Fin ally, the target

acknowledges the last data phase by deasserting trdy#

and devsel#.

Figure 11-13 illustrates back-to-back DMA burst data

transfers. The ICP is capable of expl oiting the high band-

width available with back-to-back DMA operations when

it is writing image data to a frame buffer on a PCI video

card.

The timing of Figure 11-13 assumes that the PCI bus is

granted to PNX1300 until at least the beginning of the

second DMA burst operation. For as long as bus owner-

ship is granted to PNX1300 and the ICP has queued re-

quests for data transfer, the PCI interface will perform

back-to-back DMA operations. If the target eventually

becomes unable to accep t more data, it signals a discon-

nect on the PNX1300 PCI interface. The PCI interface

remembers where the DMA burst was interrupted and at-

tempts to re start from th at point after two bus clocks.

Table 11-21. PNX1300 PCI commands as target

PNX1300 Responds To PNX1300 Ignores

Configuration read

Configuration write

Memory read

Memory write

Memory write and invalidate

Memory read line

Memory read multiple

I/O read

I/O write

Interrupt acknowledge

Special cycle

Dual address

pci_clk

frame#

c/be#

irdy#

trdy#

devsel#

1234

Address

Byte Enables

Command

Data

Wait (AD turnaround)

Data Transfer

Figure 11-1 0. Basi c single-data-phase re ad ope ra -

pci_clk

frame#

c/be#

irdy#

trdy#

devsel#

123 n

Address Data

Byte Enables

Command

Wait

Data Transfer

Figure 11-11. Basi c si n gl e- d at a- pha s e writ e o per a -

Philips Semiconductors PCI Interface

PRELIMINARY SPECIFICATION 11-17

11.8 LIMITATIONS

11.8.1 Bus Locking

The PCI interface does not implement lock#, sbo, and

sbone pins. Consequently, it is possible for both the

DSPCPU and external PCI initiators to write to a critical

memory section simultaneously. Software must imple-

ment policies to guarantee memory coherency.

11.8.2 No Expansion ROM

PNX1300 does not implement the PCI expansion ROM

capability.

11.8.3 No Cacheline Wrap Address

Sequence

The PCI interface does not implement the PCI cacheline-

wrap address mode for external PCI initiators that ac-

cess PNX1300 SDRAM.

11.8.4 No Burst for I/O or Configuration

Space

Only single-data-phase tra nsactions to configuration and

I/O spaces are supported. The byte-enable signals se-

lect the byte(s) within the addressed word.

11.8.5 Word-Only MMIO Register Access

External initiators can access PNX1300 MMIO registers

only as full words. The byte-enable signals have no ef-

fect on the data transferred. External initiato rs must read

and write all four bytes of MMIO registers.

pci_clk

frame#

c/be#

irdy#

trdy#

devsel#

123456 17

Address

Byte Enables

Command

Data 1 Data 2 Data 3 Data 4 Data 15 Data 16

Data Transfer

Data TransferData Transfer

Data Transfer

Figure 11-12. PCI burst write operation with 16 data phases.

pci_clk

frame#

c/be#

irdy#

trdy#

devsel#

1 2 3 18 19 20

Address

Byte Enables

Command

Data 1 Data 15 Data 16 Data 17 Data 31 Data 32

Data Transfer

Figure 11-13. Back-t o-back PCI burst write operations with 16 data phases which might be generated by the

ICP when writing image data to a PCI-resident video frame buffer.

PNX1300/01/02/11 Data Book Philips Semiconductors

11-18 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION 12-1

SDRAM Memory System Chapter 12

by Eino Jacobs, Chris Nelson, Thorwald Rabeler, Mohammed Yousuf, Luis Lucas

12.1 NEW IN PNX1300/ 01/02/11

• Support of 256-Mbit SDRAMs organized in x16. The

REFRESH counter must be changed. Refer to

Section 12.11 for more details.

• 16-bit memory interface support in addition to the 32-

bit mode of TM-1300.

12.2 PNX1300 MAIN MEMORY OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

PNX1300 connects to its local memory system with a

dedicated memory bus, shown in Figure 12-1. This bus

interfaces only with SDRAM or SGRAM (synchronous

graphics DRAM with its DSF pin tied low); PNX1300 is

the only master on this bus.

A variety of device types, speeds, and rank1 sizes are

supported allo wing a wide range of PNX1300 systems to

be built. Table 12-1 summarizes the memory system fea-

tures.The memory devices ca n have two or four banks.

The main memory inte rface provides all control and data

signals with sufficient drive capacity for a glueless con-

nection up to a 183-MHz memory system (for PNX1302,

166 MHz otherwise) with up to two memory devices. The

memory-system speed can be different from PNX1300

core speed; the ratio between the memory system clock

and PNX1300 core clock is programmable.

With current memory technology, PNX1300 supports a

glueless memory interface of up to 64MBytes with two

44M16 SDRAM chips (two devices with 4 banks of

four million words, each 16 bits wide).

PNX1300 provides also a 16-bit memory interface (in-

stead of 32 -bit only f or TM -130 0) fo r app licatio ns re quir-

ing lower cost and lower performance. The available

bandwidth is then reduced by two and the latency on

cache misses is increased by two for the Instruction

cache and by one SDRAM cycle for the Data cache on

critical word first demand.

The maximum amount of memory in the 16-bit mode is

32MBytes.

12.3 MAIN-MEMORY ADDRESS

APERTURE

PNX1300’s local main memory is just one of three aper-

tures into the 4-GB ad d re ss sp ac e of the DSPC PU:

• SDRAM (0.5 to 64 MB in size),

• MMIO (2 MB in size), and

• PCI (any address not in SDRAM or MMIO).

MMIO registers control the positions of the address-

space apertures. The SDRAM ape rture begins at the ab-

solute address specified in the MMIO register

DRAM_BASE and extends upward to the address spec-

ified in the DRAM_LIMIT register. If the SDRAM aperture

overlaps th e memory hole, the memory hole is ignored.

The MMIO aperture begins at the address in

MMIO_BASE, which defaults to 0xEFE00000 after pow-

er-up, and extends upwards 2 MB. (See Chapter 3,

“DSPCPU Architecture,” for a detailed discussion.) All

addresses that fall outside these two apertures are as-

sumed to be part of the PCI address aperture.

1. In this document, the term ‘rank’ is used to refer to a

group of memory devices that are accessed together.

Historically, the term ‘bank’ has been used in this con-

text; to avoid confusion, this document uses bank to re-

fer to on-chip organization (SDRAM devices have two

or four internal banks) and rank to refer to off-chip, sys -

tem-level organization.

Table 12-1. Memory System Features

Characteristic Comments

Data width 16 and 32 bits

Number of ranks Four chip-select signals support up to four

ranks (can be used as addresses)

Memory size From 512 KB to 64 MB

Devices

supported • Jedec SGRAM (DSF tied low)

• Jedec SDRAM (4, 8, 16, 32)

• PC100/133 and later

Clock rate Up to 183 MHz SDRAM speed (program-

mable ratio between

core clock and memory system clock)

Bandwidth 732 MB/s (at 183 MHz and 32-bit i/f)

Glueless interface • Up to 2 chips at 183 MHz (e.g., 32 MB

memory with 4x1Mx32 SDRAM)

• Up to 4 chips at 166 MHz (e.g., 64 MB

memory with 4x1Mx32 SDRAM)

Signal levels 3.3-V LVTTL

PNX1300/01/02/11 Data Book Philips Semiconductors

12-2 PRELIMINARY SPECIFICATION

12.4 MEMORY DEVICES SUPPORTED

All devices must have a LVTTL, 3.3-V interface.

Table 12-2 lists the devices and organizations supp orted

in a 32-bit memory interface.

Refer to Section 12.8, “Address Mapping,” in order to

evaluate the support of 2-bank, 64-Mbit devices. These

devices are not widely used. Hence they are not de-

scribed in this document.

Table 12-3 lists the devices and organizations supp orted

in a 16-bit memory interface.

12.4.1 SDRAM

PNX1300 supports synchronous DRAM chips directly.

SDRAM has a fast, synchronous interface that permits

burst transfers at 1 word per clock cycle. The memory in-

side an SDRAM device is divided into two or four ba nks;

the SDRAM implements inter leaved bank access to sus-

tain maximum bandwidth.

SDRAM devices implement a power down mechanism

with self-refresh. PNX1300 power management takes

advantage of this mechanism.

PNX1300 supports only Jedec-compatible SDRAM with

two or four internal banks of memory pe r device.

12.4.2 SGRAM

Also supported in PNX1300 systems, SGRAM is essen-

tially an SDRAM with additional features for raster graph-

ics functions. The device type is standardized by Jedec

and offered by multiple DRAM vendors. Tying the DSF

input of an SGRAM low makes the device operates like

a standard 32-bit-wide SDRAM and thus compatible with

the PNX1300 memory interface. PNX1300 is not sup-

porting the new typ es of SGRAMs that have a DDR inter-

face.

12.5 MEMORY GRANULARITY AND SIZES

PNX1300 supports a variety of memory sizes thanks to:

• Many possible configurations of SDRAM devices

• Support for up to four memory ranks

The minimum memory size is 4 MB using two

2512K16 SDRAM devices on the 32- bit data bus , or 2

MB with one of these devi ces on a 16-b it data bus. Up to

two memory devices can be connected without any glue

logic and without sacrificin g performance. The ma ximum

memory size with full performance is 64MB using two

44M16 SDRAM chips on a 32-bit data bus, and 32 MB

using one 44M16 SDRAM chip on a 16-bit data bus.

Several memory configurations can be constructed using

more devices. To do so , the frequency of the memory in-

Table 12-2. Supported Rank Configurations (32-bit)

Device Size

(Mbit) Device(s) Rank Size

16 2 512K 16 SDRAM 4 MB

2 1M 8 SDRAM 8 MB

2 2M 4 SDRAM 16 MB

64 4 512K  32 SDRAM 8 MB

4 1M 16 SDRAM 16 MB

4 2M 8 SDRAM 32b MB

128 4 1M 32 SDRAM 16 MB

1281

1. Limited support for a 32-MB configuration only.

4 2M 16 SDRAM 322 MB

2. However MM_CONFIG.SIZE may be set to

16MB (i.e. 6). Refer to Figure 12-10 and

Figure 12-11 for the two possible connection

details.

2563

3. Limited support for a 64-MB configuration only.

4 4M 16 SDRAM 644 MB

4. However MM_CONFIG.SIZE is 32 MB (i.e. 7).

Table 12-3. Supported Rank Configurations (16-bit)

Device Size

(Mbit) Device(s) Rank Size

16 2 512K 16 SDRAM 2 MB

64 4 1M 16 SDRAM 8 MB

128 4 2M 16 SDRAM 161 MB

256 4 4M 16 SDRAM 322 MB

Figure 12-1. PNX1300 internal highway bus to the external glueless SDRAM interface.

PNX1300

Memory

Interface

Chip Selects#

Address,

Clock Enables,

RAS#, CAS#, WE#

Byte Enables[3:0]

Clock

Data[31:0]

CS#

Address, Control

DQM[3:0]

CLK

DQ[31:0]

33 

SDRAM

Memory

Array

Data

Highway

PNX1300

On-Chip

Peripherals

DSPCPU

1. However MM_CONFIG.SIZE is set to 8 MB (i.e. 5)

2. However MM_CONFIG.SIZE is set to 8 MB (i.e. 5).

Philips Semiconductors SDRAM Memory System

PRELIMINARY SPECIFICATION 12-3

terface must be lowere d to account for extra prop agation

delay due to the excessive loading on the interface sig-

nals (see Section 12.13, “Output Driver Capacity”).

The following rules apply to memory rank design:

• All devices in a rank must be of the same type.

• All ranks must be a power of two in size.

• All ranks must be of equal size.

Table 12-4 lists some examples of 32-bit memory sys-

tem designs.

Refer to the TM-1 100 Databook for smaller memory con-

figurations.

Note:

• Some of these configurations may not be economi-

cally attractive due to the price premium.

• ‘Max. MHz’ refers to the memory interface/SDRAM

speed, not the PNX1300 core operating frequency.

The maximum MHz also depends on the device

being used, i.e. PNX1300, PNX1311 or PNX1302.

Refer to Section 1.9.7. 10 on page 1-19 for maximum

operating speeds.

Table 12-4 lists some example of 32-bit memory system

designs.

12.6 MEMORY SYSTEM PROGRAMMING

Memory system parameters are determined by the con-

tents of two configuration registers, MM_CONFIG and

PLL_RATIOS. Table 12-6 describes the function of

these registers, and Figure 12-2 shows their formats.

To ensure compatibility with future devices, any unde-

fined MMIO bits should be ignored when read .

MM_CONFIG and PLL_RATIOS are loaded from the

boot EEPROM, as described in Section 13.4, “Detailed

EEPROM Contents.” During this boot process, the mem-

ory interface is held in reset state. After the memory in-

terface is released from re set, the contents of th ese reg-

isters cannot be altered.

These registers are visible in MMIO space. They can be

read, but writes have no effe ct.

12.6.1 MM_CONFIG Register

The MM_CONFIG register tells the memory interface

how to use the local DRAM memory. The fields in this

rate of the memory. Table 12-8 summarizes the field

functions.

REFRESH (Refresh interval). The 16-bit REFRESH

field specifies the number of memory-system clock cy-

cles between refresh operations. The default value of

this field is 1000 (0x03E8). See Section 12.11, “Refresh,”

for more information.

BW (Bus Width). If set to ‘0’ then the memory interface

data bus width is 32 b its. If se t to ‘1’ the n the memory in-

terface data bus width is 16 bits.

SIZE (Rank size). The 3-bit SIZE field specifies the size

of each rank of DRAM. Each rank must be the size spe c-

ified by SIZE. The default is a rank size of 4MB. Refer to

Table 12-7 for the interpretation of this field.

Table 12-4. Examples of 32-bit Mem ory Co nfig urat ions

Size

(MB) Ranks Rank Configurations Max.

MHz Peak

MB/s

8 1 four 21M8 SDRAM 166 664

2two 2512K16 SDRAM

two 2512K16 SDRAM 166 664

1 one 4512K32 SDRAM 183 732

16 1 two 41M16 SDRAM 183 732

1 one 41M32 SDRAM 183 732

2 one 4512K32 SDRAM

one 4512K32 SDRAM 183 732

24 3 one 4512K32 SDRAM

one 4512K32 SDRAM

166 664

32 11

1. However MM_CONFIG.SIZE may be 16 MB (i.e.

6). Refer to Figure 12-10 and Figure 12-11 for

the two possible connection details.

two 42M16 SDRAM 183 732

11four 42M8 SDRAM 166 664

2two 41M16 SDRAM

two 41M16 SDRAM 166 664

2 one 41M32 SDRAM

one 41M32 SDRAM 183 732

4 one 4512K32 SDRAM

one 4512K32 SDRAM

166 664

48 3 one 41M32 SDRAM

one 41M32 SDRAM

166 664

64 12

2. However MM_CONFIG.SIZE is 32 MB (i.e. 7).

two 44M16 SDRAM 183 732

4 one 41M32 SDRAM

one 41M32 SDRAM

166 664

Table 12-5. Supported 16-bit Memory Configurations

Size

(MB) Ranks Rank Configurations Max.

MHz Peak

MB/s

8 1 one 41M16 SDRAM 183 366

161

1. However MM_CONFIG.SIZE is set to 8 MB (i.e. 5)

1 one 42M16 SDRAM 183 366

322

2. However MM_CONFIG.SIZE is set to 8 MB (i.e. 5)

1 one 44M16 SDRAM 183 366

PNX1300/01/02/11 Data Book Philips Semiconductors

12-4 PRELIMINARY SPECIFICATION

12.6.2 PLL_RATIOS Register

The PLL_RATIOS register controls the operation of the

separate memory-i nterface and CPU PLLs. Fields in this

put:output ratio each PLL should generate. Table 12-8

summarizes the field functions. Figure 12-3 shows how

the PLLs are connected and how fields in the

PLL_RATIOS register control them. For normal opera-

Table 12-6. Memory Configuration Registers

MM_CONFIG Describes external memory configuration

PLL_RATIOS Controls separate memory and CPU PLLs

(phase-locked loops)

Table 12-7. MM_CONFIG Fields

Field Function

REFRESH Refresh interval in memory clock cycles.

Default value 1000 (0x03E8).

SIZE Memory rank size 0 Reserved

1 512KB

21MB

32MB

44MB

58MB

6 16MB

7 32MB

Figure 12-2. Memory interface configuration registers.

31 0

MM_CONFIG (r/o) 423SIZE

PLL_RATIOS (r/o) CR

REFRESH

31 04237

SDRAM PLL Bypass

SDRAM PLL Disable

CPU PLL Bypass

CPU PLL Disabl e

SDRAM Ratio

CPU Ratio

SB SD CB CD SR

0x10 0100

MMIO_base

offset:

0x10 0300

16-bit memory interface

Table 12-8. PLL_RATIOS Fields

Field Function

CR CPU:memory ratio 01:1

12:1

23:2

34:3

45:4

5–7 Reserved

SR Mem ory :ext ernal rat io 02:1

13:1

CD CPU PLL Disable 0CPU PLL on

1CPU PLL off

CB CPU PLL bypass 0CPU PLL

1CPU Memory

SD SDRAM PLL Disable 0SDRAM PLL on

1SDRAM PLL off

SB SDRAM PLL bypass 0Memory PLL

1Memory external

Figure 12-3. PNX1300 memory and core PLL connections.

Memory System

PLL DSPCPU PLL

0423756

SD SB CD CB SR PLL_RATIOS Register

PNX1300

Core

Clock

PNX1300

TRI_CLKIN

MM_CLK1

MM_CLK0

External Clock Input

Memory System Clocks TO DDSes && EVO PLL

x3, x9

PNX1300

Peripheral

Clocks

Philips Semiconductors SDRAM Memory System

PRELIMINARY SPECIFICATION 12-5

tion Both PLLs must be activated, i.e. {CD,CB,SD,SB}

must be equal to 0000 (binary value).

The operating limits of the internal PLLs are:

• 27 MHz < Output of the SDRAM PLL < 200 MHz

• 33 MHz < Output of the CPU PLL < 266 MHz

These are not the speed grades of the chips, just the PLL

limits.

CR (CPU-to-memory PLL ratio). The 3-bit CR field se-

lects one of five input-to-output clock ratios for the CPU

PLL. The input clock is the memory system clock; the

output clock determines the PNX1300 core operating fre-

quency. The default value is ‘0’, which implies a 1:1

CPU:memory ratio. See Table 12-8 for other encoding.

SR (Memory-to-external PLL ratio). The 1-bit SR field

selects one of two memory-to-external clock ratios for

the memory inte rface PLL. The PLL inpu t is PNX1300’s

external input clock TRI_CLKIN; the PLL output deter-

mines the operating frequency of the memory interface

and SDRAM devices. The default value is ‘0’, which im-

plies a 2:1 memory:external ratio. A value of ‘1’ implies a

3:1 ratio.

CD (CPU PLL disable). The 1-bit CD field determines

whether or not the CPU PLL is turned on. The reset value

is ‘1’, which disables operation of the CPU PLL and dis-

sipates almost no power. For normal oper ation the value

should be zero, enabling the CPU PLL.

CB (CPU PLL bypass). The 1-bit CB field determines

whether the input or the output of the CPU PLL drives

PNX1300’s core logic. The default value is ‘1’, which

causes the PNX1300 core to be clocked by the input of

the CPU PLL (i.e., the memory interface clock). A value

of ‘0’ causes normal operation, and the core is clocked by

the output of the CPU PLL.

Note that if both CB and SB are set to ‘1’ (bypass the

CPU PLL and the SDRAM PL L), PNX1300’s core logic is

effectively clocked at the external input frequency.

Note: it is illegal to use the output of a disabled PLL. For

example, it is illegal to have CD set to ‘1’ while CB is set

to ‘0’.

SD (SDRAM PLL disable). The 1-bit SD field deter-

mines whether or not the SDRAM PLL is turned on. The

default value is ‘1’, which disables the SDRAM PLL. In

this state, it dissipates almost no power. For normal op-

eration the value should be ‘0’, enabling the SDRAM

PLL.

SB (SDRAM PLL bypass). The 1-bit SB field deter-

mines whether the input or the output of the SDRAM PLL

drives the memory interface and memory devices. The

default value is ‘1’, which causes the memory system to

be clocked by the input of the SDRAM PLL (PNX1300’s

external input clock). A value of ’0’ causes normal oper-

ation, and the memory system is clocked by the output of

the SDRAM PLL.

12.7 MEMORY INTERFACE PIN LIST

The memory interface consists of 61 signal pins includ-

ing clocks (but excluding power and ground pins).

Table 12-9 lists the interface signal pins.

12.8 ADDRESS MAPPING

The address mapping is determined by the state of the

rank-size bits and the bus width bit in the MM_CONFIG

12.8.1 Address Mapping in 32-bit mode

Table 12-10 shows how internal address bits from the

PNX1300 data hig hway bus are mapped to main-memo-

ry address-bus and chip select pins (MM_A[13:0],

MM_CS#[3:0]) in 32-bit data bus mode.

The column “Rank Addr./H.Way Bits” specifies which in-

ternal data-highway address bits select the preliminary

SDRAM rank. The actual rank used is subject to the lim-

itation implied by the relationship between SDRAM aper-

ture size (described in Section 13.2.1) and the rank size.

Table 12-9. Memory Interface Signal Pins

Name Function I/O Active...

MM_CLK[1:0] Memory bus clock O High

MM_CS#[3..0] Chip selects for the four

memory ranks or Address O Low

MM_RAS# Row-address strobe O Low

MM_CAS# Column address strobe O Low

MM_WE# Write enable O Low

MM_A[13:0] Address O High

MM_CKE[1:0] Clock enable O High

MM_DQM[3:0] Byte enables for dq bus O High

MM_DQ[31:0] Bi-directional data bus I/O High

Table 12-10. 32-bit Address Mapping

Rank

Size

Rank

Addr. Row

Address Column

Address Bank

Address

H.Way

Bits Pins H.Way

Bits Pin H.Way

Bit

4 MB 23–22 10–0 21–11 7–0 10–6,

4–2 11

8 MB 24-23 12,

10–0 11,

22–12 12,

8–0

11,

11–6,

4–2 11

16 MB 25-24 13-12

10–0 12-11,

23–13 12,

9–0

11,

12–6,

4–2 11

32 MB –

CS#3

CS#2

13-12

10–0

25,

24,

12-11,

23–13

CS#3,

CS#2,

9–0

25,

24,

11,

12–6,

4–2

PNX1300/01/02/11 Data Book Philips Semiconductors

12-6 PRELIMINARY SPECIFICATION

The rank is selected via the chip select bits,

MM_CS#[3:0].

The column “Row Address/H.Way Bits” specifies which

internal data-highway address bits map to the SDRAM

row address. “Row Address/Pins” specifies which lines

of PNX1300’s MM_A address bus serve as the SDRAM

row address. For the 32 MB ranksize the chip selects

may be used as row address.

The column ‘Column Address/H.Way Bits’ specifies

which data-highway address bits map to the SDRAM col-

umn address. ‘Column Address/Pins’ specifies which

lines of PNX1300’s MM_A address bus serve as the

SDRAM column address. For the 32 MB ranksize the

chip selects may be used as column address.

MM_A[12] is only defined for a 8- or 16-MB rank size.

MM_A[12] contains H.Way bit 11 during the RAS and

CAS operations. MM_A[12] can be used as a bank select

(4-bank SDRAMs) or as a Row address (two bank

SDRAMs).

MM_A[13] is only defined for a 16-MB rank size.

MM_A[13] contains H.Way bit 12 during the RAS opera-

tion. MM_A[13] can only be used as a Row address.

For the 32 MB ranksize the chip selects MM_CS#[3:2]

pins are used as addresses. MM_CS#2 is used as a

bank select in addition to MM_A[11] and MM_CS#3 is

used as a row address.

Highway address bits 5–0 are th e offset with in a 64-byte

block. All ‘0’ for an aligned block tran sfer. Table 12-8 lists

the mapping of bits 5–2 to identify in which SDRAM po-

sitions the words of a block are located. Bit 5 is always

mapped to (one of) the SDRAM internal bank selects;

thus, each SDRAM bank receives half (32 bytes) of the

block transfer.

Highway address bits 4–2 are the wor d o ffset in a ca che

block. Bits 1–0 ar e th e by te offs et with i n a 32-bit word .

12.8.2 Address Mapping in 16-bit mode

Table 12-11 shows how internal address bits from the

PNX1300 data highway bus ar e mapped to ma in-memo-

ry address-bus and chip select pins (MM_A[13:0],

MM_CS#[3:2]) in 16-bit data bus mode.

12.9 MEMORY INTERFACE AND SDRAM

INITIALIZATION

Immediately after reset, the main-memory interface is ini-

tialized by placing default values in the MM_CONFIG

and PLL_RATIOS registers (see Section 12.6, “Memory

System Programming”). During the subsequent hard-

ware boot process, when PNX1300 reads initial values

from an external ROM, these registers can be set to dif-

ferent values.

After PNX1300 is released from the reset state, the

memory interface automatically executes 10 refresh op-

erations, then initializes the mode register in each

SDRAM chip. Table 12-12 shows the settings in the

SDRAM mode register(s).

12.10 ON-CHIP SDRAM INTERLEAVING

The main-memory interface (MMI) takes advantage of

the on-chip interleaving of SDRAM devices. Interleaving

allows the precharge, RAS, and CAS commands needed

to access one internal bank to be performed while useful

data transfer is occurring with the other internal bank.

Thus, the overh ead of pre paring one b ank is hidden d ur-

ing data movement to or from the other.

The benefit of on-chip interleaving is sustainable full-

bandwidth data transfer (1 word per clock cycle). The

transition from one inter nal bank to the other happens on

8-word boundaries; transferring 8 words gives the inac-

tive bank time to prepare (perform precharge, RAS, and

CAS) so that when the last word of the 8-word block in

the active bank ha s been transferred, the next word from

the just-precharged bank is ready on the next cycle.

The seamless transitions betwee n the two on-chip banks

can be sustained for a stream of contiguous addresses

with the same dir e ctio n ( re ad or write). That is, a stream

of contiguous reads or contiguous writes can sustain full

bandwidth. If a write follows a read, then a small gap be-

tween transfers is needed.

Each bank access is terminated with a read or write with

automatic precharge, making a separate precharge com-

mand before the next RAS unnecessary.

For 4 banks SDRAM devices, the signals used as bank

addresses are interchangeable (i.e. it does not matter

which of the two signals is connected to Bank 1 or Bank

0 of the SDRAM device).

12.11 REFRESH

The MMI perfor ms SDRAM refresh cycles autonomously

using the CAS-before-RAS (CBR) mechanism. SDRAMs

have a 4K refresh interval: either 4096 rows must be re-

Table 12-11. 16-bit Address Mapping

Rank

Size

Rank

Addr. Row

Address Column

Address Bank

Address

H.Way

Bits Pins H.Way

Bit

2 MB – 9–0 20–11,5 7–0 10–6,

3–1 11 4

8 MB –

CS#3,

CS#2,

13–12,

10–0

24,

23,

12–11,

22–13,5

CS#3,

CS#2,

12,

8–0

24,

23,

11,

11–6,3–1

11 4

Table 12-12. SDRAM Mode Register Settings

Parameter Value

Burst length 4

Wrap type Interleaved

CAS latency 3

Philips Semiconductors SDRAM Memory System

PRELIMINARY SPECIFICATION 12-7

freshed every 64 ms or 2048 rows every 32 ms or one

row every 15.62sec. New SDRAM devices (i.e. 256

Mbit generation support an 8K refresh interva l, therefor e

one row every 7.81 sec.

The MMI performs refresh at timed intervals: one CBR

refresh command must be issued every 15.6 s or every

7.81 sec. A co un ter in the MMI keeps tr ack of the num-

ber of SDRAM clock cycles between refresh operations.

This counter starts after the C BR operation has complet-

ed; this CBR operation take 19 cycles. When the counter

reaches a programmed limit, the next refresh operation

is due, and the n ext-in-line data transfer reque st from the

data-highway is delayed until the CBR operation is exe-

cuted.

All devices in the main-memory system are refreshed si-

multaneously. The REFRESH field in the MM_CONFIG

cycles (as distinguished from PNX1300 core clock cy-

cles) between the CBR refresh operations.

Each CBR refresh operation takes 19 SDRAM clock cy-

cles. Thus, at 100-MHz, refresh consumes about 1.2% of

maximum available SDRAM bandwidth (1 9 cycles out of

1560). The bandwidth impact is slightly lower at higher

frequencies.

Table 12-13 lists the number of memory-system clocks

for typical SDRAM operation speeds with a 15.62s re-

fresh period. This number includes the worst case sce-

nario in order to gua r an ty th e 15 .6 2s refresh pe rio d .

Table 12-14 lists the number of memory-system clocks

for typical SDRAM operation speeds with a 7.81s re-

fresh period.This number includes the worst case sce-

nario in order to gua r an ty th e 7.81s refresh period.

12.12 POWER-DOWN MODE

When PNX1300 is put into power-down mode to reduce

power consumption, the MMI responds by putting the

SDRAM devices into their power-down mode. In this

mode, the SDRAM devices retain their contents through

self-refresh.

12.13 OUTPUT DRIVER CAPACITY

PNX1300’s output driver circui ts for the memory address

and control signals (output signals in Table 12-9), can

drive up to two memor y device s when th e me mor y inter-

face is operating at 183 MHz. If more devices are con-

nected, then a lower SDRAM clock frequency must be

chosen.

Table 12-15 lists the clock frequency as a function of th e

number of memory devices connected to unbuffered

memory interface signals.

Two identical outputs are provided for both the MM_CKE

(clock-enable) and MM_CLK signals. Each MM_CKE

and MM_CLK signal is capable of driving one SDRAM

devices at 183 MHz.

12.14 SIGNAL PROPAGATION DELAY

COMPENSATION

The PNX1300 MMI no longer has the two special pins,

MM_MATCHOUT and MM_MATCHIN, that wer e used in

the TM-1100 and TM-1000. This loop helped the inter-

face compensate for the propagation delay through cir-

cuit-board traces to an d from the external SDRAM devic-

es. It is now integrated into the MMI. Read timing is

internally derived.

To avoid excessive ringing of the clock signals, series

termination with a 33-ohm resistor is advised at the clock

outputs.

The delay of the memory clock with respect to the inter-

nal sending and receiving clocks is adjusted inside the

memory interface to achieve reliable co mmunication and

guarantee correct setup and hold times.

Figure 12-4 shows a conceptual circuit board layout.

Two SDRAM devices share a single clock output. The

clock signals should have source-series termination.

12.15 CIRCUIT BOARD DESIGN

PNX1300 and its me mory array form a high-sp eed digital

system. Even though only a small number of chips is in-

volved, this digital system operates at frequencies high

enough to make the analog characteristics of the con-

nections between the chips significant. Consequently,

the system designer must take care to ensure reliable

operation.

12.15.1 General Guidelines

• In general, PNX1300 and its memory chips must be

as close together as possible to minimize parasitic

Table 12-13. REFRESH value for a 15 .62 s perio d

SDRAM Operation Speed Value For REFRESH Field

(decimal, hexadecimal)

100 MHz 1523, 05F3

125 MHz 1914, 0779

133 MHz 2038, 07F6

143 MHz 2195, 0892

166 MHz 2554, 09F9

183 MHz 2819, 0B03

Table 12-14. REFRESH value for a 7.81 s period

SDRAM Operation Speed Value For REFRESH Field

(decimal, hexadecimal)

100 MHz 742, 02E6

125 MHz 936, 03A9

133 MHz 992, 03E7

143 MHz 1072, 0435

166 MHz 1256, 04E9

183 MHz 1384, 05E6

PNX1300/01/02/11 Data Book Philips Semiconductors

12-8 PRELIMINARY SPECIFICATION

capacitance. Close proximity is especially important

for a 183-MHz mem o ry system.

• Signal traces between PNX1300 and the memory

chips must be matche d in length as closely as possi-

ble to minimize signal skew.

• The clock-signal trace(s) must be as short as possi-

ble.

• Address and control-signal traces should also be

short, but their length is less critical than the clock’s.

• Data-signal traces should also be short, but their

length is less critical than the clock’s, especially if

only one or two ranks are connected.

• Connections to several loads must follow a “T” con-

nection scheme in order to limit the reflections.

12.15.2 Specific Guidelines

• The maximum length for a signal trace should be

10cm. For 183-MHz operation, signal trace length

must not be longer than 7cm.

• The maximum capacitive load is 30 pF per trace,

including loads.

• The signal traces on the PNX1300 circuit board must

be designed as 50-ohm transmission lines.

• At most one SDRAM device may be connected to

each MM_CLK signal at 183 MHz.

12.15.3 Termination

No termination is required for address, data, and control

signals. Address and control signals are driven only by

PNX1300; the output impedance of the drivers is suffi-

ciently matched to prevent excessive ringing. PNX1300

design assumes that when driving data lines, the output

drivers of SDRAM chips are also sufficiently impedance

matched.

Series termination of the clock outputs with a 33-ohm re-

sistor is advised.

12.16 TIMING BUDGET

The glueless interface of the PNX1300 main-me mory in-

terface makes the memory system simple and straight-

forward from one point of view, but to ensure reliab le op-

eration at high clock rates, system designers must follow

the board design guidelines (see Section 12.15, “Circuit

Board Design”).

SDRAM devices must meet the critical specifications list-

ed in Table 12-16 to ensure reliable operation of an 143-

MHz (Tcycle = 7 ns) memory system.

For a 166 MHz operation, SDRAM devices must meet

the critical specifications listed in Table 12-17 to ensure

Table 12-15. Glueless interface limits for address/

clocks

Memory Chips Maximum Clock Frequency

2 183 MHz

4 166 MHz

8 133 MHz

Figure 12-4. Conceptual board layout.

Address

Control

CLK

DQ[31:0]

33 

Address

Control

CLK

DQ[31:0]

SDRAM

Device

SDRAM

Device

PNX1300

Memory

Interface

Address,

Clock Enables,

RAS#, CAS#, WE#

Clock

Data[31:0]

Data

Highway

PNX1300

On-Chip

Peripherals

DSPCPU

Table 12-16. Critical 143-MHz SDRAM parameters

Timing Parameter Value

Max. output delay tAC 6.4 ns

Min. output hold time tOH 2.0 ns

Max. input setup time tIS 2.0 ns

Max. input hold time tIH 1.0 ns

Philips Semiconductors SDRAM Memory System

PRELIMINARY SPECIFICATION 12-9

reliable operation o f an 166- MHz (Tcycle = 6 n s) memory

system.

For a 183 MHz operation, SDRAM devices must meet

the critical specifications listed in Table 12-18 to ensure

reliable operation of a n 1 83- MHz (T cycle = 5.4 ns) mem-

ory system.

These values leave virtually no margin for the critical tim-

ing parameters in a high-speed system and assu me a to-

tal worst case delay from 0.6 ns to 0.4 ns (From 143 MHz

to 183 MHz opera ting frequency the trace layout must be

improved to reduce trace delay as well as skew) and a

TSU for PNX1300 of 0 ns.

The maximum operating frequency is usually computed

with the following equation: .

Where TCS is the skew between MM_CLK0 and

MM_CLK1, and TSU the input data setup time as defined

in Section 1.9.7.10 on page 1-19, and Tboard includes

trace delay and trace skew.

12.16.1 Main AC Parameter requirements

The PNX1300 SDRAM interface was designed to sup-

port a wide range of SDRAM vendors. Table 12-19, de-

scribes some of the minimum SDRAM AC requirements

for PNX1300 to operate correctly. The symbols or names

are not really standardize d a nd may differ from on e ven-

dor to another one. The table is not meant to be exhaus-

tive and shows only the main parameters. Parameters

are expressed in clock cycles rather than ns.

12.17 EXAMPLE BLOCK DIAGRAMS

The following figures illustrate some of the memory con-

figurations that can be built with PNX1300. For all them

the signals used as bank addresses, are interchange-

able (i.e. it does not matter which of the two signals is

connected to Bank 1 or Bank 0 of the SDRAM device).

12.17.1 Block Diagrams for a 32-bit interface

The following sections present examples of possible

connections with 16-, 64-, 128- and 256 Mbit SDRAMs.

MM_CONFIG.BW must be set to ‘0’ (refer to bw,

Section 12.6.1).

12.17.1.1 16-Mbit Devices or Less

These devices allow small memory configurations to be

built. They are described in more details in the TM-1000

and TM-1100 Databooks.

Table 12-17. Critical 166-MHz SDRAM parameters

Timing Parameter Value

Max. output delay tAC 5.5 ns

Min. output hold time tOH 2.0 ns

Max. input setup time tIS 1.5 ns

Max. input hold time tIH 1.0 ns

Table 12-18. Critical 183-MHz SDRAM parameters

Timing Parameter Value

Max. output delay tAC 5.0 ns

Min. output hold time tOH 2.0 ns

Max. input setup time tIS 1.5 ns

Max. input hold time tIH 1.0 ns

Tcycle tAC Tboard TCS TSU

+++

Table 12-19. Minimum AC Parameters

Description Symbol Clocks

ACTIVE command period tRC 10

ACTIVE to PRECHARGE command tRAS 7

PRECHARGE command period tRP 3

ACTIVE Bank A to ACTIVE bank B tRRD 3

ACTIVE to READ or WRITE command tRCD 3

WRITE recovery time tWR 2

PNX1300/01/02/11 Data Book Philips Semiconductors

12-10 PRELIMINARY SPECIFICATION

12.17.1.2 64-Mbit Devices

64-Mbit SDRAMS organized in x32 can be used to build

an 8-, 16-, 24-, or 32-MB memory system. Figure 12-5

shows an 8-MB memory system (one device only) and

Figure 12-6 details an extension of the block diagram in

order to build a 16-MB configuration.

DQ[31:0]

DQM[3:0]

CLK

Address[10:0]

Control

CS#

4512K32

SDRAM

MM_CS#[0]

MM_CLK[0]

BA[1:0]

Figure 12-5. Schematic of a 8-MB memory system consisting of one 4512K32 SDRAM (one rank).

PNX1300

MM_CS#[0]

MM_RAS, CAS, WE#, CKE

MM_A[10:0]

MM_CLK[1:0]

MM_DQ[31:0]

MM_DQM[3:0]

33 

MM_A[12,11]

DQ[31:0]CLK

Address[10:0]

Control DQM[3:0]

CS#

4512K32

SDRAM

MM_CS#[0]

MM_CLK[0]

MM_DQM[3:0]

MM_DQ[31:0]

DQ[31:0]CLK

Control DQM[3:0]

CS#

MM_DQM[3:0]

MM_DQ[31:0]

MM_CS#[1]

MM_CLK[0]

33 

4512K32

SDRAM

BA[1:0]

Address[10:0]

MM_CS#[1:0]

MM_RAS#, CAS#, WE#, CKE

MM_A[10:0]

MM_CLK[1:0]

MM_DQ[31:0]

MM_DQM[3:0]

MM_A[12,11]

Figure 12-6. Schematic of a 16-MB memory system consisting of two ranks of 4512K32 SDRAM chips.

PNX1300

Philips Semiconductors SDRAM Memory System

PRELIMINARY SPECIFICATION 12-11

64-Mbit SDRAMs organized in x16 can be used to build

a 16-, 32-, 48- or 64-MB memory systems. Figure 12-7 details a 32-MB memory system. Removing the device

controlled by MM_C S #[ 1] m ak es a 16 -M B sys te m.

Figure 12-7. Schematic of a 32-MB memory syst em consisting of four 41M16 SDRAM chips (two ranks)

MM_CS#[1:0]

MM_A[13,10:0]

MM_CLK[1:0]

MM_DQ[31:0]

MM_DQM[3:0]

MM_CS#[1]

MM_CLK[1] MM_DQ[31:16]

MM_DQ[15:0]

MM_CS#[1]

MM_CLK[0]

MM_DQ[31:16]

MM_DQ[15:0]

MM_CS#[0]

MM_CLK[1]

MM_CLK[0]

33 

MM_DQM[1:0]

MM_DQM[3:2]

MM_DQM[1:0]

PNX1300

MM_RAS, CAS, WE#, CKE

DQ[15:0]CLK

Control DQM[1:0]

CS#

41M16

SDRAM

BA[1:0]

Address[11:0]

MM_A[12,11]

DQ[15:0]CLK

Control DQM[1:0]

CS#

41M16

SDRAM

BA[1:0]

Address[11:0]

DQ[15:0]CLK

Control DQM[1:0]

CS#

41M16

SDRAM

BA[1:0]

Address[11:0]

DQ[15:0]CLK

Control DQM[1:0]

CS#

41M16

SDRAM

BA[1:0]

Address[11:0]

PNX1300/01/02/11 Data Book Philips Semiconductors

12-12 PRELIMINARY SPECIFICATION

64-Mbit SDRAMs organized in x8 devices could be used

to build a 32-MB memory system as illustrated in

Figure 12-8. Note that due to the unusual way of using

the devices, it is the only supported co nfiguration with x8

devices. MM_CONFIG.SIZE must be set to 6 (i.e. 16-MB

rank size, Section 12.6.1).

Figure 12-8. Schematic of a 32-MB memory syst em consisting of four 42M8 SDRAM chips (one r a nk)

MM_A[13,10:0]

MM_CLK[1:0]

MM_DQ[31:0]

MM_DQM[3:0]

MM_CLK[1] MM_DQ[31:24]

MM_DQ[23:16]MM_CLK[1]

MM_DQ[15:8]

MM_DQ[7:0]

MM_CLK[0]

33 

MM_DQM[2]

MM_DQM[3]

MM_DQM[1]

MM_DQM[0]

PNX1300

MM_RAS, CAS, WE#, CKE

DQ[7:0]CLK

Control DQM]

42M8

SDRAM

BA[1:0]

Address[11:0]

MM_A[11]

MM_CS#[1]

DQ[7:0]CLK

Control DQM

42M8

SDRAM

BA[1:0]

Address[11:0]

DQ[7:0]CLK

Control DQM]

42M8

SDRAM

BA[1:0]

Address[11:0]

DQ[7:0]CLK

Control DQM]

42M8

SDRAM

BA[1:0]

Address[11:0]

CS#

GND

CS#

GND

CS#

GND

CS#

GND

Philips Semiconductors SDRAM Memory System

PRELIMINARY SPECIFICATION 12-13

12.17.1.3 128-Mbit Devices

128-Mbit SDRAMs organized in x16 are partially sup-

ported. The support is provided for a 32-MB memory sys-

tem. It can only contain one rank (i.e. it cannot be extend-

ed using the other MM_CS# pins). There are two

possible connection schemes.

Figure 12-9 is backward compatible with TM-1300.

MM_CONFIG.SIZE must be set to 6 (i.e. 16 MB rank

size, Section 12.6.1).

Figure 12-9. Schematic of a 32-MB memory system consisting of two 42M16 SDRAM chips (one rank)

MM_A[13,10:0]

MM_CLK[1:0]

MM_DQ[31:0]

MM_DQM[3:0]

MM_CLK[0] MM_DQ[31:16]

MM_DQ[15:0]MM_CLK[1]

33 

MM_DQM[1:0]

MM_DQM[3:2]

PNX1300

MM_RAS, CAS, WE#, CKE

DQ[15:0]CLK

Control DQM[1:0]

42M16

SDRAM

BA[1:0]

Address[11:0]

MM_A[11]

MM_CS#[1]

DQ[15:0]CLK

Control DQM[1:0]

42M16

SDRAM

BA[1:0]

Address[11:0]

CS#

GND

CS#

GND

PNX1300/01/02/11 Data Book Philips Semiconductors

12-14 PRELIMINARY SPECIFICATION

Figure 12-10 is not backward compatible with TM-1300.

MM_CONFIG.SIZE must be set to 7 (i.e. 32 MB rank

size, Section 12.6.1). This new scheme has the advan-

tage of being compatible with the Figure 12-12. This al-

lows to build a system that receives 32- or 64-MB mem-

ory system with the exact same footprint.

Figure 12-10. Schematic of a 32-MB memory system consisting of two 42M16 SDRAM chips (one rank)

MM_A[13,10:0]

MM_CLK[1:0]

MM_DQ[31:0]

MM_DQM[3:0]

MM_CLK[0] MM_DQ[31:16]

MM_DQ[15:0]MM_CLK[1]

33 

MM_DQM[1:0]

MM_DQM[3:2]

PNX1300

MM_RAS, CAS, WE#, CKE

DQ[15:0]CLK

Control DQM[1:0]

42M16

SDRAM

BA[1:0]

Address[11:0]

MM_A[11]

MM_CS#[2]

DQ[15:0]CLK

Control DQM[1:0]

42M16

SDRAM

BA[1:0]

Address[11:0]

CS#

GND

CS#

GND

Philips Semiconductors SDRAM Memory System

PRELIMINARY SPECIFICATION 12-15

128-Mbit SDRAMs org anized in x32 can be used to build

16-, 32-, 48- or 64-MB memory systems. A 32-MB sys-

tem is pictured in Figure 12-11. A 16-MB system can be

obtained by removing the device controlled by

MM_CS#[1]. Similarly it can be extended to 48- or 64-MB

by adding devices controlled by MM_CS#[3:2].

DQ[31:0]CLK

Address[11:0]

Control DQM[3:0]

CS#

41M32

SDRAM

MM_CS#[1:0]

MM_RAS#, CAS#, WE#, CKE

MM_A[13,10:0]

MM_CLK[1:0]

MM_DQ[31:0]

MM_DQM[3:0]

MM_CS#[0]

MM_CLK[1]

MM_DQM[3:0]

MM_DQ[31:0]

DQ[31:0]CLK

Control DQM[3:0]

CS#

MM_DQM[3:0]

MM_DQ[31:0]

MM_CS#[1]

MM_CLK[0]

33 

MM_A[12,11]

41M32

SDRAM

BA[1:0]

Figure 12-11. Schematic of a 32-MB memory system consisting of two ranks of 41M32 SDRAM chips.

BA[1:0]

Address[11:0]

PNX1300

PNX1300/01/02/11 Data Book Philips Semiconductors

12-16 PRELIMINARY SPECIFICATION

12.17.1.4 256-Mbit Devices

256-Mbit SDRAMs organized in x16 can be used to build

a 64-MB memory systems. Figure 12-12 details a 64-MB

memory system. MM_CONFIG.SIZE must be set to 7

(i.e. 32-MB rank size, Section 12.6.1).

Note the connections described in Figure 12-12 for the

256-Mbit SDRAMs organized in x16 can also b e used to

connect the 128-Mbit SDRAM devices organized in x16

allowing the same footprint on th e board for two dif f erent

memory size configuration s (i.e. 64 MB or 32 MB) . Refer

to Figure 12-10 for detailed connection of the 32-MB

case.

Figure 12-12. Schematic of a 64-MB memory system consisting of two 44M16 SDRAM chips (one rank)

MM_CS#3, MM_A[13,10:0]

MM_CLK[1:0]

MM_DQ[31:0]

MM_DQM[3:0]

MM_CLK[0] MM_DQ[31:16]

MM_DQ[15:0]MM_CLK[1]

33 

MM_DQM[1:0]

MM_DQM[3:2]

PNX1300

MM_RAS, CAS, WE#, CKE

DQ[15:0]CLK

Control DQM[1:0]

44M16

SDRAM

BA[1:0]

Address[12:0]

MM_A[11], MM_CS#2

DQ[15:0]CLK

Control DQM[1:0]

44M16

SDRAM

BA[1:0]

Address[12:0]

CS#

GND

CS#

GND

Philips Semiconductors SDRAM Memory System

PRELIMINARY SPECIFICATION 12-17

12.17.2 Block Diagrams for a 16-bit interface

The following figures (i.e. Figure 12-13, Figure 12-14

and Figure 12-15) de tail the SDRAM connection s for the

64-, 128- and 25 6-Mbit SDRAMs orga nized in x16. They

respectively build a memory system of 8- , 16- or 32-MB.

MM_CONFIG.SIZE must be set to 5 (i.e. 8-MB rank size,

Section 12.6.1) for all of the pictured configurations.

MM_CONFIG.BW must be set to ‘1’ (refer to bw,

Section 12.6.1).

Note the connections described in Figure 12-15 for the

256-Mbit SDRAM device organized in x16 can also be

used to connect a 128- Mbit SDRAM device or ganized in

x16, Figure 12-14, allowing the same footprint on the

board for two different memory size configurations (i.e.

32 MB or 16 MB).

Figure 12-13. Schematic of a 8-MB memory system consisting of one 41M16 SDRAM chips (one rank )

MM_CLK[0] MM_DQ[15:0]

MM_DQM[1:0]

DQ[15:0]CLK

Control DQM[1:0]

41M16

SDRAM

BA[1:0]

Address[11:0]

CS#

GND

MM_A[13,10:0]

MM_CLK[0]

MM_DQ[31:0]

MM_DQM[3:0]

33 

PNX1300

MM_RAS, CAS, WE#, CKE

MM_A[12,11]

Figure 12-14. Schematic of a 16-MB memory system consisting of one 42M16 SDRAM chips (one rank)

MM_A[13,10:0]

MM_CLK[0]

MM_DQ[31:0]

MM_DQM[3:0]

MM_CLK[0] MM_DQ[15:0]

33 

MM_DQM[1:0]

PNX1300

MM_RAS, CAS, WE#, CKE

MM_A[11], MM_CS#2

DQ[15:0]CLK

Control DQM[1:0]

42M16

SDRAM

BA[1:0]

Address[11:0]

CS#

GND

PNX1300/01/02/11 Data Book Philips Semiconductors

12-18 PRELIMINARY SPECIFICATION

Figure 12-15. Schematic of a 32-MB memory system consisting of one 44M16 SDRAM chips (one rank)

MM_CS#3,MM_A[13,10:0]

MM_CLK[0]

MM_DQ[31:0]

MM_DQM[3:0]

MM_CLK[0] MM_DQ[15:0]

33 

MM_DQM[1:0]

PNX1300

MM_RAS, CAS, WE#, CKE

MM_A[11], MM_CS#2

DQ[15:0]CLK

Control DQM[1:0]

44M16

SDRAM

BA[1:0]

Address[12:0]

CS#

GND

PRELIMINARY SPECIFICATION 13-1

System Boot Chapter 13

by Gert Slavenburg, Bob Bradfield, and Hani Salloum

13.1 BOOT SEQUENCE OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

Before a PNX1300 system can begin operating, the

main-memory interface (MMI) registers and on-chip

clock ratio register must be configured. Since the

DSPCPU cannot begin operating until after these regis-

ters and circuits are initialized, the DSPCPU cannot be

relied on to initialize these resources. Consequently,

PNX1300 needs an independent bootstrap facility for

low-level initialization.

PNX1300 implements low-level system initialization by

combining a small block of on-chip system boot logic with

a single external serial boot EEPROM connected to the

I2C interface. See Figure 13-1. Serial EEPROMs with an

I2C interface are slow but have the advantages of being

space-efficient and inexpensive. Th e amount of informa-

tion needed for initial system boot is small, so speed is

not a concern.

The PNX1300 system boot block pe rfor ms di fferen tly fo r

each of two major types of PNX1300 system, distin-

guished by host-assisted and autonomous bootstrap-

ping. The most significa nt bit of th e tenth byte in the ex-

ternal EEPROM determines the system boot procedure

and must match the system configuration.

In host-assisted bootstrapping, a PNX1300 device is in-

tegrated into a system where some other processor

serves as the h ost . F or ex am p le, a PNX1300 ch ip mig h t

be part of a PCI card in a standard personal computer

(PC). In this case, the PNX1300 syste m boot only needs

to load enough information from the serial EEPROM to

configure the on-chip timing circuits and MMI; the host

processor can perform all other PNX1300 setup chores.

In the second type of system, autono mous bootstrapping

takes place. In this configuration, a PNX1300 device

serves as the host (main) processor; consequently, the

PNX1300 system boot must perform more work. In addi-

tion to configuring on-chip timing and the MMI, the sys-

tem boot must set the base addresses of the ma in mem-

ory and MMIO address apertures and load into main

memory a level 1 bootstrap progra m for the DSPCPU.

Only the first 10 bytes of the serial EEPROM are needed

when PNX1300 is not the ho st PCI processor; thus, such

systems can use a very low-cost 128-byte EEPROM de-

vice. When PNX1300 serves as the system’s host pro-

cessor, the boot logic permits almost 2 KB o f sto rage for

the level 1 bootstrap DSPCPU program in a sin gle eight-

pin EEPROM device.

Figure 13-1. The system boot logic uses the I2C in-

terface to access a serial EEPROM that contains

main-memor

and s

stem timin

information.

4.7K 

PNX1300

System Boot

Block

I2C Interfac e Serial

EEPROM

SCL

SDA

4.7K 

Vdd

Table 13-1. System Boot Features

Characteristic Comments

Boot Configurations

Supported • Host assisted, e.g., PNX1300 is a

PCI slave in a standard PC.

• Autonomous, e.g., PNX1300 is the

host PCI processor.

ROM Device Types

Supported • Single standard I2C serial

EEPROMs from 128 bytes to 2KB

in size.

• EEPROMs connect via the

PNX1300 built-in 2-wire I2C inter-

face.

• The use of EEPROMs with hard-

ware Write Protect (WP) is recom-

mended. A jumper on WP allows

user control over in-system repro-

gramming using the I2C interface.

• The EEPROM must respond to I2C

device address 1010.

ROM device

examples • Atmel 24C01A (128 bytes, WP)

• Atmel 24C08 (1KB, WP)

• Atmel 24C16 (2KB, WP).

ROM size • From 128 bytes to 2 KB (one

device) for initial program load.

PNX1300/01/02/11 Data Book Philips Semiconductors

13-2 PRELIMINARY SPECIFICATION

13.2 BOOT HARDWARE OPERATION

The PNX1300 boot sequence begins with the assertion

of the reset signal TRI_RESET#. After reset is de-assert-

ed, only the system boot block, I2C, and PCI interfaces

are allowed to operate. In particular, the DSPCPU and

the internal data highway bus will remain in the reset

state until they are explicitly released during the boot pro-

cedure. In autonomou s boot, the system boot block is re -

sponsible for releasing the DSPCPU and highway from

reset. In host-assisted boot, the boot logic releases the

highway from reset and the PNX1300 software driver

(which runs on the host processor) releases the

DSPCPU from reset.

The system boot block operation is illustrated in a flow

chart shown in Figure 13-2.

13.2.1 Boot Procedure Common to Both

Autonomous and Host-Assisted

Bootstrap

There should be no other I2C master active from reset

until boot EEPROM load completes. The system boot

procedure begins by lo ading a few critical pieces of infor-

mation from the serial EEPROM. This part of the proce-

dure is common to both autonomous and host-assisted

bootstrapping. See Table 13-2 for a summary and

Table 13-5 for full bit-accurate EEPROM layout details.

The first byte of the EEPROM is read using a serial clock

equal to BOOT_CLK/1000, which is guaranteed to be

less than 100 kHz. After reading the first byte, which con-

tains the actual BOOT_CLK rate as well as the EEPROM

speed capability, the boot block proceeds to read subse-

quent bytes at the highest valid speed.

The number of lines in the EEPROM device should be ‘0’

in case of a 128-byte device and ‘1’ for larger devices.

The SDRAM aperture size should be set to the smallest

size that is larger than or equal to the actual size of

SDRAM connected to PNX1300. The SDRAM aperture

size information is forwarded to the PCI interface for use

in host BIOS configuration, as described in Section

13.3.2, “Stage 2: Host-System PCI Configuration.”

The BOOT_CLK speed bits should be set to match the

closest rounded up frequency of the external clock cir-

cuit, i.e. for an external clock of 40 MHz or 50 MHz the

value should be 10. This field, together with the EE-

PROM maximum clock speed bit are used to decide the

best possible divider ratio fo r generation of th e I2C clock,

as shown in Table 13-3. In addition, the delay actions in

Figure 13-2 are taken based on the specified

BOOT_CLK value.

The EEPROM maximum clock speed bit is set to match

the speed grade of the serial EEPROM device.

The test mode bit should always be set to ‘0’. It is only set

to one for factory ATE testing.

The Subsystem ID and Subsystem Vendor ID data has

no meaning to the PNX1300 hardware; its meaning is

entirely software defined. The value is loaded by the sys-

tem boot block from the EEPROM and published in the

PCI configuration space register at offset 0x2C to pro-

vide the 16-bit Subsystem ID and Subsystem Vendor ID

values. These values are used by driver software to dis-

tinguish the board vendor and product revision informa-

tion for multiple board products based on the PNX1300

chip. Refer to Section 11.5.12, “Subsystem ID, Sub-

Table 13-2. Information Lo aded During First Part of

Bootstrapping Procedure

Information Size Interpretation

Number of lines in

EEPROM device 1 bit 0 128 lines

1 256 or more lines

SDRAM aperture size 3 bits 000 1 MB

001 1 MB

010 2 MB

011 4 MB

100 8 MB

101 16 MB

110 32 MB

111 64 MB

BOOT_CLK speed 2 bits 00 100 MHz

01 75 MHz

10 50 MHz

11 33 MHz

I2C clock speed 1 bit 0 100 KHz

1 400 KHz

Test mode 1 bit 0 normal operation

1 rapid ATE testing

Subsystem ID 16 bits Value is copied to Sub-

system ID register in PCI

configuration space.

Subsystem Vendor ID 16 bits Value is copied to Sub-

system Vendor ID regis-

ter in PCI config space.

MM_CONFIG register

initialization 20 bits Value is simply written to

the MM_CONFIG regis-

ter; see Section 12.6.1,

“MM_CONFIG Register.”

PLL_RATIOS register

initialization 8 bits Value is simply written to

the PLL_RATIOS regis-

ter; see Section 12.6.2,

“PLL_RATIOS Register.”

Autonomous/host-

assisted boot 1 bit 0 host-assisted

1 autonomous

Enable internal

PCI_CLK

1 bit

0 PCI_CLK taken

from outside

1 use on-chip XIO

PCI_CLK clock

generator

Note: MUST be set

if no external PCI

clock is supplied

SDRAM prefetchable 1 bit 0 not prefetchable

1 prefetchable

Philips Semiconductors System Boot

PRELIMINARY SPECIFICATION 13-3

system Vendor ID Register,” for more information on the

choice of values.

The MM_CONFIG and PLL_RATIOS registers control

the hardware of the MMI and PNX1300 on-chip clock cir-

cuits. These registers are described in detail in Section

12.6, “Memory System Programming.” The boot value

should be set to reflect the exact capabilities of the actual

SDRAM in the system.

The ‘enable internal PCI_CLK generator’ bit determines

the PCI_CLK pin operating mode. If this bit is ‘0’,

PCI_CLK acts compatible with TM-1000 and norma l PCI

operation, i.e. it is an input pin that takes PCI clock from

the external world. If this bit is ‘1’, an on-chip clock divider

in the XIO logic becomes the source of PCI_CLK, and

the PCI_CLK pin is configured as an output. In the latter

case, the PCI_CLK frequency can be programmed to a

divider of the PNX1300 highway clock by setting the

XIO_CTL register ‘Clock Frequency’ divide r value. Refer

to Chapter 22, “PCI-XIO External I/O Bus.” Note: This bit

must be set if no external PCI clock is supplied.

The ‘SDRAM prefetchable’ bit is copied to the PCI con-

figuration space register DRAM_BASE and only visible

as bit #3 (P bit) of DRAM_BASE in a PCI configuration

read, but not visible by MMIO access. Its purpose is to

tell the PCI host, that SDRAM reads will cause no side ef-

fects. The host may apply optimizations on PCI access,

if this bit is set.

The ‘autonomous/host-assisted boot’ bit determines

whether the system boot logic will continue reading more

information from the EEPROM or halt its operation so the

host can complete system initialization. After the infor-

mation listed in Table 13-2 has been loaded into

PNX1300 registers, an external PCI host processor can

finish the initialization of PNX1300. If no external PCI

host processor is present, the autonomous/host-assisted

boot bit should be set to ‘1’ to allow the system boo t logic

to load the information described in th e next section.

Table 13-3I2C speed as a function of EEPROM byte 0

BOOT_CLK

bits EEPROM

speed bit divider

value actual I2C

speed

00 (100 MHz) 0 (100 KHz) 1008 99.2 KHz

00 1 (400 KHz) 256 390.6 KHz

01 (75 MHz) 0 (100 KHz) 752 99.7 KHz

01 1 (400 KHz) 192 390.6 KHz

10 (50 MHz) 0 (100 KHz) 512 97.6 KHz

10 1 (400 KHz) 128 390.6 KHz

11 (33 MHz) 0 (100 KHz) 336 98.2 KHz

11 1 (400 KHz) 96 343.8 KHz

PNX1300/01/02/11 Data Book Philips Semiconductors

13-4 PRELIMINARY SPECIFICATION

TRI_RESET#

de-asserted

8-bit serial read:

1 bit: EPROM capacity

3 bits: DRAM aperture size

2 bits: PNX1300 clock speed

1 bit: I2C clock rate

1 bit: Test mode control

Write to EEPROM

size register

Write aperture siz e to

DRAM_ROUND_SIZE

size registe r in PC I BIU

Write to PNX1 300

clock speed register

32-bit serial read

Write to

SUBSYSTEM ID

registers in PCI BIU

Write 20 bits to

MM_CONFIG

Write to

PLL_RATIOS

Disable

MMI_RESET

to activate highway

Autonomous

Boot YesNo

System boot halts

(Host driver will complete

the boot procedure)

Save 11-bit

byte count

Write to

MMIO space:

MMIO_BASE

Write to

MMIO space:

DRAM_BASE

Write to

MMIO space:

DRAM_CACHEABLE_LIMIT

Bytecount == 0 YesNo

Write to SDRAM

Write 32 bits of code onto highway

with all byte enables active.

Then execute 15 dummy writes on

highway to meet MMI protocol.

Decrement byte

count by four

Write to MMIO space:

Disable CPU_RESET.

DSPCPU starts execution at

DRAM_BASE in big-endian mode.

System boot halts

24-bit serial read

8-bit serial read

64-bit serial rea d

8-bit seri al read

64-bit serial rea d

32-bit serial rea d

32-bit serial read

Wait 400 usec for

PLLs to lock

Wait ca. 0.6 m sec fo r

I2C to stabilize

Figure 13-2. Flow chart of system boot procedure for both host-assisted and autonomous configurations.

Philips Semiconductors System Boot

PRELIMINARY SPECIFICATION 13-5

13.2.2 Initial DSPCPU Program Load for

Autonomous Bootstrap

In a system where PNX1300 serves as the host CPU, the

system boo t b l oc k p er fo rm s a n autonomo us bo ot pr o ce-

dure. For an autonomous boot, the system boot block

reads all the information described in Section 13.2.1,

“Boot Procedure Common to Both Autonomous and

Host-Assisted Bootstrap,” and then—because the au-

tonomous boot bit is set—continues reading information

from the EEPROM. After this part of the system boot pro-

cedure is done, the DSPCPU starts executing. See

Table 13-4.

The DSPCPU bootstrap program byte count encodes the

number of bytes of DSPCPU pro gram code co ntained in

the EEPROM(s). This 11-bit unsigned byte count can en-

code up to 2048 bytes, which is also the maximum

amount of EEPROM storage supported. The actual

amount of EEPROM available for the DSPCPU boot-

strap program is limited to 2000 bytes . Other information

consumes 47 bytes, and the DSPCPU code must be an

integral numbe r of 32-bit wor ds.

Four pairs of 32-bit MMIO-r egister addresses and values

follow the bootstrap program byte count. Each address

tells the boot block where in the 32-bit DSPCPU address

space to store the corresponding 32-bit value.

The first pair initializes the MMIO_BASE. The

MMIO_BASE sets the base add ress of the 2-MB MMIO-

dress space. All MMIO register s are a ddressed using an

offset that is relative to the value of MMIO_BASE. For

this pair, the address is required to be 0xEFF00400 be-

cause that is the default MMIO_BASE enforced when

PNX1300 is reset. The new value for MMIO_BASE is en-

coded in the corresponding value.

The DRAM_BASE address/value pair determine the

base address of th e SDRAM address aperture within the

32-bit DSPCPU address space. The address must be

equal to 0x100000 plus the new value of MMIO_BASE

set previously in the boot procedure. The DRAM_BASE

value must be naturally aligned given the rounded DRAM

aperture size, i.e. a 6 MB DRAM aper ture should start on

a 8 MB address multiple.

The DRAM_LIMIT address/value pair determine the ex-

tent of the SDRAM address aperture . T h e a d dr es s must

be equal to 0x100004 plus the new value of

MMIO_BASE set previously in the boot procedure. The

value in DRAM_LIMIT should be 1 higher than the ad-

dress of the last valid byte of SDRAM memory, and must

be a 64 KB multiple.

The DRAM_CACHEABLE_LIMIT address/value pair de-

termine the extent of the cacheable aperture of the

SDRAM address space. The address must be equal to

0x100008 plus the value of MMIO_BASE set previously

in the boot procedure. The cacheable aperture always

begins at the address value in DRAM_BASE; the value

in DRAM_CACHEABLE_LIMIT is one higher than the

address of the last byte of cacheable SDRAM memory ,

and must be a 64 KB multiple. It is safe to initially set the

value of DRAM_CACHEABLE_LIMIT equal to

DRAM_LIMIT. The RTOS can, if desired, change the val-

ue later.

The next 32-bit value in boot EEPROM memory is a copy

of the DRAM_BASE value encoded previously. The sys-

tem boot hardware loads the DSPCPU bootstrap pro-

gram into SDRAM starting at DRAM_BASE.

The bytes of the DSPCPU bootstrap program follow the

copy of the SDRAM_BASE value. The bootstrap pro-

gram can consist of up to 500 32-bit words of DSPCPU

Table 13-4. In formation Loaded During Second Part

of Bootstrapping Procedure for Autonomous Boot

Information Size Interpretation

DSPCPU bootstrap pro-

gram byte count n11 bits up to 500 32-bit words

(2048 bytes less 47 header

bytes)

MMIO_BASE address 32 bits Value must be

0xEFF00400

MMIO_BASE value 32 bits Value is simply written to

0xEFF00400 to determine

new base address of 2-MB

MMIO register aperture

within 32-bit DSPCPU

address space

DRAM_BASE address 32 bits MMIO_BASE + 0x100000

DRAM_BASE value 32-bits Value is simply written to

DRAM_BASE to determine

base address of SDRAM

aperture within 32-bit

DSPCPU address space

DRAM_LIMIT address 32-bits MMIO_BASE + 0x100004

DRAM_LIMIT value 32-bits Value is simply written to

DRAM_LIMIT to deter-

mine limit address of

SDRAM aperture within

32-bit DSPCPU address

space

DRAM_CACHEABLE_

LIMIT address 32-bits MMIO_BASE + 0x100008

DRAM_CACHEABLE_

LIMIT value 32-bits Value is simply written to

DRAM_CACHEABLE_LIM

IT to determine limit

address of cacheable part

of SDRAM aperture within

32-bit DSPCPU address

space

DRAM_BASE value 32-bits Copy of the DRAM_BASE;

must be equal to value

specified above

SDRAM code word 0 32-bits First 32-bit word of initial

DSPCPU bootstrap pro-

gram

SDRAM code word 1 32-bits Second 32-bit word of ini-

tial DSPCPU bootstrap

program

SDRAM code word n/4 32 bits Last 32-bit word of initial

DSPCPU bootstrap pro-

gram

PNX1300/01/02/11 Data Book Philips Semiconductors

13-6 PRELIMINARY SPECIFICATION

instructions. The byte count must be a multiple of four.

Note that the bytes are stored in the EEPROM in a byte

swapped or der per grou p of 4 compar ed to SDRAM, as

detailed in Table 13-5.

After the entire DSPCPU bootstrap program is loaded

into SDRAM at DRAM_BASE, the system boot logic re-

leases the DSPCPU from the reset state. At this point,

the DSPCPU begins executing the bootstrap program

starting at DRAM_BASE and PNX1300 is fully operation-

al. At the same time, the boot logic releases the I2C inter-

face.

13.3 HOST-ASSISTED BOOT

DESCRIPTION

For a host-assisted bootstrap, the complete bootstrap

process consists of three distinct stages, but the s ystem

boot hardware performs only the first stage. The other

two stages are the responsibility of the host system.

13.3.1 Stage 1: PNX1300 System Boot

Hardware

In the first stage, the PNX1300 hardware must be initial-

ized enough to allow the host system to query and ma-

nipulate PNX1300 resources. The system boot hard-

ware, using the procedure described above in Section

13.2.1, “Boot Procedure Common to Both Autonomous

and Host-Assisted Bootstrap,” initializes the Subsystem

ID, Subsystem Vendor ID, MM_CONFIG, and

PLL_RATIOS registers, waits for the PLLs to lock, en-

ables the internal highway and MMI, but leaves the

DSPCPU in the reset state. After this minimal initializa-

tion, the host system can finish the bootstrap process.

At the completion of stage 1, the PNX1300 hardware is

ready to respond to PCI configuration space accesses,

and the boot block has released the I2C interface.

13.3.2 Stage 2: Host-System PCI

Configuration

Stage 2 is carried out either by the host-system PCI

BIOS or by a combination of the BIOS and the host op-

erating system (e.g., Windows 95). Duri ng this stage, the

host system configures all PCI-bus clients.

The PCI-bus configuration consists of querying the bus

clients to determin e th e fo llowing:

• The number of PCI base-address registers imple-

mented by each client. For PNX1300, the number of

PCI base-address registers is always two

(MMIO_BASE and DRAM_BASE).

• The size of each aperture associated with the base-

address registers. For PNX1300, the size of the

MMIO aperture is always 2 MB. The size of the

SDRAM aperture can range from 1 MB to 64 MB,

and the size must be a power of two (seven distinct

sizes).

Using this information, the host system relocates each

address aperture to eliminate overlaps in the PCI ad-

dress space. The host system accomplishes the reloca-

tion by considering e ach aperture’s size and then writin g

an appropriate starting address to each base-address

and SDRAM apertures must be relocated in this way.

Note that in the case of au tonomous boot, this r elocation

is done statically by the system boot hardware when it

simply copies the values of MMIO_BASE and

DRAM_BASE from the serial EEPROM into these regis-

ters.

The steps o f th e PCI pr ot ocol for determining the size of

an address apertu re are as follows (see Section 11.5.11,

“Base Addr ess Registers,” for a more complete discus-

sion):

• The host writes a 32-bit word of all ‘1’s (0xffffffff) to

the base-address register.

• The host reads the base-address register immedi-

ately after the write. The value returned will have ‘0’s

in all don’t-care bits and ‘1’s in all required address

bits. The required address bits form a left-aligned

(i.e., starting at the most-significant bit) contiguous

field of ‘1’s.

• This left-aligned field of ‘1’s effectively specifies the

size of the address aperture by indicating the bits of

the base-address register that are significant for r elo-

cation. That is, an address aperture of size 2n can

only begin on a 2n-byte-aligned boundary.

As an example, consider the case of the MMIO apertu re.

The host will perform the following steps during stage 2

of the bootstrap process:

• Write 0xffffffff to MMIO_BASE.

• Read from MMIO_BASE, which returns the value

0xffe00000. The host sees that this value has an 11-

bit left-aligned field of ‘1’s, which indicates that the

aperture can only be relocated on 2-MB boundaries;

thus, the aperture size is 2 MB.

• Write a new value to MMIO_BASE with the top 11

bits set to relocate the MMIO aperture to a 2-MB

region of PCI address space that does not conflict

with other PCI add re ss ap ertu re s.

At the completion of stage 2, the PNX1300 hardware is

ready to respond to host configuration space accesses,

host MMIO accesses and host SDRAM aperture access-

es. The DSPCPU is still in RESET state.

13.3.3 Stage 3: PNX1300 Driver Executing on

the Host

During the final stage of the bootstrap process, the

PNX1300 software driver executing on the host system

will write to SDRAM a program for the DSPCPU, and ini-

tialize any MMIO registers. When the initial program load

is complete, the driver releases the DSPCPU from its re-

set state by a write to the BIU_CTL register with the CR

bit set. See Chapter 11, “PCI Interface.” Now, with the

DSPCPU and host both running, the PNX1300 bootstrap

process is complete.

Philips Semiconductors System Boot

PRELIMINARY SPECIFICATION 13-7

13.4 DETAILED EEPROM CONTENTS

Table 13-5 shows the serial EEPROM contents needed

for an autonomous boot procedure. For the host-assisted

boot procedure, only the contents up to line nine are

needed.

Note that the 32-bit words in the serial EEPROM are not

stored on 32-bit wor d-aligned addresses.

Table 13-5. Serial boot EEPROM contents

Line Data Byte

bit 7 bit 6 bit 5 bit 4 bit 3 bit 2 bit 1 bit 0

0#lines

0: 128 lines

1: 256 or more

lines

SDRAM size[2:0]

000: 1MB

001: 1MB

010: 2MB

011: 4MB

100: 8MB

101: 16MB

110: 32MB

111: 64MB

BOOT_CLK[1:0]

00: 100 MHz

01: 75 MHz

10: 50 MHz

11: 33 MHz

EEPROM

clock

0: 100 KHz

1: 400 KHz

Test Mode

0: normal

1: rapid ATE

Subsystem ID, 8 msb

Subsystem ID, 8 lsb

Subsystem Vendor ID, 8 msb

Subsystem Vendor ID, 8 lsb

— — — — MM_CONFIG[19:16]

MM_CONFIG[15:8]

MM_CONFIG[7:0]

8PLL_RATIOS[7:0]

sdram PLL

bypass sdram PLL dis-

able cpu PLL bypass cpu PLL disable sdram ratio cpu ratio[2:0]

9boot type

0: host assist.

1: autonomous

enable inter-

nal PCI_CLK

SDRAM

prefetchable

0:no 1:yes — — byte count [10:8]

10 byte count [7:0]

MMIO_BASE address [31:24] (must be 0xEF)

MMIO_BASE address [23:16] (must be 0xF0)

MMIO_BASE address [15:8] (must be 0x04)

MMIO_BASE address [15:8] (must be 0x00)

MMIO_BASE value [31:24]

MMIO_BASE value [23:16]

MMIO_BASE value [15:8]

MMIO_BASE value [7:0]

DRAM_BASE address [31:24] (must be byte 3 of MMIO_BASE + 0x100000)

DRAM_BASE address [23:16] (must be byte 2 of MMIO_BASE + 0x100000)

DRAM_BASE address [15:8] (must be byte 1 of MMIO_BASE + 0x100000)

DRAM_BASE address [7:0] (must be byte 0 of MMIO_BASE + 0x100000)

DRAM_BASE value [31:24]

DRAM_BASE value [23:16]

DRAM_BASE value [15:8]

DRAM_BASE value [7:0]

DRAM_LIMIT address [31:24] (must be byte 3 of MMIO_BASE + 0x100004)

DRAM_LIMIT address [23:16] (must be byte 2 of MMIO_BASE + 0x100004)

DRAM_LIMIT address [15:8] (must be byte 1 of MMIO_BASE + 0x100004)

DRAM_LIMIT address [7:0] (must be byte 0 of MMIO_BASE + 0x100004)

DRAM_LIMIT value [31:24]

DRAM_LIMIT value [23:16]

DRAM_LIMIT value [15:8]

DRAM_LIMIT value [7:0]

DRAM_CACHEABLE_LIMIT address [31:24] (must be byte 3 of MMIO_BASE + 0x100008)

DRAM_CACHEABLE_LIMIT address [23:16] (must be byte 2 of MMIO_BASE + 0x100008)

DRAM_CACHEABLE_LIMIT address [15:8] (must be byte 1 of MMIO_BASE + 0x100008)

DRAM_CACHEABLE_LIMIT address [7:0] (must be byte 0 of MMIO_BASE + 0x100008)

PNX1300/01/02/11 Data Book Philips Semiconductors

13-8 PRELIMINARY SPECIFICATION

DRAM_CACHEABLE_LIMIT value [31:24]

DRAM_CACHEABLE_LIMIT value [23:16]

DRAM_CACHEABLE_LIMIT value [15:8]

DRAM_CACHEABLE_LIMIT value [7:0]

repeat of DRAM_BASE value [31:24]

repeat of DRAM_BASE value [23:16]

repeat of DRAM_BASE value [15:8]

repeat of DRAM_BASE value [7:0]

byte 0 of DSPCPU bootstrap program (stored at DRAM_BASE + 3)

byte 1 of DSPCPU bootstrap program (stored at DRAM_BASE + 2)

byte 2 of DSPCPU bootstrap program (stored at DRAM_BASE + 1)

byte 3 of DSPCPU bootstrap program (stored at DRAM_BASE + 0)

j+47 byte j of DSPCPU bootstrap program (stored at DRAM_BASE + ((j div 4) + (3 – (j mod 4))))

(n–1)

+47 last byte of DSPCPU bootstrap program (bits [7:0] of last 32-bit word, stored at DRAM_BASE + n – 4)

Table 13-5. Serial boot EEPROM contents

Line Data Byte

bit 7 bit 6 bit 5 bit 4 bit 3 bit 2 bit 1 bit 0

Philips Semiconductors System Boot

PRELIMINARY SPECIFICATION 13-9

13.5 EEPROM ACCESS PROTOCOLS

Figure 13-3 shows the SDA (serial data) line protocols

for three types of read accesses supported by I2C serial

EEPROMs. A read from the address currently latched in-

side the EEPROM can be for either a single byte or for

an arbitrary series of sequential bytes. The master

makes the ch oic e by s et tin g the ACK bit afte r a byte has

been transferred.

A random-access read is accomplished by performing a

dummy write, which overwrites the latched address

stored inside the EEPROM. Once the internal address

latch is set to the desired value, one of the other two read

protocols can be used to read one or more bytes.

The boot logic inside PNX1300 uses a single random

read transaction to location 0 of device address 1010000

followed by a sequential read extension to read all re-

quired EEPROM bytes in a single pass.

SDA Line Protocol:

Random Read

Device Address

0 1010A

K1010

Device Address

Dummy Write

1010A

Device Address

SDA Line Protocol:

Sequential Read

Data n Data n+1 Data n+2 Data n+3

1010A

SDA Line Protocol:

Current-Address Read

Data n

Device Address

Figure 13-3. Protocols supported by the boot block for reading the EEPROM

PNX1300/01/02/11 Data Book Philips Semiconductors

13-10 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION 14-1

Image Coprocessor Chapter 14

14.1 IMAGE COPROCESSOR OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The Image Coproce ssor (ICP) connects to the PNX1300

on-chip data highway to perform SDRAM block read and

write actions. It also connects to the PCI interface to al-

low block write transactions across PCI.

The major functions of the ICP are:

• Filter an image by reading the image from SDRAM

and writing the image back to SDRAM, while apply-

ing a user-d efined polyphase f ilter with optional hori-

zontal up- or down-scaling.

• Filter an image by reading the image from SDRAM

and writing the image back to SDRAM, while apply-

ing a user defined polyphase filter with optional verti-

cal up- or down-scaling.

• Filter an image and convert it from planar to RGB or

YUV composite by reading the image from SDRAM

and writing the im age o ut to PCI b us memor y (gra ph-

ics card) or SDRAM, while performing horizontal

scaling and conversion to one of a several RGB or

YUV formats. The programmer can add optional bit-

map masking to selectively enable/disable pixel

writes to PCI (to refresh only the exposed part of a

video window) and an optional image overlay with

alpha blending and optional chroma keying (PCI out-

put only).

• Move an image by reading the image from SDRAM

and writing it back to SDRAM.

All of the ICP functions move and transform data from

memory to memory or memory to the PCI bus. Hence,

the DSPCPU can use the ICP in a time-sharing fashion

to simultaneously achieve:

1. Vertical and horizont al resizing/subsampling on the

image stream from the Video In (VI) unit.

2. V ert ical and horizont al resizing/upsampling on the im-

age stream sent to the Video Out (VO) unit.

3. Presentation of a collection of live video windows with

programmable up an d down scaling and arbitrary

overlap configuration on PCI graphics cards.1

Full 2D scaling and filtering r equires two passes over the

data: one for horizontal scaling and filtering and one for

vertical scaling and filtering.

Figure 14-1 shows a block diagram of the PNX1300 wi th

the ICP. Figure 14-2 shows a block diagram of the inter-

nal structure of the ICP. The ICP contains a 5-tap filter,

YUV to RGB converter, an overlay and alpha blending

unit, and an output formatter. These blocks communicate

with each other through FIFOs that also buffer the block

data to and from the PNX1300 Data Highway. The ICP

uses a microprogram-controlled sequencer to control its

internal timing. The pr ogram for th is sequencer is in a ta-

ble in SDRAM. The ICP reads the appropriate portion

from the SDRAM each time the ICP is commanded to

perform a function. Microprogram control simplifies and

minimizes the ICP hardware and increases the flex ibility

of the ICP to perform additional tasks without adding

hardware.

14.2 REQUIREMENTS

14.2.1 Functions

The major functions of the ICP include:

1. Read an image from SDRAM and write the image

back to SDRAM, while applying a user defined

polyphase filter with optional up or down scaling in

horizontal direction.

2. Read an image from SDRAM and write the image

back to SDRAM, while applying a user defined

polyphase filter with optional up or down scaling in

vertical direction.

3. Read an image from SDRAM and wr ite the image out

to PCI bus memory (graphics card) or SDRAM, while

performing horizontal scaling and conver sion to one

of a several RGB and YUV formats. The PCI output

mode includes optional b itmap masking to selectively

enable/disable pixe l writes to PCI (to refresh only the

exposed pa rt of a video window) and optional RGB

overlay with alpha blend ing and op tional chro ma key-

ing.

14.2.2 Bandwidth

ICP bandwidth can be estimated from the worst-case im-

age processing bandwidth. If the worst case image is

1024 x 768 at 30 Hz in YUV 4:2:2 format, the pixel rate is

1024 x 768 x 3 0 = 23.59 Mpix/sec. For YUV 4:2:2 image

coding at 2 bytes per pixel, this is 23.59 x 2 = 47.19 MB/

1. Note that function 2 and 3 don’t normally occur simulta-

neously, and if an application attempts both simulta-

neously, some performance limitations are incurred.

PNX1300/01/02/11 Data Book Philips Semiconductors

14-2 PRELIMINARY SPECIFICATION

sec. The minimum bandwidth for the ICP function is

therefore 47.18 MB/sec., or approximately 50 MB/sec.

Video DMA In

Audio DMA In

Audio DMA Out

I2C Interface

Image

coprocessor

PNX1300

Memory

Controller

PCI Master/Slave Interface

VLD

Video Out

Digital

DMSD

or Raw

Video

Serial

Digital

Audio

JTAG

Clock

PCI Local Bus

SDRAM

Highway

SSI

Camera

Figure 14-1. PNX1300 chip block diagram

DSPCPU

Coprocessor

FIFO

Bank 5-tap

Filter

Microprogram Control Unit

To PCI

Overlay

Bit Mask

To SDRAM

Microcode

Overlay +

Alpha Blend ing +

Chroma Keying

YUV => RGB

Conversion

Output Formatting +

Bit Masking

Image Coprocessor

Overlay

Bit Mask

To SDRAM

PNX1300 Data Highway

Figure 14-2. Image coprocessor block diagram

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-3

Scaling and filtering of the two dimensional image re-

quires two passes of the image data through the filter,

one for vertical and one for horizontal. Scaling an image

and sending it to the PCI bus requires three transfers of

the image over the SDRAM bus: one transfer to read the

image for vertical filtering, one transfer to write the fil-

tered data back, and one transfer to read the image for

horizontal filtering a nd output to the PCI bus. This means

an average of SDRAM bus bandwidth of 3 x 50 = 150

MB/sec for the 1024 x 76 8 image case descr ibed above,

assuming a scaling factor of 1.0. A larger or smaller scal -

ing factor means that either the input or output image will

be smaller than 1024 x 768. The bandwidths required are

determined by the la rge r of the two imag es, in put o r out-

put. This is because all input pixels must be scanned to

generate all the output pixels.

14.2.3 Image Size and Scaling

Image sizes in the PNX1300 have a nominal range of 16

x 16 to 1024 x 768. Sizes smaller than 16 x 16 are pos-

sible, but are too small to be r ecognizable images. Imag-

es larger than 1024 x 768 (up to 64 K x 64 K) are possible

but they cannot be processed in real time and require

larger SDRAM sizes. Scaling factors have a nominal

range of 1/4 (down scaling by 4) to 4 (upscaling by 4).

Larger up and down scaling factors are possible, up to

1000 and beyond; however, very large upscaling factors

result in a large magnification of a few pixels, and very

large down scaling factors give only a few pixels as a re-

sult.

14.3 INTERFACE

The ICP unit has no PNX1300 extern al pins. It interfaces

internally to the Data Highway and the PCI Interface.

14.4 DATA FORMATS

The ICP unit accepts input and overlay image data to

generate output image data. The ICP accommodates a

variety of formats for the input, overlay and output data.

These image data formats define the relationship be-

tween the Y, U, and V or R, G, and B components of the

image as they are stor ed in memory. The ICP accepts in-

put image data in planar format, where the Y, U and V

components are in separate tables in SDRAM. The vari-

ous input image data formats differ in the position of the

U and V components relative to the Y component and the

amount of U and V data relative to the Y data.

In all modes except the YUV to RGB conver sio n mode s,

each ICP operation processes one Y, U, or V image com-

ponent. Three separate commands are required to pro-

cess all three components of an image. Since each com-

ponent is scaled and filtered separately, the software

defines the image format and format conversion by how

it scales each component.

For pixel format conversion for PCI or SDRAM output

mode, each output pixel is a co mbination of RGB or YUV

components as defined by the output format. The YUV

input data and the RGB or YUV overlay data are com-

bined by the ICP hardwa re pixel by pixel to form the RGB

or YUV output pixels. Because all three YUV compo-

nents are simultaneously woven to gether to create each

output pixel, the ICP hardware must know the image

data format in SDRAM, defined as how the components

of the image data are to be found and combined.

In the YUV to RGB conversion mode, the ICP accepts

the following input data formats: YUV 4:2:2 co-sited,

YUV 4:2:2 interspersed, and YUV 4:2:0. In this mode, the

ICP will also accept image overlay data when PCI output

is specified. The ICP accept s image ove rlay data in se v-

eral combined formats: RGB 24+ , RGB 15+, and YUV

4:2:2+. In this mode, the ICP generates output data in

several RGB and YUV formats. These formats are com-

patible with a wide variety of PCI frame buffers.

14.4.1 Image Input Formats

The ICP image input form ats define the relative positions

of the Y component and the U and V components of the

input image p ixel data. There are t hree input formats to

the ICP: 4:2:2 co-sited, 4:2:2 interspersed, and 4:2:0 in-

terspersed. The 4:2:2 formats have 2 U and 2 V pixels for

every 4 Y pixels, so the ratio of Y to U or V is 2:1. The

4:2:0 format has 1 U and 1 V pixel for every 4 Y pixels,

so the ratio of Y to U or V is 4:1. The input formats are

given below. The input formats have a sign ificant impa ct

on the 2 dimens io n al sca ling op er a tio n.

14.4.1.1 YUV 4:2:2 Co-Sited

In the YUV 4:2:2 co-sited format, the U and V pixels co-

incide with the Y pixel on every other pixel, as shown in

Figure 14-3.

14.4.1.2 YUV 4:2:2 Interspersed

In the YUV 4:2:2 interspersed for mat, the U and V pixels

lie between the Y pixe ls on e very ot her pix el of the hori-

zontal line, as shown in Figure 14-4.

14.4.1.3 YUV 4:2:0 XY Interspersed

In the YUV 4:2:0 interspersed for mat, the U and V pixels

lie between the Y pixe ls on e very ot her pix el of the hori-

zontal line, as shown in Figure 14-5.

14.4.1.4 YUV 4:1:1 Co-Sited

In the YUV 4:1:1 co-sited format, the U and V pixels co-

incide with the Y pixel on every fourth pixel, as shown in

Figure 14-6.

PNX1300/01/02/11 Data Book Philips Semiconductors

14-4 PRELIMINARY SPECIFICATION

Figure 14-3. 4:2:2 Co-sited input format

Chrominance

(U,V) samples Luminance

samples

Figure 14-4. 4:2:2 Interspersed input format

Chrominance

(U,V) samples Luminance

samples

Figure 14-5. 4:2:0 XY Interspersed input format

Chrominance

(U,V) samples Luminance

samples

Figure 14-6. 525-60 YUV 4:1:1 Co-Sited input format

Chrominance

(U,V) samples Luminance

samples

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-5

14.4.2 Image Overlay Formats

The ICP accepts image overlay data in three formats,

RGB 24+RGB 15+and YUV-4:2:2+as shown in

Table 14-1. The overlay image format must be the same

type as the output image format generated by the ICP for

the main image. For example, if the output image is one

of the RGB formats, the overlay must be one of the two

RGB overlay formats, RGB-24- and RGB-15+. If the

output image format is YUV, the overlay format must be

in YUV-4:2:2+ format. The formats must be of the same

type because the ICP does no conversion on the overlay

data.

In RGB 24+pixels are packed 1 pixel/worda full byte

of alpha informa tion (stored in th e most signific ant byte)

is included with each pixel. In RGB 15+one bit of alpha

is included for each pixel. The pixels in the overlay image

are packed as 2 pixe ls p er 32-bit wo rd , a nd the alph a bit

is the most significant bit of each half word. In the same

manner, the YUV-4:2:2+format packs two pixels into

one 32-bit word, and has one bit of alpha for each pixel.

The least significant bit of the U and V components sup-

plies the alpha bit for the Y0 and Y1 pixels, respectively.

The alpha bit in these formats selects between two alpha

values stored in the ICP, alpha 1 and alpha 0. The alpha

1 and alpha 0 values are loaded from the parameter

block when the ICP is started.

14.4.3 Alpha Blending Codes

Image overlay uses alpha blending, which combines the

overlay image with the main image according to the al-

pha value. The alpha value is supplied by the a lpha byte

in RGB 24+format and by the alpha registers, Alpha 0

and Alpha 1 in the other formats. The alpha code format

is shown in Table 14-2.

14.4.4 Output Formats

The output formats are the RGB image formats sent to

the PCI interface or SDRAM. These formats are shown

in Table 14-3. Note: B1 = Byte 1 of blue = [b7...b0]1.

Table 14-1. Image Overlay Formats

Format Bits 31-24 Bits 23-16 Bits 15-8 Bits 7-0

RGB 24+a7 - a0 r7 - r0 g7 - g0 b7 - b0

YUV-4:2:2+Y1 (v7-v1) + Y0 (u7-u1) + 

Pixel 1 Pixel 0

RGB 15+ r4 r3 r2 r1 r0 g4 g3 g2 g1 g0 b4 b3 b2 b1 b0  r4 r3 r2 r1 r0 g4 g3 g2 g1 g0 b4 b3 b2 b1 b0

Table 14-2. Alpha Blending Codes

Alpha Code Alpha Value Image Overlay

00h 0 100% 0%

20h 32 75% 25%

40h 64 50% 50%

60h 96 25% 75%

80h - FFh 128-255 0% 100%

Table 14-3. Output Data Formats

Format Word Bits 31-24 Bits 23-16 Bits 15-8 Bits 7-0

Pixel 3 Pixel 2 Pixel 1 Pixel 0

RGB 8A: 233 1 r1 r0 g2 g1 g0 b2 b1 b0 r1 r0 g2 g1 g0 b2 b1 b0 r1 r0 g2 g1 g0 b2 b1 b0 r1 r0 g2 g1 g0 b2 b1 b0

RGB 8R: 332 1 r2 r1 r0 g2 g1 g0 b1 b0 r2 r1 r0 g2 g1 g0 b1 b0 r2 r1 r0 g2 g1 g0 b1 b0 r2 r1 r0 g2 g1 g0 b1 b0

Pixel 1 Pixel 0

RGB 15+1 r4 r3 r2 r1 r0 g4 g3 g2 g1 g0 b4 b3 b2 b1 b0  r4 r3 r2 r1 r0 g4 g3 g2 g1 g0 b4 b3 b2 b1 b0

RGB-16 1 r4 r3 r2 r1 r0 g5 g4 g3 g2 g1 g0 b4 b3 b2 b1 b0 r4 r3 r2 r1 r0 g5 g4 g3 g2 g1 g0 b4 b3 b2 b1 b0

1 Pixel/Word

RGB 24+1 a7 - a0 r7 - r0 g7 - g0 b7 - b0

Packed 4 Pixels/3 Words

RGB 24-packed 1 B1 R0 G0 B0

2G2 B2 R1 G1

3R3 G3 B3 R2

Packed 2 Pixels/Word

YUV- 4:2:2 1 Y1 V0 Y0 U0

PNX1300/01/02/11 Data Book Philips Semiconductors

14-6 PRELIMINARY SPECIFICATION

14.5 ALGORITHMS

14.5.1 Introduction

The ICP provides filtering, resizing (scaling) and YUV to

RGB conversion of the source image. Filtering provides

image enhancement. Scaling generates a new image

that is larger or smaller than the current image. YUV to

RGB conversion is used to generate an RGB version of

the image for output to an RGB format frame buffer

through the PCI interface or to SDRAM.

The filtering, scaling, and YUV to RGB conversion algo-

rithms are discussed separately. The ICP uses these al-

gorithms in two ways.

1. It provides one pass horizontal scaling with horizontal

5-tap filtering of Y, U, or V.

2. It provides one pass vertical scaling with vertical 5-tap

filtering of Y, U, or V.

14.5.2 Filtering

The ICP provides high quality, 5-tap polyphase filtering,

both horizontal and vertical, of Y, U, or V data. Each filter

type is performed as a separate one dimensional filter

pass. Two dimensional filterin g of the image requires two

passes of the one dimensional filters.

Multi-tap FIR filtering

In multi-tap FIR filtering of an image, the new filter output

(pixel) value is a weighted sum of adjacent pixels. The

weighting coefficients determine the type of filtering

used. A 5-tap filter generates the new pixel value as a

weighted sum of the current value and the two pixels on

either side (2 left and 2 right for horizontal filtering, 2

above and 2 below for vertical).

A multi-tap FIR filter can be used to generate values for

new pixels that are displaced from the original (‘center’)

pixel in the same way as linear interpolation. For exam-

ple, assume the new pixel location is shifted slightly to

the right of the center pixel of the input image. A horizon-

tal filter can be used to estimat e the new pixel value by

weighting the right pixel filter coefficients more heavily

than the left, proportional to the relative position offset of

the new pixel. (In this sense, interpolation is a 2-tap fil-

ter.) This is shown in Figure 14-7. The ICP horizontal and

vertical filter operations use this method to combine scal-

ing with filtering.

Mirroring pixels at the start and end of a line or window

A line may start and/or end at the edge of the input im-

age. In this case, the two start and /or end pixels needed

for the first and last pixels of the line, respectively, are

missing. The ICP uses pixel mirroring to solve this prob-

lem. In pixel m irroring, th e two available pixels are us ed

to substitute the two missing pixels. The first pixel, uses

copies of the two pixels to the right as though they were

the two pixels to the left. Specifically, P+2 substitutes for

P-2, and P+1 substitutes for P-1. Th e last pixel uses cop-

ies of the two pixels to the left as though they were the

two pixels to the right. Since the left and right pixels are

now the same, this is called pixel mirroring.

There are five states of pixel m irrorin g: first o utput p ixel,

second output pixel, middle pixels, next to last output pix-

el and last output pixel. The first output pixel uses pixels

numbered (2,1,0,1,2). The second pixel uses (1 ,0,1,2,3).

The middle pixe ls use ( P-2, P- 1, P, P +1, P+2 ). The n ext

to last pixel uses (N-3, N-2, N-1,N, N-1), where N is the

number of the last input pixel. The last pixel uses (N-2,

N-1, N, N-1, N-2).

In some cases of upscaling, one more input pixel may be

needed at the end of the line. In these cases, the pixel

value(s) are not generated by the mirror logic. Instead,

the ICP uses a copy of the last output pixel as the best

estimate of the required output pixel.

14.5.3 Scaling

Scaling overview

Resizing, or scaling, the image m ean s gener ating a new

image that is larger or smaller than the original. The new

image will have a larger or smaller number of pixels in the

horizontal and/or vertical directions than the original im-

age. A larger image is scaling up (more new pixels); a

smaller image is scaling down (fewer newer pixels). A

simple case is a 2:1 increase or decrease in size. A 2:1

decrease could be done by throwing away every other

pixel (although this simple method results in poor image

quality). A 2:1 increase is more intere sting. The new pix-

els can be generate d in between the old ones by:

1. Duplicating the original pixels

2. Linear interpolation, where the new in-between pixels

are the weighted average of the adjacent input pixels

Input Pixels

Output Pixels

Filter (uses 5 input pixels)

Interpolation (uses 2 input pixels)

Figure 14-7. Pixel generation by interpolation and filtering

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-7

3. Multi-tap filtering, where the new in-between pixels

are multi-pixel filtered version of the adjacent input

pixels. This approach results in the best image.

The more ge neral case is w here the output imag e reso-

lution is not an integral multiple or sub-multiple of the in-

put image resolution , such as con verting from 640 x 48 0

to 1024 x 768. In this case, the output pixels have differ-

ing positions relative to the input pixels in the horizontal

or vertical dimensions. In converting from 640 to 1024,

the first output pixel on a line corresponds to the first in-

put pixel. The second output pixel is at 640/1024 of the

distance between the first and second input pixels. The

third output pixel is at (2*640)/1024 of the distance =

1280/1024 = 1+ 256/1024 = 256/1024 of the distance be-

tween the second and third input pixels, etc. The output

pixels shift with respect to the input pixel grid as you

move along the line in the horizontal or vertical dimen-

sions. This is shown in Figure 14-8.

New pixels are generated by interpolation or filtering of

the original pixels. Interpolation is the weighted average

of the input pixels adjacent to the output pixel. Filtering

extends interpolation to include input pixels beyond the

input pair adjacent to the output pixel. The number of pix-

els used to generate the outp ut defines the filter type. In-

terpolation is a 2-tap filter. A 4-tap filter would use the two

pixels to the left and the two pixels to the r ight of the out-

put pixel. A 5- tap filter identifies th e single pixel neares t

the output as the center pixel, and uses this pixel plus

two to the left and two to the right to generate the output.

If the ratio of the output pixel count per line (in H or V) to

input pixel count per line is the ratio of small integers,

there is a repeating pattern in these relative positions of

input to output pixel locations. For example, for 640 to

1024, the ratio is 8/5. The pattern repeats for every 8 out-

put and every 5 input pixels. If the ratio is not a ratio of

small integers, the pattern will take a long time to repeat.

The worst case would be 640 to 641, for example. Th ere

would be no exact repetition for the whole line.

The interpolator or filter coefficients must be weighted

according to the relative position of the new pixel relative

to the old pixels. The weighting factor is betwe en 0.0 and

1.0, corresponding to the relative p osition of the new pix-

el with respect to the old pixel grid. With a repe ating pat-

tern, fewer weighting factors are needed, and therefore

fewer coefficients in the linear interpolator or filter gener-

ating the new pixels, since you can reuse them each time

the pattern repeats. A filter with a repeating pattern is

called polyphase, indicating a repeating pattern in the

phase (offset position) of the output pixels relative to the

input pixels.

Generating the output pixels: relating the output grid to the

input grid

Scaling is a pixel transformation in which an array of out-

put pixels is gene rated fr om an arr ay of input pixe ls. The

value of each pixel on the output pixel grid is calculated

from the values of its adjacent pixels on the input grid. To

find these adjacent pixels, you overlay the output gr id on

the input grid and align the starting pixels, X0Y0, of the

two grids. To identify the adjacent inpu t pixels for a given

output pixel, you divide the output pixel X (pixel number

along the output line) and Y (pixel lin e number within win-

dow) by their corresponding scaling factors:

Xin = Xout / (horizontal scaling factor)

where: horizontal scaling factor =

output length / input length

Yin = Yout / (vertical scaling factor)

where: vertical scaling factor =

output height / input height

Note that the resulting Xin and Yin values will be real

numbers because the output pixels will usually fall be-

tween the input pixels. The fractional portion indicates

the fractional distance to the next pixel. To calculate the

output pixel valu e, you use the value for th e nearest pixel

to the left and above and co mbine it with the value o f the

other adjacent pixel(s). For example, horizontal interpo-

lation uses the starting pixel to the left interpolated with

the next pixel to the right, with the fractional value used

to determine the weighting fo r the interpolation.

ICP scaling output resolution

In the ICP, scaling is forced to have a repeating pattern

by limiting the resolu tion of the new pixel position to 1/32;

the new position is forced to be at a location n/32 in H

and V relative to the position of the original pixel grid.

This results in a worst case error of approximately 1.5%

in amplitude relative to calculations using exact output

pixel positions . This is comparable to the e rrors caused

by quantizing the amplitude of the pixels. The additional

quantization noise can be avoided by choosing an appro-

priate scale factor which, when inverted, results in frac-

tional values which are expressed in 32nds, such as the

8/5 scaling factor in the 640 to 1024 example above. A

diagram of the input to output pixel relationship and the

123451

18765

321

Input Pixels

Output Pixels

Figure 14-8. 640 to 1024 upscaling example

PNX1300/01/02/11 Data Book Philips Semiconductors

14-8 PRELIMINARY SPECIFICATION

output fractional X and Y subpixel offset is shown in

Figure 14-9.

Output scaling calculation method

The output pixel distance in H and V in the ICP is calcu-

lated to high precision (16-bit fraction) even though the

output resolution is fixed at 1/32 of the input grid. Each

output pixel’s location relative to the input pixel grid is giv-

en by:

X location of ou tput pixel = X0 of input line + outp ut

pixel number / X Scale Factor

Y location of output pixel = Y0 of input window

+ output line number / Y scale factor

The X and Y locations may not be integer values, de-

pending on the scale factor. The resulting X and Y pixel

locations can be separa ted into an integer and a fraction-

al part. The integer part of the X and Y location selects

the pixel and line number closest to the output pixel, re-

spectively. The fractional part gives the fractional dis-

tance of the output pixel to the next X and Y input pixel

values. These fractional parts are the dX and dY values

shown in Figure 14-9.

The output pixel value can be ca lculated by interpolatio n

between the two input pixels or by 5-tap filtering using the

5 nearest pixe ls rather than the 2 n earest pixels. Interpo-

lation or filtering uses the fractional position values, X

and Y, to select the appropriate filter coefficients. In the

ICP, these values are limited to 5 bits for a resolution of

1/32, even though the actual position value has much

higher resolution. The ICP uses fractional values cen-

tered around the center pixel with a range of -16/32 to

+15/32.

To perform scaling, the X and Y locations of the output

pixel relative to the input pixel grid must be generated.

This includes both th e inte ger part to locate the adjacent

pixels and the fractional part to choose the filter coeffi-

cients which generate th e output value from the adjacent

pixels. This could be done by generating the output pixel

X and Y numbers and dividing each by its associated

scale factor. Since dividing is e xpensive in hardware and

time, the ICP effectively multiplies the X and Y pixel num-

bers by the inverse of the X and Y scaling factors, resp.

This is done by incrementing the X and Y input pixel

counters by X and Y increment values that are the in-

verse of the X and Y scale factors, resp. For ou tput pixel

Xn, the inverse of the scale factor is a dded to the X input

location n times. This is equivalent to multiplying n by the

inverse of the scale factor.

The ICP uses a 16-bit integer and a16-bit fractional value

for the X and Y increment values. This allows a fractional

value resolution of 1/64K. Since the increment value will

be added 1024 times in a 1024-p ixel line, a ny error in an

individual calculation will be multiplied by 1024. The high

resolution of the calculation pr events an accumula tion of

error as you increment along the line.

Only the most significant 5 bits of the fractional value are

used by the filter coefficient RAMs. However, the X and

Y counters are incre mented by the high- resolution X and

Y increment values. The result of this truncation is a

worst case error of approximately 1.5% in amplitude rel-

ative to arbitrary pixel output positions.

The error caused by discrete (1/32) resolution can be re-

duced to exactly zero if the o utput image size is adjusted

to have a repeating pattern that fits on these 1/32 bound-

aries. For zero error, this implies that the scaling factor

must be of the form of B/A, where B (the output pixel

count factor) is a sub-multiple of 32 [i.e. 1, 2, 4, 8, 16, 32],

and A (the input pixel count factor) is an integer deter-

mined by the nearest acceptable scale factor for a given

B. In the 640 to 1024 conversion case, the B/A ratio was

8/5, meeting this requirement.

The integer values, if accumulated, would be equal to the

total number of input pixels when scaling is complete.

The integer values for each pixel define the number of

pixels to read from memory and shift in to generate the

next output pixel. For example, a scaling factor of 1.0 will

result in one pixel shifted in for each output pixel gener-

ated. Upscaling will have integer increment values of

less than one. This means that the integer value will be

‘0’ for some pixels and ‘1’ for others. For example, up-

scaling by 2.0 will result in integer values of ‘1’ half the

time and ‘0’ for the other ha lf, depending on the carr y out

from the fractional increment.

Pixel shift bypassing for large down scaling

Down scaling will have integer increment values of great-

er than one. In this case, th e integer value indicates the

number of pixels to re ad to obtain filter pixels fo r the next

output pixels. Th ere are two ways to read and shift in the

pixels for down scaling: shift all and shift bypass. In the

shift all mode (the default mode) all five pixels are shifted

for each input value read and shifted in. Shift all mode

uses the five input pixels nearest the output pixel, inde-

pendent of scaling factor. In the shift bypass case, only

the last pixel is shifted in. For example, in a down scaling

of 10, nine pixels are re ad and the 10 th pixel is shifted in

to the filter. Shift bypass mode is used for large down

scaling, i.e. down scaling factors of 2.0 or greater. The

shift bypass mode is selected by setting the GETB bit in

the parameter table. It uses input pixels that are nearest

the output pixel and those nearest each of the four output

Figure 14-9. ICP 1/32 output resolution

Input Pixels

Output Pixels

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-9

pixels adjacent to the output pixel. The shift bypass

mode also forces the coefficient RAM inputs to ‘0’, since

interpolation between adjacent input pixels is no longer

being performed.

Using scaling to convert from YUV 4:2:0 to YUV 4:2:2

YUV information in the 4:2:0 format has the UV pixels off-

set from the input grid in both X and Y. Also, the U an d V

pixels are at 1/2 of the horizontal and 1/2 of the vertical

frequencies of the Y pixels. This means the UV pixels

must be filtered and additionally scaled in both X and Y

in order to line up with the output Y pixels even if no initial

scaling is done. To generate 4:2:2 interspersed data,

vertically up-scale U and V by a factor of 2 with a start off-

set of -1/4 pixel. Up scalin g b y 2 g ene ra tes th e a ddition al

lines required, and starting with a -1/4 pixel offset (rela-

tive to U, V space) moves the output up to the same line

as the Y pixels. To generate 4:2:2 co-sited, then filter hor-

izontally with no scaling factor but with a start offs et of -

1/4 pixel, moving the output left 1/4 pixel.

14.5.4 YUV to RGB Conversion

In the ICP, YUV to RGB conversion is done by sequen-

tially processing triplets of Y, U, and V pixel data to con-

vert the pixels to an internal YUV 4:4:4 format and app ly-

ing the YUV to RGB conversion algorithm on the YUV

4:4:4 pixels. The results of this conversion normally go to

the PCI bus but can also go back to SDRAM.

YUV to RGB conversion has two steps. First the Y, U and

a V pixel data are used to generate an RGB pixel at the

output location. When the Y,U, and V pixels are ready,

YUV to RGB conversion is performed using the following

algorithms:

R = Y + 1.375(V)= Y + (1 + 3/8)(V)

G = Y - 0.34375(U) - 0.703125(V)

= Y - (11/32)(U) - (45/64)(V)

B = Y + 1.734375(U)

= Y + (1 + 47/64)(U)

In CCIR601, the U and V values are offset by +128 by in-

verting the most significant bit of the 8-bit byte. This is the

way the U and V values are stored in SDRAM. The above

algorithms assume that the U and V values are convert-

ed back to normal signed two’s complement values by in-

verting the MSB before being used.

14.5.5 Overlay and Alpha Blending

The ICP can add an overlay image to the main image

when in the horizontal filter to RGB/YUV mode with PCI

output. The overlay image is a user-defined rectangle

within the main image. When the overlay is active, each

overlay pixel is combined with each main image pixel to

generate the resulting pixel to be displayed. Each pixel

combination is controlled by an alpha value which deter-

mines the proportions of overlay and main image that

contribute to the output pi xel. The relation is given by:

Pout = (alpha) * Pover la y + (1-a lp ha ) * Pmain =

(alpha) * (Poverlay-Pmain) + Pmain

where: alpha ranges from 0 to 1

In the ICP, the alpha value range is limited by the hard-

ware to five values: {0.0, 0.25, 0.50, 0.75, 1.0}.

An alpha value is supplied for each overlay pixel. In the

RGB 24+ overlay data format: an 8-bit alpha value is

contained within the overlay data.

In all other overlay data formats (RGB 15+, etc.), an al-

pha bit in the overlay data determines the alpha value.

The alpha bit selects between two 8-bit values, alpha 1

and alpha 0, supplied by a pair of internal ICP registers.

These registers are loaded from the parameter block

when the ICP is started. When the alpha bit is ‘1’, alpha

1 value is used as the alpha value; when the alpha bit is

‘0’, alpha 0 is used as the alpha value. The two alpha reg-

isters allow translucent images and backgrounds while

being restricted to one bit per pixel for alpha selection.

Alpha blending has several uses.

1. Alpha can be used to disable portions of the overlay,

called keying. When the alph a for a pix el is ‘0’, ther e

is no overlay. When the alpha is ‘1’, the overlay is

100%, replacing the image. This allows the user to put

an irregular shaped object in an image without show-

ing the bound ing rec tangl e of the ov er la y.

2. Alpha blending allows translucent (smoky) back-

grounds and/or translucent (ghostly) overlay images

3. Using alpha at the edges of small images such as font

characters increases their effective visual resolution.

Chroma keying

The ICP also optionally provides a restricted form of

chroma keying sometimes called color keying. When the

overlay Y value is ‘0’ (an illegal value in the YUV 4:2:2+

format) or the RGB values are all ‘0’ (RGB15+ format),

the alpha value is forced to ‘0’ and no overlay or blendin g

occurs. This provides three levels of overlay: none, alpha

zero, and alpha one. This combination can be used to

generate an irre gularly shape d menu ( an oval shap e, for

example) which is translucent (e.g. an alpha value of

50%) that contains opaque (alpha = 100%) letters. In a

game, this could be a message written on a foggy back-

ground in an oval window. The chroma keying provides

the definition of the oval shape, th e a lpha zero value de-

fines the translucent foggy background and the alpha

one value defines the opaque characters on the foggy

background.

Chroma keying in the ICP is intended for computer gen-

erated or modified overla ys. Chroma keying tur ns off th e

overlay process for selected pixels by forcing an alpha

value of ‘0’ for those pixels. Chroma keyed pixels use

special codes to identify them. These codes must be

computer generated in most cases. For example, the

DSPCPU or other CPU would process an overlay image

and convert the overlay pixels to be turned off into chro-

ma keyed pixels by changing the data for those pixels to

the chroma key code.

The ICP does not have full chroma keying. Full chroma

keying has adjustable threshold values for the pixel com-

ponents. Adjustable thresholds allow the user to auto-

matically select an overlay sub-imag e from a larger over-

lay background, such as selecting an image of an actor

PNX1300/01/02/11 Data Book Philips Semiconductors

14-10 PRELIMINARY SPECIFICATION

against a bright blue b ackground while inhibiting the blue

background.

14.5.6 Dithering

Short output codes, such as RGB 8, have few bits for out-

put-value determination. RGB 8R has (2,3,3) bits for

(R,G,B). The result is a coarse, patchy image if nothing

is done to correct for th e limited resolution. Dithering sig-

nificantly improves the effective resolution of these imag -

es. For example, RGB 8 images dithering looks nearly as

good as RGB 16.

Dithering works by adding a random dithering value to

the pixel before it is truncated by the output formatter.

The dither is added to the portion which will be truncated.

The carry from this add will occasionally propagate into

the most significant portion of the pixel before tru ncation.

The carry from the add thus ‘dithers’ the displayed val-

ue.In the example sh own in Figure 14-10, a random dith-

er value is added to the original data before truncation.

The dither value should have a range of from approxi-

mately 0 to 1 LSB of the truncated value. The dither value

should be symmetrical around 1/2 the LSB of the quan-

tizing erro r of the truncat ion. In the examp le shown, the

dither signal has values of (1/8, 3/8, 5/8, 7/8). This set of

values has a range of appr oximately 0 to 1 LSB, a nd it is

symmetrical ar o un d 1/ 2 LSB.

In this example, the input signal has a value of 2.83.

Without dithering, this value would be truncated to an

output value of 2 in all cases. Averaging the un-dithered

signal over four pixels still gives you a value of 2. By add-

ing the dither signal, the o utput value is 2 or 3 d epending

on the value of the added dither signal. Averaging over

four pixels, the average output value is 2.75, much closer

to the input value than without the dither signal. The dith-

er signal has significantly reduced the error when aver-

aged over four pixels.

Two types of dithering are combined in the ICP: quad pix-

el and full image dithering. Quad pixel dithering, also

known as ordered dithering, adds one of four dithering

values to each pixel. The four dithering values corre-

spond to four-p ixel quads in the output im age. The pixels

in each quad have fixed positions in the input image, so

the dither values are chosen on the bases of odd or even

line number and odd or even pixel number in the line.

The dither values of (0/4, 3/4, 2/4, 1/4) are ad ded by line

and pixel number: even line & even pixel, even line & odd

pixel, odd line & even pixel, odd line & odd pixel. This

gives a four value ordered function for four adjacent pix-

els in the image. The (0,3,2,1) pattern is chosen specifi-

cally to prevent pairs of high or low pixel values from

clustering. Spatial dithering provides a significant im-

provement in effective resolution.

Full image dithering adds a single randomly generated

number to every pixel of the image. The result is that the

intensity and color accuracy increases as the size of the

sample is enlarged. The random number has a long bit

length to prevent repeating patterns in the image. The

random number can be static or dynamic. In the static

case, the random number generator starts with a fixed

seed at the start of the image. The random number spa-

tial pattern is fixed for the image even though the image

data may change from frame to frame. In the dynamic

case, the random number generator runs continuously,

and the dithering pattern changes from frame to frame.

The ICP combines quad pixel dithering with full image

dithering to provide th e final dithering signal for e ach pix-

el. The quad pixel dither provides the two most signifi-

cant bits of the dither signal, and the full image dither pro-

vides the least significant 4-bits of the dither signal. The

combined dither signal is 6 bits.

From 1 to 6 bits of dither signal are used, depending on

the output format. If fewer than 6 bits are needed, only

the MSBs of the dither signal are used. For example in

the RGB 8R output format, the R output value is 3 bits in

size. The output uses the 3 MSBs of the R input value

and truncates the 5 LSBs. The dither unit adds 5 bits of

dither signal (the 5 MSBs) to the 5 LSBs of the R input

value before truncation, and the RGB formatter truncates

the result after adding.

32.830

Dither = 0

Output = 2

32.955

Dither = 1/8

Output = 2

33.205

Dither = 3/8

Output = 3

33.455

Dither = 5/8

Output = 3

33.705

Dither = 7/8

Output = 3

No Dithering:

Output = 2. 0 1/4 LSB Dithering

Output = (2+3+3+3)/4 = 11/4 = 2.750

Error = +0.830

No Dithering 1/4 LSB Dithering

Error =(2.830 - 2.750) = +0.080

Figure 14-10. Dithering

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-11

14.5.7 Implementation Overview: Horizontal

Scaling and Filtering

Figure 14-11 shows a data flow block diagram of the ICP

horizontal scaling algorithm implementation. Blocks of

pixels are provided by the input block buffer. Each block

of pixels is transfer red sequentially to the 5-tap filter. The

filter does scaling and filtering of the data and puts the re-

sulting pixels in the output buffer. Completed pixels in the

output buffer are written back to SDRAM or to the PCI

output. A bypass multiplexer allows the filter to be by-

passed for SDRAM to SDRAM block moves.

Input pixel access is controlled by the Y Counter. The Y

Counter selects the word and byte for the current pixel in

the Y FIFO buffer. The Y Increment register, Y LSB Reg -

ister and the Y MSB Counter contr ol the increment of the

Y Counter. If the Y MSB Counter contents is not ‘0’, the

Y Counter is incremented and the Y MSB registe r is dec-

remented until the Y MSB Counter is ‘0’.

The Y MSB Counter is loaded with the integer portion of

the results of the Y Counter Increment operation. Y

Counter Increment involves adding the Y Increment frac-

tion and integer values to the Y LSB register and Y MSB

Counter, respectively. If there is no scaling (scaling fac-

tor = 1.0), the Y Increment integer value will be ‘1’, and

the Y Increment fractional value will be ‘0’. Each Y

Counter Increment operation will increment the Y

Counter by one in this case.

The Y Counter keeps track of horizontally indexed pi xels

sent to the filter. The Y Counter is incremented once (1.0

for no scaling) for each pixel. For a line of pixels begin-

ning with Xa and ending with Xb, the Y Counter reads pix-

els from the block buffer beginning with Xa-2 and ending

with Xb+2. The extra pixels are required by the 5-tap filter,

which uses a total of 5 pixels to generate each output pix-

el, two pixels before and two pixels after each pixel. The

horizontal filter uses the current output from the block

buffer and four delayed versions of it to generate the filter

output as the weighted sum of the center pixel plus the

two on either side. (For the case wh ere the scaling factor

= 1.0, the LSBs are always ‘0’.)

For up or down scaling, the Y Increment value is not 1.0,

it is the inverse of the scaling factor (See “ICP scaling

output resolution,” on page 14-7). For up scaling by a

factor of 2.0, the effective Y increment value is 0.5, for

example. This means two output pi xels are generated for

each input pixe l. The Y Counter effectively increments as

0.0, 0.5, 1.0, 1.5, 2.0, etc. The LSBs of the counter (i.e.

the fractional part less than 1) in the Y LSB register are

used by to the filter to generate the intermediate values.

An LSB value of 0.5 indicates that the outpu t pixel is half

way between Xn and Xn+1. The filter contains a set of 5

filter parameter RAMs, one for each coefficient. The 5

most significant LSBs from the counter select the filter

coefficients which will generate the correct value for the

output pixel at the relative offset from 0.0 indicated by the

LSBs.

SDRAM

To SDRAM

Y MSB Cntr

Pixel Clock

5 Stage Multipli-

er-Accumulator

Y LSBs

Reg

Pixel Data

a+2 RAM

a+1 RAM

a+0 RAM

a-1 RAM

a-2 RAM

Z Counter

Mux

Bypass

SDRAM

Address

Block

Y Counter

Y Incr Fraction

Y LSB Reg

Carry Out

Filter Source Select

5-tap Filter

YUV Code Delay

Y Incr Inte ger

N Byte Incr

Figure 14-11. ICP horizontal scaling data flow block diagram

Output

Buffers 6,7

Block FIFO

Buffers 0, 1

Block FIFO

via

highway

or PCI

PNX1300/01/02/11 Data Book Philips Semiconductors

14-12 PRELIMINARY SPECIFICATION

The Y Counter indicates the next pixel from the input

buffer. A new pixel is clocked into the filter reg ister s only

when the Y Counter contents change, which happens

when the Y MSB Counter is loaded with a value greater

than ‘0’. Note that for Y increment values less than 1.0

(up scaling), the change will be caused by carry incre-

ment from the Y LSBs, and a new pixel will not be

clocked into the filter shift register on every Y clock.

For increment values of 2.0 or for va lues of 1.0 or greater

with carry in (down scaling), multiple new pixels will be

clocked into the filter shift register before the filter inputs

are ready. The nu mber of new b ytes needed for the next

pixel is the sum of the Y Increment Integer value and the

carry out of the Y LSB adder. This result is loaded into

the Y MSB Counter. The filter clock is stalled until the in-

puts are ready. The integer value of the increment -- in-

cluding carry -- defines the number of new pixels to be

clocked through the shift register before the filter inputs

are ready for use.

In this discussion, the Y Counter LSBs form a 16-bit bi-

nary number. The upper 5 bits o f this 16-bit number form

a 5-bit binary number between 0 and 31 representing a

fractional distance between Y pixels between 0/32 and

32/31. If the new pixel relative distance is 31/32, it is

nearest the right pixel of th e two pixels it is between, and

the right 2 pixels will be more heavily weighted than the

left 3.

The horizontal filter shown in Figure 14-11 is pipelined to

generate a pixel for every integer increment of the Y

Counter. The filter input is always 5 clocks ahead of its

output. The first stage generates the filter term an+2Xn+2

using the data from the input block and the an+2 coeffi-

cient from the coefficient RAM driven by the Y LSBs. The

second stage registers hold the data for Xn+1 and its cor-

responding Y LSBs and generate an+1Xn+1. The last

stage regis ters hold the d ata for Xn-2 and the Xn-2 LSBs

and generate an-2Xn-2.

The LSB Register contents can change on every clock.

In the 2:1 scaling example, the LSBs alternate d between

0.0 and 0.5. The LSB Counter represents each output

pixel’s x offset value from the input pixel grid. The LSB In-

crement valu e is 16 bits long . Th e 5 upp er bi ts go to t he

coefficient RAMs, and the 11 lower bits provide precision

increment of the LSB Counter for precision in represent-

ing the scaling factor. The 11 lower bits of the LSB Incre-

ment value added to the 11 lower bits of the LSB Counter

determine when to increment the 5 LSBs that drive the

coefficient RAMs and when to clock a new Y pixel into

the filter.

14.5.7.1 Loading the extra pixels in the filter

For a 5-tap filter, 4 more pixel inputs are needed to the

filter than are generated at the filter output, two before

the first pixel and two after the last pixel. In the worst

case of a window that is exactly N blo cks wide and starts

at the first pixel of the first block, two extra blocks must

be read - on e at each en d of th e win dow - in or der to get

these 4 pixels! This is an unavoidable problem with a

multi-tap filter. For an n-tap filter, n-1 extra pixels are

needed. There are two techniques that avoid this effi-

ciency hit of fetchin g ex tr a blo cks .

1. Move the window edges so they are not within 2 pix-

els of a 64 input pixel boundary.

2. Simulate the edge pixels, such as by mirroring the

pair of pixels you have on the other side. This is the

only solution to the problem of starting (or ending) at

the edge of the image, where there are no pixels to

the left (or right) of the image window.

The ICP uses automatic mirroring to su pply these pixels.

Mirroring is used in both horizontal and vertical filter

modes.

14.5.7.2 Mirroring pixels at the ends of a line

A line may start and/or end at the edge of the input im-

age. In this case, the two start and /or end pixels needed

for the first and last pixels of the line, respectively, are

missing. The start mirror uses the two pixels to the right

of the first pixel, and the end mirror uses the two pixels to

the right of the last pixel. These pixels are supplied by

controlling the Y counter.

A mirror multiplexer in the 5-tap filter provides mirroring

of one or two pix els at th e filter inputs . This mirror multi-

plexer is used for both horizontal and vertical filtering. In

horizontal filtering, the first and last two pixels in the line

are mirrored. The mirror multiplexer is set to the appro-

priate mirror code for the first and last two pixels in the

line. The first two pixels are mirrored for the first two clock

pulses, and the last two pixels are detected using the pix-

el counter for the line.

Mirroring is optional, depending on whether the start or

end of the line is on a window boundary. The DSPCPU

or microprogram mu st detect this and enable start and/o r

end mirroring as required.

14.5.7.3 Horizontal filter SDRAM timing

Figure 14-13 shows a timing diagram for block data flow

between the SDRAM and the filter for a scaling factor of

1.0. The bus block reads a nd writes are one fourth o f the

filter processing time because the filter processes data at

100 Mpix/sec, and the SDRAM reads and writes blocks

of pixels at 400 Mpix/sec. The SDRAM logic reads the

next block while the current block is being processed.

This also provides the two pixels from the nex t block re-

quired to finish filtering the current block.

If the scaling factor is greater or less than 1.0. the

SDRAM bus activity will be different. For scaling factors

greater than 1.0, there will be fewer SDRAM reads for the

same number of writes ge nerated by the filter. For exam -

ple, a scale factor of 2.0 means that it is necessary to

read only half as many blocks to generate the same num-

ber of output blocks. For a scale factor less than one,

there will be more reads for the same number of writes.

For a scale factor o f 0.5, two bl ocks must be read for ev-

ery block of output. If the scale factor is less than 1/3,

more time will be spent reading and writing SDRAM than

filtering.

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-13

14.5.8 Implementation Overview: Vertical

Scaling and Filtering

Figure 14-14 shows a data flow block diagram of the ICP

vertical scaling algorithm implementation. Blocks of pix-

els are loaded sequentially into five input block buffers,

one for each of the 5 terms of th e 5-tap filter. Each blo ck

of pixels is transfer red sequentially to the 5-tap filter. The

filter does scaling and filtering of the data and puts the re-

sulting pixels in the output buffer. Completed pixels in the

output buffer are written back to SDRAM.

In vertical scaling, five separate blocks of pixels, one for

each line, are required because the pixels are stored in

horizontal sequence in the SDRAM. The Y Counter steps

through the 64 horizontal pixels of the five input blocks

and writes the resulting pixels into the outpu t block. Four

of the five blocks are used on the next pass, so that one

block of pixels in generates one block of pixels out except

for end conditions. The image is processed in 64-pixel

columns. Since the image to be filtered will not generally

start or end on a block bounda ry, the number of ho rizon-

tal pixels for the first and last columns will be less than 64

in these cases. Also, the data in the columns must be

aligned vertically. This results in the requirement that the

line-to-line a d dr es s of fse t v a lue must be a multiple of 64

bytes. Note that only the address offset value is modulo

64; the image to be filtered can start and stop anywhere.

Block alignment is not required.

Vertical scaling and filtering pro cesses five 64-pixel input

line segments to generate one 64-pixel output segment.

When input lines Yn-2 to Yn+2 have been processed to

generate one 64- pixel outpu t se gme nt for ou tput lin e Y n,

five new input segments are needed for the next output

line segment in th e 64-pixel column, Yn+1. If the vertical

scale factor is 1.0 (no scaling), line segments Yn-1 to

Yn+2 are reused, a new block for Yn+3 is loaded and the

block for line Yn-2 is discarded.

To load Yn+3, the MCU adds the Y offset value to the

block address (upper 26 bits) of the Y Counter, and the

Y Counter selects the next Y block to be read from

SDRAM. The Y Counter points to the line block address

for last Y block loaded, and the Y offset value is the ad-

dress difference between the start of one line and the

start of the next, X0Y0 to X0Y1. The line offset is always

an integral number o f SDRAM blocks. The line offset val-

ue must be added to the current line address to get the

next line address.

Up and down scaling use the U Co unter and U Increment

value. The U Counter is used to detect how many lines

must be read (0 to 5 ) to generate the next output line an d

to generate the vertical offset fraction for the 5-tap filter

for output lines that fall between the input lines. The U

Counter is set to its starting value (typically ‘0’) at the

start of the colum n, and the U Incr ement value is ad ded

to the U Counter for each output line segment g enerated

in the column. For a scaling factor of 1.0, the U Increment

value is 1.0, and each line processed will generate a re-

quest for one block. If the scaling factor is 1/2, the incre-

ment value will be two, corresponding to moving down

two lines. In this case, twice the line offset is added to the

Y Counter value.

For up scaling by a facto r of 2.0, the Y increment value is

0.5. This means two output lines are generated for each

input line. The U Counter increments as 0.0, 0.5, 1.0, 1.5,

2.0, etc. The LSBs of the U Counter (i.e. the fractional

part less than 1) are passed along to the filter to generate

the intermediate values. An LSB value of 0.5 means that

Input Pixels: Y

Output Pixels: Y’

123456

Y’=F(Y3,Y2,Y1,Y2,Y3)

Y’=F(Y2,Y1Y2,Y3,Y4)

Y’=F(Y1,Y2,Y3,Y4,Y5)

Y’=F(Y2,Y3,Y4,Y5,Y6)

Y’=F(Y3,Y4,Y5,Y6,Y5)

2N: Y’=F(Y4,Y5,Y6,Y5,Y4)

(3) (2) (5) (4)

Mirrored Pixels

Figure 14-12. Horizontal Pixel Mirroring

SDRAM Bus

Filter Action

Read X0 Write Xa

Read X1

Filter X1 => Xb

Filter X0 => Xa

Read X2 Write Xb

Filter X2 => Xc

Read X3

Figure 14-13. SDRAM and horizontal filter block timing

PNX1300/01/02/11 Data Book Philips Semiconductors

14-14 PRELIMINARY SPECIFICATION

the output line is half way between Yn and Yn+1. The filter

contains a set of 5 filter parameter RAMs, one for each

coefficient. The 5 most significant LSBs from th e counter

select the filter coefficients which will generate the cor-

rect value for the output pixel at the relative offset from

0.0 indicated by the LSBs.

For down scaling, the increment factor will be greater

than one. If the increment factor is 2.0, two new blocks

will have to be loaded before starting the next vertical fil-

ter pass. If the increment factor is 5 or greater, all five

blocks must be loaded. The number of blocks to be load-

ed for the next line is equal to the integer increment value

plus carry out from the LSB portion of the U Counter in-

crement.

Note that the LSB adder carr y out is availabl e before the

U Counter has been updated. This allows the current U

Counter value LSB bits to be used for the filter coeffi-

cients while using the carry out for the next value to pre-

dict how many blocks to fetch . The integer value from the

U increment value plus the car ry in from the L SB p ortio n

of the Increment adder is the number of blocks to be

loaded. These blocks must be sequentially loaded (and

not skipped) so that the filter has the necessary 5 adja-

cent lines to pe r form th e filtering. The contents of the in-

teger portion of the U Counter (updated after the add) are

not used.

Only one new block can be loaded while the current line

is being processed. If two or more blocks are needed to

process the next line, load one in overlap. Wait until the

current line is done, then load the re st of the blocks. Th e

microprogram only has to make two decisions for the

next line: is the increment value ‘0’ or greater than ‘0’,

and if greater than ‘0’, is it greater than five. If it is ‘0’, do

nothing: you will reuse all five blocks. If it is 1-4, load the

next block. If it is five or more, calculate the address of

the first block -- by adding N times the address offset to

the Y counter -- and fetch it.

When a new block is loaded and it is time to process the

next line, the block which was Yn+2 becomes Yn+1. The

Y blocks, in effect, shift up one line as you scan down the

image. This shifting action is implemented by sh ifting the

block select codes in the Filter Source Select Register

(FSSR). The FSSR contains six 3-bit register fields.

These 3-bit fields are rotated by a shift command to the

FSSR. The output of five of the FSSR fields go to the in-

put multiplexer, which selects the next block combination

and sends it to the filter. The output of the sixth field is the

free block to be filled for the next line while the current

line is being processed. The select code is also the block

code (0 to 5), so the free block is identified by its block

code in the FSSR. The FSSR codes for the six cases of

vertical filtering are shown in Table 14-4.

SDRAM

To SDRAM

Output Buffers 6,7

Block FIFO

Y Counter

Yn+2 Buffer

5-tap Filter

a+2 RAM

a+1 RAM

a+0 RAM

a-1 RAM

a-2 RAM

Yn+1 Buffer

Yn+0 Buffer

Yn-1 Buffer

Yn-2 Buffer

U Incr Integer

U LSBs

U LSB Reg

U Incr Fraction

Z Counter

Filter Source Select

6 In x 5 Out

Multiplexer

FSSR

Y Line clock

Line Clock Carry

Byte Index

Pixel Clock

Block Count

to Microcode U MSB Cntr

Block Address

to SDRAM

Output

Pixel clock

Figure 14-14. ICP vertical scaling data flow block diagram

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-15

14.5.8.1 Mirroring lines at the ends of an

image

A window may start and/or end at the edge of the input

image. In this case, the two start and/or end lines needed

for the first and last lines of the window, respectively, are

missing. These pixels are supplied by the mirror multi-

plexer at the 5-tap filter which mirrors the input lines.The

mirror multiplexer is controlled by the mirror counter and

mirror end register in the same manner as in horizontal

filtering. The mirror register in vertical filtering is incre-

mented by the output line counter. Mir roring is performed

on the first two and last two lines of the column. Mirroring

is optional, depending on whether the start or end of the

line is on a window boundary. The DSPCPU or micropro -

gram must detect this and enable start and/or end mirror-

ing as required.

14.5.8.2 Vertical filter SDRAM block timing

Figure 14-15 shows a timing diagram for block data flow

between the SDRAM and the filter for a scaling factor of

1.0. The bus block reads and writes requ ire one fourth of

the filter processing time because the filter processes

data at 100 Mpix/sec, and the SDRAM reads and writes

blocks of pixels at 400 Mpix/sec (peak). The vertical filter

starts by reading in the five blo cks necessary to generate

the next output block. While the current block is being

processed, the next block is read from SDRAM to pre-

pare for the next output block.

14.5.9 Horizontal Scaling and Filtering for

RGB Output

Figure 14-16 shows a data flow block diagram of the ICP

horizontal scaling to RGB output algorithm implementa-

tion. The six input block buffers are arranged as three

block FIFOs, one each for Y, U and V pixel streams.

These three streams are sequentially filtered, pixel by

pixel by the 5-tap filter to generate a scaled output se-

quence of Y, U, V, Y, U, V, etc. This YUV stream is fed

to the YUV to RGB converter where it is converted to one

of several RGB output formats, blended with RGB over-

lay pixels supplied by the Overlay FIFO and masked by

bit mask pixels from the bit mask block. The resulting

scaled, converted, overlay blended and masked RGB

stream is sent to the PCI interface -- typically to an RGB

format frame buffer on the PCI bus -- or to SDRAM.

The input pixel streams from the input FIFOs are trans-

ferred sequentially to the 5-tap filte r. Each stream has its

own set of four-stage delay registers used to perform

horizontal filtering on the stream. A pair of 3-way multi-

plexers switch the five filter data inputs and the 5-bit filter

coefficient select codes to the 5-tap filter. This set of mul-

tiplexers is driven by the YUV Sequence counter, a 2-bit

counter that provides the YUV processing sequence.

In horizontal scaling and filtering from SDRAM to

SDRAM, each Y, U and V component is filtered sepa-

rately as a complete image. In RGB output horizontal

scaling and filtering, the image is processed as three in-

terwoven streams of all three YUV components.

In the RGB output mode, the ICP normally generates

RGB data and writes it into a frame buffer memory on the

PCI bus or to the SDRAM. The fra me buffer memory for-

mat is RGB with one R, one G and one B value per pixel.

This could be called RGB 4:4 :4. To gen era te this ima ge,

the ICP generates a YUV 4:4:4 image and converts it to

RGB. This process is done one RGB output pixel at a

time. The ICP generates a U pixel and saves it in a reg-

ister, generat es a V pixel and saves it in a regis ter, then

generates a Y p ixel for output. The YUV to RGB convert-

er combines each Y pixe l as it is generated with the p re-

viously stored U and V pixels to generate the RGB output

data. This process is repe ated until the whole ima ge has

been converted and sent to the PCI bus or SDRAM.

14.5.9.1 YUV sequence counter in YUV 4:2:2

output Mode

For RGB output formats, the YUV data must be scaled to

YUV 4:4:4 format before conversion to RGB. The YUV

data in SDRAM is typically stored in YUV 4:2:2. This

means that the U and V data must be upscaled by 2 rel-

ative to the Y data to generate the internal YUV 4:4:4 for-

mat required for RGB conve rsion.

For the YUV 4:2:2 output formats, the U and V data do

not need to be up scaled to 4:4:4. The YUV 4:4:4 data

would be upscaled only to be decimated back to YUV

4:2:2. For YUV 4:2:2 output, the U and V pixels are used

twice. This is done by having a half-speed mode for the

YUV Sequence Counter. In this mode, the sequence is

U0, V0, Y0, Y1, U2, V2, Y2, Y3, etc. The U and V ar e not

Table 14-4. FSSR codes for vertical filtering.

Case Pn-2 Pn-1 Pn+0 Pn+1 Pn+2 IO Block

154321 0

205432 1

310543 2

421054 3

532105 4

643210 5

SDRAM Bus

Filter Action

Read Y5 Write Ya

Read Y6

Filter Y3-6 => Yb

Filter Y2-5 => Ya

Read Y7 Write Yb

Filter Y4-7 => Yc

Read Y8

Figure 14-15. SDRAM and vertical filter block t iming

PNX1300/01/02/11 Data Book Philips Semiconductors

14-16 PRELIMINARY SPECIFICATION

up scaled by 2 relative to the Y component for YUV 4:4:4

output, although they could be up scaled as part of gen-

eral up scaling of the image.

The YUV 4:2:2 output mode also provides higher pro-

cessing bandwidth relative to YUV 4:4:4 up scaling. Half

as many U an d V pixels are processed.The output pixel

rate is one pixel per 20 nanoseconds for the YUV 4:2:2

output mode versus one pixel per 30 for conversion to

YUV 4:4:4. This can be used to provide some processing

performance improvement for very large images at the

expense of some chroma quality.

14.5.9.2 P CI outp ut block timing

The ICP outputs pixels to the PCI interface at a peak rate

of 33 Mpix/sec in RGB mode and 50 Mpix/second in the

YUV mode using YUV sequencing. For one word per pix-

el output codes, such as RGB-24, this is a peak rate of

33 Mwords/sec or 132 Mpix/sec in the RGB sequencing

mode. This is the same speed as the 132 MB/sec peak

rate of the PCI interface. (At 50 Mpix/sec, the result

would be 200 MB/sec.) The BIU con trol for the PCI inter-

face has a FIFO for buffering data from the ICP, but this

buffer is only 16 words deep. Therefore, the ICP will oc-

casionally have to wait for the PCI to accept more data.

In the PCI outp ut mode, this stalls the ICP clock.

14.6 OPERATION AND PROGRAMMING

The ICP uses a combination of hardware and a Micro-

program Control Unit (MCU) to implement its scaling, fil-

tering and conversion functions. The microprogram is a

To PCI

5 Stage Multiplier-

Accumulator

Y, U, V LSBs

Reg

a+2 RAM

a+1 RAM

a+0 RAM

a-1 RAM

a-2 RAM

Y Counter

Y LSB Counter

Buffers 0,1

Block FIFO

Filter Source Select

5-tap Filter

Reg

U Counter

U LSB Counter

Buffers 2,3

Block FIFO Reg

Reg

V Counter

V LSB Counter

Buffers 4,5

Block FIFO Reg

Reg

OL Counter

B, BX Counter

Buffer 8

Bit Mask

Buffers 6,7

Overlay

FIFO

Multiplexer: Y, U, V Select

Mux

YUV to RGB Conversion, Formatting, Alpha Blending & Bit Masking

YUV

Counter

Sequence

Pixel

Clock Y, U, V Data FIFO Clocks

Mirror Multiplexer

Y Mirror Cntr

U Mirror Cntr

V Mirror Cntr

Mux

RGB to SDRAM case

RGB to PCI case

Figure 14-16. ICP horizontal scaling for RGB output data flow block diagram

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-17

factory-supplied state machine that resides in SDRAM. It

is read each time the ICP executes an operation. Using

an SDRAM-resident microprogram-controlled state ma-

chine minimizes hardware and provides flexibility in han-

dling special conditions without additional hardware.

Important Note: You must set the ICP DMA Enable bit

(IE) in the BIU_CTL register of the PCI interface for RGB

output to PCI. This bit must be set before initiating RGB

to PCI operations, or the ICP will stall waiting for the PCI

to become ready. Refer to Section 11.6.5, “BIU_CTL

14.6.1 ICP Register Model

The ICP is controlled by the DSPCPU thr ough five MMIO

registers: the MicroProgram Counter (MPC), the Micro

Instruction Register (MIR), the Data Pointer (DP), the

Data Register (DR) and the ICP Status register (SR), as

shown in Figure 14-17. The MPC, DP and SR are used

in normal operations, and the MIR and DR are used in

test and debug. Note that the MMIO registers should

never be written while the ICP is executing microcode, i.e

test the Busy bit in the SR register before writing any ICP

MMIO register.

The MPC is the MCU instruction counter. It points to the

next microinstruction to be executed. The entry point in

the microprogram defines which ICP operation is to be

executed.The DP points to the location in SDRAM of a

table of parameters used by the ICP to process the im-

age data, such as the image input and output start ad-

dresses, scaling factor, etc.

The SR has 13 active bits: Busy (B), Done (D), done In-

terrupt Enable (IE), ACK_DONE (A), Little Endian (L),

Step (S), Diagnostic (DG), Reset (R), Priority Delay (PD,

4 bits). Bits 12 .. 30 are reserved.

• (B)usy indicates the ICP is busy executing micro-

code.

• (D)one indicates that the previous requested function

is complete, and that the ICP clock is stopped.

• (D)one causes an interrupt to the DSPCPU when

Interrupt Enable is set.

• (A)CK_DONE clears (D)one and the corresponding

interrupt.

• (L)ittle Endian sets the highway endian swap multi-

plexer to little endian mode for data on the SDRAM

bus.

• (S)tep causes the MCU to execute o ne microins truc-

tion. Step is used for diagnostics to step the ICP

through its m icr oin structio ns one clo ck step at a time.

Writing a ‘1’ to Step sets Busy, which is reset at the

end of execution of the next microinstruction.

• (DG) allows SDRAM operations in step mode.

• (R) is a write-only bit that resets ICP internal regis-

ters.

• (PD) sets a tim er fo r bus activity th at define s the min-

imum bus bandwidth available to the ICP.

The ICP Status Register contains 20 read-only status

bits. The upper 16 bits of th e Status Register can contain

a 16-bit code returned by the microprogram upon com-

pletion. Bits 15 through 12 are re served for error flags.

Important Note: You must set the ICP DMA Enable bit

(IE) in the BIU_CTL register of the PCI interface for RGB

output to PCI. This bit must be set before initiating RGB

to PCI operations, or the ICP will stall waiting for the PCI

to become ready. Refer to Section 11.6.5, “BIU_CTL

14.6.2 Power Down

The ICP block enters in power down state whenever

PNX1300 is put in global power down mode.

MicroProgram Counter (MPC, ICP_MPC)

Data Pointer (DP, ICP_DP)

ICP Status (ICP_SR) D

31 0

BIE

MicroInstruction Register (MIR, ICP_MIR)

Data Register (DR, ICP_DR)

ALS

0x10 2400

0x10 2404

0x10 2408

0x10 2410

0x10 2414

MMIO Offsets

Priority Delay

12 11 6

DGR

Figure 14-17. ICP MMIO Registers

PNX1300/01/02/11 Data Book Philips Semiconductors

14-18 PRELIMINARY SPECIFICATION

The ICP block can be separately powered down by set-

ting a bit in the BL OCK_POWER_ DOWN re gister. Re fe r

to Chapter 21, “Power Management.”

It is recommended that ICP is in an idle state before

block level power down is activated.

14.6.3 ICP Operation

The DSPCPU commands the ICP to perform an opera-

tion by loading the DP with a pointer to a parameter

block, loading the MPC with a microprogram start ad-

dress and setting Busy in the SR. For example to cause

the ICP to scale and filter an image, set up a block of

SDRAM with the image and filter parameters, load the

MPC with the starting address of the appropriate micro-

program entr y poin t in SDRAM, load the DP with th e ad -

dress of the parameter block, and set Busy in the SR by

writing a ‘1’ to it. When the filter operation is complete,

the ICP will set Done and issue an interrupt. The

DSPCPU clears the interrupt by writing a ‘1’ to

ACK_DONE. Note: The interrupt should be set up as a

‘level triggered.’

When the DSPCPU sets Busy, the MCU begins reading

the microprogram from SDRAM. The microinstructions

are read in from SDRAM as requir ed by the ICP, and in-

ternal pre-fetching is used to eliminate delays. Setting

Busy enables the MCU clock, the first block of microin-

structions is automatically read in, and the MCU begins

instruction execution at the current address in the MPC.

Clearing Busy stops the MCU clock. Busy can be cleared

by hardware reset, by the MCU, or by the DSPCPU.

Hardware reset clears the Status register, including Busy

and Done, and internal registers, such as the TCR.

When the MCU completes a microprogram operation,

the microprogram typically clears Busy and sets Done,

causing an interrupt if IE is enabled.

The DSPCPU performs a software reset by clearing

(writing a ‘0’ to) Busy and by writing a ‘1’ to Reset. The

DSPCPU can also set Done to force a hardware inter-

rupt, if desired.

14.6.4 ICP Microprogram Set

The ICP comes with a factory-generated microprogram

set which implements the functions of the ICP. The mi-

croprogram set includes the following functions:

1. Loading the filter coefficient RAMs.

2. Horizontal scaling and filtering from SDRAM to

SDRAM of an input image to an outpu t image. The in-

put and output images can be of any size and position

that fits in SDRAM. The scaling factors are, in gen er-

al, limited only by input and output image sizes.

3. Vertical scaling and filtering from SDRAM to SDRAM

of an input image to an output image. The input and

output images can be of any size and position that fits

in SDRAM. The scaling factors are, in general, limited

only by input and output image sizes.

4. Horizontal scaling, filtering and YUV to RGB conver-

sion of an input image from SDRAM to an output im-

age to PCI or SDRAM, with an alpha-blended and

chroma-keyed RGB overlay and a bit mask. The input

and output images can be of any size and position

that fit in SDRAM and can be ou tput to the PCI bus or

SDRAM. In general, scaling factors are limited only by

input and outpu t ima ge size s.

The microprogram is su pplied with the ICP as pa rt of the

device driver. The entry point in the microprogram de-

fines which ICP operation is to be done. The entry points

are given below in terms of word offsets from the begin-

ning of the microprogram:

Offset Function

0 Load coefficients

1 Horizontal scaling and filtering

2 Vertical scaling and filtering

3 Horizontal scaling, filtering, YUV to RGB

conversion, bit masking (PCI) and over-

lay (PCI) with alpha blending and

chroma keying

14.6.5 ICP Processing Time

The processing time for typical operations on typical pic-

ture sizes has been measured.

Measurements were perfo rmed with the following config-

uration:

• CPU clock and SDRAM clock set to 100 MHz

• PCI clock set to 33MHz

• All measurement with PCI as pixel destination were

done with an Imagine 128 Series II graphics card,

which never caused a slowdown of the ICP opera-

tion.

• TRITON2 mother-board with SB82437UX and

SB82371SB based Intel Pentium chipset.

• PNX1300 arbiter set to default settings

• PNX1300 latency timer set to maximu m value = 0xf8.

• Overlay sizes were the same as picture sizes.

Results are tabulated below for three different cases of

available memory bandwidth:

1. No other load to SDRAM, i.e. full SDRAM bandwidth

available for ICP. See Table 14-5.

2. SDRAM memory loaded to 95% of its bandwidth by

DCACHE traffic from DSPCPU. Priority delay = 1, i.e.

ICP did wait one block time before comp eting for m emo-

ry. See Table 14-6.

3. SDRAM memory loaded to 95% of its bandwidth by

DCACHE traffic from DSPCPU. Priority delay = 16, i.e.

ICP did wait 16 block times before competing for memo-

ry. See Table 14-7.

Note: A load of 95% of the memory bandwidth is very

rarely found in a real system. So the results in these ta-

bles may be useful to estimate upper bounds for the

computation time in a loaded system.

The priority delays were set to the minimum and maxi-

mum possible values, so the computation time for other

priority delay values should be somewhere in between.

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-19

A simple linear model of computation time has been fit-

ted to the tabular data and to corresponding measure-

ments with half the number of pixels per line.

It was assumed th at

processing time = (time per line start)* (number of lines)

+(time per pixel) * (number of pixels)

Table 14-8, Table 14-9 and Table 14-10 give the time

per line start and the time per pixel in this equation for the

three memory bandwidth cases.

The maximum deviation betwee n measured time a nd fit-

ted model is on the order of 10% in the range W = 180 ...

1024, H = 240 ...768. The deviation is much less in most

cases. The values were found by least squares fit to the

measured data.

In some cases the cumulative time for line starts contrib-

uted so little to the total computation time that the value

per line start could only be determined relatively inaccu-

rately. In other words the pixel time portion dominated

the equation so much that the line time po rtion wa s neg-

ligible, given the inaccuracies of the model.

Therefore the simple model is only thought to allow inter-

polation for other picture sizes within the range W = 180

...1024, H = 240 ... 768. Extrapolation to picture sizes

much outside this range should not be attempted using

this data.

In some cases the real ICP performance may be much

better than that predicted by the model, due to irregular

behavior of the ICP.

For horizontal and vertical up/down-scaling operations

use the larger W or H value occurring at inpu t/output with

the H/V filter times table or model.

This will lead to overestimation of proces sing time by up

to 20%.

Table 14-5. Measured processing time in ms - no other load to SDRAM

W in pixels 360 640 720 720 800 800 1024

H in pixels 240 480 480 768 480 600 768

horizontal filter, 1 component 1.22 3.82 4.43 7.08 4.78 5.98 9.27

horizontal filter, 3 components YUV 4:2:2 2.68 8.18 9.29 14.86 10.08 12.60 19.35

vertical filter, 1 component 2.57 8.73 10.24 16.36 11.19 13.97 22.30

vertical filter, 3 components YUV 4:2:2 5.15 17.47 20.48 32.72 22.95 28.65 44.60

yuv to rgb8a, pci output 3.36 10.74 11.93 19.08 13.04 16.30 26.02

yuv to rgb15a, pci output 3.39 10.79 11.96 19.12 13.10 16.41 26.15

yuv to rgb24, pci output 3.72 12.24 13.52 21.62 14.85 18.59 29.98

yuv to rgb24a, pci output 4.34 14.52 16.04 25.02 17.58 21.63 35.01

yuv to rgb8a, sdram output 3.39 10.78 11.95 19.09 13.13 16.40 26.08

yuv to rgb15a, sdram output 3.46 11.04 12.26 19.60 13.46 16.82 26.87

yuv to rgb24, sdram output 3.62 11.69 13.06 20.88 14.43 18.03 28.71

yuv to rgb24a, sdram output 3.90 12.69 14.11 22.57 15.65 19.56 31.07

yuv to rgb8a, bitmask, pci output 3.37 11.42 12.49 19.97 13.61 17.01 27.83

yuv to rgb8a, RGB 15a overlay, pci output 3.67 11.72 12.92 20.67 14.23 17.79 28.23

yuv to rgb8a, RGB 24a overlay, pci output 4.23 13.57 15.32 24.51 16.93 21.15 33.15

yuv to rgb8a, yuv 422a overlay, pci output 3.67 11.72 12.92 20.67 14.23 17.79 28.23

yuv to rgb8a, 422 sequencing, pci output 2.52 7.77 8.57 13.70 9.32 11.65 18.40

Table 14-6. Measured processing time in ms - SDRAM loaded 95%, priority delay = 1

W in pixels 360 640 720 720 800 800 1024

H in pixels 240 480 480 768 480 600 768

horizontal filter, 1 component 2.01 6.37 7.60 12.16 8.02 10.02 16.02

horizontal filter, 3 components YUV 4:2:2 4.11 13.69 15.62 24.96 16.56 20.68 32.65

vertical filter, 1 component 2.60 8.79 10.34 16.50 11.25 14.05 22.43

vertical filter, 3 components YUV 4:2:2 5.20 17.59 20.66 32.96 23.15 28.89 44.87

yuv to rgb8a, pci output 3.51 11.08 12.17 19.46 13.51 16.88 26.56

yuv to rgb15a, pci output 3.52 11.11 12.22 19.51 13.47 16.82 26.65

yuv to rgb24, pci output 3.88 12.51 13.79 22.08 15.21 18.99 30.26

PNX1300/01/02/11 Data Book Philips Semiconductors

14-20 PRELIMINARY SPECIFICATION

yuv to rgb24a, pci output 4.39 14.29 15.84 25.30 17.72 22.00 34.83

yuv to rgb8a, sdram output 3.69 11.67 12.75 20.39 14.20 17.80 27.95

yuv to rgb15a, sdram output 4.25 13.15 14.64 23.41 16.79 20.98 31.49

yuv to rgb24, sdram output 5.17 16.56 18.71 29.90 20.85 26.06 40.82

yuv to rgb24a, sdram output 5.82 18.64 21.02 33.62 23.23 29.03 45.34

yuv to rgb8a, bitmask, pci output 3.65 12.37 13.45 21.50 14.68 18.34 30.13

yuv to rgb8a, rgbl15a overlay, pci output 4.94 15.30 17.23 27.51 19.06 23.78 36.70

yuv to rgb8a, rgbl24a overlay, pci output 6.77 21.93 24.85 39.73 27.44 34.31 53.67

yuv to rgb8a, yuv422a overlay, pci output 4.95 15.30 17.22 27.51 19.06 23.80 36.70

yuv to rgb8a, 422sequencing, pci output 3.04 8.92 9.63 15.39 10.53 13.16 20.37

Table 14-6. Measured processing time in ms - SDRAM loaded 95%, priority delay = 1

W in pixels 360 640 720 720 800 800 1024

H in pixels 240 480 480 768 480 600 768

Table 14-7. Measured processing time in ms, SDRAM loaded 95%, priority delay = 16

W in pixels 360 640 720 720 800 800 1024

H in pixels 240 480 480 768 480 600 768

horizontal filter, one component 7.70 24.28 29.32 46.90 30.05 37.56 60.39

horizontal filter, 3 components YUV 4:2:2 15.28 52.00 60.08 96.10 63.13 78.90 123.29

vertical filter, one component 7.50 26.71 30.92 49.31 33.57 41.93 68.18

vertical filter, 3 components YUV 4:2:2 14.48 53.45 60.70 96.83 68.69 85.79 136.40

yuv to rgb8a, pci output 10.55 31.61 34.95 55.84 37.18 46.47 74.29

yuv to rgb15a, pci output 10.55 31.61 34.93 55.84 37.17 46.45 74.29

yuv to rgb24, pci output 10.39 31.71 34.93 55.84 37.25 46.54 73.58

yuv to rgb24a, pci output 10.49 31.95 35.06 55.98 37.15 46.46 74.10

yuv to rgb8a, sdram output 13.83 41.93 48.10 76.94 51.57 64.42 99.33

yuv to rgb15a, sdram output 17.58 55.55 60.95 97.49 65.82 82.24 137.71

yuv to rgb24, sdram output 20.25 65.46 74.67 119.44 81.74 102.12 158.43

yuv to rgb24a, sdram output 24.05 78.51 88.98 142.21 98.69 125.67 196.99

yuv to rgb8a, bitmask, pci output 11.05 35.04 37.75 60.37 40.15 50.19 85.13

yuv to rgb8a, rgbl15a overlay, pci output 18.19 57.11 62.60 100.04 70.84 88.26 136.03

yuv to rgb8a, rgbl24a overlay, pci output 24.81 80.19 91.86 145.57 100.72 125.00 198.15

yuv to rgb8a, uv422a overlay, pci output 18.20 57.11 62.60 100.04 70.00 88.28 135.98

yuv to rgb8a, 422sequencing, pci output 10.56 31.09 34.79 55.63 36.27 45.33 74.43

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-21

14.6.6 Priority Delay and ICP Minimum Bus

Bandwidth

The Priority Delay field in the Status register sets the time

the ICP will wait for SDRAM service before changing

from a low-priority b us request to a high-priority request.

The ICP normally requests SDRAM bus service at the

lowest-priority level, since it is a background processing

device. In some cases, service to the ICP could be con-

tinuously delayed by other background devices, su ch as

the VLD processor or by high-priority requests from the

DSPCPU.

The PD field sets a timer on the currently active bus re-

quest. The timer is loaded with the PD value and started

each time a bus re quest is submitted. The timer is incre-

mented once each block time, the time required to load

one block of 64 bytes. If the timer reaches 16 before the

request is serviced, the ICP changes its bus request pri-

ority from low to high.

The resulting time delay until the ICP changes to high pri-

ority is:

timer delay = (16 - PD)*(block time)

One block time is 16 clock cycles.

Table 14-8. Lin e st art and pixel time for linear model,

no other load on SDRAM

function t/linestart

(s) t/pixel

(ns)

horizontal filter, 1 component 1.1 11

horizontal filter, 3 components YUV

4:2:2 3.2 22

vertical filter, 1 component 0.2 29

vertical filter, 3 components YUV 4:2:2 0.7 58

yuv to rgb8a, pci output 3.2 30

yuv to rgb15a, pci output 3.3 30

yuv to rgb24, pci output 3.7 34

yuv to rgb24a, pci output 5.3 40

yuv to rgb8a, sdram output 3.4 30

yuv to rgb15a, sdram output 3.3 31

yuv to rgb24, sdram output 3.1 33

yuv to rgb24a, sdram output 3.4 36

yuv to rgb8a, bitmask, pci output 2.5 32

yuv to rgb8a, rgbl15a overlay, pci output 3.8 32

yuv to rgb8a, rgbl24a overlay, pci output 4.0 39

yuv to rgb8a, yuv422a overlay, pci out-

put 3.8 32

yuv to rgb8a, 422sequencing, pci output 3.2 20

Table 14-9. Lin e st art and pixel time for linear model,

SDRAM loaded 95%, priority delay = 1

function t/linestart

(s) t/pixel

(ns)

horizontal filter, 1 component 0.9 20

horizontal filter,3 components YUV 4:2:2 2.8 40

vertical filter, 1 component 0.2 29

vertical filter, 3 components YUV 4:2:2 0.7 58

yuv to rgb8a, pci output 3.8 30

yuv to rgb15a, pci output 3.8 30

yuv to rgb24, pci output 4.5 34

yuv to rgb24a, pci output 6.0 39

yuv to rgb8a, sdram output 4.3 31

yuv to rgb15a, sdram output 4.9 36

yuv to rgb24, sdram output 4.6 47

yuv to rgb24a, sdram output 5.0 53

yuv to rgb8a, bitmask, pci output 3.2 34

yuv to rgb8a, rgbl15a overlay, pci output 5.5 42

yuv to rgb8a, rgbl24a overlay, pci output 5.8 63

yuv to rgb8a, yuv422a overlay, pci output 5.5 42

yuv to rgb8a, 422sequencing, pci output 4.9 21

Table 14-10. Line start and pixel time for linear

model, SDRAM loaded 95%, priority delay = 16

function t/linestart

(s) t/pixel

(ns)

horizontal filter, 1 component 2.9 77

horizontal filter, 3 components YUV422 8.7 154

vertical filter, 1 component 0.4 87

vertical filter, 3 components YUV 4:2:2 1.2 174

yuv to rgb8a, pci output 13.9 82

yuv to rgb15a, pci output 13.8 82

yuv to rgb24, pci output 13.7 82

yuv to rgb24a, pci output 14.0 82

yuv to rgb8a, sdram output 15.8 115

yuv to rgb15a, sdram output 18.5 151

yuv to rgb24, sdram output 17.5 187

yuv to rgb24a, sdram output 16.6 233

yuv to rgb8a, bitmask, pci output 14.3 91

yuv to rgb8a, rgbl15a overlay, pci output 20.7 153

yuv to rgb8a, rgbl24a overlay, pci output 21.6 232

yuv to rgb8a, yuv422a overlay, pci out-

put 20.8 153

yuv to rgb8a, 422sequencing, pci output 14.0 80

PNX1300/01/02/11 Data Book Philips Semiconductors

14-22 PRELIMINARY SPECIFICATION

Table 14-11 gives the delay in block times as a function

of the PD field.

The priority delay mechan ism in interaction with the arbi-

ter mechanism allows the user to allocate enough band-

width for the ICP to do its processing in the required

frame time. For details of the arbiter mechanism see

Chapter 20, “Arbiter.”

14.6.7 ICP Parameter Tables

Each microprogram in the microprogram set has an as-

sociated parameter table used by the ICP to process the

image data, such as the image input and output start ad-

dresses, scaling facto r, etc. The DP points to the location

in SDRAM of the first word of the parameter table. The

parameter table address must be word aligned. The pa-

rameter table can be more than one SDRAM block (16

32-bit words) long.

Note: In packed RGB24 to PCI operation the output ad-

dress offset from the start of video memory must be a

multiple of 6 bytes, i.e. on an even pixel boundary.

14.6.8 Load Coefficients

This routine loads the filter coefficient RAMs with coeffi-

cient data in the parameter table. A total of 32 sets of five

10-bit coefficients are loaded. Each set of five coeffi-

cients forms a 50-bit coefficient word. Two coefficients

are stored in each 32-bit word in SDRAM. Three 32-bit

words are used for each set o f five coefficients that form

a coefficient word. The parameter table is 96 words (6

SDRAM blocks) long. Each coefficient is stored as the 10

LSBs of each 16-bit half word of the 32-bit word.

The parameter ta ble for the coefficient load functio n con-

tains the coefficient data directly, as shown below. The

parameter table is 96 words long.

14.6.9 Horizontal Filter - SDRAM to SDRAM

This routine performs horizontal scaling and filtering of

one component (Y, U or V) of an N x M image from one

location in SDRAM to another.

14.6.9.1 Algorithms

The routine reads image data from SDRAM using the Y

address counter, then scales and filters the data in the

horizontal direction and writes it back to the SDRAM us-

ing the Z address counter. The 5-tap filter scales and fil-

ters the data. The LSB Increment value supplied by the

parameter table determines the scaling. The routine

reads and writes a line at a time until the full image is

transferred. The filter mirrors the ends of each line to pro-

vide the extra pixels needed by the filter at the ends of

each line.

14.6.9.2 Parameter table

The parameter table, shown in Table 14-13, supplies the

input and output starting addresses and offsets, the im-

age height in lines and width in pixels, and the increment

value, which is derived from the scale factor.

The input and output addresses are the byte addresses

of their respective tables. They do not need to be word-

or block-aligned.

The input and output line offsets define the difference in

bytes from th e a dd r es s of the firs t pix el in the first line to

the address of th e first pixel in the second line for their re-

spective blocks. The line offset must be constant for all

lines in each table. The line offset allows some space be-

tween the end of on e line and the star t of the n ext line . It

also allows the ICP to scale and filter a subset of an ex-

isting image, such as magnifying a portion of an image.

There are no restrictions on line offset values other than

they must be 16-bit, two’s complement integer values.

(Note that this allows negative offsets. You can use this

to flip an image vertically.)

The input and output image height and width values are

the height in lines and width in pixe ls per lin e fo r their re-

Table 14-11. ICP priority delay vs. PD code

Code Delay

block times

1111 1

1110 2

1101 3

1100 4

1011 5

1010 6

1001 7

1000 8

0111 9

0110 10

0101 11

0100 12

0011 13

0010 14

0001 15

0000 16

Table 14-12. Load coefficients parameter table

Parameter Word

Description

Upper 2

bytes Lower 2

bytes

a+2 a+1 RAM Coefficient word 0

a+0 a-1

a-2 0

a+2 a+1 RAM Coefficient word 1

a+0 a-1

a-2 0

a+2 a+1 RAM Coefficient word 31

a+0 a-1

a-2 0

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-23

spective images. The height an d width are 16-bit positive

binary num b er s be twe en 0 and 64K-1 .

The Integer increment and Fraction increment values are

the scaling parameters. The Integer value is a 16-bit in-

teger, and the Fraction value is a positive binary fraction

between 0 and 0.99999+. For up scaling (output image

bigger), the increment value is the inverse of the scaling

value. If you are upscaling by a factor of 2.5, the incre-

ment value will be the inverse of 2.50 = 0.40. The Integer

increment value will be 0 and the Fraction increment val-

ue will be 0.40. For down scaling, the increment value is

equal to the scaling value. If you are down scalin g by 2.5

(output image smaller), the Integer increment value will

be 2, and the Fraction increment value will be 0.500.

To perform scaling, the Intege r and Fractio nal increment

values must be generated and placed in the parameter

table. The simplest way to gene rate these values in com-

mon computer languages such as C is as follows:

1. Generate the Increment Value as a floating point

number = Input Width / Output Width

2. Multiply the Increment Value by 65536

3. Convert the result to a Lo ng Integer (32 bit s). The up -

per 16 bits of the Long integer will be the Integer in-

crement value, and the lower 16 bits will be the Frac-

tional value.

4. Store the 32-bit Long integer in the parameter table as

the combined Integer and Fractional increment val-

ues.

The Start Fraction defines the starting value in the scal-

ing counter for each line. It is a 16-bit, two’s complement

fractional value between - 0.500 and +0.49999. The Start

Fraction allows the input data to be offset by up to half a

pixel, referred to the input pixel grid. It is ‘0’ for Y and for

UV co-sited data, and set to ‘-0.25’ (C000h) for inter-

spersed to co-sited conversion of U and V data. The ‘-

0.25’ value effectively sh ifts the U and V da ta to ward the

start of the line b y 1/4 pixel, th e amount require d for con-

version.

14.6.9.3 Control word format

The Control word provides bit fields which affect the hor-

izontal filtering operation. The format of the Control word

is as follows.

Bit Name Function

15 Bypass Bypass filte r. Picks nearest input pixel

and passes it to output unfiltered.

When Bypass is set & scale factor is

1.0, this results in an image block

move

9 GETB Large down-scaling bit. Picks nearest

input pixels and passes them to filter.

Equivalent to bypass + 5-tap filter of

output pixels. LSB value = 0 for filter-

ing.

The Bypass bit causes the data to bypass the 5-tap filter.

The scaling operation selects the center pixel, and this

pixel is passed to the filter output. No filtering or interpo-

lation is provided. If the scaling factor is ‘1.0’, the result is

an image block move where the image is moved from

one part of SDRAM to another without modification. If the

scaling factor is other than ‘1.0’, the effective algorithm is

pixel picking, where the input pixel nearest the output

pixel location is used as the output pixel.

The GETB bit is an optional bit for large (> 4 ) down scal-

ing. When GETB is ‘0’ (no rmal oper ation), th e 5-tap filter

receives the pixel nearest the output pixel as its center

pixel plus the two adjacent input pixels on either side of

this pixel to form the five filter inputs. When GETB is set,

the filter receives the pixel n ear est the o utput pixel as its

center pixel plus the two pixe ls nearest the adjacent ou t-

put pixels on either sid e of this pixel to form the five filter

inputs. The effective algorithm is pixel picking plus 5-tap

filtering of the result. GETB also forces the scaling LSB

value to ‘0’, since output pixels are being filtered and no

Table 14-13. Horizontal filter parameter table

Parameter Word Description

Upper 2 bytes Lower 2 bytes

Input image start address Start address of X0Y0 (byte address)

Y counter

Start fraction Input image

Line offset Starting value: may be 0.5, etc. for interspersed convert;

Line offset from X0Y0 to X0Y1

Fraction increment Integer increment Increment value for Y = 1/scale factor

Input image height Input image Width Height and width in input lines and pixels

Output image start address Start address of X0Y0 (byte address)

Control Output Image

Line offset Control bits; Line offset from X0Y0 to X0Y1

Output image height Output image width H eight and width in output lines and pixels

PNX1300/01/02/11 Data Book Philips Semiconductors

14-24 PRELIMINARY SPECIFICATION

interpolation is used. (See Section 14.5.2, “Filtering”)

This is shown in Figure 14-18.

14.6.10 Vertical Filter - SDRAM to SDRAM

This routine performs vertical scaling and filtering of one

component (Y, U or V) of an N x M image from o ne loca-

tion in SDRAM to another.

14.6.10.1 Algorithms

The routine reads image data from SDRAM using the Y

address counter, scales a nd filters the data in the vertical

direction, and writes it back to the SDRAM using the Z

address counter. The 5-tap filter scales and filters the da-

ta. The U LSB register is used as the scaling coefficient

rameter table determines the scaling. Lines at the top

and bottom of the image are mirrored to provide the extra

line data needed by the 5-tap filter.

The routine reads and writes data in 64-byte (one

SDRAM block) columns of pixels until the entire image is

transferred. For each column, line segm ents of 64 pixels

are processed until the entire column has been pro-

cessed. Each 64-pixel line segment generated requires

five vertically adjacent 64-pixel line segments as input to

the 5-tap filter. The routine processes the image in pixel

columns to eliminate redundant read of input pixel data:

each new line segment typically requires reading only

one new 64 byte line se gment.

The routine processes data in 64-pixel blocks, corre-

sponding to the input block buffer sizes. Five buffers are

used in processing the current line segment, while the

sixth buffer reads in the next line segment in overlap with

current processing.

14.6.10.2 Parameter table

The parameter table, as sh own in Figure 14-19, supplies

the input and output starting addresses and offsets, the

image height in lines and width in pixels, and the scale

factor.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18 1920

0 1 2 3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18 19 20

P2N = F(10, 11, 12, 13, 14)

P2L = F(2, 7, 12, 17, 22)

21 22 23 2425

Normal Down Scaling

Large Down Scaling

Input Pixels

Output Pixels

Input Pixels

Output Pixels

Figure 14-18. Normal vs. Large down scaling for scale factor = 5.0

Figure 14-19. Vertical filter parameter table

Parameter Word Description

Upper 2 bytes Lower 2 bytes

Input image start address Start address of X0Y0 (byte address)

U counter

St art fraction Input image

Line offset Starting value: may be 0.5, etc. for interspersed convert;

Line offset from X0Y0 to X0Y1

Fraction increment Integer increment Increment value for U = 1/scale factor

Input image height Input image width Height and width in input lines and pixels

Output image start address Start address of X0Y0 (byte address)

Control Output image

Line offset Control Word; Line offset from X0Y0 to X0Y1

Output image height Output Image Width Height and width in output lines and pixels

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-25

The input and output addresses are the byte addresses

of their respective tables. The input and the output ad-

dress need to be 64-byte aligned.

The input and output line offsets define the difference in

bytes from the address of the first pixel in the first line to

the address of the first pixel in the second line for their re-

spective blocks. The line offset must be constant for all

lines in each table. It allows some space between the

end of one line and the start of the next line. It also allows

the ICP to scale an d filter a su bset of an exis ting imag e,

such as magnifying a portion of an image. Offset values

are 16-bit, two’s complemen t integer values.

Vertical filtering has a r estri ction on inp ut an d o utput line

offset values: they must be positive, and they must be

multiples of 64. Note that this only applies to the line-to-

line spacing. Even with this restriction, input images may

be any height and any width and may start at any byte

address. Also, image subsets of arbitrary height and

width can be used. As long as the original image has a

line offset which is a multiple of 64, all subsets of that im-

age will also automatically have a line offset, which is a

multiple of 64 - the same as the original image. All imag-

es should have line offsets which are multiples of 64 as

good programming practice, even though this restriction

only applies to vertical filtering. If an image does not have

a multiple of 64 line offset, it can be converted to that by

using horizontal filtering in the image block move mode

with the output offset value being a multiple of 64.

The input and output image height and width values are

the height in lines and width in pixels pe r line fo r their re-

spective images. The height an d width are 16-bit positive

binary num b er s be twe en 0 and 64K-1 .

The Integer increment and Fraction increment values are

the scaling parameters. The Integer value is a 16-bit in-

teger, and the Fraction value is a positive binary fraction

between 0 and 0.99999+. For up scaling (output image

bigger), the increment value is the inverse of the scaling

value. If you are upscaling by a factor of 2.5, the incre-

ment value will be the inverse of 2.50 = 0.40. The Integer

increment value will be 0 and the Fraction increment val-

ue will be 0.40. For down scaling, the increment value is

equal to the scaling value. If you are down scalin g by 2.5

(output image smaller), the Integer increment value will

be 2, and the Fraction increment value will be 0.500.

To perform scaling, the Intege r and Fractio nal increment

values must be generated and placed in the parameter

table. The simplest way to gene rate these values in com-

mon computer languages such as C is as follows:

1. Generate the Increment Value as a floating point

number = Input Height / Output Height

2. Multiply the Increment Value by 65536

3. Convert the result to a Lo ng Integer (32 bit s). The up -

per 16 bits of the Long integer will be the Integer in-

crement value, and the lower 16 bits will be the Frac-

tional value.

4. Store the 32-bit Long integer in the parameter table as

the combined Integer and Fractional increment val-

ues.

The Start Fraction defines the starting value in the scal-

ing counter for each line. It is a 16-bit, two’s complement

fractional value between -0.500 and 0.49999+. This val-

ue is placed in the Start Fr action al lows the inpu t data to

be offset by up to half a line, referred to the input pixel

grid. It is set to ‘0’ for all conventional YUV input data.

14.6.10.3 Control word format

The Control word provides bit fields which affect the ver-

tical filtering operation. The for mat of the Contr ol word is

as follows.

Bit Name Function

15 Bypass Bypass filter. Picks nearest input line

and passes it to output unfiltered.

When Bypass is set & scale factor is

1.0, this results in an image block

move

The Bypass bit causes the data to bypass the 5-tap filter.

The scaling operation selects the center line, and this

line is passed to the filter output. No filtering or interpola-

tion is provided. If the scaling factor is 1.0, the result is an

image block move where the image is moved from one

part of SDRAM to another without modification. If the

scaling factor is other than 1.0, the effective algorithm is

line picking, where the input line nearest the output line

location is used as the output line.

14.6.11 Horizontal Filter with RGB/YUV

Conversion to PCI or SDRAM

This routine moves an N x M image in YUV 4:2:2, YUV

4:2:0 or YUV 4:1:1 format from SDRAM to the PCI bus or

to SDRAM. The image is scaled and filtered in the hori-

zontal direction during the move. Optional bit masking

and/or RGB overlay can be used during the move when

PCI output is specified.

14.6.11.1 Algorithms

The routine reads image data from SDRAM using the Y,

U, and V address counters, scales and filters the data in

the horizontal direction and writes it to the PCI interface

or SDRAM. The 5-tap filter scales and filters the data.

The LSB Increment value for each of the Y, U and V com-

ponents supplied by the parameter table determines the

scaling. Separate scaling factors allows YUV 4:2:2 inter-

spersed to co-sited transformation as the data is being

filtered. The scaled and filtered data is conver ted to RGB

or YUV format before being sent to the PCI interface or

to SDRAM. In the PCI output case, overlay data with al-

pha blending and chroma keying can be added to the

output image, and the output image can be gated by a bit

mask before it is sent to the PCI interface.

The routine reads and writes a line at a time until the full

image is transferred. The filter mirrors the ends of each

line to provide the extra pixels needed by the filter at the

ends of each line.

PNX1300/01/02/11 Data Book Philips Semiconductors

14-26 PRELIMINARY SPECIFICATION

14.6.11.2 Parameter table

The parameter table, shown in Table 14-14, supplies the

input and output starting addresses and offsets for Y, U,

V, OL, B and Z, the image height in lines and width in pix-

els, and the scale factors for each component.

The input and output addresses are the byte addresses

of their respective tables. They do not need to be word or

block aligned. Note the following restriction: in packed

RGB24 to PCI operation the output address offset from

the start of video memory must be a multiple of 6 bytes,

i.e. on an even pixel boundary.

The input and output line offsets define the difference in

bytes from th e a dd r es s of the firs t pix el in the first line to

the address of th e first pixel in the second line for their re-

spective blocks. The line offset must be constant for all

lines in each table. The line offset allows some space be-

tween the end of on e line and the star t of the n ext line . It

also allows the ICP to scale and filter a subset of an ex-

isting image, such as magnifying a portion of an image.

There are no restrictions on line offset values other than

they must be 16-bit, two’s complement integer values.

(Note that this allows negative offsets. You can use this

to flip an image vertically.)

The input and output image height and width values are

the height in lines and width in pixe ls per lin e fo r their re-

spective images. The height and width ar e 16-bit positive

binary numbers between 0 and 64K-1 .

The Integer increment and Fraction increment values are

the scaling parameters. There is a separate scaling pa-

rameter for each of the Y, U and V input components.

The Integer value is a 16-bit integer, and the Fraction val-

ue is a positive binary fraction between 0 and 0.99999+.

For up scaling (output image bigger), the increment val-

ue is the inverse of the scaling value. If upscaling by a

factor of 2.5, the increment value will be the inverse of

2.50 = 0.40. The Integer increment value will be ‘0’ and

the Fraction increment value will be ‘0.40’. For down

scaling, the increment value is equal to the scaling value.

If you are down scaling by 2.5 (output image smaller), the

Integer increment value will be ‘2’, and the Fraction incre-

ment value will be ‘0.500’.

To perform scaling, th e Integer and Fr actional increm ent

values must be generated and placed in the parameter

Table 14-14. Horizontal filter to RGB output parameter table

Parameter Word Description

Upper 2 bytes Lower 2 bytes

Input image Y start address Y Start address of X0Y0 (byte address)

Y Counter

St art fraction Input image

Y line offset Starting value: may be 0.5, etc. for interspersed convert;

Y Line offset from X0Y0 to X0Y1

Y fraction increment Y integer increment Increment value for U = 1/scale factor

Y input image height Y input image width Y Height and width in pixels

Input image U start address U Start address of X0Y0 (byte address)

U counter

St art fraction Input image

U line offset Starting value: may be 0.5, etc. for interspersed convert;

U Line offset from X0Y0 to X0Y1

U fraction increment U integer increment Increment value for Y = 1/scale factor

U input image height U input image Width U Height and width in pixels

Input image V start address V Start address of X0Y0 (byte address)

V Counter

St art fraction Input image

V line offset Starting value: may be 0.5, etc. for interspersed convert;

V Line offset from X0Y0 to X0Y1

V fraction increment V integer increment Increment value for V = 1/scale factor

V Input image height V input image width V Height and width in pixels

Output image start address Start address of X0Y0 (byte address)

Control Output image

Line offset Input & output formats & control bits;

Line offset from X0Y0 to X0Y1

Output image height Output image width Height and width in output pixels

Bit Map image start address Start address of X0Y0 (byte address)

0 Bit map image

Line offset Line offset from X0Y0 to X0Y1

RGB overlay start address Start address of X0Y0 (byte address)

Alpha 1 & Alpha 0 Overlay

Line offset Alpha 1 & Alpha 0 blend code for RGB15+, etc.;

Line offset from X0Y0 to X0Y1

Overlay end pixel Overlay start pixel Start and end pixels along line

Overlay end Line Overlay start line Start and end lines in frame

Philips Semiconductors Image Coprocessor

PRELIMINARY SPECIFICATION 14-27

table. The simplest way to gene rate these values in com-

mon computer languages such as C is as follows:

1. Generate the Increment Value as a floating point

number = Input Width / Output Width

2. Multiply the Increment Value by 65536

3. Convert the result to a Lo ng Integer (32 bit s). The up -

per 16 bits of the Long integer will be the Integer in-

crement value, and the lower 16 bits will be the Frac-

tional value

4. Store the 32-bit Long integer in the parameter table as

the combined Integer and Fractional increment values

For YUV 4:2:2 or YUV 4:2:0 input data and RGB output

data, the scaling factor for U and V must be twice the

scaling factor for Y, unless YUV4:2:2 sequencing is used

for speed. In YUV 4:2:2 or YUV 4:2:0 data, the horizontal

components of U and V are half those of Y. The U and V

must be upscaled by 2 to generate a YUV 4:4:4 format

internally for YUV to RGB conversion. For YUV 4:1:1 in-

put data, the U and V components must be upscaled by

a factor of 4 to generate the required internal YUV 4:4:4

format.

The Start Fraction defines the starting value in the scal-

ing counter for each line. It is a 16-bit, two’s complement

fractional value between - 0.500 and 0.49999 +. The Start

Fraction allows the input data to be offset by up to half a

pixel, referred to the input pixel grid. It is ‘0’ for Y and for

UV co-sited data, and is set to ‘-0.25’ (C000) for inter-

spersed to co-sited conversion of U and V data. The ‘-

0.25’ value effectively sh ifts the U and V da ta to ward the

start of the line b y 1/4 pixel, th e amount require d for con-

version.

The Alpha 1 and Al pha 0 values are 8-bit fields within th e

16-bit Alpha field. These values are loaded into the Alpha

1 and Alpha 0 register s, resp., for use by RGB 15+ an d

YUV 4:2:2+ overlay formats in alpha blending.

The Overlay start and end pixels and lines define the

start and end pixels and lines within the outpu t image for

the overlay. The first pixel of the overlay image will be

blended with the pixel at the Overlay Start Pixel and

Overlay Start Line in th e ou tp ut ima ge.

14.6.11.3 Control word format

The Control word provides bit fields which affect the hor-

izontal filtering operation. The format of the Control word

is as follows.

Bits Name Function

15 Bypass Normally set to 0 to enable filtering.

Can be set to 1 to accomplish data

move without filtering.

14 422SEQ 4:2:2 Sequence bit. Used with YUV

4:2:2 ou tp u t

13 YUV420 YUV 4:2:0 input format

12 OEN Overlay enable. Valid only for PCI out-

put

1 1 PCI PCI output enable. Otherwise SDRAM

output

10 BEN Bit mask enable. Valid only for PCI

output

9 GETB Large down scaling bit. Picks five

input pixels near e st 5 ou tp ut pixe ls

and passes to filter.

Equivalent to filter bypass + 5-tap filter

of output pixels. LSB va lue = 0 for fil-

tering.

8 OLLE Overlay little endian enable

7-6 OFRM Overlay format

0 = RGB 24+

1 = RGB 15+

2 = YUV 4:2:2+

5 CHK Chroma keying enable

4 LE RGB output little endian enable

3-0 RGB RGB Output Code

0 = YUV 4:2:2+

1 = YUV 4:2:2

2 = RGB 24+

3 = RGB 24 packed

4 = RGB 8A (RGB 233)

5 = RGB 8R (RGB 332)

6 = RGB15+

7 = RGB 16

The 422SEQ bit controls the internal sequencing of the

YUV to RGB operation. It is set to ‘1’ when YUV 4:2:2

output is selected. When 422SEQ is ‘0’, normal RGB out-

put is assumed. In this mode, the input is YUV 4:2:2 or

YUV 4:2:0, and the output is RGB. To generate the RGB

output, the YUV 4:2:2 or YUV 4:2:0 input must be up-

scaled to YUV 4:4:4 before conversion to RGB. This

means the scaling factor for U and V must be twice the

scaling factor for Y. The internal sequencing of the filter

in this case is UVY, UVY, UVY to generate RGB, RGB,

RGB. For YUV 4:2:2 output formats, no upscaling of U

and V is required. In this case, the 422SEQ bit is set to

one, and the filter sequence is UVYY, UVYY, UVYY.

The 422SEQ bit can be set in RGB output mode to de-

crease the processing time for the image at the expense

of color bandwidth and some corresponding decrease in

picture quality. If the 422SEQ bit is set for RGB output,

the filter will perform the UVYY sequence. In this case,

the U and V components are not upscaled by 2, and the

YUV to RGB converter updates its U and V components

every other pixel. In the normal case (422SEQ=0), it

takes 6 clock cycles to generate two RGB pixels. In the

422SEQ=1 case, it takes 4 clock cycles to generate two

RGB pixels, reducing processing time by 33%.

The YUV420 bit indicates that the input data is in YUV

4:2:0 format. In YUV 4:2:0 format, the U and V compo-

nents are half the width and half the height of the Y data.

YUV 4:2:0 data is normally converted to YUV 4:2:2 data

by a separate vertical upscaling by a factor of 2.0 for best

quality. The YUV420 bit allows using YUV 4:2:0 data di-

rectly but with some quality degradation. When YUV420

is set, the ICP up scales the data vertically by line dupli-

cation. Each U and V input line is used twice. The sepa-

PNX1300/01/02/11 Data Book Philips Semiconductors

14-28 PRELIMINARY SPECIFICATION

rate vertical scaling step is eliminated at the expense of

some quality since the lines are simply duplicated rather

than being fully scaled and filtered.

The OEN bit enab les ov erlay . Set it to ‘1’ if an overlay is

used, ‘0’ if not. Overlays are only valid for PCI output.

The PCI bit selects PCI as the output po rt for the ICP da-

ta. A ‘1’ selects PCI output; a ‘0’ se lects SDRAM output.

The BEN bit enables bit masking. Set it to ‘1’ if bit mask-

ing is used, ‘0’ if not. Bit masking is only valid for PCI out-

put.

The GETB bit is an optional bit for larg e (> 4) down sca l-

ing. When GETB is ‘0’ ( normal oper ation), the 5-tap filter

receives the pixel nearest the output pixel as its center

pixel plus the two adjacent input pixels on either side of

this pixel to form the five filter inputs. When GETB is set,

the filter receive s th e pixel n ea rest the output p ixe l as its

center pixel plus th e two ad jacent outpu t pixels on eithe r

side of this pixel to form the five filter inputs. The effective

algorithm is pixel picking plus 5-tap filtering of the result.

GETB also forces the scaling LSB value to ‘0’, since out-

put pixels are being filtered and no interpo lation is used.

The OFRM bit field selects the overlay data format, as

shown in the Cont rol word format list.

The CHK bit enables chroma keying. Set it to ‘1’ if chro-

ma keying is used, ‘0’ if not.

The OLLE bit sets the endian- ness of the overlay data in-

put. Set it to ‘1’ if the overlay data is little-endian, ‘0’ if big

endian. This bit is normally set to the same value as the

LE bit in the Status register.

The LE bit sets the endian-ness of the RGB/YUV output

data. Set it to ‘1’ if the output data is little-endian, ‘0’ if big

endian. The LE bit is normally set to the same value as

the LE bit in the Status register.

The RGB field defines the output data format, as shown

in the Control word format list.

Important Note: The ICP DMA Enable bit (IE) in the

BIU_CTL register of the PCI interface must be set for

RGB output to PCI. Th is bit must be set before initiating

RGB to PCI operations, or the ICP will stall waiting for the

PCI to become ready.

PRELIMINARY SPECIFICATION 15-1

Variable Length Decoder Chapter 15

by Gene Pinkston and Selliah Rathnam

15.1 VLD OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The variable length decoder (VLD) unit Huffman-de-

codes MPEG-1 and MPEG-2 (Main Profile) video bit-

streams[1-3]. This chapter describes a programmers

view of the VLD.

The VLD reads an MPEG stream from SDRAM, decodes

the bitstream under the control of DSPCPU and outputs

two data streams. The outp ut data streams contain mac-

roblock header information and the run-length encoded

DCT coefficients. The output data streams are stored in

the SDRAM buffers.

The VLD unit, operates independently during the slice

decoding process. The remaining decoding of the MPEG

stream is car rie d ou t by the DSPCPU.

15.2 VLD OPERATION

Enabled by the DSPCPU, the VLD unit can be initialized

by hardware or software reset operations. Hardware re-

set is provided by the external TRI_RESET# pin. Soft-

ware reset is provided by one of the VLD commands.

The DSPCPU controls the VLD through the VLD com-

mand register. There are five commands supported by

the VLD:

• Shift the bitstream by some number of bits (a maxi-

mum of 15-bit shift)

• Search for the next start code

• Reset the VLD

• Parse some number of macroblocks

• Flush VLD output buffers to SDRAM

The normal mode of operation will be for the DSPCPU to

request that the VLD to parse some number of macrob-

locks. Once the VLD has begun parsing macroblocks, it

may stop for any one of the following reasons:

HWY_BUS

RD Buffer

Macroblock

DMA

ENGINE

Control status

status

MMIO &

CONF REGs

SHIFTER

start_code_

detector

mb_addr

mb_type

cbp

dmv &

motion

dct_lum

dct_chr

dctcoef

(0)

dctcoef

(1)

escape_codes

VLD

FLOW

Control

Interrupt

Run-Level

Hdr WR FIFO

WR FIFO

Figure 15-1. VLD block diagram

64 Bytes

PNX1300/01/02/11 Data Book Philips Semiconductors

15-2 PRELIMINARY SPECIFICATION

• The command was completed with no exceptions

• A start code was detected

• An error was encountered in the bitstream

• The VLD input DMA completed, and the VLD is

stalled waiting for more data

• One of the VLD output DMAs has completed and the

VLD is stalled because the output FIFO is full

The DSPCPU can be interrupted whenever the VLD

halts.

Consider the case in which the VLD has encountered a

start code. At this point, the VLD will halt and set the sta-

tus flag to indicate that a start code has been detected.

This event will generate an interrupt to the DSPCPU (if

corresponding interrupt is enabled). Upon entering the

interrupt routine, the DSPCPU will read the VLD status

has determined that a start code was encountered, the

CPU will read 8 bits from the VLD shift register to deter-

mine the type of start code encountered. If it is a ‘slice’

start code, the DSPCPU reads from the shift register the

slice quantization scale and any extra slice information.

The slice quantization scale is then written back to the

VLD quantizer-scale r egister. Before exitin g the interr upt

routine, the DSPCPU will clear the start code detected

status bit in the status register and issue a new command

to process the remaining macroblocks.

15.3 DECODING UP TO A SLICE

MPEG decoding up to the slice layer is carried out by the

DSPCPU and the VLD. The VLD is controlled by the

DSPCPU for the decoding of all parameters up to the

slice-star t code. During this pr ocess, the DSPCPU reads

from the VLD_SR register which contains the next 16 bits

of the bitstream. The DSPCPU also issues shift com-

mands to the VLD in order to advance the contents of the

shift register by the specified number of bits. The

DSPCPU may also command the VLD to advance to the

next start code. Refer to Table 15-6 for a complete list of

VLD commands and their functions. Once at the slice

layer, the VLD operates ind ependently for the entire slice

decoding. The slice decoding starts once the DSPCPU

issues a parse command.

15.4 VLD INPUT

Input to the VLD is controlled by the VLD input DMA en-

gine. The input DMA engine is programmed by the

DSPCPU to read from SDRAM. The DSPCPU pr ograms

this DMA engine by writing the address and the length of

the SDRAM buffer containing the MPEG stream. The ad-

dress of the buffer is writte n to the VLD_BIT_ADR regi s-

ter. The length, in bytes, of the buffer is written to the

VLD_BIT_CNT register.

Esc Count MBA Inc MB Type Mot Type DCT Type MV count MV Format DMV

MV Field Sel [0][0] Motion Code [0][0][1]Motion Residual [0][0][0] Motion Residual [0][0 ][1 ]Motion Code [0][0][0]

MV Field Sel [1][0] Motion Code [1][0][1]Motion Residu al [1][0 ][0] Motion Residual [1][0][1]Motion Code [1][0][0]

MV Field Sel [0][1] Motion Code [0][1][1]Motion Residual [0][1][0] Motion Res idual [0][1][1]Motion Code [0][1][0]

MV Field Sel [1][1] Motion Code [1][1][1]

Motion Residual [1][1][0] Motion Res idual [1][1][1]

Motion Code [1][1][0]

quant scale

CBPdmvector[0]dmvector[1]

First Forward Motion Vector

Second Forward Motion Vector (for MPEG2 only)

First Backward Motion Vector

Second Backward Motion Vec tor (for MPEG2 only)

012346111725

71523293031 13

410121431

Figure 15-2. MPEG-2 macroblock header output format

MB1

MB2

Philips Semiconductors Variable Length Decoder

PRELIMINARY SPECIFICATION 15-3

The VLD reads data from SDRAM into an internal 64-

byte FIFO. The VLD processing engine then reads data

from the FIFO as needed. Once this internal FIFO is

empty the VLD reads more data from SDRAM. The

VLD_BIT_ADR and VLD_BIT_CNT registers are updat-

ed after each read from main memory. The content of the

VLD_BIT_ADR register reflects the next address from

which the bitstream data will be fetched. The content of

the VLD_BIT _CNT registe r reflects the number of by tes

remaining to be read before the current transfer is com-

plete. When the number of bytes remaining to be read

from SDRAM is zero, a status flag is set and an interrupt

can be generated to the DSPCPU. The DSPCPU will

provide the new bitstream buffer address and the num-

ber of bytes in the bitstream buffer to the VLD.

15.5 VLD OUTPUT

The VLD outputs two data streams which are written

back to main memory by two output DMA engines.

These DMA engines are programmed by the DSPCPU.

One of the output streams contains macroblock header

information and the other contains run-length encoded

DCT coefficients. Each DMA engine contains a 64-byte

FIFO which is transferred to main memory once it is full.

The main memory address a nd count for the macroblock

header outp ut a re conta ined in th e VL D_ MBH_ADR an d

VLD_MBH_CNT registers respectively. The main mem-

ory address and count for the DCT coefficient output are

contained in the VLD_RL_ADR and VLD_RL_CNT reg-

isters respectively. The counts for both the macroblock

header and coefficient data are expressed in terms of 32-

bit (4 bytes) words.

15.5.1 Macroblock Header Output Data

For each MPEG-2 macroblock parsed by the VLD, six

32-bit words of macroblock header information will be

output from the VLD. Figure 15-2 pictures the layout of

the VLD output, the fields are described in Table 15-1.

Note that these fields may or may not be valid dependin g

upon the MPEG-2 video standard[2]. For example, mo-

tion vectors are not valid for intra coded macroblocks.

Similarly, ‘DCT Type’ is not valid for field pictures.

For each MPEG-1 macroblock parsed by the VLD, four

32-bit words of macroblock header information will be

output from the VLD. Figure 15-3 pictures the layout of

the VLD output, while the fields are described in

Table 15-2. Note th at these fields may or ma y not be val-

id depending upon the MPEG-1 video standard[1].

Table 15-1. References for the MPEG-2 macroblock

header data

Item Default

value

References from MPEG-2

Video Standard, IS 13818-2

document

Esc count 0 Section 6.2.5

MBA inc - Section 6.2.5 and Table B-1

MB type unde-

fined Section 6.2.5.1 and Tables B-

2, B-3, and B-4; Only 5 Msb

bits from the tables are used

Mot type unde-

fined Section 6.2.5.1; Field or Frame

motion type will be decided by

the user

DCT type unde-

fined Section 6.2.5.1

MV count unde-

fined Tables 6-17 and 6-18. The MV

Count value is one less than

the value from the tables.

MV format unde-

fined Tables 6-17 and 6-18

DMV unde-

fined Tables 6-17 and 6-17

MV field Sel[0]0] to

MV field Sel[1][1] unde-

fined Section 6.2.5 and 6.2.5.2

Motion

code[0][0][0] to

Motion

code[1][1][1]

unde-

fined Section 6.2.5.2.1 and

Table B-10

Motion Resid-

ual[0][0][0] to

Motion Resid-

ual[1][1][1]

unde-

fined Section 6.2.5.2.1; the corre-

sponding rsize bits are

extracted from the bitstream

and stored as left justified; to

get the final value shift the

given number by 8 (corre-

sponding rsize). The rsize val-

ues are stored in VLD_PI

dmvector[1] and

dmvector[0] unde-

fined Section 6.2.5.2.1 and Table B-

11; signed 2-bit integer from

Table B11.

CBP - Section 6.2.5, 6.2.5.3 and

Table B-9

Quant scale - Section 6.2.5; 5-bit from bit-

stream and use Table 7-6 to

compute the quant scale value.

Table 15-2. References for the MPEG-1 macroblock

header data

Item Default

value References from IS 11172-2

document

Esc count 0 Section 2.4.3.6

MBA inc - Section 2.4.3.6

MB type unde-

fined Section 2.4.3.6 and Tables B-

2a to B2d

Motion

code[0][0][0] to

Motion

code[0][1][1]

unde-

fined Section 2.4.2.7 and Table B-4

Motion resid-

ual[0][0][0] to

Motion resid-

ual[0][1][1]

unde-

fined Section 2.4.2.7;the corre-

sponding rsize bits are

extracted from the bitstream

and stored as left justified; to

get the final value shift the

given number by (8 - corre-

sponding rsize). The rsize val-

ues are stored in VLD_PI

CBP - Section 2.4.3.6 and Table B-3

Quant scale - Section 2.4.2.7

PNX1300/01/02/11 Data Book Philips Semiconductors

15-4 PRELIMINARY SPECIFICATION

15.5.2 Run-Level Output Data

The DCT coefficients associated with the macroblock are

output to a separate memory area and each DCT coeffi-

cient is represented as one 32-bit qua ntity (16 bits of ru n

and 16 bits of level). For intra blocks, the DC term is ex-

pressed as 16 bits of DC size and a 16-bit value whose

most significant bits (the numb er of bits used for DC level

is determined by DC size) represent the DC level. Each

block of DCT coefficients is terminated by a run value of

‘0xff’.

15.6 VLD TIME SHARING

The PNX1300 VLD is targeted for a single bitstream de-

code and there is no provision to decode more than one

bitstream at a time by using the VLD in time multiplexed

mode. However internal d evelopment has shown that u p

to 4 simultaneous MPEG1 bitstreams can be decoded.

This proc edure is beyon d the sco pe o f this da taboo k but

can be discussed further by contacting customer sup-

port.

15.7 MMIO REGISTERS

To ensure compatibility with future devices, any unde-

fined MMIO bits should be ignored when read, and writ-

ten as ‘0’s.

15.7.1 VLD Status (VLD_STATUS)

This register contains the current status information most

pertinent to the normal operation of an MPEG video de-

code application. VLD status description is detailed in

Table 15-3 and pictured in Figure 15-4. Default value (af-

ter hardwar e reset) is ‘0’.

Interrupts can be enabled for any of the defined status

bits (see following VLD_IMASK description). Acknowl-

edgment of the interrupt is done by writing a ‘1’ to the cor-

responding bit in VLD_STATUS register. Writing a one to

the bits one through five clears the corresponding bits.

However bit 0 (COMMAND_DONE) is cleared only by is-

suing a new command. Writing a ‘0’ to bit 0 of the status

that several status bits may be asserted simultaneously.

Thus it is recommended to use level triggered interrupts

(see Section 3.5.3.6 on page 3-11) and carefully ac-

knowledge the interrupt.

15.7.2 VLD Interrupt Enable (VLD_IMASK)

This register allows the DSPCPU to control the initiation

of the interrupt for the correspond ing bits in the VLD Sta-

tus Register. Writing a ‘1’ into any of the defined

VLD_IMASK bits enables the interrupt for the corre-

sponding bit in the status register (VLD_STATUS). De-

fault value (after hardware reset) is ‘0’.

Esc Count MBA Inc MB Type

Motion Code [0][0][1]Motion Residual [0][0][0] Motion Residual [0][0][1 ]Motion Code [0][0][0]

Motion Code [0][1][1]Motion Residual [0][1][0] Motion Residual [0][1][1]Motion Code [0][1][0]

quant scale

CBP

First Forward Motion Vector

First Backward Motion Ve ctor

012346111725

71523293031 13

410121431

Figure 15-3. MPEG1 Macroblock Header Output Format

MB1

MB2

Philips Semiconductors Variable Length Decoder

PRELIMINARY SPECIFICATION 15-5

15.7.3 VLD Control (VLD_CTL)

The VLD_CTL register ha s one bit indicating the endian-

ness of the VLD unit. Little-Endian = ‘1’, Big-Endian = ‘0’.

Default value (after hardware reset) is ‘0’.

15.8 VLD DMA REGISTERS

There are one input DMA engine and two output DMA

engines in the VLD block. Each of the three DMA en-

gines (or channels) for the VLD is controlled by two

MMIO registers. The address register always contains

the address of the next SDRAM transaction. The count

ferred to or from main memory. A DMA completes when

its count reaches zero. Once a DMA count register be-

comes zero, a bit is set in the status register and the

DSPCPU can be interrupted. The DSPCPU sets a non-

zero value to a DMA count register to initiate a new DMA

transaction. The input count register always contains

number of bytes to be fetched from the main memory.

The output count registe rs always contain the number of

words (4 bytes) to be written to the main memory.

Note that both of the DMA output engines write only to

64-byte aligned addresses and they always write 64

bytes. When flushing the DMA output FIFOs there may

not be 64 bytes of valid data at the time the flush com-

mand is received. In this case, 64 bytes are still written to

the main memory. The valid bytes can be determined

from the count register value before issuing the flush

command. The valid data always resides in the first N

bytes while the last 64-N bytes will contain random data

and should be ignored.

15.8.1 DMA Input

The bitstream input to the VLD is controlled by

VLD_BIT_ADR and VLD_BIT_CNT MMIO registers.

VLD_BIT_ADR contains the main memory address for

the next read from the main memory to the VLD input

FIFO. VLD_BIT_CNT register contains the number of

bytes remaining to be read before the current DMA is

completed.

The VLD input address is byte aligned.

15.8.2 Macroblock Header Output DMA

The macroblock header output of the VLD is controlled

by VLD_MBH_ADR and VLD_MBH_CNT registers.

VLD_MBH_ADR contains the address of the next write

of macroblock header data to the main memory.

VLD_MBH_CNT contains the remaining number of

words (4 bytes) to write before the current DMA expires.

The macroblock header output address is 64-byte

aligned.

15.8.3 Run-Level Output DMA

The run-level output of the VLD is controlled by

VLD_RL_ADR and VLD_RL_CNT. VLD_RL_ADR con-

tains the address o f th e n ext write of ma cr ob lock header

data to the main memory. VLD_RL_CNT contains the

number of 4-byte writes remaining before the current

DMA expires.

The run-level buffer addr ess is 64-byte aligned.

Table 15-3. VLD_STATUS register

Name Size

(bits) Description

COMMAND_DONE 1 Indicates successful completion

of current command

STARTCODE 1 VLD encountered 0x000001

while executing parse or next

start code command

ERROR 1 VLD encountered an illegal

Huffman code or an unexpected

start code

DMA_IN_DONE 1 DMA transfer of given SDRAM

buffer has completed and VLD

is stalled waiting on more main

memory input data; DSPCPU is

responsible to provide the new

SDRAM buffer to VLD

MBH_OUT_DONE 1 M acroblock Header DMA trans-

fer has completed

RL_OUT_DONE 1 Run-level DMA transfer com-

plete

Table 15-4. VLD control (R/W)

Name Size

(bits) Description

Reserved 1

Little Endian 1 Forces VLD to operate in Little

Endian Mode when set to 1.

PNX1300/01/02/11 Data Book Philips Semiconductors

15-6 PRELIMINARY SPECIFICATION

Figure 15-4. VLD MMIO Registers Layout.

31 0371115192327

MMIO_base

offset:

VLD_COMMAND (r/w)0x10 2800

VLD_STATUS (r)0x10 2810

RL_OUT_DONE

MBH_OUT_DONE

DMA_IN_DONE

ERROR

STARTCODE

COMMAND_DONE

VLD_CTL (r/w)0x10 2818

COMMAND COUNT

31 0371115192327

VLD_SR (r)0x10 2804

31 0371115192327

VALUE

VLD_QS (r/w)0x10 2808

VLD_PI (r/w)0x10 280C

VBRS HBRS VFRS HFRS

MPEG2 CONCEAL_MV

INTRA_VLC

FPFD

PICT_STRUC

PICT_TYPE

VLD_RL_CNT (r/w)0x10 2830 31 0371115192327

VLD_BIT_ADR (r/w)0x10 281C

VLD_BIT_CNT (r/w)0x10 2820 31 0371115192327

VLD_MBH_ADR (r/w)0x10 2824 31 0371115192327

VLD_MBH_CNT (r/w)0x10 2828 31 0371115192327

VLD_RL_ADR (r/w)0x10 282C 31 0371115192327

LITTLE_ENDIAN

BIT_ADR

MBH_ADR

RL_ADR

BIT_CNT

RL_CNT

MBH_CNT

VLD_IMASK (r/w)0x10 2814 Int. Enables

0 0 0 0 0 0

Philips Semiconductors Variable Length Decoder

PRELIMINARY SPECIFICATION 15-7

15.9 VLD OPERATIONAL REGISTERS

15.9.1 VLD Command (VLD_COMMAND)

This register indicates the next action to be taken by the

VLD. Some commands have an associated count which

resides in the least significant 8 bits of this register. There

are currently five commands recognized by the VLD

block:

• Shift the bitstream by ‘count’ bits (‘count’ must be

less than or equal to 15)

• Parse ‘count’ un-skipped macroblocks

• Search for the next start code

• Reset the VLD

• Flush the VLD output buffers

The DSPCPU must wait for the VLD to halt before the

next command can be issu ed. Note that there a re sever-

al ways in which a command may complete. Only a suc-

cessful completion is indicated by the

COMMAND_DONE bit in the status register. A command

may complete unsuccessfully if a start code or a n error is

encountered before the requested number of items has

been processed. Note also that expiration of a DMA

count does not constitute completion of a command.

When a DMA count expires the VLD is stalled as it waits

for a new DMA to be initiated. It is not halted. Default val-

ue (after hardware reset) is ‘0’. VLD_COMMAND fields

are described in Table 15-5 and the differ ent comma nds

explained in Table 15-6.

15.9.2 VLD Shift Register (VLD_SR)

This read only register is a shadow of the VLD’s opera-

tional shift register. Tt allows the DSPCPU to access the

bitstream through the VLD. Bits 0 through 15 are the cur -

rent contents of the VLD shift register. Bits 16 to 31 are

RESERVED and should be treated as undefined by the

programmer.

15.9.3 VLD Quantizer Scale (VLD_QS)

This 5-bit register contains the quantization scale code

(from the slice header) to be output by the VLD until it is

overridden by a macroblock quantizer scale code. The

quantizer scale code is part of the macroblock header

output.

Table 15-5. VLD Command Register

Name Size

(bits) Description

COUNT 8 Count for current command

COMMAND 4 VLD command to be exe-

cuted

Table 15-6. VLD Commands

Command Field

coding

Flags Set after

Completion of the

Command Description

Shift the bitstream

by ‘count’ bits 1 COMMAND_DONE

DMA_IN_DONE

VLD shifts the number of bits in its internal shift register. The shift register value

is available in the VLD_SR register.

The DMA_IN_DONE flag will be set when VLD runs out of data from input FIFO.

The flag is reset by issuing the new command.

Search for the

next start code 3 STARTCODE

COMMAND_DONE

DMA_IN_DONE

VLD search for a start code. The search code has 0x000001 prefix and an addi-

tional 8-bit value.

The DMA_IN_DONE flag will be set when VLD runs out of data from input FIFO.

The STARTCODE detected flag is reset by writing a ‘1’ value to the flag.

The COMMAND_DONE flag is reset by issuing the new command.

Reset the VLD 4 None Refer section 15.12 for more details

Parse for a given

number of mac-

roblocks

2 COMMAND_DONE

STARTCODE

ERROR

DMA_IN_DONE

VLD parses for a given number of un-skipped macroblocks and the associated

run-level values. COUNT will indicate the remaining macroblocks to pa rse. Note

that this number is slightly inaccurate since a parsed macroblock can still be in

internal 64-byte FIFO.

If VLD encounters a start code, the parsing action will be terminated and VLD

sets only the STARTCODE detected flag. If VLD parses the given number of un-

skipped macroblocks without encountering a start code, VLD will set the

COMMAND_DONE flag.

The ERROR flag will be set when VLD encounters an error while parsing the bit-

stream.

The DMA_IN_DONE flag will be set when VLD runs out of data from input FIFO.

The STARTCODE detected flag is reset by writing a ‘1’ value to the flag.

The COMMAND_DONE flag is reset by issuing the new command.

Flush the VLD out-

put buffer 8 COMMAND_DONE VLD flushes the remaining macroblock header data and the remaining run-level

data to SDRAM. The highway byte-enables will be used in order to write only the

valid data to SDRAM. Only the valid word count values written to SDRAM will be

subtracted from the VLD_MBH_CNT and the VLD_RL_CNT registers.

PNX1300/01/02/11 Data Book Philips Semiconductors

15-8 PRELIMINARY SPECIFICATION

15.9.4 VLD Picture Info (VLD_PI)

This 32-bit register contains the picture layer information

necessary for the VLD to parse the macroblocks within

that picture. Again, the values for each of these fields are

determined by the appropriate standard (MPEG [1-3]).

15.10 ERROR HANDLING

Upon encountering a bitstream error, the VLD will set the

bitstream-error flag (ERROR) in the VLD_STATUS reg-

ister and interrupt the DSPCPU, if the interrupt is en-

abled. Note that if a start code is pr esent (in the VLD shift

code and the error bits will be set. A separate flush com-

mand is required to flush any valid data in the run-level

and macroblock header output buffers.

The DSPCPU de-asserts the ERROR flags by writing a

‘1’ to the ERROR flag.

15.11 INTERRUPT

The interrupt source number for the VLD is 14 and it

should be set in level sensitive mode (see Section

3.5.3.6 on page 3-11).

15.12 RESET

The VLD block is reset by a hardware reset or a software

reset. The hardware reset signal is generated from the

external pin TRI_RESET#. The software reset is initiated

by writing a ‘Reset VLD’ command in the

VLD_COMMAND register. Refer Table 15-8 for the de-

tails on the software reset procedure.

15.13 ENDIAN-NESS

VLD supports little-endian and big-endian modes of op-

erations. Refer to Appendix C for the endian-ness spec-

ification of the VLD input and output da ta.

15.14 POWER DOWN

The VLD block can be separately powered down by set-

ting a bit in the BLOCK_POWER_DOWN regis ter. For a

description of powerdown , see Chapter 21, “Powe r Man-

agement.”

The VLD block should not be active when applying block

powerdown.

If the block enters power-down state while it is enabled,

its behavior upon po wer-up is undefined.

15.15 REFERENCES

[1] ISO/IEC IS 13818-2, International Standard (1994),

MPEG-2 Video.

[2] ISO/IEC IS 11172-2, International Standard (1992),

MPEG-1 Video.

[3] MPEG Video Compression Standard, by Joan L.

Mitchell, William B. Pennebaker, Chad E. Fogg, Didier J.

LeGall; ITP publication.

Table 15-7 . VLD pictu r e i nfo regi st e r (r/w )

Name Size

(bits) Description

PICT_TYPE (picture

type) 2 I, P, or B picture

PICT_STRUC (pic ture

structure) 2 f ield or frame picture

FPFD (frame predic-

tion frame dct) 1 specifies that this picture

uses only frame prediction

and frame dct

INTRA_VLC 1 Use DCT table zero or one

CONCEAL_MV 1 concealment vectors present

in the bitstream

reserved 6 Reserved for future expan-

sion

MPEG2 mode 1 Switches VLD between

MPEG-1 and MPEG-2

decoding.

Value ‘1’ = MPEG-2 mode

reserved 2 reserved

HFRS (horizontal for-

ward rsize) 4 size of residual motion vector

VFRS (vertical forward

rsize) 4 size of residual motion vector

HBRS (horizontal

backward rsize) 4 size of residual motion vector

VBRS (vertical back-

ward rsize) 4 size of residual motion vector

Table 15-8. Software reset procedure

Cycle

no. Action Remarks

i DSPCPU issues the ‘Reset

the VLD’ command by writ-

ing the required value in the

VLD_COMMAND register.

i to j VLD will complete the pend-

ing, if any, highway transac-

tions.

Any highway transac-

tions, once started, will

not be aborted in the

middle

j+1 VLD will perform the full

reset. All status and control

registers are reset and

all the buffers are

made empty.

MMIO Registers initial-

ized to zero includes

VLD_STATUS.

PRELIMINARY SPECIFICATION 16-1

I2C Interface Chapter 16

by Essam Abu-ghoush, Robert Nichols

16.1 I2C OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

PNX1300 includes an I2C interface which can be used to

control many different multimedia devices such as:

• DMSDs - Digital multi-standard decoders

• DENCs - Digital encoders

• Digital cameras

•I

2C - Parallel I/O expanders

The key features of the I2C interface are:

• Supports I2C single master mode

•I

2C data rate up to 400 kbits/sec

• Support for the 7-bit addressing option of the I2C

specification

• Provisions for full software use of I2C interface pins

for implementing software I2C or similar protocols

Note that the I2C pins are also used to load the initial boot

parameters and/or code from a serial EEPROM as de-

scribed in Section 13, “System Boot”. The boot logic is

only active upon PNX1300 hardware reset and quiescent

afterwards.

A typical system using the I2C interface is presented in

Figure 16-1. The PNX1300 is connected as a master to

a series of slave devices through SCL and SDA. Note

that the bus has one pullup resistor for each of the clock

and data lines. The pullup should be set to a voltage no

higher than VREF_PERIPH.

16.2 COMPARED TO TM-1000

The following are the main I2C differences from TM-

1000:

• The SEX bit is removed. Endian-ness is fixed.

•The

I2C clock rate is closer to 100/400 kHz

• The GDI bit now correctly indicates write-completion

• Clock stretching is always enabled.

16.3 EXTERNAL INTERFACE

The I2C external interface is composed of two signals as

shown in Table 16-1.

16.4 I2C REGISTER SET

The I2C user interface consists of four registers visible to

the programmer. The registers are mapped into the

MMIO address space an d are fully accessible to the pro-

grammer. Figure 16-2 shows the I2C register set. To en-

sure compatibility with future devices, any undefined

MMIO bits should be ignored when read, and written as

‘0’s.

16.4.1 IIC_AR Register

The IIC_AR is the I2C address r egister and is used in both

master receive and tran smit modes. This register is writ-

ten with the address(es) of the I2C slave device and the

bytecount for transmit/receive. Table 16-2 lists the bit-

field definitions for the IIC_AR register.

Figure 16-1. Typical I2C system implementation

SCL

SDA

PNX1300 Slave

I2C

Slave

I2C

+ VREF_PERIPH

RpRp

Table 16-1. I2C External interface

Signal Type Description

IIC_SDA I/O I2C serial data

IIC_SCL O I2C clock

Table 16-2. IIC_AR Register

Bits Field Name Definition

31:25 ADDRESS 7-bit slave device address.

24 DIRECTION Read/Write control bit

23:16 reserved must be written to ‘0’

15:8 COUNT Byte count of requested transfer

7:0 reserved Read as ‘0’

PNX1300/01/02/11 Data Book Philips Semiconductors

16-2 PRELIMINARY SPECIFICATION

ADDRESS must be programmed to contain the 7 bits of

the desired slave address

The DIRECTION bitfield controls read/write operation on

the I2C interface. The bit definition is:

• DIRECTION = 0 –> I2C write

• DIRECTION = 1 –> I2C read

The COUNT field must contain the d esired b yteco unt fo r

the current transfer. The COUNT field will decrement by

one for each data byte transferred across I2C. The re-

maining bytecount for the current transfer can be read

from the COUNT field at any time. Note that the

DSPCPU must refrain from rewriting the IIC_AR register

until the current transfer completes to avoid corrupting

the bytecount or address fields.

Note: For writes, the byte count decrements before the

byte is actually transferred over the I2C bus. However,

the last byte is saved in an internal register and the

DSPCPU can write a new word when COUNT = 0.

16.4.2 IIC_DR Register

The IIC_DR register conta ins the actua l d ata tran sfer red

during I2C operation. For a master transmit operation,

data transfer will be initiated when data is written to this

address byte in the IIC_AR register followed by the data

bytes that were written to the IIC_DR register, byte3 first

and byte0 last. The I2C interface will interrupt for more

transmit data to be writte n to the IIC_DR until the transfer

bytecount COUNT in the IIC_AR register is reached.

In master receive operation, one or more data bytes re-

ceived are placed in the IIC_DR r egister by the I2C inter-

face. Data bytes received are loaded into the IIC_DR

The number of bytes the DSPCPU requests for a transfer

is written into the COUNT bitfield of the IIC_AR register.

The transfer completes when the I2C interface receives

the number of bytes indicated by the COUNT bitfield of

the IIC_AR register.

Figure 16-2. I 2C registers

MMIO_base

offset:

IIC_AR (r/w)0x10 3400 037111519232731

COUNT

IIC_DR (r/w)0x10 3404 037111519232731

IIC_SR (r/o)0x10 3408 037111519232731

reserved

DIRECTION

ADDRESS

BYTE3 BYTE2 BYTE1 BYTE0

reserved

DIRECTION

STATE

SDNACKI

SANACKI

GDI

GD_IEN

F_IEN

SDNACK_IEN

SANACK_IEN

IIC_CR (r/w)0x10 340C 037111519232731

CLRFI

CLRGDI

CLRSANACKI

CLRSDNACKI

ENABLE

RBC

SDA_STAT

SCL_STAT

SW_MODE_EN

SDA_OUT

SCL_OUT

Philips Semiconductors I2C Interface

PRELIMINARY SPECIFICATION 16-3

16.4.3 IIC_SR Register

The I2C status register contains status information re-

garding the transfer in progress and the nature of inter-

rupts associated with I2C operation.

The IIC_SR register is read only and is intended as the

primary source of sta tus regarding cur rent I2C oper ation.

The IIC_SR register must be used in conjunction with the

IIC_CR register. The interrupt sources of the IIC_SR reg-

ister are individually enabled by writing to the appropriate

enable bit in the IIC_CR register. The bitfield definitions

of the IIC_SR register are presented in Table 16-3. The

IIC_SR provides four sources of interrupts. Note: the in-

terrupt should be set up as level triggered interrupt.

•GDI interrupt — The GDI bit together with the FI bits

provide status about I2C transfer completion. The

interpretation of GDI/FI bit combinations are different

depending on whether the I2C interface is in master

transmit or master receive mode. Refer to Table 16-4

and Table 16-6 for GDI/FI interpretation.

•FI interrupt — See GDI bit definition and GDI/FI

transmit and receive definitions in Table 16-4 and

Table 16-6.

•SANACKI interrupt — This int errupt flag bit indicates

that a slave address was transmitted but no slave on

the I2C bus acknowledges the address to claim the

transaction. This is an error condition. Once the I2C

interface has set this interrupt flag, the interface is

idle. The DSPCPU should clear this interrupt flag by

writing a ‘1’ to IIC_CR.CLRSANACKI before re-

attempting this transfer or starting another I2C trans-

fer.

•SDNACKI interrupt — This interrupt flag bit indicates

that an addressed slave receiver device has refused

to acknowledge the current byte of data for an ongo-

ing transfer. This is an error condition. Once the I2C

interface has set this interrupt flag, the interface is

idle. The DSPCPU should clear this interrupt flag by

writing a ‘1’ to IIC_CR.CL RSDNACKI before retrying

this transfer or starting an ot he r.

The SDA_STAT and SCL_STAT bits indicate the current

state of the SDA and SCL signals. The STATE field indi-

Table 16-3. IIC_SR regis te r

Bits Field Name Definition

31 GDI Good Data Interrupt. This is the nor-

mal transfer complete interrupt flag.

This interrupt may be asserted without

the IIC_SR.FI interrupt bit at the end of

an I2C transfer or after master abort of

an I2C transfer.

30 FI Full Interrupt. This interrupt indicates

the condition of the IIC_DR register

dependent upon whether the I2C inter-

face is in receive or transmit mode.

29 SANACKI Slave Address No Acknowledge Inter-

rupt.

28 SDNACKI Slave Dat a No Acknowledge Interrupt.

27 SDA_ST AT This bit is used to examine the state of

the external I2C SDA data pin. Bit

polarity is:

1 = SDA pad is low

0 = SDA pad floated high

26 SCL_STA T This bit is used to examine the state of

the external I2C SCL clock pin. Bit

polarity is:

1 = SCL pad is low

0 = SCL pad floated high

25:23 STATE The STATE field indicates the micro-

activity of the I2C bus.

22 DIRECTION Direction of current data transfer.

21 Reserved Read as ‘0’

15:8 RBC Remaining Byte Count.

7:0 Reserved Read as ‘0’

Table 16-4. Master transmit mode GDI/FI status

GDI FI Description

0 0 Message is not complete. The IIC_DR is not

empty. No interrupt.

0 1 Message is not complete. The IIC_DR is empty

and the requested transmit byte count is not

equal to 0. The DSPCPU must write additional

bytes of the current transfer to the IIC_DR regis-

ter.

1 X Message transmission has completed. The

IIC_DR is empty. The byte transmit count = 0.

Table 16-5. STATE field values

STATE Meaning

000 I2C Interface is idle.

001 RESERVED FOR FUTURE USE

010 IDLE (MSG is done, awaiting clear GDI to go to

000 state)

011 Address phase is being processed

100 BYTE3 (first byte) is being processed

101 BYTE2 is being processed

110 BYTE1 is being processed

111 BYTE0 (last) is being processed

Table 16-6. Master receive GDI/FI conditions

GDI FI Description

0 0 Message is not complete. IIC_DR is not full.

No interrupt.

0 1 IIC_DR contains received data and needs to

be read serviced. More data bytes are

expected since the receive byte count is not

equal to 0.

1 X The transfer has been completed and the

receive byte count is equal to 0. 0 to 4 valid

bytes are in the IIC_DR register awaiting read

servicing by the DSPCPU.

PNX1300/01/02/11 Data Book Philips Semiconductors

16-4 PRELIMINARY SPECIFICATION

cates the microactivity of the I2C interface. The field val-

ues and their meanings are presented in Table 16-5 The

DIRECTION status bit indicates if the I2C interface is in

transmit or receive mode.

• if DIRECTION = 0 then I2C is a transmitter.

• if DIRECTION = 1 then I2C is a receiver.

The RBC bitfield indicates the remaining bytecount for an

I2C transfer in progre ss. The IIC_SR.RBC bitfie ld serves

as a read- only ‘shadow regis ter’ for th e IIC_AR.COUNT

bitfield. During I2C transfer, the RBC bitfield will reflect

the remaining bytecount. To avoid corrupting an I2C

transfer, the DSPCPU must refrain from writing to the

IIC_AR.COUNT bitfield until a message is complete.

Completion is indicated by the RBC bitfield decrementing

to zero.

16.4.4 IIC_CR Register

The I2C control register contains control information re-

quired for enabling I2C transfers. This register is used to

enable and clear interrupt sources which normally occur

during I2C operation. The four interrupt sources de-

scribed in the section on the IIC_SR register are enabled

and cleared through the IIC_C R register. The enabl e bit-

fields are:

•GD_IEN — Enable for normal transfer complete

interrupt.

•F_IEN — Enable for IIC_DR data service request

interrupt.

•SANACK_IEN — Enable for slave address not

acknowledged interrupt. This is an error interrupt.

•SDNACK_IEN — Enable for slave data not acknowl-

edged interrupt. An addressed slave receiver has

refused to accept the last byte transmitted to it. This

is handled as an er ro r int er ru pt .

In addition to the interrupt enable bits, the IIC_CR con-

tains interrupt clear bits associated with each of the inter-

rupt sources in the IIC_SR register. These IIC_CR inter-

rupt clear bits are defined as:

•CLRGDI — Clear bit for the GDI interrupt in the

IIC_SR register. Writing a ‘1 ’ to th is b it clea rs the GDI

interrupt.

•CLRFI — Clear bit for the FI interrupt in the IIC_SR

•CLRSANACKI — Clear bit for the SANACKI inter-

rupt in the IIC_SR register. Writing a ‘1’ to this bit

clears the SANACKI interrupt.

•CLRSDNACKI — Clear bit for the SDNACKI inter-

rupt in the IIC_SR register. Writing a ‘1’ to this bit

clears the SDNACKI interrupt.

The remaining bitfield of the IIC_CR register is:

•ENABLE — Master enable for I2C serial interface.

ENABLE must be set equal to ‘1’ to transfer any bits

from the I2C interface block. Writing a ‘0’ to the

ENABLE bit effe ctively resets th e entire I2C interface,

including all status and interrupt flag bits. A transfer

in progress is aborted and the byte currently trans-

ferred is lost.

Note: For writes, Reserved1, 2, 3 and 4 bitfields

MUST always be written with ‘0’s.

Table 16-7. IIC_CR Register

Bits Field Name Definition

31 GD_IEN Enable for normal transfer complete

interrupt

30 F_IEN Enable for IIC_DR data service

request interrupt

29 SANACK_IEN Enable for slave address not

acknowledged interrupt

28 SDNACK_IEN Enable for slave data not acknowl-

edged interrupt. An addressed slave

receiver has refused to accept the

last byte transmitted to it

27:26 Reserved1 Always write ‘0’s to these bits.

(See Note1)

25 CLRGDI Clear bit for the GDI interrupt in the

IIC_SR register. Wr iting a ‘1’ to this

bit clears the GDI interrupt

24 CLRFI Clear bit for the FI interrupt in the

IIC_SR register. Wr iting a ‘1’ to this

bit clears the FI interrupt

23 CLRSANACKI Clear bit for the SANACKI interrupt

in the IIC_SR register . W riting a ‘1’ to

this bit clears the SANACKI interrupt.

22 CLRSDNACKI Clear bit for the SDNACKI interrupt

in the IIC_SR register . W riting a ‘1’ to

this bit clears the SDNACKI interrupt.

21:6 Reserved2 Always write ‘0’s to these bits.

(See Note1)

10 SW_MODE_EN 0 (power-on/reset default) - Normal

I2C hardware operating mode.

1 - Enable software operating mode.

The I2C pins are entirely controlled

by user writes to the ‘sda_out’ and

‘scl_out’ register bits.

7 SDA_OUT Enabled by sw_mode_en. This bit is

used by sw to manually control the

external I2C SDA data pin. Bit polar-

ity is:

1 = SDA pad pulled low

0 = SDA pad left open drain

6 SCL_OUT Enabled by sw_mode_en. This bit is

used by sw to manually control the

external I2C SCL clock pin. Bit polar-

ity is:

1 = SCL pad pulled low

0 = SCL pad left open drain

5:2 Reserved3 Always write ‘0’s to these bits.

(See Note1)

1 Reserved4 Always write ‘0’s to these bits.

(See Note1)

0 ENABLE I2C serial interface enable

Table 16-7. IIC_CR Register (Continued)

Bits Field Name Definition

Philips Semiconductors I2C Interface

PRELIMINARY SPECIFICATION 16-5

16.5 I2C SOFTWARE OPERATION MODE

I2C software operation mode is intended for use by soft-

ware I2C or similar algorithm implementations. In this

case, the SCL and SDA pins are fully controlled and ob-

served by software, and the hardware I2C interface is

disconnected from the SCL and SDA pins. Refer to

Figure 16-3 for a clarification of the principles involved.

Software mode is by default disabled after boot. Soft-

ware mode is enabled by writing a ‘1’ to

IIC_CR.SW_MODE_EN. At that point, the SCL and SDA

pins can be controlled by the IIC_CR SDA_OUT and

SCL_OUT bits. Writing a ‘1’ to either bit causes the cor-

responding pin to become active, i.e. be pulled low. The

SDA and SCL lines are open-collector outputs, and can

hence also be pu lled low by extern al devices. The actual

pin state can be observed by software by examining

IIC_SR SDA_STAT and SCL_STAT bits. A 1 in these

MMIO bits indicates that the corresponding pin is cur-

rently pulled low.

By appropriate software, possibly using a timer interrupt,

full I2C functionality can be implemented using this

mechanism.

16.6 I2C HARDWARE OPERATION MODE

Hardware operation of I 2C is the default mode after boot.

The PNX1300 I2C hardware interface opera tes in one of

two modes:

1. Master-transmitter (to write data to a slave)

2. Master-receiver (to read data from a slave)

As a master, the I2C logic will generate all the serial clock

pulses and the START and STOP bus conditions. The

START and STOP bus conditions are shown in

Figure 16-4. A transfer is ended with a STOP condition

or a repeated START condition. Since a repeated

START condition is also the beginning of the next serial

transfer, the I2C bus will not be released.

Note: The I2C interface on PNX1300 will operate as a

master ONLY!

The number of bytes transferred between the START

and STOP conditions from transmitter to receiver is not

limited. Each 8-bit data byte is followed by one acknowl-

edge bit. The transmitter releases the SDA line which will

pull-up to a HIGH level during the acknowledge bit time.

The receiver acknowledges by pulling the data line LOW

during this acknowledge period. The master must always

generate the SCL transitions for the acknowledge bit

time.

SCL

SDA

hardware

DATA

HIWAY

open drain

scl_stat

scl_out

I2C

sda_stat

sda_out

tribuf

sw_mode_en

buf

open drain

buf

Figure 16-3. I2C software mode only logic

PNX1300/01/02/11 Data Book Philips Semiconductors

16-6 PRELIMINARY SPECIFICATION

Two types of data transfers are supported by the

PNX1300 I2C interface:

• Data transfer from a master transmitter to a slave

receiver, also called a WRITE operation. The master

first transmits a 1-byte slave address, then the

desired number of data bytes. The slave receiver

returns an acknowledge bit af ter each byte. The mas-

ter terminates the transaction by a STOP after the

last byte.

• Data transfer from slave transmitter to master

receiver, also called a READ operation. The first byte

(the slave address) is transmitted by the master and

acknowledged by the slave. The selected slave

transmits successive data bytes which are each

acknowledged by the master, except the last byte

desired by the master, for which the master gener-

ates a ‘notack’ condition. This causes the slave to

terminate byte transmission. The slave transmitter

then must release the bus so that the master may

generate a STOP condition.

The type of transaction is indicated by the LSbit of the ad-

dress byte. Data transfer from a master transmitter to a

slave receiver is called a WRITE. It is signified by a ‘0’ in

the LSbit of the address byte. Data transfer from a slave

transmitter to a master receiver is called a READ. It is

signified by a ‘1’ in the LSBit of the ad dress byte.

Example steps for successful progr amming of the I2C in -

terface on PNX1300 are outlined as follows for both

reads and writes. Enable the I2C interface prior to at-

tempting any accesses to external I2C devices.

To enable the interface:

• Set bit IIC_CR.ENABLE (0x10340c) = 1

For write addressing mode:

1. On entry, clea r an y po ssib le I 2C interrupt sources by

writing IIC_CR bits [25:22] = ‘1111’. (Note that pro-

grammers must mask and enable high -level interrupt

sources through the VIC facility in the DSPCPU. See

the appropriate PNX1300 databook chapter).

2. Enable desired I2C interrupt sources by setting

IIC_CR[31:28] bits appropriately.

3. Simultaneously load IIC_AR[31:25] with 7-bit slave

address, IIC_AR.DIRECTION = 0 and IIC_AR[15:8]

with the appropriate bytecount for the transfer.

4. Load IIC_DR[31:0] with data for the write. Note that

writing this register triggers the transfer across the I2C

bus.Up to 4 bytes will be transferred after writing, de-

pendent on bytecount in IIC_AR[8:15}.Transfers of

more than 4 bytes have to be do ne by breaking th em

down into a sequence of 4-byte transfers and a last

transfer which may be less than 4 bytes. This is done

by repeatedly reloading the register until the byte-

count is fulfilled. Transfer is done high byte first, pro-

ceeding to low byte.

5. Detect I2C resulting condition code in IIC_SR[31:28]

and respond - OR - Detect I2C high level interrupt and

respond. (Note that this last step is dependent upon

system software requirements).

6. If transfer count is not yet fulfilled, clear GDI and FI

bits and proceed with step iv) until all data is written.

For read addressing mode:

1. On entr y, clea r an y po ssib le I2C interrupt sources by

writing IIC_CR bits [25:22] = ‘1111’. (Note that pro-

grammers must mask and ena ble high level interr upt

sources through the VIC facility in the DSPCPU. See

the appropriate databook chapter).

2. Enable desired I2C interrupt sources by setting

IIC_CR[31:28] bits appropriately.

3. Simultaneously load IIC_AR[31:25] with 7-bit slave

address, IIC_AR.DIRECTION = 1 and IIC_AR[15:8]

with the appropriate bytecount for the transfer. Note

that writing this register triggers the read across the

I2C bus.

4. Detect I2C resulting condition in IIC_SR[31:28] and

respond - OR - Detect I2C interrupt and respond.

(Note that this last step is dependent upon system

software requirem e nts.)

5. Clear GDI and FI bits and read the contents of

IIC_DR. Up to 4 bytes will be available in IIC_DR, fe-

ver if the remaining bytecount wa s less than 4. Bytes

are stored high byte first, proceeding to low byte.

6. Proceed with step iv) until all data is read, i.e byte-

count is fulfilled.

16.6.1 Slave NAK

If a slave device does not generate an ACK where re-

quired, this is considered a NAK. Upon receipt of a NAK

after transmitting a device address or data byte, the mas-

ter takes the following actions:

• the I2C state becomes IDLE (STATE = 000)

• a STOP condition is issued on the bus

• no more data is sent

SDA

SCL SP

START STOP

Figure 16-4. START and STOP Conditions on I2C

Philips Semiconductors I2C Interface

PRELIMINARY SPECIFICATION 16-7

16.7 I2C CLOCK RATE GENERATION

The I2C hardware block diagram is shown in Figure 16-5

below. In hardware operatin g mode, the IIC__SCL exter-

nal clock is derived by division from the BOOT_CLK pin

on PNX1300. The BOOT _CLK pin is normally connected

to TRI_CLKIN. The IIC__SCL clock divider value is de-

termined at boot time and cann ot be changed thereafter.

The value chosen depends on the first byte read from the

EEPROM, as described in Section 13.2.1, “Boot Proce-

dure Common to Both Autonomous and Host-Assisted

Bootstrap.”

The PNX1300 I2C block is able to ‘stretch’ the SCL clock

in response to slaves that need to slow down byte trans-

fer. This mechanism of slowing SCL in response to a

slave is called ‘clock stretching.’ This clock stretching is

accomplished by the slave by holding the SCL line ‘low’ after completion of a byte transfer and acknowledge se-

quence. Clock stretching is always enabled.

Table 16-8. I2C speed and EEPROM byte 0

BOOT_CLK

bits EEPROM

speed bit divider

value actual I2C

speed

00 (100 MHz) 0 (100 kHz) 1008 99.2 kHz

00 1 (400 kHz) 256 390.6 k Hz

01 (75 MHz) 0 (100 kHz) 752 99.7 kHz

01 1 (400 kHz) 192 390.6 k Hz

10 (50 MHz) 0 (100 kHz) 512 97.6 kHz

10 1 (400 kHz) 128 390.6 k Hz

11 (33 MHz) 0 (100 kHz) 336 98.2 kHz

11 1 (400 kHz) 96 343.8 kHz

Figure 16-5. I2C block diagram

Boot S/M

and Logic

Reset

Logic

I2C Clock

Gen Prog

PAD

I2C

I/F

S/M

Serializer/Deserializer

PAD

Addr

Data

Boot Address

Boot Data

cpu-arst

TRI_RESET#

controls controls

cpu-arst

IIC_SCL

PAD

BOOTCLKIN

ATE

(eeprom image

Byte0,bit0)

IIC_SDA

controls

I2C low

level S/M controls

boot addr

cpu-arst

boot_sclk

sclk

Boot

Data

IIC_AR reg

IIC_DR reg

sclk

sync

Data Hiway

PNX1300/01/02/11 Data Book Philips Semiconductors

16-8 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION 17-1

Synchronous Serial Interface Chapter 17

17.1 SYNCHRONOUS SERIAL INTERFACE

OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The PNX1300 synchro nous seri al interface (SSI) unit in-

terfaces to an off-chip modem analog front end (MAFE)

subsystem, network terminator, ADC/DAC or codec

through a flexible bit-serial connection. The hardware

performs full-duplex serialization/deserialization of a bit

stream from any of these devices. Any such front end de-

vice connected must support transmitting, receiving of

data, and initialization via a synchronous seria l interface.

Since the communication algorithm is implemented in

software by the PNX1300 DSPCPU and the analog inter-

face is off chip, a wide variety of modem, networ k and/or

FAX protocols may be supported.

The SSI hardware includes:

• A 16-bit receive shift register (RxSR), synchronized

by an external receive frame synchronization pulse

(SSI_RxFSX) and clocked by an external clock

(RxCLK)

• A 32-bit MMIO receive data register (SSI_RxDR) to

provide data access from the DSPCPU

• 32-entry deep,16-bit wide receive buffer (RxFIFO), to

buffer between the receive shift register (RxSR) and

MMIO receive data register (SSI_RxDR)

• A 16-bit transmit shift register (TxSR), synchronized

by an external or internal transmit frame synchroni-

zation pulse and clocked by an external clock (either

SSI_IO1 or SSI_RxCLK)

• A 32-bit MMIO transmit data register (SSI_TxDR) to

transmit data from the DSPCPU.

• 30-entry deep, 16-bit wide transmit buffer (TxFIFO),

to buffer between the MMIO transmit data register

(SSI_TxDR) and transmit shift register (TxSR)

• Transmit frame sync pulse generation logic

• Control and status logic

• Interrupt generation logic

The SSI unit is not a hiway bus master. All I/O is complet-

ed through DSPCPU MMIO cycles. FIFOs are used to in-

crease allowable interrupt response time and decrease

interrupt rate.

17.2 INTERFACE

The external interface consists of the 6 pins described in

Table 17-1.

17.3 BLOCK DIAGRAM

The main block diagram of the SSI unit is illustrated in

Figure 17-1.

The I/O block is used for control of the I/O pins and for

selecting the transmit clock and transmit frame synchro-

nization signals.

The frame synchronization block can be used for gener-

ating an internal synchronization signal derived from re-

ceive clock input (SSI_RxCLK) or from an I/O pin

(SSI_IO1).

The SSI transmit block buffers and transmits the bits us-

ing the generated frame synchronization signal (TxFSX)

and the transmit clock. The transmit clock is either the re-

ceive clock or the clock present on SSI_IO1.

The SSI receive block receives and buffers the bits on

the SSI_RxDATA line, using the receive clock

(SSI_RxCLK) and the rece ive frame synchronization sig-

nal (SSI_RxFSX).

Each of the blocks will be described in detail in the next

subsections.

Table 17-1. Synchronous serial interface pins

Name Type Description

SSI_RxCLK IN-5 Serial interface clock signal; pro-

vided by an external communica-

tion device.

SSI_RxFSX IN-5 Frame synchronization reference

signal; provided by an external

communication device.

SSI_RxDA TA IN-5 Receive serial data signal; provided

by the receive channel of an exter-

nal communication device.

SSI_TxDATA OUT Transmit serial dat a signal output.

SSI_IO1 I/O-5 Transmit clock input or general pur-

pose I/O pin.

SSI_IO2 I/O-5 Transmit Frame synchronization

signal input or output or general

purpose I/O pin.

PNX1300/01/02/11 Data Book Philips Semiconductors

17-2 PRELIMINARY SPECIFICATION

17.3.1 General Purpose I/O

Figure 17-2 illustrates the functionality of the general

purpose I/O pins. The SSI_IO1 and SSI_IO2 external

pins may be used as general purpose I/O by proper con-

figuration of the SSI_CTL register, or they may be used

as transmit clock input and as tra nsmit framin g signal in-

put or output. The SSI_CTL.IO1 and SSI_CTL.IO2 Mode

Select fields control the direction and functionality of

these two pins.

A hardware reset or a software reset of the transmitter

through SSI_CTL.TXR command sets the SSI_CTL.IO1

and SSI_CTL.O2 fields to 11b, a conflict-free initial pin

state.Table 17-2 shows the effect of SSI_CTL.IO1 on pin

SSI_IO1, Table 17-3 shows the effect of SSI_CTL.IO2

on SSI_IO2. Note: If SSI_IO1 is not selected a s tr ansmit

clock input, the transmit clock is taken from the receive

clock signal instead. If SSI_IO2 is not selected as trans-

mit framing signal input or output, the transmit framing

signal is taken from the receive framing signal instead.

SSI_RxCLK

TxFSX

SSI_RxFSX

Frame Synchronization

Block

Figure 17-1. The SSI interface block diagram

SSI_IO2

SSI_IO1 I/O Control

Block

SSI Transmit

Block

TxCLK

SSI_TxDATA

SSI Receive

Block

SSI_RxDATA

IO1[1:0]=00

RIO1

WIO1

Figure 17-2. I/O block diagram

internal TxFSX

2:1

MUX

IO2[1:0] = 00

WIO2 IO2[1:0] = 00

SSI_IO2

RIO2

IO2[0] = 0

IO2[0] = 1

SSI_IO1

IO1[1:0]=01

SSI_RxFSX TxFSX

SSI_IO2

2:1

MUX

IO2[1:0] = 11

2:1

MUX

IO2[1:0] = 10

internal TxFSX IO2[1:0] = 10

IO2[1:0] = 11

TxCLK

2:1

MUX

IO1[1:0]=10

SSI_IO1

SSI_RxCLK

Philips Semiconductors Synchronous Serial Interface

PRELIMINARY SPECIFICATION 17-3

17.3.2 Frame Synchronization

The internal frame synchronization logic is illustrated in

Figure 17-3. An internal Frame Synchronization signal

(TxFSX) is being generated from the transmit or receive

clock selected by SSI_CTL.IO1. The Clock is divided by

the word length (16) and a Frame Rate Divider which is

controlled by the FSS[3:0] bits in the SSI_CTL register.

FMS determines the Frame Mode oper ation, whether the

frame sync pulse is word-length or bit-length. The trans-

mit framing signal is selected depending on

SSI_CTL.IO2, as shown in Table 17-4.

17.3.3 SSI Transmit

The transmitter control block diagram is illustrated in

Figure 17-4. The transmitter clock can be selected from

two sources, i.e. SSI_IO1 or SSI_RxCLK by program-

ming IO1[1:0] bits in the SSI_CTL register (see

Figure 17-2). A transfer takes place on either the rising or

falling edge of the clock, which can be configured with

SSI_CTL.TCP.

The transmitter has a 30-entry deep, 16-bit transmit

buffer that buffers the data between the 32-bit

SSI_TXDR regis ter and the 16-bit transmit shift registe r

(TxSR).

The TxSR is a 16-bit transmit shift register. It can be con-

figured to shift out MSB or LSB first with SSI_CTL.TSD.

A detailed description of the configuration of the transmit-

ter can be found in the SSI_CTL and SSI_CSR register

description (17.10.1 and 17.10.2)

SSI_TxDR is a 32-bit MMIO transmit register.

17.3.4 SSI Receive

The receiver control block diagram is illustrated in

Figure 17-5. The receiver clock, frame synchronization

and data signal are always taken from the external pins.

The receiver has a 32-entry deep, 16-bit receive buffer

that buffers the data between the 16-bit receive shift reg-

ister (RxSR) and the 32-bit SSI_RXDATA register.

The input pin SSI_RxDATA provides serial shift in data

to the RxSR. The RxSR is a 16-bit receive shift register.

RxSR can be configured to shift in fr om MSB or LSB first

using SSI_CTL.RSD. A transfer takes place on either the

rising or falling edge of the receiver clock, which can be

configured with the SSI_CTL.RCP.

Table 17-2 Effect of SSI_CTL.IO1 on SSI_I O1

IO1[0:1] Function of SSI_IO1

00 general purpose output with positive logic

polarity, reflecting the value in

SSI_CTL.WIO1

01 general purpose input, with option al change

detector function. The input state can be

read from SSI_CSR.RIO1. The change

detector is clocked by the highway bus. The

change detector may option ally generate an

interrupt, under the control of CDE bit of

SSI_CTL.

10 Transmit clock (TxCLK) input

11 tri-state, input signal value ignored

Table 17-3 Effect of SSI_CTL.IO2 on SSI_I O2

IO2[0:1] Function of SSI_IO2

00 General purpose output with positive logic

polarity, reflecting the value in

SSI_CTL.WIO2

01 General purpose input. The input state can

be read in from SSI_CSR.RIO2. No change

detector is provided for this pin.

10 Internal transmit framing signal (TxF SX) out-

put.

11 Transmit framing signal (TxFSX) input.

SSI_RxCLK

TxCLK

SSI_IO1

Word Length

Divider Frame Rate

Divider Frame Sync

Mode

FSS[3:0] FMS

Figure 17-3. Frame synchronization generation block diagram

internal TxFSX

2:1

MUX

IO1[1:0]=10

Table 17-4. Effect of SSI_CTL.IO2 on transmit

framing signal

IO2[0:1] Source of transmit framing signal

00 taken from RxFSX

01 taken from RxFSX

10 internally generated

11 taken from SSI_IO2 pin

PNX1300/01/02/11 Data Book Philips Semiconductors

17-4 PRELIMINARY SPECIFICATION

A detailed description of the configu ration of the receiver

can be found in the SSI_CTL and SSI_CSR register de-

scription (17.10.1 and 17.10.2)

SSI_RxDR is a 32-bit MMIO receive data register.

Due to the possibility of speculative reading of the

SSI_RxDR, the read itself can not be implemented to ac-

knowledge the data as a side effect. For this reason an

explicit acknowledge mechanism is provided by the

SSI_RxACK register.

The SSI_RxACK is a 1-bit MMIO register that is used to

signal the SSI receiver state machine that a word has

been successfully read from the SSI_RxDR.

Writing a ‘1’ to this register initiates updating of the inter-

nal state. Writing a ‘0’ has no effect.

The register cannot be read, its effect may be observed

in the WAR field of the SSI_CSR.

The status fields of the SSI_CSR will update within 1

highway clock cycle after writing to the SSI_RXACK reg-

ister.

SSI_TxDATA Transmit

Shift Reg 64-byte Transmit Buffer Transmit

Data Reg

TxCLK Transmit Control Logic

TxFSX

Transmit

Control Reg

Transmit

Status Reg

Figure 17-4. The Sync Serial Interface Transmit Block Diagram

TxSR SSI_TXDR

SSI_RxCLK

SSI_RxFSX

SSI_RxDATA Receive

Shift Reg 64-byte Receive Buffer Receive

Data Reg

Receive Control Logic

Receive

Control Reg

Receive

Status Reg

Figure 17-5. The SSI receive block diagram

RxSR SSI_RXDR

Philips Semiconductors Synchronous Serial Interface

PRELIMINARY SPECIFICATION 17-5

17.4 SSI TRANSMIT OPERATION

17.4.1 Setup SSI_CTL

Write the SSI_CTL to reset and enable the transmitter.

Both the transmitter and receiver must be reset simulta-

neously. This will set all registers and internal logic to be

same as after a power-up reset. The recommended pro-

cedure is to set up all transmitter-related control bits be-

fore performing a TXE assert. In particular, fields TCP,

RSD, IO1, IO2, FMS, FSP, MOD and TMS should N OT

be changed after enabling the transmitter until after the

next transmitter reset.

The TxCLK is taken from th e SSI_IO1 pin or from the re-

ceive clock, dependent on SSI_CTL.IO1. The direction of

shift in the TxSR and the clock edge on which to shift

must also be configured in SSI_CTL. If the DSPCPU

does not poll the SSI status registers, it should enable

the transmitter interrupt and set the ILS field by writing to

the SSI_CTL to allow interrupt driven servicing of the

SSI. Note that both transmit and receive use the same

ILS field. Set the framing controls, slot size, and mode re-

quired according to the external communication circuit’s

requirements by writing the SSI_CTL. Finally, set the in-

terrupt level to respond to empty levels in the TxFIFO.

Note that the Rx and Tx mach ines share the framing and

clock divide controls. They cannot be set to differ ent val-

ues for Rx and Tx.

If the RxCLK used to derive the TxCLK needs a divide by

two, this is done by setting SSI_CSR.CD2.

17.4.2 Operation Details

The transmit state machine will wait for transmit data to

be written to the SSI_TxDR register. (see also

Figure 17-6) As soon as SSI_TxDR is written, it’s value

will be propagated through two entries of the TxFIFO

(TxFIFO is 16-bit and SSI_TxDR is 32-bit) and trans-

ferred to TxSR, synchronized to TxFSX. The order of

transferring the two 16-bit parts in the 32-bit SSI_TxDR

can be configured by the endian bit SSI_CTL .EMS. Data

will begin shifting out of TxSR, one bit for each active

edge of the TxCLK, from either bit 15 (MSB first SSI_CTL

setting) or from bit 0 (LSB first) until TxSR is empty. For

endian control and shift direction see also subsection

17.8. When the shift register is empty, the transmit state

machine will load the value from the next available

TxFIFO location and begin shifting out that data. The

transmission continues until the transmit state machine

is disabled or reset.

If the last available TxFIFO has not be en upd ated a t the

appropriate time to reload TxSR, the last transmitted

frame is retransmitted and a transmit underrun error is in-

dicated in the transmitter status SSI_CSR.TUE

17.4.3 Interrupt and Status

The refill status of the SSI_TxDR register is stored in

SSI_CSR. As the transmit state machine loads a TxFIFO

The SSI will generate an internal interrupt when the num-

ber of empty words in the TxFIFO rises above the level

set by SSI_CSR.ILS. If the transmit state machine at-

tempts to read a TxFIFO while th e last availa ble TxFIF O

has not been updated, it will set the transmit underrun bit.

This can cause a protocol error in the transmission.

The number of available word buff ers (SSI_CSR.WAW)

and transmitter data register empty (SSI_CSR.TDE) in-

formation is updated automatically by the SSI block.

... ... ... ... 7 6 5 4 3 2 1 0

TxSR

32-bit MMIO Reg

30-depth of 16-bit buffer

16-bit

SSI_TxDATA

29 28 27 ...

rd_ptr

From

Hiway

wr_ptr

SSI_TxDR

Figure 17-6. The transmit buffer operation

PNX1300/01/02/11 Data Book Philips Semiconductors

17-6 PRELIMINARY SPECIFICATION

17.5 SSI RECEIVE OPERATION

17.5.1 Setup SSI_CTL

Write the SSI_CTL to reset and enable the receiver. Both

the transmitter and receiver must be reset simultaneous-

ly. This will set all registers and internal logic the same as

after a power-up reset. The recommended procedure is

to set up all receiver related control bits before perform-

ing a RXE assert. In particular, fields TCP, RSD, IO1,

IO2, FMS, FSP, MOD and TMS should NOT be changed

after enabling th e receiver until after the n ext receiver re-

set.

The direction of shift in the RxSR, mode, and the clock

edge polarity must also be configured in SSI_CTL. Set

the framing cont rols acc ording to th e extern al communi-

cation circuit’s requirements. Note that the Rx and Tx

machines share the framing and clock divide controls.

If the DSPCPU does not poll the SSI status registers, it

should enable the receiver interrupt and set the ILS field

by writing to the SSI_CTL to allow interrupt driven servic-

ing of the SSI receiver. Note that both transmit and re-

ceive use the same ILS field.

If the RxCLK is double the frequency of the data rate on

the SSI bus, SSI_CSR.CD2 can be used to divide the re -

ceive clock by two.

17.5.2 Operation Details

The receive state machine will begin shifting

SSI_RxDATA into the RxSR on the first active edge of

SSI_RxCLK received after the receiver is enabled (see

also Figure 17-7). When full, the RxSR is parallel trans-

ferred to the first available RxFIFO entry and possibly

SSI_RxDR. Reception continues and when RxSR is full

again, a parallel load of the next available RxFIFO entry

from RxSR is accomplished. This continues until the re-

ceiver is disabled or reset. If the receive state machine

must transfer RxSR into one of the RxF IFO entries and

none of the RxFIFO entries is available, the valu e will be

lost and the receive overrun bit will be set.

17.5.3 Interrupt and Status

The status of the RxFIFO is visible in SSI_CSR. WAR is

the number of 32 -bit words availabl e for read; it is more

than ILS (RDF). As the receive state machine loads

RxFIFO from the RxSR, it sets the associated status bit.

The SSI will generate an internal interrupt when the num-

ber of full entries in RxFIFO is more then SSI_CTL.ILS .

If the receive state machine attempts to load RxFIFO

while none of the RxFIFO entries is available, it will set

the receive overrun bit and generate an interrupt.

Due to the possibility of speculative reading of the

SSI_RxDR, the DSPCPU must explicitly indicate a suc-

cessful read of SSI_Rx DR by writing a ‘1’ in the LSB to

the SSI_RxACK register. The status fields of the

SSI_CSR will update within 1 highway clock cycle after

completion of writing to SSI_RXACK register.

17.6 FRAME TIMING

The frame timing can be controlled by the F SS and VSS

fields in the SSI_CTL register.

The FSS[3:0] bits control t he divide ratio for the program -

mable frame rate divider used to generate the frame

sync pulses. The valid value ranges from 1 to 16 slots of

16 bit each, e.g. a value of 5 indicates that a frame con-

tains 5 slots of 16 bits each. Note: the value ‘16’ is ac-

complished by storing a ‘0’ in this field. If a codec is con-

nected which generates 6 slots and the SSI block is

programmed to 5 slots a framing error is indicated in

SSI_CSR.FES; and if TIE or RIE is e nabled, an in terrupt

is generated .

For an example of a frame timing diagram see

Figure 17-11 and Figure 17-12.

The VSS[3:0] bits control the number of valid slots in the

frame, starting from slot 1. For example, if the VSB[3:0]

bits are if set to 4 and FSS set to 5, slots 1, 2, 3 and 4 in

the frame contain valid data from the transmitter FIFO

and slot 5 will contain non-valid data. The receiver will

only accept dat a in slot 1, 2, 3 an d 4.

4 5 6 7 ... ... ... ... ... 29 30 31

RxSR

32-bit MMIO Reg

32-depth of 16-bit buffer

16-bit

SSI_RxDATA

0 1 2 3

rd_ptr wr_ptr

Hiway

SSI_RxDR

Figure 17-7. The receive buffer operation

Philips Semiconductors Synchronous Serial Interface

PRELIMINARY SPECIFICATION 17-7

17.7 INTERRUPT GENERATION

Depending on the settings of the TIE, RIE and CDE bits

in the SSI_CTL register, the SSI unit can generate inter-

rupts. This is best illustrated by Figure 17-8. Note:

RXFES and TXFES are the intern al receive and transmit

framing error conditions. When an SSI interrupt is detect-

ed, the interrupt service routine should check all status

bits.The interrupts should be set up as level-trig gered in-

terrupts.

17.8 16-BIT ENDIAN-NESS AND SHIFT

DIRECTION

The SSI unit supports both access orders for the 16-bit

halves of a machine word. In addition, the shift direction

can be controlled to select MSB or LSB shifting first. The

SSI_CTL.EMS bit controls the 16-bit endian mode, and

the TSD and RSD bits control transmit and receive shift

direction.

When EMS is set, the first data word received in a frame

will be transferred to bit 15-0 of the SSI_RxDR, the sec-

ond word will be transferred to bits 31-16 of the

SSI_RxDR. EMS = ‘0’ reverses the order of the halves of

SSI_RxDR. Likewise in the transmitter , when EMS is set,

the first data word transmitted in a frame will be bits 15-

0 of SSI_TxDR, the second word transferred will be bits

31-16 of SSI_TxDR.

TSD and RSD control the shift direction of transmit and

receive shift registers (TxSR and RxSR). Transmit data

is transmitted MSB first when TSD is ‘0’ or LSB first oth-

erwise. Receive data is received MSB first when RSD

equals ‘0’, LSB first otherwise.

For an example of the transmit operation see

Figure 17-9. Receive works the same, only that data is

shifted in.

Figure 17-8. Interrupt generation logic.

TUE and

TDE

TXFES

TIE

ROE and

RDF

RIE

or SSI interrupt

CDE & CDS

RXFES

Figure 17-9. 16-bit endian and shift direction operation.

SSI_TXDR31 015

SSI_RXFSX

SSI_TXDATA D16 D15 D14 D13 ....... D2 D1 D0 D31 D30 D29 ....... D18 D17 D16 D15 D14 D13 ......

1st word 3th word

SSI_RXFSX

SSI_TXDATA D31 D0 D1 D2 ....... D13 D14 D15 D16 D17 D18 ....... D29 D30 D31 D0 D1 D2 ......

1st word 3th word

SSI_RXFSX

SSI_TXDATA D0 D31 D30 D29 ....... D18 D17 D16 D15 D14 D13 ....... D2 D1 D0 D31 D30 D29 ......

1st word 3th word

SSI_RXFSX

SSI_TXDATA D15 D16 D17 D18 ....... D29 D30 D31 D0 D1 D2 ....... D13 D14 D15 D16 D17 D18 ......

1st word 3th word

2nd word

EMS = 1, TSD = 0

EMS = 1, TSD = 1

EMS = 0, TSD = 0

EMS = 0, TSD = 1

PNX1300/01/02/11 Data Book Philips Semiconductors

17-8 PRELIMINARY SPECIFICATION

17.9 SSI TEST MODES

The SSI unit has two test modes which can be controlled

by setting SSI_CSR.TMS. A remote and a local loop

back testmode are supported (see also Table 17-9).

17.9.1 Remote Loopback

This test mode allows a remote transmitter to test itself,

the intervening transmission media, and its associated

receiver. In this mode, the data received on the

SSI_RxDATA pin is buffered and transmitted on the

SSI_TxDATA pin. The data is not transferred to

SSI_TxDR/TxFIFO and the DSPCPU is never interrupt-

ed. The transmitter is clocked by the SSI_RxCLK pin with

a combinatorial clock delay.

17.9.2 Local Loopback

This test mode allows the DSPCPU to run local checks

of the SSI. Data written to the TxFIFO is serialized and

passed to the receiver via an internal serial connection.

The receiver deserializes the data and passes it to the

RxFIFO register. Interrupts will be generated if enabled.

During local loop back mode, the data on the

SSI_RxDATA pin is ignored and the SSI_TxDATA pin is

tristated. An external CLK must be provided during local

loop back mode or no transmission or reception will oc-

cur.

17.10 MMIO REGISTERS

The MMIO Control and Status registers are shown in

Figure 17-10. The register fields are described in

Table 17-5, Table 17-6, Table 17-7, Table 17-8, and

Table 17-9. To ensure compatibility with future devices,

any undefined MMIO bits should be ignored when read,

and written as ‘0’s.

SSI_CTL (r/w)0x10 2C00 31 0

MMIO_BASE

offset:

SSI_TXDR (w/o)0x10 2C10

SSI_RXDR (r/o)0x10 2C20

SSI_RXACK (w/o)0x10 2C24

371115192327

TXDATA

RXDATA

SSI_CSR (r/w)0x10 2C04 WAW

FMS

FSP

MOD

EMS

TDE

RDF

TUE

RIO1

RIO2

037111519

31 0371115192327

FES

CDS

ROE

TXR

RXR

TXE

TSD

RSD

TCP

RCP

RXE

IO1 IO2

WIO1

WIO2

TIE

RIE

FSS VSS ILS

WAR

31 2327

CTUE

SROE

CFES

CCDS

TMS

CDE

CD2

SLP

reset: 0x00f00000

reset: 0x0000f000

RX_ACK

Figure 17-10. SSI MMIO registers.

Philips Semiconductors Synchronous Serial Interface

PRELIMINARY SPECIFICATION 17-9

17.10.1 SSI Control Register (SSI_CTL)

SSI_CTL is a 32-bit read/write control register used to direct the operation of the SSI. The value of th is register after a

hardware reset is 0x00 F00000.

Table 17-5. SSI control register (SSI_C TL ) fiel d s.

Field Description

TXR T ransmitter Software Reset (Bit 31). Setting TXR performs the same functions as a hardware reset. Resets all

transmitter functions. A transmission in progress is interrupted and the data remaining in the TxSR is lost. The

TxFIFO pointers are reset and the data contained will not be transmitted, but the data in the SSI_TxDR and/or

TxFIFO are not explicitly deleted. The transmitter status and interrupts are all cleared. This is an action bit. This bit

always reads ‘0’. Wr iting a ‘1’ in combination with writing a ‘1‘ in the RXR field will initiate a reset for the SSI module.

Note: this bit is always set together with RXR because a separate transmitter or receiver reset is not implemented.

RXR Receiver Software Reset (Bit 30). Setting RXR performs the same functions as a hardware reset. Resets all

receiver functions. A reception in progress is interrupted and the data collected in the RxSR is lost. The RxFIFO

pointers are reset, and the SSI will not generate an interrupt to DSPCPU to retrieve data in the SSI_RxDR and/or

RxFIFO. The data in the SSI_RxDR and/or RxFIFO is not explicitly deleted. The receiver status and interrupts are

all cleared.This is an action bit.This bit always reads ‘0’. Writing a ‘1’ in combination with writing a ‘1‘ in the TXR field

will initiate a reset for the SSI module. Note: this bit is always set together with TXR, because a separate transmitter

or receiver reset is not implemented.

TXE T ransmitter Enable (Bit 29). TXE enables the operation of the transmit shift register st ate machine. When TXE is set

and a frame sync is detected, the transmit state machine of the SSI is begins transmission of the frame. When TXE

is cleared, the transmitter will be disabled after completing transmission of data currently in the TxSR. The serial out-

put (SSI_TxDATA) is three-stated, and any data present in SSI_TxDR and/or TxFIFO will not be transmitted (i.e.,

data can be written to SSI_TxDR with TXE cleared; TDE can be cleared, but dat a will not be transferred to the TxSR).

St atus fields updated by the T r ansmit state machine are not updated or reset when an active transmitter is disabled.

RXE Receive Enable (Bit 28). When RXE is set, the receive state machine of the SSI is enabled. When this bit is cleared,

the receiver will be disabled by inhibiting dat a transfer into SSI_RxDR and/or RxFIFO. If data is being received while

this bit is cleared, the remainder of that 16-bit word will be shifted in and transferred to the SSI RxFIFO and/or

SSI_RxDR.

Status fields updated by the Receive state machine are not updated or reset when an active receiver is disabled.

TCP Transmit Clock Polarity (Bit 27). The TCP bit value should only be changed when the transmitter is disabled. TCP

controls on which edge of TxCLK data is output. TCP=0 causes data to be output at rising edge of TxCLK, TCP=1

causes data to be output at falling edge of TxCLK.

RCP Receive Clock Polarity (Bit 26). RCP controls which edge of RxCLK samples dat a. The data is sampled at rising edge

when RCP = ‘1’ or falling edge when RCP = ‘0’.

TSD Transmit Shift Direction (Bit 25). TSD controls the shift direction of transmit shift register (TxSR). Transmit data is

transmitted MSB first when TSD = ‘0’ or LSB first otherwise. The operation of this bit is explained in more detail in

section 17.8.

RSD Receive Shif t Direction (Bit 24). The RSD bit value should only be changed when the receiver is dis abled. RSD con-

trols the shift direction of receive shift register (RxSR). Receive data is received MSB first when RSD = ‘0’, LSB first

otherwise. The operation of this bit is explained in more detail in section 17.8.

IO1 Mode Select SSI_IO1 pin (Bit 23-22). The IO1 field value should only be changed when the transmitter and receiver

are disabled. The IO1[1:0] bits are used to select the function of SSI_IO1 pin. The function may be selected as listed

in table Table 17-6.

IO2 Mode Select SSI_IO2 pin (Bit 21-20). The IO2 field value should only be changed when the transmitter and receiver

are disabled. The IO2[1:0] bits are used to select the function of SSI_IO2 pin. The function may be selected according

to Table 17-7

WIO1 Write IO1 (Bit 19). Value written here appears on the SSI_IO1 pin when the pin is configured to be a general purpose

output.

WIO2 Write IO2 (Bit 18). V alue written here appears on the SSI_IO2 pin when this pin is configured to be a general purpose

output.

TIE Transmit Interrupt Enable (Bit 17). Enables interrupt by the TDE flag in the SSI status register (transmit needs refill)

Also enables interrupt of the TUE (transmitter underrun error) and TXFES (transmit framing error)

RIE Receive Interrupt Enable (Bit 16). When RIE is set, the DSPCPU will be interrupted when RDF in the SSI status reg-

ister is set (receive complete). It will also be interrupted on ROE (receiver overrun error) and on RXFES (receive

framing error).

FSS Frame Size Select (Bits 15-12). The FSS[3:0] bits control the divide ratio for the programmable frame rate divider

used to generate the frame sync pulses. The valid setup value ranges from 1 to 16 slot(s). The value ‘16’ is accom-

plished by storing a 0 in this field.

PNX1300/01/02/11 Data Book Philips Semiconductors

17-10 PRELIMINARY SPECIFICATION

VSS V alid Slot Size (Bit 11-8). The VSS[3:0] bits control the valid slot size (starting from slot 1) for dif ferent modem analog

front end devices. The valid setup value ranges from 1 to 16 slot(s). The v alue 16 is accomplished by storing a ‘0’ in

this field.

FMS Frame Sync Mode Select (Bit 7). The FMS bit value should only be changed when the transmitter and receiver are

disabled. FMS selects the type of frame sync to be recognized by both Rx and Tx. When FMS = ‘1’, frame sync is

word-length bit clock. When this bit = ‘0’, frame sync is a 1-bit clock.

FSP Frame Sync Polarity (Bit 6). The FSP bit value s hould only be changed when the transmitter and receiver are dis-

abled. FSP controls which edge of frame sync is the active edge for both Rx and Tx. This bit causes frame signal to

be active at rising edge when FSP = ‘0’ , or falling edge when FSP = ‘1’.

MOD Mode Select (Bit 5). The MOD bit value should only be changed when the tran smitter and receiver are disabled. MOD

selects the operational mode of the SSI for ISDN functionality. When MOD is set, the SSI is configured as a U-inter-

face for ISDN NT. Otherwise, set to ‘0’. Setting MOD bit and CD2 supports the MC145574 and MC145572 ISDN in-

terface transceivers.

EMS Endian Mode Select (Bit 4). Selects the big- or little-endian mode operation. See Section 17.8 for more detail.

ILS Interrupt Level Select (Bit 3-0). Sets the point where an interrupt is generated for normal data buffer servicing. The

number ranges from 1 to 15. This field controls interrupt level of both transmit and receive functions.

Table 17-5. SSI control register (SSI_CTL) fields.

Field Description

Table 17-6. IO1 mode select

Bit Mode

00 General Purpose Output: Configures the SSI_IO1 pin for general purpose output. The pin follows the state of the WIO1

field of the SSI_CTL.

01 General Purpose Input: Change detector may be used. Value can be read in from the RIO1 field of the SSI_CSR.

10 Enable External TxCLK: Allows for use of an externally generated TxCLK. The clock is provided via the TxCLK pin. All

general purpose I/O functions are unavailable.

11 Disable: Pin is not used. Output buffer is tristated and the input is ignored. (RESET default)

Table 17-7. IO2 mode select

Bit Mode

00 General Purpose Output: Configures the SSI_IO2 pin as a general purpose output. The pin follows the state of the WIO2

field of the SSI_CTL.

01 General Purpose Input: Value can be read in from RIO2 field of the SSI_CSR.

10 Frame Signal TxFSX (Output): Outputs the frame signal generated by the internal frame signal generation logic.

11 Frame Signal TxFSX (Input): Allows for use of an externally generated TxFSX. The frame sync signal is provided via

TxFSX pin. All general purpose I/O functions are unavailable. (RESET default)

Philips Semiconductors Synchronous Serial Interface

PRELIMINARY SPECIFICATION 17-11

17.10.2 SSI Control/Status Register (SSI_CSR)

SSI_CSR is a 32-bit read/write register that controls the SSI unit and sho ws the curre nt status of the SSI module. The

default value after hardware reset is 0x0000F000.

Table 17-8. SSI cont rol / s tatus register (SSI_CSR) fields

Field Description

TMS Test Mode Select (Bit 31-30). Value should only be changed when the transmitter and receiver are disabled. See

Table 17-9.

CDE Change Detector Enable (Bit 29). CDE enables the change detector function on the SSI_IO1 pin. When CDE is set,

the DSPCPU will be interrupted when CDS in the SSI status register is set. When CDE is cleared, this interrupt is

disabled. However, the CDS bit will always indicate the change detector condition.

When the change detector is enabled, the CLK samples SSI_IO1. The CDS bit will be set for either a ‘0’ –> ‘1’ or a ‘1’

–> ‘0’ change between the current value and the stored value.

CD2 RXCLK Divider (Bit 28). When CD2 = ‘1’, the internal RxCLK is divided by two. In the divide by 2 mode, the clock edge

that samples the asserted Frame Sync Pulse will resync the RxCLK divider to be a data capture edge. Dat a samples

will occur every other clock thereafter until the end of the valid slots in the frame.

SLP Sleepless (Bit 27). When set, this bit allows the SSI to ignore the global power down signal. If cleared, assertion of the

global power down signal will cause the SSI transmitter to finish transmission of the current 16-bit word, then enter a

state similar to transmitter disabled, (SSI_CTL.TXE = ’0’).

In the receiver, a 16-bit word currently being transmitted to RxSR will complete reception and be transferred to the

RxFIFO. The receiver will then enter a state similar to receiver disabled, (SSI_CTL.RXE = ‘0’).

CTUE Clear T ransmitter Underrun Error (Bit 21). A control bit written by the DSPCPU to indicate that the transmitter underrun

error flag should be cleared. This is an action bit. Writing a ‘1’ clears SSI_CSR.TUE. The bit always reads ‘0’.

CROE Clear Receiver Overrun Error (Bit 20). A control bit written by the DSPCPU to indicate that the receiver overrun error

flag should be cleared. This is an action bit. Writing a ‘1’ clears SSI_CSR.T O E. The bit always reads ‘0’.

CFES Clear Framing Error Status (Bit 19). A control bit written by the DSPCPU to indicate that the receiver ’s framing error

flag should be cleared. This is an action bit. Writing a ‘1’ clears SSI_CSR.FES. The bit always reads ‘0’.

CCDS Clear Change Detector St atus (Bit 18). A control bit written by the DSPCPU to indicate that the change detector status

on IO1 flag should be cleared. This is an action bit. Writing a ‘1’ clears SSI_CSR.CDS. The bit always reads ‘0’.

W AW Word buf fers A v ailable for Write (Bit 15-12). The W AW[3:0] bits provide the number of 32-bit words available for write

in the transmit buffer (TxFIFO). The SSI can store 15 words in the transmit FIFO. When the FIFO is empty, WAW =

‘15’. When the FIFO is full, WAW = ‘0’ and the SSI will ignore any further attempts to add words to the FIFO. Note:

The fill routine should check that WAW is nonzero, before writing data.

WAR Word buffers Available for Read (Bit 11-8). The WAR[3:0] bits provide the number of 32-bit word available for read in

the receive buffer (RxFIFO). The SSI can store 16 words in the receive FIFO. However, the maximum value indicated

by the WAR register = ‘15’ (because it’s a 4-bit register field). When the FIFO is empty, WAR = ‘0’. When the FIFO is

full, WAR = ‘15’ and the SSI will generate an overrun error if more data is received.

TDE Transmit Data register Empty (Bit 7). In normal operation, this bit will be set when the number of empty words in the

TxFIFO is greater than the Interrupt Level Select value, SSI_CTL.ILS. If SSI_CTL.TIE is set, the SSI will generate an

interrupt. When set, it indicates that the SSI_TxDR/TxFIFO registers require DSPCPU service for refilling after normal

transmission. As the DSPCPU refills the TxFIFO during the interrupt service routine, this bit will be cleared by the SSI

when the number of empty slots drops below the value of SSI_CTL.ILS.

RDF Receive Data register Full (Bit 6). In normal operation, this bit will be set when the number of words in the RxFIFO is

greater than SSI_CTL.ILS. If SSI_CTL.RIE is set, the S SI will generate an interrupt. When set, this bit indicates that

normal received data resides in SSI_RxDR register and RxFIFO buf fer for reading. DSPCPU must service the RxFIFO

before a receiver overrun occurs.

TUE Transmitter Underrun Error (Bit 5). No current data was available from the TxFIFO when a load of the TxSR was

scheduled. The transmitted message may have been corrupted. Generates interrupt if enabled by TIE.

ROE Receive Overrun Error (Bit 4). No RxFIFO slot in which to store received data. These bit s have been lost and the mes-

sage stream is incomplete. Generates an interrupt if enabled by RIE.

FES Frame Error (Bit 3). A frame sync pulse has been detected where not expected or did not occur as expected during

transmit or receive. Received data may be invalid. Transmit data have been sent out of sync. Receive frame error

RXFES generates an interrupt if enabled by RIE. T ransmit frame error TXFES generates an interrupt if enabled by TIE

CDS Change Detector Status (Bit 2). The input change detector on SSI_IO1 pin has detected a change in state.

RIO1 Read IO1 (bit 1). RIO1 reflects the value on the SSI_IO1 pin.

RIO2 Read IO2 (bit 2). RIO2 reflects the value on the SSI_IO2 pin.

PNX1300/01/02/11 Data Book Philips Semiconductors

17-12 PRELIMINARY SPECIFICATION

17.11 TIMING DIAGRAMS

Figure 17-11 an d Figure 17-12 illustrate the timing of the

data signals and the frame timing.

17.12 POWER DOWN

SSI block can be separately powered down by setting a

bit in the BLOCK_POWER_DOWN register. For a de-

scription of powerdown, see Chapter 21, “Power Man-

agement.” The SSI block should not be active when ap-

plying block powerdown.

If the block enters power-down state while transmission

is enabled, behavior upon po wer-up is undefined.

Table 17-9. Test mode s ele ct

Bit Mode

0X Normal Operation.

10 Remote Loopback Test: Direct connection of receiver serial data to transmitter serial dat a. Transmitter is

clocked with RxCLK. No data loaded to the SSI_RxDR register or RxFIFO buffer and no CPU interrupt is gener-

ated. Useful to allow remote device to test the communication medium and the Rx and Tx front ends.

11 Local Loopback Test: Feedback is after SSI_TxDR and SSI_RxDR register and serializer/deserializer. Allows

DSPCPU to test the bulk of the Rx and Tx circuits. During Local Loopback Test, an external clock on

SSI_RXCLK should be present to clock the SSI unit.

Figure 17-11. SSI Serial timing. (FSP = 0, RSD = 0, TSD = 0, TCP = 0, RCP = 0, FMS = 0)

SSI_RXCLK

SSI_RXFSX

SSI_RXDATA

SSI_TXDATA

D0 D15 D14 D13 D12

D11 D10 D9 D8

D7 D6 D5 D4

D3 D2 D1 D0

D15 D14 D13 D12

Figure 17-12. SSI Serial timing. (FSP = 0, RSD = 0, TSD = 0, TCP = 0, RCP = 0, FMS = 0, FSS = 5, VSS = 4)

SSI_RXCLK

SSI_RXFSX

SSI_RXDATA

SSI_TXDATA

1st DATA

1st Frame

2nd DATA

3th DATA

4th DATA

1st DATA

2nd Frame

PRELIMINARY SPECIFICATION 18-1

JTAG Functional Specification Chapter 18

by Renga Sundararajan, Hans Bouwmeester and Frank Bouwman

18.1 OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The IEEE 1149.1 (JTAG) standard can be used for vari-

ous purposes including testing connections between in-

tegrated circuits on board level, controlling the testing of

the internal structures of the integrated circuits, and mon-

itoring and communicating with a running system.

The JTAG standard defines on-chip test logic, four or five

dedicated pins collectively called the Test Access Port

(TAP) and a TAP con tro ller.

The JTAG standard defines instructions that must al-

ways be implemented by a TAP controller in order to

guarantee correct behavior on board level. Apart from

mandatory instructions, the standard also allows user-

defined and private instructions. In PNX1300, user de-

fined and private instructions exist for debug purposes

and for production test. For debug there is communica-

tion between a debug monitor running on the PNX1300

DSPCPU and a debugger front-end running on a host

computer. This will be explained in chapter Section 18.3

18.2 TEST ACCESS PORT (TAP)

The Test Access Port includ es three or four dedicated in-

put pins and one output pin:

• TCK (Test Clock)

• TMS (Test Mode Select)

• TDI (Test Data In)

• TRST (Test Reset, optional!)

• TDO (Test Data Out)

TRST is not present on PNX1300.

TCK provides the clock for test logic required by the stan-

dard. TCK is asynchronous to the system clock. Stored

state devices in JTAG controller must retain their state

indefinitely when TCK is stopped at 0 or 1.

The signal received at TMS is decoded by the TAP con-

troller to control test functions. The test logic is required

to sample TMS at the rising edge of TCK.

Serial test instructions and test data are re ceived at TDI.

The TDI signal is required to be sampled at the rising

edge of TCK. When test data is shifted from TDI to TDO,

the data must appear without inversion at TDO after a

number of rising and falling edges of TCK determined by

the length of the instruction or test data register selected.

TDO is the serial output for test instructions and data

from the TAP controller. Changes in the state of TDO

must occur at the falling edge of TCK. This is because

devices connected to TDO are required to sample TDO

at the rising edge of TCK. The TDO driver must be in an

inactive state (i.e., TDO line HIghZ) except when data

scanning is in prog r es s.

18.2.1 TAP Controller

The TAP controller is a finite state machine; it synchro-

nously responds to changes in TCK and TMS signals.

The TAP instructions and data are serially scanned into

the TAP controller’s instruction and data register s via the

common input line TDI. The TMS signal tells the TAP

controller to select either the TAP instruction register or

a TAP data register as the destination for serial input

from the common line TDI. An instruction scanned into

the instruction register selects a data re gister to be con -

nected between TDI and TDO and hence to be the des-

tination for serial data input.

TAP controller state changes are determined by the TMS

signal. The states are used for scanning in/out TAP in-

struction and data, updating instruction and data regis-

ters, and for executing instructions.

The controller state diagram (Figure 18-1) shows sepa-

rate states for ‘capture’, ‘shift’ and ‘update’ of data and in-

structions. The reason for sepa rate states is to leave the

contents of a data register or an instruction register un-

disturbed until serial scan-in is finished and the update

state is entered. By separating the shift and update

states, the contents of a register (the parallel stage) is not

affected during scan in/out.

The TAP controller must be in T est Logic Reset state af-

ter power-up. It remains in that stat e as long as TMS is

held at ‘1’. It tran sitions to Run-Test/Idle state when TMS

= ‘0’. The Run-Test/Idle state is an idle state of the con-

troller in between scan ning in/out an instruction/data reg-

ister. The ‘Run-Test’ part of the name refers to start of

built-in tests. The “Idle” part of the name refers to all other

cases. Note that there are two similar sub-structures in

the state diagram, one for scanning in an instruction and

another for scann ing in data. To scan in/out a data regis-

ter, one has to scan in an instruction first.

An instruction or data register must have at least two

stages, a shift register stage and a parallel input/output

stage. When an n-bit data regi ster is to be ‘read’, the reg-

ister is selected by an instruction. The registers contents

are ‘captured’ first (loaded in parallel into shift register

stage), n bits are shifted in and at the same time n bits

PNX1300/01/02/11 Data Book Philips Semiconductors

18-2 PRELIMINARY SPECIFICATION

are shifted out. Finally the register is ‘updated’ with the

new n bits shifted in.

Note: when a register is scanned, its old value is shifted

out of TDO. The ne w value shifted in via TDI is wr itten to

the register at the update state. Hence, scan in/out in-

volve the same steps. This also means that reading a

stated. We can specify some registers as read-only via

JTAG so that when the controller transitions to update

state for the read-only r egister, the update h as no effect.

Sometimes, read-write registers are needed (for exam-

ple, control registers used for handshake) which can be

read non-destructively. In such cases, the value shifted

in determines whether the old value is ‘remembered’ or

something else happens.

18.2.2 PNX1300 JTAG Instruction Set

PNX1300 uses a 5-bit instruction register. The unspeci-

fied opcodes are private and their effects are undefined.

Table 18-1 lists the JTAG instructions.

Select

DR Scan

Capture

Shift

Exit1

Pause

Exit2

Update

Select

IR Scan

Capture

Shift

Exit1

Pause

Exit2

Update

1 1

Test Logic

Reset

Run-Test/

Idle

0 0

Figure 18-1. State diagram of TAP controller

Table 18-1. JTAG instruction encoding

Encoding Instruction name Action

00000 EXTEST Select (dummy) boundary

scan register

00001 SAMPLE/PRELOAD Select (dummy) boundary

scan register

11111 BYPASS Select bypass register

10000 RESET Reset TriMedia to power

on state

10001 SEL_DATA_IN Select DATA_IN register

Philips Semiconductors JTAG Functional Specification

PRELIMINARY SPECIFICATION 18-3

The JTAG instructions EXTEST, SAMPLE/PRELOAD,

and BYPASS are standard instructions and are not dis-

cussed here. The MACRO, BURNIN, and PASS_C_S in-

structions are used during hardware test mode, and are

also not discussed here. All other instructions are dis-

cussed in Section 18.3

18.3 USING JTAG FOR PNX1300 DEBUG

Figure 18-2 shows an overview of the JTAG access path

from a host machine to a target TriMedia system and a

simplified block diagram of the TriMedia processor. The

JTAG Interface Module shown sepa rately in the dia gram

may be a PC add-on card such as PC-1149.1/100F

Boundary Scan Controller Board from Corelis Inc. or a

similar module connected to a PC serial or para llel port.

The JTAG interface module is necessary only for TriMe-

dia systems that are not plugged into a PC. For PC-host-

ed TriMedia systems, the host ba sed debugger front-end

can communicate with the target reside nt debug monitor

via the PCI bus.

The enhancements to the standard functionality of JTAG

test logic provides a handshake mechanism for transfer-

ring data to and from a TriM edia proc es sor’s M MIO reg -

isters reserved for this purpose, for posting an interrupt,

and for resetting processor state. The actual interpreta-

tion of the contents of the MMIO registers is determined

by a software protocol used by the debug monitor run-

ning on the TriMedia processor and the debug front-end

running on a host machine.

The communication between a host computer and a tar-

get TriMedia system via JTAG requ ires, at a high level of

abstraction, the following components.

• A host computer with a serial or parallel inter-

face.

The host computer transfers data to and from the

JTAG interface module, preferably in word-parallel

fashion. A JTAG interface device driver is also

needed to access and modify the registers of the

JTAG interface module.

• A JTAG interface module (hardware) that asyn-

chronously transfers data to and from the host

computer.

The interface module synchronously tran sfers dat a to

and from the JTAG TAP on a TriMedia processor,

and supplies the test clock, TCK, and other signals to

10010 SEL_DATA_OUT Select DATA_OUT register

10011 SEL_IFULL_IN Select IFULL_IN register

10100 SEL_OFULL_OUT Select OFULL_OUT regis-

ter

10101 SEL_JTAG_CTRL Select JTAG_CTRL regis-

ter

11110 MACRO Hardware test mode select

01010 BURNIN P rivate

01110 PASS_C_S Private

Table 18-1. JTAG instruction encoding

Encoding Instruction name Action

Host Machine JTAG Interface

JTAG board

Connector

Serial or Parallel

Connection

JTAG TAP (TCK, TMS, TDI, TDO)

Main

Memory

(SDRAM)

DSP

CPU MMI

JTAG

controller MMIO

Scan Chain connecting possibly

other chips on board

TriMedia Board

Figure 18-2. TriMedia system with JTAG test access

DATA Highway

Module

(such as a PC)

May be a PC plug-in board

PNX1300/01/02/11 Data Book Philips Semiconductors

18-4 PRELIMINARY SPECIFICATION

the TriMedia JTAG controller. The interface module

may be a PC plug-in board.

This module may transfer data from and to the host

computer in bit-serial or word-parallel fashion. It

transfers data from and to the JTAG registers on a

TriMedia processor in bit-serial fashion in accor-

dance with the IEEE 1149.1 standard. The JTAG

interface module connects to a 4-pin JTAG connec-

tor on a TriMedia board which provides a path to the

JTAG pins on a TriMedia processor. It is the respon-

sibility of the interface module to scan data in and out

of the TriMedia processor into its internal buffers and

make them available to the host computer.

• A JTAG controller on the TriMedia processor

which provides a bridge between the external

JTAG TAP and the internal system.

The controller transfers data from/to the TAP to/from

its scannable registers asynchronous to the internal

system clock. A monitor running on a TriMedia pro-

cessor and the debugger front-end running on a host

computer exchange data via JTAG by reading/writing

the MMIO registers reserved for this purpose, includ-

ing a control register used for the hand shake.

18.3.1 JTAG Instruction and Data Registers.

PNX1300 has two JTAG data registers and one JTAG

control register (see Figure 18-3) in MMIO space and a

number a JTAG instructions to manipulate those regis-

ters. Table 18-2 lists the MMIO addresses of the JTAG

data and control registers. The addresses are offsets

from MMIO_BASE. All references to instruction and data

registers below are JTAG instruction s and data registers

and not TriMedia instruction or data registers.

•Two 32-bit data registers, JTAG_DATA_IN and

JTAG_DATA_OUT in MMIO space. Both registers

can be connected in between TDI and TDO like the

standard Bypass and Boundary Scan registers of

JTAG (not shown in Figure 18-3).

The JTAG_DATA_IN register can be read or written

to via the JTAG port. The JTAG_DATA_OUT register

is read-only via the JTAG port, so that scanning out

JTAG_DATA_OUT is non-destructive.

The JTAG_DATA_IN and JTAG_DATA_OUT are

readable/writable from the TriMedia processor via

the usual load/store operations.

•An 8-bit control register JTAG_CTRL in MMIO

space. The JTAG_CTRL register is used for hand-

shake between a debug monitor running on a TriMe-

dia and a debugger front-end running on a host.

JTAG_CTRL.ofull = ‘1’ means that

JTAG_DATA_OUT has valid data to be scanned out.

On power-on reset of the TriMedia processor,

JTAG_CTRL.ofull = ‘0’. JTAG_CTRL.ofull is both

readable and writable via JTAG tap. Writing 0 to

JTAG_CTRL.ofull via JTAG is a ‘remember’ opera-

tion, i.e., JTAG_CTRL.ofull retains its previous state.

Writing a ‘1’ to JTAG_CTRL.ofull via JTAG is a ‘clear’

operation, i.e., JTAG_CTRL.ofull becomes ‘0’.

JTAG_CTRL.ifull = ‘0’ means that the

JTAG_DATA_IN register is empty. JTAG_CTRL.ifull

= 1 means that JTAG_DATA_IN has valid data and

the debug monitor has not yet copied it to its private

area. On power-on reset of the TriMedia processor,

JTAG_CTRL.ifull = 0. JTAG_CTRL.ifull is readable

and writable via JTAG. Writing a ‘0’ to

JTAG_CTRL.ifull via JTAG is a remember operation,

i.e., JTAG_CTRL.ifull retains it previous state. Writ-

ing a ‘1’ to JTAG_CTRL.ifull posts an interrupt on

hardware line 18.

The peripheral blocks on a TriMedia processor may

enter a ‘power down’ state to reduce power con-

sumption. The JTAG_CTRL.sleepless bit determines

if the JTAG block participates in a po wer down state.

In the power-on RESET state, JTAG_CTRL.sleep-

less bit is ‘1’ meaning the JTAG block does not

power down. It can be read and written to by the Tri-

Media processor via load/store operations and by the

debugger front-end running on a host by scan in/out.

•Two virtual registers, JTAG_IFULL_IN and

JTAG_OFULL_OUT. The first virtual register

Table 18-2. MMIO Register Assignments

MMIO Offset JTAG Register

0x 10 3800 JTAG_DATA_IN

0x 10 3804 JTAG_DATA _OUT

0x 10 3808 JTAG_CTRL

TDO

JTAG_DATA_IN

JTAG_DATA_OUT

JTAG_CTRL

from

TDI

ifull ofull

unused bits

31 0

Figure 18-3. Additional JTAG data registers and control register

sleepless

bit

Philips Semiconductors JTAG Functional Specification

PRELIMINARY SPECIFICATION 18-5

JTAG_IFULL_IN connects the registers

JTAG_CTRL.ifull and JTAG_DATA_IN in series.

Likewise, the virtual register JTAG_OFULL_OUT

connects JTAG_CTRL.ofull and JTAG_DATA_OUT

in series.

The reason for the virtual registers is to shorten the

time for scanning the JTAG_DATA_IN and

JTAG_DATA_OUT registers. Without virtual regis-

ters, we must scan in an instruction to select

JTAG_DATA_IN, scan in data, scan an instruction to

select JTAG_CTRL register and finally scan in the

control register. With virtual register, we can scan in

an instruction to select JTAG_IFULL_IN and then

scan in both control and data bits. Similar savings

can be achieved for scan out using virtual registers.

• Five JTA G inst ruc ti o ns

• 5 instructions, SEL_DATA_IN, SEL_DATA_OUT,

SEL_IFULL_IN, SEL_OFULL_OUT, and

SEL_JTAG_CTRL, for selecting the registers to

be connected between TDI and TDO for serial

input/output.

• An instruction RESET for resetting the TriMedia

processor to power on state.

• In the capture-IR state of the TAP controller, the least

2 significant bits (bits 0 and 1) of the shift register

stage must be loaded with the ‘01’ as required in the

standard. The standard allows the remaining bits of

the IR shift stage to be loaded with design specific

data. The bits 2, 3 and 4 of the IR shift stage are

loaded with bits 0, 1 and 2 of the JTAG_CTRL regis-

ter. This means that shifting in any instruction allows

the 3 least significant bits of the JTAG_CTRL register

to be inspected. This reduces the polling overhead

for data transfer.

Race Conditions

Since the JTAG data registers live in MMIO space and

are accessible by both the TriMedia processor and the

JTAG controller at the same time, race conditions must

not exist either in hardware or in software. The following

communication protocol uses a handshake mechanism

to avoid software race conditions.

18.3.2 JTAG Communication Protocol

The following describes the handshake mechanism for

transferring data via JTAG.

•Transfer from debug front-end to debug monitor

The debugger front-end running on a host transfers

data to a debug monitor via JTAG_DATA_IN regis-

ter. It must poll JTAG_CTRL.ifull bit to check if

JTAG_DATA_IN register can be written to. If the

JTAG_CTRL.ifull bit is clear, the front-end may scan

data into JTAG_DATA_IFULL_IN register. Note that

data and control bits may be shifted in with

SEL_IFULL_IN instruction and the bit shifted into

JTAG_CTRL.ifull register must be ‘1’. This action

triggers an interrupt. The debug monitor must copy

the data from JTAG_DATA_IN register into its private

area when servicing the interrupt and then clear

JTAG_CTRL.ifull bit thus allowing JTAG interface

module to write to JTAG_DATA_IN register the next

piece of data.

•Transfer from monitor to front-end

The monitor running on TriMedia must check if

JTAG_CTRL.ofull is clear and if so, it can write data

to JTAG_DATA_OUT. After that, the monitor must

set the JTAG_CTRL.ofull bit. The debugger front-end

polls the JTAG_C TRL.ofull bit. When that bit is set, it

can scan out JTAG_DATA_OUT register and clear

JTAG_CTRL.ofull bit. Since JTAG_DATA_OUT is

read-only via JTAG, the update action at the end of

scan out has no effect on JTAG_DATA_OUT. The

JTAG_CTRL.ofull bit, however, must be cleared by

shifting in the value ‘1’.

• Controller States

In the power-on reset state, JTAG_CTRL.ifull and

JTAG_CTRL.ofull must be cleared by the JTAG con-

troller.

18.3.3 Example Data Transfer Via JTAG

Scanning in a 5-bit instruction will take 12 TCK cycles

from the Run-Test/Idle state: 4 cycles to reach Shift-IR

state, 5 cycles for actual shifting in, 1 cycle to exit1-IR

state, 1 cycle to Update-IR state, and 1 cycle back to

Run-Test/Idle state. Likewise, scanning in a 32 bit data

JTAG_CTRL data register will take 14 TCK cycles from

Idle state. However , if a data transfer follo ws instruction

transfer, then the transition to DR scan stage can be

done without going through Idle state, saving 1 cycle.

18.3.3.1 Transferring data to TriMedia via

JTAG

Poll control register to check if input buffer is empty. Scan

in data when it is empty and set the ifull control bit to ‘1’

triggering an inte rrupt. Note that scanning in any in struc-

tion automatically scans out the 3 least significant bits

(including ifull and ofull bits) of the JTAG_C TRL register.

Table 18-3. Transfer of Data in via JTAG

Action Number of

TCK cycles

IR shift in SEL_IFULL_IN instruction 12

While JTAG_CTRL.ifull = 1, scan in

SEL_IFULL_IN instruction 11+

DR scan 33 bits of register JTAG_IFULL_IN 38

TOTAL 61+ cycles

PNX1300/01/02/11 Data Book Philips Semiconductors

18-6 PRELIMINARY SPECIFICATION

18.3.3.2 Transferring data from TriMedia via

JTAG

Poll control register to check if output buffer is full. Scan

out data when it is full and clear the ofu ll control bit. Note

that scanning in any instruction automatically scans out

the 3 least significant bits (includin g ifull and o full bits) of

JTAG_CTRL register.

Note that the above timings do not include the over-

heads of the JTAG software driver for JTAG interface

module plugged into a PC.

18.3.4 JTAG Interface Module

It is expected that the interface module will be a program-

mable JTAG interface module. One end of the module

should be connected to a JT AG tap an d the othe r end to

a host computer via a serial or parallel line or plugged

into a PC. It is up to the JTAG driver software on a host

computer to program the JTAG interface module via the

serial/parallel interface for transferring data to/from the

target. The transfer rates will depend on the interface

module.

Table 18-4. Transfer of Data out via JTAG

Action Number of

TCK cycles

IR shift in SEL_OFULL_OUT instruction 12

While JTAG_CTRL.ofull = 0, scan in

SEL_OFULL_OUT instruction 11+

DR scan 33 bits of register JTAG_OFULL_OUT 38

TOTAL 61+ cycles

PRELIMINARY SPECIFICATION 19-1

On-Chip Semaphore Assist Device Chapter 19

19.1 OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

PNX1300 has a s imple MP semaphore-assist device. It

is a 32-bit register, accessible through MMIO by either

the local PNX1300 CPU or by any other CPU on PCI

through the aperture made available on PCI. The sema-

phore, SEM, is located at MMIO offset 0x10 0500.

SEM operation is a s follows: each m aster in the syst em

constructs a personal nonzero 12 bit ID (see below). To

obtain the global semaphore, a master does the follow-

ing action:

write ID to SEM (use 32 bit store, with ID in 12 LSB)

retrieve SEM (use 32 bit load, it returns 0x00000nnn)

if (SEM = ID) {

“performs a shor t critical section action”

write 0 to SEM

}

else “try again later, or loop back to write”

19.2 SEM DEVICE SPECIFICATION

SEM is a 32-bit MMIO location. The 12 LSB consist of

storage flip-flops with surrounding logic, the 20 MSBs al-

ways return a ‘0’ when read.

SEM is RESET to ‘0’ by power up reset.

When SEM is written to, the storage flip-flops beha ve as

follows:

if (cur_content == 0) new_content = write_value;

else if (write_value == 0) new_content = 0;

/* ELSE NO ACTION ! */

19.3 CONSTRUCTING A 12-BIT ID

A PNX1300 processor can construct a personal, nonzero

12-bit ID in a variety of ways. Below are some sugges-

tions.

PCI configspace PERSONALITY entry. Each PNX1300

receives a 16-bit PERSONALITY value from the EE-

PROM during boot. This PERSONALITY register is lo-

cated at offset 0x40 in configuration sp ace. In a MP sys-

tem, some of the bits of PERSONALITY can be

individualized for each CPU involved, giving it a unique

2/3/4-bit ID, as needed given the maximum number of

CPUs in the design.

In the case of a host-assisted PNX1300 boot, the PCI

BIOS assigns a unique MMIO_BASE and DRAM_BASE

to every PNX1300. In particular, the 11 MSBs of each

MMIO_bas e are unique, sinc e each MMIO aperture is 2

MB in size. These bits can be used as a personality ID.

Set bit 11 (MSB) to '1' to guarantee a nonzero ID#.

19.4 WHICH SEM TO USE

Each PNX1300 in the system adds a SEM device to the

mix. The intended use is to treat one of these SEM de-

vices as THE master semaphore in the system. Many

methods can be used to determine which SEM is master

SEM. Some examples below:

Each DSPCPU can use PCI configuration space access-

es to determine which other PNX1300s are present in

the system. Then, the PNX1300 with the lowest PER-

SONALITY number, or the lowest MMIO_base is cho sen

as the PNX1300 containin g the master semaphore.

19.5 USAGE NOTES

To avoid contention on the master SEM device, it should

only be used for inter-processor semaphores. Processes

running on a single CPU can use regular memory to im-

plement synchronization primitives.

The critical section associated with SEM should be kept

as short as possible. Preferably, SEM should only be

used as the basis to make multiple memory-resident sim-

ple semaphores. In this case, the non-cacheable DRAM

area of each PNX1300 can be used to implement the

semaphore data structures efficiently.

As described here, SEM does not guarantee starvation-

free access to critical resources. Claiming of SEM is

purely stochastic. This should work fine as long as SEM

is not overloaded. Utmost care should be taken in SEM

access frequency and duration of the basic critical sec-

tions to keep the load conditions reaso nable.

00000000000000000000

31 12 11 0

SEM

0x10 0500

PNX1300/01/02/11 Data Book Philips Semiconductors

19-2 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION 20-1

Arbiter Chapter 20

by Eino Jacobs, Luis Lucas, Chris Nelson, Allan Tzeng, Gert Slavenburg

20.1 ARBITER FEATURES

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The PNX1300 internal highway bus conveys all the

memory and MMIO traffic. The on-chip peripheral units

described in this databook are connected to this internal

highway bus. Accesses to the bus are controlled by a

central arbiter. Figure 2-1 on page 2-2 shows the whole

system where the arbiter is embedded in the main mem-

ory interface (MMI) block. The traffic includes the memo-

ry requests issued by most of the on-chip units as well as

the MMIO transactions issued by the DSPCPU or PCI

block and responded to by the peripherals.

The arbiter was designed to make PNX1300 a true real-

time system by providing a highly programmable bus

bandwidth allocation scheme. The primary characteris-

tics are:

• round robin arbitration

• hierarchical organization

• programmable allocation of highway bandwidth

• dual priorities with priority raising mechanism

These features are explained in the next sections of this

chapter. The arbiter is programmed through two MMIO

registers:

• ARB_RAISE

•ARB_BW_CTL

The default values (after hardware RESET) stored in

these two MMIO registers are su itable for most of the ap-

plications. If these default settin gs introduce violations of

real-time constraints in units like Vid eo In (VI), Video Out

(VO), Audio In (AI) and Audio Out (AO) (each of these

units has a Highway Bandwidth Error detection mecha-

nism), the ARB_BW_CTL register should be pro-

grammed to 0x090A9. This setting gives almost maxi-

mum priority to real-time units but may slow down the

CPU.

Fine tuning of the arbiter settings is described in the fol-

lowing sections.

20.2 DUAL PRIORITIES WITH PRIORITY

RAISING MECHANISM

The best CPU performance is obtained if cache misses

can take priority over peripheral requests on the high-

way. However, peripherals need to have a maximum

guaranteed latency low enough to satisfy the real-time

constraints of I/O units.

PNX1300 provides this featur e with the following priority-

raising mechanism.

Peripheral unit requests can have 2 priorities: low and

high. Within each class there is fair, round-robin arbitra-

tion (Section 20.3). Requests wit h high priori ty take pre-

cedence over requests with low priority.

Units can indicate the priority of their requests to be low

or high.

A unit may initially post a request with low priority. If the

request is not serviced within a particular waiting time,

the unit can raise the priority of the request to high. This

can be done when the worst ca se latency at hig h priority

approaches the real- time constraint of the unit. Thus, the

unit uses only spare ba ndwidth without slowing down the

CPU unless real-time constraints require it to claim high

priority.

In PNX1300, only the ICP unit has its own priority raising

logic (i.e. it controls the low to high transition of the re-

quest). Refer to Chapter 14, “Image Coprocessor,” for

more information.

Priority raising for the VLD, PCI, VI and VO units is han-

dled by the arbiter central priority raising mechanism.

The central priority raising mechanism settings are con-

trolled from the DSPCPU with the ARB_RAISE MMIO

time for which the arbiter handles the request at low pri-

ority.

The delay is defined by a 5-bit field (dedicated per unit)

and is counted in CPU clock cycles. The granularity of

the delay is 16 cycles, so the maximum time spent at low

priority for each request can be programmed from 0 to

496 cycles, inclusive, in increments of 16 cycles.

The default value for the entire ARB_RAISE register is

‘0’. This causes all requests from VLD, PCI, VI and VO to

be handled as high-priority requests until the

Table 20-1. ARB_RAISE register layout

Offset Name Bits Fields

0x10010C ARB_RAISE 19:15 VLD_delay[4:0]

14:10 PCI_delay[4:0]

9:5 VI_delay[4:0]

4:0 VO_delay[4:0]

PNX1300/01/02/11 Data Book Philips Semiconductors

20-2 PRELIMINARY SPECIFICATION

ARB_RAISE register contents has been change d for the

application requirements.

Corner-case note : There is some risk in setting the delay

high, then lowering it, as the last request submitted with

the high delay might viol ate the latency constraints of the

new real-time domain. However this should not happen

since this register should be set before the application

starts.

The other units (AI, AO and BTI (boot block)) and the

CPU will always have their requests considered as high

priority. High priority for the CPU will give maximum pos-

sible performance.

AO and AI requests are happening at very low rate.

Hence, the probability that they take time away from the

CPU is negligible.

20.3 ROUND ROBIN ARBITRATION

In addition to the dual priority mechanism, a round-robin

arbitration is used to schedule the requests with same

priority. The purpose is to ensure, for every unit with a

high-priority request, a maximum latency for gaining ac-

cess to the highway and/o r a minimum share of the avail-

able bandwidth.

Round-robin arbitration ensures that no starvation of re-

quests can occur and therefore requests with real-time

constraints can be handled in time.

The round robin arbitration algorithm is as follows.

Requests are granted according to a dynamic priority list.

Whenever a unit request is granted, it will be moved to

the last position in the priority list and another unit will be

moved to the first position in the priority list. Priorities are

rotated. A unit with a waiting request will eventually reach

the first place in the priority list.

As an example, Figure 20-1 shows a state diagram of an

arbitration state machine wit h 2 requesters. The no des A

and B indicate states A and B. In state A, requester A has

ownership of the highway, in state B requester B has

ownership. The arc from state A to state B indicates that

if the current state is state A and a request from request-

er B is asserted, then a transition to state B occurs, i.e.

ownership of the highway passes from requester A to re-

quester B.

When, in a particular state, none of the arcs le aving from

that node has its condition fulfilled, the state machine re-

mains in the same stat e.

When both requester A and B have requests asserted,

then owners hip of the highway switches between A and

B, creating fair allocatio n of ow ne r ship .

Figure 20-2 pictures a state diagram that allocates fair

arbitration with 3 requesters.

20.3.1 Weighted Round Robin Arbitration

Not all units need to have equal latency and bandwidth.

It is preferred to allocate bandwidth to units according to

their needs. This is achieved with weighted round-robin

and can be illustrated in the following examples.

Figure 20-3 pictures a state machine with two requesters

A and B with double weight given to requester A. There

are now 2 state s A1 and A2 whe re requester A h as own-

ership of the highway. When both A and B requests are

asserted, requester A will have ownership of the highway

twice as often as requester B.

Figure 20-1. State diagram of round robin arbitra-

tor with 2 requesters.

Figure 20-2. State diagram of round robin arbitra-

tor with 3 requesters.

A&~C

B&~A

C&~B

A1 B

Figure 20-3. State d iagram of round robin arbitra-

tor with 2 requesters; A has double weight.

B&~A

Philips Semiconductors Arbiter

PRELIMINARY SPECIFICATION 20-3

Figure 20-4 shows a state machine with 3 requesters in

which double weight is given to requester A. Such state machines can become very complex and

cannot be implemented for a lar ge system like PNX1300

with 9 requesters. Hierarchy or arbitration levels are

used to overcome this problem.

20.3.2 Arbitration Levels

The arbitration is split into multiple levels of hierarchy.

Each level of hierarchy has an independent arbitration

state machine. At the bottom of the hierarchy, the arbitra-

tion is performed between a group of units. Wh ichever of

these units ‘wins’ is passed to the next level of hierarchy,

where the selected unit compe tes with other units at that

level for highway access.This is continued until the hi gh-

est level of arbitration.

By splitting arbitration into multiple levels it is easy to

support a large number of highway units while the com-

plexity of the arbitration state machines at each level of

hierarchy remains modest.

A1 B

Figure 20-4. State diagram of round robin arbitra-

tor with 3 requesters; A has double weight.

A2C

B&~A

C&~A

A&~B

A&~C

B&~C&~A

C&~B&~A

L1 arbitration

L6 arbitration

L5 arbitration

L4 arbitration

L3 arbitration

L2 arbitration

Cache priority- b as ed arb itr at ion

vo_req

icp_reqh

icp_reql

vi_req

pci_req

vld_req ai_req ao_req

bti_mmio_req

bti_req

pci_mmio_req

ic_req

dc_req

dc_mmio_req

dc_req_pref

1/2/3 1/2/3

1/3/5 1/3/5/7

1/3/5/7 1/3/5

1/2 1/3/5

1/3/5 1/2

11 1

1111 2

Figure 20-5. Arbitration architecture

dvdd_req

spdo_req

PNX1300/01/02/11 Data Book Philips Semiconductors

20-4 PRELIMINARY SPECIFICATION

Hierarchy also makes it easy and natural to allocate bus

bandwidth or la tency to a gro up of units. Most bandwidth

or latency-demanding units are located at the top of the

hierarchy while the less demanding are at the bottom

and get a small amount of overall bandwidth.

20.4 ARBITER ARCHITECTURE

In addition to the dual priority mechanism described in

Section 20.2, PNX1300 supports an arbitration architec-

ture made of 6 fixed levels of hierarch y. This is combined

with a programmab le weighted round robin al gorithm per

level, as pictured in Figure 20-5.

The weights can be adjusted by software to allocate

bandwidth and laten cy depending on application requ ire-

ments. Within a level of hierarchy the units can have

equal weights, giving them an e qual share o f bandwidth .

Alternatively, they can have different weights, giving

them an unequal share of the bandwidth for that level.

The arbitration weights at each level are described in

Table 20-3 and illustrated in Figure 20-5.

Table 20-2 presents the minimum bandwidth allocation

at Level 1 between the DSPCPU and the peripherals

(level 2) according to the differen t weight va lues that can

be program med. Not e that programming a weight of 3/3

or 2/2 instead of 1/1 is leg al and results in the same allo-

cation.

Note: The different types of requests from the DSPCPU

caches are arbitrated locally before sending a single

CPU request to the arbiter. The PCI bus also performs lo-

cal arbitration be fore sending a system r equest to the ar-

biter.

The weight programming is done by setting the MMIO

description and coding is provided in Table 20-4.

The hardware RESET value of ARB_BW_CTL is 0, re-

sulting in a weight of 1 for all requests.

Note that each media processor application needs to

carefully review its arbiter settings.

Table 20-2. Minimum bandwidth allocation between

CPU caches and peripheral units.

weight of

CPU and

caches

weight of

level 2 bandwidth

at level 1 bandwidth

at level 2

3 1 75% 25%

2 1 67% 33%

3 2 60% 40%

1 1 50% 50%

2 3 40% 60%

1 2 33% 67%

1 3 25% 75%

Table 20-3. Arbitration weights at each level

Level Arbitration Weights

level 1: CPU MMIO, Dcache, Lcache are arbitrated with

fixed priorities between each other and together

have a programmable weight of 1, 2 or 3.

Level 2 has a programmable weight of 1, 2 or 3.

level 2: VO unit has a programmable weight of 1, 3 or 5.

Level 3 has a programmable weight of 1, 3, 5 or 7.

level 3: The ICP unit has a programmable weight of 1,3,5 or

7. Level 4 has a programmable weight of 1,3 or 5.

level 4 The VI unit has a program mable weight of 1 or 2.

Level 5 has a programmable weight of 1,3 or 5.

level 5: The PCI unit has a programmable weight of 1,3 or 5.

Level 6 has a programmable weight of 1 or 2.

level 6: Level 6 contains several lower bandwidth and/or

latency-tolerant units. The VLD has a weight of 2. AI,

AO, DVDD and the boot block (only active during

booting) have a weight of 1.

Table 20-4. ARB_BW_CTL MMIO register

Offset level of

arbitration field bits allowed

values

0x100104 n/a RESERVED 25:18

level 1 CPU weight 17:16 00 = weight 1

01 = weight 2

10 = weight 3

level 1 L2 weight 15:14 00 = weight 1

01 = weight 2

10 = weight 3

level 2 VO weight 13:12 00 = weight 1

01 = weight 3

10 = weight 5

level 2 L3 weight 11:10 00 = weight 1

01 = weight 3

10 = weight 5

11 = weight 7

level 3 ICP weight 9:8 00 = weight 1

01 = weight 3

10 = weight 5

11 = weight 7

level 3 L4 weight 7:6 00 = weight 1

01 = weight 3

10 = weight 5

level 4 VI weight 5 0 = weight 1

1 = weight 2

level 4 L5 weight 4:3 00 = weight 1

01 = weight 3

10 = weight 5

level 5 PCI weight 2:1 00 = weight 1

01 = weight 3

10 = weight 5

level 5 L6 weight 0 0 = weight 1

1 = weight 2

Philips Semiconductors Arbiter

PRELIMINARY SPECIFICATION 20-5

20.5 ARBITER PROGRAMMING

The PNX1300 arbiter accepts programmable bandwidth

weights to directly control the percentage of bandwidth

allocated to each unit. In the worst case all bandwidth is

used. If not all of the bandwidth is used, then all units

eventually get their desired bandwidth (as the bus be-

comes free) regardless of the weights. However, the

weights still indirectly guarantee each unit a worst-case

latency, which is important for the real-time behavior.

There are two basic types of PNX1300 coprocessor and

peripheral units. The first type is units which have hard

real-time constraints, i.e. VO, VI, AO and AI. To ensure

multimedia functionality, these units must be able to ac-

quire the bus within a fixed amount of time in order to fill

or empty a buffer before it over- or underflows.

The second type, the CPU, PCI, ICP, VLD and DVDD

units, can absorb long latencies but performance is en-

hanced (there are fewer stall cycles or waiting cycles) if

latency is short. The bandwidth requirement is usually

known and depends on the application. It is especially

well known that ICP and VLD or DVDD have a fixed

bandwidth requirements in multimedia applications.

For the PNX1300 DSPCPU, latency is of prime impor-

tance. CPU performance redu ces as average latency in-

creases. The design of the arbiter guarantees that the

DSPCPU gets all unused bus bandwidth with lowest pos-

sible latency. Optimal opera tion is achieved if the arbiter

is set in such a way that the DSPCPU has the best pos-

sible latency given the required latency and bandwidth of

units active in the application.

To pick programmable weights and priority raising de-

lays, the following procedure is recommended:

1. Try to keep CPU weight as high as possible through

the remaining steps.

2. Pick weights sufficient to guarantee latency to hard

real-time peripherals (see Section 20.5.1).

3. Pick weights for remaining peripherals in order to give

enough bandwid th to each (see Section 20.5.2). S tep

2 above has priority, because bandwidth can be ac-

quired as the bus becomes free and because the hard

real-time units use a known amount of bandwidth.

4. If latency and bandwidth slack remains, increase pri-

ority raise delays in order to improve average CPU la-

tency.

20.5.1 Latency Analysis

In the following, ceil(X) is the least integral value greater

than or equal to X.

Latency is defined in each real-time unit chapter through

this databook. Refer to the related sections to find out the

latency requirement according to the mode and clock

speed at which the unit is operating.

This latency value has to be larger than the maximum la-

tency Lx (in nanoseconds) guaranteed by the arbiter.

For a unit x the arbiter guarantees a latency of:

Lx = Lx,sc * (SDRAM cycle time in ns)

where

Lx,sc = (Dx * T) + E + ceil(Dx * T / Kd) * K + ceil(16*Rx/C)

is the latency in SDRAM clock cycles.

Latency in CPU clock cycles is defined by:

Lx,cc = ceil(Lx,sc * C)

The symbols ar e de fin e d as follow s:

T = 20 cycles (transaction length, assuming worst case

pattern alternating reads and writes).

E = 10 cycles (extra delay in case the first transaction

made by the CPU requires a different bank order to sat-

isfy the critical word first.

K = 19 cycles (refresh transaction length).

Kd is the programmed refresh interval (see Section 12.11

on page 12-6).

C is the CPU/SDRAM ratio (i.e. 5/4, 4/3, 3/2, 2/1 or 1 as

explained in Section 12.6.2 on page 12-4).

Rx is the priority raise delay of unit x as stored in MMIO

Rx = 0 for units other than VO, VI, PCI or VLD.

Dx is the worst case number of requests that the arbiter

allows before the request from unit x goes through.

Dx includes the transaction from unit x (the unit which

needs the data) as well as the internal implementation

delays that occur in the transaction.

Dx is derived from the arbiter settings as follows:

DCPU ceil CPUweight L2weight

CPUweight

------------------------------------------------------





DVO ceil VOweight L3weight

VOweight

--------------------------------------------------





1+=

DICP ceil ICPweight L4weight

ICPweight

----------------------------------------------------





1+=

DVI ceil VIweight L5weight

VIweight

------------------------------------------------





1+=

DPCI ceil PCIweight L6weight

PCIweight

----------------------------------------------------





1+=

DVLD ceil 211011+++++

-------------------------------------------------





1+=

DAI ceil 211011+++++

-------------------------------------------------





1+=

DAO ceil 211011+++++

-------------------------------------------------





1+=

DDVDD ceil 211011+++++

-------------------------------------------------





1+=

DSPDO ceil 211011+++++

-------------------------------------------------





1+=

PNX1300/01/02/11 Data Book Philips Semiconductors

20-6 PRELIMINARY SPECIFICATION

Where

As an example, if CPUweight is 3, L2 weight is 2, VOweight

is 3 and L3weight is 7, then

•D

2 is ceil[(3 + 2) / 2] = 3,

•D

VO is ceil[(3 + 7) / 3] * 3 +1 = 13.

If CPU/SDRAM ratio is 5/4 (for example memory fre-

quency is 80 MHz and CPU frequency is 100 MHz), re-

fresh interval Kd is 1220 cycles, and Rx is 2, then the

maximum latency for VO is:

•L

VO,sc = 13 * 20 + 10 + ceil[13 * 20 / 1220] * 19 +

ceil(16 * 2 / (5 / 4)] = 315 SDRAM cycles

•L

VO = LVO,sc * 12.5 = 3937.5 ns

Note: Average latency is normally much lower than worst

case latency because on rare occasions many units will

issue requests at exactly the same time (this is assumed

when evaluating the maximum latency).

Note: All real-time units have a special exception notifi-

cation flag that is raised if an overflow or underflow oc-

curs while operating.

Note: To compute the latency Lx when a unit is not en-

abled, its weight has to be set to ‘0’ in the D{2,3,4,5,6}

equations and in D{AI,AO,VLD} for AI, AO or VLD.

These equations are not accurate for a ll the weights, but

give an upper bound of the worst case (which is usually

too pessimistic).

A much more accurate number could be found by simu-

lating the arbiter, e.g. if the settings are: CPUweight=1,

L2weight=2, VOweight=1 and L3weight=1, then

DVO = ceil[(1 + 1) / 1] * ceil[(1 + 2) / 2]

giving 4 requests. But actually the worst case grant re-

quests order is: CPU, L3, VO - resulting in 3 requests

only.

20.5.2 Bandwidth Analysis

In the following, ceil(x) means the least integral value

greater than or equal to x.

Minimum allocated bandwidth, Bx for a unit x, by the ar-

biter is defined as follows:

Bx = (Mcycles - Kk) * S / [T * Ex + (16 * Rx / C)]

Where:

Mcycles is the total amo unt o f SDRAM cycles avail able in

a period P in which the bandwidth is computed. For ex-

ample, if the period is 1 second and SDRAM runs at 80

MHz then Mcycles is 80,000,000.

Kk is the amount of SDRAM cycles used by the refresh

during the same period P.

If P is in seconds it could be expressed as:

Kk = ceil(4096 * P / .064) * K

For example, if P is 1 second then Kk is

ceil(4096 * 1 / .064) * 19 = 1216000 SDRAM cycles.

S is the size of the transaction on the bus.

For PNX1300, S is equal to 64 (bytes).

Ex is the ratio of requests available for a unit x according

to the arbiter settings.

It means the unit x will get 1 / Ex out of the total requests.

Ex is derived from the arbiter settings as follows:

Where:

D2ceil CPUweight L2weight

L2weight

------------------------------------------------------





D3ceil VOweight L3weight

L3weight

--------------------------------------------------





=

D4ceil ICPweight L4weight

L4weight

----------------------------------------------------





=

D5ceil VIweight L5weight

L5weight

------------------------------------------------





=

D6ceil PCIweight L6weight

L6weight

----------------------------------------------------





=

ECPU CPUweight L2weight

CPUweight

------------------------------------------------------=

EVO VOweight L3weight

VOweight

--------------------------------------------------E2

=

EICP ICPweight L4weight

ICPweight

----------------------------------------------------E3

=

EVI VIweight L5weight

VIweight

------------------------------------------------E4

=

EPCI PCIweight L6weight

PCIweight

----------------------------------------------------E5

=

EVLD 211011+++++

------------------------------------------------- E6

=

EAI 211011+++++

------------------------------------------------- E6

=

EAO 211011+++++

------------------------------------------------- E6

=

EDVDD 211011+++++

------------------------------------------------- E6

=

ESPDO 211011+++++

------------------------------------------------- E6

=

E2CPUweight L2weight

L2weight

------------------------------------------------------=

E3VOweight L3weight

L3weight

--------------------------------------------------E2

=

E4ICPweight L4weight

L4weight

----------------------------------------------------E3

=

Philips Semiconductors Arbiter

PRELIMINARY SPECIFICATION 20-7

For example, with the sam e settings as in the example of

Section 20.5.1, then

•E

2 is (3 + 2) / 2 = 2.5

•E

VO is (3 + 7) / 3 * 2.5 = 8.33,

which gives

•B

VO = (80 - 1.216) * 64 / [ 20*8.33 + 16*2 / (5/4) ]

resulting in 26.23 million B/sec corresponding to 25.01

MB/sec.

Note: In order to compute the latency Bx when a unit is

not enabled, its weight has to b e considere d as ‘0’ in th e

E{2,3,4,5,6} equations and in E{AI,AO,VLD} for AI, AO or

VLD.

The maximum amount of requests, Ax, for unit x allowed

during Mcycles period is:

Ax = floor(Bx / S)

Where floor(X) is the greatest integral value less than or

equal to X.

Note: This number does not take into account the worst

case pattern for request acknowledgment. Thus if the pe-

riod is too small Ax is not accurate.

20.6 EXTENDED BEHAVIOR ANALYSIS

The following sections describes a more accurate behav-

ior of the PNX1300 arbitration system.

20.6.1 Extended Bandwidth Analysis

The minimum bandwidth allocation derived from the ar-

biter settings is accurate if one of the two following con-

ditions are true:

• The units emit requests all the time (i.e. do back-to-

back requests)

• After a request has been acknowledged, the unit

emits a new request before the new arbitration point.

The arbitration is decided around every 16 cycles.

This time depends on the direction of the transac-

tions (read/write).

In PNX1300, the only unit almost able to sustain back-to-

back requests is the data cache. The other units will post

a request and wait for the data before the next re quest is

posted. This behavior makes the bandwidth computa-

tion:

• almost accurate if the unit is down in the arbiter hier-

archy (true if the units placed above are enabled).

• rather inaccurate if large weights are used for a unit.

Since no back-to-back requests are implemented, the

worst case is that a unit can only get one request out of

three if all the others are asking. This limits the use of

large weights for other units than data cache.

However som e units m ight be ab le to catch one req ues t

out of two. This depe nds on the way requests interleave,

since the arbitration point is dep endent on the type of the

request (read or write) as well as on the CPU ratio.

This makes it almost impossible to d escribe the behavior

precisely.

The exact bandwidth necessary for units like VO, VI, AO

or AI are well known (see dedicated sections in each cor-

responding chapter). If the arbiter settings allocate more

bandwidth for these units than they can use, the extra

bandwidth can be used by units that are located below

these units (VO, VI) or at the same level as (AO and AI)

in the arbiter hierarch y.

As an example, with the default settings, VO gets 25% of

the available bandwidth and the CPU gets 50%. If the

SDRAM clock speed is 100 MHz, then 100 MB/sec are

allocated to VO. If VO runs at 27 MHz (NTSC or PAL

mode), then VO will not use all this allocated bandwidth.

Thus any of the units that are below VO in the arbiter hi-

erarchy can potentially use the remaining allocated

bandwidth.

In other words - even if only 10% are allocated to one unit

like the CPU, PCI or the ICP, it may use more.

20.6.2 Extended Latency Analysis

Some units (VO and VI) have a latency/bandwidth re-

quirement and their behavior needs to be simulated in or-

der to find out the correct settings. For example the re-

quirement for VO (in image mode 4:2:2 or 4:2:0 without

up scaling, overlay disa ble d ) is:

• During 128 VO clock cycles, VO block needs to

have 2 requests acked ([2 Ys, one U and one V]/2).

The default value ‘0’ for ARB_BW_CTL leads to a bus al-

location of 50% for CPU, 25% for VO and 25% for L3

blocks.

The worst case arbitration for VO is then: CPU L3 CPU

VO, CPU L3 CPU VO to which the refresh (K), internal

delays (T) and E for the first CPU request need to be

added.

The first VO request will require 129 SDRAM cycles (DVO

= 5 or from the worst case pattern 19 + 10 + 20 + 4 * 20).

The arbitration pattern shows that the following request

will require (in the worst case) an extra 4 * 20 SDRAM cy-

cles. Thus VO clock speed cannot be greater than

61.24% (128 / [129 + 80]) of the SDRAM clock speed.

By changing the settings to 33% for the CPU, 33% for VO

and 33% for L3 blocks (i.e. CPU weight = ‘1’, L2weight = ‘2’,

VOweight = ‘1’, L3weight = 1), the new SDRAM/VO clock

percentage becomes 75.74% (128 / [109 + 60]) corre-

sponding to a worst case arbitration pattern of CPU L3

VO, CPU L3 VO.

Before changing the settings the minimum SDRAM

speed required to run VO at 74.25 MHz (high definition

speed) was 122 MHz. After the new allocation 100 MHz

is fine. Note that here DVO remains equal to ‘5’.

E5VIweight L5weight

L5weight

------------------------------------------------E4

=

E6PCIweight L6weight

L6weight

----------------------------------------------------E5

=

PNX1300/01/02/11 Data Book Philips Semiconductors

20-8 PRELIMINARY SPECIFICATION

When VO is running in image mode 4:2:2 or 4:2:0 without

upscaling and overlay enabled, the requirements be-

come:

• During the first 64 VO clock cycles at least one

request must be acked (the OL (overlay) dat a).

• During 128 VO clock cycles, VO block requires that

4 requests be acked ([4 OLs, two Ys one V and one

U]/2).

If the settings are 33% for the CPU, 33% for VO and 33%

for L3 blocks then the worst case arbitration pattern is

CPU L3 VO, CPU L3 VO, etc.

The first requirement limits the VO/SDRAM ratio to

(64 / [19 + 10 + 20 + 3 * 20]) = 58.7%.

The second requirement gives a VO/SDRAM ratio of

44.29% (128 / [19 + 10 + 20 + 3 * 20 + 3 * 20 * 3]).

Thus if VO clock speed is supposed to be 54 MHz (pro-

gressive scan) the SDRAM must run at least at 122 MHz.

By setting the arbiter to 25% for the CPU, 37.5% for VO

and 37.5% for VI (CPUweight = 1, L2weight = 3, VOweight =

1, L3weight = 1, assuming only VO and VI are enabled)

the arbitration pattern becomes CPU VI VO VI CPU VO

VI VO CPU VI VO.

Now both VI and VO are able to catch one re quest out of

two, thanks to the read / write overlap. This leads to a

VO/SDRAM ratio of 47.5% or a 113 MHz SDRAM.

20.6.3 Raising Priority

If VO is running at 27 MHz (NTSC or PAL) without over-

lay and CPUweight is set to ‘3’ while all the other weights

are set to ‘1’, then the worst case latency derived from

20.5.1 for VO is:

LVO,sc = (c eil[(1 + 1) / 1] + ceil[(3 + 1) / 1] + 1) * 20 + 10

+ 19 = 169 SDRAM cycles (assumes RVO = ‘0’).

The latency for VO is 1 request in 64 VO cloc k cycles. If

SDRAM is running at 80 MHz, then the maximum latency

tolerated by VO is floor(64 / (27 / 80)) = 189 SDRAM cy-

cles.

This means that VO requests can remain at low priority

for 189 - 169 = 20 SDRAM cycles.

If the CPU clock speed is 100 MHz (ratio is 5 / 4) then the

ARB_RAISE register can be programmed to:

floor(20 * (5 / 4) / 16) = 1.

VO requests will stay at low priority for 16 cycles allowing

slightly better average CPU performance.

20.6.4 Conclusion

There is no obvious way to set the best weights for laten-

cy or bandwidth allocation since the behavior of each

block cannot be easily described with equations. Practi-

cal results obtained by running applications showed that

once the arbiter is weighted to meet latencies the re-

maining weight settings do not allow much improvement.

The best way to tune the weights is by experiment, run-

ning the application.

The only accurate computation is the maximum worst

case latency, which ensures that the hard real-time un its

work properly. This computation gives an upper bound

and can be too pessimistic - but it still gives the right or-

der of magnitude. Refer to Table 20-5 for the recom-

mended allocation method.

Table 20-5. Recommended Allocation Method

Video In allocate required latency

Video Out allocate required latency

Audio In allocate required latency

Audio Out allocate required latency

SPDIF Out allocate required latency

ICP allocate bandwidth

PCI allocate bandwidth

VLD allocate bandwidth/latency

DVDD allocate bandwidth/latency

PRELIMINARY SPECIFICATION 21-1

Power Management Chapter 21

by Eino Jacobs and Hani Salloum

21.1 OVERVIEW

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

PNX1300 supports power management in two ways:

• In global po wer-down mode, mo st clocks on the chip

are shut down and the SDRAM main memory is

brought into low-power self- re fresh mode . Th e powe r

of all on-chip peripheral blocks except for BTI (boot

and I2C blocks), Dcache, Icache, PCI, timers and

VIC blocks is shut off. Some peripherals can be

selectively prevented from participating in the global

power down.

• A block power down mechanism allows power down

of select peripheral blocks

21.2 ENTERING AND EXITING GLOBAL

POWER DOWN MODE

Power management is software controlled and is initiat-

ed by writing to the MMIO register POWER_DOWN. Dur-

ing execution of this MMIO operation, the system is pow-

ered down without completing the MMIO operation.

When the system wakes up fro m power d own mode , the

MMIO operation is completed.

This means that during program execution on the

DSPCPU the mom ent of power down is defined exactly:

any instruction before the instruction that contains the

MMIO operation is completed before entering power

down mode. The in struction containin g the MMIO opera-

tion and all subsequent instructions are completed after

wake up from power down mode.

Wake-up from power down mode is effected by receiving

an interrupt (any interrupt) that passes the acceptance

criteria of the interrupt controller.

There is also wake-up from power down if a peripheral

unit asserts a memory request signal on the highway.

During power down mode the whole chip is powered

down, except the PLL s, the interrupt logic, the timers, the

wake-up logic in the MMI , and any logic in the peripher al

units and PCI bus interface that is not participating in the

power down.

Note: Writing to the global POWER_DOWN register (at

offset 0x100108) has no effect on the contents of the

BLOCK_POWER_DOWN register (at offset 0x103428),

and vice versa.

21.3 EFFECT OF GLOBAL POWER DOWN

ON PERIPHERALS

The on-chip peripheral units participate in global power

down. This can be a programmable option for selected

peripherals. These selected peripherals have a program-

mable MMIO control bit, the SLEEPLESS bit, that can be

used to prevent it from participating in the global power

down mode. By default every pe ripheral unit must partic-

ipate in power do wn .

The following peripheral units have the SLEEPLESS bit:

Video In, Video Out, Audio In, Audio Out, SPDO, SSI,

and JTAG.

The following peripherals do not have the SLEEPLESS

bit and always participate in power down: VLD, boot/I2C

and ICP.

The following peripherals do not participate in global

power down, although they must power themselves

down when they are inactive: VIC, PCI.

When a peripheral does not participate in global power

down, it can still do regular main memory traffic. Every

time a peripheral unit asserts the highway request signal,

the MMI will initiate a wake-up sequence. The CPU must

execute software that initiates a new power down of the

system. This software can be the wait-l oop of the RTOS .

Programmer’s note: Since the system is awak ened each

time there is a transaction o n the highway, it may be in-

teresting to make a software loop that does the activation

of the POWER_DOWN mode. Then the activation is con-

ditional and most of the time done using a global vari-

able, usually set by a handler. It then becomes mandato-

ry to be sure that there are no interruptible jumps

between the time the value of the global variable is

fetched and compared by the DSPCU and the time the

conditional write to the MMIO is performed (it is the clas-

sical semaphore or test and set issue). Thus it is recom-

mended that a separate function be used with the ad-

dress of the variable a s a parameter . This function needs

then to be compiled specifically without interruptible

jumps.

The wake-up from power down mode takes approxi-

mately 20 SDRAM clock cycles. This amount of time is

added to the worst case latency for memory requests

compared to the situation when the system is not in pow-

er down mode.

PNX1300/01/02/11 Data Book Philips Semiconductors

21-2 PRELIMINARY SPECIFICATION

21.4 DETAILED SEQUENCE OF EVENTS

FOR GLOBAL POWER DOWN

The sequence of events to power down PNX1300 is as

follows:

• Issue a MMIO write to the POWER_DOWN register

• The main memory interface (MMI) waits till the com-

pletion of the current SDRAM transf er, if th ere is one

still busy.

• The MMI brings SDRAM into the self refresh state,

goes into a wait state, and asserts the global signal

global_power_down.

• All units that participate in the power down, respond

to the global_power_down signal by disabling their

clocks.

• Only the PLL, interrupt controller, timers, wake-up

logic, the PCI bus interface, and any peripherals that

have their SLEEPLESS bit control bit set continue to

be clocked. The SDRAM clock continues.

• An interrupt is detected by the interrupt controller or a

unit that didn’t particip ate in th e power do wn reque st s

a memory transfer.

• The MMI de-asserts the global_power_down signal,

activating all blocks on the chip.

• The MMI recovers SDRAM from self-refresh.

• The MMI causes completion of the MMIO operation

that initiated the power down sequence.

• When software takes an interruptible branch opera-

tion, the interrupt that caused the wake-up will be

serviced (if the wake-up was initiated by an interrupt).

21.5 MMIO REGISTER POWER_DOWN

The register POWER_DOWN has an offse t 0x10010 8 in

the MMIO aperture and has no content. Writing to this

Reading from this register returns an undefined value

and has no side-effect.

21.6 BLOCK POWER DOWN

This feature is new in PNX1300. It selectively shuts off a

particular block or a set of blocks based on software pro-

gramming.

This type of power down can be used in applications

where certain blocks will never participate in the opera-

tion of the chip. The objective of having this type of power

down is saving on power consumption.

Each peripheral unit which can participate in the global

power down can be selectively powered down.

This is done by setting a control bit in MMIO register

BLOCK_POWER_DOWN specific ally for the block. The

BLOCK_POWER_DOWN register is located at MMIO

offset 0x103428. See Figure 21-1 below.

Setting a particular bit to ’1’ in this register has the effect

of shutting off the corresponding block. Writing ’0’ to this

bit, enables the power for the block again.

A block should not be powered down if it is active. Enable

bit should be set to ‘0’ before deciding to po wer down the

block.

Note: The unassigned bits o f this register have to be writ-

ten to ‘0’ and read as ‘0’.

Note: Writing to the global POWER_DOWN register (at

offset 0x100108) has no effect on the contents of the

BLOCK_POWER_DOWN register (at offset 0x103428),

and vice versa.

Figure 21-1. Power down register BLOCK_POWER_DOWN

SPDO

DVDD

AO AI

EVO VI

31 03192327

SSI

VLD

1115

BLOCK_POWER_DOWN (r/w)

MMIO_base

offset:

0x10 3428

ICP

PRELIMINARY SPECIFICATION 22-1

PCI-XIO External I/O Bus Chapter 22

By David Wyland

22.1 SUMMARY FUNCTIONALITY

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

The PNX1300 PCI-XIO bus allows glueless connection

to PCI peripherals, 8-bit microprocessor peripherals and

8-bit memory devices. All these device types can be in-

termixed in a single PNX1300 system.

The PCI-XIO bus provides the following features:

• All PCI 2.1 features (32-bit, 33 MHz)

• Simple, non-multiplexed, 8-bit data, 24-bit address

XIO bus with control signals for 68K and x86 style

devices

• Glueless connection to ROM, EPROM, flash

EEPROM, UARTs, SRAM, etc.

• Programmable internal or external bus clock source

• 0-7 programmable wait states for XIO devices

• Support for single byte read, single byte write, DMA

read or DMA write

• The 16 MB of XIO device space is visible as 16

MWords (64 MBytes) in the DSPCPU memory map

22.1.1 Description

The XIO logic that implements the protocol for 8-bit de-

vices appears as a on-chip PCI target device to the rest

of the PNX1300. It only responds when it is addressed by

the PNX1300 as initiator a n d never respon ds to extern al

PCI masters. When it is addressed by the PNX1300 as

an initiator, it responds to the PNX1300 PCI BIU as a nor-

mal slave device, activating PCI_DEVSEL#.

The XIO logic serves as a bridge between the PCI bus

and XIO devices such as ROMs, flash EPROMs an d I/O

device chips. The PNX1300 addresses XIO devices on

the PCI-XIO bus in the same way as registers or memory

in any other PCI slave device. The XIO logic supplies the

PCI_TRDY# signals to the PCI bus and also supplies the

chip-select, read, write and data-strobe signals to XIO

devices attached to the PCI-X IO B us. A conc ep tu al o n ly

block diagram of the PCI-XIO Bus is shown in

Figure 22-2. The real hardware uses the PCI_AD[0:30]

signals and PCI_C/BE#[0:3] signals for both PCI and

XIO devices, as shown in Figure 22-3.

The XIO logic is activated when the Enable bit in the

XIO_CTL register is asserted and whenever the

PNX1300 (as initiator) addresses the PCI-XIO bus ad-

dress range, as defined by a 6-bit address field in the XIO

Bus Control Register. This 6-bit field defines the 6 most

significant bits of the XIO Bus address space. When the

PNX1300 sends out an address as an initia tor, the upper

6 bits of the address are compared with this field. If they

match, the PCI-XIO bus logic is activated. The

PCI_INTB# output is asserted to indicate that the PCI-

XIO Bus is active. It becomes active at PCI data phase

time. When XIO is enabled, the PCI_INTB# signal be-

comes dedicated as XIO bus chip-select, and turns from

an open-drain output into a normal logic output.

PCI_INTB# serves as a global chip select for all XIO Bus

chips. When XIO is disabled, PCI_INTB# is available for

PCI-specific use or as a general purpose software I/O pin

with open-drain behavior as in TM-1000.

The Address field bits in the XIO Bus Control register

serve as a base address register in PCI terms. The XIO

Bus Control register is not a PCI configuration register. It

does not need to be a PCI configuration register because

the PCI-XIO Bus can only be addressed by the

PNX1300. It will not respond to requests by any other ex-

ternal PCI device.

When the XIO-PCI Bus controller logic is activated, it

generates PCI_DEVSEL# as a response to the PCI bus.

When PCI_IRDY# has been re ceived from the BIU, it as-

serts an external PCI_INTB# signal as the global chip se-

lect. It also reconfigures the PCI address/data pins for 8-

bit byte transfers. When the PCI-XIO Bus is active, the

lower 24 bits of the external 32-bit PCI bus are used to

output a 24-bit address for all transfers, read or write.

The upper 8 bits of the external PCI bus are unchanged

and transfer data no rmally. This is shown in Figure 22-3.

The 24-bit address on the XIO Bus pins is the word ad-

dress for the PCI transfer, which is the lower 26 bits of

the PCI transfer address with the two least significant bits

ignored. One word is transferred to or from the PCI bus

for each byte read or written on the XIO bus. In writes to

the XIO bus, a 32-bit word is transferred from the PCI

BIU to the XIO Bus controller, but the lower 24 bits and

the PCI byte enables are ignored. In reads from the PCI

bus, a 32-bit word is transferred from the XIO Bus con-

troller to the PCI BIU with the data in the upper 8 bits and

the 24-bit address in the lower 24 bits. Note that the 24-

bit address returned in a read is the lower 26 bits of the

PCI transfer address with the two least significant bits

truncated. For example, a PCI transfer address of 44

hexadecimal would return a value of 11 hexadecimal as

the lower 24 bits of the 32-bit data in a read. The 24-bit

XIO Bus address is generated by an address counter in

the XIO Bus controller. This counter is loaded with the

PCI word address at PCI frame time at the start of the

PNX1300/01/02/11 Data Book Philips Semiconductors

22-2 PRELIMINARY SPECIFICATION

PCI transfer a nd is incremented for each PCI word trans-

ferred.

The XIO Bus does not generate parity during XIO Bus

write transfers or check parity during XIO Bus read trans-

fers. This allows the XIO Bus to interface to st andard 8-

bit devices without having to add parity-generation and

check logic. While the XIO Bus is active, the XIO Bus log-

ic inhibits parity checking and drives the PCI Parity and

Parity Error pins so that they do not float.

Word transfer is used to transfer the bytes to and from

the PCI bus for hardware simplicity. The primary intend-

ed use of the PCI-XIO Bus is for slow devices, ROMs,

flash EPROMs and I/O. Because the PCI-XIO bus is so

much slower than the PNX1300, there is time available

for the PNX1300 to pack and unpack the words. In the

case of ROMs and flash EPROMs, the data is typically

compressed, requiring the PNX1300 CPU to both un-

pack and decompress the data.

The PCI-XIO Bus Controller logic reconfigures the byte

enables as control signals for the attached XIO Bus chips

during XIO Bus transfers. It also drives the PCI_TRDY#

signal to the PCI Bus for each transfer. The PCI Bus byte

enables are reconfigured to generate XIO Bus timing sig-

nals: Read (IORD), Write (IOWR) and Data Strobe (DS).

These signals allow ROM, flash EPROM, 68K and x86

devices t o be gluel ess ly in terf aced to the XIO Bus. For a

single device, the PCI_INTB# line is used as the global

Audio In

Audio Out

DSPCPU

400 MIPS

2.5 GOPS

I2C Interface

Image

Co Processor

PNX1300

MMI

PCI and External I/O (PCI-XIO) Bus Interface

VLD Assist

Video Out

Digital

DMSD

or Raw

Video

Serial

Digital

Audio

JTAG

XIO Bus PCI - XIO Bus AD[31:0]

SDRAM: 32-bit data

SDRAM

Highway

Synchronous

Video In

Glueless

Flash

EPROM I/F

XIO

I/O Device PCI

I/O Device

Clock

Camera

I2C Bus

CCIR 601

Digital

Video Out

V.34 Modem

Controls PCI Bus

Controls

Serial I/F

Figure 22-1. Partial PNX1300 chip block diagram

Philips Semiconductors PCI-XIO External I/O Bus

PRELIMINARY SPECIFICATION 22-3

chip enable. If more than one dev ice is to be added, an

external decoder, such as a 74FCT138, can be used to

decode the upper bits of the 24-bit transfer address, with

the PCI_INTB# line used as a global chip enable to the

decoder.

The PCI-XIO Bus controller has a wa it state generator to

provide timing for slow devices. The wait state gene rator

allows the addition of up to 7 wait states for slow chip ac-

cess and write times. The wait state gener ator logic gen-

erates the PCI_TRDY# signal to the PCI bus.

The XIO Bus controller contains a clock generator for

standalone systems. The PCI-XIO Bus uses the PCI

clock. This clock is normally supplied by a PCI Bus cen-

tral resource outside the PNX1300 chip. In sta ndalone or

low-cost systems, the internal clock generator can be

used. The internal clock generator divides the PNX1300

highway clock by a 5-bit number in a prescaler. Th is al-

lows setting bus clocks from 4 MHz to 66 MHz in a 133

MHz system. The internal clock generator programming

is described in Se ction 22.5, “XIO_CTL MMIO Register.”

22.2 BLOCK DIAGRAM

Figure 22-2 shows a conceptual block diagram of the

PCI-XIO Bus as a slave device on the PCI Bus. The XIO

Bus Controller generates an XIO Bus, which is an 8-bit

bus with a 24-bit address. Devices attached to the XIO

Bus appear as memory locations in the 16 MB address

space of the XIO Bus.

Figure 22-3 shows an implementation block diagram of

the PCI_XIO Bus. To conserve pins, the XIO Bus Con-

troller uses the PCI I/O pins as XIO Bus pins during XIO

Bus data transfers. It reconfigures the 32 PCI address/

data pins as 8 XIO Bus data pins and 24 XIO Bus ad-

dress pins, and it reconfigures the byte enable pins as

XIO Bus timing signals. By changing the functions of the

pins during the transfer, 36 pins are saved which would

otherwise be required to drive the XIO Bus devices. By

reconfiguring the PCI pins only during the data phase of

the XIO Bus transfers, the PCI-XIO bus retains its PCI

Bus compatibility.

Figure 22-4 shows a more detailed block diagram of the

PCI-XIO Bus controller.

PNX1300 SDRAM Data Highway

PCI

Bus

Interface

Unit (BIU)

PCI Bus

XIO Bus

Controller

PCI Device

PCI

Device PCI

Host

ROMx86

Device

PNX1300

8-bit data + 24-bit addresses

XIO Bus

Figure 22-2. PCI-XIO bus device CONCEPTUAL block diagram

for address & data, these use the same pins/wires

PNX1300/01/02/11 Data Book Philips Semiconductors

22-4 PRELIMINARY SPECIFICATION

PNX1300 SDRAM Data Highway

PCI

Bus

Interface

Unit (BIU)

PCI Bus

XIO Bus

Controller

PCI Device

PCI

Device PCI

Host

ROM x86

Device

etc.

PNX1300

Mux

PCI_INTB#

PCI_INTB# = XIO Bus Active As Target

PCI_AD[23:0]

PCI_AD[31:24] PCI_AD[31:24]

PCI_AD[31:0] PCI_AD[31:0] PCI_AD[31:0]

XIO Bus

Figure 22-3. PCI-XIO Bus device implementation block diagram

PNX1300 SDRAM Data Highway

XIO Config Reg Clock

Bus Timing

PCI

Bus

Interface

PCI_AD[31:24]

PCI_C/BE0#: IORD#

PCI_CLK

PCI-XIO Bus Controller

Unit (BIU)

Mux

Data Out [31:24]

Data In [31:0]

Data Out [23:0]

Address [23:0] PCI_AD[23:00]

Address [31:24]

PCI_INTA#, INTC#, INTD#

PCI_C/BE1#: IOWR#

C/BE TRDY XIO Controls

+ Wait States

PCI_INTB# = Chip Enable

PCI Controls: Frame, etc.

PCI_TRDY#

PCI_DEVSEL#

OROR

DEVSEL

PCI_REQ#

PCI_GNT#

Tie REQ to GNT for stand alone (no host) case

PNX1300 Initiator

PCI_C/BE2#: DS#

PCI_C/BE3#

Figure 22-4. PCI-XIO Bus interface controller block diagram

PCI-XIO Bus

Philips Semiconductors PCI-XIO External I/O Bus

PRELIMINARY SPECIFICATION 22-5

22.3 DATA FORMATS

The data transfer fo rmats for the PCI-XIO b us are shown

in Figure 22-5. The 8-bit data field is the data tran sferred

to or from the PCI-XIO Bus. The read address is the 24-

bit address on the PCI-XIO Bus address lines when the

read transfer takes place.

22.4 INTERFACE

22.4.1 PCI-XIO Bus Interface Design

The PCI-XIO Bus can accommodate a variety of different

devices and bus protocols. The following are examples

of devices interfaced to the PCI-XIO Bus.

Data Read Address

UnusedData

Read: XIO Bus to PCI

Write: PCI to XIO Bus

31 24 23 0

Figure 22-5. PCI-XIO Bus data formats

Table 22-1. PCI-XIO Bus signal definitions

PNX1300 PCI Signal Pins I/O PCI Function XIO Function

PCI_INTB# 1 O PCI-XIO Bus Enable = XIO Bus Active As Target Device

PCI_AD[23:0] 24 I/O PCI Address/Data Address bus: 16 MB

PCI_AD[31:24] 8 I/O Data bus: 8 bits

PCI_PAR 1 O Even Parity for AD & C/BE

PCI_C/BE0# 1 Command/Byte Enables

On XIO read, BE[3:0] = 0110b’4

On XIO write, BE[3:0] = 0111b’4

IORD# = Read Enable

PCI_C/BE1# 1 IOWR# = Write Enable

PCI_C/BE2# 1 DS# = Data Strobe

PCI_C/BE3# 1 unused

PCI_CLK 1 I/O 33 MHz PCI Clock: can optionally be generated by PNX1300 on board osc

PCI_FRAME# 1 I/O PCI Address/Command Strobe + Transfer In Progress

PCI_DEVSEL# 1 I/O Device Select Valid Asserted by PNX1300 = XIO Active

PCI_IRDY# 1 I/O Initiator Ready = Transfer In Progress

PCI_TRDY# 1 I/O Target Ready Asserted by PNX1300 = XIO T ransfer Timing

PCI_STOP# 1 I/O Target Requests Stop of Transaction

PCI_IDSEL# 1 I Chip Select for PCI Config Writes

PCI_REQ# 1 O PNX1300 Requesting PCI Bus

PCI_GNT# 1 I PNX1300 Is Granted PCI Bus

PCI_PERR# 1 I Parity Error to PNX1300

PCI_SERR 1 O System Error from PNX1300

PCI_INTA# 1 I/O General Purpose I/O

PCI_INTB# 1 I/O General Purpose I/O XIO Bus Active = Global Chip Select

PCI_INTC# 1 I/O General Purpose I/O

PCI_INTD# 1 I/O General Purpose I/O

PNX1300/01/02/11 Data Book Philips Semiconductors

22-6 PRELIMINARY SPECIFICATION

22.4.1.1 Flash EEPROM

Figure 22-6 shows an 8-bit flash EEPROM interfaced to

the PCI-XIO Bus. Examples of these devices are the Mi-

cron MT28F200C1 and the AMD 29LV400.

22.4.1.2 68K Bus I/O device

Figure 22-7 shows a 68K bus I/O device interfaced to the

PCI-XIO Bus. Example devices are the Motorola

MC68HC681 DUART and the MC68HC901 Multi-Func-

tion Peripheral.

22.4.1.3 x86/ISA Bus I/O device

Figure 22-8 shows an x86 or ISA bus I/O device inter-

faced to the PCI-XIO Bus. An example device is the Intel

82091 Advanced Integrated Peripheral ( AIP).

22.4.1.4 Multiple Flash EEPROM

Figure 22-9 shows two 8-bit flash EEPROMs interfaced

to the PCI-XIO Bus. A 74FCT138 logic chip decodes up-

per bits PCI_AD[19-17] of the XIO bus addre ss to gener-

ate the chip selects for the two EEPROMs. These bits

decode the address space into blocks of 128 KB. The ad-

dress range of each enable is shown on the enable lines.

Six spare chip selects are available for attaching up to six

more EEPROMs or to attach other devices. The

74FCT138 provides both decode of the address bits an d

the AND function for the PCI_INTB# global chip enable

Address

PCI_AD[16:0]

Write Enable

PCI_C/BE1#: IOWR#

Output Enable

PCI_C/BE0#: IORD#

Chip Select

PCI_INTB#

Data PCI_AD[31:24]

128Kx8 EEPROM

Figure 22-6. 8-bit Flash EEPROM Interface

Address

PCI_AD[23:0]

R/W#

PCI_C/BE1#: IOWR#

DS#PCI_C/BE2: DS#

Chip Select

PCI_INTB#

Data PCI_AD[31:24]

68K Bus Device

CLK

PCI_CLK

Figure 22- 7. 8- b it 68K Bus De vi ce Int erf ace

Address

PCI_AD[23:0]

I/O Read Enable

PCI_C/BE0#: IORD#

I/O Write EnablePCI_C/BE1#: IOWR#

Chip Select

PCI_INTB#

Data PCI_AD[31:24]

x86 or ISA Bus Device

BALE

PCI_CLK

Figure 22 - 8. 8- bit x86 / ISA Bus De vi ce int e rf ac e

Philips Semiconductors PCI-XIO External I/O Bus

PRELIMINARY SPECIFICATION 22-7

signal so that only one EEPROM chip enable signal is

active at global chip en ab le time.

22.5 XIO_CTL MMIO REGISTER

The PCI-XIO Bus Controller has one programmer visible

MMIO register: XIO_CTL. Its format is shown in

Table 22-2. To ensure compatibility with future devices,

any undefined MMIO bits should be ignored when read,

and written as ‘0’s.

22.5.1 PCI_CLK Bus Clock Frequency

PCI_CLK, the clock for the PCI and PCI-XIO bus can be

supplied externally or internally. This is determined at

boot time, by the ‘enable internal PCI_CLK generator’ bit,

bit 6 of byte 9 in the boot EEPROM. Refer to Section 13.2

on page 13-2. If this bit = ‘0’, PCI_CLK acts compatible

with TM-1000 and norm al PCI operation, i.e. PCI_CLK is

an input pin that takes the PCI clock from the external

world. If this bit = ‘1’, an on- chip clock divide r in the XIO

logic becomes the source of PCI_CLK, and the PCI_CLK

pin is configured as an output. In the latter case, the

PCI_CLK frequency can be programmed to a divider of

the PNX1300 highway clock by setting th e XIO_CTL reg-

ister ‘Clock Frequency’ divider value.

Table 22-2. XIO_ CTL Register Fields : MMIO Address

0x10 3060

Field Bits Function Reset Value

Address 31:26 XIO address space undefined

25:11 unused 0

Wait States 10:8 Wait states 0

Enable 7 Enable XIO Bus opera-

tion 0 = disabled

6:5 unused

Clock Fre-

quency 4:0 Clock divider 0x1f

Address

PCI_AD[16:0]

Write Enable

PCI_C/BE1#: IOWR#

Output EnablePCI_C/BE0#: IORD#

Chip Select

PCI_INTB#

Data

PCI_AD[31:24]

128Kx8 EEPROM

Address

Write Enable

Output Enable

Chip Select

Data

74FCT138

A[2-0] O0

PCI_AD[19-17] 0-128K

128-256K

256-384K

384-512K

512-640K

640-768K

768-896K

896-1024K

128Kx8 EEPROM

Figure 22-9. Multiple 8-bit Flash EEPROM Interface

Table 22-3. PCI_CLK frequencies for 133.0 MHz

PNX1300 highway clock

Clock

Frequency

(use odd

values)

PNX1300

Clocks PCI-XIO Clock

Period, ns Frequency,

MHz

0 illegal illegal illegal

1 2 15 66.5

2 3 22.5 44.33

3 4 30 33.25

... ... ... ...

30 31 233 4.29

31 32 241 4.16

PNX1300/01/02/11 Data Book Philips Semiconductors

22-8 PRELIMINARY SPECIFICATION

A table of PCI-XIO Bus Clock frequencies vers us Clock

field values is shown in Table 22-3. Note that the

PCI_CLK operating frequency should be set to observe

the frequency limits given in the AC/DC timing character-

ization data for PNX1300. Odd values of ‘Clock Frequen-

cy’ are recommended, resulting in an even divider, which

generates a 50% duty cycle PCI_ CLK.

22.5.2 Wait State Generator

The XIO Bus controller has an automatic wait state gen-

erator to allow for read and write cycle times of devices

on the XIO bus.

22.6 PCI-XIO BUS TIMING

The timing for the PCI-XIO bus is shown below: Note that

the ‘fat’ lines indicate active drive by PNX1300. Thin lines

indicate areas where the PNX1300 is not actively driving.

(In these areas, pull-up resistors retain the signal high for

control signals, PCI_AD lines are left floating.)

Figure 22-10 shows the timing for a single byte read

transfer. Figure 22-11 shows the timing for a single byte

read transfer with wait states. Figure 22-14 shows the

timing for a DMA burst read transfer of 2 bytes, and

Figure 22-16 shows the timing for a DMA burst write

transfer o f 2 bytes. The DMA bur st transfers are shown

at maximum rate, with zero wait states. DMA burst trans-

fers with wait states insert wait states between the trans-

fers. In the read case, the IORD# enable and DS# are ex-

tended by the wait states. In the write case, the IOWR#

enable and DS# are delayed by the wait states.

Table 22-4. Wait state generator codes

Code Wait States

... ...

PCI_CLK

PCI_FRAME#

PCI_IRDY#

PCI_TRDY#

PCI_DEVSEL#

Frame Ti me Bus Turnaround XIO Transfer

Figure 22-10. PCI-XIO Bus timing: single byte read, 0 wait states

& Address Setup

PCI_AD[23:0]: ADDR XIO AddrsPCI Address

PCI_AD[31:24]: DATA Read Data

PCI Address

PCI_INTB#/CE#

PCI_C/BE2#/DS# PCI Command

PCI_C/BE1#/IOWR# PCI Command

PCI_C/BE0#/IORD# PCI Command

Read Sample Point

Bus Idle

Philips Semiconductors PCI-XIO External I/O Bus

PRELIMINARY SPECIFICATION 22-9

PCI_CLK

PCI_FRAME#

PCI_IRDY#

PCI_TRDY#

PCI_DEVSEL#

Frame Ti me Bus Turnaround Wait (k tim e s)

Figure 22-11. PCI-XIO Bus timing: single byte read, 1 or more wait states

& Address Setup

PCI_AD[23:0]: ADDR XIO AddrsPCI Address

PCI_AD[31:24]: DATA Read Data

PCI Address

PCI_INTB#/CE#

PCI_C/BE2#/DS# PCI Command

PCI_C/BE1#/IOWR# PCI Command

PCI_C/BE0#/IORD# PCI Command

Read Sample Point

XIO transfer

PCI_CLK

PCI_FRAME#

PCI_IRDY#

PCI_TRDY#

PCI_DEVSEL#

Frame Time Write Cycle Data hold time

Figure 22-12. PCI-XIO Bus timing: single byte write, 0 wait states

PCI_AD[23:0]: ADDR XIO AddrsPCI Address

PCI_AD[31:24]: DATA PCI Address

PCI_INTB#/CE#

PCI_C/BE2#/DS# PCI Command

PCI_C/BE1#/IOWR# PCI Command

PCI_C/BE0#/IORD# PCI Command

Bus Idle

XIO Data

PNX1300/01/02/11 Data Book Philips Semiconductors

22-10 PRELIMINARY SPECIFICATION

PCI_CLK

PCI_FRAME#

PCI_IRDY#

PCI_TRDY#

PCI_DEVSEL#

Frame Ti me

Figure 22-13. PCI-XIO Bus timing: single byte write, 1 or more wait states

Write cycle

PCI_AD[23:0]: ADDR XIO AddrsPCI Address

PCI_AD[31:24]: DATA PCI Address

PCI_INTB#/CE#

PCI_C/BE2#/DS# PCI Command

PCI_C/BE1#/IOWR# PCI Command

PCI_C/BE0#/IORD# PCI Command

Data Hold time

XIO Data

Wait (k) Bus Id le

PCI_CLK

PCI_FRAME#

PCI_IRDY#

PCI_TRDY#

PCI_DEVSEL#

Frame Ti me Bus Turnaround XIO Data 1

Figure 22-14. PCI-XIO Bus timing: DMA burst read, 2 bytes, 0 wait states

& Address Setup

PCI_AD[23:0]: ADDR XIO Addrs 1PCI Address

PCI_AD[31:24]: DATA Read Data 2

PCI Address

PCI_INTB#/CE#

PCI_C/BE2#/DS# PCI Command

PCI_C/BE1#/IOWR# PCI Command

PCI_C/BE0#/IORD# PCI Command

Read Sample Points

XIO Data 2 Bus Idle

XIO Addrs 2

Read Data 1

Philips Semiconductors PCI-XIO External I/O Bus

PRELIMINARY SPECIFICATION 22-11

PCI_CLK

PCI_FRAME#

PCI_IRDY#

PCI_TRDY#

PCI_DEVSEL#

Figure 22- 15 . PCI- XIO Bus timing : DMA burst read , 2 byt es , 1 or more wait st at es

PCI_AD[23:0]: ADDR XIO Addrs 1PCI Addr

PCI_AD[31:24]: DATA PCI Addr

PCI_INTB#/CE#

PCI_C/BE2#/DS# PCI Com

PCI_C/BE1#/IOWR# PCI Com

PCI_C/BE0#/IORD# PCI Com

Read Sample Points

Read Data 1

wait(k) data 1 wait(k) data 2

XIO Addrs 2

Read Data 2

Frame Turn

PCI_CLK

PCI_FRAME#

PCI_IRDY#

PCI_TRDY#

PCI_DEVSEL#

Figure 22- 16 . PCI- XIO Bus timing : DMA burst w rite, 2 b yt es, 1 or m ore wai t st at es

PCI_AD[23:0]: ADDR PCI Addr

PCI_AD[31:24]: DATA PCI Addr

PCI_INTB#/CE#

PCI_C/BE2#/DS# PCI Com

PCI_C/BE1#/IOWR# PCI Com

PCI_C/BE0#/IORD# PCI Com

wait(k) hold data2 wait(k)

XIO Addrs 1

Frame data1

XIO Addrs 2

hold idle

XIO Data1 XIO Data 2

PNX1300/01/02/11 Data Book Philips Semiconductors

22-12 PRELIMINARY SPECIFICATION

22.7 PCI-XIO BUS CONTROLLER

OPERATION AND PROGRAMMING

The PCI-XIO Bus is a PCI target device. All valid PCI

transfers with PNX1300 as the initiator are allowed, in-

cluding single word and DMA transfers. When data is

read from the PCI-XIO Bus, it reads as a 32-bit word with

the 8 bits of data as the most significant byte and the 24-

bit XIO Bus transfer address as the least significant

bytes. When data is written to the PCI-XIO Bus, it is writ-

ten as a word, but only the most significant byte of the

data is transferred to the bus. The lower 24 bits are ig-

nored as they are replaced by the lower 24 bits of the

transfer address before being placed on the bus.

Before the PCI-XIO Bus can be used, the PCI-XIO Bus

Control Register must be set up. This register must be

loaded with the base address for the PCI-XIO bus and

the control fields for clock frequency, wait states per

transfer and PCI-XIO Bus enable.

To read a single byte to a PCI-XIO Bus device, first de-

fine the 24-bit address for the device. This might be the

address in an EPROM for the desired byte. Multiply this

device address by four to convert it to a word address

and add the XIO Bus base address. The combined ad-

dress is the PCI transfer address. Use this address as

the transfer address for a single word DSPCPU load.

Table 22-5 shows examples of this address conversion.

At the completion of the load, the data received will con-

sist of 8 bits of data and the 24-bit device address. To

write a byte, use the same transfer address and write a

word to this address with the desired data as the most

significant byte of the word written.

To transfer data between the XIO-PCI bus and the

SDRAM using the PCI DMA capability, set the

SRC_ADR or the DEST_ADR register to the PCI-XIO

Bus transfer address, depending on the direction of the

transfer. The PCI-XIO Bus tr ansfer a ddress is four times

the starting address as seen on the PCI-XIO Bus ad-

dress pins plus the PCI-XIO Bus controller base address.

This is the starting addr ess for the PCI-XIO Bus tra nsfer.

Set the other address, destination or source, to the de-

sired starting address in SDRAM. Set the

PCI_DMA_CTL reg ister for the desired direction and set

the transfer count to the four times number of PCI-XIO

Bus bytes to be transferred. The transfer count is four

times the PCI-XIO Bus bytes to be transferred because

the PCI-XIO Bus transfers one word to or from the PCI

bus for each byte transferred to or from devices on the

PCI-XIO Bus.

Word transfer is used to transfer the bytes to and from

the PCI bus for hardware simp licity. Additional har dware

could be added to pack and unpack bytes, but this is an

unnecessary complication given the speed of the PCI-

XIO Bus relative to the speed of the PNX1300 bus and

CPU. The primary intended use of the PCI-XIO Bus is for

ROMs, flash EPROMs and I/O devices. Because the

PCI-XIO bus is so much slower than the PNX1300 , there

PCI_CLK

PCI_FRAME#

PCI_IRDY#

PCI_TRDY#

PCI_DEVSEL#

Figure 22-17. PCI-XIO Bus timing: DMA burst write, 2 bytes, 0 wait states

PCI_AD[23:0]: ADDR PCI Addr

PCI_AD[31:24]: DATA PCI Addr

PCI_INTB#/CE#

PCI_C/BE2#/DS# PCI Com

PCI_C/BE1#/IOWR# PCI Com

PCI_C/BE0#/IORD# PCI Com

hold data 2 hold bus idle

XIO Addrs 1

Frame data1

XIO Addrs 2

XIO Data 1 XIO Data 2

Table 22-5. PCI to XIO Bus address conversion

examples

XIO Bus

Address

in Hex

PCI Word

Address

in Hex

XIO-PCI

Base

Address

in Hex

PCI T ransfer

Address

in Hex

11 44 5800 0000 5800 0044

0123 048C 5800 0000 5800 048C

11 0012 44 0048 5800 0000 5844 0048

Philips Semiconductors PCI-XIO External I/O Bus

PRELIMINARY SPECIFICATION 22-13

is time available for the PNX1300 to pack and unpack the

words. At three PCI-XIO bus wait states, at least 120

nanoseconds are required for each byte transferred. This

corresponds to 12 CPU instructions at 100 MHz. The

CPU may need to process each byte of data anyway. In

the case of ROMs and flash EPROMs, the data is typical-

ly compressed, requiring the PNX1300 CPU to both un-

pack and decompress the data.

PNX1300/01/02/11 Data Book Philips Semiconductors

22-14 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION A-1

PNX1300/01/02/11 DSPCPU Operations Appendix A

by Gert Slavenburg, Marcel Janssens

A.1 ALPHABETIC OPERATION LIST

The following table lists the complete operation set of PNX1300’s DSPCPU. Note that this is not an instruction list; a

DSPCPU instruct ion con ta ins from on e to five of thes e op e ra tio ns .

Aalloc............................4

allocd..........................5

allocr...........................6

allocx..........................7

asl...............................8

asli..............................9

asr ............................10

asri............................11

Bbitand........................12

bitandinv...................13

bitinv.........................14

bitor ..........................15

bitxor.........................16

borrow ......................17

Ccarry .........................18

curcycles ..................19

cycles .......................20

Ddcb............................21

dinvalid.....................22

dspiabs.....................23

dspiadd.....................24

dspidualabs..............25

dspidualadd..............26

dspidualmul..............27

dspidualsub..............28

dspimul.....................29

dspisub.....................30

dspuadd....................31

dspumul....................32

dspuquadaddui.........33

dspusub....................34

dualasr......................35

dualiclipi....................36

dualuclipi ..................37

Ffabsval......................38

fabsvalflags ..............39

fadd ..........................40

faddflags...................41

fdiv............................42

fdivflags....................43

feql............................44

feqlflags....................45

fgeq ..........................46

fgeqflags...................47

fgtr............................48

fgtrflags.....................49

fleq............................50

fleqflags....................51

fles............................52

flesflags....................53

fmul...........................54

fmulflags...................55

fneq ..........................56

fneqflags...................57

fsign..........................58

fsignflags..................59

fsqrt ..........................60

fsqrtflags...................61

fsub...........................62

fsubflags...................63

funshift1....................64

funshift2....................65

funshift3....................66

Hh_dspiabs.................67

h_dspidualabs..........68

h_iabs.......................69

h_st16d.....................70

h_st32d.....................71

h_st8d.......................72

hicycles.....................73

Iiabs...........................74

iadd...........................75

iaddi..........................76

iavgonep...................77

ibytesel.....................78

iclipi ..........................79

iclr.............................80

ident..........................81

ieql............................82

ieqli...........................83

ifir16..........................84

ifir8ii..........................85

ifir8ui.........................86

ifixieee......................87

ifixieeeflags...............88

ifixrz..........................89

ifixrzflags ..................90

iflip............................91

ifloat..........................92

ifloatflags..................93

ifloatrz.......................94

ifloatrzflags...............95

igeq...........................96

igeqi..........................97

igtr ............................98

igtri............................99

iimm........................100

ijmpf........................101

ijmpi........................102

ijmpt........................103

ild16........................104

ild16d......................105

ild16r.......................106

ild16x......................107

ild8..........................108

ild8d........................109

ild8r.........................110

ileq..........................111

ileqi.........................112

iles..........................113

ilesi.........................114

imax........................115

imin.........................116

imul.........................117

imulm......................118

ineg.........................119

ineq.........................120

ineqi........................121

inonzero..................122

isub.........................123

isubi........................124

izero........................125

Jjmpf.........................126

jmpi.........................127

jmpt.........................128

Lld32.........................129

ld32d.......................130

ld32r .......................131

ld32x.......................132

lsl............................133

lsli...........................134

lsr............................135

lsri...........................136

Mmergedual16lsb......137

mergelsb.................138

mergemsb ..............139

Nnop .........................140

Ppack16lsb...............141

pack16msb.............142

packbytes...............143

pref.........................144

pref16x ...................145

pref32x ...................146

prefd.......................147

prefr........................148

Qquadavg..................149

quadumax...............150

quadumin................151

quadumulmsb.........152

Rrdstatus...................153

rdtag.......................154

readdpc ..................155

readpcsw................156

readspc...................157

rol ...........................158

roli...........................159

Ssex16......................160

sex8........................161

st16.........................162

st16d.......................163

st32.........................164

st32d.......................165

st8...........................166

st8d.........................167

Uubytesel..................168

uclipi.......................169

uclipu......................170

ueql.........................171

ueqli........................172

ufir16 ......................173

ufir8uu ....................174

ufixieee...................175

ufixieeeflags ...........176

ufixrz.......................177

ufixrzflags...............178

ufloat.......................179

ufloatflags...............180

ufloatrz....................181

ufloatrzflags............182

ugeq .......................183

ugeqi.......................184

ugtr.........................185

ugtri ........................186

uimm.......................187

uld16.......................188

uld16d.....................189

uld16r .....................190

uld16x.....................191

uld8.........................192

uld8d.......................193

uld8r .......................194

uleq.........................195

uleqi........................196

ules.........................197

ulesi........................198

ume8ii.....................199

ume8uu ..................200

umin........................201

umul........................202

umulm.....................203

uneq .......................204

uneqi.......................205

Wwritedpc..................206

writepcsw................207

writespc..................208

Zzex16......................209

zex8........................210

PNX1300/01/02/11 Data Book Philips Semiconductors

A-2 PRELIMINARY SPECIFICATION

A.2 OPERATION LIST BY FUNCTION

Load/Store Operations

alloc............................4

allocd..........................5

allocr...........................6

allocx..........................7

h_st16d.....................70

h_st32d.....................71

h_st8d.......................72

ild16........................104

ild16d......................105

ild16r.......................106

ild16x......................107

ild8..........................108

ild8d........................109

ild8r.........................110

ld32.........................129

ld32d.......................130

ld32r .......................131

ld32x.......................132

pref.........................144

pref16x ...................145

pref32x ...................146

prefd.......................147

prefr........................148

st16.........................162

st16d.......................163

st32.........................164

st32d.......................165

st8...........................166

st8d.........................167

uld16.......................188

uld16d.....................189

uld16r .....................190

uld16x.....................191

uld8.........................192

uld8d.......................193

uld8r .......................194

Shift Operations

asl...............................8

asli..............................9

asr ............................10

asri............................11

funshift1....................64

funshift2....................65

funshift3....................66

lsl............................133

lsli...........................134

lsr............................135

lsri...........................136

rol ...........................158

roli...........................159

Logical Operations

bitand........................12

bitandinv...................13

bitinv.........................14

bitor ..........................15

bitxor.........................16

DSP Operations

dspiabs.....................23

dspiadd.....................24

dspidualabs..............25

dspidualadd..............26

dspidualmul..............27

dspidualsub..............28

dspimul.....................29

dspisub.....................30

dspuadd....................31

dspumul....................32

dspuquadaddui.........33

dspusub....................34

dualasr......................35

dualiclipi....................36

dualuclipi ..................37

h_dspiabs.................67

h_dspidualabs..........68

iclipi ..........................79

ifir16..........................84

ifir8ii..........................85

ifir8ui.........................86

iflip............................91

imax........................115

imin.........................116

quadavg..................149

quadumax...............150

quadumin................151

quadumulmsb.........152

uclipi.......................169

uclipu......................170

ufir16 ......................173

ufir8uu ....................174

ume8ii.....................199

ume8uu ..................200

umin........................201

Floating-Point Arithmeti c

fabsval......................38

fabsvalflags ..............39

fadd ..........................40

faddflags...................41

fdiv............................42

fdivflags....................43

fmul...........................54

fmulflags...................55

fsign..........................58

fsignflags..................59

fsqrt ..........................60

fsqrtflags...................61

fsub...........................62

fsubflags...................63

Floating-Point Conversion

ifixieee......................87

ifixieeeflags...............88

ifixrz..........................89

ifixrzflags ..................90

ifloat..........................92

ifloatflags..................93

ifloatrz.......................94

ifloatrzflags...............95

ufixieee...................175

ufixieeeflags ...........176

ufixrz.......................177

ufixrzflags...............178

ufloat.......................179

ufloatflags...............180

ufloatrz....................181

ufloatrzflags............182

Floating-Point Relation als

feql............................44

feqlflags....................45

fgeq ..........................46

fgeqflags...................47

fgtr............................48

fgtrflags.....................49

fleq............................50

fleqflags....................51

fles............................52

flesflags....................53

fneq ..........................56

fneqflags...................57

Integer Arithmetic

borrow ......................17

carry .........................18

h_iabs.......................69

iabs...........................74

iadd...........................75

iaddi..........................76

iavgonep...................77

ident..........................81

imul.........................117

imulm......................118

ineg.........................119

inonzero..................122

isub.........................123

isubi........................124

izero........................125

umul........................202

umulm.....................203

Immediate Operations

iimm........................100

uimm.......................187

Sign/Zero Extend Ops

sex16......................160

sex8........................161

zex16......................209

zex8........................210

Integer Relationals

ieql............................82

ieqli...........................83

igeq...........................96

igeqi..........................97

igtr ............................98

igtri............................99

ileq..........................111

ileqi.........................112

iles..........................113

ilesi.........................114

ineq.........................120

ineqi........................121

ueql.........................171

ueqli........................172

ugeq .......................183

ugeqi.......................184

ugtr.........................185

ugtri ........................186

uleq.........................195

uleqi........................196

ules.........................197

ulesi........................198

uneq .......................204

uneqi.......................205

Control-Flow Operations

ijmpf........................101

ijmpi........................102

ijmpt........................103

jmpf.........................126

jmpi.........................127

jmpt.........................128

Special-Register Ops

cycles .......................20

curcycles ..................19

hicycles.....................73

nop .........................140

readdpc ..................155

readpcsw................156

readspc...................157

writedpc..................206

writepcsw................207

writespc..................208

Cache Oper a t ions

dcb............................21

dinvalid.....................22

iclr.............................80

rdstatus...................153

rdtag.......................154

Pack/Merge/Select Ops

ibytesel.....................78

mergedual16lsb......137

mergelsb.................138

mergemsb ..............139

pack16lsb...............141

pack16msb.............142

packbytes...............143

ubytesel..................168

PNX1300/01/02/11 Data Book Philips Semiconductors

A-3 PRELIMINARY SPECIFICATION

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-4

Allocate a cache block

pseudo-op for allocd(0)

SYNTAX

[ IF rguard ] alloc(d) rsrc1

FUNCTION

if rguard then {

cache_block_mask = ~(cache_block_size -1)]

allocate adata cache block with [(rsrc1 + 0) & cache_block_mask] address

}

ATTRIBUTES

Function unit dmemspec

Operation code 213

Number of operands 1

Modifier -

Modifier range -

Latency -

Issue slots 5

DESCRIPTION

The alloc operation is a pseudo operation transformed by the scheduler into an allocd(0) with the same arguments.

(Note: pseudo operations cannot be used in assembly files.)

The alloc operation allo cate a cache block with the address com puted fro m [(rsrc1 + 0) & cache_block_mask] and set s

the status of this cache block as valid. No data is fetched from main memory for this operation. The allocated cache

block data is undefined after this operation. It is the responsibility of the programmer to update the allocated cache

block by store operations.

Refer to the ‘cache architecture’ section for details on the cache block size.

The alloc operation optiona lly t a kes a guard , spe cified in rg uar d. If a guard is pr esent, its LSB controls the executio n

of the alloc operation. If the LSB of rguard is 1, alloc opera tion is executed; otherwise, it is not executed.

EXAMPLES

Initial Values Operation Result

r10 = 0xabcd,

cache_block_size = 0x40 alloc r10 Allocates a cache block for the address space from

0xabc0 to 0x0xabff without fetching the data from

main memory; The data in this address space is

undefined.

r10 = 0xabcd, r11 = 0,

cache_block_size = 0x40 IF r11 alloc r10 since guard is false, alloc operation is not executed

r10 = 0xac0f, r11 = 1,

cache_block_size = 0x40 IF r11 alloc r10 Allocates a cache block for the address space from

0xac00 to 0xac3f without fetching the data from main

memory; the data in this address space is undefined.

SEE ALSO

allocd allocr allocx

alloc

PNX1300/01/02/11 Data Book Philips Semiconductors

A-5 PRELIMINARY SPECIFICATION

allocd Allocate a cache block with displacement

SYNTAX

[ IF rguard ] allocd(d) rsrc1

FUNCTION

if rguard then {

cache_block_mask = ~(cache_block_size -1)]

allocate adata cache block with [(rsrc1 + d) & cache_block_mask] address

}

ATTRIBUTES

Function unit dmemspec

Operation code 213

Number of operands 1

Modifier 7 bits

Modifier range -255..252 by 4

Latency -

Issue slots 5

DESCRIPTION

The allocd operation allocate a cache block with the address computed from [(rsrc1 + d) & cache_block_mask] and

sets the status of this cache block as valid. No data is fetched from main memory for this operation. The allocated

cache block data is undefined after this operation. It is the responsibility of the programmer to update the allocated

cache block by store operations.

Refer to the ‘cache architecture’ section for details on the cache block size.

The allocd operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

execution of the allocd operation. If the LSB of rguard is 1, allocd operation is executed; otherwise, it is not executed.

EXAMPLES

Initial Values Operation Result

r10 = 0xabcd,

cache_block_size = 0x40 allocd(0x32) r10 Allocates a cache block for the address space from

0xabc0 to 0x0xabff without fetching the data from

main memory; The data in this address space is

undefined.

r10 = 0xabcd, r11 = 0,

cache_block_size = 0x40 IF r11 allocd(0x32) r10 since guard is false, allocd operation is not executed

r10 = 0xabff, r11 = 1,

cache_block_size = 0x40 IF r11 allocd(0x4) r10 Allocates a cache block for the address space from

0xac00 to 0xac3f without fetching the data from main

memory; the data in this address space is undefined.

SEE ALSO

allocr allocx

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-6

Allocate a cache block with index

SYNTAX

[ IF rguard ] allocr rsrc1 rsrc2

FUNCTION

if rguard then {

cache_block_mask = ~(cache_block_size -1)]

allocate adata cache block with [(rsrc1 + rsrc2) & cache_block_mask] address

}

ATTRIBUTES

Function unit dmemspec

Operation code 214

Number of operands 2

Modifier No

Modifier range -

Latency -

Issue slots 5

DESCRIPTION

The allocr operation allocate a cache block with the address computed from [(rsrc1 + r scr2) & cache_block_mask] and

sets the status of this cache block as valid. No data is fetched from main memory for this operation. The allocated

cache block data is undefined after this operation. It is the responsibility of the programmer to update the allocated

cache block by store operations.

Refer to the ‘cache architecture’ section for details on the cache block size.

The allocr operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

execution of the allocr operation. If the LSB of rguard is 1, allocr operation is executed; otherwise, it is not ex ec uted.

EXAMPLES

Initial Values Operation Result

r10 = 0xabcd, r12 = 0x32

cache_block_size = 0x40 allocr r10 r12 Allocates a cache block for the address space from

0xabc0 to 0xabff without fetching the data from main

memory; The data in this address space is undefined.

r10 = 0xabcd, r11 = 0, r12=0x32,

cache_block_size = 0x40 IF r11 allocr r10 r12 since guard is false, allocr operation is not executed

r10 = 0xabff, r11 = 1, r12 =0x4,

cache_block_size = 0x40 IF r11 allocr r10 r12 Allocates a cache block for the address space from

0xac00 to 0xac3f without fetching the data from main

memory; the data in this address space is undefined.

SEE ALSO

allocd allocx

allocr

PNX1300/01/02/11 Data Book Philips Semiconductors

A-7 PRELIMINARY SPECIFICATION

allocx Allocate a cache block with scaled index

SYNTAX

[ IF rguard ] allocx rsrc1 rsrc2

FUNCTION

if rguard then {

cache_block_mask = ~(cache_block_size -1)]

allocate adata cache blockwith [(rsrc1 + 4 x rsrc2) & cache_block_mask] address

}

ATTRIBUTES

Function unit dmemspec

Operation code 215

Number of operands 2

Modifier No

Modifier range -

Latency -

Issue slots 5

DESCRIPTION

The allocx operation allocate a cache block with the address computed from [(rsrc1 + 4 x rscr2) & cache_block_mask]

and sets the st a tus of this cache block as valid. No data is fetched from main memory for this operation. The allo cate d

cache block data is undefined after this operation. It is the responsibility of the programmer to update the allocated

cache block by store operations.

Refer to the ‘cache architecture’ section for details on the cache block size.

The allocx operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

execution of the allocx operation. If the LSB of rguard is 1, allocx operation is executed; otherwise, it is not executed.

EXAMPLES

Initial Values Operation Result

r10 = 0xabcd, r12 = 0xc

cache_block_size = 0x40 allocx r10 r12 Allocates a cache block for the address space from

0xabc0 to 0x0xabff without fetching the data from

main memory; The data in this address space is

undefined.

r10 = 0xabcd, r11 = 0, r12=0xc,

cache_block_size = 0x40 IF r11 allocx r10 r12 since guard is false, allocx operation is not executed

r10 = 0xabff, r11 = 1, r12 =0x4,

cache_block_size = 0x40 IF r11 allocx r10 r12 Allocates a cache block for the address space from

0xac00 to 0xac3f without fetching the data from main

memory; the data in this address space is undefined.

SEE ALSO

allocd allocr

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-8

Arithmetic shift left

SYNTAX

[ IF rguard ] asl rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

n  rsrc2<4:0>

rdest<31:n>  rsrc1<31–n:0>

rdest<n–1:0>  0

if rsrc2<31:5> != 0 {

rdest <- 0

}

ATTRIBUTES

Function unit shifter

Operation code 19

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2

DESCRIPTION

As shown be low, the asl operation takes two arguments, rsrc1 and rsrc2. Rsrc2 specify an unsigned shift amount,

and rdest is se t to rsrc1 arithmetically shifted left by this amount. If the rsrc2<31:5> value is not zero, then take this as

a shift by 32 or more bits. Zeros are shifted into the LSBs of rdest while the MSBs shifted out of rsrc1 are lost.

The asl operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is un changed.

EXAMPLES

Initial Values Operat ion Result

r60 = 0x20, r30 = 3 asl r60 r30  r90 r90  0x100

r10 = 0, r60 = 0x20, r30 = 3 IF r10 asl r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x20, r30 = 3 IF r20 asl r60 r30  r110 r110  0x100

r70 = 0xfffffffc, r40 = 2 asl r70 r40  r120 r120  0xfffffff0

r8 0 = 0x e, r5 0 = 0xfffffffe asl r80 r50  r125 r125  0x00000000 (shift by more than 32)

r30 = 0x7008000f, r60 = 0x20 asl r30 r60  r111 r111  0x00000000

r30 = 0x8008000f, r45 = 0x80000000 asl r30 r45  r100 r100  0x00000000

r30 = 0x8008000f, r45 = 0x23 asl r30 r45  r100 r100  0x00000000

031

rsrc1 31

rsrc2

000

Left shifter

32 bits from rsrc1

031

rdest 3

000

Intermediate result

(example: n = 3)

rsrc2

SEE ALSO

asli asr asri lsl lsli lsr

lsri rol roli

asl

PNX1300/01/02/11 Data Book Philips Semiconductors

A-9 PRELIMINARY SPECIFICATION

asli Arithmetic shift left immediate

SYNTAX

[ IF rguard ] asli(n) rsrc1  rdest

FUNCTION

if rguard then {

rdest<31:n>  rsrc1<31–n:0>

rdest<n–1:0>  0

}

ATTRIBUTES

Function unit shifter

Operation code 11

Number of operands 1

Modifier 7 bits

Modifier range 0..31

Latency 1

Issue slots 1, 2

DESCRIPTION

As shown below, the asli operation takes a single argument in rsrc1 and an immediate modifier n and produces a

result in rdest equal to rsrc1 arithmetically shifted left by n bits. The value of n must be between 0 and 31, inclusive.

Zeros are shifted into the LSBs of rdest while the MSBs shifted out of rsrc1 are lost.

The asli operations optionally take a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

r60 = 0x20 asli(3) r60  r90 r90  0x100

r10 = 0, r60 = 0x20 IF r10 asli(3) r60  r100 no change, since guard is false

r20 = 1, r60 = 0x20 IF r20 asli(3) r60  r110 r110  0x100

r70 = 0xfffffffc asli(2) r70  r120 r120  0xfffffff0

r80 = 0xe asli(30) r80  r125 r125  0x80000000

031

rsrc1

000

Left shifter

32 bits from rsrc1

031

rdest 3

000

Intermediate result

(example: n = 3)

Shift amount n

from operation modifier

SEE ALSO

asl asr asri lsl lsli lsr

lsri rol roli

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-10

Arithmetic shift right

SYNTAX

[ IF rguard ] asr rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

n  rsrc2<4:0>

rdest<31:31–n>  rsrc1<31>

rdest<30–n:0>  rsrc1<30:n>

if rsrc2<31:5> != 0 {

rdest <- rsrc1<31>

}

ATTRIBUTES

Function unit shifter

Operation code 18

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2

DESCRIPTION

As shown below, the asr operation takes two arguments, rsrc1 and rsrc2. Rsrc2 specifies an unsigned shift

amount, and r src1 is arithmetically shif ted right by this amount. If the rsrc2<31 :5> value is not zero, then take this as a

shift by 32 or more bits. The MSB (sign bit) of rsrc1 is replicated as needed to fill vacated bits from the left.

The asr operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is un changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x7008000f, r20 = 1 asr r30 r20  r50 r50  0x38040007

r30 = 0x7008000f, r42 = 2 asr r30 r42  r60 r60  0x1c020003

r10 = 0, r30 = 0x7008000f, r44 = 4 IF r10 asr r30 r44  r70 no change, since guard is false

r20 = 1, r30 = 0x7008000f, r44 = 4 IF r20 asr r30 r44  r80 r80  0x07008000

r40 = 0x80030007, r44 = 4 asr r40 r44  r90 r90  0xf8003000

r30 = 0x7008000f, r45 = 0x1f asr r30 r45  r100 r100  0x00000000

r30 = 0x8008000f, r45 = 0x1f asr r30 r45  r100 r100  0xffffffff

r30 = 0x7008000f, r45 = 0x20 asr r30 r45  r100 r100  0x00000000

r30 = 0x8008000f, r45 = 0x20 asr r30 r45  r100 r100  0xffffffff

r30 = 0x8008000f, r45 = 0x23 asr r30 r45  r100 r100  0xffffffff

031

rsrc1 0

rsrc2

SSS

Right shifter

32 bits from rsrc1

031

rdest 28

SSS

Intermediate result

(example: n = 3)

rsrc2

SEE ALSO

asl asli asri lsl lsli lsr

lsri rol roli

asr

PNX1300/01/02/11 Data Book Philips Semiconductors

A-11 PRELIMINARY SPECIFICATION

asri Arithmetic shift right by immediate amount

SYNTAX

[ IF rguard ] asri(n) rsrc1  rdest

FUNCTION

if rguard then {

rdest<31:31–n>  rsrc1<31>

rdest<30–n:0>  rsrc1<31:n>

}

ATTRIBUTES

Function unit shifter

Operation code 10

Number of operands 1

Modifier 7 bits

Modifier range 0..31

Latency 1

Issue slots 1, 2

DESCRIPTION

As shown below, the asri operation takes a single argument in rsrc1 and an immediate modifier n and produces a

result in rdest that is equal to rsrc1 arithmetically shifted right by n bits. The value of n must be between 0 and 31,

inclusive. The MSB (sign bit) of rsrc1 is replicated as needed to fill vacated bits from the left.

The asri operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

r30 = 0x7008000f asri(1) r30  r50 r50  0x38040007

r30 = 0x7008000f asri(2) r30  r60 r60  0x1c020003

r10 = 0, r30 = 0x7008000f IF r10 asri(4) r30  r70 no change, since guard is false

r20 = 1, r30 = 0x7008000f IF r20 asri(4) r30  r80 r80  0x07008000

r40 = 0x80030007 asri(4) r40  r90 r90  0xf8003000

r30 = 0x7008000f asri(31) r30  r100 r100  0x00000000

r40 = 0x80030007 asri(31) r40  r110 r110  0xffffffff

SSS

Right shifter

32 bits from rsrc1

031

rdest 28

SSS

Intermediate result

(example: n = 3) S

031

rsrc1

Shift amount n

from operation modifier

SEE ALSO

asl asli asr lsl lsli lsr

lsri rol roli

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-12

Bitwise logical AND

SYNTAX

[ IF rguard ] bitand rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  rsrc1 & rsrc2

ATTRIBUTES

Function unit alu

Operation code 16

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The bitand operation computes the bitwise, logical AND of the first and second arguments, rsrc1 and rsrc2. The

result is stored in the destination register, rdest.

The bitand operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xf310ffff, r40 = 0xffff0000 bitand r30 r40  r90 r90  0xf3100000

r10 = 0, r50 = 0x88888888 IF r10 bitand r30 r50  r80 no change, since guard is false

r20 = 1, r30 = 0xf310ffff,

r50 = 0x88888888 IF r20 bitand r30 r50  r100 r100  0x80008888

r60 = 0x11119999, r50 = 0x88888888 bitand r60 r50  r110 r110  0x00008888

r70 = 0x55555555, r30 = 0xf310ffff bitand r70 r30  r120 r120  0x51105555

SEE ALSO

bitor bitxor bitandinv

bitand

PNX1300/01/02/11 Data Book Philips Semiconductors

A-13 PRELIMINARY SPECIFICATION

bitandinv Bitwise logical AND NOT

SYNTAX

[ IF rguard ] bitandinv rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  rsrc1 & ~rsrc2

ATTRIBUTES

Function unit alu

Operation code 49

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The bitandinv operation computes the bitwise, logical AND of the first argument, rsrc1, with the 1’ s complem ent

of the second argument, rsrc2. The result is stored in the destination register, rdest.

The bitandinv operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xf310ffff, r40 = 0xffff0000 bitandinv r30 r40  r90 r90  0x0000ffff

r10 = 0, r50 = 0x88888888 IF r10 bitandinv r30 r50  r80 no change, since guard is false

r20 = 1, r30 = 0xf31 0ffff,

r50 = 0x88888888 IF r20 bitandinv r30 r50  r100 r100  0x73107777

r60 = 0x11119999, r50 = 0x88888888 bitandinv r60 r50  r110 r110  0x11111111

r70 = 0x55555555, r30 = 0xf310ffff bitandinv r70 r30  r120 r120  0x04450000

SEE ALSO

bitand bitor bitxor

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-14

Bitwise logical NOT

SYNTAX

[ IF rguard ] bitinv rsrc1  rdest

FUNCTION

if rguard then

rdest  ~rsrc1

ATTRIBUTES

Function unit alu

Operation code 50

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The bitinv operation computes the bitwise, logical NOT of the argument rsrc1 and writes the result into rdest.

The bitinv operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xf310ffff bitinv r30  r60 r60  0x0cef0000

r1 0 = 0, r40 = 0xffff0000 IF r10 bitinv r40  r70 no change, since guard is false

r2 0 = 1, r40 = 0xffff0000 IF r20 bitinv r40  r100 r100  0x0000ffff

r50 = 0x88888888 bitinv r50  r110 r110  0x77777777

SEE ALSO

bitand bitandinv bitor

bitxor

bitinv

PNX1300/01/02/11 Data Book Philips Semiconductors

A-15 PRELIMINARY SPECIFICATION

bitor Bitwise logical OR

SYNTAX

[ IF rguard ] bitor rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  rsrc1 | rsrc2

ATTRIBUTES

Function unit alu

Operation code 17

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The bitor operation computes the bitwise, logical OR of the first and second arguments, rsrc1 and rsrc2. The

result is stored in the destination register, rdest.

The bitor operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xf310ffff, r40 = 0xffff0000 bitor r30 r40  r90 r90  0xffffffff

r10 = 0, r50 = 0x88888888 IF r10 bitor r30 r50  r80 no change, since guard is false

r20 = 1, r30 = 0xf31 0ffff,

r50 = 0x88888888 IF r20 bitor r30 r50  r100 r100  0x fb98 ffff

r60 = 0x11119999, r50 = 0x88888888 bitor r60 r50  r110 r110  0x99999999

r70 = 0x55555555, r30 = 0xf310ffff bitor r70 r30  r120 r120  0 xf 75 5ffff

SEE ALSO

bitand bitandinv bitinv

bitxor

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-16

Bitwise logical exclusive-OR

SYNTAX

[ IF rguard ] bitxor rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  rsrc1  rsrc2

ATTRIBUTES

Function unit alu

Operation code 48

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The bitxor operation computes the bitwise, logical exclusive-OR of the first and second arguments, rsrc1 and

rsrc2. The result is stored in the destination register, rdest.

The bitxor operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xf310ffff, r40 = 0xffff0000 bitxor r30 r40  r90 r90  0x0cefffff

r10 = 0, r50 = 0x88888888 IF r10 bitxor r30 r50  r80 no change, since guard is false

r20 = 1, r30 = 0xf310ffff,

r50 = 0x88888888 IF r20 bitxor r30 r50  r100 r100  0x7b987777

r60 = 0x11119999, r50 = 0x88888888 bitxor r60 r50  r110 r110  0x99991111

r70 = 0x55555555, r30 = 0xf310ffff bitxor r70 r30  r120 r120  0xa645aaaa

SEE ALSO

bitand bitandinv bitinv

bitor

bitxor

PNX1300/01/02/11 Data Book Philips Semiconductors

A-17 PRELIMINARY SPECIFICATION

borrow Compute borrow bit from unsigned subtract

pseudo-op for ugtr

SYNTAX

[ IF rguard ] borrow rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 < rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 33

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The borrow operation is a pseudo opera tion transformed by the scheduler into an ugtr with reversed argument s.

(Note: pseudo operations cannot be used in assembly source files.)

The borrow operation computes the unsigned difference of the first and second arguments, rsrc1–rsrc2. If the

difference generates a borrow (if rsrc2 > rsrc1), 1 is stored in the destination register, rdest; otherwise, rdest is set to

0.The borrow operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r70 = 2, r30 = 0xfffffffc borrow r70 r30  r80 r80  1

r10 = 0, r70 = 2, r30 = 0xfffffffc IF r10 borrow r70 r30  r90 no change, since guard is false

r20 = 1, r70 = 2, r30 = 0xfffffffc IF r20 borrow r70 r30  r100 r100  1

r60 = 4, r30 = 0xfffffffc borrow r60 r30  r110 r110  1

r30 = 0xfffffffc borrow r30 r30  r120 r120  0

SEE ALSO

ugtr carry

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-18

Compute carry bit from unsigned add

SYNTAX

[ IF rguard ] carry rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if (rsrc1+rsrc2) < 232 then

rdest  0

else

rdest  1

}

ATTRIBUTES

Function unit alu

Operation code 45

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The carry operation computes the unsigned sum of the first and second arguments, rsrc1+rsrc2. If the sum

generates a carry (if the sum is grea te r than 2 32-1), 1 is stored in th e de stinatio n register, r dest; otherwise, rdest is set

to 0.

The carry operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r7 0 = 2, r30 = 0xfffffffc carry r70 r30  r80 r80  0

r10 = 0, r70 = 2, r30 = 0xfffffffc IF r10 carry r70 r30  r90 no change, since guard is false

r20 = 1, r70 = 2, r30 = 0xfffffffc IF r20 carry r70 r30  r100 r100  0

r6 0 = 4, r30 = 0xfffffffc carry r60 r30  r110 r110  1

r30 = 0xfffffffc carry r30 r30  r120 r120  1

SEE ALSO

borrow

carry

PNX1300/01/02/11 Data Book Philips Semiconductors

A-19 PRELIMINARY SPECIFICATION

curcycles Read current clock cycle counter, least-

significant word

SYNTAX

[ IF rguard ] curcycles  rdest

FUNCTION

if rguard then

rdest CCCOUNT<31:0>

ATTRIBUTES

Function unit fcomp

Operation code 162

Number of operands 0

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

Refer to Section 3.1.5, “CCCOUNT—Clock Cycle Counter” for a description of the CCCOUNT operation. The

curcycles operation copies the current low 32 bits of the master Clock Cycle Counter (CCCOUNT) to the

destination register, rdest.. The master CCCOUNT increments on all cycles (processor-stall and non-stall) if

PCSW.CS = 1; otherwise, the counter increments only on non-stall cycles.

The curcycles operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

CCCOUNT_HR = 0xabcdefff12345678 curcycles  r60 r30  0x12345678

r10 = 0, CCCOUNT_HR = 0xabcdefff12345678 IF r10 curcycles  r70 no change, since guard is false

r20 = 1, CCCOUNT_HR = 0xabcdefff12345678 IF r20 curcycles  r100 r100  0x12345678

SEE ALSO

cycles hicycles writepcsw

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-20

Read clock cycle counter, least-significant word

SYNTAX

[ IF rguard ] cycles  rdest

FUNCTION

if rguard then

rdest  CCCOUNT<31:0>

ATTRIBUTES

Function unit fcomp

Operation code 154

Number of operands 0

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

Refer to Section 3.1.5, “CCCOUNT—Clock Cycle Counter” for a description of the CCCOUNT operation. The

cycles opera tion copie s the low 32 bits of the slave regis ter of Clock Cyc le Co unter (CC CO UNT) to the destin ation

successful interruptible jump and on processor reset. Thus, if cycles and hicycles are executed without

intervening interruptible jumps, the operation pair is guaranteed to be a coherent sample of the master clock-cycle

counter. The master counter increments on all cycles (processor-stall and non-stall) if PCSW.CS = 1; otherwise, the

counter increments only on non-stall cycles.

The cycles operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

CCCOUNT_HR = 0xabcdefff12345678 cycles  r60 r30  0x12345678

r10 = 0, CCCOUNT_HR = 0xabcdefff12345678 IF r10 cycles  r70 no change, since guard is false

r20 = 1, CCCOUNT_HR = 0xabcdefff12345678 IF r20 cycles  r100 r100  0x12345678

SEE ALSO

hicycles curcycles

writepcsw

cycles

PNX1300/01/02/11 Data Book Philips Semiconductors

A-21 PRELIMINARY SPECIFICATION

Data cache copy back

SYNTAX

[ IF rguard ] dcb(d) rsrc1

FUNCTION

if rguard then {

addr  rsrc1 + d

if dcache_valid_addr(addr) && dcache_dirty_addr(addr) then {

dcache_copyback_addr(addr)

dcache_reset_dirty_addr(addr)

}

ATTRIBUTES

Function unit dmemspec

Operation code 205

Number of operands 1

Modifier 7 bits

Modifier range –256..252 by 4

Latency 3

Issue slots 5

DESCRIPTION

The dcb operation causes a block in the data cache to be copied back to main memory if the block is marked dirty

and valid, and the blo ck’ s dirty bi t is reset. The t arget block of dcb is the block in the dat a cache that cont ains the byte

addressed by rsrc1 + d. The d value is an opcode mod ifier, must be in th e ra nge –25 6 to 25 2 inclusive, and must be a

multiple of 4.

A valid copy of the target block remains in the cache . Stall cycles are taken as necessary to complete the copy-back

operation. If the target block is not dirty or if the block is not in the cache, dcb has no effect and no stall cycles are

taken.

dcb has no effect on blocks that are in the non-cacheable SDRAM aperture. dcb does not change the replacement

status of data- cache blocks.

dcb ensures coherency between caches and main memory by discarding all pending prefetch operations and by

causing all non-empty copyback buffers to be emptied to main me mory.

The dcb operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls if the

operation is carried out or not.If the LSB of rguard is 1, the operation is carried out; otherwise,it is not carrie d out.

EXAMPLES

Initial Values Operation Result

dcb(0) r30

r10 = 0 IF r10 dcb(4) r40 no change and no stall cycles, since

guard is false

r20 = 1 IF r20 dcb(8) r50

SEE ALSO

dinvalid

dcb

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-22

Invalidate data cache block

SYNTAX

[ IF rguard ] dinvalid(d) rsrc1

FUNCTION

if rguard then {

addr  rsrc1 + d

if dcache_valid_addr(addr) then {

dcache_reset_valid_addr(addr)

dcache_reset_dirty_addr(addr)

}

ATTRIBUTES

Function unit dmemspec

Operation code 206

Number of operands 1

Modifier 7 bits

Modifier range –256..252 by 4

Latency 3

Issue slots 5

DESCRIPTION

The dinvalid operation resets the valid and dirty bit of a block in the data cache. Regardless of the block’s dirty

bit, the block is not written back to main memory. The target block of dinvalid is the block in the data cache that

contains the byte addressed by rsrc1 + d. The d value is an opcode modifier, must be in the range –256 to 252

inclusive, and must be a multiple of 4.

Stall cycles are taken as necessary to complete the invalidate operation. If the target block is not in the cache,

dinvalid has no effect and no stall cycles are taken.

dinvalid has no effect on blocks that are in the non-cacheable SDRAM aperture. dinvalid does clear the

valid bits of locked blocks. dinvalid does not change the replacement status of data-cache blocks.

dinvalid ensures coherency between caches and main memory by discarding all pending prefetch operations

and by causing all non-empty copyback buffers to be emptied to main memory.

The dinvalid operation optionally takes a guard, specified in rguard. If a guard is pres ent, its LSB contro ls if the

operation is carried out or not. If the LSB of rguard is 1, the operation is carrie d ou t; otherwise, it is not carried out.

EXAMPLES

Initial Values Operation Result

dinvalid(0) r30

r10 = 0 IF r10 dinvalid(4) r40 no change and no stall cycles, since

guard is false

r20 = 1 IF r20 dinvalid(8) r50

SEE ALSO

dcb

dinvalid

PNX1300/01/02/11 Data Book Philips Semiconductors

A-23 PRELIMINARY SPECIFICATION

Clipped signed absolute value

pseudo-op for h_dspiabs

SYNTAX

[ IF rguard ] dspiabs rsrc1  rdest

FUNCTION

if rguard then {

if rsrc1 >= 0 then

rdest  rsrc1

else if rsrc1 = 0x80000000 then

rdest  0x7fffffff

else

rdest  –rsrc1

}

ATTRIBUTES

Function unit dspalu

Operation code 65

Number of operands 1

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

The dspiabs operation is a pseudo operation transformed by the scheduler into an h_dspiabs with a constant

first argument zero and second argu ment equal to the dspiabs argument. (Note: pseu do operations cannot be use d

in assembly source files.)

The dspiabs operation computes the absolute value of rsrc1, clips the result into the range [231–1..0] (or

[0x7fffffff..0]), and stores the clipped value into rdest. All values are signed integers.

The dspiabs operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xffffffff dspiabs r30  r60 r60  0x00000001

r10 = 0, r40 = 0x80000001 IF r10 dspiabs r40  r70 no change, since guard is false

r20 = 1, r40 = 0x80000001 IF r20 dspiabs r40  r100 r100  0x7 fffffff

r50 = 0x80000000 dspiabs r50  r80 r80  0x7fffffff

r90 = 0x7fffffff dspiabs r90  r110 r110  0x7fffffff

SEE ALSO

h_dspiabs h_dspidualabs

dspiadd dspimul dspisub

dspuadd dspumul dspusub

dspiabs

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-24

Clipped signed add

SYNTAX

[ IF rguard ] dspiadd rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

temp  sign_ext32to64(rsrc1) + sign_ext32to64(rsrc2)

if temp < 0xffffffff 80000000 then

rdest  0x80000000

else if temp > 0x000000007fffffff then

rdest  0x 7fffffff

else

rdest  temp

}

ATTRIBUTES

Function unit dspalu

Operation code 66

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

As shown below, the dspiadd operation computes the sum rsrc1+rsrc2, clips the result into the 32-bit signed

range [231–1..–231] (o r [0x7f ff ff f f..0 x80000000]) , and store s the clipped value into rdest. All values are signed integers.

The dspiadd operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x1200, r40 = 0xff dspiadd r30 r40  r60 r60  0x12ff

r10 = 0, r30 = 0x1200, r40 = 0xff IF r10 dspiadd r30 r40  r80 no change, since guard is false

r20 = 1, r30 = 0x1200, r40 = 0xff IF r20 dspiadd r30 r40  r100 r100  0x12ff

r50 = 0x7fffffff, r90 = 1 dspiadd r50 r90  r110 r110  0x7fffffff

r70 = 0x80000000, r80 = 0xffffffff dspiadd r70 r80  r120 r120  0x80000000

031

rsrc1 031

rsrc2

031

rdest



032

Clip to [231–1..–231]

signed signed

Full-precision

33-bit result signed

signed

SEE ALSO

dspiabs dspimul dspisub

dspuadd dspumul dspusub

dspiadd

PNX1300/01/02/11 Data Book Philips Semiconductors

A-25 PRELIMINARY SPECIFICATION

Dual clipped absolute value of signed 16-bit

halfwords

pseudo-op for h_dspidualabs

SYNTAX

[ IF rguard ] dspidualabs rsrc1  rdest

FUNCTION

if rguard then {

temp1  sign_ext16to32(rsrc1<15:0>)

temp2  sign_ext16to32(rsrc1<31:16>)

if temp1 = 0xffff8000 then temp1  0x7fff

if temp2 = 0xffff8000 then temp2  0x7fff

if temp1 < 0 then tem p1  –temp1

if temp2 < 0 then tem p2  –temp2

rdest<31:16>  temp2<15:0>

rdest<15:0>  temp1<15:0>

}

ATTRIBUTES

Function unit dspalu

Operation code 72

Number of operands 1

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

The dspidualabs operation is a pseudo operation transformed by the scheduler into an h_dspidualabs with

a constant zero as first argument and the dspidualabs argument as second argument. (Note: pseudo operations

cannot be used in assembly source files.)

The dspidualabs operation performs two 16-bit clipped, signed absolute value computations separately on the

high and low 16-bit halfwords of r src1. Both absolute values are clipped into the range [0x0..0x7f f f] and written into the

corresponding halfwords of rdest. All values are signed 16-bit integers.

The dspidualabs operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xffff0032 dspidualabs r30  r60 r60  0x00010032

r10 = 0, r40 = 0x80008001 IF r10 dspidualabs r40  r70 no change, since guard is false

r20 = 1, r40 = 0x80008001 IF r20 dspidualabs r40  r100 r100  0x7fff7fff

r50 = 0x0032ffff dspidualabs r50  r80 r80  0x00320001

r90 = 0x7fffffff dspidualabs r90  r110 r110  0x7fff0001

SEE ALSO

h_dspidualabs dspiabs

dspidualadd dspidualmul

dspidualsub

dspidualabs

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-26

Dual clipped add of signed 16-bit halfwords

SYNTAX

[ IF rguard ] dspidualadd rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

temp1  sign_ext16to32(rsrc1<15:0>) + sign_ex t16to32(rsrc2<15:0>)

temp2  sign_ext16to32(rsrc1<31:16>) + sign_ext16to32(rsrc2<31:16>)

if temp1 < 0xffff8000 then temp1  0x8000

if temp2 < 0xffff8000 then temp2  0x8000

if temp1 > 0x7fff then temp1  0x7fff

if temp2 > 0x7fff then temp2  0x7fff

rdest<31:16>  temp2<15:0>

rdest<15:0>  temp1<15:0>

}

ATTRIBUTES

Function unit dspalu

Operation code 70

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

As shown below, the dspidualadd operation computes two 16-bit clipped, signed sums separately on the two

pairs of high and low 16-bit halfwords of rsrc1 and rsrc2. Both sums are clipped into the range [215–1..–215] (or

[0x7fff..0x8000]) and written into the corresponding halfwords of rdest. All values are signed 16-bit integers.

The dspidualadd operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x12340032, r40 = 0x00010002 dspidualadd r30 r40  r60 r60  0x12350034

r10 = 0, r30 = 0x12340032, r40 = 0x00010002 IF r10 dspidualadd r30 r40  r70 no change, since guard is

false

r20 = 1, r30 = 0x12340032, r40 = 0x00010002 IF r20 dspidualadd r30 r40  r100 r100  0x12350034

r50 = 0x80000001, r80 = 0xff ff7fff dspidualadd r50 r80  r90 r90  0x80007fff

r110 = 0x00017fff, r120 = 0x7fff7fff dspidualadd r110 r120  r125 r125  0x7fff7fff

01531

rsrc1 01531

rsrc2

031

rdest



017017

Two full-precision

17-bit signed sums

Clip to [215–1 .. –215]Clip to [215–1 .. –215]

signed signed signed

signed signed

signedsigned

signed

SEE ALSO

dspidualabs dspidualmul

dspidualsub dspiabs

dspidualadd

PNX1300/01/02/11 Data Book Philips Semiconductors

A-27 PRELIMINARY SPECIFICATION

Dual clipped multiply of signed 16-bit halfwords

SYNTAX

[ IF rguard ] dspidualmul rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

temp1  sign_ext16to32(rsrc1<15:0>)  sign_ext16to32(rsrc2<15:0>)

temp2  sign_ext16to32(rsrc1<31:16>)  sign_ext16to32(rsrc2<31:16>)

if temp1 < 0xffff8000 then temp1  0x8000

if temp2 < 0xffff8000 then temp2  0x8000

if temp1 > 0x 7 fff then temp1  0x7fff

if temp2 > 0x 7 fff then temp2  0x7fff

rdest<31:16>  temp2<15:0>

rdest<15:0>  temp1<15:0>

}

ATTRIBUTES

Function unit dspmul

Operation code 95

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the dspidualmul operation computes two 16-bit clipped, signed p roduct s separately on the two

pairs of high and low 16-bit halfwords of rsrc1 and rsrc2. Both products are clipped into the range [215–1..–215] (or

[0x7fff..0x8000]) and wr itten into the corresponding halfwords of rdest. All values are signed 16-bit integers.

The dspidualmul operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x0020010, r40 = 0x00030020 dspidualmul r30 r40  r60 r60  0x00060200

r10 = 0, r30 = 0x0020010, r40 = 0x00030020 IF r10 dspidualmul r30 r40  r70 no change, since guard is false

r20 = 1, r30 = 0x0020010, r40 = 0x00030020 IF r20 dspidualmul r30 r40  r100 r100  0x00060200

r50 = 0x80000002, r80 = 0x00024000 dspidualmul r50 r80  r90 r90  0x80007fff

r110 = 0x08000003, r120 = 0x00108001 dspidualmul r110 r120  r125 r125  0x7fff8000

01531

rsrc1 01531

rsrc2

031

rdest



031031

Two full-precision

32-bit signed products

Clip to [2 15–1..–215]Clip to [215–1..–215]

signed signed signed

signed signed

signedsigned

signed

SEE ALSO

dspidualabs dspidualadd

dspidualsub dspiabs

dspidualmul

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-28

Dual clipped subtract of signed 16-bit halfwords

SYNTAX

[ IF rguard ] dspidualsub rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

temp1  sign_ext16to32(rsrc1<15:0>) – sign_ext16to32(rsrc2<15:0>)

temp2  sign_ext16to32(rsrc1<31:16>) – sign_ext16to32(rsrc2<31:16>)

if temp1 < 0xffff8000 then temp1  0x8000

if temp2 < 0xffff8000 then temp2  0x8000

if temp1 > 0x7fff then temp1  0x7fff

if temp2 > 0x7fff then temp2  0x7fff

rdest<31:16>  temp2<15:0>

rdest<15:0>  temp1<15:0>

}

ATTRIBUTES

Function unit dspalu

Operation code 71

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

As shown below, the dspidualsub operation computes two 16-bit clipped, signed differences separately on the

two pairs of high and low 16-bit halfw ords of rsrc1 and rsrc2. Both d ifferences ar e clip pe d in to th e ra n ge [ 215–1..–215]

(or [0x7fff..0x8000]) and written into the corresponding halfwords of rdest. All values are signed 16-bit integers.

The dspidualsub operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x12340032, r40 = 0x00010002 dspidualsub r30 r40  r60 r60  0x12330030

r10 = 0, r30 = 0x12340032, r40 = 0x00010002 IF r10 dspidualsub r30 r40  r70 no change, since guard is

false

r20 = 1, r30 = 0x12340032, r40 = 0x00010002 IF r20 dspidualsub r30 r40  r100 r100  0x12330030

r50 = 0x80000001, r80 = 0x00018001 dspidualsub r50 r80  r90 r90  0x80007fff

r110 = 0x00018001, r120 = 0x80010002 dspidualsub r110 r120  r125 r125  0x7fff8000

01531

rsrc1 01531

rsrc2

031

rdest



017017

Two full-precision

17-bit signed di ffe re nc es

Clip to [215–1..–215]Clip to [215–1..–215]

signed signed signed

signed signed

signedsigned

signed

SEE ALSO

dspidualabs dspidualadd

dspidualmul dspiabs

dspidualsub

PNX1300/01/02/11 Data Book Philips Semiconductors

A-29 PRELIMINARY SPECIFICATION

Clipped signed multiply

SYNTAX

[ IF rguard ] dspimul rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

temp  sign_ext32to64(rsrc1)  sign_ext32to64(rsrc2)

if temp < 0xffffffff 80000000 then

rdest  0x80000000

else if temp > 0x000000007fffffff then

rdest  0x7fffffff

else

rdest  temp<31:0>

}

ATTRIBUTES

Function unit ifmul

Operation code 141

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the dspimul operation computes the product rsrc1rsrc2, clips the result into the 32-bit range

[231–1..–231] (or [0x7fffff ff..0x80000000]), and stores the clipped value into rdest. All values are signed integers.

The dspimul operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x10, r40 = 0x20 dspimul r30 r40  r60 r60  0x200

r10 = 0, r30 = 0x10, r40 = 0x20 IF r10 dspimul r30 r40  r80 no change, since guard is false

r20 = 1, r30 = 0x10, r40 = 0x20 IF r20 dspimul r30 r40  r100 r100  0x200

r50 = 0x40000000, r90 = 2 dspimul r50 r90  r110 r110  0x7fffffff

r80 = 0xffffffff dspimul r80 r80  r120 r120  0x1

r70 = 0x80000000, r90 = 2 dspimul r70 r90  r120 r120  0x80000000

031

rsrc1 031

rsrc2

031

rdest



063

Clip to [231–1..–231]

signed signed

Full-precision

64-bit result signed

signed

SEE ALSO

dspiabs dspiadd dspisub

dspuadd dspumul dspusub

dspimul

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-30

Clipped signed subtract

SYNTAX

[ IF rguard ] dspisub rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

temp  sign_ext32to64(rsrc1) – sign_ext32to64(rsrc2)

if temp < 0xfffffffff 80000000 then

rdest  0x80000000

else if temp > 0x000000007fffffff then

rdest  0x 7fffffff

else

rdest  temp<31:0>

}

ATTRIBUTES

Function unit dspalu

Operation code 68

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

As shown below, the dspisub operation computes the difference rsrc1–rsrc2, clips the result into the 32-bit range

[231–1..–231] (or [0x7fffffff..0x80000000]), and stores the clipped value into rdest. All values are sig ned integers.

The dspisub operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x1200, r40 = 0xff dspisub r30 r40  r60 r60  0x1101

r10 = 0, r30 = 0x1200, r40 = 0xff IF r10 dspisub r30 r40  r80 no change, since guard is false

r20 = 1, r30 = 0x1200, r40 = 0xff IF r20 dspisub r30 r40  r100 r100  0x1101

r50 = 0x7fffffff, r90 = 0xffffffff dspisub r50 r90  r110 r110  0x7fffffff

r70 = 0x80000000, r80 = 1 dspisub r70 r80  r120 r120  0x80000000

031

rsrc1 031

rsrc2

031

rdest



032

Clip to [231–1..–231]

signed signed

Full-precision

33-bit result signed

signed

SEE ALSO

dspiabs dspiadd dspimul

dspuadd dspumul dspusub

dspisub

PNX1300/01/02/11 Data Book Philips Semiconductors

A-31 PRELIMINARY SPECIFICATION

Clipped unsigned add

SYNTAX

[ IF rguard ] dspuadd rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

temp  zero_ext32to64(rsrc1) + zero_ext32to64(rsrc2)

if (unsigned)temp > 0x00000000ffffffff then

rdest  0xffffffff

else

rdest  temp<31:0>

}

ATTRIBUTES

Function unit dspalu

Operation code 67

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

As shown below, the dspuadd operation computes unsigned sum rsrc1+rsrc2, clips the result into the unsigned

range [232–1..0] (or [0xffffffff..0]), and stores the clipped value into rdest.

The dspuadd operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x1200, r40 = 0xff dspuadd r30 r40  r60 r60  0x12ff

r10 = 0, r30 = 0x1200, r40 = 0xff IF r10 dspuadd r30 r40  r80 no change, since guard is false

r20 = 1, r30 = 0x1200, r40 = 0xff IF r20 dspuadd r30 r40  r100 r100  0x12ff

r50 = 0xffffffff, r90 = 1 dspuadd r50 r90  r110 r110  0xffffffff

r70 = 0x80000001, r80 = 0x7fffffff dspuadd r70 r80  r120 r120  0xffffffff

031

rsrc1 031

rsrc2

031

rdest



032

Clip to [232–1..0]

unsigned unsigned

Full-precision

33-bit result unsigned

unsigned

SEE ALSO

dspiabs dspiadd dspimul

dspisub dspumul dspusub

dspuadd

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-32

Clipped unsigned multiply

SYNTAX

[ IF rguard ] dspumul rsrc1 rsrc2  rdest

OPERATION

if rguard then {

temp  zero_ext32to64(rsrc1)  zero_ext32to64(rsrc2)

if (unsigned)temp > 0x00000000ffffffff then

rdest  0x ffffffff

else

rdest  temp<31:0>

}

ATTRIBUTES

Function unit ifmul

Operation code 142

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the dspumul operation computes unsigned product r src1rsrc2, clips the r esult into the unsigned

range [232–1..0] (or [0xffffffff..0]), and stores the clipped value into rdest.

The dspumul operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x10, r40 = 0x20 dspumul r30 r40  r60 r60  0x200

r10 = 0, r30 = 0x10, r40 = 0x20 IF r10 dspumul r30 r40  r80 no change, since guard is false

r20 = 1, r30 = 0x10, r40 = 0x20 IF r20 dspumul r30 r40  r100 r100  0x200

r50 = 0x40000000, r90 = 2 dspumul r50 r90  r110 r110  0x80000000

r80 = 0xffffffff dspumul r80 r80  r120 r120  0xffffffff

r70 = 0x80000000, r90 = 2 dspumul r70 r90  r120 r120  0xffffffff

031

rsrc1 031

rsrc2

031

rdest



063

Clip to [232–1..0]

unsigned unsigned

Full-precision

64-bit result unsigned

unsigned

SEE ALSO

dspiabs dspiadd dspisub

dspuadd dspumul dspusub

dspumul

PNX1300/01/02/11 Data Book Philips Semiconductors

A-33 PRELIMINARY SPECIFICATION

Quad clipped add of unsigned/signed bytes

SYNTAX

[ IF rguard ] dspuquadaddui rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

for (i  0, m  31, n  24; i < 4; i  i + 1, m  m – 8, n  n – 8) {

temp  zero_ext8to32(rsrc1<m:n>) + sign_ext8to32(rsrc2<m:n>)

if temp < 0 then

rdest<m:n>  0

else if temp > 0xff then

rdest<m:n>  0xff

else rdest<m:n>  temp<7:0>

}

ATTRIBUTES

Function unit dspalu

Operation code 78

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

As shown below, the dspuquadaddui operation computes four separate sums of the four pairs of corresponding

8-bit bytes of rsrc1 and rsrc2. The bytes in rsrc1 are considered unsigned values; the bytes in rsrc2 are considered

signed. The four sums are clipped into the unsigned range [255..0] (or [0xff..0]); thus, the final byte sums are

unsigned. All computations are performed without loss of precision.

The dspuquadaddui operation optionally takes a guard, specified in rguard. If a guard is present, its LSB

controls the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not

changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x02010001, r40 = 0xffffff01 dspuquadaddui r30 r40  r50 r50  0x01000002

r10 = 0, r60 = 0x9c9c6464, r70 = 0x649c649c IF r10 dspuquadaddui r60 r70  r80 no change, since guard is

false

r20 = 1, r60 = 0x9c9c6464, r70 = 0x649c649c IF r20 dspuquadaddui r60 r70  r90 r90  0xff38c800

01531

rsrc1 01531

rsrc2

031

rdest



23 7 23 7

71523

09 0909 09

Four full-precision

10-bit signed sums

Clip to [255..0]

unsigned unsigned unsigned unsigned signed signed signed signed

signed signed signed signed

unsigned unsigned unsigned unsigned

Clip to [255..0] Clip to [255..0] Clip to [255..0]

SEE ALSO

dspidualadd

dspuquadaddui

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-34

Clipped unsigned subtract

SYNTAX

[ IF rguard ] dspusub rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

temp  zero_ext32to64(rsrc1) – zero_ext32to64(rsrc2)

if (signed)temp < 0 then

rdest  0

else

rdest  temp<31:0>

}

ATTRIBUTES

Function unit dspalu

Operation code 69

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

As shown below, the dspusub operation computes unsigned difference rsrc1–rsrc2, clips the result into the

unsigned range [232–1..0] (or [0xffffffff..0]), and stores the clipped value into rdest.

The dspusub operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x1200, r40 = 0xff dspusub r30 r40  r60 r60  0x1101

r10 = 0, r30 = 0x1200, r40 = 0xff IF r10 dspusub r30 r40  r80 no change, since guard is false

r20 = 1, r30 = 0x1200, r40 = 0xff IF r20 dspusub r30 r40  r100 r100  0x1101

r50 = 0, r90 = 1 dspusub r50 r90  r110 r110  0

r70 = 0x80000001, r80 = 0xffffffff dspusub r70 r80  r120 r120  0

031

rsrc1 031

rsrc2

031

rdest



032

Clip to [232–1..0]

unsigned unsigned

Full-precision

33-bit result signed

unsigned

SEE ALSO

dspiabs dspiadd dspimul

dspisub dspuadd dspumul

dspusub

PNX1300/01/02/11 Data Book Philips Semiconductors

A-35 PRELIMINARY SPECIFICATION

dualasr Dual-16 arithmetic shift right

SYNTAX

[ IF rguard ] dualasr rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

n <- rsrc2<3:0>

rdest<31:31-n> <- rsrc1<31>

rdest<30-n:16> <- rsrc1<30:16+n>

rdest<15:15-n> <- rsrc1<15>

rdest<14-n:0> <- rsrc1<14:n>

if rsrc2<31:4> != 0 {

rdest<31:16> <- rsrc1<31>

rdest<15:0> <- rsrc1<15>

}

ATTRIBUTES

Function unit shifter

Operation code 102

Number of operands 2

Modifier No

Modifier range -

Latency 1

Issue slots 1,2

DESCRIPTION

The argument rsrc1 contains two 16-bit signed integers, rsrc1<31:16> and rsrc1<15:0>. Rsrc2 specifies an

unsigned shif t am ount, and the two 16-bit integ ers shif te d right by this am ount. Th e sign bit s r src1<31> and r src1<15>

are replicated as needed within each 16-bit value from the left. If the rsrc2<31 :4> value is not zero, then take this as a

shift by 16 or more, i.e. exte nd the sign bit into either result.

The dualasr operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x70087008, r40 = 0x1 dualasr r30 r40 -> r50 r50 <- 0x38043804

r30 = 0x70087008, r40 = 0x2 dualasr r30 r40 -> r50 r50 <- 0x1c021c02

r10 = 0, r30 = 0x70087008, r40 = 0x2 IF r10 dualasr r30 r40 -> r50 no change, since guard is false

r10 = 1, r30 = 0x70084008, r40 = 0x4 IF r10 dualasr r30 r40 -> r50 r50 <- 0x07000400

r10 = 1, r30 = 0x800c800c, r40 = 0x4 IF r10 dualasr r30 r40 -> r50 r50 <- 0xf800f800

r10 = 1, r30 = 0x700c700c, r40 = 0xf IF r10 dualasr r30 r40 -> r50 r50 <- 0x00000000

r10 = 1, r30 = 0x700c800c, r40 = 0xf IF r10 dualasr r30 r40 -> r50 r50 <- 0x0000ffff

r10 = 1, r30 = 0x800c700c, r40 = 0xf IF r10 dualasr r30 r40 -> r50 r50 <- 0xffff0000

r10 = 1, r30 = 0x800c700c, r40 = 0x10000000 IF r10 dualasr r30 r40 -> r50 r50 <- 0xffff0000

r10 = 1, r30 = 0x800c700c, r40 = 0x10 IF r10 dualasr r30 r40 -> r50 r50 <- 0xffff0000

031

rsrc1 031

rsrc2 n

Right shifter

rdest

SSS

Four LSBs of rsrc2

Right shifter Four LSBs of rsrc2

SSS Lower 13 bits

Intermediate result

(example: n = 3) SSSS Lower 13 bits

Intermediate result

(example: n = 3) S

15 12

SSS S

SEE ALSO

asl asli asri lsl lsli lsr

lsri rol roli

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-36

Dual-16 clip signed to signed

SYNTAX

[ IF rguard ] dualiclipi rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

rdest<31:16> <- min(max(rscrc1<31:16>, -rsrc2<15:0>-1), rsrc2<15:0>)

rdest<15:0> <- min(max(rscrc1<15:0>, -rsrc2<15:0>-1), rsrc2<15:0>)

}

ATTRIBUTES

Function unit dspalu

Operation code 82

Number of operands 2

Modifier No

Modifier range -

Latency 2

Issue slots 1,3

DESCRIPTION

The argument rs rc1 contains two signe d16-bit integers, rsrc1<31:16> an d rsrc1<15:0>. Each integer valu e is clipped

into the signed integer range (-rsrc2 -1) to rsrc2. The value in rsrc2 contains an unsigned integer and must have the

value betwee n 0 and 0x 7fff inclusive.

The dualiclipi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x00800080, r40 = 0x7f dualiclipi r30 r40 -> r50 r50 <- 0x007f007f

r30 = 0x7ffff7ffff, r40 = 0x7ffe dualiclipi r30 r40 -> r50 r50 <- 0x7ffe7ffe

r10 = 0, r30 = 0x7ffff7ffff, r40 = 0x7ffe IF r10 dualiclipi r30 r40 -> r50 no change, since guard is false

r10 = 1, r30 = 0x12345678, r40 = 0xabc IF r10 dualiclipi r30 r40 -> r50 r50 <- 0x0abc0abc

r10 = 1, r30 = 0x80008000, r40 = 0x03ff IF r10 dualiclipi r30 r40 -> r50 r50 <- 0xfc00fc00

r10 = 1, r30 = 0x800003fe, r40 = 0x03ff IF r10 dualiclipi r30 r40 -> r50 r50 <- 0xfc0003fe

r10 = 1, r30 = 0x000f03fe, r40 = 0x03ff IF r10 dualiclipi r30 r40 -> r50 r50 <- 0x000f03fe

SEE ALSO

iclipi uclipi dualuclipi

imin imax quadumax

quadumin

dualiclipi

PNX1300/01/02/11 Data Book Philips Semiconductors

A-37 PRELIMINARY SPECIFICATION

dualuclipi Dual-16 clip signed to unsigned

SYNTAX

[ IF rguard ] dualuclipi rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

rdest<31:16> <- min(max(rscrc1<31:16>, 0), rsrc2<15:0>)

rdest<15:0> <- min(max(rscrc1<15:0>, 0), rsrc2<15:0>)

}

ATTRIBUTES

Function unit dspalu

Operation code 83

Number of operands 2

Modifier No

Modifier range -

Latency 2

Issue slots 1,3

DESCRIPTION

The argument rsrc1 contains two 16-bit signed integers, rsrc1<31:16> and rsrc1<15:0>. Each integer value is

clipped into the unsigned integer range 0 to rsrc2. The value in rsrc2 contains an unsigned integer and must have the

value between 0 and 0xffff inclusive.

The dualuclipi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x00800080, r40 = 0x7f dualuclipi r30 r40 -> r50 r50 <- 0x007f007f

r30 = 0x7ffff7ffff, r40 = 0x7ffe dualuclipi r30 r40 -> r50 r50 <- 0x7ffe7ffe

r10 = 0, r30 = 0x7ffff7ffff, r40 = 0x7ffe IF r10 dualuclipi r30 r40 -> r50 no change, since guard is false

r10 = 1, r30 = 0x12345678, r40 = 0xabc IF r10 dualuclipi r30 r40 -> r50 r50 <- 0x0abc0abc

r10 = 1, r30 = 0x80008000, r40 = 0x03ff IF r10 dualuclipi r30 r40 -> r50 r50 <- 0x00000000

r10 = 1, r30 = 0x800003fe, r40 = 0x03ff IF r10 dualuclipi r30 r40 -> r50 r50 <- 0x000003fe

r10 = 1, r30 = 0x000f03fe, r40 = 0x03ff IF r10 dualuclipi r30 r40 -> r50 r50 <- 0x000f03fe

SEE ALSO

iclipi uclipi dualiclipi

imin imax quadumax

quadumin

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-38

Floating-point absolute value

SYNTAX

[ IF rguard ] fabsval rsrc1  rdest

FUNCTION

if rguard then {

if (float)rsrc1 < 0 then

rdest  –(float)rsrc1

else

rdest  (float)rsrc1

}

ATTRIBUTES

Function unit falu

Operation code 115

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The fabsval operation computes the absolute value of the argument rsrc1 and stores the result into rdest. All

values are in IEEE single-precision floating-point format. If an argument is denormalized, zero is substituted for the

argument before computing the absolute value, and the IFZ flag in the PCSW is set. If fabsval causes an IEEE

exception, the corresponding exception flags in the PCSW are set. The PCSW exception flags are sticky: the flags

can be set as a side-effect of any floating-point operation but can only be reset by an explicit writepcsw operation.

The update of the PCSW exception flags occurs at the same time as rdest is written. If any other floating-point

compute operations update the PCSW at the same time, the net result in each exception flag is the logical OR of all

simultaneous update s OR ed with th e ex isting PCSW value for that exception flag.

The fabsvalflags operation computes the exception flags that would result from an individual fabsval.

The fabsval operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) fabsval r30  r90 r90  0x40400000 (3.0)

r35 = 0xbf800000 (-1.0) fabsval r35  r95 r95  0x3f800000 (1.0)

r40 = 0x00400000 (5.877471754e-39) fabsval r40  r100 r100  0x0 (+0.0), IFZ set

r4 5 = 0x ffffffff (QNaN ) fabsval r45  r105 r105  0xffffffff (QNaN)

r50 = 0xffbfffff (SNaN) fabsval r50  r110 r110  0xffffffff (QNaN), INV set

r10 = 0,

r55 = 0xff7fffff (–3.402823466e+38) IF r10 fabsval r55  r115 no change, since guard is false

r20 = 1,

r55 = 0xff7fffff (–3.402823466e+38) IF r20 fabsval r55  r120 r120  0x7f7fffff (3.402823466e+38)

SEE ALSO

iabs dspiabs dspidualabs

fabsvalflags readpcsw

writepcsw

fabsval

PNX1300/01/02/11 Data Book Philips Semiconductors

A-39 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point absolute

value

SYNTAX

[ IF rguard ] fabsvalflags rsrc1  rdest

FUNCTION

if rguard then

rdest  ieee_flags(abs_val((float)rsrc1))

ATTRIBUTES

Function unit falu

Operation code 116

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The fabsvalflags operation computes the IEEE exceptions that would result from computing the absolute

value of rsrc1 and writes a bit ve ctor representing the exception flags into r dest. The argument value is in IEEE single-

precision floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same format as

the IEEE exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. If rsrc1 is

denormalized, the IFZ bit in the result is set.

The fabsvalflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) fabsvalflags r30  r90 r90  0x0

r35 = 0xbf800000 (-1.0) fabsvalflags r35  r95 r95  0x0

r40 = 0x00400000 (5.877471754e-39) fabsvalflags r40  r100 r100  0x20 (IFZ)

r45 = 0xffffffff (QNa N) fabsvalflags r45  r105 r105  0x0

r50 = 0xffbfffff (SNaN) fabsvalflags r50  r110 r110  0x10 (INV)

r10 = 0,

r55 = 0xff7fffff (–3.402823466e+38) IF r10 fabsvalflags r55  r115 no change, since guard is false

r20 = 1,

r55 = 0xff7fffff (–3.402823466e+38) IF r20 fabsvalflags r55  r120 r120  0x0

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fabsval faddflags readpcsw

fabsvalflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-40

Floating-point add

SYNTAX

[ IF rguard ] fadd rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  (float)rsrc1 + (float)rsrc2

ATTRIBUTES

Function unit falu

Operation code 22

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The fadd operation computes the sum rsrc1+rsrc2 and stores the result into rdest. All values are in IEEE single-

precision floating-point format. Rounding is according to the IEEE rounding mode bits in PCSW. If an argument is

denormalized, zero is substituted for the argument before computing the sum, and the IFZ flag in the PCSW is set. If

the result is denormalized, the result is set to zero instead, and the OFZ flag in the PCSW is set. If fadd causes an

IEEE exception, the corresponding exception flags in the PCSW are set. The PCSW exception flags are sticky: the

flags can be set as a side-effect of any floating-point operation but can only be reset by an explicit writepcsw

operation. The update of the PCSW exception flags occurs at the same time as rdest is written. If any other floating-

point compute operat ions update th e PCSW at the same time, the n et result in e ach exception flag is the logical OR of

all simultaneous updates ORed with the existing PCSW value for that exception flag.

The faddflags operation computes the exception flags that would result from an individual fadd.

The fadd operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r60 = 0xc0400000 (–3.0),

r30 = 0x3f800000 (1.0) fadd r60 r30  r90 r90  0xc0000000 (–2.0)

r40 = 0x40400000 (3.0),

r60 = 0xc0400000 (–3.0) fadd r40 r60  r95 r95  0x00000000 (0.0)

r10 = 0, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e-38) IF r10 fadd r40 r80  r100 no change, since guard is false

r20 = 1, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e-38) IF r20 fadd r40 r80  r110 r110  0x40400000 (3.0), INX flag set

r40 = 0x40400000 (3.0),

r81 = 0x00400000 (5.877471754e–39) fadd r40 r81  r111 r111  0x40400000 (3.0), IFZ flag set

r82 = 0x00c00000 (1.763241526e-38),

r83 = 0x80800000 (–1.175494351e-38) fadd r82 r83  r112 r112  0x00000000 (0.0), OFZ, UNF,

INX flags set

r84 = 0x7f800000 (+INF),

r85 = 0xff800000 (–INF) fadd r84 r85  r113 r113  0xffffffff (QNaN), INV flag set

r7 0 = 0x 7f7fffff (3.402823466e+38) fadd r70 r70  r120 r120  0x7f800000 (+INF), OVF,

INX flags set

r80 = 0x00800000 (1.763241526e–38) fadd r80 r80  r125 r125  0x01000000 (2.350988702e–38)

SEE ALSO

faddflags iadd dspiadd

dspidualadd readpcsw

writepcsw

fadd

PNX1300/01/02/11 Data Book Philips Semiconductors

A-41 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point add

SYNTAX

[ IF rguard ] faddflags rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float)rsrc1 + (float)rsrc2)

ATTRIBUTES

Function unit falu

Operation code 112

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The faddflags operation computes the IEEE exceptions that would result from computing the sum rsrc1+rsrc2

and stores a bit vector representing the exception flags into rdest. The argument values are in IEEE single-precision

floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same format as the IEEE

exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. Rounding is

according to the IEEE rounding mode bits in PCSW. If an argument is denormalized, zero is substituted before

computing the sum, and the IFZ bit in the result is set. If the sum would be denormalized, the OFZ bit in the result is

set.

The faddflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r10 = 0x7 f7fffff (3.402823466e+38),

r20 = 0x3f800000 (1.0) faddflags r10 r20  r60 r60  0x2 (INX)

r30 = 0,

r10 = 0x7 f7fffff (3.402823466e+38) IF r30 faddflags r10 r10  r50 no change, since guard is false

r40 = 1,

r10 = 0x7 f7fffff (3.402823466e+38) IF r40 faddflags r10 r10  r70 r70  0xa (OVF INX)

r80 = 0x00a00000 (1.469367939e–38),

r81 = 0x80800000 (–1.17549435e–38) faddflags r80 r81  r100 r100  0x46 (OFZ UNF INX)

r95 = 0x7f800000 (+INF),

r96 = 0xff800000 (–INF) faddflags r95 r96  r105 r105  0x10 (INV)

r98 = 0x40400000 (3.0),

r99 = 0x00400000 (5.877471754e–39) faddflags r98 r99  r111 r111  0x20 (IFZ)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fadd fsubflags readpcsw

faddflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-42

Floating-point divide

SYNTAX

[ IF rguard ] fdiv rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  (float)rsrc1 / (float)rsrc2

ATTRIBUTES

Function unit ftough

Operation code 108

Number of operands 2

Modifier No

Modifier range —

Latency 17

Recovery 16

Issue slots 2

DESCRIPTION

The fdiv operation computes the quotient rsrc1rsrc2 and stores the result into rdest. All values are in IEEE

single-precision floating-point format. Rounding is according to the IEEE rounding mode bits in PCSW. If an argument

is denormalized, zero is substituted for the argument before computing the quotient, and the IFZ flag in the PCSW is

set. If the result is denormalize d, the result is set to zero instead, and the OFZ flag in the PCSW is set. If fdiv causes

an IEEE exception, the corresponding exception flags in the PCSW are set. The PCSW exception flags are sticky: the

flags can be set as a side-effect of any floating-point operation but can only be reset by an explicit writepcsw

operation. The update of the PCSW exception flags occurs at the same time as rdest is written. If any other floating-

point compute operat ions update th e PCSW at the same time, the n et result in e ach exception flag is the logical OR of

all simultaneous updates ORed with the existing PCSW value for that exception flag.

The fdivflags operation computes the exception flags that would result from an individual fdiv.

The fdiv operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r60 = 0xc0400000 (–3.0),

r30 = 0x3f800000 (1.0) fdiv r60 r30  r90 r90  0xc0400000 (–3.0)

r40 = 0x40400000 (3.0),

r60 = 0xc0400000 (–3.0) fdiv r40 r60  r95 r95  0xbf800000 (–1.0)

r10 = 0, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e–38) IF r10 fdiv r40 r80  r100 no change, since guard is false

r20 = 1, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e–38) IF r20 fdiv r40 r80  r110 r110  0x7f400000 (2 .552117754e38)

r40 = 0x40400000 (3.0),

r81 = 0x00400000 (5.877471754e–39) fdiv r40 r81  r111 r111  0x7f800000 (+INF), IFZ, DBZ flags set

r82 = 0x00c00000 (1.763241526e–38),

r83 = 0x80800000 (–1.175494351e–38) fdiv r82 r83  r112 r112  0xbfc00000 (-1.5)

r84 = 0x7f800000 (+INF),

r85 = 0xff800000 (–INF) fdiv r84 r85  r113 r113  0xffffffff (QNaN), INV flag set

r7 0 = 0x 7f7fffff (3.402823466e+38) fdiv r70 r70  r120 r120  0x3f800000 (1.0)

r80 = 0x00800000 (1.763241526e–38) fdiv r80 r80  r125 r125  0x3f800000 (1.0)

r75 = 0x40400000 (3.0),

r76 = 0x0 (0.0) fdiv r75 r76  r126 r126  0x7f800000 (+INF), DBZ flag set

SEE ALSO

fdivflags readpcsw

writepcsw

fdiv

PNX1300/01/02/11 Data Book Philips Semiconductors

A-43 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point divide

SYNTAX

[ IF rguard ] fdivflags rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float)rsrc1 / (float)rsrc2)

ATTRIBUTES

Function unit ftough

Operation code 109

Number of operands 2

Modifier No

Modifier range —

Latency 17

Recovery 16

Issue slots 2

DESCRIPTION

The fdivflags operation computes the IEEE exceptions that would result from computing the quotient

rsrc1rsrc2 and stores a bit vector representing the exception flags into rdest. The argument values are in IEEE

single-precision floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same

format as the IEEE exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation.

Rounding is according to the IEEE rounding mode bits in PCSW. If an argument is denormalized, zero is substituted

before computin g the quotie nt, and the IFZ bit in the resu lt is set. If the quotien t would be de normalized, th e OFZ bit in

the result is set.

The fdivflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x7 f7fffff (3.402823466e+38),

r40 = 0x3f800000 (1.0) fdivflags r30 r40  r100 r100  0

r10 = 0,

r50 = 0x7 f7fffff (3.402823466e+38)

r60 = 0x3e000000 (0.125)

IF r10 fdivflags r50 r60  r110 no change, since guard is false

r20 = 1,

r50 = 0x7 f7fffff (3.402823466e+38)

r60 = 0x3e000000 (0.125)

IF r20 fdivflags r50 r60  r111 r111  0xa (OVF INX)

r70 = 0x40400000 (3.0),

r80 = 0x00400000 (5.877471754e–39) fdivflags r70 r80  r112 r112  0x21 (IFZ DBZ)

r85 = 0x7f800000 (+INF),

r86 = 0xff800000 (–INF) fdivflags r85 r86  r113 r113  0x10 (INV)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fdiv faddflags readpcsw

fdivflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-44

Floating-point compare equal

SYNTAX

[ IF rguard ] feql rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if (float)rsrc1 = (float)rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit fcomp

Operation code 148

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The feql operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is equal to the second

argument, rsrc2; otherwise, rdest is set to 0. Th e argument s are treated as IEEE single-precisi on floating-point value s;

the result is an integer. If an argument is denormalized, zero is substituted for the argument before computing the

comparison, and the IFZ flag in the PCSW is set. If feql causes an IEEE exception, the corresponding exception

flags in the PCSW are set. The PCSW exception flags are sticky: the flag s can b e set as a s ide-e ffect of any f loatin g-

point operation but can only be reset by an explicit writepcsw operation. The update of the PCSW exception flags

occurs at the same time as rdest is written. If any other floating-point compute operations update the PCSW at the

same time, the net result in each exception flag is the logical OR of all simultaneous updates ORed with the existing

PCSW value for that exception flag.

The feqlflags operation computes the exception flags that would result from an individual feql.

The feql operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) feql r30 r40  r80 r80  0

r30 = 0x40400000 (3.0) feql r30 r30  r90 r90  1

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 feql r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 feql r60 r30  r110 r110  0

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) feql r30 r60  r120 r120  0

r30 = 0x40400000 (3.0),

r6 1 = 0x ffffffff (QNaN ) feql r30 r61  r121 r121  0

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) feql r50 r55  r125 r125  0

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) feql r60 r65  r126 r126  0, IFZ flag set

r50 = 0x7f800000 (+INF) feql r50 r50  r127 r127  1

SEE ALSO

ieql feqlflags fneq

readpcsw writepcsw

feql

PNX1300/01/02/11 Data Book Philips Semiconductors

A-45 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point compare

equal

SYNTAX

[ IF rguard ] feqlflags rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float)rsrc1 = (float)rsrc2)

ATTRIBUTES

Function unit fcomp

Operation code 149

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The feqlflags operation computes the IEEE exceptions that would result from computing the comparison

rsrc1=rsrc2 and stores a bit vector representing the exception flags into rdest. The argument values are in IEEE

single-precision floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same

format as the IEEE exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. If

an argument is deno rmalized, zer o is sub stituted be fore computing the comparison, and the IFZ b i t in the re su lt is set.

The feqlflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) feqlflags r30 r40  r80 r80  0

r30 = 0x40400000 (3.0) feqlflags r30 r30  r90 r90  0

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 feqlflags r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 feqlflags r60 r30  r110 r110  0

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) feqlflags r30 r60  r120 r120  0

r30 = 0x40400000 (3.0),

r61 = 0xffffffff (QNa N) feqlflags r30 r61  r121 r121  0

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) feqlflags r50 r55  r125 r125  0

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) feqlflags r60 r65  r126 r126  0x20 (IFZ)

r50 = 0x7f800000 (+INF) feqlflags r50 r50  r127 r127  0

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

feql ieql fgtrflags

readpcsw

feqlflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-46

Floating-point compare greater or equal

SYNTAX

[ IF rguard ] fgeq rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if (float)rsrc1 >= (float)rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit fcomp

Operation code 146

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The fgeq operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is greater than or equal to

the second argument, rsrc2; otherwise, rdest is set to 0. The arguments are treated as IEEE single-precision floating-

point values; the result is an integer. If an argument is denormalized, zero is substituted for the argument before

computing the comparison, and the IFZ flag in the PCSW is set. If fgeq causes an IEEE exception, the

corresponding exception flags in the PCSW are set. The PCSW exception flags are sticky: the flags can be set as a

side-effect of any floating-point operation but can only be reset by an explicit writepcsw operation. The update of

the PCSW exception flags occurs at the same time as rdest is written. If any other floating-point compute operations

update the PCSW at the same time, the net re sult in ea ch e xception flag is the logical OR of all simult ane ous updates

ORed with the existing PCSW value for that exception flag.

The fgeqflags operation computes the exception flags that would result from an individual fgeq.

The fgeq operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) fgeq r30 r40  r80 r80  1

r30 = 0x40400000 (3.0) fgeq r30 r30  r90 r90  1

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 fgeq r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 fgeq r60 r30  r110 r110  0

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) fgeq r30 r60  r120 r120  1

r30 = 0x40400000 (3.0),

r6 1 = 0x ffffffff (QNaN ) fgeq r30 r61  r121 r121  0, INV flag set

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) fgeq r50 r55  r125 r125  1

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) fgeq r60 r65  r126 r126  1, IFZ flag set

r50 = 0x7f800000 (+INF) fgeq r50 r50  r127 r127  1

SEE ALSO

igeq fgeqflags fgtr

readpcsw writepcsw

fgeq

PNX1300/01/02/11 Data Book Philips Semiconductors

A-47 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point compare

greater or equal

SYNTAX

[ IF rguard ] fgeqflags rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float)rsrc1 >= (float)rsrc2)

ATTRIBUTES

Function unit fcomp

Operation code 147

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The fgeqflags operation computes the IEEE exceptions that would result from computing the comparison

rsrc1>=rsrc2 and stores a bit vector representing the exception flags into rdest. The argument values are in IEEE

single-precision floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same

format as the IEEE exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. If

an argument is deno rmalized, zer o is sub stituted be fore computing the comparison, and the IFZ b i t in the re su lt is set.

The fgeqflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) fgeqflags r30 r40  r80 r80  0

r30 = 0x40400000 (3.0) fgeqflags r30 r30  r90 r90  0

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 fgeqflags r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 fgeqflags r60 r30  r110 r110  0

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) fgeqflags r30 r60  r120 r120  0

r30 = 0x40400000 (3.0),

r61 = 0xffffffff (QNa N) fgeqflags r30 r61  r121 r121  0x10 (INV)

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) fgeqflags r50 r55  r125 r125  0

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) fgeqflags r60 r65  r126 r126  0x20 (IFZ)

r50 = 0x7f800000 (+INF) fgeqflags r50 r50  r127 r127  0

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fgeq igeq fgtrflags

readpcsw

fgeqflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-48

Floating-point compare greater

SYNTAX

[ IF rguard ] fgtr rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if (float)rsrc1 > (float)rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit fcomp

Operation code 144

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The fgtr operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is greater than th e second

argument, rsrc2; otherwise, rdest is set to 0. Th e argument s are treated as IEEE single-precisi on floating-point value s;

the result is an integer. If an argument is denormalized, zero is substituted for the argument before computing the

comparison, and the IFZ flag in the PCSW is set. If fgtr causes an IEEE exception, the corresponding exception

flags in the PCSW are set. The PCSW exception flags are sticky: the flag s can b e set as a s ide-e ffect of any f loatin g-

point operation but can only be reset by an explicit writepcsw operation. The update of the PCSW exception flags

occurs at the same time as rdest is written. If any other floating-point compute operations update the PCSW at the

same time, the net result in each exception flag is the logical OR of all simultaneous updates ORed with the existing

PCSW value for that exception flag.

The fgtrflags operation computes the exception flags that would result from an individual fgtr.

The fgtr operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) fgtr r30 r40  r80 r80  1

r30 = 0x40400000 (3.0) fgtr r30 r30  r90 r90  0

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 fgtr r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 fgtr r60 r30  r110 r110  0

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) fgtr r30 r60  r120 r120  1

r30 = 0x40400000 (3.0),

r6 1 = 0x ffffffff (QNaN ) fgtr r30 r61  r121 r121  0, INV flag set

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) fgtr r50 r55  r125 r125  1

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) fgtr r60 r65  r126 r126  1, IFZ flag set

r50 = 0x7f800000 (+INF) fgtr r50 r50  r127 r127  0

SEE ALSO

igtr fgtrflags fgeq

readpcsw writepcsw

fgtr

PNX1300/01/02/11 Data Book Philips Semiconductors

A-49 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point compare

greater

SYNTAX

[ IF rguard ] fgtrflags rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float)rsrc1 > (float)rsrc2)

ATTRIBUTES

Function unit fcomp

Operation code 145

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The fgtrflags operation computes the IEEE exceptions that would result from computing the comparison

rsrc1>rsrc2 and stores a bit vector representing the exception flags into rdest. The argument values are in IEEE

single-precision floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same

format as the IEEE exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. If

an argument is deno rmalized, zer o is sub stituted be fore computing the comparison, and the IFZ b i t in the re su lt is set.

The fgtrflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) fgtrflags r30 r40  r80 r80  0

r30 = 0x40400000 (3.0) fgtrflags r30 r30  r90 r90  0

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 fgtrflags r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 fgtrflags r60 r30  r110 r110  0

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) fgtrflags r30 r60  r120 r120  0

r30 = 0x40400000 (3.0),

r61 = 0xffffffff (QNa N) fgtrflags r30 r61  r121 r121  0x10 (INV)

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) fgtrflags r50 r55  r125 r125  0

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) fgtrflags r60 r65  r126 r126  0x20 (IFZ)

r50 = 0x7f800000 (+INF) fgtrflags r50 r50  r127 r127  0

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fgtr igtr fgeqflags

readpcsw

fgtrflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-50

Floating-point compare less-than or equal

pseudo-op for fgeq

SYNTAX

[ IF rguard ] fleq rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if (float)rsrc1 <= (float)rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit fcomp

Operation code 146

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The fleq operation is a pseudo operation transformed by the scheduler into an fgeq with the arguments

exchanged (fleq’s rsrc1 is fgeq’s rsrc2 and vice versa). (Note: pseudo operations cannot be used in assembly

source files.)

The fleq operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is less than or equal to the

second argument, rsrc2; otherwise, rdest is set to 0. The argument s are treated as IEEE single-precision floating-point

values; the result is an integer. If an argument is denormalized, zero is substituted for the argument before computing

the comparison, a nd the IFZ flag in the PCSW is set. If fleq causes an IEEE exception, the corresponding exception

flags in the PCSW are set. The PCSW exception flags are sticky: the flag s can b e set as a s ide-e ffect of any f loatin g-

point operation but can only be reset by an explicit writepcsw operation. The update of the PCSW exception flags

occurs at the same time as rdest is written. If any other floating-point compute operations update the PCSW at the

same time, the net result in each exception flag is the logical OR of all simultaneous updates ORed with the existing

PCSW value for that exception flag.

The fleqflags operation computes the exception flags that would result from an individual fleq.

The fleq operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) fleq r30 r40  r80 r80  0

r30 = 0x40400000 (3.0) fleq r30 r30  r90 r90  1

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 fleq r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 fleq r60 r30  r110 r110  1

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) fleq r30 r60  r120 r120  0

r30 = 0x40400000 (3.0),

r6 1 = 0x ffffffff (QNaN ) fleq r30 r61  r121 r121  0, INV flag set

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) fleq r50 r55  r125 r125  0

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) fleq r60 r65  r126 r126  0, IFZ flag set

r50 = 0x7f800000 (+INF) fleq r50 r50  r127 r127  1

SEE ALSO

ileq fgeq fleqflags

readpcsw writepcsw

fleq

PNX1300/01/02/11 Data Book Philips Semiconductors

A-51 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point compare

less-than or equal

pseudo-op for fgeqflags

SYNTAX

[ IF rguard ] fleqflags rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float)rsrc1 <= (float)rsrc2)

ATTRIBUTES

Function unit fcomp

Operation code 147

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The fleqflags operation is a pseudo operation transformed by the scheduler into an fgeqflags with the

arguments exchanged (fleqflags’s rsrc1 is fgeqflags’s rsrc2 and vice versa). (Note: pseudo operations

cannot be used in assembly source files.)

The fleqflags operation computes the IEEE exceptions that would result from computing the comparison

rsrc1<=rsrc2 and stores a bit vector representing the exception flags into rdest. The argument values are in IEEE

single-precision floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same

format as the IEEE exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. If

an argument is deno rmalized, zer o is sub stituted be fore computing the comparison, and the IFZ b i t in the re su lt is set.

The fleqflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) fleqflags r30 r40  r80 r80  0

r30 = 0x40400000 (3.0) fleqflags r30 r30  r90 r90  0

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 fleqflags r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 fleqflags r60 r30  r110 r110  0

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) fleqflags r30 r60  r120 r120  0

r30 = 0x40400000 (3.0),

r61 = 0xffffffff (QNa N) fleqflags r30 r61  r121 r121  0x10 (INV)

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) fleqflags r50 r55  r125 r125  0

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) fleqflags r60 r65  r126 r126  0x20 (IFZ)

r50 = 0x7f800000 (+INF) fleqflags r50 r50  r127 r127  0

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fleq ileq fgeqflags

readpcsw

fleqflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-52

Floating-point compare less-than

pseudo-op for fgtr

SYNTAX

[ IF rguard ] fles rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if (float)rsrc1 < (float)rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit fcomp

Operation code 144

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The fles operation is a pseudo operation transformed by the scheduler into an fgtr with the arguments

exchanged (fles’s rsrc1 is fgtr’s rsrc2 and vice versa). (Note: pseudo operations cannot be used in assembly

source files.)

The fles operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is less than the second

argument, rsrc2; otherwise, rdest is set to 0. Th e argument s are treated as IEEE single-precisi on floating-point value s;

the result is an integer. If an argument is denormalized, zero is substituted for the argument before computing the

comparison, and the IFZ flag in the PCSW is set. If fles causes an IEEE exception, the corresponding exception

flags in the PCSW are set. The PCSW exception flags are sticky: the flag s can b e set as a s ide-e ffect of any f loatin g-

point operation but can only be reset by an explicit writepcsw operation. The update of the PCSW exception flags

occurs at the same time as rdest is written. If any other floating-point compute operations update the PCSW at the

same time, the net result in each exception flag is the logical OR of all simultaneous updates ORed with the existing

PCSW value for that exception flag.

The flesflags operation computes the exception flags that would result from an individual fles.

The fles operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) fles r30 r40  r80 r80  0

r30 = 0x40400000 (3.0) fles r30 r30  r90 r90  0

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 fles r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 fles r60 r30  r110 r110  1

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) fles r30 r60  r120 r120  0

r30 = 0x40400000 (3.0),

r6 1 = 0x ffffffff (QNaN ) fles r30 r61  r121 r121  0, INV flag set

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) fles r50 r55  r125 r125  0

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) fles r60 r65  r126 r126  0, IFZ flag set

r50 = 0x7f800000 (+INF) fles r50 r50  r127 r127  0

SEE ALSO

iles fgtr flesflags

readpcsw writepcsw

fles

PNX1300/01/02/11 Data Book Philips Semiconductors

A-53 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point compare

less-than

pseudo-op for fgtrflags

SYNTAX

[ IF rguard ] flesflags rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float)rsrc1 < (float)rsrc2)

ATTRIBUTES

Function unit fcomp

Operation code 145

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The flesflags operation is a pseudo operation transformed by the scheduler into an fgtrflags with the

arguments exchanged (flesflags’s rsrc1 is fgtrflags’s rsrc2 and vice versa). (Note: pseudo operations

cannot be used in assembly source files.)

The flesflags operation computes the IEEE exceptions that would result from computing the comparison

rsrc1<rsrc2 and stores a bit vector representing the exception flags into rdest. The argument values are in IEEE

single-precision floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same

format as the IEEE exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. If

an argument is deno rmalized, zer o is sub stituted be fore computing the comparison, and the IFZ b i t in the re su lt is set.

The flesflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) flesflags r30 r40  r80 r80  0

r30 = 0x40400000 (3.0) flesflags r30 r30  r90 r90  0

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 flesflags r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 flesflags r60 r30  r110 r110  0

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) flesflags r30 r60  r120 r120  0

r30 = 0x40400000 (3.0),

r61 = 0xffffffff (QNa N) flesflags r30 r61  r121 r121  0x10 (INV)

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) flesflags r50 r55  r125 r125  0

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) flesflags r60 r65  r126 r126  0x20 (IFZ)

r50 = 0x7f800000 (+INF) flesflags r50 r50  r127 r127  0

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fles iles fleqflags

readpcsw

flesflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-54

Floating-point multiply

SYNTAX

[ IF rguard ] fmul rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  (float)rsrc1  (float)rsrc2

ATTRIBUTES

Function unit ifmul

Operation code 28

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

The fmul operation computes the prod uct rsrc1rsrc2 and stores the result into rdest. All values are in IEEE single-

precision floating-point format. Rounding is according to the IEEE rounding mode bits in PCSW. If an argument is

denormalized, zero is substituted for the argument before computing the pr odu ct, and th e IFZ flag in the PCSW is set.

If the result is denormalized, the result is set to zero instea d, and th e OFZ fl ag in th e PCSW is se t. If fmul causes an

IEEE exception, the corresponding exception flags in the PCSW are set. The PCSW exception flags are sticky: the

flags can be set as a side-effect of any floating-point operation but can only be reset by an explicit writepcsw

operation. The update of the PCSW exception flags occurs at the same time as rdest is written. If any other floating-

point compute operat ions update th e PCSW at the same time, the n et result in e ach exception flag is the logical OR of

all simultaneous updates ORed with the existing PCSW value for that exception flag.

The fmulflags operation computes the exception flags that would result from an individual fmul.

The fmul operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r60 = 0xc0400000 (–3.0),

r30 = 0x3f800000 (1.0) fmul r60 r30  r90 r90  0xc0400000 (-3.0)

r40 = 0x40400000 (3.0),

r60 = 0xc0400000 (–3.0) fmul r40 r60  r95 r95  0xc1100000 (-9.0)

r10 = 0, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e–38) IF r10 fmul r40 r80  r100 no change, since guard is false

r20 = 1, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e–38) IF r20 fmul r40 r80  r105 r105  0x1400000 (3.52648305e-38)

r41 = 0x3f000000 (0.5),

r80 = 0x00800000 (1.17549435e–38) fmul r41 r80  r110 r110  0x0, OFZ, UNF, INX flags set

r42 = 0x7f800000 (+INF),

r43 = 0x0 (0.0) fmul r42 r43  r106 r106  0xffffffff (QNaN), INV flag set

r40 = 0x40400000 (3.0),

r81 = 0x00400000 (5.877471754e–39) fmul r40 r81  r111 r111  0, IFZ flag set

r82 = 0x00c00000 (1.763241526e–38),

r83 = 0x8080000 (–1.175494351e–38) fmul r82 r83  r112 r112  0, UNF, INX flag set

r84 = 0x7f800000 (+INF),

r85 = 0xff800000 (–INF) fmul r84 r85  r113 r113  0xff800000 (-INF)

r7 0 = 0x 7f7fffff (3.402823466e+38) fmul r70 r70  r120 r120  0x7f800000, OVF, INX flags set

r80 = 0x00800000 (1.763241526e–38) fmul r80 r80  r125 r125  0, UNF, INX flag set

SEE ALSO

imul umul dspimul

dspidualmul fmulflags

readpcsw writepcsw

fmul

PNX1300/01/02/11 Data Book Philips Semiconductors

A-55 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point multiply

SYNTAX

[ IF rguard ] fmulflags rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float)rsrc1  (float)rsrc2)

ATTRIBUTES

Function unit ifmul

Operation code 143

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

The fmulflags operation computes the IEEE exceptions that would result from computing the product

rsrc1rsrc2 and stores a bit vector representing the exception flags into rdest. The argument values are in IEEE

single-precision floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same

format as the IEEE exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation.

Rounding is according to the IEEE rounding mode bits in PCSW. If an argument is denormalized, zero is substituted

before computing the product, and the IFZ bit in the result is set. If the product would be denormalized, the OFZ bit in

the result is set.

The fmulflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r60 = 0xc0400000 (–3.0),

r30 = 0x3f800000 (1.0) fmulflags r60 r30  r90 r90  0

r40 = 0x40400000 (3.0),

r60 = 0xc0400000 (–3.0) fmulflags r40 r60  r95 r95  0

r10 = 0, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e–38) IF r10 fmulflags r40 r80  r100 no change, since guard is false

r20 = 1, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e–38) IF r20 fmulflags r40 r80  r105 r105  0

r41 = 0x3f000000 (0.5),

r80 = 0x00800000 (1.17549435e–38) fmulflags r41 r80  r110 r110  0x46 (OFZ UNF INX)

r42 = 0x7f800000 (+INF),

r43 = 0x0 (0.0) fmulflags r42 r43  r106 r106  0x10 (INV)

r40 = 0x40400000 (3.0),

r81 = 0x00400000 (5.877471754e–39) fmulflags r40 r81  r111 r111  0x20 (IFZ)

r82 = 0x00c00000 (1.763241526e–38),

r83 = 0x8080000 (–1.175494351e–38) fmulflags r82 r83  r112 r112  0x06 (UNF INX)

r84 = 0x7f800000 (+INF),

r85 = 0xff800000 (–INF) fmulflags r84 r85  r113 r113  0

r70 = 0x7 f7fffff (3.402823466e+38) fmulflags r70 r70  r120 r120  0x0a (OVF INX)

r80 = 0x00800000 (1.763241526e–38) fmulflags r80 r80  r125 r125  0x06 (UNF INX)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fmul faddflags readpcsw

fmulflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-56

Floating-point compare not equal

SYNTAX

[ IF rguard ] fneq rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if (float)rsrc1 != (float)rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit fcomp

Operation code 150

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The fneq ope ration sets the destination register, rdest, to 1 if the first argument, rsrc1, is not equal to the second

argument, rsrc2; otherwise, rdest is set to 0. Th e argument s are treated as IEEE single-precisi on floating-point value s;

the result is an integer. If an argument is denormalized, zero is substituted for the argument before computing the

comparison, and the IFZ flag in the PCSW is set. If fneq causes an IEEE exception, the corresponding exception

flags in the PCSW are set. The PCSW exception flags are sticky: the flag s can b e set as a s ide-e ffect of any f loatin g-

point operation but can only be reset by an explicit writepcsw operation. The update of the PCSW exception flags

occurs at the same time as rdest is written. If any other floating-point compute operations update the PCSW at the

same time, the net result in each exception flag is the logical OR of all simultaneous updates ORed with the existing

PCSW value for that exception flag.

The fneqflags operation computes the exception flags that would result from an individual fneq.

The fneq operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) fneq r30 r40  r80 r80  1

r30 = 0x40400000 (3.0) fneq r30 r30  r90 r90  0

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 fneq r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 fneq r60 r30  r110 r110  1

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) fneq r30 r60  r120 r120  1

r30 = 0x40400000 (3.0),

r6 1 = 0x ffffffff (QNaN ) fneq r30 r61  r121 r121  0

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) fneq r50 r55  r125 r125  1

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) fneq r60 r65  r126 r126  1, IFZ flag set

r50 = 0x7f800000 (+INF) fneq r50 r50  r127 r127  0

SEE ALSO

ineq feql fneqflags

readpcsw writepcsw

fneq

PNX1300/01/02/11 Data Book Philips Semiconductors

A-57 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point compare

not equal

SYNTAX

[ IF rguard ] fneqflags rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float)rsrc1 != (float)rsrc2)

ATTRIBUTES

Function unit fcomp

Operation code 151

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The fneqflags operation computes the IEEE exceptions that would result from computing the comparison

rsrc1!=rsrc2 and stores a bit vector representing the exception flags into rdest. The argument values are in IEEE

single-precision floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same

format as the IEEE exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. If

an argument is deno rmalized, zer o is sub stituted be fore computing the comparison, and the IFZ b i t in the re su lt is set.

The fneqflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0), r40 = 0 (0.0) fneqflags r30 r40  r80 r80  0

r30 = 0x40400000 (3.0) fneqflags r30 r30  r90 r90  0

r10 = 0, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r10 fneqflags r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x3f800000 (1.0),

r30 = 0x40400000 (3.0) IF r20 fneqflags r60 r30  r110 r110  0

r30 = 0x40400000 (3.0),

r60 = 0x3f800000 (1.0) fneqflags r30 r60  r120 r120  0

r30 = 0x40400000 (3.0),

r61 = 0xffffffff (QNa N) fneqflags r30 r61  r121 r121  0

r50 = 0x7f800000 (+INF)

r55 = 0xff800000 (-INF) fneqflags r50 r55  r125 r125  0

r60 = 0x3f800000 (1.0),

r65 = 0x00400000 (5.877471754e-39) fneqflags r60 r65  r126 r126  0x20 (IFZ)

r50 = 0x7f800000 (+INF) fneqflags r50 r50  r127 r127  0

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fneq ineq fleqflags

readpcsw

fneqflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-58

Sign of floating-point value

SYNTAX

[ IF rguard ] fsign rsrc1  rdest

FUNCTION

if rguard then {

if (float)rsrc1 = 0.0 then

rdest  0

else if (float)rsrc1 < 0.0 then

rdest  0x ffffffff

else

rdest  1

}

ATTRIBUTES

Function unit fcomp

Operation code 152

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The fsign operation sets the destina tion register, rdest, to either 0, 1, or –1 depending on the sign of the argument

in rsrc1. rdest is set to 0 if rsrc1 is equal to zero, to 1 if rsrc1 is positive, or to –1 if rsrc1 is negative. The argument is

treated as an IEEE single-precision floating-point value; the result is an integer. If the argument is denormalized, zero

is substitute d before computing the com parison, and t he IFZ flag in the PCSW is set; thus, the result of fsign for a

denormalized argument is 0. If fsign causes an IEEE exception, the corresponding exception flags in the PCSW

are set. The PCSW exception flags are sticky: the flags can be set as a side-effect of any floating-point operation but

can only be reset by an explicit writepcsw operation. The update of the PCSW exception flags occurs at the same

time as rdest is written. If any other floating-point compute operations update the PCSW at the same time, the net

result in each exception flag is the logical OR of all simult a neo us updat es ORed with the existing PCSW valu e for that

exception flag.

The fsignflags operation computes the exception flags that would result from an individual fsign.

The fsign operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) fsign r30  r100 r100  1

r40 = 0xbf800000 (-1.0) fsign r40  r105 r105  0xffffffff ( -1)

r50 = 0x80800000 (-1.175494351e-38) fsign r50  r110 r110  0xffffffff (-1)

r60 = 0x80400000 (-5.877471754e-39) fsign r60  r115 r115  0, IFZ flag set

r1 0 = 0, r70 = 0xffffffff (QN aN ) IF r10 fsign r70  r116 no change, since guard is false

r2 0 = 1, r70 = 0xffffffff (QN aN ) IF r20 fsign r70  r117 r117  0, INV flag set

r80 = 0xff800000 (-INF) fsign r80  r120 r120  0xffffffff (-1)

SEE ALSO

fsignflags readpcsw

writepcsw

fsign

PNX1300/01/02/11 Data Book Philips Semiconductors

A-59 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point sign

SYNTAX

[ IF rguard ] fsignflags rsrc1  rdest

FUNCTION

if rguard then

rdest  ieee_flags(sign((float)rsrc1))

ATTRIBUTES

Function unit fcomp

Operation code 153

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The fsignflags operation computes the IEEE exceptions that would result from computing the sign of rsrc1 and

stores a bit vector representing the exception flags into rdest. The argument va lue is in IEEE single-pr ecision floating -

point format; the result is an integer bit vector. The bit vector stored in rdest has the same format as the IEEE

exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. If the argument is

denormalized, zero is substituted before computing the sign, and the IFZ bit in the result is set.

The fsignflags operation op tionally t akes a g uard, specified in r guard. If a guard is present, its LSB controls th e

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) fsignflags r30  r100 r100  0

r40 = 0xbf800000 (-1.0) fsignflags r40  r105 r105  0

r50 = 0x80800000 (-1.175494351e-38) fsignflags r50  r110 r110  0

r60 = 0x80400000 (-5.877471754e-39) fsignflags r60  r115 r115  0x20 (IFZ)

r10 = 0, r70 = 0xffffffff (QNa N) IF r10 fsignflags r70  r116 no change, since guard is false

r20 = 1, r70 = 0xffffffff (QNa N) IF r20 fsignflags r70  r117 r117  0x10 (INV)

r80 = 0xff800000 (-INF) fsignflags r80  r120 r120  0

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fsign readpcsw

fsignflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-60

Floating-point square root

SYNTAX

[ IF rguard ] fsqrt rsrc1  rdest

FUNCTION

if rguard then

rdest  square_root(rsrc1)

ATTRIBUTES

Function unit ftough

Operation code 110

Number of operands 1

Modifier No

Modifier range —

Latency 17

Recovery 16

Issue slots 2

DESCRIPTION

The fsqrt operation computes the squareroot of rsrc1 and stores the result into rdest. All values are in IEEE

single-precision floating-point format. Rounding is according to the IEEE rounding mode bits in PCSW. If an argument

is denormalized, zero is substituted for the argument before computing the squareroot, and the IFZ flag in the PCSW

is set. If the result is denormalized, the result is set to zero instead, and the OFZ flag in the PCSW is set. If fsqrt

causes an IEEE exception, the corresponding exception flags in the PCSW are set. The PCSW exception flags are

sticky: the flags can be set as a side-effect of any floating-point operation but can only be reset by an explicit

writepcsw operation. The update of the PCSW exception flags occurs at the same time as rdest is written. If any

other floating-point comp ute operations upd ate the PCSW at the same time, the net result in each exceptio n flag is the

logical OR of all simultaneous updates ORed with the existing PCSW value for that exception flag.

The fsqrtflags operation computes the exception flags that would result from an individual fsqrt.

The fsqrt operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r60 = 0xc0400000 (–3.0) fsqrt r60  r90 r90  0xffffffff (QNaN), INV flag se t

r40 = 0x40400000 (3.0) fsqrt r40  r95 r95  0x3fddb3d7 (1.732051), INX flag set

r10 = 0, r40 = 0x40400000 (3.0) IF r10 fsqrt r40  r100 no change, since guard is false

r20 = 1, r40 = 0x40400000 (3.0) IF r20 fsqrt r40  r110 r110  0x3fddb3d7 (1.732051), INX flag set

r82 = 0x00c00000 (1.763241526e–38) fsqrt r82  r112 r112  0x201cc471 (1.32787105e-19), INX flag set

r84 = 0x7f800000 (+INF) fsqrt r84  r113 r113  0x7f800000 (+INF)

r7 0 = 0x 7f7fffff (3.402823466e+38) fsqrt r70  r120 r120  0x5 f7fffff (1. 8446743e19), INX flag set

r80 = 0x00400000 (5.877471754e-39) fsqrt r80  r125 r125  0, IFZ flag set

SEE ALSO

fsqrtflags readpcsw

writepcsw

fsqrt

PNX1300/01/02/11 Data Book Philips Semiconductors

A-61 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point square root

SYNTAX

[ IF rguard ] fsqrtflags rsrc1  rdest

FUNCTION

if rguard then

rdest  ieee_flags(square_root((float)rsrc1))

ATTRIBUTES

Function unit ftough

Operation code 111

Number of operands 1

Modifier No

Modifier range —

Latency 17

Recovery 16

Issue slots 2

DESCRIPTION

The fsqrtflags operation computes the IEEE exceptions that would result from computing the squareroot of

rsrc1 and stores a bit vector representing the exception flags into rdest. The argument value is in IEEE single-

precision floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same format as

the IEEE exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. Rounding is

according to the IEEE rounding mode bits in PCSW. If the argument is denormalized, zero is substituted before

computing the squareroot, and the IFZ bit in the result is set. If the result is denormalized, and the OFZ flag in the

PCSW is set.

The fsqrtflags operation op tionally t akes a g uard, specified in r guard. If a guard is present, its LSB controls th e

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r60 = 0xc0400000 (–3.0) fsqrtflags r60  r90 r90  0x10 (INV)

r40 = 0x40400000 (3.0) fsqrtflags r40  r95 r95  0x2 (INX)

r10 = 0, r40 = 0x40400000 (3.0) IF r10 fsqrtflags r40  r100 no change, since guard is false

r20 = 1, r40 = 0x40400000 (3.0) IF r20 fsqrtflags r40  r110 r110  0x2 (INX)

r82 = 0x00c00000 (1.763241526e–38) fsqrtflags r82  r112 r112  0x2 (INX)

r84 = 0x7f800000 (+INF) fsqrtflags r84  r113 r113  0

r70 = 0x7 f7fffff (3.402823466e+38) fsqrtflags r70  r120 r120  0x2 (INX)

r80 = 0x00400000 (5.877471754e-39) fsqrtflags r80  r125 r125  0x20 (IFZ)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fsqrt readpcsw

fsqrtflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-62

Floating-point subtract

SYNTAX

[ IF rguard ] fsub rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  (float)rsrc1 – (float)rsrc2

ATTRIBUTES

Function unit falu

Operation code 113

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The fsub operation computes the difference rsrc1–rsrc2 and writes the result into rdest. All values are in IEEE

single-precision floating-point format. Rounding is according to the IEEE rounding mode bits in PCSW. If an argument

is denormalized, zero is substituted fo r the ar gument before comp uting the dif ference, an d the IFZ flag in the PCSW is

set. If the result is denormalize d, the result is set to zero instead, and the OFZ flag in the PCSW is set. If fsub causes

an IEEE exception, the corresponding exception flags in the PCSW are set. The PCSW exception flags are sticky: the

flags can be set as a side-effect of any floating-point operation but can only be reset by an explicit writepcsw

operation. The update of the PCSW exception flags occurs at the same time as rdest is written. If any other floating-

point compute operat ions update th e PCSW at the same time, the n et result in e ach exception flag is the logical OR of

all simultaneous updates ORed with the existing PCSW value for that exception flag.

The fsubflags operation computes the exception flags that would result from an individual fsub.

The fsub operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r60 = 0xc0400000 (–3.0),

r30 = 0x3f800000 (1.0) fsub r60 r30  r90 r90  0xc0800000 (-4.0)

r40 = 0x40400000 (3.0),

r60 = 0xc0400000 (–3.0) fsub r40 r60  r95 r95  0x40c00000 (6.0)

r10 = 0, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e-38) IF r10 fsub r40 r80  r100 no change, since guard is false

r20 = 1, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e-38) IF r20 fsub r40 r80  r110 r110  0x40400000 (3.0), INX flag set

r40 = 0x40400000 (3.0),

r81 = 0x00400000 (5.877471754e–39) fsub r40 r81  r111 r111  0x40400000 (3.0), IFZ flag set

r82 = 0x00c00000 (1.763241526e-38),

r83 = 0x0080000 (1.175494351e-38) fsub r82 r83  r112 r112  0x0, OFZ, UNF and INX flags set

r84 = 0x7f800000 (+INF),

r85 = 0x7f800000 (+INF) fsub r84 r85  r113 r113  0xffffffff (QNaN), INV flag set

r7 0 = 0x 7f7fffff (3.402823466e+38)

r86 = 0xff7fffff (-3.402823466e+38) fsub r70 r86  r120 r120  0x7f800000 (+INF), OVF, INX

flag set

r8 7 = 0x ffffffff (QNaN ))

r30 = 0x3f800000 (1.0 fsub r87 r30  r125 r125  0xffffffff (QNaN)

r87 = 0xffbfffff (SNaN))

r30 = 0x3f800000 (1.0 fsub r87 r30  r125 r125  0xffffffff (QNaN), INV flag set

r83 = 0x0080001 (1.175494421e-38),

r89 = 0x0080000 (1.175494351e-38) fsub r83 r89  r126 r126  0x0, OFZ, UNF and INX flags set

SEE ALSO

fsubflags isub dspisub

dspidualsub readpcsw

writepcsw

fsub

PNX1300/01/02/11 Data Book Philips Semiconductors

A-63 PRELIMINARY SPECIFICATION

IEEE status flags from floating-point subtract

SYNTAX

[ IF rguard ] fsubflags rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float)rsrc1 – (float)rsrc2)

ATTRIBUTES

Function unit falu

Operation code 114

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The fsubflags operation computes the IEEE exceptions that would result from computing the difference rsrc1–

rsrc2 and writes a bit vector representing the exception flags into rdest. The argument values are in IEEE single-

precision floating-point format; the result is an integer bit vector. The bit vector stored in rdest has the same format as

the IEEE exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. Rounding is

according to the IEEE rounding mode bits in PCSW. If an argument is denormalized, zero is substituted before

computing the dif ference, and the IFZ bit in the result is set. If the d if ference would be d enormalized, th e OFZ bit in th e

result is set.

The fsubflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r60 = 0xc0400000 (–3.0),

r30 = 0x3f800000 (1.0) fsubflags r60 r30  r90 r90  0

r40 = 0x40400000 (3.0),

r60 = 0xc0400000 (–3.0) fsubflags r40 r60  r95 r95  0

r10 = 0, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e-38) IF r10 fsubflags r40 r80  r100 no change, since guard is false

r20 = 1, r40 = 0x40400000 (3.0),

r80 = 0x00800000 (1.17549435e-38) IF r20 fsubflags r40 r80  r110 r110  0x2 (INX)

r40 = 0x40400000 (3.0),

r81 = 0x00400000 (5.877471754e–39) fsubflags r40 r81  r111 r111  0x20 (IFZ)

r82 = 0x00c00000 (1.763241526e-38),

r83 = 0x0080000 (1.175494351e-38) fsubflags r82 r83  r112 r112  0x40 (OFZ)

r84 = 0x7f800000 (+INF),

r85 = 0x7f800000 (+INF) fsubflags r84 r85  r113 r113  0x10 (INV)

r70 = 0x7 f7fffff (3.402823466e+38)

r86 = 0xff7fffff (-3.402823466e+38) fsubflags r70 r86  r120 r120  0xA (OVF,INX)

r87 = 0xffffffff (QNa N))

r30 = 0x3f800000 (1.0 fsubflags r87 r30  r125 r125  0x0

r87 = 0xffbfffff (SNaN))

r30 = 0x3f800000 (1.0 fsubflags r87 r30  r125 r125  0x10 (INV)

r83 = 0x0080001 (1.175494421e-38),

r89 = 0x0080000 (1.175494351e-38) fsubflags r83 r89  r126 r126  0x4 (UNF)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

fsub faddflags readpcsw

fsubflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-64

Funnel-shift 1byte

SYNTAX

[ IF rguard ] funshift1 rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest<31:8>  rsrc1<23:0>

rdest<7:0>  rsrc2<31:24>

ATTRIBUTES

Function unit s hifter

Operation code 99

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2

DESCRIPTION

As shown below, the funshift1 operation effectively shifts left by one byte the 64-bit concate nation of rsrc1 and

rsrc2 and writes the most-significant 32 bits of the shifted result to rdest.

The funshift1 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xaabbccdd, r40 = 0x11223344 funshift1 r30 r40  r50 r50  0xbbccdd11

r10 = 0, r40 = 0x11223344,

r30 = 0xaabbccdd IF r10 funshift1 r40 r30  r60 no change, since guard is false

r20 = 1, r40 = 0x11223344,

r30 = 0xaabbccdd IF r20 funshift1 r40 r30  r70 r70  0x223344aa

07152331

rsrc1 07152331

rsrc2

07152331

rdest

SEE ALSO

funshift2 funshift3 rol

funshift1

PNX1300/01/02/11 Data Book Philips Semiconductors

A-65 PRELIMINARY SPECIFICATION

Funnel-shift 2 bytes

SYNTAX

[ IF rguard ] funshift2 rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest<31:16>  rsrc1<15:0>

rdest<15:0>  rsrc2<31:16>

ATTRIBUTES

Function unit s hifter

Operation code 100

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2

DESCRIPTION

As shown below, the funshift2 operation effectively shif ts lef t by two bytes the 64 -b it conca tenation of r src1 and

rsrc2 and writes the most-significant 32 bits of the shifted result to rdest.

The funshift2 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xaabbccdd, r40 = 0x11223344 funshift2 r30 r40  r50 r50  0xccdd1122

r10 = 0, r40 = 0x11223344,

r30 = 0xaabbccdd IF r10 funshift2 r40 r30  r60 no change, since guard is false

r20 = 1, r40 = 0x11223344,

r30 = 0xaabbccdd IF r20 funshift2 r40 r30  r70 r70  0x3344aabb

07152331

rsrc1 07152331

rsrc2

07152331

rdest

SEE ALSO

funshift1 funshift3 rol

funshift2

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-66

Funnel-shift 3 bytes

SYNTAX

[ IF rguard ] funshift3 rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest<31:24>  rsrc1<7:0>

rdest<23:0>  rsrc2<31:8>

ATTRIBUTES

Function unit s hifter

Operation code 101

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2

DESCRIPTION

As shown below, the funshift3 operation effectively shifts left by three bytes the 64-bit concatenation of rsrc1

and rsrc2 and writes the most-significant 32 bits of the shifted result to rdest.

The funshift3 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xaabbccdd, r40 = 0x11223344 funshift3 r30 r40  r50 r50  0xdd112233

r10 = 0, r40 = 0x11223344,

r30 = 0xaabbccdd IF r10 funshift3 r40 r30  r60 no change, since guard is false

r20 = 1, r40 = 0x11223344,

r30 = 0xaabbccdd IF r20 funshift3 r40 r30  r70 r70  0x44aabbcc

07152331

rsrc1 07152331

rsrc2

07152331

rdest

SEE ALSO

funshift1 funshift2 rol

funshift3

PNX1300/01/02/11 Data Book Philips Semiconductors

A-67 PRELIMINARY SPECIFICATION

Clipped signed absolute value

SYNTAX

[ IF rguard ] h_dspiabs r0 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc2 >= 0 then

rdest  rsrc2

else if rsrc2 = 0x80000000 then

rdest  0x7fffffff

else

rdest  –rsrc2

}

ATTRIBUTES

Function unit dspalu

Operation code 65

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

The h_dspiabs operation computes the absolute value of rsrc2, clips the result into the range [0x0..0x7fff ffff], and

stores the clipped value into rdest. All values are signed integers. This operation requires a zero as first argument.

The programmer is advised to use the unary pseudo operation dspiabs instead.

The h_dspiabs operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xffffffff h_dspiabs r0 r30  r60 r60  0x00000001

r10 = 0, r40 = 0x80000001 IF r10 h_dspiabs r0 r40  r70 no change, since guard is false

r20 = 1, r40 = 0x80000001 IF r20 h_dspiabs r0 r40  r100 r100  0x7fffffff

r50 = 0x80000000 h_dspiabs r0 r50  r80 r80  0x7 fffffff

r90 = 0x7fffffff h_dspiabs r0 r90  r110 r110  0x7fffffff

SEE ALSO

h_dspiabs dspidualabs

dspiadd dspimul dspisub

dspuadd dspumul dspusub

h_dspiabs

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-68

Dual clipped absolute value of signed 16-bit

halfwords

SYNTAX

[ IF rguard ] h_dspidualabs r0 rsrc2  rdest

FUNCTION

if rguard then {

temp1  sign_ext16to32(rsrc2<15:0>)

temp2  sign_ext16to32(rsrc2<31:16>)

if temp1 = 0xffff8000 then temp1  0x7fff

if temp2 = 0xffff8000 then temp2  0x7fff

if temp1 < 0 then temp1  –temp1

if temp2 < 0 then temp2  –temp2

rdest<31:16>  temp2<15:0>

rdest<15:0>  temp1<15:0>

}

ATTRIBUTES

Function unit dspalu

Operation code 72

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

The h_dspidualabs operation performs two 16-bit clipped, signed absolute value computations separately on

the high and low 16-bit halfwords of rsrc2. Both absolute values are clipped into th e range [0x0..0x7f f f] and writte n into

the corresponding halfwords of rdest. All values are signed 16-bit integers. This operation requires a zero as first

argument. The programmer is advised to use the dspidualabs pseudo operation instead.

The h_dspidualabs operation optionally takes a g uard, specified in r guard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xffff0032 h_dspidualabs r0 r30  r60 r60  0x00010032

r10 = 0, r40 = 0x80008001 IF r10 h_dspidualabs r0 r40  r70 no change, since guard is false

r20 = 1, r40 = 0x80008001 IF r20 h_dspidualabs r0 r40  r100 r100  0x7fff7fff

r50 = 0x0032ffff h_dspidualabs r0 r50  r80 r80  0x00320001

r90 = 0x7fffffff h_dspidualabs r0 r90  r110 r110  0x7fff0001

SEE ALSO

dspidualabs dspiabs

dspidualadd dspidualmul

dspidualsub dspiabs

h_dspidualabs

PNX1300/01/02/11 Data Book Philips Semiconductors

A-69 PRELIMINARY SPECIFICATION

Hardware absolute value

SYNTAX

[ IF rguard ] h_iabs r0 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc2 < 0 then

rdest  –rsrc2

else

rdest  rsrc2

}

ATTRIBUTES

Function unit alu

Operation code 44

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The h_iabs operation computes the absolute value of rsrc2 and stores the result into rdest. The argument is a

signed integer; the result is an unsigned integer. This operation requires a zero as first argument. The programmer is

advised to use the iabs pseudo operation instead.

The h_iabs operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xffffffff h_iabs r0 r30  r60 r60  0x00000001

r10 = 0, r40 = 0xfffffff4 IF r10 h_iabs r0 r40  r80 no change, since guard is false

r20 = 1, r40 = 0xfffffff4 IF r20 h_iabs r0 r40  r90 r90  0xc

r50 = 0x80000001 h_iabs r0 r50  r100 r100  0x 7fffffff

r60 = 0x80000000 h_iabs r0 r60  r110 r110  0x80000000

r20 = 1 h_iabs r0 r20  r120 r120  1

SEE ALSO

iabs fabsval

h_iabs

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-70

Hardware 16-bit store with displacement

SYNTAX

[ IF rguard ] h_st16d(d) rsrc1 rsrc2

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  1

else

bs  0

mem[rsrc2 + d + (1  bs)]  rsrc1<7:0>

mem[rsrc2 + d + (0  bs)]  rsrc1<15:8>

}

ATTRIBUTES

Function unit dmem

Operation code 30

Number of operands 2

Modifier 7 bits

Modifier range –128..126 by 2

Latency n/a

Issue slots 4, 5

DESCRIPTION

The h_st16d operation stores the least- sig nificant 16-bit ha lfword of r src1 into the memory locations pointed to by

the address in rsrc2 + d. The d value is an opcode modifier, must be in the range –128 and 1 26 inclusive, and must be

a multiple of 2. This store operation is performed as little-endian or big-endian depending on the current setting of the

bytesex bit in the PCSW.

If h_st16d is misaligned (the memory address computed by rsrc2 + d is not a multiple of 2), the result of

h_st16d is undefined, and the MSE (Misaligned Store Exception) bit in the PCSW register is set to 1. Additionally, if

the TRPMSE (TRaP on Misaligned Store Exception) bit in PCSW is 1, exception processing will be requested on the

next interruptible jump.

The h_st16d operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the addressed memory locations (and the modification of cache if the locations are cacheable). If the

LSB of rguard is 1, the store takes effect. If the LSB of rguard is 0, h_st16d has no side effects whatever; in particular,

the LRU and other status bits in the data cache are not affected.

EXAMPLES

Initial Values Operation Result

r10 = 0xcfe, r80 = 0x44332211 h_st16d(2) r80 r10 [0xd00]  0x22, [0xd01]  0x11

r50 = 0, r20 = 0xd05,

r70 = 0xaabbccdd IF r50 h_st16d(–4) r70 r20 no change, since guard is false

r60 = 1, r30 = 0xd06,

r70 = 0xaabbccdd IF r60 h_st16d(–4) r70 r30 [0xd02]  0xcc, [0xd03]  0xdd

SEE ALSO

st16 st16d st8 st8d st32

st32d readpcsw ijmpf

h_st16d

PNX1300/01/02/11 Data Book Philips Semiconductors

A-71 PRELIMINARY SPECIFICATION

Hardware 32-bit store with displacement

SYNTAX

[ IF rguard ] h_st32d(d) rsrc1 rsrc2

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  3

else

bs  0

mem[rsrc2 + d + (3  bs)]  rsrc1<7:0>

mem[rsrc2 + d + (2  bs)]  rsrc1<15:8>

mem[rsrc2 + d + (1  bs)]  rsrc1<24:16>

mem[rsrc2 + d + (0  bs)]  rsrc1<31:24>

}

ATTRIBUTES

Function unit dmem

Operation code 31

Number of operands 2

Modifier 7 bits

Modifier range –256..252 by 4

Latency n/a

Issue slots 4, 5

DESCRIPTION

The h_st32d operation stores all 32 bits of rsrc1 into the memory locations pointed to by the address in rsrc2 + d.

The d value is an opcode modifier, must be in the range –256 and 252 inclusive, and must be a multiple of 4. This

store operation is performed as little-endian or big-endian depending on the current setting of the bytesex bit in the

PCSW.

If h_st32d is misaligned (the memory address computed by rsrc2 + d is not a multiple of 4), the result of

h_st32d is undefined, and the MSE (Misaligned Store Exception) bit in the PCSW register is set to 1. Additionally, if

the TRPMSE (TRaP on Misaligned Store Exception) bit in PCSW is 1, except ion processing will be requested on the

next interruptible jump.

The h_st32d operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the addressed memory locations (and the modification of cache if the locations are cacheable). If the

LSB of rguard is 1, the store takes effect. If the LSB of rguard is 0, h_st32d has no side effects whatever; in

particular, the LRU and other status bits in the data cache are not affected.

EXAMPLES

Initial Values Operation Result

r10 = 0xcfc, r80 = 0x44332211 h_st32d(4) r80 r10 [0xd00]  0x44, [0xd01]  0x33,

[0xd02]  0x22, [0xd03]  0x11

r50 = 0, r20 = 0xd0b,

r70 = 0xaabbccdd IF r50 h_st32d(–8) r70 r20 no change, since guard is false

r60 = 1, r30 = 0xd0c,

r70 = 0xaabbccdd IF r60 h_st32d(–8) r70 r30 [0xd04]  0xaa, [0xd05]  0xbb,

[0xd06]  0xcc, [0xd07]  0xdd

SEE ALSO

st32 st32d st16 st16d st8

st8d readpcsw ijmpf

h_st32d

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-72

Hardware 8-bit store with displacement

SYNTAX

[ IF rguard ] h_st8d(d) rsrc1 rsrc2

FUNCTION

if rguard then

mem[rsrc2 + d]  rsrc1<7:0>

ATTRIBUTES

Function unit dmem

Operation code 29

Number of operands 2

Modifier 7 bits

Modifier range –64..63

Latency n/a

Issue slots 4, 5

DESCRIPTION

The h_st8d operation stores the least-significant 8-bit byte of rsrc1 into the memory location pointed to by the

address formed from th e sum r src2 + d. The value of the opco de modifier d must be in the range -64 an d 63 inclusive.

This operation does not depend on the bytesex bit in the PCSW since only a single byte is stored.

The h_st8d operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the addressed memory location (and the modification of cache if the location is cacheable). If the LSB

of rguard is 1, the store takes effect. If the LSB of rguard is 0, h_st8d has no side effects whatever; in particular, the

LRU and other status bits in the data cache are not affected.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, r80 = 0x44332211 h_st8d(3) r80 r10 [0xd03]  0x11

r50 = 0, r20 = 0xd01,

r70 = 0xaabbccdd IF r50 h_st8d(-4) r70 r20 no change, since guard is false

r60 = 1, r30 = 0xd02,

r70 = 0xaabbccdd IF r60 h_st8d(-4) r70 r30 [0xcfe]  0xdd

SEE ALSO

st8 st8d st16 st16d st32

st32d

h_st8d

PNX1300/01/02/11 Data Book Philips Semiconductors

A-73 PRELIMINARY SPECIFICATION

Read clock cycle counter, most-significant word

SYNTAX

[ IF rguard ] hicycles  rdest

FUNCTION

if rguard then

rdest  CCCOUNT<63:32>

ATTRIBUTES

Function unit fcomp

Operation code 155

Number of operands 0

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

Refer to Section 3.1.5, “CCCOUNT—Clock Cycle Counter” for a description of the CCCOUNT operation. The

hicycles operation copies the high 32 bits of the slave register Clock Cycle Counter (CCCOUNT) to the

destination register, rdest. The contents of the master counter are transferred to the slave CCCOUNT register only on

a successful interruptible jump and on processor reset. Thus, if cycles and hicycles are executed without

intervening interruptible jumps, the operation pair is guaranteed to be a coherent sample of the master clock-cycle

counter. The master counter increments on all cycles (processor-stall and non-stall) if PCSW.CS = 1; otherwise, the

counter increments only on non-stall cycles.

The hicycles operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

CCCOUNT_HR = 0xabcdefff12345678 hicycles  r60 r60  0xabcdefff

r10 = 0, CCCOUNT_HR = 0xabcdefff12345678 IF r10 hicycles  r70 no change, since guard is false

r20 = 1, CCCOUNT_HR = 0xabcdefff12345678 IF r20 hicycles  r100 r100  0xabcdefff

SEE ALSO

cycles curcycles writepcsw

hicycles

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-74

Absolute value

pseudo-op for h_iabs

SYNTAX

[ IF rguard ] iabs rsrc1  rdest

FUNCTION

if rguard then {

if rsrc1 < 0 then

rdest  –rsrc1

else

rdest  rsrc1

}

ATTRIBUTES

Function unit alu

Operation code 44

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The iabs operation is a pseudo operation transformed by the scheduler into an h_iabs with zero as the first

argument and a second argument equal to the iabs argument. (Note: pseudo operations cannot be used in

assembly source files.)

The iabs operation computes the absolute valu e of r src1 and stores the result into r dest. The argument is a signed

integer; the result is an unsigned integer.

The iabs operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xffffffff iabs r30  r60 r60  0x00000001

r1 0 = 0, r40 = 0xfffffff4 IF r10 iabs r40  r80 no change, since guard is false

r2 0 = 1, r40 = 0xfffffff4 IF r20 iabs r40  r90 r90  0xc

r50 = 0x80000001 iabs r50  r100 r100  0x7fffffff

r60 = 0x80000000 iabs r60  r110 r110  0x80000000

r20 = 1 iabs r20  r120 r120  1

SEE ALSO

h_iabs dspiabs dspidualabs

fabsval

iabs

PNX1300/01/02/11 Data Book Philips Semiconductors

A-75 PRELIMINARY SPECIFICATION

Signed add

SYNTAX

[ IF rguard ] iadd rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  rsrc1 + rsrc2

ATTRIBUTES

Function unit alu

Operation code 12

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The iadd operation computes the sum rsrc1+rsrc2 and stores the result into rdest. The operands can be either

both signed or unsigned integers. No overflow or underflow detection is performed.

The iadd operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r60 = 0x100 iadd r60 r60  r80 r80  0x200

r10 = 0, r60 = 0x100, r30 = 0xf11 IF r10 iadd r60 r30  r50 no change, since guard is false

r20 = 1, r60 = 0x100, r30 = 0xf11 IF r20 iadd r60 r30  r90 r90  0x1011

r70 = 0xffffff00, r40 = 0xffffff9c iadd r70 r40  r100 r100  0xfffffe9c

SEE ALSO

iaddi carry dspiadd

dspidualadd fadd

iadd

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-76

Add with immediate

SYNTAX

[ IF rguard ] iaddi(n) rsrc1  rdest

FUNCTION

if rguard then

rdest  rsrc1 + n

ATTRIBUTES

Function unit alu

Operation code 5

Number of operands 1

Modifier 7 bits

Modifier range 0..127

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The iaddi operation sums a single argument in rsrc1 and an immediate modifier n and stores the result in rdest.

The value of n must be between 0 and 127, inclusive.

The iaddi operations optionally take a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

r30 = 0xf11 iaddi(127) r30  r70 r70  0xf90

r1 0 = 0, r40 = 0xffffff9 c IF r10 iaddi(1) r40  r80 no change, since guard is false

r2 0 = 1, r40 = 0xffffff9 c IF r20 iaddi(1) r40  r90 r90  0xffffff9d

r50 = 0x1000 iaddi(15) r50  r120 r120  0x100f

r60 = 0xfffffff0 iaddi(2) r60  r110 r110  0xfffffff2

r60 = 0xfffffff0 iaddi(17) r60  r120 r120  1

SEE ALSO

iadd carry

iaddi

PNX1300/01/02/11 Data Book Philips Semiconductors

A-77 PRELIMINARY SPECIFICATION

Signed average

SYNTAX

[ IF rguard ] iavgonep rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  (sign_ext32to64(rsrc1) + sign_ext32to64(rsrc2) + 1) >> 1;

ATTRIBUTES

Function unit dspalu

Operation code 25

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

As shown below, the iavgonep operation returns the average of the two arguments. This operation computes the

sum rsrc1+rsrc2+1, shifts the sum right by 1 bit, and stores the result into rdest. The operands ar e signed integers.

The iavgonep operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r60 = 0x10, r70 = 0x20 iavgonep r60 r70  r80 r80  0x18

r10 = 0, r60 = 0x10, r30 = 0x20 IF r10 iavgonep r60 r30  r50 no change, since guard is false

r20 = 1, r60 = 0x9, r30 = 0x20 IF r20 iavgonep r60 r30  r90 r90  0x15

r70 = 0xfffffff7, r40 = 0x2 iavgonep r70 r40  r100 r100  0x fffffffd

r70 = 0xfffffff7, r40 = 0x3 iavgonep r70 r40  r100 r100  0x fffffffd

031

rsrc1 031

rsrc2

031

rdest



032

Full precision

33-bit result S

shift down one bit

signedsigned

signed

SEE ALSO

quadavg iadd

iavgonep

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-78

Signed select byte

SYNTAX

[ IF rguard ] ibytesel rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc2 = 0 then

rdest  sign_ext8to32(rsrc1<7:0>)

else if rsrc2 = 1 then

rdest  sign_ext8to32(rsrc1<15:8>)

else if rsrc2 = 2 then

rdest  sign_ext8to32(rsrc1<23:16>)

else if rsrc2 = 3 then

rdest  sign_ext8to32(rsrc1<31:24>)

}

ATTRIBUTES

Function unit alu

Operation code 56

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

As shown below, the ibytesel operation selects one byte from the argument, rsrc1, sign-extends the byte to 32

bits, and stores the result in rdest. The value of rsrc2 determines which byte is selected, with rsrc2=0 selecting the

LSB of rsrc1 and rsrc2=3 selecting the MSB of rsrc1. If rsrc2 is not between 0 and 3 inclusive, the result of

ibytesel is undefined.

The ibytesel operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x44332211, r40 = 1 ibytesel r30 r40  r50 r50  0x00000022

r10 = 0, r60 = 0xddccbbaa, r70 = 2 IF r10 ibytesel r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0xddccbbaa, r70 = 2 IF r20 ibytesel r60 r70  r90 r90  0xffffffcc

r100 = 0xffffff7 f, r110 = 0 ibytesel r100 r110  r120 r120  0x0000007f

01531

rsrc1 031

rsrc2

23 7 1

031

rdest 7

SSSSSSSSSSSSSSSSSSSSSSSS

3210

signed signed signed signed

signed

SEE ALSO

ubytesel sex8 packbytes

ibytesel

PNX1300/01/02/11 Data Book Philips Semiconductors

A-79 PRELIMINARY SPECIFICATION

Clip signed to signed

SYNTAX

[ IF rguard ] iclipi rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  min(max(rsrc1, –rsrc2–1), rsrc2)

ATTRIBUTES

Function unit dspalu

Operation code 74

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

The iclipi operation returns the value of rsrc1 clipped into the unsigned integer range (–rsrc2–1) to rsrc2,

inclusive. The argument rsrc1 is considered a signed integer; rsrc2 is considered an unsigned integer and must have

a value between 0 and 0x7fffffff inclusive.

The iclipi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x80, r40 = 0x7f iclipi r30 r40  r50 r50  0x7f

r10 = 0, r60 = 0x12345678,

r70 = 0xabc IF r10 iclipi r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0x12345678,

r70 = 0xabc IF r20 iclipi r60 r70  r90 r90  0xabc

r100 = 0x80000000, r110 = 0x3fffff iclipi r100 r110  r120 r120 0xffc00000

SEE ALSO

uclipi uclipu imin imax

iclipi

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-80

Invalidate all instruction cache blocks

SYNTAX

[ IF rguard ] iclr

FUNCTION

if rguard then {

block 0

for all blocks in instruction cache {

icache_reset_valid_block(block)

block block + 1

}

ATTRIBUTES

Function unit branch

Operation code 184

Number of operands 0

Modifier No

Modifier range —

Latency n/a

Issue slots 2, 3, 4

DESCRIPTION

The iclr operation reset s the valid bits of all blocks in the instruction cache.

iclr does clear the valid bits of locked blocks. iclr does not change the replacement status of instruction-cache

blocks.

iclr ensures cohere n cy be tween caches and ma in me m or y by dis ca rd i ng all pe nd ing pr ef etc h op e ra tio ns .

The side effect time behavior of iclr is such that if instruction i performs an iclr, instructions i, i+1, i+2 will be

included in the discard from the instruction cache, but i+3 will be retained.

The iclr operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

iclr

r10 = 0 IF r10 iclr no change and no stall cycles, since

guard is false

r20 = 1 IF r20 iclr

SEE ALSO

dcb dinvalid

iclr

PNX1300/01/02/11 Data Book Philips Semiconductors

A-81 PRELIMINARY SPECIFICATION

Identity

pseudo-op for iadd

SYNTAX

[ IF rguard ] ident rsrc1  rdest

FUNCTION

if rguard then

rdest  rsrc1

ATTRIBUTES

Function unit alu

Operation code 12

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The ident operation is a pseudo opera tio n tra nsforme d b y the sche du ler into an iadd with r0 (always contains 0)

as the first argument and rsrc1 as the second. (Note: pseudo operations cannot be used in assembly sou rce files.)

The ident operation copies the argument rsrc1 to rdest. It is used by the instruction scheduler to implement

The ident operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x100 ident r30  r40 r40  0x100

r10 = 0, r50 = 0x12345678 IF r10 ident r50  r60 no change, since guard is false

r20 = 1, r50 = 0x12345678 IF r20 ident r50  r70 r70  0x12345678

SEE ALSO

iadd

ident

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-82

Signed compare equal

SYNTAX

[ IF rguard ] ieql rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 = rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 37

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The ieql operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is equal to the second

argument, rsrc2; otherwise, rdest is set to 0. The arguments are treated as signed integers.

The ieql operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 ieql r30 r40  r80 r80  0

r10 = 0, r60 = 0x100, r30 = 3 IF r10 ieql r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, r60 = 0x1000 IF r20 ieql r50 r60  r90 r90  1

r70 = 0x80000000, r40 = 4 ieql r70 r40  r100 r100  0

r70 = 0x80000000 ieql r70 r70  r110 r110  1

SEE ALSO

igeq ueql ieqli ineq

ieql

PNX1300/01/02/11 Data Book Philips Semiconductors

A-83 PRELIMINARY SPECIFICATION

Signed compare equal with immediate

SYNTAX

[ IF rguard ] ieqli(n) rsrc1  rdest

FUNCTION

if rguard then {

if rsrc1 = n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 4

Number of operands 1

Modifier 7 bits

Modifier range –64..63

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The ieqli operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is equal to the opcode

modifier, n; otherwise, rdest is set to 0. The arguments are treated as signed integers.

The ieqli operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ieqli(2) r30  r80 r80  0

r30 = 3 ieqli(3) r30  r90 r90  1

r30 = 3 ieqli(4) r30  r100 r100  0

r10 = 0, r40 = 0x100 IF r10 ieqli(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 ieqli(63) r40  r100 r100  0

r60 = 0xffffffc0 ieqli(-64) r60  r120 r120  1

SEE ALSO

ieql igeqi ueqli ineqi

ieqli

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-84

Sum of products of signed 16-bit halfwords

SYNTAX

[ IF rguard ] ifir16 rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  sign_ext16to32(rsrc1<31:16>) sign_ext16to32(rsrc2<31:16>) +

sign_ext16to32(rsrc1<15:0>) sign_ext16to32(rsrc2<15:0>)

ATTRIBUTES

Function unit dspmul

Operation code 93

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the ifir16 operation computes two separate products of the two pairs of corresponding 16-bit

halfwords of rsrc1 and rsrc2; the two produ ct s are summed, a nd the resu lt is written to r dest. All values are considered

signed; thus, the intermediate products and the final sum of products are signed. All intermediate computations are

pe rf or me d w it ho ut lo ss of pr eci si on ; t he fi na l s um of pro du ct s i s cl ip pe d i nt o t he ra ng e [0 x8 0 00 00 00. .0 x7 fffffff ] b ef or e

being written into rdest.

The ifir16 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x00020003, r40 = 0x00010002 ifir16 r30 r40  r50 r50  0x8

r10 = 0, r60 = 0xff9c0064, r70 = 0x0064ff9c IF r10 ifir16 r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0xff9c0064, r70 = 0x0064ff9c IF r20 ifir16 r60 r70  r90 r90  0xffffb1e0

r30 = 0x00020003, r70 = 0x0064ff9c ifir16 r30 r70  r100 r100  0xffffff9c

01531

rsrc1 01531

rsrc2

031

rdest





signed signed signed signed

signed

032

Clip to [231–1..–231]

Full-precision

33-bit result signed

SEE ALSO

ifir8ii ifir8ui ufir8uu

ifir16

PNX1300/01/02/11 Data Book Philips Semiconductors

A-85 PRELIMINARY SPECIFICATION

Signed sum of products of signed bytes

SYNTAX

[ IF rguard ] ifir8ii rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  sign_ext8to32(rsrc1<31:24>) sign_ext8to32(rsrc2<31:24>) +

sign_ext8to32(rsrc1<23:16>) sign_ext8to32(rsrc2<23:16>) +

sign_ext8to32(rsrc1<15:8>) sign_ext8to32(rsrc2<15:8>) +

sign_ext8to32(rsrc1<7:0>) sign_ext8to32(rsrc2<7:0>)

ATTRIBUTES

Function unit dspmul

Operation code 92

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the ifir8ii operation computes four separate products of the four pairs of corresponding 8-bit

bytes of rsrc1 and rsrc2; the four products are summed, and the result is written to rdest. All values are considered

signed; thus, the intermediate products and the final sum of products are signed. All computations are performed

without loss of precision.

The ifir8ii operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r70 = 0x0afb14f6, r30 = 0x0a0a1414 ifir8ii r70 r30  r90 r90  0xfa

r10 = 0, r70 = 0x0afb14f6, r30 = 0x0a0a1414 IF r10 ifir8ii r70 r30  r100 no change, since guard is false

r20 = 1, r80 = 0x649c649c, r40 = 0x9c649c64 IF r20 ifir8ii r80 r40  r110 r110  0xffff63c0

r50 = 0x80808080, r60 = 0xffffffff ifir8ii r50 r60  r120 r120  0x200

01531

rsrc1 01531

rsrc2

031

rdest







23 7 23 7

signed signed signed signed signed signed signed signed

signed

SEE ALSO

ifir8ui ufir8uu ifir16

ufir16

ifir8ii

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-86

Signed sum of products of unsigned/signed

bytes

SYNTAX

[ IF rguard ] ifir8ui rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  zero_ext8to32(rsrc1<31:24>) sign_ext8to32(rsrc2<31:24>) +

zero_ext8to32(rsrc1<23:16>) sign_ext8to32(rsrc2<23:16>) +

zero_ext8to32(rsrc1<15:8>) sign_ext8to32(rsrc2<15:8>) +

zero_ext8to32(rsrc1<7:0>) sign_ext8to32(rsrc2<7:0>)

ATTRIBUTES

Function unit dspmul

Operation code 91

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the ifir8ui operation computes four separate products of the four pairs of corresponding 8-bit

bytes of rsrc1 and rsrc2; the four products are summed, and the result is written to rdest. The bytes from rsrc1 are

considered unsigned, but the bytes from rsrc2 are considered signed; thus, the intermediate products and the final

sum of products are signed. All computations are performed without loss of precision.

The ifir8ui operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r70 = 0x0afb14f6, r30 = 0x0a0a1414 ifir8ui r30 r70  r90 r90  0xfa

r10 = 0, r70 = 0x0afb14f6, r30 = 0x0a0a1414 IF r10 ifir8ui r30 r70  r100 no change, since guard is false

r20 = 1, r80 = 0x649c649c, r40 = 0x9c649c64 IF r20 ifir8ui r40 r80  r110 r110  0x2bc0

r50 = 0x80808080, r60 = 0xffffffff ifir8ui r60 r50  r120 r120  0xf ffe0200

01531

rsrc1 01531

rsrc2

031

rdest







23 7 23 7

unsigned unsigned unsigned unsigned signed signed signed signed

signed

SEE ALSO

ifir8ii ufir8uu ifir16

ufir16

ifir8ui

PNX1300/01/02/11 Data Book Philips Semiconductors

A-87 PRELIMINARY SPECIFICATION

Convert floating-point to integer using PCSW

rounding mode

SYNTAX

[ IF rguard ] ifixieee rsrc1  rdest

FUNCTION

if rguard then {

rdest  (long) ((float)rsrc1)

}

ATTRIBUTES

Function unit falu

Operation code 121

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ifixieee operation converts the single-precision IEEE floating-point value in rsrc1 to a signed integer and

writes the result into rdest. Rounding is according to the IEEE rounding mode bits in PCSW. If rsrc1 is denormalized,

zero is substituted before conversion, and the IFZ flag in the PCSW is set. If ifixieee causes an IEEE exception,

such as overflow or underflow, the corresponding exception flags in the PCSW are set. The PCSW exception flags

are sticky: the flags can be set as a side-effect of any floating-point operation but can only be reset by an explicit

writepcsw operation. The update of the PCSW exception flags occurs at the same time as rdest is written. If any

other floating-point comp ute operations update the PCSW at the same time, the net r esult in each exception flag is the

logical OR of all simultaneous updates ORed with the existing PCSW value for that exception flag.

The ifixieeeflags operation computes the exception flags that would result from an individual ifixieee.

The ifixieee operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination registe r. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) ifixieee r30  r100 r100  3

r35 = 0x40247ae1 (2.57) ifixieee r35  r102 r102  3, INX flag set

r10 = 0,

r40 = 0xff4fffff (–3.402823466e+38) IF r10 ifixieee r40  r105 no change, since guard is false

r20 = 1,

r40 = 0xff4fffff (–3.402823466e+38) IF r20 ifixieee r40  r110 r110  0x80000000 (-231), INV flag set

r45 = 0x7f800000 (+INF)) ifixieee r45  r112 r112  0x7fffffff (231-1), INV flag set

r50 = 0xbfc147ae (-1.51) ifixieee r50  r115 r115  -2, INX flag set

r60 = 0x00400000 (5.877471754e-39) ifixieee r60  r117 r117  0, IFZ set

r70 = 0xffffffff (QNa N) ifixieee r70  r120 r120  0, INV flag set

r80 = 0xffbfffff (SNaN) ifixieee r80  r122 r122  0, INV flag set

SEE ALSO

ufixieee ifixrz ufixrz

ifixieee

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-88

IEEE status flags from convert floating-point to

integer using PCSW rounding mode

SYNTAX

[ IF rguard ] ifixieeeflags rsrc1  rdest

FUNCTION

if rguard then

rdest  ieee_flags((long) ((float)rsrc1))

ATTRIBUTES

Function unit falu

Operation code 122

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ifixieeeflags operation computes the IEEE exceptions that would result from converting the single-

precision IEEE floating-point value in rsrc1 to a signed integer, and an integer bit vector representing the computed

exception flags is written into rdest. The bit vector stored in rdest has the same format as the IEEE exception bits in

the PCSW. The exception flags in PCSW are left unchanged by this operation. Rounding is according to the IEEE

rounding mode bits in PCSW. If rsrc1 is denormalized, zero is substituted before computing the conversion, and the

IFZ bit in the result is set.

The ifixieeeflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB

controls the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not

changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) ifixieeeflags r30  r100 r100  0

r35 = 0x40247ae1 (2.57) ifixieeeflags r35  r102 r102  0x02 (INX)

r10 = 0,

r40 = 0xff4fffff (–3.402823466e+38) IF r10 ifixieeeflags r40  r105 no change, since guard is false

r20 = 1,

r40 = 0xff4fffff (–3.402823466e+38) IF r20 ifixieeeflags r40  r110 r110  0x10 (INV)

r45 = 0x7f800000 (+INF)) ifixieeeflags r45  r112 r112  0x10 (INV)

r50 = 0xbfc147ae (-1.51) ifixieeeflags r50  r115 r115  0x02 (INX)

r60 = 0x00400000 (5.877471754e-39) ifixieeeflags r60  r117 r117  0x20 (IFZ)

r7 0 = 0x ffffffff (QNaN ) ifixieeeflags r70  r120 r120  0x10 (INV)

r80 = 0xffbfffff (SNaN) ifixieeeflags r80  r122 r122  0x10 (INV)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

ifixieee ufixieeeflags

ifixrzflags ufixrzflags

ifixieeeflags

PNX1300/01/02/11 Data Book Philips Semiconductors

A-89 PRELIMINARY SPECIFICATION

Convert floating-point to integer with round

toward zero

SYNTAX

[ IF rguard ] ifixrz rsrc1  rdest

FUNCTION

if rguard then {

rdest  (long) ((float)rsrc1)

}

ATTRIBUTES

Function unit falu

Operation code 21

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ifixrz operation converts the single-precision IEEE floating-point value in rsrc1 to a signed integer and

writes the result into rdest. Rounding toward zero is performed; the IEEE rounding mode bits in PCSW are ignored.

This is the preferred rounding for ANSI C. If rsrc1 is denormalized, zero is substituted before conversion, and the IFZ

flag in the PCSW is set. If ifixrz causes an IEEE exception, such as overflow or underflow, the corresponding

exception flags in the PCSW are set. The PCSW e xception flags are sticky: the flags can be set as a side-ef fect of any

floating-point operation but can only be reset by an explicit writepcsw operation. The update of the PCSW

exception flags occurs at the same time as rdest is written. If any other floatin g-point comp ute operat ions update the

PCSW at the same time, the net result in each exception flag is the log i cal OR of all simultaneous updates ORed with

the existing PCSW value for that exception flag.

The ifixrzflags operation computes the exception flags that would result from an individual ifixrz.

The ifixrz operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination registe r. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) ifixrz r30  r100 r100  3

r35 = 0x40247ae1 (2.57) ifixrz r35  r102 r102  2, INX flag set

r10 = 0,

r40 = 0xff4fffff (–3.402823466e+38) IF r10 ifixrz r40  r105 no change, since guard is false

r20 = 1,

r40 = 0xff4fffff (–3.402823466e+38) IF r20 ifixrz r40  r110 r110  0x80000000 (-231), INV flag

set

r45 = 0x7f800000 (+INF)) ifixrz r45  r112 r112  0x7fffffff (231-1), INV flag set

r50 = 0xbfc147ae (-1.51) ifixrz r50  r115 r115  -1, INX flag set

r60 = 0x00400000 (5.877471754e-39) ifixrz r60  r117 r117  0, IFZ set

r70 = 0xffffffff (QNa N) ifixrz r70  r120 r120  0, INV flag set

r80 = 0xffbfffff (SNaN) ifixrz r80  r122 r122  0, INV flag set

SEE ALSO

ifixieee ufixieee ufixrz

ifixrz

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-90

IEEE status flags from convert floating-point to

integer with round toward zero

SYNTAX

[ IF rguard ] ifixrzflags rsrc1  rdest

FUNCTION

if rguard then

rdest  ieee_flags((long) ((float)rsrc1))

ATTRIBUTES

Function unit falu

Operation code 129

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ifixrzflags operation computes the IEEE exceptions that would result from converting the single-precision

IEEE floating-point value in rsrc1 to a signed integer, and an integer bit vector representing the computed exception

flags is written into rdest. The bit vector stored in rdest has the same format as the IEEE exception bits in the PCSW.

The exception flags in PCSW are left unchanged by this operation. Rounding toward zero is performed; the IEEE

rounding mode bits in PCSW are ignored. If rsrc1 is denormalized, zero is substituted before computing the

conversion, and the IFZ bit in the result is set.

The ifixrzflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) ifixrzflags r30  r100 r100  0

r35 = 0x40247ae1 (2.57) ifixrzflags r35  r102 r102  0x02 (INX)

r10 = 0,

r40 = 0xff4fffff (–3.402823466e+38) IF r10 ifixrzflags r40  r105 no change, since guard is false

r20 = 1,

r40 = 0xff4fffff (–3.402823466e+38) IF r20 ifixrzflags r40  r110 r110  0x10 (INV)

r45 = 0x7f800000 (+INF)) ifixrzflags r45  r112 r112  0x10 (INV)

r50 = 0xbfc147ae (-1.51) ifixrzflags r50  r115 r115  0x02 (INX)

r60 = 0x00400000 (5.877471754e-39) ifixrzflags r60  r117 r117  0x20 (IFZ)

r7 0 = 0x ffffffff (QNaN ) ifixrzflags r70  r120 r120  0x10 (INV)

r80 = 0xffbfffff (SNaN) ifixrzflags r80  r122 r122  0x10 (INV)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

ifixrz ufixrzflags

ifixieeeflags

ufixieeeflags

ifixrzflags

PNX1300/01/02/11 Data Book Philips Semiconductors

A-91 PRELIMINARY SPECIFICATION

If non-zero negate

SYNTAX

[ IF rguard ] iflip rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 = 0 then

rdest  rsrc2

else

rdest  –rsrc2

}

ATTRIBUTES

Function unit dspalu

Operation code 77

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

The iflip operation copi es rsrc2 to rdest if rsrc1 = 0; otherwise (if rsrc1 != 0), rdest is set to the two’s-complement

of rsrc2. All values are signed integers.

The iflip operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0, r40 = 1 iflip r30 r40  r50 r50  0x1

r10 = 0, r60 = 0xffff0000, r70 = 0xabc IF r10 iflip r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0xffff0000, r70 = 0xabc IF r20 iflip r60 r70  r90 r90  0xfffff544

r30 = 0, r100 = 0xffffff9c iflip r30 r100  r110 r110  0xffffff9c

r40 = 1, r110 = 0 xffffffff iflip r40 r110  r120 r120  0x1

SEE ALSO

inonzero izero

iflip

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-92

Convert signed integer to floating-point

SYNTAX

[ IF rguard ] ifloat rsrc1  rdest

FUNCTION

if rguard then {

rdest  (float) ((long)rsrc1)

}

ATTRIBUTES

Function unit falu

Operation code 20

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ifloat operation converts the signed integer value in rsrc1 to single-precision IEEE floating-point format and

writes the result into rdest. Rounding is according to the IEEE rounding mode bits in PCSW. If ifloat causes an

IEEE exception, such as inexact, the corresponding exception flags in the PCSW are set. The PCSW exception flags

are sticky: the flags can be set as a side-effect of any floating-point operation but can only be reset by an explicit

writepcsw operation. The update of the PCSW exception flags occurs at the same time as rdest is written. If any

other floating-point comp ute operations upd ate the PCSW at the same time, the net result in each exceptio n flag is the

logical OR of all simultaneous updates ORed with the existing PCSW value for that exception flag.

The ifloatflags operation computes the exception flags that would result from an individual ifloat.

The ifloat operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 3 ifloat r30  r100 r100  0x40400000 (3.0)

r4 0 = 0x ffffffff (-1) ifloat r40  r105 r105  0xbf800000 (-1.0)

r1 0 = 0, r50 = 0xfffffffd IF r10 ifloat r50  r110 no change, since guard is false

r2 0 = 1, r50 = 0xfffffffd IF r20 ifloat r50  r115 r115  0xc0400000 (–3.0)

r6 0 = 0x 7fffffff (2147483647) ifloat r60  r117 r117  0x4f000000 (2.147483648e+9), INX flag set

r70 = 0x80000000 (-2147483648) ifloat r70  r120 r120  0xcf000000 (-2.147483648e+9)

r8 0 = 0x 7ffffff1 (2147483633) ifloat r80  r122 r122  0x4f000000 (2.147483648e+9), INX flag set

SEE ALSO

ufloat ifloatrz ufloatrz

ifixieee ifloatflags

ifloat

PNX1300/01/02/11 Data Book Philips Semiconductors

A-93 PRELIMINARY SPECIFICATION

IEEE status flags from convert signed integer to

floating-point

SYNTAX

[ IF rguard ] ifloatflags rsrc1  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float) ((long)rsrc1))

ATTRIBUTES

Function unit falu

Operation code 130

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ifloatflags operation computes the IEEE exceptions that would result from converting the signed integer

in rsrc1 to a single-precision IEEE floating-point value, and an integer bit vector representing the computed exception

flags is written into rdest. The bit vector stored in rdest has the same format as the IEEE exception bits in the PCSW.

The exception flags in PCSW are lef t unchange d by this operation . Roundin g is accord i ng to the IEEE rou ndin g mode

bits in PCSW.

The ifloatflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ifloatflags r30  r100 r100  0

r40 = 0xffffffff (-1) ifloatflags r40  r105 r105  0

r10 = 0, r50 = 0xfffffffd IF r10 ifloatflags r50  r110 no change, since guard is false

r20 = 1, r50 = 0xfffffffd IF r20 ifloatflags r50  r115 r115  0

r60 = 0x7 fffffff (2147483647) ifloatflags r60  r117 r117  0x02 (INX)

r70 = 0x80000000 (-2147483648) ifloatflags r70  r120 r120  0

r80 = 0x7 ffffff 1 ( 2147483633) ifloatflags r80  r122 r122  0x02 (INX)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

ifloat ifloatrzflags

ufloatflags ufloatrzflags

ifloatflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-94

Convert signed integer to floating-point with

rounding toward zero

SYNTAX

[ IF rguard ] ifloatrz rsrc1  rdest

FUNCTION

if rguard then {

rdest  (float) ((long)rsrc1)

}

ATTRIBUTES

Function unit falu

Operation code 117

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ifloatrz operation converts the signed integer value in rsrc1 to single-precision IEEE floating-point format

and writes the result into rdest. Rounding is performed toward zero; the IEEE rounding mode bits in PCSW are

ignored. Th is is the pref erred round ing mode fo r ANSI C. If ifloatrz causes an IEEE exception, such as inexact,

the corresponding exception flags in the PCSW are set. The PCSW exception flags are sticky: the flags can be set as

a side-eff ect of any floating-point oper ation but can on ly be reset by an e xplicit writepcsw operation. The upd ate of

the PCSW exception flags occurs at the same time as rdest is written. If any other floating-point compute operations

update the PCSW at the same time, the net re sult in ea ch e xception flag is the logical OR of all simult ane ous updates

ORed with the existing PCSW value for that exception flag.

The ifloatrzflags operation computes the exception flags that would result from an individual ifloatrz.

The ifloatrz operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 3 ifloatrz r30  r100 r100  0x40400000 (3.0)

r4 0 = 0x ffffffff (-1) ifloatrz r40  r105 r105  0xbf800000 (-1.0)

r1 0 = 0, r50 = 0xfffffffd IF r10 ifloatrz r50  r110 no change, since guard is false

r2 0 = 1, r50 = 0xfffffffd IF r20 ifloatrz r50  r115 r115  0xc0400000 (–3.0)

r6 0 = 0x 7fffffff (2147483647) ifloatrz r60  r117 r117  0x4ef fffff (2.147483520e+9), INX flag set

r70 = 0x80000000 (-2147483648) ifloatrz r70  r120 r120  0xcf000000 (-2.147483648e+9)

r8 0 = 0x 7ffffff1 (2147483633) ifloatrz r80  r122 r122  0x 4e ffffff (2.147483520e+9), INX flag set

SEE ALSO

ifloat ufloatrz ifixieee

ifloatflags

ifloatrz

PNX1300/01/02/11 Data Book Philips Semiconductors

A-95 PRELIMINARY SPECIFICATION

IEEE status flags from convert signed integer to

floating-point with rounding toward zero

SYNTAX

[ IF rguard ] ifloatrzflags rsrc1  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float) ((long)rsrc1))

ATTRIBUTES

Function unit falu

Operation code 118

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ifloatrzflags operation computes the IEEE exceptions that would result from converting the signed

integer in rsrc1 to a single-precision IEEE floating-point value, and an integer bit vector representing the computed

exception flags is written into rdest. The bit vector stored in rdest has the same format as the IEEE exception bits in

the PCSW. The exception flags in PCSW are left unchanged by this operation. Rounding is performed toward zero;

the IEEE rounding mode bits in PCSW are ignored.

The ifloatrzflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB

controls the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not

changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ifloatrzflags r30  r100 r100  0

r40 = 0xffffffff (-1) ifloatrzflags r40  r105 r105  0

r10 = 0, r50 = 0xfffffffd IF r10 ifloatrzflags r50  r110 no change, since guard is false

r20 = 1, r50 = 0xfffffffd IF r20 ifloatrzflags r50  r115 r115  0

r60 = 0x7 fffffff (2147483647) ifloatrzflags r60  r117 r117  0x02 (INX)

r70 = 0x80000000 (-2147483648) ifloatrzflags r70  r120 r120  0

r80 = 0x7 ffffff 1 ( 2147483633) ifloatrzflags r80  r122 r122  0x02 (INX)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

ifloatrz ifloatflags

ufloatflags ufloatrzflags

ifloatrzflags

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-96

Signed compare greater or equal

SYNTAX

[ IF rguard ] igeq rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 >= rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 14

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The igeq operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is greater than or equal to

the second argument, rsrc2; otherwise, rdest is set to 0. The arguments are treated as signed integers.

The igeq operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 igeq r30 r40  r80 r80  0

r10 = 0, r60 = 0x100, r30 = 3 IF r10 igeq r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, r60 = 0x100 IF r20 igeq r50 r60  r90 r90  1

r70 = 0x80000000, r40 = 4 igeq r70 r40  r100 r100  0

r70 = 0x80000000 igeq r70 r70  r110 r110  1

SEE ALSO

ileq igeqi

igeq

PNX1300/01/02/11 Data Book Philips Semiconductors

A-97 PRELIMINARY SPECIFICATION

Signed compare greater or equal with immediate

SYNTAX

[ IF rguard ] igeqi(n) rsrc1  rdest

FUNCTION

if rguard then {

if rsrc1 >= n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 1

Number of operands 1

Modifier 7 bits

Modifier range –64..63

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The igeqi operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is greater than or equal to

the opcode modifier, n; otherwise, rdest is set to 0. The arguments are treated as signed in tegers.

The igeqi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 igeqi(2) r30  r80 r80  1

r30 = 3 igeqi(3) r30  r90 r90  1

r30 = 3 igeqi(4) r30  r100 r100  0

r10 = 0, r40 = 0x100 IF r10 igeqi(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 igeqi(63) r40  r100 r100  1

r60 = 0x80000000 igeqi(-64) r60  r120 r120  0

SEE ALSO

igeq iles ieqli

igeqi

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-98

Signed compare greater

SYNTAX

[ IF rguard ] igtr rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 > rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 15

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The igtr operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is greater than th e second

argument, rsrc2; otherwise, rdest is set to 0. The arguments are treated as signed integers.

The igtr operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 igtr r30 r40  r80 r80  0

r10 = 0, r60 = 0x100, r30 = 3 IF r10 igtr r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, r60 = 0x100 IF r20 igtr r50 r60  r90 r90  1

r70 = 0x80000000, r40 = 4 igtr r70 r40  r100 r100  0

r70 = 0x80000000 igtr r70 r70  r110 r110  0

SEE ALSO

iles igtri

igtr

PNX1300/01/02/11 Data Book Philips Semiconductors

A-99 PRELIMINARY SPECIFICATION

Signed compare greater with immediate

SYNTAX

[ IF rguard ] igtri(n) rsrc1  rdest

FUNCTION

if rguard then {

if rsrc1 > n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 0

Number of operands 1

Modifier 7 bits

Modifier range –64..63

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The igtri operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is greater than the opcode

modifier, n; otherwise, rdest is set to 0. The arguments are treated as signed integers.

The igtri operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 igtri(2) r30  r80 r80  1

r30 = 3 igtri(3) r30  r90 r90  0

r30 = 3 igtri(4) r30  r100 r100  0

r10 = 0, r40 = 0x100 IF r10 igtri(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 igtri(63) r40  r100 r100  1

r60 = 0x80000000 igtri(-64) r60  r120 r120  0

SEE ALSO

igtr igeqi

igtri

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-100

Signed immediate

SYNTAX

iimm(n)  rdest

FUNCTION

rdest  n

ATTRIBUTES

Function unit c onst

Operation code 191

Number of operands 0

Modifier 32 bits

Modifier range 0x80000000

..0x7fffffff

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The iimm operation stores the signed 32-bit opcode mo difier n into rdest. Note: this operation is not guarded.

EXAMPLES

Initial Values Operation Result

iimm(2)  r10 r10  2

iimm(0x100)  r20 r20  0x100

iimm(0xfffc0000)  r30 r30  0xfffc0000

SEE ALSO

uimm

iimm

PNX1300/01/02/11 Data Book Philips Semiconductors

A-101 PRELIMINARY SPECIFICATION

Interruptible indirect jump on false

SYNTAX

[ IF rguard ] ijmpf rsrc1 rsrc2

FUNCTION

if rguard then {

if (rsrc1 & 1) = 0 then {

DPC  rsrc2

if exception is pending then

service exception

elseif interrupt is pending then

service interrupts

else

PC, SPC  rsrc2

}

ATTRIBUTES

Function unit branch

Operation code 181

Number of operands 2

Modifier no

Modifier range —

Delay 3

Issue slots 2, 3, 4

DESCRIPTION

The ijmpf operation conditionally changes the program flow and allows pending interrupts or exceptions to be

serviced. If neither interrupt s or exceptions are pending an d the LSB of r src1 is 0, the DPC, PC, and SPC registers are

set equal to rsrc2. If an interrupt or exception is pending and the LSB of rsrc1 is 0, DPC is s et equal to rsrc2 and the

service routine is invoked, where exceptions have prior itie s over interru pts. If the LSB of r src1 is 1, program execution

continues with the next sequential instruction.

The ijmpf operation optionally takes a guard, specified in rguard. If a guard is present, its LSB adds another

condition to the jump. If the LSB of rguard is 1, the instruction execut es as previously described; otherwise, the jump

will not be taken and PC, DPC, and SPC are not modified regardless of the value of rsrc1.

EXAMPLES

Initial Values Operation Result

r50 = 0, r70 = 0x330 ijmpf r50 r70 program execution continues at 0x330 after

first servicing pending interrupts

r20 = 1, r70 = 0x330 ijmpf r20 r70 since r20 is true, program execution contin-

ues with next sequential instruction

r30 = 0, r50 = 0, r60 = 0x8000 IF r30 ijmpf r50 r60 since guard is false, program execution con-

tinues with next sequential instruction

r40 = 1, r50 = 0, r60 = 0x8000 IF r40 ijmpf r50 r60 program execution continues at 0x8000 after

first servicing pending interrupts

SEE ALSO

jmpf jmpt jmpi ijmpt ijmpi

ijmpf

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-102

Interruptible jump immediate

SYNTAX

[ IF rguard ] ijmpi(address)

FUNCTION

if rguard then {

DPC  address

if exception is pending then

service exception

else if interrupt is pending then

service interrupts

else

PC, SPC  address

}

ATTRIBUTES

Function unit branch

Operation code 179

Number of operands 0

Modifier 32 bits

Modifier range 0 ..0xffffffff

Delay 3

Issue slots 2, 3, 4

DESCRIPTION

The ijmpi operation changes the program flow and allows pending interrupts or exceptions to be serviced. If no

interrupts or exceptions are pending, the DPC, PC, and SPC registers are set equal to address. If an exception or

interrupts is pending, DPC is set equal to address and a service routine is invoked, where exceptions have priorities

over interrupts. address is an immediate opcode modifier.

The ijmpi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB adds a cond ition to

the jump. If the LSB of rguard is 1, the instruction executes as previously described; otherwise, the jump will not be

taken and PC, DPC, and SPC are not modified.

EXAMPLES

Initial Values Operation Result

ijmpi(0x330) program execution continues at 0x330

r30 = 0 IF r30 ijmpi(0x8000) since guard is false, program execution con-

tinues with next sequential instruction

r40 = 1 IF r40 ijmpi(0x8000) program execution continues at 0x8000

SEE ALSO

jmpf jmpt jmpi ijmpf ijmpt

ijmpi

PNX1300/01/02/11 Data Book Philips Semiconductors

A-103 PRELIMINARY SPECIFICATION

Interruptible indirect jump on true

SYNTAX

[ IF rguard ] ijmpt rsrc1 rsrc2

FUNCTION

if rguard then {

if (rsrc1 & 1) = 1 then {

DPC  rsrc2

if exception is pending then

service exception

elseif interrupt is pending then

service interrupts

else

PC, SPC  rsrc2

}

ATTRIBUTES

Function unit branch

Operation code 177

Number of operands 2

Modifier no

Modifier range —

Delay 3

Issue slots 2, 3, 4

DESCRIPTION

The ijmpt operation conditionally changes the program flow and allows pending interrupts or exceptions to be

serviced. If no interrupts or exceptions are p ending a nd the LSB of rsrc1 is 1, the DPC, PC, and SPC registers ar e set

equal to rsrc2. If an exception o r interr upt is pendin g and the LSB of r src1 is 1, DPC is set equal to rsrc2 and a service

routine is invoked, where exceptions have priority over interrupts. If the LSB of rsrc1 is 0, program execution

continues with the next sequential instruction.

The ijmpt operation optionally takes a guard, specified in rguard. If a guard is present, its LSB adds another

condition to the jump. If the LSB of rguard is 1, the instruction execut es as previously described; otherwise, the jump

will not be taken and PC, DPC, and SPC are not modified regardless of the value of rsrc1.

EXAMPLES

Initial Values Operation Result

r50 = 1, r70 = 0x330 ijmpt r50 r70 program execution continues at 0x330 after

first servicing pending interrupts

r20 = 0, r70 = 0x330 ijmpt r20 r70 since r20 is false, program execution contin-

ues with next sequential instruction

r30 = 0, r50 = 1, r60 = 0x8000 IF r30 ijmpt r50 r60 since guard is false, program execution con-

tinues with next sequential instruction

r40 = 1, r50 = 1, r60 = 0x8000 IF r40 ijmpt r50 r60 program execution continues at 0x8000 after

first servicing pending interrupts

SEE ALSO

jmpf jmpt jmpi ijmpf ijmpi

ijmpt

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-104

Signed 16-bit load

pseudo-op for ild16d(0)

SYNTAX

[ IF rguard ] ild16 rsrc1  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  1

else

bs  0

temp<7:0>  mem[(rsrc1 +(1  bs)]

temp<15:8>  mem[(rsrc1 + (0  bs)]

rdest  sign_ext16to32(temp<15:0>)

}

ATTRIBUTES

Function unit dmem

Operation code 6

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The ild16 operation is a pseudo operation transformed by the scheduler into an ild16d(0) with the same

argument. (Note: pseudo operations cannot be used in assembly source files.)

The ild16 operation loads the 16-bit memory value from the address contained in rsrc1, sign extends it to 32 bits,

and stores the result in rdest. If the memory address contained in rsrc1 is not a multiple of 2, the result of ild16 is

undefined but no exception will be raised. This load operation is performed as little-endian or big-endian depending on

the current setting of the bytesex bit in the PCSW.

The result of an access by ild16 to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The ild16 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of th e destination register and the occurr ence of side ef fects. If the LSB of r guard is 1, rdest is written and

the data cache status bits are upda ted if the addre ssed loca tions are c achea ble. if the L SB of rguard is 0, rdest is not

changed and ild16 has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, [0xd00] = 0x22,

[0xd01] = 0x11 ild16 r10  r60 r60  0x00002211

r30 = 0, r20 = 0xd04, [0xd04] = 0x84,

[0xd05] = 0x33 IF r30 ild16 r20  r70 no change, since guard is false

r40 = 1, r20 = 0xd04, [0xd04] = 0x84,

[0xd05] = 0x33 IF r40 ild16 r20  r80 r80  0xffff8433

r50 = 0xd01 ild16 r50  r90 r90 undefined, since 0xd01 is not a multiple of 2

SEE ALSO

ild16d ild16r ild16x

ild16

PNX1300/01/02/11 Data Book Philips Semiconductors

A-105 PRELIMINARY SPECIFICATION

Signed 16-bit load with displacement

SYNTAX

[ IF rguard ] ild16d(d) rsrc1  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  1

else

bs  0

temp<7:0>  mem[(rsrc1 + d + (1  bs)]

temp<15:8>  mem[(rsrc1 + d + (0  bs)]

rdest  sign_ext16to32(temp<15:0>)

}

ATTRIBUTES

Function unit dmem

Operation code 6

Number of operands 1

Modifier 7 bits

Modifier range –128..126 by 2

Latency 3

Issue slots 4, 5

DESCRIPTION

The ild16d operation loads the 16-bit memory value from the address computed by rsrc1 + d, sign extends it to

32 bits, and stores the result in rdest. The d value is an opcode modifier, must be in the range –128 to 126 inclusive,

and must be a multiple of 2. If the memory ad dr ess co mputed by rsrc1 + d is not a multiple of 2, the result of ild16d

is undefined but no exception will be raised. This load operation is performed as little-endian or big-endian depending

on the current setting of the bytese x bit in the PCSW.

The result of an access by ild16d to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The ild16d operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the d estination register and the occurrence of side effect s. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed locations are cacheable. if the LSB of rguard is 0, rdest is not

changed and ild16d has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, [0xd02] = 0x22,

[0xd03] = 0x11 ild16d(2) r10  r60 r60  0x00002211

r30 = 0, r20 = 0xd04, [0xd00] = 0x84,

[0xd01] = 0x33 IF r30 ild16d(-4) r20  r70 no change, since guard is false

r40 = 1, r20 = 0xd04, [0xd00] = 0x84,

[0xd01] = 0x33 IF r40 ild16d(-4) r20  r80 r80  0x ffff8 433

r50 = 0xd01 ild16d(-4) r50  r90 r90 undefined, since 0xd01 +(–4) is not a

multiple of 2

SEE ALSO

ild16 uld16 uld16d ild16r

uld16r ild16x uld16x

ild16d

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-106

Signed 16-bit load with index

SYNTAX

[ IF rguard ] ild16r rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  1

else

bs  0

temp<7:0>  mem[(rsrc1 + rsrc2 +(1  bs)]

temp<15:8>  mem[(rsrc1 + rsrc2 + (0  bs)]

rdest  sign_ext16to32(temp<15:0>)

}

ATTRIBUTES

Function unit dmem

Operation code 195

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The ild16r operation loads the 16-bit memory value from the address computed by rsrc1 + rsrc2, sign extends it

to 32 bits, and stores the result in rdest. If the memory address computed by rsrc1 + rsrc2 is not a multiple of 2, the

result of ild16r is undefined but no exception will be raised. This load operation is performed as little-endian or big-

endian depending on the current setting of the bytesex bit in the PCSW.

The result of an access by ild16r to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The ild16r operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of th e destination register and the occurr ence of side ef fects. If the LSB of r guard is 1, rdest is written and

the data cache status bits are upda ted if the addre ssed loca tions are c achea ble. if the L SB of rguard is 0, rdest is not

changed and ild16r has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, r20 = 2, [0xd02] = 0x22,

[0xd03] = 0x11 ild16r r10 r20  r80 r80  0x00002211

r50 = 0, r40 = 0xd04, r30 = 0xfffffffc,

[0xd00] = 0x84, [0xd01] = 0x33 IF r50 ild16r r40 r30  r90 no change, since guard is false

r60 = 1, r40 = 0xd04, r30 = 0xfffffffc,

[0xd00] = 0x84, [0xd01] = 0x33 IF r60 ild16r r40 r30  r100 r100  0xffff8433

r70 = 0xd01, r30 = 0xfffffffc ild16r r70 r30  r110 r110 undefined, since 0xd01 +(–4) is not a

multiple of 2

SEE ALSO

ild16 uld16 ild16d uld16d

uld16r ild16x uld16x

ild16r

PNX1300/01/02/11 Data Book Philips Semiconductors

A-107 PRELIMINARY SPECIFICATION

Signed 16-bit load with scaled index

SYNTAX

[ IF rguard ] ild16x rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  1

else

bs  0

temp<7:0>  mem[(rsrc1 + (2  rsrc2) + (1  bs)]

temp<15:8>  mem[(rsrc1 + (2  rsrc2) + (0  bs)]

rdest  sign_ext16to32(temp<15:0>)

}

ATTRIBUTES

Function unit dmem

Operation code 196

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The ild16x operation loads the 16-bit memory value from the address computed by rsrc1 + 2 rsrc2, sign extends

it to 32 bits, and stores the result in rdest. If the memory address computed by rsrc1 + 2rsrc2 is not a multiple of 2,

the result of ild16x is undefined but no exception will be raised. This load operation is performed as little-endian or

big-endian depending on the current setting of the bytesex bit in the PCSW.

The result of an access by ild16x to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The ild16x operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the d estination register and the occurrence of side effect s. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed locations are cacheable. if the LSB of rguard is 0, rdest is not

changed and ild16x has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, r30 = 1, [0xd02] = 0x22,

[0xd03] = 0x11 ild16x r10 r30  r100 r100  0x00002211

r50 = 0, r40 = 0xd04, r20 = 0xfffff ffe,

[0xd00] = 0x84, [0xd01] = 0x33 IF r50 ild16x r40 r20  r80 no change, since guard is false

r60 = 1, r40 = 0xd04, r20 = 0xfffff ffe,

[0xd00] = 0x84, [0xd01] = 0x33 IF r60 ild16x r40 r20  r90 r90  0xffff8433

r70 = 0xd01, r30 = 1 ild16x r70 r30  r110 r110 undefined, since 0xd01 + 21 is not a

multiple of 2

SEE ALSO

ild16 uld16 ild16d uld16d

ild16r uld16r uld16x

ild16x

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-108

Signed 8-bit load

pseudo-op for ild8d(0)

SYNTAX

[ IF rguard ] ild8 rsrc1  rdest

FUNCTION

if rguard then

rdest  sign_ext8to32(mem[rsrc1])

ATTRIBUTES

Function unit dmem

Operation code 192

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The ild8 operation is a pseudo operation transformed by the scheduler into an ild8d(0) with the same

argument. (Note: pseudo operations cannot be used in assembly source files.)

The ild8 operation loads the 8-bit memory value from the address contained in rsrc1, sign extends it to 32 bits,

and stores the result in r dest. This operation do es not d epe nd on the byte sex bit in th e PCSW since on ly a single byte

is loaded.

The result of an access by ild8 to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The ild8 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of th e destination register and the occurr ence of side ef fects. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed location is cacheable. if the LSB of rguard is 0, rdest is not

changed and ild8 has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, [0xd00] = 0x22 ild8 r10  r60 r60  0x00000022

r30 = 0, r20 = 0xd04, [0xd04] = 0x84 IF r30 ild8 r20  r70 no change, since guard is false

r40 = 1, r20 = 0xd04, [0xd04] = 0x84 IF r40 ild8 r20  r80 r80  0xffffff84

r50 = 0xd01, [0xd01] = 0x33 ild8 r50  r90 r90  0x00000033

SEE ALSO

uld8 ild8d uld8d ild8r

uld8r

ild8

PNX1300/01/02/11 Data Book Philips Semiconductors

A-109 PRELIMINARY SPECIFICATION

Signed 8-bit load with displacement

SYNTAX

[ IF rguard ] ild8d(d) rsrc1  rdest

FUNCTION

if rguard then

rdest  sign_ext8to32(mem[rsrc1 + d])

ATTRIBUTES

Function unit dmem

Operation code 192

Number of operands 1

Modifier 7 bits

Modifier range –64..63

Latency 3

Issue slots 4, 5

DESCRIPTION

The ild8d operation lo ads the 8-bit memor y value from the address computed by rsrc1 + d, sign extends it to 32

bits, and stores the result in rdest. The d value is an opcode modifier in the range -64 to 63, inclusive. This operation

does not depend on the bytesex bit in the PCSW since only a single byte is loaded.

The result of an access by ild8d to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The ild8d operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the d estination register and the occurrence of side effect s. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed location is cacheable. if the LSB of rguard is 0, rdest is not

changed and ild8d has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, [0xd02] = 0x22 ild8d(2) r10  r60 r60  0x000022

r30 = 0, r20 = 0xd04, [0xd00] = 0x84 IF r30 ild8d(-4) r20  r70 no change, since guard is false

r40 = 1, r20 = 0xd04, [0xd00] = 0x84 IF r40 ild8d(-4) r20  r80 r80  0xffffff 84

r50 = 0xd05, [0xd01] = 0x33 ild8d(-4) r50  r90 r90  0x00000033

SEE ALSO

ild8 uld8 uld8d ild8r

uld8r

ild8d

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-110

Signed 8-bit load with index

SYNTAX

[ IF rguard ] ild8r rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  sign_ext8to32(mem[rsrc1 + rsrc2])

ATTRIBUTES

Function unit dmem

Operation code 193

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The ild8r oper ation load s the 8-bit me mory valu e from the a ddress com puted by rsrc1 + rsrc2, sign extends it to

32 bits, and stores the result in rdest. This operation does not depend on the bytesex bit in the PCSW since only a

single byte is loaded.

The result of an access by ild8r to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The ild8r operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of th e destination register and the occurr ence of side ef fects. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed location is cacheable. if the LSB of rguard is 0, rdest is not

changed and ild8r has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, r20 = 2, [0xd02] = 0x22 ild8r r10 r20  r80 r80  0x00000022

r50 = 0, r40 = 0xd04, r30 = 0xfffffffc,

[0xd00] = 0x84 IF r50 ild8r r40 r30  r90 no change, since guard is false

r60 = 1, r40 = 0xd04, r30 = 0xfffffffc,

[0xd00] = 0x84 IF r60 ild8r r40 r30  r100 r100  0xffffff84

r70 = 0xd05, r30 = 0xfffffffc,

[0xd01] = 0x33 ild8r r70 r30  r110 r110  0x00000033

SEE ALSO

ild8 uld8 ild8d uld8d

uld8r

ild8r

PNX1300/01/02/11 Data Book Philips Semiconductors

A-111 PRELIMINARY SPECIFICATION

Signed compare less or equal

pseudo-op for igeq

SYNTAX

[ IF rguard ] ileq rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 <= rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 14

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The ileq operation is a pseudo operation transformed by the scheduler into an igeq with the arguments

exchanged (ileq’s rsrc1 is igeq’s rsrc2 and vice versa). (Note: pseudo operations cannot be used in assembly

source files.)

The ileq operation sets the destination register, rdest, to 1 if the f irst argume nt, rsrc1, is less than or equal to the

second argument, rsrc2; otherwise, rdest is set to 0. The arguments are treated as signed integers.

The ileq operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 ileq r30 r40  r80 r80  1

r10 = 0, r60 = 0x100, r30 = 3 IF r10 ileq r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, 0x100 IF r20 ileq r50 r60  r90 r90  0

r70 = 0x80000000, r40 = 4 ileq r70 r40  r100 r100  1

r70 = 0x80000000 ileq r70 r70  r110 r110  1

SEE ALSO

igeq ileqi

ileq

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-112

Signed compare less or equal with immediate

SYNTAX

[ IF rguard ] ileqi(n) rsrc1  rdest

FUNCTION

if rguard then {

if rsrc1 <= n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 42

Number of operands 1

Modifier 7 bits

Modifier range –64..63

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The ileqi operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is less than or equal to the

opcode modifier, n; otherwise, rdest is set to 0. The arguments are treated as signed integers.

The ileqi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ileqi(2) r30  r80 r80  0

r30 = 3 ileqi(3) r30  r90 r90  1

r30 = 3 ileqi(4) r30  r100 r100  1

r10 = 0, r40 = 0x100 IF r10 ileqi(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 ileqi(63) r40  r100 r100  0

r60 = 0x80000000 ileqi(-64) r60  r120 r120  1

SEE ALSO

ileq igeqi

ileqi

PNX1300/01/02/11 Data Book Philips Semiconductors

A-113 PRELIMINARY SPECIFICATION

Signed compare less

pseudo-op for igtr

SYNTAX

[ IF rguard ] iles rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 < rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 15

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The iles operation is a pseudo operation transformed by the scheduler into an igtr with the arguments

exchanged (iles’s rsrc1 is igtr’s rsrc2 and vice versa). (Note: pseudo operations cannot be used in assembly

source files.)

The iles operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is less than the second

argument, rsrc2; otherwise, rdest is set to 0. The arguments are treated as signed integers.

The iles operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 iles r30 r40  r80 r80  1

r10 = 0, r60 = 0x100, r30 = 3 IF r10 iles r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, 0x100 IF r20 iles r50 r60  r90 r90  0

r70 = 0x80000000, r40 = 4 iles r70 r40  r100 r100  1

r70 = 0x80000000 iles r70 r70  r110 r110  0

SEE ALSO

igtr ilesi

iles

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-114

Signed compare less with immediate

SYNTAX

[ IF rguard ] ilesi(n) rsrc1  rdest

FUNCTION

if rguard then {

if rsrc1 < n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 2

Number of operands 1

Modifier 7 bits

Modifier range –64..63

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The ilesi operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is less than the opcode

modifier, n; otherwise, rdest is set to 0. The arguments are treated as signed integer s.

The ilesi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ilesi(2) r30  r80 r80  0

r30 = 3 ilesi(3) r30  r90 r90  0

r30 = 3 ilesi(4) r30  r100 r100  1

r10 = 0, r40 = 0x100 IF r10 ilesi(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 ilesi(63) r40  r100 r100  0

r60 = 0x80000000 ilesi(-64) r60  r120 r120  1

SEE ALSO

iles ileqi

ilesi

PNX1300/01/02/11 Data Book Philips Semiconductors

A-115 PRELIMINARY SPECIFICATION

Signed maximum

SYNTAX

[ IF rguard ] imax rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 > rsrc2 then

rdest  rsrc1

else

rdest  rsrc2

}

ATTRIBUTES

Function unit dspalu

Operation code 24

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

The imax operation sets th e destination r egister, rdest, to the contents of rsrc1 if rsrc1>rsrc2; otherwise, rdest is set

to the contents of rsrc2. The arguments are treated as signed integers.

The imax operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 2, r20 = 1 imax r30 r20  r80 r80  2

r10 = 0, r60 = 0x100, r30 = 2 IF r10 imax r60 r30  r50 no change, since guard is false

r20 = 1, r60 = 0x100, r40 = 0xffffff9c IF r20 imax r60 r40  r90 r90  0x100

r70 = 0xffffff00, r40 = 0xffffff9c imax r70 r40  r100 r100  0xffffff9c

SEE ALSO

imin

imax

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-116

Signed minimum

SYNTAX

[ IF rguard ] imin rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 > rsrc2 then

rdest  rsrc2

else

rdest  rsrc1

}

ATTRIBUTES

Function unit dspalu

Operation code 23

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

The imin operation set s the destina tion register, rdest, to the content s of rsrc2 if rsrc1>rsrc2; otherwise, rdest is set

to the contents of rsrc1. The argum en ts are trea te d as signed integers .

The imin operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 2, r20 = 1 imin r30 r20  r80 r80  1

r10 = 0, r60 = 0x100, r30 = 2 IF r10 imin r60 r30  r50 no change, since guard is false

r20 = 1, r60 = 0x100, r40 = 0xffffff9c IF r20 imin r60 r40  r90 r90  0xffffff9c

r70 = 0xffffff00, r40 = 0xffffff9c imin r70 r40  r100 r100  0xffffff00

SEE ALSO

imax

imin

PNX1300/01/02/11 Data Book Philips Semiconductors

A-117 PRELIMINARY SPECIFICATION

Signed multiply

SYNTAX

[ IF rguard ] imul rsrc1 rsrc2  rdest

FUNCTION

if rguard then

temp  (sign_ext32to64(rsrc1)  sign_ext32to64(rsrc2))

rdest  temp<31:0>

ATTRIBUTES

Function unit ifmul

Operation code 27

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the imul operation comp utes the product r src1rsrc2 and writes th e least-significant 32 bits of the

full 64-bit product into rdest. The operands are considered signed integers. No overflow or underflow detection is

performed.

The imul operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r60 = 0x100 imul r60 r60  r80 r80  0x10000

r10 = 0, r60 = 0x100, r30 = 0xf11 IF r10 imul r60 r30  r50 no change, since guard is false

r20 = 1, r60 = 0x100, r30 = 0xf11 IF r20 imul r60 r30  r90 r90  0xf1100

r70 = 0xffffff00, r40 = 0xffffff9c imul r70 r40  r100 r100  0x6400

031

rsrc1 031

rsrc2

031

rdest



063 31

64-bit result

signed signed

signed

SEE ALSO

umul imulm umulm dspimul

dspumul dspidualmul

quadumulmsb fmul

imul

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-118

Signed multiply, return most-significant 32 bits

SYNTAX

[ IF rguard ] imulm rsrc1 rsrc2  rdest

FUNCTION

if rguard then

temp  (sign_ext32to64(rsrc1)  sign_ex t32to64(rsrc2))

rdest  temp<63:32>

ATTRIBUTES

Function unit ifmul

Operation code 139

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the imulm operation computes the product rsrc1rsrc2 and writes th e most-significant 32 bits of

the full 64-bit product into rdest. The operands are considered signed integers.

The imulm operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r60 = 0x10000 imulm r60 r60  r80 r80  0x00000001

r10 = 0, r60 = 0x100, r30 = 0xf11 IF r10 imulm r60 r30  r50 no change, since guard is false

r20 = 1, r60 = 0x10001000,

r30 = 0xf1100000 IF r20 imulm r60 r30  r90 r90  0xf f10ff11

r70 = 0xffffff00, r40 = 0x64 imulm r70 r40  r100 r100  0xffffffff

031

rsrc1 031

rsrc2

031

rdest



063 31

64-bit result

signed signed

signed

SEE ALSO

umulm dspimul dspumul

dspidualmul quadumulmsb

fmul

imulm

PNX1300/01/02/11 Data Book Philips Semiconductors

A-119 PRELIMINARY SPECIFICATION

Signed negate

pseudo-op for isub

SYNTAX

[ IF rguard ] ineg rsrc1  rdest

FUNCTION

if rguard then

rdest  –rsrc1

ATTRIBUTES

Function unit alu

Operation code 13

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The ineg operation is a pseudo operation transformed by the scheduler into an isub with r0 (always contains 0)

as the first argument and rsrc1 as the second argument. (Note: pseudo operations cannot be used in assembly

source files.)

The ineg operation computes the negative of rsrc1 and writes the result into rdest. The argument is a signed

integer; the result is an unsigned integer. If rsrc1 = 0x80000000, then ineg returns 0x80000000 since the positive

value is not representable.

The ineg operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xffffffff ineg r30  r60 r60  0x00000001

r10 = 0, r40 = 0xfffffff4 IF r10 ineg r40  r80 no change, since guard is false

r20 = 1, r40 = 0xfffffff4 IF r20 ineg r40  r90 r90  0xc

r50 = 0x80000001 ineg r50  r100 r100  0x7fffffff

r60 = 0x80000000 ineg r60  r110 r110  0x80000000

r20 = 1 ineg r20  r120 r120  0xffffffff

SEE ALSO

isub

ineg

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-120

Signed compare not equal

SYNTAX

[ IF rguard ] ineq rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 != rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 39

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The ineq operation sets the destination register, rdest, to 1 if the two arguments, rsrc1 and rsrc2, are not equal;

otherwise, rdest is set to 0.

The ineq operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 ineq r30 r40  r80 r80  1

r10 = 0, r60 = 0x1000, r30 = 3 IF r10 ineq r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, r60 = 0x1000 IF r20 ineq r50 r60  r90 r90  0

r70 = 0x80000000, r40 = 4 ineq r70 r40  r100 r100  1

r70 = 0x80000000 ineq r70 r70  r110 r110  0

SEE ALSO

ieql igtr ineqi

ineq

PNX1300/01/02/11 Data Book Philips Semiconductors

A-121 PRELIMINARY SPECIFICATION

Signed compare not equal with immediate

SYNTAX

[ IF rguard ] ineqi(n) rsrc1  rdest

FUNCTION

if rguard then {

if rsrc1 != n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 3

Number of operands 1

Modifier 7 bits

Modifier range –64..63

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The ineqi operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is not equal to the opcode

modifier, n; otherwise, rdest is set to 0. The arguments are treated as signed integers.

The ineqi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ineqi(2) r30  r80 r80  1

r30 = 3 ineqi(3) r30  r90 r90  0

r30 = 3 ineqi(4) r30  r100 r100  1

r10 = 0, r40 = 0x100 IF r10 ineqi(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 ineqi(63) r40  r100 r100  1

r60 = 0xffffffc0 ineqi(-64) r60  r120 r120  0

SEE ALSO

ineq igeqi ieqli

ineqi

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-122

If nonzero select zero

SYNTAX

[ IF rguard ] inonzero rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 != 0 then

rdest  0

else

rdest  rsrc2

}

ATTRIBUTES

Function unit alu

Operation code 47

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The inonzero operation writes 0 into r dest if the value of rsrc1 is not zero; otherwise, rsrc2 is copied to rdest. The

operands are considered signed integers.

The inonzero operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 2, r20 = 1 inonzero r30 r20  r80 r80  0

r10 = 0, r60 = 0x100, r30 = 2 IF r10 inonzero r60 r30  r50 no change, since guard is false

r20 = 1, r60 = 0x100, r40 = 0xffffff9c IF r20 inonzero r60 r40  r90 r90  0

r1 0 = 0, r40 = 0xffffff9 c inonzero r10 r40  r100 r100  0xffffff9c

r20 = 1, r60 = 0x100 inonzero r20 r60  r110 r110  0

r10 = 0, r70 = 0x456789 inonzero r10 r70  r120 r120  0x456789

SEE ALSO

izero iflip

inonzero

PNX1300/01/02/11 Data Book Philips Semiconductors

A-123 PRELIMINARY SPECIFICATION

Subtract

SYNTAX

[ IF rguard ] isub rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  rsrc1 – rsrc2

ATTRIBUTES

Function unit alu

Operation code 13

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The isub operation computes the difference rsrc1–rsrc2 and writes the result into rdest. The operands can be

either both signed or unsigned integers. No overflow or underflow detection is performed.

The isub operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 isub r30 r40  r80 r80  0xffffffff

r10 = 0, r60 = 0x100, r30 = 3 IF r10 isub r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, r60 = 0x100 IF r20 isub r50 r60  r90 r90  0xf00

r70 = 0x80000000, r40 = 4 isub r70 r40  r100 r100  0x 7ffffffc

SEE ALSO

isubi borrow dspisub

dspidualsub fsub

isub

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-124

Subtract with immediate

SYNTAX

[ IF rguard ] isubi(n) rsrc1  rdest

FUNCTION

if rguard then

rdest  rsrc1 – n

ATTRIBUTES

Function unit alu

Operation code 32

Number of operands 1

Modifier 7 bits

Modifier range 0..127

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The isubi operation computes the dif feren ce of a single argument in r src1 and an immediate modifier n and stores

the result in rdest. The value of n must be between 0 and 127, inclusive.

The isubi operations optionally take a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

r30 = 0xf11 isubi(127) r30  r70 r70  0xe92

r1 0 = 0, r40 = 0xffffff9 c IF r10 isubi(1) r40  r80 no change, since guard is false

r2 0 = 1, r40 = 0xffffff9 c IF r20 isubi(1) r40  r90 r90  0xffffff9b

r50 = 0x1000 isubi(15) r50  r120 r120  0x0ff1

r60 = 0xfffffff0 isubi(2) r60  r110 r110  0xffffffee

r20 = 1 isubi(17) r20  r120 r120  0xfffffff0

SEE ALSO

isub borrow

isubi

PNX1300/01/02/11 Data Book Philips Semiconductors

A-125 PRELIMINARY SPECIFICATION

If zero select zero

SYNTAX

[ IF rguard ] izero rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 = 0 then

rdest  0

else

rdest  rsrc2

}

ATTRIBUTES

Function unit alu

Operation code 46

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The izero operation writes 0 into rdest if the value of rsrc1 is equal to zero; otherwise, rsrc2 is copied to rdest. Th e

operands are considered signed integers.

The izero operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 2, r20 = 1 izero r30 r20  r80 r80  1

r10 = 0, r60 = 0x100, r30 = 2 IF r10 izero r60 r30  r50 no change, since guard is false

r20 = 1, r60 = 0x100, r40 = 0xffffff9c IF r20 izero r60 r40  r90 r90  0xffffff9c

r10 = 0, r40 = 0xffffff9c izero r10 r40  r100 r100  0

r20 = 1, r60 = 0x100 izero r20 r60  r110 r110  0x100

r20 = 1, r70 = 0x456789 izero r20 r70  r120 r120  0x456789

SEE ALSO

inonzero iflip

izero

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-126

Indirect jump on false

SYNTAX

[ IF rguard ] jmpf rsrc1 rsrc2

FUNCTION

if rguard then {

if (rsrc1 & 1) = 0 then

PC  rsrc2

}

ATTRIBUTES

Function unit branch

Operation code 180

Number of operands 2

Modifier No

Modifier range —

Delay 3

Issue slots 2, 3, 4

DESCRIPTION

The jmpf operation conditionally changes the program flow. If the LSB of rsrc1 is 0, the PC register is set equal to

rsrc2; otherwise, program execution continues with the next sequential instruction.

The jmpf operation optionally takes a guard, specified in rguard. If a guard is present, its LSB adds another

condition to the jump. If the LSB of rguard is 1, the instruction execut es as previously described; otherwise, the jump

will not be taken regardless of the value of rsrc1.

EXAMPLES

Initial Values Operation Result

r50 = 0, r70 = 0x330 jmpf r50 r70 program execution continues at 0x330

r20 = 1, r70 = 0x330 jmpf r20 r70 since r20 is true, program execution contin-

ues with next sequential instruction

r30 = 0, r50 = 0, r60 = 0x8000 IF r30 jmpf r50 r60 since guard is false, program execution con-

tinues with next sequential instruction

r40 = 1, r50 = 0, r60 = 0x8000 IF r40 jmpf r50 r60 program execution continues at 0x8000

SEE ALSO

jmpt jmpi ijmpf ijmpt

ijmpi

jmpf

PNX1300/01/02/11 Data Book Philips Semiconductors

A-127 PRELIMINARY SPECIFICATION

Jump immediate

SYNTAX

[ IF rguard ] jmpi(address)

FUNCTION

if rguard then

PC  address

ATTRIBUTES

Function unit branch

Operation code 178

Number of operands 0

Modifier 32 bits

Modifier range 0..0xffffffff

Delay 3

Issue slots 2, 3, 4

DESCRIPTION

The jmpi operation changes the program flow by setting the PC register equal to the immediate opcode modifier

address.

The jmpi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB adds a condition to

the jump. If the LSB of rguard is 1, the instruction executes as previously described; otherwise, the jump will not be

taken.

EXAMPLES

Initial Values Operation Result

jmpi(0x330) program execution continues at 0x330

r30 = 0 IF r30 jmpi(0x8000) since guard is false, program execution con-

tinues with next sequential instruction

r40 = 1 IF r40 jmpi(0x8000) program execution continues at 0x8000

SEE ALSO

jmpf jmpt ijmpf ijmpt

ijmpi

jmpi

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-128

Indirect jump on true

SYNTAX

[ IF rguard ] jmpt rsrc1 rsrc2

FUNCTION

if rguard then {

if (rsrc1 & 1) = 1 then

PC  rsrc2

}

ATTRIBUTES

Function unit branch

Operation code 176

Number of operands 2

Modifier no

Modifier range —

Delay 3

Issue slots 2, 3, 4

DESCRIPTION

The jmpt operation conditionally changes the program flow. If the LSB of rsrc1 is 1, the PC register is set equal to

rsrc2; otherwise, program execution continues with the next sequential instruction.

The jmpt operation optionally takes a guard, specified in rguard. If a guard is present, its LSB adds another

condition to the jump. If the LSB of rguard is 1, the instruction execut es as previously described; otherwise, the jump

will not be taken regardless of the value of rsrc1.

EXAMPLES

Initial Values Operation Result

r50 = 1, r70 = 0x330 jmpt r50 r70 program execution continues at 0x330

r20 = 0, r70 = 0x330 jmpt r20 r70 since r20 is false, program execution contin-

ues with next sequential instruction

r30 = 0, r50 = 1, r60 = 0x8000 IF r30 jmpt r50 r60 since guard is false, program execution con-

tinues with next sequential instruction

r40 = 1, r50 = 1, r60 = 0x8000 IF r40 jmpt r50 r60 program execution continues at 0x8000

SEE ALSO

jmpf jmpi ijmpf ijmpt

ijmpi

jmpt

PNX1300/01/02/11 Data Book Philips Semiconductors

A-129 PRELIMINARY SPECIFICATION

32-bit load

pseudo-op for ld32d(0)

SYNTAX

[ IF rguard ] ld32 rsrc1  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  3

else

bs  0

rdest<7:0>  mem[rsrc1 + (3  bs)]

rdest<15:8>  mem[rsrc1 + (2  bs)]

rdest<23:16>  mem[rsrc1 + (1  bs)]

rdest<31:24>  mem[rsrc1 + (0  bs)]

}

ATTRIBUTES

Function unit dmem

Operation code 7

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The ld32 operation is a pseudo operation transformed by the scheduler into an ld32d(0) with the same

argument. (Note: pseudo operations cannot be used in assembly source files.)

The ld32 operation loads the 32-bit memory value from the address contained in rsrc1 and stores the result in

rdest. If the memory addres s cont aine d in r src1 is not a multiple of 4, the result of ld32 is undefined bu t no exceptio n

will be raised. This load operation is performed as little-endian or big-endian depending on the current setting of the

bytesex bit in the PCSW.

The ld32 operation can be used to access the MMIO address aperture (the result of MMIO access by 8- or 16-bit

memory operations is undefin ed). The state of the BSX bit in the PCSW has no effect on MMIO access by ld32.

The ld32 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the d estination register and the occurrence of side effect s. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed locations are cacheable. if the LSB of rguard is 0, rdest is not

changed and ld32 has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00,

[0xd00] = 0x84, [0xd01] = 0x33,

[0xd02] = 0x22, [0xd03] = 0x11

ld32 r10  r60 r60  0x84332211

r30 = 0, r20 = 0xd04,

[0xd04] = 0x48, [0xd05] = 0x66,

[0xd06] = 0x55, [0xd07] = 0x44

IF r30 ld32 r20  r70 no change, since guard is false

r40 = 1, r20 = 0xd04,

[0xd04] = 0x48, [0xd05] = 0x66,

[0xd06] = 0x55, [0xd07] = 0x44

IF r40 ld32 r20  r80 r80  0x48665544

r50 = 0xd01 ld32 r50  r90 r90 undefined, since 0xd01 is not a multiple of 4

SEE ALSO

ld32d ld32r ld32x st32

st32d h_st32d

ld32

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-130

32-bit load with displacement

SYNTAX

[ IF rguard ] ld32d(d) rsrc1  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  3

else

bs  0

rdest<7:0>  mem[rsrc1 + d + (3  bs)]

rdest<15:8>  mem[rsrc1 + d + (2  bs)]

rdest<23:16>  mem[rsrc1 + d + (1  bs)]

rdest<31:24>  mem[rsrc1 + d + (0  bs)]

}

ATTRIBUTES

Function unit dmem

Operation code 7

Number of operands 1

Modifier 7 bits

Modifier range –256..252 by 4

Latency 3

Issue slots 4, 5

DESCRIPTION

The ld32d operation loads the 32-bit memory value from the address computed by rsrc1 + d and stores the resu lt

in rdest. The d value is an opcode modifier, must be in the range –256 to 252 inclusive, and must be a multiple of 4. If

the memory address computed by rsrc1 + d is not a multiple of 4, the result of ld32d is undefined but no exception

will be raised. This load operation is performed as little-endian or big-endian depending on the current setting of the

bytesex bit in the PCSW.

The ld32d opera tio n can b e u se d to acce ss th e M MIO ad dress aperture (the re su lt of MMIO access by 8 - or 16- bit

memory operations is undefined). The state of the BSX bit in the PCSW has no effect on MMIO access by ld32d.

The ld32d operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of th e destination register and the occurr ence of side ef fects. If the LSB of r guard is 1, rdest is written and

the data cache status bits are upda ted if the addre ssed loca tions are c achea ble. if the L SB of rguard is 0, rdest is not

changed and ld32d has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xcfc,

[0xd00] = 0x84, [0xd01] = 0x33,

[0xd02] = 0x22, [0xd03] = 0x11

ld32d(4) r10  r60 r60  0x84332211

r30 = 0, r20 = 0xd0c,

[0xd04] = 0x48, [0xd05] = 0x66,

[0xd06] = 0x55, [0xd07] = 0x44

IF r30 ld32d(-8) r20  r70 no change, since guard is false

r40 = 1, r20 = 0xd0c,

[0xd04] = 0x48, [0xd05] = 0x66,

[0xd06] = 0x55, [0xd07] = 0x44

IF r40 ld32d(-8) r20  r80 r80  0x48665544

r50 = 0xd01 ld32d(-8) r50  r90 r90 undefined, since 0xd01 +(–8) is not a

multiple of 4

SEE ALSO

ld32 ld32r ld32x st32

st32d h_st32d

ld32d

PNX1300/01/02/11 Data Book Philips Semiconductors

A-131 PRELIMINARY SPECIFICATION

32-bit load with index

SYNTAX

[ IF rguard ] ld32r rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  3

else

bs  0

rdest<7:0>  mem[rsrc1 + rsrc2 + (3  bs)]

rdest<15:8>  mem[rsrc1 + rsrc2 + (2  bs)]

rdest<23:16>  mem[rsrc1 + rsrc2 + (1  bs)]

rdest<31:24>  mem[rsrc1 + rsrc2 + (0  bs)]

}

ATTRIBUTES

Function unit dmem

Operation code 200

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The ld32r operation loads the 32-bit memory value from the address computed by rsrc1 + rsrc2 and stores the

result in rdest. If the memory address computed by rsrc1 + rsrc2 is not a multiple of 4, the result of ld32r is

undefined but no exception will be raised. This load operation is performed as little-endian or big-endian depending on

the current setting of the bytesex bit in the PCSW.

The ld32r opera tion can b e u sed to access th e M MIO addr ess ap erture (the result of M MIO access by 8 - or 16 -bit

memory operations is undefin ed). The state of the BSX bit in the PCSW has no effect on MMIO access by ld32r.

The ld32r operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the d estination register and the occurrence of side effect s. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed locations are cacheable. if the LSB of rguard is 0, rdest is not

changed and ld32r has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xcfc, r20 = 0x4,

[0xd00] = 0x84, [0xd01] = 0x33,

[0xd02] = 0x22, [0xd03] = 0x11

ld32r r10 r20  r80 r80  0x84332211

r50 = 0, r40 = 0xd 0c, r 30 = 0xfffffff8,

[0xd04] = 0x48, [0xd05] = 0x66,

[0xd06] = 0x55, [0xd07] = 0x44

IF r50 ld32r r40 r30  r90 no change, since guard is false

r60 = 1, r40 = 0xd 0c, r 30 = 0xfffffff8,

[0xd04] = 0x48, [0xd05] = 0x66,

[0xd06] = 0x55, [0xd07] = 0x44

IF r60 ld32r r40 r30  r100 r100  0x48665544

r50 = 0xd01, r30 = 0xfffffff8 ld32r r70 r30  r110 r110 undefined, since 0xd01 +(–8) is not a

multiple of 2

SEE ALSO

ld32 ld32d ld32x st32

st32d h_st32d

ld32r

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-132

32-bit load with scaled index

SYNTAX

[ IF rguard ] ld32x rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  3

else

bs  0

rdest<7:0>  mem[rsrc1 + (4  rsrc2) +(3  bs)]

rdest<15:8>  mem[rsrc1 + (4  rsrc2) + (2  bs)]

rdest<23:16>  mem[rsrc1 + (4  rsrc2) + (1  bs)]

rdest<31:24>  mem[rsrc1 + (4  rsrc2) + (0  bs)]

}

ATTRIBUTES

Function unit dmem

Operation code 201

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The ld32x operation loads the 32-bit memory value from the address computed by rsrc1 + 4rsrc2 and stores the

result in rdest. If the memory address computed by rsrc1 + 4rsrc2 is not a multiple of 4, the result of ld32x is

undefined but no exception will be raised. This load operation is performed as little-endian or big-endian depending on

the current setting of the bytesex bit in the PCSW.

The ld32x opera tio n can b e u se d to acce ss th e M MIO ad dress aperture (the re su lt of MMIO access by 8 - or 16- bit

memory operations is undefined). The state of the BSX bit in the PCSW has no effect on MMIO access by ld32x.

The ld32x operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of th e destination register and the occurr ence of side ef fects. If the LSB of r guard is 1, rdest is written and

the data cache status bits are upda ted if the addre ssed loca tions are c achea ble. if the L SB of rguard is 0, rdest is not

changed and ld32x has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xcfc, r30 = 0x1,

[0xd00] = 0x84, [0xd01] = 0x33,

[0xd02] = 0x22, [0xd03] = 0x11

ld32x r10 r30  r100 r100  0x84332211

r5 0 = 0, r40 = 0xd0c, r20 = 0xfffffffe,

[0xd04] = 0x48, [0xd05] = 0x66,

[0xd06] = 0x55, [0xd07] = 0x44

IF r50 ld32x r40 r20  r80 no change, since guard is false

r6 0 = 1, r40 = 0xd0c, r20 = 0xfffffffe,

[0xd04] = 0x48, [0xd05] = 0x66,

[0xd06] = 0x55, [0xd07] = 0x44

IF r60 ld32x r40 r20  r90 r90  0x48665544

r70 = 0xd01, r30 = 0x1 ld32x r70 r30  r110 r110 undefined, since 0xd01 + 41 is not a

multiple of 4

SEE ALSO

ld32 ld32d ld32r st32

st32d h_st32d

ld32x

PNX1300/01/02/11 Data Book Philips Semiconductors

A-133 PRELIMINARY SPECIFICATION

Logical shift left

pseudo-op for asl

SYNTAX

[ IF rguard ] lsl rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

n  rsrc2<4:0>

rdest<31:n>  rsrc1<31–n:0>

rdest<n–1:0>  0

if rsrc2<31:5> != 0 {

rdest <- 0

}

ATTRIBUTES

Function unit s hifter

Operation code 19

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2

DESCRIPTION

The lsl operation is a pseudo operation that is transformed by the scheduler into an asl with the same

arguments. (Note: pseudo operations cannot be used in assembly source files.)

As shown be low, the lsl operation takes two arguments, rsrc1 and rsrc2. Rsrc2 specify an unsigned shift amount,

and rdest is set to rsrc1 logically shifted left by this amount. If the rsrc2<31:5> value is not zero, then take this as a

shift by 32 or more bits. Zeros are shifted into the LSBs of rdest while the MSBs shifted out of rsrc1 are lost.

The lsl operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

r60 = 0x20, r30 = 3 lsl r60 r30  r90 r90  0x100

r10 = 0, r60 = 0x20, r30 = 3 IF r10 lsl r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x20, r30 = 3 IF r20 lsl r60 r30  r110 r110  0x100

r70 = 0xfffffffc, r40 = 2 lsl r70 r40  r120 r120  0xfffffff 0

r80 = 0xe , r50 = 0xfffffffe lsl r80 r50  r125 r125  0x00000000 (shift by more than 32))

r30 = 0x7008000f, r45 = 0x20 lsl r30 r45  r100 r100  0x00000000

r30 = 0x8008000f, r45 = 0x80000000 lsl r30 r45  r100 r100  0x00000000

r30 = 0x8008000f, r45 = 0x23 lsl r30 r45  r100 r100  0x00000000

031

rsrc1

031

rsrc2

000

Left shifter

32 bits from rsrc1

031

rdest 3

000

Intermediate result

(example: n = 3)

rsrc2

SEE ALSO

asl asli asr asri lsli lsr

lsri rol roli

lsl

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-134

Logical shift left immediate

pseudo-op for asli

SYNTAX

[ IF rguard ] lsli(n) rsrc1  rdest

FUNCTION

if rguard then {

rdest<31:n>  rsrc1<31–n:0>

rdest<n–1:0>  0

}

ATTRIBUTES

Function unit s hifter

Operation code 11

Number of operands 1

Modifier 7 bits

Modifier range 0..31

Latency 1

Issue slots 1, 2

DESCRIPTION

The lsli operation is a pseudo operation that is transformed by the scheduler into an asli with the same

argument and opcode modifier. (Note: pseudo operations cannot be used in assembly source files.)

As shown below, the lsli operation takes a single argument in rsrc1 and an immediate modifier n and produces a

result in rdest equal to rsrc1 logically shifted left by n bits. The value o f n must be between 0 and 31, inclusive. Zeros

are shifted into the LSBs of rdest while the MSBs shifted out of rsrc1 are lost.

The lsli operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is un changed.

EXAMPLES

Initial Values Operation Result

r60 = 0x20 lsli(3) r60  r90 r90  0x100

r10 = 0, r60 = 0x20 IF r10 lsli(3) r60  r100 no change, since guard is false

r20 = 1, r60 = 0x20 IF r20 lsli(3) r60  r110 r110  0x100

r70 = 0xfffffffc lsli(2) r70  r120 r120  0x fffffff 0

r80 = 0xe lsli(30) r80  r125 r125  0x80000000

031

rsrc1

000

Left shifter

32 bits from rsrc1

031

rdest 3

000

Intermediate result

(example: n = 3)

Shift amount n

from operation modifier

SEE ALSO

asl asli asr asri lsl lsr

lsri rol roli

lsli

PNX1300/01/02/11 Data Book Philips Semiconductors

A-135 PRELIMINARY SPECIFICATION

Logical shift right

SYNTAX

[ IF rguard ] lsr rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

n  rsrc2<4:0>

rdest<31:32–n>  0

rdest<31–n:0>  rsrc1<31:n>

if rsrc2<31:5> != 0 {

rdest <- 0

}

ATTRIBUTES

Function unit s hifter

Operation code 96

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2

DESCRIPTION

As shown below, the lsr operation takes two arguments, rsrc1 and rsrc2. Rsrc2 specifies an unsigned shift

amount, and rsrc1 is logically shifted right by this amount. If the rsrc2<31:5> value is not zero, then take this as a shift

by 32 or more bits. Zeros fill vacated bits from the left.

The lsr operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

r30 = 0x7008000f, r20 = 1 lsr r30 r20  r50 r50  0x38040007

r30 = 0x7008000f, r42 = 2 lsr r30 r42  r60 r60  0x1c020003

r10 = 0, r30 = 0x7008000f, r44 = 4 IF r10 lsr r30 r44  r70 no change, since guard is false

r20 = 1, r30 = 0x7008000f, r44 = 4 IF r20 lsr r30 r44  r80 r80  0x07008000

r40 = 0x80030007, r44 = 4 lsr r40 r44  r90 r90  0x08003000

r30 = 0x7008000f, r45 = 0x1f lsr r30 r45  r100 r100  0x00000000

r30 = 0x8008000f, r45 = 0x1f lsr r30 r45  r100 r100  0x00000001

r30 = 0x7008000f, r45 = 0x20 lsr r30 r45  r100 r100  0x00000000

r30 = 0x8008000f, r45 = 0x80000000 lsr r30 r45  r100 r100  0x00000000

r30 = 0x8008000f, r45 = 0x23 lsr r30 r45  r100 r100  0x00000000

031

rsrc1 031

rsrc2

000

Right shifter

32 bits from rsrc1

031

rdest 28

000

Intermediate result

(example: n = 3)

rsrc2

SEE ALSO

asl asli asr asri lsl lsli

lsri rol roli

lsr

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-136

Logical shift right immediate

SYNTAX

[ IF rguard ] lsri(n) rsrc1  rdest

FUNCTION

if rguard then {

rdest<31:32–n>  0

rdest<31–n:0>  rsrc1<31:n>

}

ATTRIBUTES

Function unit s hifter

Operation code 9

Number of operands 1

Modifier 7 bits

Modifier range 0..31

Latency 1

Issue slots 1, 2

DESCRIPTION

As shown below, the lsri operation takes a single argument in rsrc1 and an immediate modifier n and produces a

result in rdest that is equal to rsrc1 logically shifted right by n bit s. The value of n must be between 0 and 31, inclu sive.

Zeros fill vacated bits from the left.

The lsri operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is un changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x7008000f lsri(1) r30  r50 r50  0x38040007

r30 = 0x7008000f lsri(2) r30  r60 r60  0x1c020003

r10 = 0, r30 = 0x7008000f IF r10 lsri(4) r30  r70 no change, since guard is false

r20 = 1, r30 = 0x7008000f IF r20 lsri(4) r30  r80 r80  0x07008000

r40 = 0x80030007 lsri(4) r40  r90 r90  0x08003000

r30 = 0x7008000f lsri(31) r30  r100 r100  0x00000000

r40 = 0x80030007 lsri(31) r40  r110 r110  0x00000001

000

Right shifter

32 bits from rsrc1

031

rdest 28

000

Intermediate result

(example: n = 3) S

031

rsrc1

Shift amount n

from operation modifier

SEE ALSO

asl asli asr asri lsl lsli

lsr rol roli

lsri

PNX1300/01/02/11 Data Book Philips Semiconductors

A-137 PRELIMINARY SPECIFICATION

mergedual16lsb Merge dual 16-bit lsb bytes

SYNTAX

[ IF rguard ] mergedual16lsb rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

rdest<31:24> <- rsrc1<23:16>

rdest<23:16> <- rsrc1<7:0>

rdest<15:8> <- rsrc2<23:16>

rdest<7:0> <- rsrc2<7:0>

}

ATTRIBUTES

Function unit s hifter

Operation code 103

Number of operands 2

Modifier No

Modifier range -

Latency 1

Issue slots 1,2

DESCRIPTION

The arguments rsrc1 and rsrc2 are vectors of two 16-bit data. The mergedual16lsb operation merges the least

significant bytes from each 16-bit data rsrc1 and rsrc2 into one 32-bit data in dest register, to convert to quad 8-bit.

The mergedual16lsb operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x12345678, r40 = 0xaabbccdd mergedual16lsb r30 r40 -> r50 r50 <- 0x3478bbdd

r10 = 0, r30 = 0x12345678, r40 = 0xaabbccdd IF r10 mergedual16lsb r30 r40 -> r50 no change, since guard is

false

r10 = 1, r30 = 0x01020304, r40 = 0x0a0b0c0d IF r10 mergedual16lsb r30 r40 -> r50 r50 <- 0x02040b0d

rsrc1 0

1523

rsrc2

152331

rdest

SEE ALSO

mergelsb mergemsb

pack16lsb pack16msb

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-138

Merge least-significant byte

SYNTAX

[ IF rguard ] mergelsb rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

rdest<7:0>  rsrc2<7:0>

rdest<15:8>  rsrc1<7:0>

rdest<23:16>  rsrc2<15:8>

rdest<31:24>  rsrc1<15:8>

}

ATTRIBUTES

Function unit alu

Operation code 57

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

As shown below, the mergelsb operation interleaves the two pairs of least-significant bytes from the arguments

rsrc1 and rsrc2 into rdest. The least-significant byte from rsrc2 is packed into the least-significant byte of rdest; the

least-significant byte from rsrc1 is packed into the second-least-significant byte of rdest; the second-least-significant

byte from rsrc2 is packed into the second-most-significant byte of rdest; and the second-least-significant byte from

rsrc1 is packed into the most-significant byte of rdest.

The mergelsb operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is un changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x12345678, r40 = 0xaabbccdd mergelsb r30 r40  r50 r50  0x56cc78dd

r10 = 0, r40 = 0xaabbccdd, r30 = 0x12345678 IF r10 mergelsb r40 r30  r60 no change, since guard is false

r20 = 1, r40 = 0xaabbccdd, r30 = 0x12345678 IF r20 mergelsb r40 r30  r70 r70  0xcc56dd78

07152331

rsrc1 07152331

rsrc2

07152331

rdest

SEE ALSO

pack16lsb pack16msb

packbytes mergemsb

mergelsb

PNX1300/01/02/11 Data Book Philips Semiconductors

A-139 PRELIMINARY SPECIFICATION

Merge most-significant byte

SYNTAX

[ IF rguard ] mergemsb rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

rdest<7:0>  rsrc2<23:15>

rdest<15:8>  rsrc1<23:15>

rdest<23:16>  rsrc2<31:24>

rdest<31:24>  rsrc1<31:24>

}

ATTRIBUTES

Function unit alu

Operation code 58

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

As shown be low, the mergemsb operation interleaves the two pairs of most-significant bytes from the arguments

rsrc1 and rsrc2 into rdest. The second-most-significant byte from rsrc2 is packed into the least-significant byte of

rdest; the second-most-significant byte from rsrc1 is packed into the second-least-significant byte of rdest; the most-

significant byte from rsrc2 is packed into the second-most-significant byte of rdest; and the most-significant byte from

rsrc1 is packed into the most-significant byte of rdest.

The mergemsb operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

r30 = 0x12345678, r40 = 0xaabbccdd mergemsb r30 r40  r50 r50  0x12aa34bb

r10 = 0, r40 = 0xaabbccdd, r30 = 0x12345678 IF r10 mergemsb r40 r30  r60 no change, since guard is false

r20 = 1, r40 = 0xaabbccdd, r30 = 0x12345678 IF r20 mergemsb r40 r30  r70 r70  0xaa12bb34

07152331

rsrc1 07152331

rsrc2

07152331

rdest

SEE ALSO

pack16lsb pack16msb

packbytes mergelsb

mergemsb

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-140

No operation

SYNTAX

nop

FUNCTION

No operation

ATTRIBUTES

Function unit -

Operation code -

Number of operands -

Modifier -

Modifier range -

Latency 1

Issue slots 1-5

DESCRIPTION

The NOP operation does not change any DSPCPU state. It is mainly used to fill-up the empty issue slots. Only two

bits are used to code the NOP operation.

EXAMPLES

Initial Values Operation Result

r30 = 0x12345678, r40 =

0xaabbccdd nop No change in any regsiters

SEE ALSO

nop

PNX1300/01/02/11 Data Book Philips Semiconductors

A-141 PRELIMINARY SPECIFICATION

Pack least-significant 16-bit halfwords

SYNTAX

[ IF rguard ] pack16lsb rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

rdest<15:0>  rsrc2<15:0>

rdest<31:16>  rsrc1<15:0>

}

ATTRIBUTES

Function unit alu

Operation code 53

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

As shown below, the pack16lsb operation packs the two least-significant halfwords from the arguments rsrc1

and rsrc2 into rdest. The halfword from r src1 is packed into the most-significant halfword of rdest; the halfword from

rsrc2 is packed into the least-significant halfword of rdest.

The pack16lsb operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

r30 = 0x12345678, r40 = 0xaabbccdd pack16lsb r30 r40  r50 r50  0x5678ccdd

r10 = 0, r40 = 0xaabbccdd, r30 = 0x12345678 IF r10 pack16lsb r40 r30  r60 no change, since guard is false

r20 = 1, r40 = 0xaabbccdd, r30 = 0x12345678 IF r20 pack16lsb r40 r30  r70 r70  0xccdd5678

01531

rsrc1 01531

rsrc2

01531

rdest

SEE ALSO

pack16msb packbytes

mergelsb mergemsb

pack16lsb

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-142

Pack most-significant 16 bits

SYNTAX

[ IF rguard ] pack16msb rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

rdest<15:0>  rsrc2<31:16>

rdest<31:16>  rsrc1<31:16>

}

ATTRIBUTES

Function unit alu

Operation code 54

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

As shown below, the pack16msb operation packs the two most-significant halfwords from the arguments rsrc1

and rsrc2 into rdest. The halfword from rsrc1 is packed into the most-significant halfword of rdest; the halfword from

rsrc2 is packed into the least-s ignific ant halfword of rdest.

The pack16msb operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is un changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x12345678, r40 = 0xaabbccdd pack16msb r30 r40  r50 r50  0x1234aabb

r10 = 0, r40 = 0xaabbccdd, r30 = 0x12345678 IF r10 pack16msb r40 r30  r60 no change, since guard is false

r20 = 1, r40 = 0xaabbccdd, r30 = 0x12345678 IF r20 pack16msb r40 r30  r70 r70  0xaabb1234

01531

rsrc1 01531

rsrc2

01531

rdest

SEE ALSO

pack16lsb packbytes

mergelsb mergemsb

pack16msb

PNX1300/01/02/11 Data Book Philips Semiconductors

A-143 PRELIMINARY SPECIFICATION

Pack least-significant byte

SYNTAX

[ IF rguard ] packbytes rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

rdest<7:0>  rsrc2<7:0>

rdest<15:8>  rsrc1<7:0>

}

ATTRIBUTES

Function unit alu

Operation code 52

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

As shown below, the packbytes operation packs the two least-significant bytes from the arguments rsrc1 and

rsrc2 into rdest. The byte from rsrc1 is packed into the second-least-significant byte of rdest; the byte from rsrc2 is

packed into the least- significant byte of rdest. The two most-significant bytes of rdest are filled with zeros.

The packbytes operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

r30 = 0x12345678, r40 = 0xaabbccdd packbytes r30 r40  r50 r50  0x000078dd

r10 = 0, r40 = 0xaabbccdd, r30 = 0x12345678 IF r10 packbytes r40 r30  r60 no change, since guard is false

r20 = 1, r40 = 0xaabbccdd, r30 = 0x12345678 IF r20 packbytes r40 r30  r70 r70  0x0000dd78

07152331

rsrc1 07152331

rsrc2

07152331

rdest 0000000000000000

SEE ALSO

pack16lsb pack16msb

mergelsb mergemsb

packbytes

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-144

prefetch

pseudo-op for prefd(0)

SYNTAX

[ IF rguard ] pref rsrc1

FUNCTION

if rguard then {

cache_block_mask = ~(cache_block_size - 1)

data_cache <- mem[(rsrc1 + 0) & cache_block_mask]

}

ATTRIBUTES

Function unit dmemspec

Operation code 209

Number of operands 1

Modifier -

Modifier range -

Latency -

Issue slots 5

DESCRIPTION

The pref operation is a pseudo operation transformed by the scheduler into an prefd(0) with the same arguments.

(Note: pseudo operations cannot be used in assembly files.)

The pref operation loads the one full cache block size of memory value from the address computed by ((rsrc1+0) &

cache_block_mask) and stores the data into the data cache. This operation is not guaranteed to be executed. The

prefetch unit will not execute this operation when the data to be prefetched is already in the data cache. A pref

operation will not be executed when the cache is already occupied with 2 cache misses, when the operation is issued.

The pref operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the execution

of the prefetch operation. If the LSB of rguard is 1, prefetch operation is executed; otherwise, it is not executed.

EXAMPLES

NOTE: This operation may only be supported in TM-1000, TM-1100, TM-1300 and

PNX1300/01/02/11. It is not guaranteed to be available in future generations of Trimedia

products.

Initial Values Operation Result

r10 = 0xabcd,

cache_block_size = 0x40 pref r10 Loads a cache line for the address space from

0xabc0 to 0x0xabff from the main memory. If the data

is already in the cache, the operation is not executed.

r10 = 0xabcd, r11 = 0,

cache_block_size = 0x40 IF r11 pref r10 since guard is false, pref operation is not executed

r10 = 0xabff, r11 = 1,

cache_block_size = 0x40 IF r11 pref r10 Loads a cache line for the address space from

0xabc0 to 0x0xabff from the main memory. If the data

is already in the cache, the operation is not executed.

SEE ALSO

pref16x pref32x prefd

prefr allocd allocr allocx

pref

PNX1300/01/02/11 Data Book Philips Semiconductors

A-145 PRELIMINARY SPECIFICATION

pref16x prefetch with 16-bit scaled index

SYNTAX

[ IF rguard ] pref16x rsrc1 rsrc2

FUNCTION

if rguard then {

cache_block_mask = ~(cache_block_size - 1)

data_cache <- mem[(rsrc1 + (2 x rscr2)) & cache_block_mask]

}

ATTRIBUTES

Function unit dmemspec

Operation code 211

Number of operands 2

Modifier No

Modifier range -

Latency -

Issue slots 5

DESCRIPTION

The pref16x operation loads one full cache block from the main memory at the address computed by ((rsrc1+ (2 x

rscr2)) & cache_block_mask) and stores the data into the data cache. This operation is not guaranteed to be

executed. The prefetch unit will not execute this operation when the data to be prefetched is already in the data cache.

The data cache has hardware to simultaneously sustain two cache misses or prefetches. A pref16x operation will not

be executed when the cache is already occupied with 2 cache misses, when the operation is issued.

The pref16x operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

execution of the prefetch operation. If the LSB of rguard is 1, prefetch operation is executed; otherwise, it is not

executed

EXAMPLES

NOTE: This operation may only be supported in TM-1000, TM-1100, TM-1300 and

PNX1300/01/02/11. It is not guaranteed to be available in future generations of Trimedia

products.

Initial Values Operation Result

r10 = 0xabcd, r12 = 0xc

cache_block_size = 0x40 pref16x r10 r12 Loads a cache line for the address space from

0xabc0 to 0xabff from the main memory. If the data is

already in the cache, the operation is not executed.

r10 = 0xabcd, r11 = 0, r12=0xc,

cache_block_size = 0x40 IF r11 pref16x r10 r12 since guard is false, pref16x operation is not exe-

cuted

r10 = 0xabff, r11 = 1, r12 =0x1,

cache_block_size = 0x40 IF r11 pref16x r10 r12 Loads a cache line for the address space from

0xac00 to 0x0xac3f from the main memory. If the

data is already in the cache, the operation is not exe-

cuted.

SEE ALSO

pref32x prefd prefr allocd

allocr allocx

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-146

prefetch with 32-bit scaled index

SYNTAX

[ IF rguard ] pref32x rsrc1 rsrc2

FUNCTION

if rguard then {

cache_block_mask = ~(cache_block_size - 1)

data_cache <- mem[(rsrc1 + (4 x rscr2)) & cache_block_mask]

}

ATTRIBUTES

Function unit dmemspec

Operation code 212

Number of operands 2

Modifier No

Modifier range -

Latency -

Issue slots 5

DESCRIPTION

The pref32x operation loads the one full cache block size of memory value from the address computed by ((rsrc1+

(4 x rscr2)) & cache_block_mask) and stores the data into the data cache. This operation is not guaranteed to be

executed. The prefetch unit will not execute this operation when the data to be prefetched is already in the data cache.

A pref32x operation will not be executed when the cache is already occupied with 2 cache misses, when the operation

is issued.

The pref32x operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

execution of the prefetch operation. If the LSB of rguard is 1, prefetch operation is executed; otherwise, it is not

executed..

EXAMPLES

NOTE: This operation may only be supported in TM-1000, TM-1100, TM-1300 and

PNX1300/01/02/11. It is not guaranteed to be available in future generations of Trimedia

products.

Initial Values Operation Result

r10 = 0xabcd, r12 = 0xd

cache_block_size = 0x40 pref32x r10 r12 Loads a cache line for the address space from

0xac00 to 0x0xac3f from the main memory. If the

data is already in the cache, the operation is not exe-

cuted.

r10 = 0xabcd, r11 = 0, r12=0xd,

cache_block_size = 0x40 IF r11 pref32x r10 r12 since guard is false, pref32x operation is not exe-

cuted

r10 = 0xabff, r11 = 1, r12 =0x1,

cache_block_size = 0x40 IF r11 pref32x r10 r12 Loads a cache line for the address space from

0xac00 to 0x0xac3f from the main memory. If the

data is already in the cache, the operation is not exe-

cuted.

SEE ALSO

pref16x prefd prefr allocd

allocr allocx

pref32x

PNX1300/01/02/11 Data Book Philips Semiconductors

A-147 PRELIMINARY SPECIFICATION

prefd prefetch with displacement

SYNTAX

[ IF rguard ] prefd(d) rsrc1

FUNCTION

if rguard then {

cache_block_mask = ~(cache_block_size - 1)

data_cache <- mem[(rsrc1 + d) & cache_block_mask]

}

ATTRIBUTES

Function unit dmemspec

Operation code 209

Number of operands 1

Modifier 7 bits

Modifier range –256..252 by 4

Latency -

Issue slots 5

DESCRIPTION

The prefd oper ation loads th e one full cache bloc k size of me mory va lue from the add re ss computed by ( ( rsrc1+d) &

cache_block_mask) and stores the data into the data cache. This operation is not guaranteed to be executed. The

prefetch unit will not execute this operation when the data to be prefetched is already in the data cache. A prefd

operation will not be executed when the cache is alread y occupied with 2 cache misses, when the o peration is issu ed.

The prefd operation o ptionally t akes a guard, specified in rguard. If a guard is pr esent, it s LSB controls the execution

of the prefetch operation. If the LSB of rguard is 1, prefetch operation is executed; otherwise, it is not executed..

EXAMPLES

NOTE: This operation may only be supported in TM-1000, TM-1100, TM-1300 and

PNX1300/01/02/11. It is not guaranteed to be available in future generations of Trimedia

products.

Initial Values Operation Result

r10 = 0xabcd,

cache_block_size = 0x40 prefd(0xd) r10 Loads a cache line for the address space from

0xabc0 to 0x0xabff from the main memory. If the dat a

is already in the cache, the operation is not executed.

r10 = 0xabcd, r11 = 0,

cache_block_size = 0x40 IF r11 prefd(0xd) r10 since guard is false, prefd operation is not executed

r10 = 0xabff, r11 = 1,

cache_block_size = 0x40 IF r11 prefd(ox1) r10 Loads a cache line for the address space from

0xac00 to 0x0xac3f from the main memory. If the

data is already in the cache, the operation is not exe-

cuted.

SEE ALSO

pref16x pref32x prefr

allocd allocr allocx

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-148

prefetch with index

SYNTAX

[ IF rguard ] prefr rsrc1 rsrc2

FUNCTION

i f rguard then {

cache_block_mask = ~(cache_block_size - 1)

data_cache <- mem[(rsrc1 + rscr2) & cache_block_mask]

}

ATTRIBUTES

Function unit dmemspec

Operation code 210

Number of operands 2

Modifier No

Modifier range -

Latency -

Issue slots 5

DESCRIPTION

The prefr operation loads the one full cache block size of memory value from the address computed by

((rsrc1+rscr2) & cache_block_mask) and stores the data into the data cache. This operation is not guaranteed to be

executed. The prefetch unit will not execute this operation when the data to be prefetched is already in the data cache.

A prefr operation will not be executed when the cache is already occupied with 2 cache misses, when the operation is

issued.

The prefr operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

execution of the prefetch operation. If the LSB of rguard is 1, prefetch operation is executed; otherwise, it is not

executed..

EXAMPLES

NOTE: This operation may only be supported in TM-1000, TM-1100, TM-1300 and

PNX1300/01/02/11. It is not guaranteed to be available in future generations of Trimedia

products.

Initial Values Operation Result

r10 = 0xabcd, r12 = 0xd

cache_block_size = 0x40 prefr r10 r12 Loads a cache line for the address space from

0xabc0 to 0x0xac3f from the main memory. If the

data is already in the cache, the operation is not exe-

cuted.

r10 = 0xabcd, r11 = 0, r12=0xd,

cache_block_size = 0x40 IF r11 prefr r10 r12 since guard is false, prefr operation is not executed

r10 = 0xabff, r11 = 1, r12 =0x1,

cache_block_size = 0x40 IF r11 prefr r10 r12 Loads a cache line for the address space from

0xac00 to 0x0xac3f from the main memory. If the

data is already in the cache, the operation is not exe-

cuted.

SEE ALSO

pref16x pref32x prefd

allocd allocr allocx

prefr

PNX1300/01/02/11 Data Book Philips Semiconductors

A-149 PRELIMINARY SPECIFICATION

Unsigned byte-wise quad average

SYNTAX

[ IF rguard ] quadavg rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

temp  (zero_ext8to32(rsrc1<7:0>) + zero_ext8to32(rsrc2<7:0>) + 1) / 2

rdest<7:0>  temp<7:0>

temp  (zero_ext8to32(rsrc1<15:8>) + zero_ext8to32(rsrc2<15:8>) + 1) / 2

rdest<15:8>  temp<7:0>

temp  (zero_ext8to32(rsrc1<23:16>) + zero_ext8to32(rsrc2<23:16>) + 1) / 2

rdest<23:16>  temp<7:0>

temp  (zero_ext8to32(rsrc1<31:24>) + zero_ext8to32(rsrc2<31:24>) + 1) / 2

rdest<31:24>  temp<7:0>

}

ATTRIBUTES

Function unit dspalu

Operation code 73

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

As shown below, the quadavg operation computes four separate averages of the four pairs of corresponding 8-bit

bytes of rsrc1 and rsrc2. All bytes are considered unsigned. The least-significant 8 bits of each average is written to

the corresponding byte in rdest. No overflow or underflow detection is performed.

The quadavg operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x0201000e, r40 = 0xffffff02 quadavg r30 r40  r50 r50  0x81808008

r10 = 0, r60 = 0x9c9c6464, r70 = 0x649c649c IF r10 quadavg r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0x9c9c6464, r70 = 0x649c649c IF r20 quadavg r60 r70  r90 r90  0x809c6480

01531

rsrc1 01531

rsrc2

031

rdest



23 7 23 7

71523

08 0808 08

Four full-precision

9-bit sums

unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned

unsigned unsigned unsigned unsigned

SEE ALSO

iavgonep dspuquadaddui

ifir8ii

quadavg

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-150

Unsigned byte-wise quad maximum

SYNTAX

[ IF rguard ] quadumax rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

rdest<7:0> if rsrc1<7:0> > rsrc2<7:0> then rsrc1<7:0> else rsrc2<7:0>

rdest<15:8> if rsrc1<15:8> > rsrc2<15:8> then rsrc1<15:8> else rsrc2<15:8>

rdest<23:16> if rsrc1<23:16> > rsrc2<23:16> then rsrc1<23:16> else rsrc2<23:16>

rdest<31:24> if rsrc1<31:24> > rsrc2<31:24> then rsrc1<31:24> else rsrc2<31:24>

}

ATTRIBUTES

Function unit dspalu

Operation code 81

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1,3

DESCRIPTION

The quadumax operation computes four separate maximum values of the four pairs of corresponding 8-bit bytes of

rsrc1 and rsrc2. All bytes are considered unsigned. The quadumax operation is particularly suited to implement

median computation on packed pixel data structures:

MEDIAN_Q(a,b,c) (QUADUMIN( QUADUMAX ( QUADUM IN((a),(b)), (c)), QUADUMAX((a),(b))))

The quadumax operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x0201000e, r40 = 0xff00ff02 quadumax r30 r40  r50 r50  0xff01ff0e

r10 = 0, r60 = 0x9c9c6464, r70 = 0x649d649c IF r10 quadumax r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0x9c9c6464, r70 = 0x649d649c IF r20 quadumax r60 r70  r90 r90  0x9c9d649c

SEE ALSO

imax imin quadumin

quadumax

PNX1300/01/02/11 Data Book Philips Semiconductors

A-151 PRELIMINARY SPECIFICATION

quadumin Unsigned bytewise quad minimum

SYNTAX

[ IF rguard ] quadumin rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

rdest<7:0> if rsrc1<7:0> < rsrc2<7:0> then rsrc1<7:0> else rsrc2<7:0>

rdest<15:8> if rsrc1<15:8> < rsrc2<15:8> then rsrc1<15:8> else rsrc2<15:8>

rdest<23:16> if rsrc1<23:16> < rsrc2<23:16> then rsrc1<23:16> else rsrc2<23:16>

rdest<31:24> if rsrc1<31:24> < rsrc2<31:24> then rsrc1<31:24> else rsrc2<31:24>

}

ATTRIBUTES

Function unit dspalu

Operation code 80

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1,3

DESCRIPTION

The quadumin operation computes four separate minimum values of the four pairs of corresponding 8-bit bytes of

rsrc1 and rsrc2. All bytes are considered unsigned. The quadumin operation is particularly suited to implement

median computation on packed pixel data structures:

MEDIAN_Q(a,b,c) (QUADUMIN(QUADUMAX( QUADUMIN((a),(b)), (c)), QUADUMAX((a),(b))))

The quadumin operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x0201000e, r40 = 0xff00ff02 quadumin r30 r40  r50 r50  0x02000002

r10 = 0, r60 = 0x9c9c6464, r70 = 0x649d649c IF r10 quadumin r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0x9c9c6464, r70 = 0x649d649c IF r20 quadumin r60 r70  r90 r90  0x649c6464

SEE ALSO

imin imax quadumax

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-152

Unsigned quad 8-bit multiply most significant

SYNTAX

[ IF rguard ] quadumulmsb rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

temp  (zero_ext8to32(rsrc1<7:0>) zero_ext8to32(rsrc2<7:0>))

rdest<7:0>  temp<15:8>

temp  (zero_ext8to32(rsrc1<15:8>) zero_ext8to32(rsrc2<15:8>))

rdest<15:8>  temp<15:8>

temp  (zero_ext8to32(rsrc1<23:16>) zero_ext8to32(rsrc2<23:16>))

rdest<23:16>  temp<15:8>

temp  (zero_ext8to32(rsrc1<31:24>) zero_ext8to32(rsrc2<31:24>))

rdest<31:24>  temp<15:8>

}

ATTRIBUTES

Function unit dspmul

Operation code 89

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the quadumulmsb operation computes four separate products of the four pairs of corresponding

8-bit bytes of rsrc1 and rsrc2. All bytes are considered unsigned. The most-significant 8 bits of ea ch 16-bit p roduct is

written to the corresponding byte in rdest.

The quadumulmsb operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x0210800e, r40 = 0xffffff02 quadumulmsb r30 r40  r50 r50  0x010f7f00

r10 = 0, r60 = 0x80ff1010, r70 = 0x80ff100f IF r10 quadumulmsb r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0x80ff1010, r70 = 0x80ff100f IF r20 quadumulmsb r60 r70  r90 r90  0x40fe0100

01531

rsrc1 01531

rsrc2

031

rdest



23 7 23 7

71523

715

Four full-precision

16-bit products

0 715 0 715 0 715 0

unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned

unsigned unsigned unsigned unsigned

SEE ALSO

quadavg dspuquadaddui

ifir8ii

quadumulmsb

PNX1300/01/02/11 Data Book Philips Semiconductors

A-153 PRELIMINARY SPECIFICATION

Read data cache status bits

SYNTAX

[ IF rguard ] rdstatus(d) rsrc1  rdest

FUNCTION

if rguard then {

set_addr  rsrc1 + d

/* set_addr<10:6> selects set */

rdest<9:0>  dcache_LRU_set(set_addr)

rdest<17:10>  dcache_dirty_set(set_addr)

rdest<31:18>  0

}

ATTRIBUTES

Function unit dmemspec

Operation code 203

Number of operands 1

Modifier 7 bits

Modifier range –256..252 by 4

Latency 3

Issue slots 5

DESCRIPTION

The rdstatus operation re ads the LRU and dirty b it s associa ted with a se t in th e dat a ca che an d writes the se bit s

into the destination register rdest. The target set in the data cache is determined by bits 10..6 of the result of rsrc1 + d.

The d value is an opcode modifier, must be in the range –256 to 252 inclusive, and must be a multiple of 4.

The result of rdstatus contains LRU information in bit s 9..0 and dirty-bit information in bits 17..10 . All other bit s of

rdest are set to zero.

rdstatus requires two stall cycles to complete.

The dual-ported data cache uses two sep arate copie s of t ag and st atus information. A rdstatus operation returns

the LRU and dirty information stored in the cache port that corresponds to the operation slot in which the rdstatus

operation is issued.

The rdstatus operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

rdstatus(0) r30  r60

r10 = 0 IF r10 rdstatus(4) r40  r70 no change, since guard is false

r20 = 1 IF r20 rdstatus(8) r50  r80

SEE ALSO

rdtag

rdstatus

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-154

Read data cache address tag

SYNTAX

[ IF rguard ] rdtag(d) rsrc1  rdest

FUNCTION

if rguard then {

block_addr  rsrc1 + d

/* block_addr<13:11> selects element, block_addr<10:6> selects set */

rdest<21:0>  dcache_tag_block(block_addr)

rdest<31:22>  0

}

ATTRIBUTES

Function unit dmemspec

Operation code 202

Number of operands 1

Modifier 7 bits

Modifier range –256..252 by 4

Latency 3

Issue slots 5

DESCRIPTION

The rdtag operation reads the addr ess tag associated with a block in the data cache and writes these bits into th e

destination register rdest. The target block in the data cache is determined by bits 13..6 of the result of rsrc1 + d. Bits

10..6 of rsrc1 + d select the cache set and 13..11 of rsrc1 + d select the element within that set. The d value is an

opcode modifier, must be in the range –256 to 252 inclusive, and must be a multiple of 4.

rdtag writes the address tag for the selected block in bits 21..0 of rdest. All other bits of rdest are set to zero.

rdtag requires no stall cycles to complete.

The dual-ported data cache uses two separate copies of tag and status information. A rdtag operation returns the

address tag information stored in the cache port that corresponds to the operation slot in which the rdtag operation

is issued.

The rdtag operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

rdtag(0) r30  r60

r10 = 0 IF r10 rdtag(4) r40  r70 no change, since guard is false

r20 = 1 IF r20 rdtag(8) r50  r80

SEE ALSO

rdstatus

rdtag

PNX1300/01/02/11 Data Book Philips Semiconductors

A-155 PRELIMINARY SPECIFICATION

Read destination program counter

SYNTAX

[ IF rguard ] readdpc  rdest

FUNCTION

if rguard then {

rdest  DPC

}

ATTRIBUTES

Function unit fcomp

Operation code 156

Number of operands 0

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The readdpc writes the current value of the DPC (Destination Prog ram Counter) processor register to rdest.

Interruptible jumps write th eir t arget addr ess to the DPC. If an interr upt or exce ption is t aken at an inte rruptible jump,

execution of the interrupted program can be resumed by jumping to the value contained in DPC. This operation can

be used to save state before idling a task in a multi-tasking environment.

The readdpc operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

DPC = 0xbeebee readdpc  r100 r100  0xbeebee

r20 = 0, DPC = 0xabba IF r20 readdpc  r101 no change, since guard is false

r21 = 1, DPC = 0xabba IF r21 readdpc  r102 r102  0xabba

SEE ALSO

writedpc readspc ijmpf

ijmpi ijmpt

readdpc

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-156

Read program control and status word

SYNTAX

[ IF rguard ] readpcsw  rdest

FUNCTION

if rguard then {

rdest  PCSW

}

ATTRIBUTES

Function unit fcomp

Operation code 158

Number of operands 0

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The readpcsw writes the current value of the PCSW (Program Control and Status Word) processor register to

rdest. The layout of PCSW is shown below.

Fields in the PCSW have two chief purposes: to control aspects of processor operation and to record events that

occur during program execution. Thus, readpcsw can be used to determin e current processo r operating mod es and

what events have occurred; this operation can also be used to save state before idling a task in a multi-tasking

environment.

The readpcsw operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is un changed.

EXAMPLES

Initial Values Operation Result

PCSW = 0x80110642 readpcsw  r100 r100  0x80110642 (trap on MSE, INV and DBZ

enabled, IEN=1 - interrupts enabled, BSX=1 - little

endian mode of operation, OFZ=1 - a denormalized

result was produced somewhere, INX=1 - an inexact

result was produced somewhere)

r20 = 0, PCSW = 0x80000000 IF r20 readpcsw  r101 no change, since guard is false

r21 = 1, PCSW = 0x80000000 IF r21 readpcsw  r102 r102  0x80000000 (trap on MSE enabled)

MSE CS IEN BSX IEEE MODE OFZ IFZ INV OVF UNF INX DBZ

01234567891011121415

Misaligned store exception

Count stall s (1  Yes)

FP exception trap-ena ble bits

IEEE rounding mode

0  to nearest, 1  to zero, 2  to positive, 3  to negative

Interrupt enable (1  allow interrupts)

Byte sex (1  little endian)

PCSW<31:16>

PCSW<15:0> UNDEF

Misaligned store

exception trap enable Trap on first exit

FP exceptions

TRP

MSE TFE TRP

OFZ TRP

IFZ TRP

INV TRP

OVF TRP

UNF TRP

INX TRP

DBZ

1617181920212223252627283031

UNDEF UNDEFINED

WBE RSE

Write back error

Reserved ex ce ption

TRP

WBE TRP

RSE

Write back error trap enable Reserved exception

trap enab le

SEE ALSO

writepcsw

readpcsw

PNX1300/01/02/11 Data Book Philips Semiconductors

A-157 PRELIMINARY SPECIFICATION

Read source program counter

SYNTAX

[ IF rguard ] readspc  rdest

FUNCTION

if rguard then {

rdest  SPC

}

ATTRIBUTES

Function unit fcomp

Operation code 157

Number of operands 0

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The readspc writes the current value of the SPC (Source Program Counter) processor register to rdest.

An interruptible jump that is not interrupted (no NMI, INT, or EXC event was pending when the jump was executed)

writes its target address to SPC. The value of SPC allows an exception-handling routine to determine the start

address of the block of scheduled code (called a decision tree) that was executing before the exception was

taken.This operation can be used to save state before idling a task in a multi-tasking environment.

The readspc operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

SPC = 0xbeebee readspc  r100 r100  0xbeebee

r20 = 0, SPC = 0xabba IF r20 readspc  r101 no change, since guard is false

r21 = 1, SPC = 0xabba IF r21 readspc  r102 r102  0xabba

SEE ALSO

writespc readdpc ijmpf

ijmpi ijmpt

readspc

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-158

Rotate left

SYNTAX

[ IF rguard ] rol rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

n  rsrc2<4:0>

rdest<31:n>  rsrc1<31–n:0>

rdest<n–1:0>  rsrc1<31:32–n>

}

ATTRIBUTES

Function unit s hifter

Operation code 97

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2

DESCRIPTION

As shown below, the rol operation takes two arguments, rsrc1 and rsrc2. The least-significant five bits of rsrc2

specify an unsigned rotate amount, and rdest is set to rsrc1 rotated left by this amount. The most-significant n bits of

rsrc1, where n is the rotate amount, appe ar as the least-significant n bits in rdest.

The rol operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is un changed.

EXAMPLES

Initial Values Operation Result

r60 = 0x20, r30 = 3 rol r60 r30  r90 r90  0x100

r10 = 0, r60 = 0x20, r30 = 3 IF r10 rol r60 r30  r100 no change, since guard is false

r20 = 1, r60 = 0x20, r30 = 3 IF r20 rol r60 r30  r110 r110  0x100

r70 = 0xfffffffc, r40 = 2 rol r70 r40  r120 r120  0xfffffff3

r8 0 = 0x e, r5 0 = 0xfffffffe rol r80 r50  r125 r125  0x80000003 (r50 is effectively equal to 0x1e)

031

rsrc1 031

rsrc2 4n

Left rotator

32 bits from rsrc1

031

rdest 9

Intermediate result

(example: n = 9)

Five LSBs of rsrc2

031 32 bits from rsrc1 03123 23

SEE ALSO

roli asr asri lsl lsli lsr

lsri

rol

PNX1300/01/02/11 Data Book Philips Semiconductors

A-159 PRELIMINARY SPECIFICATION

Rotate left by immediate

SYNTAX

[ IF rguard ] roli(n) rsrc1  rdest

FUNCTION

if rguard then {

rdest<31:n>  rsrc1<31–n:0>

rdest<n–1:0>  rsrc1<31:32–n>

}

ATTRIBUTES

Function unit shifter

Operation code 98

Number of operands 1

Modifier 7 bits

Modifier range 0..31

Latency 1

Issue slots 1, 2

DESCRIPTION

As shown below, the roli operation takes a single argument in rsrc1 and an immediate modifier n and produces a

result in rdest equal to rsrc1 rotated left by n bits. The value of n must be between 0 and 31, inclusive. The most-

significant n bits of rsrc1 appear as the least-significant n bits in rdest.

The roli operations optionally take a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is unchanged.

EXAMPLES

Initial Values Operation Result

r60 = 0x20 roli(3) r60  r90 r90  0x100

r10 = 0, r60 = 0x20 IF r10 roli(3) r60  r100 no change, since guard is false

r20 = 1, r60 = 0x20 IF r20 roli(3) r60  r110 r110  0x100

r70 = 0xfffffffc roli(2) r70  r120 r120  0xfffffff3

r80 = 0xe roli(30) r80  r125 r125  0x80000003

Rotate amount n

from operation modifier

031

rsrc1

Left rotator

32 bits from rsrc1

031

rdest 9

Intermediate result

(example: n = 9)

031 32 bits from rsrc1 03123 23

SEE ALSO

rol asl asli asr asri lsl

lsli lsr lsri

roli

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-160

Sign extend 16 bits

SYNTAX

[ IF rguard ] sex16 rsrc1  rdest

FUNCTION

if rguard then

rdest  sign_ext16to32(rsrc1<15:0>)

ATTRIBUTES

Function unit alu

Operation code 51

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

As shown below, the sex16 operation sign extends the lea st-significant 16bit ha lfword of the argu ment, r src1, to 32

bits and stores the result in rdest.

The sex16 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of the guard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xffff0040 sex16 r30  r60 r60  0x00000040

r10 = 0, r40 = 0xff0fff91 IF r10 sex16 r40  r70 no change, since guard is false

r20 = 1, r40 = 0xff0fff91 IF r20 sex16 r40  r100 r100  0xffffff91

r50 = 0x00000091 sex16 r50  r110 r110  0x00000091

01531

rsrc1

031

rdest 15

SSSSSSSSSSSSSSSSS

signed

SEE ALSO

zex16 sex8 zex8

sex16

PNX1300/01/02/11 Data Book Philips Semiconductors

A-161 PRELIMINARY SPECIFICATION

Sign extend 8 bits

pseudo-op for ibytesel

SYNTAX

[ IF rguard ] sex8 rsrc1  rdest

FUNCTION

if rguard then

rdest  sign_ext8to32(rsrc1<7:0>)

ATTRIBUTES

Function unit alu

Operation code 56

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The sex8 operation is a pseudo operation transformed by the scheduler into a ibytesel with rsrc1 as the first

argument and r0 (always contains 0) as the second. (Note: pseudo operations cannot be used in assembly source

files.)

As shown below, the sex8 operation sign extends the least-significant halfword of the argument, rsrc1, to 32 bits

and writes the result in rdest.

The sex8 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xffff0040 sex8 r30  r60 r60  0x00000040

r10 = 0, r40 = 0xff0fff91 IF r10 sex8 r40  r70 no change, since guard is false

r20 = 1, r40 = 0xff0fff91 IF r20 sex8 r40  r100 r100  0xffffff91

r50 = 0x00000091 sex8 r50  r110 r110  0xffffff 91

01531

rsrc1

031

rdest 15 7

SSSSSSSSSSSSSSSSSSSSSSSS

signed

SEE ALSO

ibytesel sex16 zex8 zex16

sex8

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-162

16-bit store

pseudo-op for h_st16d(0)

SYNTAX

[ IF rguard ] st16 rsrc1 rsrc2

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  1

else

bs  0

mem[rsrc1 + (1  bs)]  rsrc2<7:0>

mem[rsrc1 + (0  bs)]  rsrc2<15:8>

}

ATTRIBUTES

Function unit dmem

Operation code 30

Number of operands 2

Modifier No

Modifier range —

Latency n/a

Issue slots 4, 5

DESCRIPTION

The st16 operation is a pseudo operation transformed by the scheduler into an h_st16d(0) with the same

arguments. (Note: pseudo operations cannot be used in assembly files.)

The st16 operation stores the least-significant 16-bit halfword of rsrc2 into the memory locations pointed to by the

address in rsrc1. This store operation is performed as little-endian or big-endian depending on the current setting of

the bytesex bit in the PCSW.

If st16 is misaligned (the memory address in rsrc1 is not a multiple of 2), the result of st16 is undef ined, an d the

MSE (Misaligned Store Exception) bit in the PCSW register is set to 1. Additionally, if the TRPMSE (TRaP on

Misaligned Store Exception) bit in PCSW is 1, exception processing will be requested on the next interruptible jump.

The result of an access by st16 to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The st16 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the addressed memory locations (and the modification of cache if the locations are cacheable). If the

LSB of rguard is 1, the store takes effect. If the LSB of rguard is 0, st16 has no side effects whatever; in particular, the

LRU and other status bits in the data cache are not affected.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, r80 = 0x44332211 st16 r10 r80 [0xd00]  0x22, [0xd01]  0x11

r50 = 0, r20 = 0xd01,

r70 = 0xaabbccdd IF r50 st16 r20 r70 no change, since guard is false

r60 = 1, r30 = 0xd02,

r70 = 0xaabbccdd IF r60 st16 r30 r70 [0xd02]  0xcc, [0xd03]  0xdd

SEE ALSO

st16d h_st16d st8 st8d

st32 st32d

st16

PNX1300/01/02/11 Data Book Philips Semiconductors

A-163 PRELIMINARY SPECIFICATION

16-bit store with displacement

pseudo-op for h_st16d

SYNTAX

[ IF rguard ] st16d(d) rsrc1 rsrc2

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  1

else

bs  0

mem[rsrc1 + d + (1  bs)]  rsrc2<7:0>

mem[rsrc1 + d + (0  bs)]  rsrc2<15:8>

}

ATTRIBUTES

Function unit dmem

Operation code 30

Number of operands 2

Modifier 7 bits

Modifier range –128..126 by 2

Latency n/a

Issue slots 4, 5

DESCRIPTION

The st16d operation is a pseudo operation transformed by the scheduler into an h_st16d with the same

arguments. (Note: pseudo operations cannot be used in assembly files.)

The st16d operation stores the least-signi ficant 16-bit halfword of r src2 into the me mory locations pointe d to by the

address in rsrc1 + d. The d value is an opcode modifier, must be in the range –128 and 126 inclusive, and must be a

multiple of 2. Th is store operation is performed as little-endian or big-endian depending on the current setting of the

bytesex bit in the PCSW.

If st16d is misaligned (the memory address computed by rsrc1 + d is not a multiple of 2), the result of st16d is

undefined, and th e MSE ( Misaligned Store Exception) b it in the PCSW register is set to 1. Additionally, if the TRPMSE

(TRaP on Misaligned Store Exception) bit in PCSW is 1, exception processing will be requested on the next

interruptible jump.

The result of an access by st16d to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The st16d operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the addressed memory locations (and the modification of cache if the locations are cacheable). If the

LSB of rguard is 1, the store takes effect. If the LSB of rguard is 0, st16d has no side effects whatever; in particular,

the LRU and other status bits in the data cache are not affected.

EXAMPLES

Initial Values Operation Result

r10 = 0xcfe, r80 = 0x44332211 st16d(2) r10 r80 [0xd00]  0x22, [0xd01]  0x11

r50 = 0, r20 = 0xd05,

r70 = 0xaabbccdd IF r50 st16d(–4) r20 r70 no change, since guard is false

r60 = 1, r30 = 0xd06,

r70 = 0xaabbccdd IF r60 st16d(–4) r30 r70 [0xd02]  0xcc, [0xd03]  0xdd

SEE ALSO

st16 h_st16d st8 st8d st32

st32d

st16d

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-164

32-bit store

pseudo-op for h_st32d(0)

SYNTAX

[ IF rguard ] st32 rsrc1 rsrc2

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  3

else

bs  0

mem[rsrc1 + (3  bs)]  rsrc2<7:0>

mem[rsrc1 + (2  bs)]  rsrc2<15:8>

mem[rsrc1 + (1  bs)]  rsrc2<23:16>

mem[rsrc1 + (0  bs)]  rsrc2<31:24>

}

ATTRIBUTES

Function unit dmem

Operation code 31

Number of operands 2

Modifier No

Modifier range —

Latency n/a

Issue slots 4, 5

DESCRIPTION

The st32 operation is a pseudo operation transformed by the scheduler into an h_st32d(0) with the same

arguments. (Note: pseudo operations cannot be used in assembly files.)

The st32 operation stores all 32 bits of rsrc2 into the memory locations pointed to by t he address in rsrc1. The d

value is an opcode modifier and must be a multiple of 4. This store operation is performed as little-endian or big-

endian depending on the current setting of the bytesex bit in the PCSW.

If st32 is misaligned (the memory address in rsrc1 is not a multiple of 4), the result of st32 is undef ined, an d the

MSE (Misaligned Store Exception) bit in the PCSW register is set to 1. Additionally, if the TRPMSE (TRaP on

Misaligned Store Exception) bit in PCSW is 1, exception processing will be requested on the next interruptible jump.

The st32 operation can be used to access the MMIO address aperture (the resu lt of MMI O access by 8- or 16-bit

memory operations is undefined). The state of the BSX bit in the PCSW has no effect on MMIO access by st32.

The st32 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the addressed memory locations (and the modification of cache if the locations are cacheable). If the

LSB of rguard is 1, the store takes effect. If the LSB of rguard is 0, st32 has no side effects whatever; in particular, the

LRU and other status bits in the data cache are not affected.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, r80 = 0x44332211 st32 r10 r80 [0xd00]  0x44, [0xd01]  0x33,

[0xd02]  0x22, [0xd03]  0x11

r50 = 0, r20 = 0xd01,

r70 = 0xaabbccdd IF r50 st32 r20 r70 no change, since guard is false

r60 = 1, r30 = 0xd04,

r70 = 0xaabbccdd IF r60 st32 r30 r70 [0xd04]  0xaa, [0xd05]  0xbb,

[0xd06]  0xcc, [0xd07]  0xdd

SEE ALSO

h_st32d st32d st16 st16d

st8 st8d

st32

PNX1300/01/02/11 Data Book Philips Semiconductors

A-165 PRELIMINARY SPECIFICATION

32-bit store with displacement

pseudo-op for h_st32d

SYNTAX

[ IF rguard ] st32d(d) rsrc1 rsrc2

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  3

else

bs  0

mem[rsrc1 + d + (3  bs)]  rsrc2<7:0>

mem[rsrc1 + d + (2  bs)]  rsrc2<15:8>

mem[rsrc1 + d + (1  bs)]  rsrc2<23:16>

mem[rsrc1 + d + (0  bs)]  rsrc2<31:24>

}

ATTRIBUTES

Function unit dmem

Operation code 31

Number of operands 2

Modifier 7 bits

Modifier range –256..252 by 4

Latency n/a

Issue slots 4, 5

DESCRIPTION

The st32d operation is a pseudo operation transformed by the scheduler into an h_st32d with the same

arguments. (Note: pseudo operations cannot be used in assembly files.)

The st32d operation stores all 32 bits of rsrc2 into the memory locations pointed to by the address in rsrc1 + d.

The d value is an opcode modifier, must be in the range –256 and 252 inclusive, and must be a multiple of 4. This

store operation is performed as little-endian or big-endian depending on the current setting of the bytesex bit in the

PCSW.

If st32d is misaligned (the memory address computed by rsrc1 + d is not a multiple of 4), the result of st32d is

undefined, and th e MSE ( Misaligned Store Exception) b it in the PCSW register is set to 1. Additionally, if the TRPMSE

(TRaP on Misaligned Store Exception) bit in PCSW is 1, exception processing will be requested on the next

interruptible jump.

The st32d operatio n can b e u sed to access th e M MIO addr ess apertu re (the result of M MIO access by 8 - or 16 -bit

memory operations is undefin ed). The state of the BSX bit in the PCSW has no effect on MMIO access by st32d.

The st32d operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the addressed memory locations (and the modification of cache if the locations are cacheable). If the

LSB of rguard is 1, the store takes effect. If the LSB of rguard is 0, st32d has no side effects whatever; in particular,

the LRU and other status bits in the data cache are not affected.

EXAMPLES

Initial Values Operation Result

r10 = 0xcfc, r80 = 0x44332211 st32d(4) r10 r80 [0xd00]  0x44, [0xd01]  0x33,

[0xd02]  0x22, [0xd03]  0x11

r50 = 0, r20 = 0xd0b,

r70 = 0xaabbccdd IF r50 st32d(–8) r20 r70 no change, since guard is false

r60 = 1, r30 = 0xd0c,

r70 = 0xaabbccdd IF r60 st32d(–8) r30 r70 [0xd04]  0xaa, [0xd05]  0xbb,

[0xd06]  0xcc, [0xd07]  0xdd

SEE ALSO

h_st32d st32 st16 st16d

st8 st8d

st32d

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-166

8-bit store

pseudo-op for h_st8d(0)

SYNTAX

[ IF rguard ] st8 rsrc1 rsrc2

FUNCTION

if rguard then

mem[rsrc1]  rsrc2<7:0>

ATTRIBUTES

Function unit dmem

Operation code 29

Number of operands 2

Modifier No

Modifier range —

Latency n/a

Issue slots 4, 5

DESCRIPTION

The st8 operation is a pseudo operation transformed by the scheduler into an h_st8d(0) with the same

arguments. (Note: pseudo operations cannot be used in assembly files.)

The st8 operation stores the least-significant 8-bit byte of rsrc2 into the memory location pointed to by the address

in rsrc1. This operation does not depend on the bytesex bit in the PCSW since only a single byte is stored.

The result of an access by st8 to the MMIO address aperture is unde fined; access to th e MMIO aperture is define d

only for 32-bit loads and stores.

The st8 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the addressed memory location (and the modification of cache if the location is cacheable). If the LSB

of rguard is 1, the store takes effect. If the LSB of rguard is 0, st8 has no side effects whatever; in particular, the LRU

and other status bits in the data cache are not affected.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, r80 = 0x44332211 st8 r10 r80 [0xd00]  0x11

r50 = 0, r20 = 0xd01,

r70 = 0xaabbccdd IF r50 st8 r20 r70 no change, since guard is false

r60 = 1, r30 = 0xd02,

r70 = 0xaabbccdd IF r60 st8 r30 r70 [0xd02]  0xdd

SEE ALSO

h_st8d st8d st16 st16d

st32 st32d

st8

PNX1300/01/02/11 Data Book Philips Semiconductors

A-167 PRELIMINARY SPECIFICATION

8-bit store with displacement

pseudo-op for h_st8d

SYNTAX

[ IF rguard ] st8d(d) rsrc1 rsrc2

FUNCTION

if rguard then

mem[rsrc1 + d]  rsrc2<7:0>

ATTRIBUTES

Function unit dmem

Operation code 29

Number of operands 2

Modifier 7 bits

Modifier range –64..63

Latency n/a

Issue slots 4, 5

DESCRIPTION

The st8d operation is a pseudo operation transformed by the scheduler into an h_st8d with the same

arguments. (Note: pseudo operations cannot be used in assembly files.)

The st8d operation stores the least-significant 8-bit byte of rsrc2 into the memory location pointed to by the

address formed from th e sum r src1 + d. The value of the opcode modifier d must be in the range - 64 and 63 inclusive.

This operation does not depend on the bytesex bit in the PCSW since only a single byte is stored.

The result of an access by st8d to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The st8d operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the addressed memory location (and the modification of cache if the location is cacheable). If the LSB

of rguard is 1, the store takes ef fe ct. If the LSB of r guard is 0, st8d has no side effects whatever; in particular , the LRU

and other status bits in the data cache are not affected.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, r80 = 0x44332211 st8d(3) r10 r80 [0xd03]  0x11

r50 = 0, r20 = 0xd01,

r70 = 0xaabbccdd IF r50 st8d(-4) r20 r70 no change, since guard is false

r60 = 1, r30 = 0xd02,

r70 = 0xaabbccdd IF r60 st8d(-4) r30 r70 [0xcfe]  0xdd

SEE ALSO

h_st8d st8 st16 st16d st32

st32d

st8d

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-168

Select unsigned byte

SYNTAX

[ IF rguard ] ubytesel rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc2 = 0 then

rdest  zero_ext8to32(rsrc1<7:0>)

else if rsrc2 = 1 then

rdest  zero_ext8to32(rsrc1<15:8>)

else if rsrc2 = 2 then

rdest  zero_ext8to32(rsrc1<23:15>)

else if rsrc2 = 3 then

rdest  zero_ext8to32(rsrc1<31:24>)

}

ATTRIBUTES

Function unit alu

Operation code 55

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

As shown be low, the ubytesel operation selects one byte from the argument, rsrc1, zero-extends the byte to 32

bits, and stores the result in rdest. The value of rsrc2 determines which byte is selected, with rsrc2=0 selecting the

LSB of rsrc1 and rsrc2=3 selecting the MSB of rsrc1. If rsrc2 is not between 0 and 3 inclusive, the result of

ubytesel is undefined.

The ubytesel operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x44332211, r40 = 1 ubytesel r30 r40  r50 r50  0x00000022

r10 = 0, r60 = 0xddccbbaa, r70 = 2 IF r10 ubytesel r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0xddccbbaa, r70 = 2 IF r20 ubytesel r60 r70  r90 r90  0x000000cc

r100 = 0xffffff7 f, r110 = 0 ubytesel r100 r110  r120 r120  0x0000007f

01531

rsrc1 031

rsrc2

23 7 1

031

rdest 7

3210

00000000000000000000000

unsigned unsigned unsigned unsigned

unsigned

SEE ALSO

ibytesel sex8 packbytes

ubytesel

PNX1300/01/02/11 Data Book Philips Semiconductors

A-169 PRELIMINARY SPECIFICATION

Clip signed to unsigned

SYNTAX

[ IF rguard ] uclipi rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  min(max(rsrc1, 0), rsrc2)

ATTRIBUTES

Function unit dspalu

Operation code 75

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

The uclipi operat ion returns the valu e of rsrc1 clipped into the unsigned integer range 0 to rsrc2, inclusive. The

argument rsrc1 is considered a signed integer; rsrc2 is considered an unsigned integer.

The uclipi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x80, r40 = 0x7f uclipi r30 r40  r50 r50  0x7f

r10 = 0, r60 = 0x12345678,

r70 = 0xabc IF r10 uclipi r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0x12345678,

r70 = 0xabc IF r20 uclipi r60 r70  r90 r90  0xabc

r100 = 0x80000000, r110 = 0x3fffff uclipi r100 r110  r120 r120  0

SEE ALSO

iclipi uclipu imin imax

uclipi

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-170

Clip unsigned to unsigned

SYNTAX

[ IF rguard ] uclipu rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 > rsrc2 then

rdest  rsrc2

else

rdest  rsrc1

}

ATTRIBUTES

Function unit dspalu

Operation code 76

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

The uclipu opera tion returns the valu e of rsrc1 clipped into the unsigned integer range 0 to rsrc2, inclusive. The

arguments rsrc1 and rsrc2 are considered unsigned integers.

The uclipu operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x80, r40 = 0x7f uclipu r30 r40  r50 r50  0x 7 f

r10 = 0, r60 = 0x12345678,

r70 = 0xabc IF r10 uclipu r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0x12345678,

r70 = 0xabc IF r20 uclipu r60 r70  r90 r90  0xabc

r100 = 0x80000000, r110 = 0x3fffff uclipu r100 r110  r120 r120  0x3fffff

SEE ALSO

iclipi uclipi imin imax

uclipu

PNX1300/01/02/11 Data Book Philips Semiconductors

A-171 PRELIMINARY SPECIFICATION

Unsigned compare equal

pseudo-op for ieql

SYNTAX

[ IF rguard ] ueql rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 = rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 37

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The ueql operation is a pseudo operation transformed by the scheduler into an ieql with the same arguments.

(Note: pseudo operations cannot be used in assembly files.)

The ueql operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is equal to the second

argument, rsrc2; otherwise, rdest is set to 0. The arguments are treated as unsigned integers.

The ueql operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 ueql r30 r40  r80 r80  0

r10 = 0, r60 = 0x100, r30 = 3 IF r10 ueql r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, r60 = 0x1000 IF r20 ueql r50 r60  r90 r90  1

r70 = 0x80000000, r40 = 4 ueql r70 r40  r100 r100  0

r70 = 0x80000000 ueql r70 r70  r110 r110  1

SEE ALSO

ieql ueqli igeq uneq

ueql

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-172

Unsigned compare equal with immediate

SYNTAX

[ IF rguard ] ueqli(n) rsrc1  rdest

FUNCTION

if rguard then {

if rsrc1 = n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 38

Number of operands 1

Modifier 7 bits

Modifier range 0..127

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The ueqli operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is equal to the opcode

modifier, n; otherwise, rdest is set to 0. The arguments are treated as unsigned integ er s.

The ueqli operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ueqli(2) r30  r80 r80  0

r30 = 3 ueqli(3) r30  r90 r90  1

r30 = 3 ueqli(4) r30  r100 r100  0

r10 = 0, r40 = 0x100 IF r10 ueqli(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 ueqli(63) r40  r100 r100  0

r60 = 0x07f ueqli(127) r60  r120 r120  1

SEE ALSO

ieqli ueql igeqi uneqi

ueqli

PNX1300/01/02/11 Data Book Philips Semiconductors

A-173 PRELIMINARY SPECIFICATION

Sum of products of unsigned 16-bit halfwords

SYNTAX

[ IF rguard ] ufir16 rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  zero_ext16to32(rsrc1<31:16>) zero_ext16to32(rsrc2<31:16>) +

zero_ext16to32(rsrc1<15:0>) zero_ext16to32(rsrc2<15:0>)

ATTRIBUTES

Function unit dspmul

Operation code 94

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the ufir16 operation computes two separate products of the two pairs of corresponding 16-bit

halfwords of rsrc1 and rsrc2; the two products are summed, and the result is written to rdest. All halfwords are

considered unsigned; thus, the intermediate products and the final sum of products are unsigned. All intermediate

computations are performed without loss of precision; the final sum of products is clipped into the range [0xffffffff..0]

before being written into rdest.

The ufir16 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x00020003, r40 = 0x00010002 ufir16 r30 r40  r50 r50  8

r10 = 0, r60 = 0x80000064, r70 = 0x00648000 IF r10 ufir16 r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0x80000064, r70 = 0x00648000 IF r20 ufir16 r60 r70  r90 r90  0x00640000

r30 = 0x00020003, r70 = 0x00648000 ufir16 r30 r70  r100 r100  0x000180c8

01531

rsrc1 01531

rsrc2

031

rdest





unsigned unsigned unsigned unsigned

unsigned

032

Clip to [232–1..0]

Full-precision

33-bit result unsigned

SEE ALSO

ifir16 ifir8ii ifir8ui

ufir8uu

ufir16

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-174

Unsigned sum of products of unsigned bytes

SYNTAX

[ IF rguard ] ufir8uu rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  zero_ext8to32(rsrc1<31:24>) zero_ext8to32(rsrc2<31:24>) +

zero_ext8to32(rsrc1<23:16>) zero_ext8to32(rsrc2<23:16>) +

zero_ext8to32(rsrc1<15:8>) zero_ext8to32(rsrc2<15:8>) +

zero_ext8to32(rsrc1<7:0>) zero_ext8to32(rsrc2<7:0>)

ATTRIBUTES

Function unit dspmul

Operation code 90

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the ufir8uu operation computes four separate products of the four pairs of corresponding 8-bit

bytes of rsrc1 and rsrc2; the four products are summed, and the result is written to rdest. All values are considered

unsigned. All computations are performed without loss of precision.

The ufir8uu operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r70 = 0x0afb14f6, r30 = 0x0a0a1414 ufir8uu r70 r30  r90 r90  0x1efa

r10 = 0, r70 = 0x0afb14f6, r30 = 0x0a0a1414 IF r10 ufir8uu r70 r30  r100 no change, since guard is false

r20 = 1, r80 = 0x649c649c, r40 = 0x9c649c64 IF r20 ufir8uu r80 r40  r110 r110  0xf3c0

r50 = 0x80808080, r60 = 0xffffffff ufir8uu r50 r60  r120 r120  0x1fe00

01531

rsrc1 01531

rsrc2

031

rdest







23 7 23 7

unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned

unsigned

SEE ALSO

ifir8ui ifir8ii ifir16

ufir16

ufir8uu

PNX1300/01/02/11 Data Book Philips Semiconductors

A-175 PRELIMINARY SPECIFICATION

Convert floating-point to unsigned integer using

PCSW rounding mode

SYNTAX

[ IF rguard ] ufixieee rsrc1  rdest

FUNCTION

if rguard then {

rdest  (unsigned long) ((float)rsrc1)

}

ATTRIBUTES

Function unit falu

Operation code 123

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ufixieee operation converts the single-precision IEEE floating-point value in rsrc1 to an unsigned integer

and writes the result into rdest. Rounding is according to the IEEE rounding mode bits in PCSW. If rsrc1 is

denormalized, zero is substituted before conversion, and the IFZ flag in the PCSW is set. If ufixieee causes an

IEEE exception, such as overflow or underflow, the corresponding exception flags in the PCSW are set. The PCSW

exception flags are sticky: the flags can be set as a side-effect of any floating-point operation but can only be reset by

an explicit writepcsw operation. The update of the PCSW exception flags occurs at the same time as rdest is

written. If any other floating-point compute operations update the PCSW at the same time, the net result in each

exception flag is the logical OR of all simultaneous updates ORed with the existing PCSW value for that exception

flag.

The ufixieeeflags operation comput es the exception flags that would result from an individual ufixieee.

The ufixieee operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination registe r. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) ufixieee r30  r100 r100  3

r35 = 0x40247ae1 (2.57) ufixieee r35  r102 r102  3, INX flag set

r10 = 0,

r40 = 0xff4fffff (–3.402823466e+38) IF r10 ufixieee r40  r105 no change, since guard is false

r20 = 1,

r40 = 0xff4fffff (–3.402823466e+38) IF r20 ufixieee r40  r110 r110  0x0, INV flag set

r45 = 0x7f800000 (+INF)) ufixieee r45  r112 r112  0xffffffff (232-1), INV flag set

r50 = 0xbfc147ae (-1.51) ufixieee r50  r115 r115  0, INV flag set

r60 = 0x00400000 (5.877471754e-39) ufixieee r60  r117 r117  0, IFZ set

r70 = 0xffffffff (QNa N) ufixieee r70  r120 r120  0, INV flag set

r80 = 0xffbfffff (SNaN) ufixieee r80  r122 r122  0, INV flag set

SEE ALSO

ifixieee ifixrz ufixrz

ufixieee

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-176

IEEE status flags from convert floating-point to

unsigned integer using PCSW rounding mode

SYNTAX

[ IF rguard ] ufixieeeflags rsrc1  rdest

FUNCTION

if rguard then

rdest  ieee_flags((unsigned long) ((float)rsrc1))

ATTRIBUTES

Function unit falu

Operation code 124

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ufixieeeflags operation computes the IEEE exceptions that would result from converting the single-

precision IEEE floating-point value in rsrc1 to an unsigned integer, and an integer bit vector representing the

computed exception flags is written into rdest. The bit vector stored in rdest has the same format as the IEEE

exception bits in the PCSW. The exception flags in PCSW are left unchanged by this operation. Rounding is

according to the IEEE rounding mode bits in PCSW. If an argument is denormalized, zero is substituted before

computing the conversion, and the IFZ bit in the result is set.

The ufixieeeflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB

controls the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not

changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) ufixieeeflags r30  r100 r100  0

r35 = 0x40247ae1 (2.57) ufixieeeflags r35  r102 r102  0x02 (INX)

r10 = 0,

r40 = 0xff4fffff (–3.402823466e+38) IF r10 ufixieeeflags r40  r105 no change, since guard is false

r20 = 1,

r40 = 0xff4fffff (–3.402823466e+38) IF r20 ufixieeeflags r40  r110 r110  0x10 (INV)

r45 = 0x7f800000 (+INF)) ufixieeeflags r45  r112 r112  0x10 (INV)

r50 = 0xbfc147ae (-1.51) ufixieeeflags r50  r115 r115  0x10 (INV)

r60 = 0x00400000 (5.877471754e-39) ufixieeeflags r60  r117 r117  0x20 (IFZ)

r7 0 = 0x ffffffff (QNaN ) ufixieeeflags r70  r120 r120  0x10 (INV)

r80 = 0xffbfffff (SNaN) ufixieeeflags r80  r122 r122  0x10 (INV)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

ufixieee ifixieeeflags

ifixrzflags ufixrzflags

ufixieeeflags

PNX1300/01/02/11 Data Book Philips Semiconductors

A-177 PRELIMINARY SPECIFICATION

Convert floating-point to unsigned integer with

round toward zero

SYNTAX

[ IF rguard ] ufixrz rsrc1  rdest

FUNCTION

if rguard then {

rdest  (unsigned long) ((float)rsrc1)

}

ATTRIBUTES

Function unit falu

Operation code 125

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ufixrz operation converts the single-precision IEEE floating-point value in rsrc1 to an unsigned integer and

writes the result into rdest. Rounding toward zero is performed; the IEEE rounding mode bits in PCSW are ignored.

This is the preferred rounding mode for ANSI C. If rsrc1 is denormalized, zero is substituted before conversion, and

the IFZ flag in the PCSW is set. If ufixrz causes an IEEE exception, such as overflow or underflow, the

corresponding exception flags in the PCSW are set. The PCSW exception flags are sticky: the flags can be set as a

side-effect of any floating-point operation but can only be reset by an explicit writepcsw operation. The update of

the PCSW exception flags occurs at the same time as rdest is written. If any other floating-point compute operations

update the PCSW at the same time, the net resu lt in ea ch exce ption flag is the logical OR of all simultaneous updates

ORed with the existing PCSW value for that exception flag.

The ufixrzflags operation computes the exception flags that would result from an individual ufixrz.

The ufixrz operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination registe r. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) ufixrz r30  r100 r100  3

r35 = 0x40247ae1 (2.57) ufixrz r35  r102 r102  2, INX flag set

r10 = 0,

r40 = 0xff4fffff (–3.402823466e+38) IF r10 ufixrz r40  r105 no change, since guard is false

r20 = 1,

r40 = 0xff4fffff (–3.402823466e+38) IF r20 ufixrz r40  r110 r110  0x0, INV flag set

r45 = 0x7f800000 (+INF)) ufixrz r45  r112 r112  0xffffffff (232-1), INV flag set

r50 = 0xbfc147ae (-1.51) ufixrz r50  r115 r115  0, INV flag set

r60 = 0x00400000 (5.877471754e-39) ufixrz r60  r117 r117  0, IFZ set

r70 = 0xffffffff (QNa N) ufixrz r70  r120 r120  0, INV flag set

r80 = 0xffbfffff (SNaN) ufixrz r80  r122 r122  0, INV flag set

SEE ALSO

ifixieee ufixieee ifixrz

ufixrz

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-178

IEEE status flags from convert floating-point to

unsigned integer with round toward zero

SYNTAX

[ IF rguard ] ufixrzflags rsrc1  rdest

FUNCTION

if rguard then

rdest  ieee_flags((unsigned long) ((float)rsrc1))

ATTRIBUTES

Function unit falu

Operation code 126

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ufixrzflags operation computes the IEEE exceptions that would result from converting the single-precision

IEEE floating-point value in rsrc1 to an unsigned integer, and an integer bit vector representing the computed

exception flags is written into rdest. The bit vector stored in rdest has the same format as the IEEE exception bits in

the PCSW. The exception flags in PCSW are left unchanged by this operation. Rounding toward zero is performed;

the IEEE rounding mode bits in PCSW are ignored. If an argument is denormalized, zero is substituted before

computing the conversion, and the IFZ bit in the result is set.

The ufixrzflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x40400000 (3.0) ufixrzflags r30  r100 r100  0

r35 = 0x40247ae1 (2.57) ufixrzflags r35  r102 r102  0x02 (INX)

r10 = 0,

r40 = 0xff4fffff (–3.402823466e+38) IF r10 ufixrzflags r40  r105 no change, since guard is false

r20 = 1,

r40 = 0xff4fffff (–3.402823466e+38) IF r20 ufixrzflags r40  r110 r110  0x10 (INV)

r45 = 0x7f800000 (+INF)) ufixrzflags r45  r112 r112  0x10 (INV)

r50 = 0xbfc147ae (-1.51) ufixrzflags r50  r115 r115  0x10 (INV)

r60 = 0x00400000 (5.877471754e-39) ufixrzflags r60  r117 r117  0x20 (IFZ)

r7 0 = 0x ffffffff (QNaN ) ufixrzflags r70  r120 r120  0x10 (INV)

r80 = 0xffbfffff (SNaN) ufixrzflags r80  r122 r122  0x10 (INV)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

ufixrz ifixrzflags

ifixieeeflags

ufixieeeflags

ufixrzflags

PNX1300/01/02/11 Data Book Philips Semiconductors

A-179 PRELIMINARY SPECIFICATION

Convert unsigned integer to floating-point

SYNTAX

[ IF rguard ] ufloat rsrc1  rdest

FUNCTION

if rguard then {

rdest  (float) ((unsigned long)rsrc1)

}

ATTRIBUTES

Function unit falu

Operation code 127

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ufloat operation converts the unsigned integer value in rsrc1 to single-precision IEEE floating-point format

and writes the result into rdest. Rounding is according to the IEEE rounding mode bits in PCSW. If ufloat causes

an IEEE exception, such as inexact, the corresponding exception flags in the PCSW are set. The PCSW exception

flags are sticky: the flags can be set as a side-effect of any floating- poin t oper ation but can only be re set b y an exp licit

writepcsw operation. The update of the PCSW exception flags occurs at the same time as rdest is written. If any

other floating-point comp ute operations update the PCSW at the same time, the net r esult in each exception flag is the

logical OR of all simultaneous updates ORed with the existing PCSW value for that exception flag.

The ufloatflags operation computes the exception flags that would result from an individual ufloat.

The ufloat operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination registe r. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 3 ufloat r30  r100 r100  0x40400000 (3.0)

r40 = 0xffffffff ( 4294967295) ufloat r40  r105 r105  0x4f800000 (4.294967296e+9), INX flag set

r10 = 0, r50 = 0xfffffffd IF r10 ufloat r50  r110 no change, since guard is false

r20 = 1, r50 = 0xfffffffd IF r20 ufloat r50  r115 r115  0x4f800000 (4.294967296e+9), INX flag set

r60 = 0x7 fffffff (2147483647) ufloat r60  r117 r117  0x4f000000 (2.147483648e+9), INX flag set

r70 = 0x80000000 (2147483648) ufloat r70  r120 r120  0x4f000000 (2.147483648e+9)

r80 = 0x7 ffffff 1 ( 2147483633) ufloat r80  r122 r122  0x4f000000 (2.147483648e+9), INX flag set

SEE ALSO

ifloat ifloatrz ufloatrz

ifixieee ufloatflags

ufloat

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-180

IEEE status flags from convert unsigned integer

to floating-point

SYNTAX

[ IF rguard ] ufloatflags rsrc1  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float) ((unsigned long)rsrc1))

ATTRIBUTES

Function unit falu

Operation code 128

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ufloatflags operation computes the IEEE exceptions that would result from converting the unsigned

integer in rsrc1 to a single-precision IEEE floating-point value, and an integer bit vector representing the computed

exception flags is written into rdest. The bit vector stored in rdest has the same format as the IEEE exception bits in

the PCSW. The exception flags in PCSW are left unchanged by this operation. Rounding is according to the IEEE

rounding mode bits in PCSW.

The ufloatflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls

the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ufloatflags r30  r100 r100  0

r4 0 = 0x ffffffff (4294967295) ufloatflags r40  r105 r105  0x02 (INX)

r1 0 = 0, r50 = 0xfffffffd IF r10 ufloatflags r50  r110 no change, since guard is false

r2 0 = 1, r50 = 0xfffffffd IF r20 ufloatflags r50  r115 r115  0x02 (INX)

r6 0 = 0x 7fffffff (2147483647) ufloatflags r60  r117 r117  0x02 (INX)

r70 = 0x80000000 (2147483648) ufloatflags r70  r120 r120  0

r8 0 = 0x 7ffffff1 (2147483633) ufloatflags r80  r122 r122  0x02 (INX)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

ufloat ifloatflags

ifloatrzflags

ufloatrzflags

ufloatflags

PNX1300/01/02/11 Data Book Philips Semiconductors

A-181 PRELIMINARY SPECIFICATION

Convert unsigned integer to floating-point with

rounding toward zero

SYNTAX

[ IF rguard ] ufloatrz rsrc1  rdest

FUNCTION

if rguard then {

rdest  (float) ((unsigned long)rsrc1)

}

ATTRIBUTES

Function unit falu

Operation code 119

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ufloatrz operation converts the unsigned integer value in rsrc1 to single-precision IEEE floating-point

format and writes the r esult into r dest. Rounding is performed toward zero; the IEEE rounding mode bits in PCSW are

ignored. This is the preferred rounding mode for ANSI C. If ufloatrz causes an IEEE exception, such as inexact,

the corresponding exception flags in the PCSW are set. The PCSW exception flags are sticky: the flags can be set as

a side-eff ect of any floating-poin t operat ion but can o nly be reset by an explicit writepcsw operation. The update of

the PCSW exception flags occurs at the same time as rdest is written. If any other floating-point compute operations

update the PCSW at the same time, the net resu lt in ea ch exce ption flag is the logical OR of all simultaneous updates

ORed with the existing PCSW value for that exception flag.

The ufloatrzflags operation computes the exception flags that would result from an individual ufloatrz.

The ufloatrz operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination registe r. If the LSB of rguard is 1, rdest and the exception flags in PCSW are written;

otherwise, rdest is not changed and the operation does not affect the exception flags in PCSW.

EXAMPLES

Initial Values Operation Result

r30 = 3 ufloatrz r30  r100 r100  0x40400000 (3.0)

r40 = 0xffffffff ( 4294967295) ufloatrz r40  r105 r105  0x4f7fffff (4.294967040e+9), INX flag set

r10 = 0, r50 = 0xfffffffd IF r10 ufloatrz r50  r110 no change, since guard is false

r20 = 1, r50 = 0xfffffffd IF r20 ufloatrz r50  r115 r115  0x4f7fffff (4.294967040e+9), INX flag set

r60 = 0x7 fffffff (2147483647) ufloatrz r60  r117 r117  0x4effffff (2.147483520e+9), INX flag set

r70 = 0x80000000 (2147483648) ufloatrz r70  r120 r120  0x4f000000 (2.147483648e+9)

r80 = 0x7 ffffff 1 ( 2147483633) ufloatrz r80  r122 r122  0x4effffff (2.147483520e+9), INX flag set

SEE ALSO

ifloatrz ifloat ufloat

ifixieee ufloatflags

ufloatrz

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-182

IEEE status flags from convert unsigned integer

to floating-point with rounding toward zero

SYNTAX

[ IF rguard ] ufloatrzflags rsrc1  rdest

FUNCTION

if rguard then

rdest  ieee_flags((float) ((unsigned long)rsrc1))

ATTRIBUTES

Function unit falu

Operation code 120

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 1, 4

DESCRIPTION

The ufloatrzflags operation computes the IEEE exceptions that would result from converting the unsigned

integer in rsrc1 to a single-precision IEEE floating-point value, and an integer bit vector representing the computed

exception flags is written into rdest. The bit vector stored in rdest has the same format as the IEEE exception bits in

the PCSW. The exception flags in PCSW are left unchanged by this operation. Rounding is performed toward zero;

the IEEE rounding mode bits in PCSW are ignored.

The ufloatrzflags operation optionally takes a guard, specified in rguard. If a guard is present, its LSB

controls the modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not

changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ufloatrzflags r30  r100 r100  0

r4 0 = 0x ffffffff (4294967295) ufloatrzflags r40  r105 r105  0x02 (INX)

r1 0 = 0, r50 = 0xfffffffd IF r10 ufloatrzflags r50  r110 no change, since guard is false

r2 0 = 1, r50 = 0xfffffffd IF r20 ufloatrzflags r50  r115 r115  0x02 (INX)

r6 0 = 0x 7fffffff (2147483647) ufloatrzflags r60  r117 r117  0x02 (INX)

r70 = 0x80000000 (2147483648) ufloatrzflags r70  r120 r120  0

r8 0 = 0x 7ffffff1 (2147483633) ufloatrzflags r80  r122 r122  0x02 (INX)

OFZ IFZ INV OVF UNF INX DBZ

0123456731

SEE ALSO

ufloatrz ifloatflags

ufloatflags ifloatrzflags

ufloatrzflags

PNX1300/01/02/11 Data Book Philips Semiconductors

A-183 PRELIMINARY SPECIFICATION

Unsigned compare greater or equal

SYNTAX

[ IF rguard ] ugeq rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if (unsigned)rsrc1 >= (unsigned)rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 35

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The ugeq ope ration sets the destination register, rdest, to 1 if the first argument, rsrc1, is greater than or equal to

the second argument, rsrc2; otherwise, rdest is set to 0. The arguments are treated as unsigned integers.

The ugeq operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 ugeq r30 r40  r80 r80  0

r10 = 0, r60 = 0x100, r30 = 3 IF r10 ugeq r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, r60 = 0x100 IF r20 ugeq r50 r60  r90 r90  1

r70 = 0x80000000, r40 = 4 ugeq r70 r40  r100 r100  1

r70 = 0x80000000 ugeq r70 r70  r110 r110  1

SEE ALSO

igeq ugeqi

ugeq

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-184

Unsigned compare greater or equal with

immediate

SYNTAX

[ IF rguard ] ugeqi(n) rsrc1  rdest

FUNCTION

if rguard then {

if (unsigned)rsrc1 >= (unsigned)n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 36

Number of operands 1

Modifier 7 bits

Modifier range 0..127

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The ugeqi operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is greater than or equal to

the opcode modifier, n; otherwise, rdest is set to 0. The arguments are treated as unsigned integers.

The ugeqi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ugeqi(2) r30  r80 r80  1

r30 = 3 ugeqi(3) r30  r90 r90  1

r30 = 3 ugeqi(4) r30  r100 r100  0

r10 = 0, r40 = 0x100 IF r10 ugeqi(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 ugeqi(63) r40  r100 r100  1

r60 = 0x80000000 ugeqi(127) r60  r120 r120  1

SEE ALSO

ugeq igeqi

ugeqi

PNX1300/01/02/11 Data Book Philips Semiconductors

A-185 PRELIMINARY SPECIFICATION

Unsigned compare greater

SYNTAX

[ IF rguard ] ugtr rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if (unsigned)rsrc1 > (unsigned)rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 33

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The ugtr operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is greater than the se cond

argument, rsrc2; otherwise, rdest is set to 0. The arguments are treated as unsigned integers.

The ugtr operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 ugtr r30 r40  r80 r80  0

r10 = 0, r60 = 0x100, r30 = 3 IF r10 ugtr r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, r60 = 0x100 IF r20 ugtr r50 r60  r90 r90  1

r70 = 0x80000000, r40 = 4 ugtr r70 r40  r100 r100  1

r70 = 0x80000000 ugtr r70 r70  r110 r110  0

SEE ALSO

igtr ugtri

ugtr

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-186

Unsigned compare greater with immediate

SYNTAX

[ IF rguard ] ugtri(n) rsrc1  rdest

FUNCTION

if rguard then {

if (unsigned)rsrc1 > (unsigned)n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 34

Number of operands 1

Modifier 7 bits

Modifier range 0..127

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The ugeqi operation set s the destination register, rdest, to 1 if the first argument, rsrc1, is greater than the opcode

modifier, n; otherwise, rdest is set to 0. The arguments are treated as unsigned integ er s.

The ugeqi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ugtri(2) r30  r80 r80  1

r30 = 3 ugtri(3) r30  r90 r90  0

r30 = 3 ugtri(4) r30  r100 r100  0

r10 = 0, r40 = 0x100 IF r10 ugtri(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 ugtri(63) r40  r100 r100  1

r60 = 0x80000000 ugtri(127) r60  r120 r120  1

SEE ALSO

igtri ugtr

ugtri

PNX1300/01/02/11 Data Book Philips Semiconductors

A-187 PRELIMINARY SPECIFICATION

Unsigned immediate

SYNTAX

uimm(n)  rdest

FUNCTION

rdest  n

ATTRIBUTES

Function unit c onst

Operation code 191

Number of operands 0

Modifier 32 bits

Modifier range 0..0xffffffff

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The uimm operation writes the unsigned 32-bit opcode modifier n into rdest. Note: this operation is not guarded.

EXAMPLES

Initial Values Operation Result

uimm(2)  r10 r10  2

uimm(0x100)  r20 r20  0x100

uimm(0xfffc0000)  r30 r30  0xfffc0000

SEE ALSO

iimm

uimm

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-188

Unsigned 16-bit load

pseudo-op for uld16d(0)

SYNTAX

[ IF rguard ] uld16 rsrc1  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  1

else

bs  0

temp<7:0>  mem[rsrc1 + (1  bs)]

temp<15:8>  mem[r src1 + (0  bs)]

rdest  zero_ext16to32(temp<15:0>)

}

ATTRIBUTES

Function unit dmem

Operation code 197

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The uld16 operation is a pseudo operation transformed by the scheduler into an uld16d(0) with the same

argument. (Note: pseudo operations cannot be used in assembly source files.)

The uld16 operation loads the 16-bit memory value fr om the address contained in rsrc1, zero extends it to 32 bits,

and writes the result in rdest. If the memory address contained in rsrc1 is not a multiple of 2, the result of uld16 is

undefined but no exception will be raised. This load operation is performed as little-endian or big-endian depending on

the current setting of the bytesex bit in the PCSW.

The result of an access by uld16 to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The uld16 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of th e destination register and the occurr ence of side ef fects. If the LSB of r guard is 1, rdest is written and

the data cache status bits are upda ted if the addre ssed loca tions are c achea ble. if the L SB of rguard is 0, rdest is not

changed and uld16 has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, [0xd00] = 0x22,

[0xd01] = 0x11 uld16 r10  r60 r60  0x00002211

r30 = 0, r20 = 0xd04, [0xd04] = 0x84,

[0xd05] = 0x33 IF r30 uld16 r20  r70 no change, since guard is false

r40 = 1, r20 = 0xd04, [0xd04] = 0x84,

[0xd05] = 0x33 IF r40 uld16 r20  r80 r80  0x00008433

r50 = 0xd01 uld16 r50  r90 r90 undefined (0xd01 is not a multiple of 2)

SEE ALSO

uld16d ild16 ild16d uld16r

ild16r uld16x ild16x

uld16

PNX1300/01/02/11 Data Book Philips Semiconductors

A-189 PRELIMINARY SPECIFICATION

Unsigned 16-bit load with displacement

SYNTAX

[ IF rguard ] uld16d(d) rsrc1  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  1

else

bs  0

temp<7:0>  mem[rsrc1 + d + (1  bs)]

temp<15:8>  mem[rsrc1 + d + (0  bs)]

rdest  zero_ext16to32(temp<15:0>)

}

ATTRIBUTES

Function unit dmem

Operation code 197

Number of operands 1

Modifier 7 bits

Modifier range –128..126 by 2

Latency 3

Issue slots 4, 5

DESCRIPTION

The uld16d operation loads the 16-bit memory value from the address computed by rsrc1 + d, zero extends it to

32 bits , and wr ite s th e re su lt in r dest. The d value is an opcod e mo difier, must be in the ra nge –128 an d 1 26 in clusive,

and must be a multiple of 2. If the memory ad dr ess co mputed by rsrc1 + d is not a multiple of 2, the result of uld16d

is undefined but no exception will be raised. This load operation is performed as little-endian or big-endian depending

on the current setting of the bytese x bit in the PCSW.

The result of an access by uld16d to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The uld16d operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the d estination register and the occurrence of side effect s. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed locations are cacheable. if the LSB of rguard is 0, rdest is not

changed and uld16d has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, [0xd02] = 0x22,

[0xd03] = 0x11 uld16d(2) r10  r60 r60  0x00002211

r30 = 0, r20 = 0xd04, [0xd00] = 0x84,

[0xd01] = 0x33 IF r30 uld16d(-4) r20  r70 no change, since guard is false

r40 = 1, r20 = 0xd04, [0xd00] = 0x84,

[0xd01] = 0x33 IF r40 uld16d(-4) r20  r80 r80  0x00008433

r50 = 0xd01 uld16d(-4) r50  r90 r90 undefined (0xd01 +(–4) is not a multiple

of 2)

SEE ALSO

uld16 ild16 ild16d uld16r

ild16r uld16x ild16x

uld16d

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-190

Unsigned 16-bit load with index

SYNTAX

[ IF rguard ] uld16r rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  1

else

bs  0

temp<7:0>  mem[rsrc1 + rsrc2 + (1  bs)]

temp<15:8>  mem[rsrc1 + rsrc2 + (0  bs)]

rdest  zero_ext16to32(temp<15:0>)

}

ATTRIBUTES

Function unit dmem

Operation code 198

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The uld16r operation loads the 16-bit memory value from the address computed by rsrc1 + rsrc2, zero extends it

to 32 bits, and writes the result in rdest. If the memory address computed by rsrc1 + rsrc2 is not a multiple of 2, the

result of uld16r is undefined but no exception will be raised. This load operation is performed as little-endian or big-

endian depending on the current setting of the bytesex bit in the PCSW.

The result of an access by uld16r to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The uld16r operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of th e destination register and the occurr ence of side ef fects. If the LSB of r guard is 1, rdest is written and

the data cache status bits are upda ted if the addre ssed loca tions are c achea ble. if the L SB of rguard is 0, rdest is not

changed and uld16r has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, r20 = 2, [0xd02] = 0x22,

[0xd03] = 0x11 uld16r r10 r20  r80 r80  0x00002211

r50 = 0, r40 = 0xd04, r30 = 0xfffffffc,

[0xd00] = 0x84, [0xd01] = 0x33 IF r50 uld16r r40 r30  r90 no change, since guard is false

r60 = 1, r40 = 0xd04, r30 = 0xfffffffc,

[0xd00] = 0x84, [0xd01] = 0x33 IF r60 uld16r r40 r30  r100 r100  0x00008433

r70 = 0xd01, r30 = 0xfffffffc uld16r r70 r30  r110 r1 10 undefined (0xd01 +(–4) is not a multiple

of 2)

SEE ALSO

uld16 ild16 uld16d ild16d

ild16r uld16x ild16x

uld16r

PNX1300/01/02/11 Data Book Philips Semiconductors

A-191 PRELIMINARY SPECIFICATION

Unsigned 16-bit load with scaled index

SYNTAX

[ IF rguard ] uld16x rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if PCSW.bytesex = LITTLE_ENDIAN then

bs  1

else

bs  0

temp<7:0>  mem[rsrc1 + (2  rsrc2) + (1  bs)]

temp<15:8>  mem[rsrc1 + (2  rsrc2) + (0  bs)]

rdest  zero_ext16to32(temp<15:0>)

}

ATTRIBUTES

Function unit dmem

Operation code 199

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The uld16x operation loads the 16-b it mem ory value fr om the add ress comp uted by rsrc1 + 2rsrc2, zero extends

it to 32 bits, and writes the result in rdest. If the memory address computed by rsrc1 + 2rsrc2 is not a multiple of 2,

the result of uld16x is undefined but no exception will be raised. This load operation is performed as little-endian or

big-endian depending on the current setting of the bytesex bit in the PCSW.

The result of an access by uld16x to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The uld16x operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the d estination register and the occurrence of side effect s. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed locations are cacheable. if the LSB of rguard is 0, rdest is not

changed and uld16x has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, r30 = 1, [0xd02] = 0x22,

[0xd03] = 0x11 uld16x r10 r30  r100 r100  0x00002211

r50 = 0, r40 = 0xd04, r20 = 0xfffff ffe,

[0xd00] = 0x84, [0xd01] = 0x33 IF r50 uld16x r40 r20  r80 no change, since guard is false

r60 = 1, r40 = 0xd04, r20 = 0xfffff ffe,

[0xd00] = 0x84, [0xd01] = 0x33 IF r60 uld16x r40 r20  r90 r90  0x00008433

r70 = 0xd01, r30 = 1 uld16x r70 r30  r110 r110 undefined (0xd01 + 21 is not a multi-

ple of 2)

SEE ALSO

uld16 ild16 uld16d ild16d

uld16r ild16r ild16x

uld16x

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-192

Unsigned 8-bit load

pseudo-op for uld8d(0)

SYNTAX

[ IF rguard ] uld8 rsrc1  rdest

FUNCTION

if rguard then

rdest  zero_ext8to32(mem[rsrc1])

ATTRIBUTES

Function unit dmem

Operation code 8

Number of operands 1

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The uld8 operation is a pseudo operation transformed by the scheduler into an uld8d(0) with the same

argument. (Note: pseudo operations cannot be used in assembly source files.)

The uld8 operation loads the 8-bit memory value from the address contained in rsrc1, zero extends it to 32 bits,

and writes the result in rdest. This operation does not depend on the bytesex bit in the PCSW since only a single byte

is loaded.

The result of an access by uld8 to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The uld8 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of th e destination register and the occurr ence of side ef fects. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed location is cacheable. if the LSB of rguard is 0, rdest is not

changed and uld8 has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, [0xd00] = 0x22 uld8 r10  r60 r60  0x00000022

r30 = 0, r20 = 0xd04, [0xd04] = 0x84 IF r30 uld8 r20  r70 no change, since guard is false

r40 = 1, r20 = 0xd04, [0xd04] = 0x84 IF r40 uld8 r20  r80 r80  0x00000084

r50 = 0xd01, [0xd01] = 0x33 uld8 r50  r90 r90  0x00000033

SEE ALSO

ild8 uld8d ild8d uld8r

ild8r

uld8

PNX1300/01/02/11 Data Book Philips Semiconductors

A-193 PRELIMINARY SPECIFICATION

Unsigned 8-bit load with displacement

SYNTAX

[ IF rguard ] uld8d(d) rsrc1  rdest

FUNCTION

if rguard then

rdest  zero_ext8to32(mem[rsrc1 + d])

ATTRIBUTES

Function unit dmem

Operation code 8

Number of operands 1

Modifier 7 bits

Modifier range –64..63

Latency 3

Issue slots 4, 5

DESCRIPTION

The uld8d operation loads the 8-bit memory value from the address computed by rsrc1 + d, zero extends it to 32

bits, and writes the result in rdest. The d value is an opcode modifier in the range –64 to 63 inclusive. This operation

does not depend on the bytesex bit in the PCSW since only a single byte is loaded.

The result of an access by uld8d to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The uld8d operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the d estination register and the occurrence of side effect s. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed location is cacheable. if the LSB of rguard is 0, rdest is not

changed and uld8d has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, [0xd02] = 0x22 uld8d(2) r10  r60 r60  0x000022

r30 = 0, r20 = 0xd04, [0xd00] = 0x84 IF r30 uld8d(-4) r20  r70 no change, since guard is false

r40 = 1, r20 = 0xd04, [0xd00] = 0x84 IF r40 uld8d(-4) r20  r80 r80  0x00000084

r50 = 0xd05, [0xd01] = 0x33 uld8d(-4) r50  r90 r90  0x00000033

SEE ALSO

uld8 ild8 ild8d uld8r

ild8r

uld8d

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-194

Unsigned 8-bit load with index

SYNTAX

[ IF rguard ] uld8r rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  zero_ext8to32(mem[rsrc1 + rsrc2])

ATTRIBUTES

Function unit dmem

Operation code 194

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 4, 5

DESCRIPTION

The uld8r operation loads the 8-bit memory value from the address computed by rsrc1 + rsrc2, zero extends it to

32 bits, and writes the result in rdest. This operation does not depend on the bytesex bit in the PCSW since only a

single byte is loaded.

The result of an access by uld8r to the MMIO address aperture is undefined; access to the MMIO aperture is

defined only for 32-bit loads and stores.

The uld8r operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of th e destination register and the occurr ence of side ef fects. If the LSB of r guard is 1, rdest is written and

the data cache status bits are updated if the addressed location is cacheable. if the LSB of rguard is 0, rdest is not

changed and uld8r has no side effects whatever.

EXAMPLES

Initial Values Operation Result

r10 = 0xd00, r20 = 2, [0xd02] = 0x22 uld8r r10 r20  r80 r80  0x00000022

r50 = 0, r40 = 0xd04, r30 = 0xfffffffc,

[0xd00] = 0x84 IF r50 uld8r r40 r30  r90 no change, since guard is false

r60 = 1, r40 = 0xd04, r30 = 0xfffffffc,

[0xd00] = 0x84 IF r60 uld8r r40 r30  r100 r100  0x00000084

r70 = 0xd05, r30 = 0xfffffffc,

[0xd01] = 0x33 uld8r r70 r30  r110 r110  0x00000033

SEE ALSO

uld8 ild8 uld8d ild8d

ild8r

uld8r

PNX1300/01/02/11 Data Book Philips Semiconductors

A-195 PRELIMINARY SPECIFICATION

Unsigned compare less or equal

pseudo-op for ugeq

SYNTAX

[ IF rguard ] uleq rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if (unsigned)rsrc1 <= (unsigned)rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 35

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The uleq operation is a pseudo operation transformed by the scheduler into an ugeq with the arguments

exchanged (uleq’s rsrc1 is ugeq’s rsrc2 and vice versa). (Note: pseudo operations cannot be used in assembly

source files.)

The uleq operation sets the destination register, rdest, to 1 if the f irst argume nt, rsrc1, is less than or equal to the

second argument, rsrc2; otherwise, rdest is set to 0. The arguments are treated as unsigned integers.

The uleq operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 uleq r30 r40  r80 r80  1

r10 = 0, r60 = 0x100, r30 = 3 IF r10 uleq r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, r60 = 0x100 IF r20 uleq r50 r60  r90 r90  0

r70 = 0x80000000, r40 = 4 uleq r70 r40  r100 r100  0

r70 = 0x80000000 uleq r70 r70  r110 r110  1

SEE ALSO

ileq uleqi

uleq

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-196

Unsigned compare less or equal with immediate

SYNTAX

[ IF rguard ] uleqi(n) rsrc1  rdest

FUNCTION

if rguard then {

if (unsigned)rsrc1 <= (unsigned)n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 43

Number of operands 1

Modifier 7 bits

Modifier range 0..127

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The uleqi operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is less than or equal to the

opcode modifier, n; otherwise, rdest is set to 0. The arguments are treated as unsigned integers.

The uleqi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 uleqi(2) r30  r80 r80  0

r30 = 3 uleqi(3) r30  r90 r90  1

r30 = 3 uleqi(4) r30  r100 r100  1

r10 = 0, r40 = 0x100 IF r10 uleqi(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 uleqi(63) r40  r100 r100  0

r60 = 0x80000000 uleqi(127) r60  r120 r120  0

SEE ALSO

uleq ileqi

uleqi

PNX1300/01/02/11 Data Book Philips Semiconductors

A-197 PRELIMINARY SPECIFICATION

Unsigned compare less

pseudo-op for ugtr

SYNTAX

[ IF rguard ] ules rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if (unsigned)rsrc1 < (unsigned)rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 33

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The ules operation is a pseudo operation transformed by the scheduler into an ugtr with the arguments

exchanged (ules’s rsrc1 is ugtr’s rsrc2 and vice versa). (Note: pseudo operations cannot be used in assembly

source files.)

The ules operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is less than the second

argument, rsrc2; otherwise, rdest is set to 0. The arguments are treated as unsigned integers.

The ules operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 ules r30 r40  r80 r80  1

r10 = 0, r60 = 0x100, r30 = 3 IF r10 ules r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, r60 = 0x100 IF r20 ules r50 r60  r90 r90  0

r70 = 0x80000000, r40 = 4 ules r70 r40  r100 r100  0

r70 = 0x80000000 ules r70 r70  r110 r110  0

SEE ALSO

iles ugtr

ules

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-198

Unsigned compare less with immediate

SYNTAX

[ IF rguard ] ulesi(n) rsrc1  rdest

FUNCTION

if rguard then {

if (unsigned)rsrc1 < (unsigned)n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 41

Number of operands 1

Modifier 7 bits

Modifier range 0..127

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The ulesi operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is less than the opcode

modifier, n; otherwise, rdest is set to 0. The arguments are treated as unsigned integ er s.

The ulesi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 ulesi(2) r30  r80 r80  0

r30 = 3 ulesi(3) r30  r90 r90  0

r30 = 3 ulesi(4) r30  r100 r100  1

r10 = 0, r40 = 0x100 IF r10 ulesi(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 ulesi(63) r40  r100 r100  0

r60 = 0x80000000 ulesi(127) r60  r120 r120  0

SEE ALSO

ules ilesi

ulesi

PNX1300/01/02/11 Data Book Philips Semiconductors

A-199 PRELIMINARY SPECIFICATION

Unsigned sum of absolute values

of signed 8-bit differences

SYNTAX

[ IF rguard ] ume8ii rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  abs_val(sign_ext8to32(rsrc1<31:24>) – sign_ext8to32(rsrc2<31:24>)) +

abs_val(sign_ext8to32(rsrc1<23:16>) – sign_ext8to32(rsrc2<23:16>)) +

abs_val(sign_ext8to32(rsrc1<15:8>) – sign_ext8to32(rsrc2<15:8>)) +

abs_val(sign_ext8to32(rsrc1<7:0>) – sign_ext8to32(rsrc2<7:0>))

ATTRIBUTES

Function unit dspalu

Operation code 64

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

As shown below, the ume8ii operation computes four separate differences of the four pairs of corresponding

signed 8-bit bytes of rsrc1 and rsrc2; the absolute values of the four dif ferences a re summed, and the sum is written to

rdest. All computations are performed without loss of precision.

The ume8ii operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r80 = 0x0a14f6f6, r30 = 0x1414ecf6 ume8ii r80 r30  r100 r100  0x14

r10 = 0, r80 = 0x0a14f6f6, r30 = 0x1414ecf6 IF r10 ume8ii r80 r30  r70 no change, since guard is false

r20 = 1, r90 = 0x64649c9c, r40 = 0x649c649c IF r20 ume8ii r90 r40  r110 r110  0x190

r40 = 0x649c649c, r90 = 0x64649c9c ume8ii r40 r90  r120 r120  0x190

r50 = 0x80808080, r60 = 0x7f7f7f7f ume8ii r50 r60  r125 r125  0x3fc

01531

rsrc1 01531

rsrc2

031

rdest









23 7 23 7

signed signed signed signed signed signed signed signed

unsigned

SEE ALSO

ume8uu

ume8ii

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-200

Sum of absolute values of unsigned 8-bit

differences

SYNTAX

[ IF rguard ] ume8uu rsrc1 rsrc2  rdest

FUNCTION

if rguard then

rdest  abs_val(zero_ext8to32(rsrc1<31:24>) – zero_ext8to32(rsrc2<31:24>)) +

abs_val(zero_ext8to32(rsrc1<23:16>) – zero_ext8to32(rsrc2<23:16>)) +

abs_val(zero_ext8to32(rsrc1<15:8>) – zero_ext8to32(rsrc2<15:8>)) +

abs_val(zero_ext8to32(rsrc1<7:0>) – zero_ext8to32(rsrc2<7:0>))

ATTRIBUTES

Function unit dspalu

Operation code 26

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

As shown below, the ume8uu operation computes four separate differences of the four pairs of corresponding

unsigned 8-bit bytes of rsrc1 and rsrc2. The absolute values of the four differences are summed and the result is

written to rdest. All computations are performed without loss of precision.

The ume8uu operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation R esult

r80 = 0x0a14f6f6, r30 = 0x1414ecf6 ume8uu r80 r30  r100 r100  0x14

r10 = 0, r80 = 0x0a14f6f6, r30 = 0x1414ecf6 IF r10 ume8uu r80 r30  r70 no change, since guard is false

r20 = 1, r90 = 0x64649c9c, r40 = 0x649c649c IF r20 ume8uu r90 r40  r110 r110  0x70

r40 = 0x649c649c, r90 = 0x64649c9c ume8uu r40 r90  r120 r120  0x70

r50 = 0x80808080, r60 = 0x7f7f7f7f ume8uu r50 r60  r125 r125  0x4

01531

rsrc1 01531

rsrc2

031

rdest









23 7 23 7

unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned

unsigned

SEE ALSO

ume8ii

ume8uu

PNX1300/01/02/11 Data Book Philips Semiconductors

A-201 PRELIMINARY SPECIFICATION

umin Minimum of unsigned values

pseudo-op for uclipu

SYNTAX

[ IF rguard ] umin rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 > rsrc2 then

rdest  rsrc2

else

rdest  rsrc1

}

ATTRIBUTES

Function unit dspalu

Operation code 76

Number of operands 2

Modifier No

Modifier range —

Latency 2

Issue slots 1, 3

DESCRIPTION

The umin operation returns the minimum value of rsrc1 and rsrc2. The arguments rsrc1 and rsrc2 are consider ed

unsigned intege rs.

The umin operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0x80, r40 = 0x7f umin r30 r40  r50 r50  0x7f

r10 = 0, r60 = 0x12345678,

r70 = 0xabc IF r10 umin r60 r70  r80 no change, since guard is false

r20 = 1, r60 = 0x12345678,

r70 = 0xabc IF r20 umin r60 r70  r90 r90  0xabc

r100 = 0x80000000, r110 = 0x3fffff umin r100 r110  r120 r120  0x3fffff

SEE ALSO

iclipi uclipi imin imax

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-202

Unsigned multiply

SYNTAX

[ IF rguard ] umul rsrc1 rsrc2  rdest

FUNCTION

if rguard then

temp  zero_ext32to64(rsrc1)  zero_ext32to64(rsrc2)

rdest  temp<31:0>

ATTRIBUTES

Function unit ifmul

Operation code 138

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the umul operation comp utes the prod uct rsrc1rsrc2 and writes the least-significant 32 bits of the

full 64-bit product into rdest. The operands are considered unsigned integers. No overflow or underflow detection is

performed.

The umul operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r60 = 0x100 umul r60 r60  r80 r80  0x10000

r10 = 0, r60 = 0x100, r30 = 0xf11 IF r10 umul r60 r30  r50 no change, since guard is false

r20 = 1, r60 = 0x100, r30 = 0xf11 IF r20 umul r60 r30  r90 r90  0xf1100

r70 = 0x100, r40 = 0xffffff9c umul r70 r40  r100 r100  0xffff9c00

031

rsrc1 031

rsrc2

031

rdest



063 31

64-bit result

unsigned unsigned

unsigned

SEE ALSO

imul imulm umulm dspimul

dspumul dspidualmul

quadumulmsb fmul

umul

PNX1300/01/02/11 Data Book Philips Semiconductors

A-203 PRELIMINARY SPECIFICATION

Unsigned multiply, return most-significant 32

bits

SYNTAX

[ IF rguard ] umulm rsrc1 rsrc2  rdest

FUNCTION

if rguard then

temp  zero_ext32to64(rsrc1)  zero_ext32to64(rsrc2)

rdest  temp<63:32>

ATTRIBUTES

Function unit ifmul

Operation code 140

Number of operands 2

Modifier No

Modifier range —

Latency 3

Issue slots 2, 3

DESCRIPTION

As shown below, the umulm operation computes the product rsrc1rsrc2 and writes the most-significant 32 bits of

the 64-bit product into rdest. The operands are considered unsigned integers.

The umulm operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r60 = 0x10000 umulm r60 r60  r80 r80  0x00000001

r10 = 0, r60 = 0x100, r30 = 0xf11 IF r10 umulm r60 r30  r50 no change, since guard is false

r20 = 1, r60 = 0x10001000,

r30 = 0xf1100000 IF r20 umulm r60 r30  r90 r90  0xf110f11

r70 = 0xffffff00, r40 = 0x100 umulm r70 r40  r100 r100  0xff

031

rsrc1 031

rsrc2

031

rdest



063 31

64-bit result

unsigned unsigned

unsigned

SEE ALSO

umulm dspimul dspumul

dspidualmul quadumulmsb

fmul

umulm

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-204

Unsigned compare not equal

pseudo-op for ineq

SYNTAX

[ IF rguard ] uneq rsrc1 rsrc2  rdest

FUNCTION

if rguard then {

if rsrc1 != rsrc2 then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 39

Number of operands 2

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The uneq operation is a pseudo operation transformed by the scheduler into an ineq. (Note: pseudo operations

cannot be used in assembly source files.)

The uneq operation sets the destination register, rdest, to 1 if the two arguments, rsrc1 and rsrc2, are not equal;

otherwise, rdest is set to 0.

The uneq operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3, r40 = 4 uneq r30 r40  r80 r80  1

r10 = 0, r60 = 0x1000, r30 = 3 IF r10 uneq r60 r30  r50 no change, since guard is false

r20 = 1, r50 = 0x1000, r60 = 0x1000 IF r20 uneq r50 r60  r90 r90  0

r70 = 0x80000000, r40 = 4 uneq r70 r40  r100 r100  1

r70 = 0x80000000 uneq r70 r70  r110 r110  0

SEE ALSO

ineq igtr uneqi

uneq

PNX1300/01/02/11 Data Book Philips Semiconductors

A-205 PRELIMINARY SPECIFICATION

Unsigned compare not equal with immediate

SYNTAX

[ IF rguard ] uneqi(n) rsrc1  rdest

FUNCTION

if rguard then {

if (unsigned)rsrc1 != (unsigned)n then

rdest  1

else

rdest  0

}

ATTRIBUTES

Function unit alu

Operation code 40

Number of operands 1

Modifier 7 bits

Modifier range 0..127

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The uneqi operation sets the destination register, rdest, to 1 if the first argument, rsrc1, is not equal to the opcode

modifier, n; otherwise, rdest is set to 0. The arguments are treated as unsigned integers.

The uneqi operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 3 uneqi(2) r30  r80 r80  1

r30 = 3 uneqi(3) r30  r90 r90  0

r30 = 3 uneqi(4) r30  r100 r100  1

r10 = 0, r40 = 0x100 IF r10 uneqi(63) r40  r50 no change, since guard is false

r20 = 1, r40 = 0x100 IF r20 uneqi(63) r40  r100 r100  1

r60 = 0x80000000 uneqi(127) r60  r120 r120  1

SEE ALSO

uneq ineqi ueqli

uneqi

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-206

Write destination program counter

SYNTAX

[ IF rguard ] writedpc rsrc1

FUNCTION

if rguard then {

DPC  rsrc1

}

ATTRIBUTES

Function unit fcomp

Operation code 160

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The writedpc copies the value of rsrc1 to the DPC (Destination Program Counter) processor register. Whenever

a hardware update (during an interruptible jump) and a software update (through a writedpc) coincide, the

software update takes precedence.

Interruptible jumps write their target address to the DPC. The value of DPC is inte nded to be us ed b y an exc eptio n-

handling routine as a jump address to resume execution of the program that was running before the exception was

taken.

The writedpc operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of DPC. If the LSB of rguard is 1, DPC is written; otherwise, DPC is unchanged.

EXAMPLES

Initial Values Operation Result

r30 = 0xbeebee writedpc r30 DPC  0xbeebee

r20 = 0, r31 = 0xabba IF r20 writedpc r31 no change, since guard is false

r21 = 1, r31 = 0xabba IF r21 writedpc r31 DPC  0xabba

SEE ALSO

readdpc writespc ijmpf

ijmpi ijmpt

writedpc

PNX1300/01/02/11 Data Book Philips Semiconductors

A-207 PRELIMINARY SPECIFICATION

Write program control and status word

SYNTAX

[ IF rguard ] writepcsw rsrc1 rsrc2

FUNCTION

if rguard then {

PCSW  (PCSW & ~rsrc2) | (rsrc1 & rsrc2)

}

ATTRIBUTES

Function unit fcomp

Operation code 161

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The writepcsw copies the value of rsrc1 to the PCSW (Program Control and Status Word) processor register

using rsrc2 as a mask. A bit in PCSW is affected by writepcsw only if the corresponding bit in rsrc2 is set to 1; the

value of any bit in PCSW with a corresponding 0-bit in rsrc2 will not be changed by writepcsw. Whenever a

hardware update (e.g., when a floating-point exception is raised) and a software update (through a writepcsw)

coincide, the PCSW bits currently being updated by hardware will reflect the hardware-determined value while the bits

not being affected by hardware will reflect the value in the writepcsw operand. The layout of PCSW is shown

below. The programmer should take care not to alter UNDEF fields in the PCSW.

Fields in the PCSW have two chief purposes: to control aspects of processor operation and to record events that

occur during program execution. Thus, writepcsw can be used to effect changes in some aspects of processor

operation and to clear fields that record events; this operation can also be used to restore state before resuming an

idled task in a multi-tasking environment. Note: The latency of writepcsw is 1, i.e. the PCSW reflects the new value in

the next cycle. But it takes additional 3 cycles for updates to the exception flags and exception enable bits to take

effect in the hardware. Therefore 3 delay slots / nops shall be inserted between writepcsw and the next interruptible

jump, if exception flags or enable bits are changed. This guarantees that the new state is recognized in the interrupt

logic during exe cu tio n of the ijum p.

The writepcsw operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of PCSW. If the LSB of rguard is 1, PCSW is written; otherwise, PCSW is unchanged.

EXAMPLES

Initial Values Operation Result

r30 = 0x100, r40 = 0x180 writepcsw r30 r40 PCSW. IEEE MO DE = to positive infinity

r20 = 0, r50 = 0x0, r60 = 0x400 IF r20 writepcsw r50 r60 no change, since guard is false

r21 = 1, r50 = 0x0, r60 = 0x400 IF r21 writepcsw r50 r60 PCSW.IEN = 0 (disable interrupts)

r70 = 0x80110000, r80 = 0xffff0000 writepcsw r70 r80 enable trap on MSE, INV and DBZ exclusively

MSE CS IEN BSX IEEE MODE OFZ IFZ INV OVF UNF INX DBZ

01234567891011121415

Misaligned store exception

Count stall s (1  Yes)

FP exception trap-ena ble bits

IEEE rounding mode

0  to nearest, 1  to zero, 2  to positive, 3  to negative

Interrupt enable (1  allow interrupts)

Byte sex (1  little endian)

PCSW<31:16>

PCSW<15:0> UNDEF

Misaligned store

exception trap enable Trap on first exit

FP exceptions

TRP

MSE TFE TRP

OFZ TRP

IFZ TRP

INV TRP

OVF TRP

UNF TRP

INX TRP

DBZ

1617181920212223252627283031

UNDEF UNDEFINED

WBE RSE

Write back error

Reserved ex ce ption

TRP

WBE TRP

RSE

Write back error trap enable

Reserved exception

trap enab le

SEE ALSO

readpcsw fadd faddflags

ijmpf cycles hicycles

writepcsw

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-208

Write source program counter

SYNTAX

[ IF rguard ] writespc rsrc1

FUNCTION

if rguard then

SPC  rsrc1

ATTRIBUTES

Function unit fcomp

Operation code 159

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 3

DESCRIPTION

The writespc copies the value of rsrc1 to the SPC (Source Program Counter) processor register. Whenever a

hardware update (during an interruptible jump) and a software update (through a writespc) coincide, the software

update takes precedence.

An interruptible jump that is not interrupted (no NMI, INT, or EXC event was pending when the jump was executed)

writes its t arget addr ess to SPC. The value of SPC is in tended to a llow an exception -handling r outine to d etermine the

start address of the block of scheduled code (called a decision tree) that was executing before the exception was

taken.

The writespc operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of SPC. If the LSB of rguard is 1, SPC is written; otherwise, SPC is unchanged.

EXAMPLES

Initial Values Operation Result

r30 = 0xbeebee writespc r30 SPC  0xbeebee

r20 = 0, r31 = 0xabba IF r20 writespc r31 no change, since guard is false

r21 = 1, r31 = 0xabba IF r21 writespc r31 SPC  0xabba

SEE ALSO

readspc writedpc ijmpf

ijmpi ijmpt

writespc

PNX1300/01/02/11 Data Book Philips Semiconductors

A-209 PRELIMINARY SPECIFICATION

Zero extend 16 bits

pseudo-op for pack16lsb

SYNTAX

[ IF rguard ] zex16 rsrc1  rdest

FUNCTION

if rguard then

rdest  zero_ext16to32(rsrc1<15:0>)

ATTRIBUTES

Function unit alu

Operation code 53

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 1 , 2, 3, 4, 5

DESCRIPTION

The zex16 operation is a pseudo operation transformed by the scheduler into a pack16lsb with 0 as the first

argument and rsrc1 as the second. (Note: pseudo operations cannot be used in assembly source files.)

As shown below, the zex16 operation zero extends the least-s ignificant 16-bit halfword of the argument, rsrc1, to

32 bits and writes the result in rdest.

The zex16 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xffff0040 zex16 r30  r60 r60  0x00000040

r10 = 0, r40 = 0xff0fff91 IF r10 zex16 r40  r70 no change, since guard is false

r20 = 1, r40 = 0xff0fff91 IF r20 zex16 r40  r100 r100  0x0000ff91

r50 = 0x00000091 zex16 r50  r110 r110  0x00000091

01531

rsrc1

031

rdest 15

0000000000000000

unsigned

SEE ALSO

sex16 sex8 zex8

zex16

Philips Semiconductors PNX1300/01/02/11 DSPCPU Operations

PRELIMINARY SPECIFICATION A-210

Zero extend 8 bits

pseudo-op for ubytesel

SYNTAX

[ IF rguard ] zex8 rsrc1  rdest

FUNCTION

if rguard then

rdest  zero_ext8to32(rsrc1<7:0>)

ATTRIBUTES

Function unit alu

Operation code 55

Number of operands 1

Modifier No

Modifier range —

Latency 1

Issue slots 1, 2, 3, 4, 5

DESCRIPTION

The zex8 operation is a pseudo operation transformed by the scheduler into a ubytesel with r0 (always

contains 0) as the first argument and rsrc1 as the second. (Note: pseudo operations cannot be used in assembly

source files.)

As shown below, the zex8 operation zero extends the least-significant byte of the argument, rsrc1, to 32 bits and

writes the result in rdest.

The zex8 operation optionally takes a guard, specified in rguard. If a guard is present, its LSB controls the

modification of the destination register. If the LSB of rguard is 1, rdest is written; otherwise, rdest is not changed.

EXAMPLES

Initial Values Operation Result

r30 = 0xffff0040 zex8 r30  r60 r60  0x00000040

r10 = 0, r40 = 0xff0fff91 IF r10 zex8 r40  r70 no change, since guard is false

r20 = 1, r40 = 0xff0fff91 IF r20 zex8 r40  r100 r100  0x00000091

r50 = 0x00000091 zex8 r50  r110 r110  0x00000091

031

rsrc1

031

rdest 0

00000000000000000000000

unsigned

SEE ALSO

ubytesel sex16 sex8 zex16

zex8

PNX1300/01/02/11 Data Book Philips Semiconductors

A-211 PRELIMINARY SPECIFICATION

PNX1300/01/02/11 Data Book Philips Semiconductors

A-212 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION B-1

MMIO Register Summary Chapter B

by Gert Slavenburg, and Selliah Rathnam

B.1 MMIO REGISTERS

The following table lists all the MMIO registers implemented in PNX1300/01/02/1 1. The registers are grouped accord-

ing to the unit to which they belong. For compatibility with future devices, any undefined MMIO bits should be ignored

when read, and wr itte n as zeroes.

MMIO Register Name Offset

(in hex)

Accessibility

Description

DSPCPU External

PCI

Initiators

DSPCPU Registers

DRAM_BASE 10 0000 R/W R/W Start of DRAM address aperture

DRAM_LIMIT 10 0004 R/W R/W End of DRAM address aperture

MMIO_BASE 10 0400 R/W R/W Start of 2-MB MMIO-register address aperture

EXCVEC 10 0800 R/W R/W Interrupt vector (handler start address) for exceptions

ISETTING0 10 0810 R/W R/W Interrupt mode & priority settings for sources 0-7

ISETTING1 10 0814 R/W R/W Interrupt mode & priority settings for sources 8-15

ISETTING2 10 0818 R/W R/W Interrupt mode & priority settings for sources 16-23

ISETTING3 10 081c R/W R/W Interrupt mode & priority settings for sources 24-31

IPENDING 10 0820 R/W R/W Interrupt-pending status bit for all 32 sources

ICLEAR 10 0824 R/W R/W Interrupt-clear bit for all 32 sources

IMASK 10 0828 R/W R/W Interrupt-mask bit for all 32 sources

INTVEC0 10 0880 R/W R/W Interrupt vector (handler start address) for source 0

INTVEC1 10 0884 R/W R/W Interrupt vector (handler start address) for source 1

INTVEC2 10 0888 R/W R/W Interrupt vector (handler start address) for source 2

INTVEC3 10 088c R/W R/W Interrupt vector (handler start address) for source 3

INTVEC4 10 0890 R/W R/W Interrupt vector (handler start address) for source 4

INTVEC5 10 0894 R/W R/W Interrupt vector (handler start address) for source 5

INTVEC6 10 0898 R/W R/W Interrupt vector (handler start address) for source 6

INTVEC7 10 089c R/W R/W Interrupt vector (handler start address) for source 7

INTVEC8 10 08a0 R/W R/W Interrupt vector (handler start address) for source 8

INTVEC9 10 08a4 R/W R/W Interrupt vector (handler start address) for source 9

INTVEC10 10 08a8 R/W R/W Interrupt vector (handler start address) for source 10

INTVEC11 10 08ac R/W R/W Interrupt vector (handler start address) for source 11

INTVEC12 10 08b0 R/W R/W Interrupt vector (handler start address) for source 12

INTVEC13 10 08b4 R/W R/W Interrupt vector (handler start address) for source 13

INTVEC14 10 08b8 R/W R/W Interrupt vector (handler start address) for source 14

INTVEC15 10 08bc R/W R/W Interrupt vector (handler start address) for source 15

INTVEC16 10 08c0 R/W R/W Interrupt vector (handler start address) for source 16

INTVEC17 10 08c4 R/W R/W Interrupt vector (handler start address) for source 17

INTVEC18 10 08c8 R/W R/W Interrupt vector (handler start address) for source 18

INTVEC19 10 08cc R/W R/W Interrupt vector (handler start address) for source 19

PNX1300/01/02/11 Data Book Philips Semiconductors

B-2 PRELIMINARY SPECIFICATION

INTVEC20 10 08d0 R/W R/W Interrupt vector (handler start address) for source 20

INTVEC21 10 08d4 R/W R/W Interrupt vector (handler start address) for source 21

INTVEC22 10 08d8 R/W R/W Interrupt vector (handler start address) for source 22

INTVEC23 10 08dc R/W R/W Interrupt vector (handler start address) for source 23

INTVEC24 10 08e0 R/W R/W Interrupt vector (handler start address) for source 24

INTVEC25 10 08e4 R/W R/W Interrupt vector (handler start address) for source 25

INTVEC26 10 08e8 R/W R/W Interrupt vector (handler start address) for source 26

INTVEC27 10 08ec R/W R/W Interrupt vector (handler start address) for source 27

INTVEC28 10 08f0 R/W R/W Interrupt vector (handler start address) for source 28

INTVEC29 10 08f4 R/W R/W Interrupt vector (handler start address) for source 29

INTVEC30 10 08f8 R/W R/W Interrupt vector (handler start address) for source 30

INTVEC31 10 08fc R/W R/W Interrupt vector (handler start address) for source 31

TIMER1_TMODULUS 10 0c00 R/W R/W Contains: (maximum count value for timer 1) + 1

TIMER1_TVALUE 10 0c04 R/W R/W Current value of timer 1 counter

TIMER1_TCTL 10 0c08 R/W R/W Timer 1 control (prescale value, source select, run bit)

TIMER2_TMODULUS 10 0c20 R/W R/W Contains: (maximum count value for timer 2) + 1

TIMER2_TVALUE 10 0c24 R/W R/W Current value of timer 2 counter

TIMER2_TCTL 10 0c28 R/W R/W Timer 2 control (prescale value, source select, run bit)

TIMER3_TMODULUS 10 0c40 R/W R/W Contains: (maximum count value for timer 3) + 1

TIMER3_TVALUE 10 0c44 R/W R/W Current value of timer 3 counter

TIMER3_TCTL 10 0c48 R/W R/W Timer 3 control (prescale value, source select, run bit)

SYSTIMER_TMODULUS 10 0c60 R/W R/W Contains: (maximum count value for system timer) + 1

SYSTIMER_TVALUE 10 0c64 R/W R/W Current value of system timer/counter

SYSTIMER_TCTL 10 0c68 R/W R/W System timer control (prescale value, source select, run bit)

BICTL 10 1000 R/W R/W Instruction breakpoint control

BINSTLOW 10 1004 R/W R/W Start of address range that causes instruction breakpoints

BINSTHIGH 10 1008 R/W R/W End of address range that causes instruction breakpoints

BDCTL 10 1020 R/W R/W Data breakpoint control

BDATAALOW 10 1030 R/W R/W Start of addres s range that causes data breakpoints

BDATAAHIGH 10 1034 R/W R/W End of address range that causes data breakpoints

BDATAVAL 10 1038 R/W R/W Compare value for data breakpoint s

BDATAMASK 10 103c R/W R/W Compare mask for compare value for data breakpoint s

Cache And Memory System

DRAM_CACHEABLE_LIMIT 10 0008 R/W R/W Start of non-cacheable region in DRAM

MEM_EVENTS 10 000c R/W R/W Selects two cache-related events for counting

DC_LOCK_CTL 10 0010 R/W R/W Enable bit for data-cache locking, also PCI hole disable

DC_LOCK_ADDR 10 0014 R/W R/W Start of address range that will be locked into the data cache

DC_LOCK_SIZE 10 0018 R/W R/W Size of address range that will be locked into the data cache

DC_PARAMS 10 001c R/— R/— Data-cache geometry (blocksize, associativity, # of sets)

IC_PARAMS 10 0020 R/— R/— Instruction-cache geometry (blocksize, assoc., # of sets)

MM_CONFIG 10 0100 R/— R/— DRAM settings (rank size, bus width, refresh interval)

ARB_BW_CTL 10 0104 R/W R/W Internal bus arbitration control (bandwidth/latency allocation)

ARB_RAISE 10 010C R/W R/W Arbiter Priority Raising timer

POWER_DOWN 10 0108 R/W R/W Write to this register to initiate power down

IC_LOCK_CTL 10 0210 R/W R/W Enable bit for instruction-cache locking

IC_LOCK_ADDR 10 0214 R/W R/W Start of address range that will be locked into the instruction

cache

MMIO Register Name Offset

(in hex)

Accessibility

Description

DSPCPU External

PCI

Initiators

Philips Semiconductors MMIO Register Summary

PRELIMINARY SPECIFICATION B-3

IC_LOCK_SIZE 10 0218 R/W R/W Size of address range that will be locked into the instruction

cache

PLL_RATIOS 10 0300 R/— R/— Sets ratios of external and internal clock frequencies

BLOCK_POWER_DOWN 10 3428 R/W R/W Powers up and down individual blocks

Video In

VI_STATUS 10 1400 R/— R/— Status of video-in unit

VI_CTL 10 1404 R/W R/W Sets operation and interrupt modes for video in

VI_CLOCK 10 1408 R/W R/W Sets clock source (internal/external), frequency

VI_CAP_START 10 140c R/W R/W Sets capture st art x and y offsets

VI_CAP_SIZE 10 1410 R/W R/W Sets capture size width and height

VI_BASE1

VI_Y_BASE_ADR 10 1414 R/W R/W Capture modes: sets base address of Y-value array

Message/raw modes: sets base address of buffer 1

VI_BASE2

VI_U_BASE_ADR 10 1418 R/W R/W Capture modes: sets base address of U-value array

Message/raw modes: sets base address of buffer 2

VI_SIZE

VI_V_BASE_ADR 10 141c R/W R/W Capture modes: sets base address of V-value array

Message/raw modes: sets size of buffers

VI_UV_DELTA 10 1420 R/W R/W Capture modes: address delta for adjacent U, V lines

VI_Y_DELTA 10 1424 R/W R/W Capture modes: address delta for adjacent Y lines

Video Out

VO_STATUS 10 1800 R/— R/— Status of video-out unit

VO_CTL 10 1804 R/W R/W Sets operation and interrupt modes for video out

VO_CLOCK 10 1808 R/W R/W Sets video-out clock frequency

VO_FRAME 10 180c R/W R/W Sets frame parameters (preset, start, length)

VO_FIELD 10 1810 R/W R/W Sets field parameters (overlap, field-1 line, field-2 line)

VO_LINE 10 1814 R/W R/W Sets field parameters (starting pixel, frame wid th)

VO_IMAGE 10 1818 R/W R/W Sets image parameters (height, width)

VO_YTHR 10 181c R/W R/W Sets threshold for YTR interrupt, image v/h offsets

VO_OLSTART 10 1820 R/W R/W Sets overlay image parameters (start line/pixel, alpha)

VO_OLHW 10 1824 R/W R/W Sets overlay image parameters (height, width)

VO_YADD 10 1828 R/W R/W Sets Y-component/buffer-1 starting address

VO_UADD 10 182c R/W R/W Sets U-component/buffer-2 starting address

VO_VADD 10 1830 R/W R/W Sets V-component address/buffer-1 length

VO_OLADD 10 1834 R/W R/W Sets overlay image address/buffer-2 length

VO_VUF 10 1838 R/W R/W Sets start-of-line-to-start-of-line address offsets (U, V)

VO_YOLF 10 183c R/W R/W Sets start-of-line-to-start-of-line addr. offsets (Y, overlay)

EVO_CTL 10 1840 R/W R/W Sets operations for enhance video out

EVO_MASK 10 1844 R/W R/W Sets YUV mask values foe the chroma-key process

EVO_CLIP 10 1848 R/W R/W Sets output clip values

EVO_KEY 10 184c R/W R/W Sets YUV chroma-key values

EVO_SLVDLY 10 1850 R/W R/W Sets delay cycles for genlock mode

Audio In

AI_STATUS 10 1c00 R/— R/— Status of audio-in unit

AI_CTL 10 1c04 R/W R/W Sets operation and interrupt modes for audio in

AI_SERIAL 10 1c08 R/W R/W Sets clock ratios and internal/external clock generation

AI_FRAMING 10 1c0c R/W R/W Sets format of serial data stream

MMIO Register Name Offset

(in hex)

Accessibility

Description

DSPCPU External

PCI

Initiators

PNX1300/01/02/11 Data Book Philips Semiconductors

B-4 PRELIMINARY SPECIFICATION

AI_FREQ 10 1c10 R/W R/W Sets AI_OSCLK frequency

AI_BASE1 10 1c14 R/W R/W Sets base address of buffer 1

AI_BASE2 10 1c18 R/W R/W Sets base address of buffer 2

AI_SIZE 10 1c1c R/W R/W Sets number of samples in buffers

Audio Out

AO_STATUS 10 2000 R/— R/— Status of audio-out unit

AO_CTL 10 2004 R/W R/W Sets operation and interrupt modes for audio out

AO_SERIAL 10 2008 R/W R/W Sets clock ratios and internal/external clock generation

AO_FRAMING 10 200c R/W R/W Sets format of serial data stream

AO_FREQ 10 2010 R/W R/W Set AO_OSCLK frequency

AO_BASE1 10 2014 R/W R/W Sets base address of buffer 1

AO_BASE2 10 2018 R/W R/W Sets base address of buffer 2

AO_SIZE 10 201c R/W R/W Sets number of samples in buffers

AO_CC 10 2020 R/W R/W Codec control field values

AO_CFC 10 2024 R/W R/W Codec Frame Control

AO_TSTAMP 10 2028 R/— R/W Timestamp of the last buffer

SPDIF Out

SDO_STATUS 10 4C00 R/— R/— Status register

SDO_CTL 10 4C04 R/W R/W Control register

SDO_FREQ 10 4C08 R/W R/W Frequency register

SDO_BASE1 10 4C0C R/W R/W Base address of buffer 1

SDO_BASE2 10 4C10 R/W R/W Base address of buffer 2

SDO_SIZE 10 4C14 R/W R/W Number of samples in buffers

SDO_TSTAMP 10 4C18 R/— R/— Timestamp of the last buffer

PCI Interface

BIU_STATUS 10 3004 R/— R/— Status of PCI interface (done/busy bits, error bits)

BIU_CTL 10 3008 R/W R/W Sets operation and interrupt modes for PCI

PCI_ADR 10 300c R/W —/— Holds address for DSPCPU PCI access

PCI_DATA 10 3010 R/W — /— Holds data for DSPCPU PCI access

CONFIG_ADR 10 3014 R/W R/W Holds address for configuration access

CONFIG_DATA 10 3018 R/W R/W Holds data for configuration access

CONFIG_CTL 10 301c R/W R/W Sets read/write, bus number for configuration access

IO_ADR 10 3020 R/W R/W Holds address for I/O access

IO_DATA 10 3024 R/W R/W Holds data for I/O access

IO_CTL 10 3028 R/W R/W Sets read/write, byte-enable for I/O access

SRC_ADR 10 302c R/W R/W Holds source address for DMA operation

DEST_ADR 10 3030 R/W R/W Holds destination address for DMA operation

DMA_CTL 10 3034 R/W R/W Sets read/write, transfer length for DMA operation

INT_CTL 10 3038 R/W R/W Controls interrupt system

XIO_CTL 10 3060 R/W R/W XIO control register

JTAG

JTAG_DATA_IN 10 3800 R/W R/W JTAG data input buffer

JTAG_DATA_OUT 10 3804 R/W R/W JTAG data output buf fer

JTAG_CTL 10 3808 R/W R/W JTAG control

Image Co-Processor

MMIO Register Name Offset

(in hex)

Accessibility

Description

DSPCPU External

PCI

Initiators

Philips Semiconductors MMIO Register Summary

PRELIMINARY SPECIFICATION B-5

ICP_MPC 10 2400 R/W R/W MicroProgram Counter

ICP_MIR 10 2404 R/W R/W Micro Instruction Register

ICP_DP 10 2408 R/W R/W Data Pointer

ICP_DR 10 2410 R/W R/W Data Register

ICP_SR 10 2414 R/W R/W Status Register

VLD Co-Processor

VLD_COMMAND 10 2800 R/W R/W Next action to be taken by VLD

VLD_SR 10 2804 R/— R/— Bitstream shift register

VLD_QS 10 2808 R/W R/W Quantization Scale Code

VLD_PI 10 280C R/W R/W Picture layer Information

VLD_STATUS 10 2810 R/W R/W Status Register

VLD_IMASK 10 2814 R/W R/W Controls which status bits causes VLD interrupts

VLD_CTL 10 2818 R/W R/W Control Register

VLD_BIT_ADR 10 281C R/W R/W Current Bitstream Read Address

VLD_BIT_CNT 10 2820 R/W R/W Bitstream remaining byte count

VLD_MBH_ADR 10 2824 R/W R/W Macro Block Header output address

VLD_MBH_CNT 10 2828 R/W R/W Macro Block Header output remaining count

VLD_RL_ADR 10 282C R/W R/W Run/Length output address

VLD_RL_CNT 10 2830 R/W R/W Run/Length output remaining count

I2C Interface

IIC_AR 10 3400 R/W R/W Address, Byte count and Direction

IIC_DR 10 3404 R/W R/W Data Register

IIC_STATUS 10 3408 R/— R/— Status Register

IIC_CTL 10 340C R/W R/W Control Register

Synchronous Serial Interface

SSI_CTL 10 2C00 R/W R/W Control Register

SSI_CSR 10 2C04 R/W R/W Additional Control and Status register

SSI_TXDR 10 2C10 —/W — /W Transmit Data Register

SSI_RXDR 10 2C20 R/— R/— Receive Data Register

SSI_RXACK 10 2C24 —/W —/W Write a ‘1’ here to ACK read of Receive Data Register

SEM Device

SEM 10 0500 R/W R/W Simple multi-processor semaphore

MMIO Register Name Offset

(in hex)

Accessibility

Description

DSPCPU External

PCI

Initiators

PNX1300/01/02/11 Data Book Philips Semiconductors

B-6 PRELIMINARY SPECIFICATION

PRELIMINARY SPECIFICATION C-1

Endian-ness Appendix C

by Selliah Rathnam, Luis Lucas

C.1 PURPOSE

In this document, the generic PNX1300 name refers

to the PNX1300 Series, or the PNX1300/01/02/11

products.

PNX1300 was designed to support both Little and Big

Endian systems. The PCI system bus (controlled by the

PCI Interface Unit (BIU)) operates in Little Endian mode

in both systems. This document describes how the dual

endian-ness feature is handled in PNX1300.

C.2 LITTLE AND BIG ENDIAN

ADDRESSING CONVENTIONS

In Big Endian mode, a given word address (32-bit) base

corresponds to the most significant byte (MSB) of the

word. Increasing the byte address generally means de-

creasing the significance of the byte being accessed. In

Little Endian mode, the same word address base refers

to the least significant byte (LSB) of that word. Increasing

the byte address generally means increasing the signifi-

cance of the byte being accessed. This addressing con-

vention is shown in Figure C-1.

In Figure C-1, there is a two-line ‘C’ code which defines

a 32-bit constant in hex format assigned to the variable

‘w’ (assumes ‘int’ is 32-bit) and its address is copied into

the byte (character) pointer variable ‘cp’. The value of ad-

dress refere nced by the ‘cp ’ has a value of ‘0x04’ in Big

Endian machine and a value of ‘0x07’ in Little Endian ma-

chine.

It is possible to transfer from one endian-n ess to another

just by swapping the bytes within a word as shown in Fig-

ure C-2.

int w = 0x04050607;

char *cp = (char *)&w;

Figure C-1. Big and Little Endian address references

031

04 05 06 07

Big Endian Mode Little Endian Mode

cp+0

04 05 06 07

cp+3

cp+1 cp+2 cp+3 cp+2 cp+1 cp+0

Figure C-2. Data conversion from Big Endian to Little Endian (BSW)

int w = 0x04050607;

char *cp = (char *)&w;

031

07 06 05 04

Big Endian Mode

Little Endian Mode

cp+0

04 05 06 07

cp+3

cp+1 cp+2 cp+3

cp+2 cp+1 cp+0

PNX1300/01/02/11 Data Book Philips Semiconductors

C-2 PRELIMINARY SPECIFICATION

C.3 TEST TO VERIFY THE CORRECT

OPERATION OF PNX1300 IN BIG AND

LITTLE ENDIAN SYSTEMS

The following test can be used to verify the correct oper-

ation of PNX1300 in Little Endian and Big Endian sys-

tems.

1. Store a 32-bit constant ‘0x04050607’ from the host

CPU to the PNX1300 SDRAM through the PCI inter-

face. Load the word from the same address to on e of

the PNX1300’s global register and check for the same

value.

2. Store a 32-bit constant ‘0x04050607’ from the host

CPU to the PNX1300 SDRAM through PCI in terface.

Load a byte from the same address to one of the

PNX1300 global registers. Check for the value of

‘0x04’ in Big Endian systems, and check for the value

‘0x07’ in Little Endian systems.

C.4 REQUIREMENT FOR THE PNX1300 TO

OPERATE IN EITHER LITTLE ENDIAN

OR BIG ENDIAN MODE

The endian-ness handling in each PNX1300 unit is de-

scribed in the following sections. Most units use the high-

way/PCI bus to transfer data. The hig hway/PCI bus ha s

four byte lanes. The bit assignment of the highway/PCI

bus lanes is shown in Table C-2.

The PCI bus and PNX1300 highway buse s are addr ess-

invariant buses, i.e the data corresponding to address

offset ‘0’ uses the byte-0 lane of the highway/PCI bus,

the data corresponds to address offset ‘1’ uses the byte-

1 lane of the highway/PCI bus etc.

C.4.1 Data Cache

The PNX1300 PCSW register has a byte-sex (BSX) bit

to configure the PNX1300 in Big Endian or Little Endian

mode. This bit must be set to ‘1’ for the Little Endian

mode as defined in Chapter 3, “DSPCPU Architecture.”

This BSX bit is used by the PNX1300 data cache unit for

the store/load operation . Data cache per forms three cat-

egories of data transactions:

• Read/write data from/to DSPCPU registers to/from

data cache or SDRAM

• Read/write of MMIO data from/to DSPCPU registers

to/from MMIO registers

• Read/write data from/to DSPCPU registers to/from

PCI address space through special registers in the

BIU unit.

The DSPCPU endian-ness is determined by the va lue of

the BSX bit in the PCSW register. Table C-1 and Table

C-3 describe the data translation format being used by

the data cache to transfer the data to/from DSPCPU reg-

ister to/fr om data cache or SDRAM. Table C-1 and Table

C-3 are restricted to addresses that fall in the

DRAM_BASE and DRAM_LIMIT range.

There is no byte-swap re quired for th e MMIO data tra ns-

action from/to DSPCPU register to the MMIO registers.

However, one of the special registers, PCI_DATA, does

not follow the normal MMIO transactions. The data

cache byte-swaps the data to/from the PCI_DATA regis-

ter using the data translation format as defined in Table

C-1 and Table C-3 for the memory cycle.

For the PCI configuration cycle and I/O cycle transac-

tions from the DSPCPU, a programmer can byte-swap

the data in the DSPCPU registers and write to the

PCI_DATA register using MMIO write operations. There

is no byte-swap from the PCI_DATA register in BIU unit

to the PCI bus. Software uses the Table C-1 or Table C-

3 data to byte-swap the data within th e CPU r egi ster be-

fore writing the data to the PCI_DATA register for the

configuration and I/O cycle transactions.

Table C-1. Little Endian data format in PNX1300 DSPCPU register, highway, SDRAM memory, PCI bus, host

memory, host CPU register

PCSW-

BSX

value

Endian

Mode Data T ransaction

type Address

Data in

DSPCPU

msb lsb

Data in highway/

Dcache/SDRAM/

PCI-bus

byte3 byte0

[31:24] [7:0]

Data in host

CPU register

msb lsb

Data in host

memory

byte3 byte0

[31:24] [7:0]

1 Little Word r/w 00001000 01020304 01020304 01020304 01020304

1 Little Half-Word r/w 00001000 xxxx0304 xxxx0304 xxxx0304 xxxx0304

1 Little Half-Word r/w 00001002 xxxx0304 0304xxxx xxxx0304 0304xxxx

1 Little Byte read/write 00001000 xxxxxx04 xxxxxx04 xxxxxx04 xxxxxx04

1 Little Byte read/write 00001001 xxxxxx04 xxxx04xx xxxxxx04 xxxx04xx

1 Little Byte read/write 00001002 xxxxxx04 xx04xxxx xxxxxx04 xx04xxxx

1 Little Byte read/write 00001003 xxxxxx04 04xxxxxx xxxxxx04 04xxxxxx

Table C-2. Bit assignment of the highway/PCI bus

lanes

byte 3 byte 2 byte 1 byte 0

Bits 31:24 23:16 15:8 7:0

Philips Semiconductors Endian-ness

PRELIMINARY SPECIFICATION C-3

C.4.2 Instruction Cache

It is assumed that the instruction cache always operates

in Little Endian regardless of the host and PNX1300 en-

dian-ness. Instruction cache does not use the PCSW’s

byte sex bit (BSX). The compiler supports the loading of

instructions in memory differently for Big Endian and Lit-

tle Endian modes.

C.4.3 PNX1300 PCI Interface Unit

The PNX1300 highway bus and the PCI bus are addre ss

invariant buses, i.e. a data corresponding to address

zero is always transferred through the byte-zero line re-

gardless of the endian-ness. The address-invariant na-

ture of the PCI and the highway buses allows data to be

transferred from/to PCI bus directly to/from SDRAM with-

out byte swapping in either Big or Little Endian mode The

byte swapping of data for Big En dian mode is p erforme d

by the data cache unit. However, MMIO data does not go

through the byte swapper in the Data cach e. This resu lts

in using a byte-swapper in the BIU to byte-swap the

MMIO data in Big Endian mode.

The PNX1300 BIU has a separate byte sex (SE, Swap

Enabled) flag defined in its control register (BIU_CTL).

This byte-sex flag must be set by the software, i.e. MMIO

write operation from the host CPU. This byte-sex flag is

used only for MMIO data accesses and none of the

MMIO data accesses is affected by this SE flag. Table C-

4 shows the byte-swap logic that handles the MMIO ac-

cesses from the DSPCPU and host CPU and the non

MMIO data accesses from any source.

The BIU has several special registers to handle memory,

PCI configuration, I/O and DMA accesses. It does not

byte-swap the I/O data from the special registers. The

data cache and software performs the necessary byte

swapping for this data.

When using PNX1300 in Little Endian-based systems,

the first transaction to the PNX1300 is to set the SE bit in

the BIU configuration register to avoid unnecessary soft-

ware byte-swapping in the host CPU for the subsequent

MMIO read/write accesses. The SE bit in the BIU_CTL

coming data from PCI bus. The default value of SE is ‘0’,

i.e the BIU byte-swaps the MMIO data including the write

operation to the BIU_CTL register. Software is required

to byte swap the BIU_CTL register value within the host

CPU before storing the value in BIU_CTL register. Once,

the BIU.SE bit has been set, no additional software byte-

swapping is required for further read/write operations to

any MMIO registers.

C.4.4 Image Coprocessor (ICP)

The input source data for the ICP unit might come from

different units such as Video In, the DSPCPU, PCI bus,

etc. via SDRAM. Data consistency needs to be main-

tained when th e PN X13 0 0 oper at es in Lit tle or Big En d i-

an systems/mode. The ICP needs the capability to oper-

ate on the SDRAM as source data and SDRAM or PCI

as destination data in either Little or Big Endian mode.

Figure C-3, Figure C-4, Figure C-5 and Figure C-6 illus-

trate the Big and Little Endian memory image format for

the image input format (Figure C-3) and the three sup-

ported image overlay formats.

The ICP can output the data to either the SDRAM or PCI

bus. RGB 8R and RGB 8A pixel formats are byte streams

and therefore do not require any byte swapping. Figure

C-9 pictures the data format. RGB-24, RGB-15,

RGB-16 and YUV-4:2:2 pixel formats can be u sed to out-

put the pixels to PCI or SDRAM in both Endian modes.

Output formats are shown, respectively, in Figure C-4,

Figure C-5, Figure C-8, and Figure C-7 . Packed RGB-24

cannot be used in Big Endian mode. Little Endian data

format is shown in Figure C-11.

Table C-3. Big Endian data format in the PNX1300 DSPCPU register, highway, SDRAM memory, PCI bus, host

memory, and host CPU register

PCSW-

BSX

value

Endian

Mode Data transactio n

type Address

Data in

DSPCPU

msb lsb

Data in highway/

Dcache/SDRAM/

PCI-bus

byte3 byte0

[31:24] [7:0]

Data in Host

CPU register

msb lsb

Data in host

memory

byte0 byte3

[31:24] [7:0]

0 Big Word r/w 00001000 01020304 04030201 01020304 01020304

0 Big Half-word r/w 00001000 xxxx0304 xxxx0403 xxxx0304 0304xxxx

0 Big Half-word r/w 00001002 xxxx0304 0403xxxx xxxx0304 xxxx0304

0 Big Byte read/write 00001000 xxxxxx04 xxxxxx04 xxxxxx04 04xxxxxx

0 Big Byte read/write 00001001 xxxxxx04 xxxx04xx xxxxxx04 xx04xxxx

0 Big Byte read/write 00001002 xxxxxx04 xx04xxxx xxxxxx04 xxxx04xx

0 Big Byte read/write 00001003 xxxxxx04 04xxxxxx xxxxxx04 xxxxxx04

Table C-4. BIU.SE bit usa ge in processing data in

BIU unit

BIU.SE

value Endian

Mode

MMIO

access

from

DSPCPU

MMIO

access from

PCI side

Non MMIO

data

0 Big No byte-swap byte-swap No byte-

swap

1 Little No byte-swap No byte-swap No byte-

swap

PNX1300/01/02/11 Data Book Philips Semiconductors

C-4 PRELIMINARY SPECIFICATION

Note: A+0 corresponds to byte-0 lan e of SDRAM/Hwy

and A+3 corresponds to byte-3 lane of SDRAM/Hwy

Figure C-3. Byte mask, planar YUV 4:2:0 and YUV 4:2:2 for ICP, VO or VI memory data in Little and Big En-

dian modes

Y pixel byte data

Y7 Y6 Y5 Y4

Y3 Y2 Y1 Y0

Big Endian Mode Little Endian Mode

in memory

A+3

(same for U, V, B)

Y3 Y2 Y1 Y0

Y7 Y6 Y5 Y4

A+3

A+2 A+1 A+0 A+2 A+1 A+0

31 31 0

Figure C-4. RBG-24+ data format for ICP in Little and Big Endian modes

0R0G0B0

Pixel word data

1R1 G1 B1

1R1G1B1

0R0 G0 B0

Big Endian Mode Little Endian Mode

in memory or PCI

Note: A+0 corresponds to byte-0 lane of SDRAM/Hwy/PCI

and A+3 corresponds to byte-3 lane of SDRAM/Hwy/PCI

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

31 31 00

Figure C-5. RBG-15+ data format for ICP in Little and Big Endian modes

Pixel half-word data

in memory or PCI

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

R0G’0

G0B0R1G’1

G1B1

R2G’2

G2B2

R3G’3G3B3

R0G’0 G0B0

R1G’1 G1B1

R2G’2 G2B2

R3G’3 G3B3

Big Endian Mode Little Endian Mode

Pn+1 Pn+1

PnPn

31 31 00

Note: A+0 corres po nd s to byte-0 lane of SDRAM/Hwy/PCI

and A+3 corresponds to byte-3 lane of SDRAM/Hwy/PCI

Philips Semiconductors Endian-ness

PRELIMINARY SPECIFICATION C-5

Figure C-6. Pack ed YUV 4: 2: 2+ data format for the ICP or VO in Little and Big Endia n modes

Pixel half-word dat a

Big Endian Mode Little Endian Mode

in memory or PCI

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

Pn+1 Pn+1

PnPn

U0 Y0

V0 Y1

U1 Y2V1 Y3

U0

V0

U1

Y2V1Y3

31 31 00

Note: A+0 corresponds to byte-0 lane of SDRAM/Hwy/PCI

and A+3 corresponds to byte-3 lane of SDRAM/Hwy/PCI

Figure C-7. Pack ed YUV 4:2: 2 da t a format for ICP in Lit tle and Big En d ian mo de s

Pixel half-word dat a

Big Endian Mode Little Endian Mode

in memory or PCI

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

Pn+1 Pn+1

PnPn

U0 Y0

V0 Y1

U1 Y2

V1 Y3

31 31 00

Note: A+0 corresponds to byte-0 lane of SDRAM/Hwy/PCI

and A+3 corresponds to byte-3 lane of SDRAM/Hwy/PCI

Figure C-8. RBG-16 data format for ICP in Little and Big Endian modes

Pixel half-word data

in memory or PCI

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

R0G’0

G0B0R1G’1

G1B1

R2G’2

G2B2

R3G’3G3B3

R0G’0 G0B0

R1G’1 G1B1

R2G’2 G2B2

R3G’3 G3B3

Big Endian Mode Little Endian Mode

Pn+1 Pn+1

PnPn

31 31 00

Note: A+0 corres po nd s to byte-0 lane of SDRAM/Hwy/PCI

and A+3 corresponds to byte-3 lane of SDRAM/Hwy/PCI

PNX1300/01/02/11 Data Book Philips Semiconductors

C-6 PRELIMINARY SPECIFICATION

Figure C-9. RGB8A and RGB8R data format for ICP in Little and Big Endian modes

RGB 8A or 8R

P7 P6 P5 P4

P3 P2 P1 P0

Big Endian Mode Little Endian Mode

in Memory or PCI

A+3

(Same for U, V, B)

P3 P2 P1 P0

P7 P6 P5 P4

A+3

A+2 A+1 A+0 A+2 A+1 A+0

31 31 0

Note: A+0 corresponds to byte-zero lane of SDRAM/Hwy/PCI

and A+3 corresponds to byte-three lane of SDRAM/Hwy/PCI

Figure C-10. Half-word swap within a half-word (BSH)

031

05 04 07 06

Before swap

After Swap

04 05 06 07

Figure C-11. Packed RBG-24 data format for ICP in Little Endian mode only

Pixel Word Data B1 R0 G0 B0

Big Endian Mode Little Endian Mode

in Memory or PCI

Note: A+0 corresponds to byte-zero lane of SDRAM/Hwy/PCI

and A+3 corresponds to byte-three lane of SDRAM/Hwy/PCI

A+3 A+2 A+1 A+0

31 0

G2 B2

NOT SUPPORTED G1

R3 G3 B3

Philips Semiconductors Endian-ness

PRELIMINARY SPECIFICATION C-7

The Table C-5 shows the byte-swap implementation of

various pixel formats used in the ICP unit. Refer to Figure

C-2 and Figure C-10 for the byte-swap code used in Ta-

ble C-4 and Tab le C-5 . Byte-swapping is performed only

in Big Endian mode. No swapping is done in the Little En-

dian mode.

The ICP has a byte se x bit (L) defined in its MMIO-based

configuration register. The setting of this bit and th e BSX

bit in the PCSW register should be the same. The L bit

must be set by the software.

C.4.5 Video In (VI) and Video Out (VO) Units

The VI unit stores the YUV pixels in planar 4:2:2 or 4:2:0

image format as shown in Figure C-3 and sto res the raw

8- and 10-bit data as shown in Figure C-12.

The VO unit uses YUV-4:2:2 planar, YUV-4:2:0 planar,

and YUV-4:2:2+ packed as input pixel formats. The pla-

nar memory image format of the YUV-4:2:2 and YUV-

4:2:0 are shown in Fi gure C-3. The YUV-4:2:2+ memo-

ry image format for overlay is pictured in Figure C-6.

The VI and VO units have a byte-sex bit (Little Endian

and LTL_END) defined in the control MMIO registers,

VI_CONTROL and VO_CONTROL. The definition of

these byte-sex bits and the BSX bit in the PCSW register

should be treated as same. Little Endian and LTL_END

bits must be set by software.

C.4.6 Audio In (AI), Audio-Out (AO), and

SPDIF Out (SDO) Units

The AI unit uses 8-bit mono, 8-bit stereo, 16-bit mono

and 16-bit stereo data. The AO unit uses 16-bit mono,

16-bit stereo, 32-bit mono and 32-bit stereo data. The

SPDO unit uses 32-bit word data. The memory image

format of these data is presented in Figure C-13.

Swapping takes place at the byte level and the bits within

a byte are never disturbed. Both the AI and AO units

have a byte sex bit (LITTLE_ENDIAN) defined in each

units MMIO-based configuration register. The definition

of the these bits and the BSX bit in the PCSW register

should be trea ted as same . This b yte sex bit must b e set

by the software.

C.4.7 Variable Length Encoder (VLD) Unit

The VLD inputs data from SDRAM in the form of a bit-

stream with a byte-aligned starting address and outputs

a header stream and a ‘run-level’ data stream. The VLD

unit has a byte sex bit (LITTLE_ENDIAN) defined in its

MMIO-based configuration regi ster. The definition of this

Table C-5. ICP byte swapping type for input data

Endian-ness L bit Pixel Type Swap Type

(see Figure C-2

& Figure C-10)

Big Endian 0 Y,U,V planar No swap

Big Endian 0 RGB 24+BSW

Big Endian 0 YUV-4:2:2+BSH

Big Endian 0 RGB 15+BSH

Table C-6. ICP byte swapping type for output data

Endian-

ness L bit Pixel Type Swap Type

(see Figure C-2 &

Figure C-10)

Big Endian 0 RGB 8A: 233 No swap

Big Endian 0 RGB 8R: 332 No swap

Big Endian 0 RGB 15+BSH

Big Endian 0 RGB 16 BSH

Big Endian 0 RGB 24+BSW

Big Endian 0 RGB24

packed No support for Big

Endian

Big Endian 0 YUV- 4:2:2

packed BSH

Figure C-12. Memory image format for raw 8-bit and 10-bit data

Dn+3 Dn+2 Dn+1 Dn

Big Endian Mode Little Endian Mode

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

raw 8-bit data

in memory Dn+3 Dn+2 Dn+1 Dn

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

raw 10-bit data

in memory Dn+1 Dn

lsb msbmsblsb Dn+1 Dnlsbmsbmsb lsb

Note: A+0 corresponds to byte-0 lane of SDRAM/Hwy

and A+3 corresponds to byte-3 lane of SDRAM/Hwy

lsb is the Least Significant Byte

msb is the Most Significant Byte

PNX1300/01/02/11 Data Book Philips Semiconductors

C-8 PRELIMINARY SPECIFICATION

bit and the BSX bit in the PCSW register should the

same. This byte sex bit must be set by the software.

Figure C-14 describes the VLD input and output data for-

mat as seen in the SDRAM and highway bus. The input

data is byte oriented and no swapping is required in the

VLD unit. However, the output data is read by the

DSPCPU in words, thus the VLD needs to swap the out-

put bytes within a word (shown in Figure C-14) to com-

pensate for the CPU swap.

C.4.8 Synchronous Serial Interface (SSI)

The SSI unit has I/O connections through the external

serial pins and also to the internal 32-bit data highway via

MMIO transactions. The minimum quantity of data to be

analyzed by the CPU is 16-bits (i.e. one half word). The

SSI uses a 16-bit or 1-bit endian-ness; it is detailed in

Section 17.8 on page 17-7. The 32-bit quantity contained

in the CPU register is written or read ‘as is’ into/from the

SSI MMIO register. The EMS bit in SSI_CTL determine s

which half-word (16-bit) is sent first as pictured in Figure

C-15.

Figure C-13. Memory image format for audio data

Ln+3 Ln+2 Ln+1 Ln

Big Endian Mode Little Endian Mode

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

8-bit data (mono)

in memory Ln+3 Ln+2 Ln+1 Ln

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

16-bit data (mono)

in memory Ln+1 Ln

lsb msbmsblsb Ln+1 Lnlsbmsbmsb lsb

Note: A+0 corresponds to byte-zero lane of SDRAM/Hwy

and A+3 corresponds to byte-three lane of SDRAM/Hwy

lsb is the least significant byte

msb is the most significant byte

Rn+1 Ln+1 RnLn

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

8-bit data (stereo)

in memory Rn+1 Ln+1 RnLn

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

16-bit data (stereo)

in memory RnLn

lsb msbmsblsb RnLnlsbmsbmsb lsb

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

32-bit data

in memory

msb

lsb lsbmsb

Figure C-15. SSI data format as seen in highway

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

16-bit half-word data

in CPU/MMIOs Dn+1

DnDn+1 Dnlsbmsbmsb lsb

Note: A+0 corresponds to byte-0 lane of CPU/Hwy

and A+3 corresponds to byte-3 lane of CPU/Hwy

lsb is the least significant byte

msb is the most significant byte

SSI_CTL.EMS = 0 SSI_CTL.EMS = 1

lsbmsbmsb lsb

Philips Semiconductors Endian-ness

PRELIMINARY SPECIFICATION C-9

C.4.9 Compiler

The TCS compiler supports the loading of instruction in

memory differently for Big Endian and Little Endian

modes.

C.5 SUMMARY

PNX1300 is r equired to operate in the same endian-ness

as the host CPU. At reset, PNX1300 oper ates in Big En-

dian mode; no special steps are required to set the Endi-

an bits. When using PNX1300 in Little Endian systems,

the first transaction is to set the SE bit in the BIU_CTL

11.6.5 on page 11-11.

C.6 REFERENCES

1. PCI Multimedia Design Guide, revision 1.0 - dated

March 29,1994

2. Designing PCI Cards and Drivers for Power Macin-

tosh Computers, By Apple Computer, Inc.; Refer-

ence: R0650LL/A; Phone: 1-800-282-273 2

Figure C-14. VLD input and output data format

Byten+3 Byten+2 Byten+1 Byten

Big Endian Mode Little Endian Mode

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

Input data Byten+3 Byten+2 Byten+1 Byten

12 34 56 78

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

Header output

Header = 0x1234567 8

Note: A+0 corresponds to byte-0 lane of SDRAM/Hwy

and A+3 corresponds to byte-1 lane of SDRAM/Hwy

1234

12 34 56 78

A+3 A+3

A+2 A+1 A+0 A+2 A+1 A+0

Run level output

Run value = 0x1234

Level value = 0x5678 1234

At word Address A

PNX1300/01/02/11 Data Book Philips Semiconductors

C-10 PRELIMINARY SPECIFICATION

ABC HEDFGIJKLMNOPQRSTUVWXYZ

PRELIMINARY SPECIFICATION Index-1

Index

Numerics

12nc 1-10

A/D converter 8-1

Absolute maximu m ra tin gs 1-12

AC characteristics 1-12

address fields,instruction cache 5-8

address lines

driving capacity 12-7

address mapping

based on rank size 12-5, 12-6

DRAM memory system 12-5

instruction cache 5-8

picture 5-9

addressing modes 3-4

AI_BASE1

picture 8-5

AI_BASE2

picture 8-5

AI_CONTROL

field description table 8-6

AI_CTL

picture 8-5

AI_FRAMING

picture 8-5

AI_FREQ

picture 8-5

AI_OSCLK

description table 8-1

AI_SCK

description table 8-1

AI_SD

description table 8-1

AI_SERIAL

picture 8-5

AI_SIZE

picture 8-5

AI_STATUS

field description table 8-6

picture 8-5

AI_WS

description table 8-1

algorithms

image processing 14-6

of Enhanced Video Out Unit 7-10

algorithms, ICP 14-6

alignment 5-4

alloc A-4

allocate on write 5-4

allocd A-5

allocr A-6

allocx A-7

alphablending codes 14-5

byte for alpha blending 14-5

keying 14-9

registers 14-5

alpha blending 7-13, 14-1, 14-9

alpha blending codes 14-5

table 14-5

alpha value

for overlay pixel 14-9

AO_BASE1

picture 9-8

AO_BASE2

picture 9-8

AO_CC

picture 9-8

AO_CFC

picture 9-8

AO_CONTROL

field description table 9-9, 9-10

AO_CTL

picture 9-8

AO_FRAMING

picture 9-8

AO_FREQ

picture 9-8

AO_OSCLK

description table 9-2

AO_SCK

description table 9-2

AO_SERIAL

picture 9-8

AO_SIZE

picture 9-8

AO_STATUS

field description table 9-9

picture 9-8, 16-2

aperture

DRAM 5-2

memory 12-1

PCI 11-2

aperture,PCI 5-5

APERTURE_CONTROL field 5-5

asi A-8

ABC HEDFGIJKLMNOPQRSTUVWXYZ

Index-2 PRELIMINARY SPECIFICATION

asli A-9

asr A-10

asri A-11

audio capture 8-5

audio codec 8-1, 8-3

audio in unit

diagnostic mode 8-7

memory data formats 8-4

audio input 8-1

audio memory format 8-4

audio out unit

memory data formats 9-7

audio sample rate 8-2

audio test 8-7

bandwidth

requirements of ICP 14-1

base addre ss

PCI interface registers 11-7

BDATAAHIGH

picture 3-14

BDATAALOW

picture 3-14

BDATAMASK

picture 3-14

BDATAVAL

picture 3-14

BDCTL

picture 3-14

BICTL

picture 3-14

binary compatibility 3-4

BINSTHIGH

picture 3-14

BINSTLOW

picture 3-14

bit masking 14-28

bitand A-12

bitandinv A-13

bitinv A-14

bitmap

masking 14-1

bitor A-15

bitxor A-16

BIU_CTL

PCI interface MMIO register 11-11

picture 11-10

BIU_STATUS

PCI interface MMIO register 11-11

picture 11-10

blending

alpha 14-1

blending codes

alpha blending 14-5

block timing

PCI output 14-16

boolean repr e sen ta tio n 3-3

borrow A-17

boundary scan 1-1

breakpoints 3-13

built-in self test

PCI interface register 11-7

byte orderi ng

DSPCPU 3-2

bytesex 3-2

cache

address mapping,instruction cache 5-8

alignment 5-3, 5-4

associativity 5-3

bandwidth requirements 5-1

block size 5-3

blocksize 5-3

byte in word 5-3

coherency 5-3, 5-4, 5-11

copyback 5-4

copyback operation 5-6

CPU stall 5-8

data cache characteristics,table 5-3

data cache initialization 5-8

data cache,description 5-3

dcb opcode 5-6

dinvalid opcode 5-6

dirty bit 5-4

dirty bits 5-3

dual port 5-4

endian-ness 5-3, 5-4

hidden concurrency 5-7

iclr operation 5-9

initialization 5-8

instruction cache 5-8

instruction cache coherency 5-9

instruction cache initialization and boot 5-10

instruction cache parameters 5-8

instruction cache summary 5-8

instruction cache tag 5-8

invalidate operation 5-6

latency 5-8

locking 5-3, 5-4

locking registers 5-5

LRU replacement 5-11

memory hole 5-5

ABC HEDFGIJKLMNOPQRSTUVWXYZ

PRELIMINARY SPECIFICATION Index-3

miss proces sin g or de r 5-4, 5-9

miss transfer order 5-3

MMIO registers summary 5-13

noncachable region 5-3

non-cacheable region 5-5

number of sets 5-3

operation ordering 5-7

overview 5-1

overview,memory system 5-1

parameters 5-3

partial word transfers 5-4

partial words 5-3

performance evaluation support 5-12

performance events

table 5-13

ports 5-3

rdstatus result format 5-6

rdtag result format 5-6

replacement policies 5-3, 5-4

replacement policy 5-9

scheduling constraint 5-4

set 5-3

size 5-3

special data cache operations 5-6

special opcodes 5-4

special operation ordering 5-7

status operations 5-6, 5-7

summary of characteristics 5-2

tag field of address 5-3

tag operations 5-6, 5-7

valid bits 5-3

word in set 5-3

write misses 5-4

cache line size

PCI interface register 11-7

carry A-18

CCCOUNT

definition 3-3

CCIR 656

line timing

description 7-4

pixel timing

description 7-4

video connector on Enhanced Video Out

Unit,picture 7-2

CCIR 656 frame timing

description 7-6

description table 7-6

CCIR 656 line timing

picture 7-5

CCIR 656 pixel timing

picture 7-5

CCIR656 serial D1 7-2

chroma

keying 14-1

Chroma keying 7-14

chroma keying 14-1, 14-9

circuit board design

guidelines 12-7

class code

PCI interface register 11-6

Clipping 7-14

codec 8-1

coherency 5-4

coherency,instruction cache 5-9

command ID

PCI interface register 11-3

compatibility

software 3-4

concurrency

PCI interface 11-3

concurrency,hidden 5-7

CONFIG_ADR

PCI interface MMIO register 11-12

picture 11-10

CONFIG_CTL

PCI interface MMIO register 11-13

picture 11-10

CONFIG_DATA

PCI interface MMIO register 11-13

configuration header 11-3

configuration operations

PCI interface 11-2

control word

ICP vertical filter 14-25

of ICP 14-23

conversion

interspersed to co-sited 7-11

to RGB 14-1

to YUV composite 14-1

YUV to RGB 14-3, 14-9

copyback 5-4

co-sited sampling 6-4

counter 3-12

CPU stall 5-8

curcycles A-19

cycles A-20

D1 serial 7-2

data address fields 5-3

data breakpoint 3-13

data cache

coherency 5-11

dcb operation 5-6

ABC HEDFGIJKLMNOPQRSTUVWXYZ

Index-4 PRELIMINARY SPECIFICATION

dinvalid operation 5-6

initialization 5-8

LRU replacement 5-11

performance evaluation support 5-12

rdstatus operation 5-6

rdtag operation 5-6

data cache locking registers 5-5

data format

planar 14-3

DC/AC Characteristics 1-12

DC_LOCK_ADDR

description table 5-13

DC_LOCK_CTL

description table 5-13

DC_LOCK_SIZE

description table 5-13

DC_PARAMS

description table 5-13

fields 5-3

picture 5-3

DC_PARAMS register 5-3

dcb 5-6, A-21

dcb operation 5-6

DDS 7-3, 8-2

debug frontend 18-3

debug suppo rt 3-13

DEST_ADR

PCI interface MMIO register 11-14

picture 11-10

device control 3-7

device ID

PCI interface register 11-3

device interrupts 3-11

diagnostic mode 8-7

audio in unit 8-7

dimensions 1-10

dinvalid 5-6, A-22

dinvalid operation 5-6

direct digital synthesizer 7-3, 8-2

dirty bit 5-4

dithering 14-10

algorithm 14-10

method 14-10

DMA operations

PCI interface 11-2

DMA_CTL

PCI interface MMIO register 11-14

picture 11-10

downscaling 14-1

DPC

definition 3-3

DRAM aperture 5-2

DRAM base 5-2

DRAM limit 5-2

DRAM memory system

address aper tu re 12-1

address mapping 12-5

circuit board design 12-7

example block dia g ra m s 12-9

example configurations table 12-3

features 12-1

granularity and sizes 12-2

initialization 12-6

mode register setting 12-6

on-chip interleaving 12-6

output driver capacity 12-7

power down mode 12-7

programming 12-3

refresh 12-6

signal pins 12-5

supported devices 12-2

supported rank configurations 12-2

DRAM_BASE

description table 5-13

PCI interface MMIO register 11-9

PCI interface register 11-7

picture 5-2, 11-10

DRAM_BASE updates 11-10

DRAM_CACHEABLE_LIMIT

description table 5-13

picture 5-5

DRAM_LIMIT

description table 5-13

picture 5-2

DSPCPU

addressing modes 3-4

byte order ing 3-2

software compatibility 3-4

DSPCPU operations

listed alphabetically A-1

listed by function A-2

dspiabs A-23

dspiadd A-24

dspidualabs A-25

dspidualadd A-26

dspidualmul A-27

dspidualsub A-28

dspimul A-29

dspisub A-30

dspuadd A-31

dspumul A-32

dspuquadaddui A-33

ABC HEDFGIJKLMNOPQRSTUVWXYZ

PRELIMINARY SPECIFICATION Index-5

dspusub A-34

dual port 5-4

EAV and SAV codes

description 7-5

EAV format 6-5

edge sensitive interrupts 3-10

endian-ness 5-4

endianness 3-2

Enhanced Video Out 7-1

Enhanced Video Out Unit

active video definition

picture 7-7

algorithms,overview 7-10

alpha blending 7-13

block diagram 7-3

CCIR 656 frame timin g

description 7-6

description table 7-6

CCIR 656 line timing

description 7-4

picture 7-5

CCIR 656 pixel timing

description 7-4

picture 7-5

clock system 7-25

picture 7-3

connection to video encoder,picture 7-2

connection to video in unit,picture 7-3

connection,CCIR656,picture 7-2

data streaming 7-23

data transfer timing 7-9

dds 7-25

DDS and PLL setting,examples 7-25

error conditions 7-23

field definition

picture 7-7

frame definition

picture 7-7

frame timing signals 7-7

functions,summary 7-1

graphics overlay 7-22

graphics overlay formats 7-10

horizontal tim ing si gn als 7-7

image addressing 7-22

image definition

picture 7-7

image timing 7-4

interrupts 7-23

message passing 7-23

MMIO registers 7-14

NTSC 7-23

operating modes 7-13

operation,description 7-21

overlay definition

picture 7-7

PAL 7-23

pixel mirroring 7-11

PLL filter

block diagram 7-25

pll filter 7-25

progressive scan 7-6

summary of functions 7-1

timing generation

description 7-6

timing register

recommended values 7-23

video image data formats 7-9

YUV image format 7-9

YUV planar format 7-10

YUV upscaling 7-11

Enhanced Video Out unit

block diagram 7-3

clock system 7-3

interface pins 7-2

EVOEnhanced Video Out Unit 7-1

EVO_CLIP

field description table 7-21

picture 7-20

EVO_CTL

field description table 7-20

picture 7-20

EVO_KEY

field description table 7-21

picture 7-20

EVO_MASK

field description table 7-21

picture 7-20

EVO_SLVDLY

field description table 7-21

picture 7-20

exceptions

definition 3-9

expansion ROM base address

PCI interface register 11-9

fabsval A-38

fabsvalflags A-39

fadd A-40

faddflags A-41

fdiv A-42

ABC HEDFGIJKLMNOPQRSTUVWXYZ

Index-6 PRELIMINARY SPECIFICATION

fdivflags A-43

feql A-44

feqlflags A-45

fgeq A-46

fgeqflags A-47

fgtr A-48

fgtrflags A-49

filter 5-tap 14-1

algorithm,ICP horizontal 14-22

algorithm,ICP vertical 14-24

coefficient,loading 14-22

horizontal 14-22

horizontal,parameter table 14-23

ICP vertical 14-24

ICP vertical,parameter table 14-24

parameter table,vertical 14-24

polyphase 14-1

SDRAM to SDRAM 14-24

SDRAM to SDRAM,horizontal 14-22

vertical 14-24

with RGB/YUV conver sio n 14-25

filtering

horizontal 14-1, 14-12, 14-15

horizontal,ICP 14-6

horizontal,method 14-11

ICP 14-6

ICP,5-tap 14-6

method 14-11

multi-tap 14-6

two dimens ion al 14-1

vertical 14-1

fleq A-50

fleqflags A-51

fles A-52

flesflags A-53

floating point

exception flags 3-2

IEEE rounding mode 3-2

representation 3-4

fmul A-54

fmulflags A-55

fneq A-56

fneqflags A-57

four-way LRU 5-11

frame timing signals 7-7

fsign A-58

fsignflags A-59

fsqrt A-60

fsqrtflags A-61

fsub A-62

fsubflags A-63

fullres captur e mo d e

video in unit 6-1

description 6-4

funshift1 A-64

funshift2 A-65

funshift3 A-66

general purpose registers 3-1

general purpose timer/ counter 3-12

Genlock 7-7

Genlock mode 7-8

granularity

memory 12-2

graphics overlay 7-10, 7-22

graphics overlay formats 7-10

grid input 14-7

output 14-7

guarding

definition 3-5

h_dspiabs A-67

h_dspidualabs A-68

h_iabs A-69

h_st16d A-70

h_st32d A-71

h_st8d A-72

halfres capture mode

video in unit 6-1

description 6-9

handshake mechanism

JTAG 18-5

HBE 8-7

header type

PCI interface register 11-7

hicycles A-73

hidden concurrency 5-7

hierarchical LRU 5-4

highway latency

audio 8-7

horizontal

filtering 14-12

scaling 14-11, 14-15

horizontal filter 14-22

parameter,table 14-23

timing 14-12

horizontal filter to RGB parameter table 14-26

horizontal filtering 14-1, 14-15

horizontal scaling 14-1, 14-15

ABC HEDFGIJKLMNOPQRSTUVWXYZ

PRELIMINARY SPECIFICATION Index-7

horizontal timi ng sign als 7-7

huffman code 15-1

I/O buffer circuits 1-1

I/O operations

PCI interface 11-2

i2s 8-1

iabs A-74

iadd A-75

iaddi A-76

iavgonep A-77

ibytesel A-78

IC_LOCK_ADDR

description table 5-13

picture 5-10

IC_LOCK_CTL

description table 5-13

picture 5-10

IC_LOCK_SIZE

description table 5-13

picture 5-10

IC_PARAMS

description table 5-13

picture 5-8

IC_PARAMS fields 5-8

ICLEAR

picture 3-11

iclipi A-79

iclr 5-9, A-80

ICP algorithms 14-6

alpha blending 14-9

bandwidth requirements 14-1

block diagram 14-1

chroma keying 14-9

coefficients,table 14-22

color keying 14-9

control word format 14-23

dithering 14-10

filter coefficient, loading 14-22

filter SDRAM to SDRAM 14-22

horizontal filter control word 14-27

horizontal filte r pa ra m et er tab le 14-22

horizontal filter to RGB parameter table 14-26

horizontal filter with conversion 14-25

horizontal filter,algorithm 14-22, 14-25

horizontal filter,table 14-23

horizontal filtering 14-6, 14-15

horizontal scaling 14-15

image formats 14-3

image overlay formats 14-5

image overlay form a ts ta ble 14-5

image resizing 14-6

image scaling 14-6

internal structure 14-1

lines mirroring 14-15

microprogram 14-16

missing pixels,filtering 14-6

move image 14-1

operation 14-16

output formats 14-5

output scaling,calculation method 14-8

overlay 14-9

parameter tables 14-22

PCI block timing 14-16

pixel mirroring 14-6

priority delay 14-20

programming 14-16

registers 14-17

scaling output resolution 14-7

SDRAM timing 14-15

status register,PD field 14-20

upscaling exam p le 14-7

vertical filter 14-24

vertical filter algorithm 14-24

vertical filter control word 14-25

vertical filter parameter table 14-24

vertical filtering 14-6

YUV formats 14-3

YUV sequence counter 14-15

YUV to RGB conversion 14-9

ICP (image co-processor) 14-1

ICP_DP, MMIO register 14-17

ICP_DR, MMIO register 14-17

ICP_MIR, MMIO register 14-17

ICP_MPC, MMIO register 14-17

ICP_SR, MMIO register 14-17

ident A-81

IEEE 1149.1 1-1

IEEE rounding mode 3-2

ieql A-82

ieqli A-83

ifir16 A-84

ifir8ii A-85

ifir8ui A-86

ifixieee A-87

ifixieeeflags A-88

ifixrz A-89

ifixrzflags A-90

iflip A-91

ifloat A-92

ifloatflags A-93

ifloatrz A-94

ifloatrzflags A-95

ABC HEDFGIJKLMNOPQRSTUVWXYZ

Index-8 PRELIMINARY SPECIFICATION

igeq A-96

igeqi A-97

igtr A-98

igtri A-99

iimm A-100

iis 8-1

ijmpf A-101

ijmpi A-102

ijmpt A-103

ild16 A-104

ild16d A-105

ild16r A-106

ild16x A-107

ild8 A-108

ild8d A-109

ild8r A-110

ileq A-111

ileqi A-112

iles A-113

ilesi A-114

image

ICP input format 14-3

processing algorithms 14-6

resizing 14-6

scaling 14-6

scaling factor range 14-3

size range 14-3

Image co-processor

block diagram 14-1

image co-processor 14-1

block diagram 14-2

image formats 14-3

image overlay 14-1, 14-5, 14-9

image overlay formats

of ICP,table 14-5

image processing

bandwidth 14-1

IMASK

picture 3-11

imax A-115

imin A-116

imul A-117

imulm A-118

ineg A-119

ineq A-120

ineqi A-121

initialization

DRAM memory system 12-6

instruction cache 5-10

initialization,cache 5-8

inonzero A-122

input format

ICP 14-3

input grid

relating to output grid 14-7

instruction breakpoint 3-13

instruction cache 5-8

address mapping 5-8

picture 5-9

coherency 5-11

initialization and boot 5-10

LRU replacement 5-11

performance evaluation support 5-12

instruction cache pa rameters 5-8

instruction cache set 5-8

instruction cache tag 5-8

instruction cache,summary 5-8

INT_CTL

PCI interface MMIO register 11-15

picture 3-12, 11-10

integer representation 3-4

interleaving

of SDRAM 12-6

interrupt line

PCI interface register 11-9

interrupt mask 3-10

interrupt mode 3-10

interrupt pin

PCI interface register 11-9

interrupt priority 3-10

interrupt vectors 3-9

interrupts 3-9

definition 3-9

DSPCPU enable bit 3-2

interspersed sampling 6-5

intervals

refresh 12-6

INTVEC[31:0]

picture 3-9

IO_ADR

PCI interface MMIO register 11-13

picture 11-10

IO_CTL

PCI interface MMIO register 11-13

picture 11-10

IO_DATA

PCI interface MMIO register 11-13

picture 11-10

IPENDING

picture 3-11

IS 11172-2 references 15-3

IS 13818-2 references

table 15-3

ISETTING0

picture 3-10

ISETTING1

ABC HEDFGIJKLMNOPQRSTUVWXYZ

PRELIMINARY SPECIFICATION Index-9

picture 3-10

ISETTING2

picture 3-10

ISETTING3

picture 3-10

isub A-123

isubi A-124

izero A-125

jmpf A-126

jmpi A-127

jmpt A-128

JTAG

additional registers

picture 18-4

BYPASS instruction 18-2

communica tio n pr ot oc ol 18-5

example datat transfer 18-5

EXTEST instruction 18-2

instruction enco din g s

table 18-2

instructions

SEL_DATA_IN 18-5

SEL_DATA_OUT 18-5

SEL_IFULL_IN 18-5

SEL_JTAG_CTRL 18-5

SEL_OFULL_OUT 18-5

MACRO instruction 18-3

MMIO registers

table 18-4

overview 18-1

race condition,avoid 18-5

RESET instruction 18-2

SAMPLE/PRELOAD instruction 18-2

SEL_DATA_IN instruction 18-2

SEL_DATA_OUT instruction 18-3

SEL_IFULL_IN instruction 18-3

SEL_JTAG_CTRL instruction 18-3

SEL_OFULL_OUT instruction 18-3

system components 18-3

TAP controller description 18-1

TAP controller state diagram,picture 18-2

test access port 18-1

test clock 18-1, 18-3

test data in 18-1

test data out 18-1

test mode select 18-1

virtual registers 18-4

JTAG_CTRL

JTAG_DATA_IN

JTAG_DATA_OUT

JTAG_IFULL_IN 18-4

JTAG_OFULL_OUT 18-4

keying

chroma 14-9

color 14-9

latency timer

PCI interface register 11-7

latency,memory operation 5-8

ld32 A-129

ld32d A-130

ld32r A-131

ld32x A-132

level sensitive interrupts 3-10

linesmirroring 14-15

load coefficients parameter table 14-22

load store ordering 3-3, 3-5, 3-7, 5-5, 17-4, 17-6

locking conditions 5-4

locking range 5-4

LRU bit definition 5-12

LRU bit definitions,picture 5-12

LRU bit update ordering 5-12

LRU initialization 5-12

LRU replacement,cache 5-11

LRU, hierarchical 5-4

LRU,four-way 5-11

LRU,two-way 5-11

lsl A-133

lsli A-134

lsr A-135

lsri A-136

macro block header 15-1

macroblock header, standard references 15-3

main image 14-9

max_lat

PCI interface register 11-9

Maximum Ratings 1-12

MEM_EVENTS

description table 5-13

picture 5-12

ABC HEDFGIJKLMNOPQRSTUVWXYZ

Index-10 PRELIMINARY SPECIFICATION

memory

operation ordering 5-7

memory data formats

audio in unit 8-4

audio out unit 9-7

memory format

audio 8-4

memory hole 5-5

memory map 3-7

picture 3-7

memory mapped devices 3-7

mergelsb A-138

mergemsb A-139

message passing mode

video in unit

description 6-11

message-passing mode

video in unit 6-1

description 6-11

min_gnt

PCI interface register 11-9

mirroring

lines 14-15

pixels 14-12

misaligned

store 3-3

miss proces sin g, or der 5-9

MM_A[11:0]

description table 12-5

MM_CAS#

description table 12-5

MM_CKE[3:0]

description table 12-5

MM_CLK[1:0]

description table 12-5

MM_CS#[3:0]

description table 12-5

MM_DQ[31:0]

description table 12-5

MM_DQM

description table 12-5

MM_RAS#

description table 12-5

MM_WE#

description table 12-5

mmio 3-7

MMIO aperture

picture 3-8

MMIO references,non-cached 5-8

MMIO registers

AI_BASE1

picture 8-5

AI_BASE2

picture 8-5

AI_CONTROL

field description table 8-6

AI_CTL

picture 8-5

AI_FRAMING

picture 8-5

AI_FREQ

picture 8-5

AI_SERIAL

picture 8-5

AI_SIZE

picture 8-5

AI_STATUS

field description table 8-6

picture 8-5

AO_BASE1

picture 9-8

AO_BASE2

picture 9-8

AO_CC

picture 9-8

AO_CFC

picture 9-8

AO_CONTROL

field description table 9-9, 9-10

AO_CTL

picture 9-8

AO_FRAMING

picture 9-8

AO_FREQ

picture 9-8

AO_SERIAL

picture 9-8

AO_SIZE

picture 9-8

AO_STATUS

field description table 9-9

picture 9-8, 16-2

BDATAAHIGH

picture 3-14

BDATAALOW

picture 3-14

BDATAMASK

picture 3-14

BDATAVAL

picture 3-14

BDCTL

picture 3-14

BICTL

picture 3-14

BINSTHIGH

picture 3-14

ABC HEDFGIJKLMNOPQRSTUVWXYZ

PRELIMINARY SPECIFICATION Index-11

BINSTLOW

picture 3-14

BIU_CTL 11-11

picture 11-10

BIU_STATUS 11-11

picture 11-10

cache registers summary 5-13

CONFIG_ADR 11-12

picture 11-10

CONFIG_CTL 11-13

picture 11-10

CONFIG_DATA 11-13

DC_LOCK_ADDR

description table 5-13

picture 5-5

DC_LOCK_CTL

description table 5-13

picture 5-5

DC_LOCK_SIZE

description table 5-13

picture 5-5

DC_PARAMS 5-3

description table 5-13

fields 5-3

picture 5-3

DEST_ADR 11-14

picture 11-10

DMA_CTL 11-14

picture 11-10

DRAM_BASE 11-9

description table 5-13

picture 5-2, 11-10

DRAM_CACHEABLE_LIMIT

description table 5-13

picture 5-5

DRAM_LIMIT

description table 5-13

picture 5-2

EVO_CLIP

picture 7-20

EVO_CTL

picture 7-20

EVO_KEY

picture 7-20

EVO_MASKK

picture 7-20

EVO_SLVDLY

picture 7-20

for VLD 15-4

IC_LOCK_ADDR

description table 5-13

picture 5-10

IC_LOCK_CTL

description ta ble 5-13

picture 5-10

IC_LOCK_SIZE

description ta ble 5-13

picture 5-10

IC_PARAMS

description ta ble 5-13

fields 5-8

picture 5-8

ICLEAR

picture 3-11

ICP_DP 14-17

ICP_DR 14-17

ICP_MIR 14-17

ICP_MPC 14-17

ICP_SR 14-17

IMASK

picture 3-11

INT_CTL 11-15

picture 3-12, 11-10

INTVEC[31:0]

picture 3-9

IO_ADR 11-13

picture 11-10

IO_CTL 11-13

picture 11-10

IO_DATA 11-13

picture 11-10

IPENDING

picture 3-11

ISETTING0

picture 3-10

ISETTING1

picture 3-10

ISETTING2

picture 3-10

ISETTING3

picture 3-10

JTAG registers 18-4

JTAG_CTRL 18-4

JTAG_DATA_IN 18-4

JTAG_DATA_OUT 18-4

MEM_EVENTS

description ta ble 5-13

picture 5-12

MM_CONFIG

picture 12-4

MMIO_BASE 11-9

description ta ble 5-13

picture 11-10

of Enhanced Video Out Unit 7-14

of ICP 14-17

PCI interface

ABC HEDFGIJKLMNOPQRSTUVWXYZ

Index-12 PRELIMINARY SPECIFICATION

accessibility 11-11

PCI_ADR 11-12

picture 11-10

PCI_DATA 11-12

picture 11-10

PLL_RATIOS

picture 12-4

SCR_ADR

picture 11-10

setup of SSI_CTL 17-6

SPDO_BASE1

picture 10-5

SPDO_BASE2

picture 10-5

SPDO_CTL

picture 10-5

SPDO_FREQ

picture 10-5

SPDO_SIZE

picture 10-5

SPDO_STATUS

picture 10-5

SPDO_TSTAMP

picture 10-5

SRC_ADR 11-14

SSI_CSR

fields description 17-11

SSI_CTL

fields description 17-9

summary table B-1

TCTL

picture 3-13

TMODULUS

picture 3-13

TVALUE

picture 3-13

VI_BASE1

alignment 6-11

picture 6-10

VI_BASE2

alignment 6-11

picture 6-10

VI_CAP_SIZE

picture 6-8

VI_CAP_START

picture 6-8

VI_CLOCK

picture 6-8, 6-10

VI_CTL

picture 6-8, 6-10

VI_SIZE

picture 6-10

VI_STATUS

picture 6-8, 6-10

VI_U_BASE_ADR

picture 6-8

VI_UV_DELTA

picture 6-8

VI_V_BASE_ADR

picture 6-8

VI_Y_BASE_ADR

picture 6-8

VI_Y_DELTA

picture 6-8

video in, view in raw and message passing mode

picture 6-10

video in,YUV capture 6-8

VLD unit,picture 15-6

VO_CLOCK

common values 7-23

picture 7-15

VO_CTL

fields description table 7-17

picture 7-15

VO_FIELD

default values 7-23

picture 7-15

VO_FRAME

default values 7-23

picture 7-15

VO_IMAGE

default values 7-23

picture 7-15

VO_LINE

default values 7-23

picture 7-15

VO_OLADD

field description table 7-19

picture 7-15

VO_OLHW

picture 7-15

VO_OLSTART

picture 7-15

VO_STATUS

picture 7-15

VO_UADD

field description table 7-19

picture 7-15

VO_VADD

field description table 7-19

picture 7-15

VO_VUF

picture 7-15

VO_YADD

picture 7-15

VO_YOLF

ABC HEDFGIJKLMNOPQRSTUVWXYZ

PRELIMINARY SPECIFICATION Index-13

field description table 7-19

picture 7-15

VO_YTHR

picture 7-15

VO_YUF

field description table 7-19

MMIO_BASE

description table 5-13

PCI interface MMIO register 11-9

PCI interface register 11-7

picture 11-10

MMIO_BASE updates 11-10

MPEG bitstream 15-1

MPEG-1 macroblock header 15-3

MPEG-1 macroblock header,output format 15-4

MPEG-1 standard references 15-3

MPEG-2 macroblock header 15-3

MPEG-2 macroblock header,output format 15-2

MPEG-2 standard

references

table 15-3

multi-tap FIR filtering 14-6

New features 1-1

non cacheable region 5-5

noncachable region 5-3

non-interlaced scan 7-6

non-maskable interrupt 3-10

nop A-140

NTSC 7-23

offset byte in set 5-8

operation ordering,special 5-7

operations

DSPCPU A-1, A-2

order,miss processing 5-9

ordering

memory operations 5-7

Ordering In formation 1-10

ordering,special operatio n 5-7

output fo rmats

ICP 14-5

output grid

relating to input grid 14-7

output scaling

calculation 14-8

overlap configuration of windows 14-1

overlay

blending 14-9

of image 14-1

overlay formats

of ICP 14-5

overlay image 14-9

overlay, image 14-5, 14-9

overlays

computer generated 14-9

oversampling A/D co nve r ter 8-2

pack16lsb A-141

pack16msb A-142

package outline 1-10

package,BGA package 1-10

packbytes A-143

PAL 7-23

parameter table

ICP horizontal filter 14-23

parameter tables

horizontal filter to RGB 14-26

ICP 14-22

vertical filter 14-24

Part Number 1-10

partial words 5-4

PCI aperture 11-2

output block timing 14-16

space 11-2

PCI aperture 5-5

PCI configuration sp ac e 11-3

PCI header 11-3

PCI interface

characteristics overview 11-1

concurrency 11-3

configuration header 11-3

configuration operations 11-2

configuration registers 11-3

DMA operations 11-2

I/O operations 11-2

initiator 11-2

limitations 11-17

ordering 11-3

priorities 11-3

registers

base addresses 11-7

built-in self test 11-7

cache line size 11-7

class code 11-6

command

fields 11-5

ABC HEDFGIJKLMNOPQRSTUVWXYZ

Index-14 PRELIMINARY SPECIFICATION

command ID 11-3

device ID 11-3

DRAM_BASE 11-7

expansion ROM base address 11-9

header type 11-7

interrupt line 11-9

interrupt pin 11-9

latency timer 11-7

max_lat 11-9

min_gnt 11-9

MMIO_BASE 11-7

revision ID 11-6

status 11-5

fields 11-6

vendor ID 11-3

single word load/store 11-2

target of operations 11-3

PCI references,non-cached 5-8

PCI_ADR

PCI interface MMIO register 11-12

picture 11-10

PCI_DATA

PCI interface MMIO register 11-12

picture 11-10

PCSW

definition 3-2

performance events,cache 5-13

Philips Part Number 1-10

pins AI_OSCLK

description table 8-1

AI_SCK

description table 8-1

AI_SD

description table 8-1

AI_WS

description table 8-1

AO_OSCLK

description table 9-2

AO_SCK

description table 9-2

complete list 1-2

DC/AC Characteristics 1-12

I/O circuit summary 1-1

MM_CAS#

description table 12-5

MM_CLK[1:0]

description table 12-5

MM_CS#[3:0]

description table 12-5

MM_DQ[31:0]

description table 12-5

MM_DQM

description ta ble 12-5

MM_RAS#

description ta ble 12-5

MM_WE#

description ta ble 12-5

package 1-10

SPDO

description ta ble 10-1

timing 1-19, 1-20, 1-21

VI_CLK

description ta ble 6-2

VI_DATA[7:0]

description ta ble 6-2

VI_DATA[8] 6-11

VI_DATA[9:8]

description ta ble 6-2

VI_DATA[9] 6-11

VI_DVALID

description ta ble 6-2

VO_CLK

description ta ble 7-3

VO_DATA[7:0]

description ta ble 7-3

VO_IO1

description ta ble 7-3

VO_IO2

description ta ble 7-3

pixelmirroring 14-6

missing 14-6

shift bypassing for downscaling 14-8

transformation,scaling 14-7

pixel mirroring 7-11

pixels

mirroring 14-12

planar

data format 14-3

PLL filter

of video out 7-25

polyphase filter 14-1

power down mode

DRAM memory system 12-7

of SDRAM 12-7

pref A-144

pref16x A-145

pref32x A-146

prefd A-147

prefr A-148

priority delay 14-20

Progressive scan 7-6

ABC HEDFGIJKLMNOPQRSTUVWXYZ

PRELIMINARY SPECIFICATION Index-15

quadavg A-149, A-150

quadumulmsb A-151, A-152

quasi-dual 5-4

rank size

vs. address mapping 12-5, 12-6

raw capture mo d es

video in unit

description 6-10

raw10s capture mode

video in unit 6-1

raw10u capture mode

video in unit 6-1

raw8 capture mode

video in unit 6-1

rdstatus A-153

result format 5-6

rdstatus operation 5-6

result format picture 5-6

rdtag A-154

result format 5-6

rdtag operation 5-6

result format picture 5-6

readdpc A-155

readpcsw A-156

readspc A-157

refresh

DRAM memory system 12-6

intervals 12-6

region

noncachable 5-3

region,non-cacheable 5-5

replacement 5-4

representation

boolean 3-3

floating point 3-4

integer 3-4

rescaling of images 14-1

resizing

horizontal 14-1

in ICP 14-6

vertical 14-1

revision ID

PCI register 11-6

RGB conversion 14-1

rol A-158

roli A-159

run-level output data 15-1

sample rate 8-1, 8-2

SAV and EAV codes

description 7-5

description table 7-6

format

picture 7-5

SAV format 6-5

scaling 14-6

algorithm 14-8

horizontal 14-1, 14-11, 14-15

horizontal,method 14-11

method 14-11

range 14-3

shift bypassing 14-8

two dimensional 14-1

vertical 14-1, 14-13

SDRAM 12-2

supported devices 12-2, 13-7

SDRAM memory system

timing budget 12-8

sequence counter

YUV 14-15

serial CCIR656 7-2

serial frame 8-1, 8-3

Serial Interface 17-1

sex16 A-160

sex8 A-161

SGRAM 12-2

supported devices 12-2, 13-7

size of image,range 14-3

software compatibility 3-4

software interrupt 3-11

SPCdefinition 3-3

SPDO

description table 10-1

SPDO_BASE1

picture 10-5

SPDO_BASE2

picture 10-5

SPDO_CTL

picture 10-5

SPDO_FREQ

picture 10-5

SPDO_SIZE

picture 10-5

SPDO_STATUS

picture 10-5

SPDO_TSTAMP

picture 10-5

ABC HEDFGIJKLMNOPQRSTUVWXYZ

Index-16 PRELIMINARY SPECIFICATION

speculative loads 3-3, 3-5, 3-7, 5-5, 17-4, 17-6

SRC_ADR

PCI interface MMIO register 11-14

picture 11-10

SSI_CTL

field description 17-9

st16 A-162

st16d A-163

st32 A-164

st32d A-165

st8 A-166

st8d A-167

stall,CPU 5-8

status

PCI interface register 11-5

status operations,cache 5-6, 5-7

stereo 8-1

stereo A/D converter 8-1

storemisaligned 3-3

subsampling

horizontal 14-1

vertical 14-1

Synchronous Serial Interface 17-1

synthesizer 8-2

synthesizer,digital 7-3

tag operations 5-6, 5-7

TAP controller 18-1

description 18-1

TAP,test access port 18-1

TCTL

picture 3-13

termination

guidelines 12-7

test access port 18-1

TFE definition 3-3

timer 3-12

timing 1-19

SDRAM block 14-15

vertical filter 14-15

timing refere nc e co de s 6-5

TMODULUS

picture 3-13

translucent

background 14-9

foreground 14-9

TVALUE

picture 3-13

two-way LRU 5-11

ubytesel A-168

uclipi A-169

uclipu A-170

ueql A-171

ueqli A-172

ufir16 A-173

ufir8uu A-174

ufixieee A-175

ufixieeeflags A-176

ufixrz A-177

ufixrzflags A-178

ufloat A-179

ufloatflags A-180

ufloatrz A-181

ufloatrzflags A-182

ugeq A-183

ugeqi A-184, A-186

ugtr A-185

uimm A-187

uld16 A-188

uld16d A-189

uld16r A-190

uld16x A-191

uld8 A-192

uld8d A-193

uld8r A-194

uleq A-195

uleqi A-196

ules A-197

ulesi A-198

ume8ii A-199

ume8uu A-200

umul A-202

umulm A-203

uneq A-204

uneqi A-205

upsampling

horizontal 14-1

vertical 14-1

upscaling 7-11, 14-1

V.34 interface

block diagram 17-2, 17-3, 17-4

external pins,t ab le 17-1

programming model 17-8

setup of SSI_CTL register 17-5

test modes 17-8

transmitter logic model 17-5

used as general purpose I/O

ABC HEDFGIJKLMNOPQRSTUVWXYZ

PRELIMINARY SPECIFICATION Index-17

17-1, 17-2, 17-3

V.34 modem 17-1

vectored interrupts 3-9

vendor ID

PCI interface register 11-3

vertical filter

ICP 14-24

vertical filter parameter table 14-24

vertical filtering 14-1

vertical scaling 14-1, 14-13

VI_BASE1

alignment 6-11

picture 6-10

VI_BASE2

alignment 6-11

picture 6-10

VI_CAP_SIZE

picture 6-8

VI_CAP_START

picture 6-8

VI_CLK

description table 6-2

VI_CLOCK

picture 6-8, 6-10

VI_CTL

picture 6-8, 6-10

VI_DATA

VI_DATA[8] 6-11

VI_DATA[9] 6-11

VI_DATA[7:0]

description table 6-2

VI_DATA[9:8]

description table 6-2

VI_DVALID

description table 6-2

VI_SIZE

picture 6-10

VI_STATUS

picture 6-8, 6-10

VI_U_BASE_ADR

picture 6-8

VI_UV_DELTA

picture 6-8

VI_V_BASE_ADR

picture 6-8

VI_Y_BASE_ADR

picture 6-8

VI_Y_DELTA

picture 6-8

victim of replacement 5-4

video image data formats 7-9

video in unit

capture parameters

explanation 6-6

picture 6-5

clock generator 6-4

clocking modes 6-4

common source parameters 6-6

connected to 10bit A/D converter

picture 6-4

connected to 8bit CCIR656 camera

picture 6-3

connected to video out

picture 6-3

connected to video recorder

picture 6-3

co-sited sampling 6-4

diagnostic mode 6-2

format of SAV and EAV codes 6-5

fullres capture mode 6-1

description 6-4

halfres capture mode 6-1

description 6-9

halfres co-sited sample capture

picture 6-9

halfres interspersed sample capture

picture 6-9

halfres planar memory format

picture 6-9

highway latency requirements 6-13

highway latency,HBE description 6-13

interface pins

description ta ble 6-2

interspersed sampling 6-5

message passing

major states diagram 6-12

message passing mod e

description 6-11

example signal diagram 6-12

message-passing mode 6-1

description 6-11

power down 6-2

raw and message passing modes

MMIO register view, picture 6-10

raw capture modes

description 6-10

raw mode,major states,diagram 6-11

raw10s capture mode 6-1

raw10u capture mode 6-1

raw8 capture mode 6-1

reset 6-2

YUV 4:2:2 planar memory format

picture 6-7

YUV capture view of MMIO registers 6-8

virtual registers 18-4

VLD

ABC HEDFGIJKLMNOPQRSTUVWXYZ

Index-18 PRELIMINARY SPECIFICATION

command register 15-1

command register,description 15-7

commands 15-1

CPU interaction 15-2

error handling,description 15-8

flush output command 15-1

input,description 15-2

interrupt description 15-8

introduction 15-1

MMIO registers 15-4

picture 15-6

operational registers,description 15-7

output,description 15-3

parse command 15-1

parsing actio n 15-2

picture info register,description 15-8

quantizer scale register,description 15-7

reset command 15-1

reset description 15-8

search command 15-1

shift command 15-1

shift register,description 15-7

software reset procedure 15-8

stop reasons 15-1

VO Video Out Unit 7-1

VO_CLK

description table 7-3

VO_CLOCK

common values 7-23

field description table 7-18

picture 7-15

VO_CTL

fields 7-17

picture 7-15

VO_DATA[7:0]

description table 7-3

VO_FIELD

default values 7-23

field description table 7-18

picture 7-15

VO_FRAME

default values 7-23

field description table 7-18

picture 7-15

VO_IMAGE

default values 7-23

field description table 7-19

picture 7-15

VO_IO1

description table 7-3

VO_IO2

description table 7-3

VO_LINE

default values 7-23

field description table 7-19

picture 7-15

VO_OLADD

field description table 7-19

picture 7-15

VO_OLHW

field description table 7-19

picture 7-15

VO_OLSTART

field description table 7-19

picture 7-15

VO_STATUS

field description table 7-16

picture 7-15

VO_UADD

field description table 7-19

picture 7-15

VO_VADD

field description table 7-19

picture 7-15

VO_VUF

picture 7-15

VO_YADD

field description table 7-19

picture 7-15

VO_YOLF

field description table 7-19

picture 7-15

VO_YTHR

field description table 7-7, 7-19

picture 7-15

VO_YUF

field description table 7-19

write misses 5-4

writedpc A-206

writepcsw A-207

writespc A-208

YUVformats of ICP 14-3

sequence counter 14-15

YUV capture

view of video in MMIO registers 6-8

YUV conversion 14-1

YUV image format 7-9

ABC HEDFGIJKLMNOPQRSTUVWXYZ

PRELIMINARY SPECIFICATION Index-19

YUV planar format 7-10

YUV to RGB conversion 14-9

YUV to RGB converter 14-1

YUV upscaling 7-11

zex16 A-209

zex8 A-210

ABC HEDFGIJKLMNOPQRSTUVWXYZ
Index-20 PRELIMINARY SPECIFICATION

All rights are reserved. Reproduction in whole or in part is prohibi ted without the prior written consent of the copyright ow ner.

The informatio n presented in this docume nt does not form part of any quotation or contract, is believed to be accu ra te and relia ble and may be changed

without notice. No liability will be accepted by the publisher for any consequence of its use. Publication thereof does not convey nor imply an y license

under patent- or oth e r indu strial or intellec tu al pro p erty rights.

Internet: http://www.semicond u cto rs.philips.com

2004 69

Printed in the United States of America

Philips Semiconductors – a worldwide company

For all other countries apply to: Philips Semiconductors ,

International Marketing & Sales Communications, Building BE-p, P.O. Box 218,

5600 MD EINDHOVEN, Th e Netherlands, Fax. +31 40 2 7 2 4825

Argentina: see South America

Australia: 3 Figtree Drive, HOMEBUSH, NSW 2140,

Tel. +61 2 9704 8141, Fax. +61 2 9704 8139

Austria: Computerstr. 6, A-1101 WIEN, P.O. Box 213,

Tel. +43 1 60 101 1248, Fax. +43 1 60 101 1210

Belarus: Hotel Minsk Business Center, Bld. 3, r. 1211, Volodarski Str. 6,

220050 MINSK, Tel. +375 172 20 0733, Fax. +375 172 20 0773

Belgium: see The Netherlands

Brazil: see South America

Bulgaria: Philips Bulgaria Ltd., Energoproject, 15th floor,

51 James Bourc hier Blvd., 1407 SOFIA,

Tel. +359 2 68 9211, Fax. +359 2 68 9102

Canada: PHILIPS SEMICONDUCTORS/COMPONENTS,

Tel. +1 800 234 7381, Fax. +1 800 943 0087

China/Hong Kong: 501 Hong Kong Industrial Techno logy Centre,

72 Tat Chee Avenue, Kowloon Tong, HONG KONG,

Tel. +852 2319 7888, Fax. +852 2319 7700

Colombia: see South America

Czech Republic: see Austria

Denmark: Sydhavns gade 23, 1780 COPENHAGEN V,

Tel.+4533293333,Fax.+4533293905

Finland: Sinikalli ontie 3, FIN-02630 ESPOO,

Tel. +358 9 615 800, Fax. +358 9 6158 0920

France: 51 Rue Carnot, BP317, 92156 SURESNES Cedex,

Tel. +33 1 4099 6161, Fax. +33 1 4099 6427

Germany: Hammerbrookstra ße 6 9, D- 20097 HAMBURG,

Tel.+4940235360,Fax.+494023536300

Hungary: see Austria

India: Philips INDIA Ltd, Band Box Building, 2nd floor,

254-D, Dr. Annie Besant Road, Worli , MUMBAI 400 025,

Tel. +91 22 493 8541, Fax. +91 22 493 0966

Indonesia: PT Philips Development Corporation, Semiconductors Division,

Gedung Phil ips, Jl. Buncit Raya Kav.99- 100, JAKARTA 12510,

Tel. +62 21 794 0040 ext. 2501, Fax. +62 21 794 0080

Ireland: Newstead, Clonskeagh, DUBLIN 14,

Tel.+35317640000,Fax.+35317640200

Israel: RAPAC Electronics, 7 Kehilat Salo niki St, PO Box 18053,

TEL AVIV 61180, Tel. +972 3 645 0444, Fax. +972 3 649 1007

Italy: PHILIPS SEMICONDUCTORS, Via Casati, 23 - 20052 MONZA (MI),

Tel. +39 039 203 6838, Fax +39 039 203 6800

Japan: Philips Bldg 13-37, Kohnan 2-chome, Minato-ku, TOKYO 108-

8507, Tel. +81 3 3740 5130, Fax. +81 3 3740 5057

Korea: Philips House, 260-199 Itaewon-dong, Yongsan-ku, SEOUL,

Tel. +82 2 709 1412, Fax. +82 2 709 1415

Malaysia: No. 7 6 J alan Universiti, 46200 PETALING JAYA, SEL ANGOR,

Tel. +60 3 750 5214, Fax. +60 3 757 4880

Mexico: 5900 Gateway East, Suite 200, EL PASO, TEXAS 79905, Tel. +9-

5 800 234 7381, Fax +9-5 800 943 0087

Middle East: see Italy

Netherlands: Postbus 90050, 5600 PB EINDHOVEN, Bldg. VB,

Tel.+31402782785,Fax.+31402788399

New Zealand: 2 Wagener Place, C.P.O. Box 1041, AUCKLAND,

Tel. +64 9 849 4160, Fax. +64 9 849 7811

Norway: Box 1, Manglerud 0612, OSLO,

Tel.+4722748000,Fax.+4722748341

Pakistan: see Singapore

Philippines: Philips Semiconductors Philippines Inc.,

106 Valero St. Salcedo Village, P.O. Box 2108 MCC, MAKATI,

MetroMANILA, Tel.+6328166380,Fax.+6328173474

Poland: Al.Jerozolimskie 195 B, 02-222 WARSAW,

Tel.+48225710000,Fax.+48225710001

Portugal: see Spain

Romania: see Italy

Russia: Philips Russia, Ul. Usatcheva 35A, 119048 MOSCOW,

Tel. +7 095 755 6918, Fax. +7 095 755 6919

Singapore: Lorong 1, Toa Payoh, SINGAPORE 319762,

Tel. +65 350 2538, Fax. +65 251 6500

Slovakia: see Austria

Slovenia: see Italy

South Africa: S.A. PHILIPS Pty Ltd., 195-215 Main Road Martindale,

2092 JOHANNESBURG, P.O. Box 58088 N ewville 2114,

Tel. +27 11 471 5401, Fax. +27 11 471 5398

South America: Al. Vicente Pinzon, 173, 6th floor, 04547-

130 SÃO PAULO, SP , Brazil, Tel. +55 1 1 821 2333, Fax. +55 11 821 2382

Spain: Balmes 22, 08007 BARCELONA,

Tel. +34 93 301 6312, Fax. +34 93 301 4107

Sweden: Kott bygatan 7, Akalla, S-16485 STOCKHOLM,

Tel. +46 8 5985 2000, Fax. +46 8 5985 2745

Switzerland: Allmendstrasse 140, CH- 8027 ZÜRICH,

Tel. +41 1 488 2741 Fax. +41 1 488 3263

Taiwan: Philips Semiconductors, 6F, No. 96, Chien Kuo N. Rd., Sec. 1,

TAIPEI, Taiwan Tel.+886221342886,Fax.+886221342874

Thailand: PHILI PS ELECTRONICS (THAILAND) Ltd., 209/2 Sanpav uth-

Bangna Road Prakanong, BANGKOK 10260,

Tel. +66 2 745 4090, Fax. +66 2 398 0793

Turkey: Yukari Dudullu, Org. San. Blg., 2.Cad. Nr. 28 81260 Umraniye,

ISTANBUL, Tel. +90 216 522 1500, Fax. +90 216 522 1813

Ukraine: PHILIPS UKRAINE, 4 Patrice Lumumba str., Building B, Floor 7,

252042 KIEV, Tel. +380 44 264 2776, Fax. +380 44 268 0461

United Kingdom: Philips Semiconductors Ltd., 276 Bath Road, Hayes,

MIDDLESEX UB3 5BX, Tel. +44 208 730 5000, Fax. +44 208 754 8421

United States: 811 East Arques Avenue, SUNNYVALE, CA 94088-3409,

Tel. +1 800 234 7381, Fax. +1 800 943 0087

Uruguay: see South America

Vietnam: see Singapore

Yugoslavia: PHILIPS, Trg N. Pasica 5/v, 11000 BEOGRAD,

Tel. +381 11 3341 299, Fax.+381 11 3342 553

Date of release: 2004 Aug 20 Document order number: xxxx xxx xxxxx

2004 Aug 20

Philips Semiconductors Product Specification

Media Processor PNX1300/01/02/11