# ATLAS Level-1 Calorimeter Trigger Jet / Energy Processor Module

**Project Specification (PDR)** 

Version 1.0 Date: 14 November 2002

Carsten Nöding Uli Schäfer Jürgen Thomas *Universität Mainz* 

# Samuel Silverstein

Stockholm University

# **Table of Contents**

| 1 | Intro | duction                                                 | 4  |
|---|-------|---------------------------------------------------------|----|
|   | 1.1   | Related projects                                        | 4  |
|   | 1.2   | Overview                                                | 4  |
|   | 1.2.1 | Real-time data path                                     | 5  |
|   | 1.2.2 | Level-2 interface, monitoring, diagnostics, and control | 7  |
| 2 | Func  | tional requirements                                     | 9  |
|   | 2.1   | Jet input data reception                                | 9  |
|   | 2.2   | Jet input data conditioning                             | 9  |
|   | 2.3   | Jet element processing                                  | 10 |
|   | 2.3.1 | Total transverse energy algorithm                       | 11 |
|   | 2.3.2 | Missing energy algorithm                                | 11 |
|   | 2.3.3 | Jet Algorithm                                           | 11 |
|   | 2.4   | DAQ and RoI                                             | 12 |
|   | 2.5   | Diagnostics and control                                 | 13 |
|   | 2.5.1 | Playback and spy functionality                          | 13 |
|   | 2.5.2 | VME interface                                           | 13 |
|   | 2.5.3 | DCS interface                                           | 14 |
|   | 2.5.4 | JTAG port                                               | 14 |
|   | 2.5.5 | Configuration                                           | 14 |
|   | 2.5.6 | TTC interface                                           | 15 |
|   | 2.6   | Board level issues : Power supplies and line impedances | 15 |
| 3 | Impl  | ementation                                              | 16 |
|   | 3.1   | Jet input data reception                                | 16 |
|   | 3.2   | Jet input data conditioning                             | 17 |
|   | 3.3   | Jet element processing                                  | 19 |
|   | 3.3.1 | Total transverse energy                                 | 19 |
|   | 3.3.2 | Missing energy                                          | 19 |
|   | 3.3.3 | Jet algorithm                                           | 20 |
|   | 3.3.4 | FCAL and endcap calorimeter treatment                   | 20 |
|   | 3.4   | DAQ and RoI                                             | 21 |
|   | 3.4.1 | DAQ read-out                                            | 22 |
|   | 3.4.2 | ROI read-out                                            | 22 |
|   | 3.5   | Diagnostics and control                                 | 23 |
|   | 3.5.1 | Playback and spy                                        | 23 |
|   | 3.5.2 | VME control                                             | 23 |
|   | 3.5.3 | DCS interface                                           | 23 |
|   | 3.5.4 | JTAG                                                    | 24 |
|   | 3.5.5 | FPGA configuration                                      | 24 |
|   | 3.5.6 | Timing                                                  | 25 |
|   | 3.6   | Signal levels and supply voltages                       | 26 |
| 4 | Inter | faces : connectors, pinouts, signal levels,data formats | 27 |
|   | 4.1   | backplane connector layout                              | 29 |
| 5 | Prog  | ramming Model                                           | 32 |
|   | 5.1.1 | Memory Map                                              | 33 |
|   | 5.1.2 | Sub Base Address                                        | 33 |
|   | 5.2   | Input FPGA                                              | 33 |
|   | 5.2.1 | CONTROL_REG                                             | 34 |
|   | 5.2.2 | STATUS_REG                                              | 34 |
|   | 5.2.3 | MASK_REG                                                | 34 |
|   | 5.2.4 | CLK_PHASE_EM                                            | 35 |
|   | 5.2.5 | CLK_PHASE_HAD                                           | 35 |
|   | 5.2.6 | DELAY_REG                                               | 35 |
|   | 5.3   | Main Processor                                          | 35 |
|   | 5.3.1 | Memory Map                                              | 36 |
|   | 5.3.2 | CONTROL_REG                                             | 36 |
|   | 5.3.3 | STATUS_REG                                              | 37 |

| 5.4 | 4 Con    | trol FPGA         | 37 |
|-----|----------|-------------------|----|
|     | 5.4.1    | Memory Map        | 37 |
|     | 5.4.2    | CONTROL_REG       | 37 |
|     | 5.4.3    | STATUS_REG        | 37 |
|     | 5.4.4    | TTC_REG           | 37 |
|     | 5.4.5    | TTCrx_REG         | 37 |
| 5.5 | 5 VM     | E CPLD            | 37 |
|     | 5.5.1    | Memory Map        | 37 |
|     | 5.5.2    | MOD_ID_B          | 38 |
|     | 5.5.3    | STATUS_REG        | 38 |
|     | 5.5.4    | CFG_MASK_REG      | 38 |
|     | 5.5.5    | FPGA_RESET_REG    | 38 |
|     | 5.5.6    | CFG_REG           | 39 |
| 5.6 | 6 ROO    | C FPGA            | 39 |
|     | 5.6.1    | Memory Map        | 39 |
|     | 5.6.2    | CONTROL_REG       | 39 |
|     | 5.6.3    | GLINK_STATUS_REG  | 39 |
|     | 5.6.4    | LATENCY_REG       | 39 |
|     | 5.6.5    | SLICE_REG         | 40 |
|     | 5.6.6    | ROI_REG           | 40 |
|     | 5.6.7    | BC_OFFSET_REG     | 40 |
|     | 5.6.8    | GLINK_CONTROL_REG | 40 |
| 6   | Glossary |                   | 40 |

# 1 Introduction

This document describes the specification for the "Module-0" prototype Jet / Energy Module (JEM). It is a full-scale prototype designed to be functionally identical to the final production modules. Section 1 of this document gives an overview of the Jet / Energy processor (JEP) and the JEM. Section 2 describes the functional requirements of the JEM. Section 3 details the implementation of the prototype module presently under design. The functionality of the prototype JEM is intended to be nearly identical to that of the production modules. However, some revisions may be required based on results from prototype tests.

# 1.1 Related projects

The JEM is the main processor module within the Jet / Energy Processor (JEP) of the ATLAS level-1 calorimeter trigger. The JEP, along with the electromagnetic and hadronic Cluster Processor (CP) receives event information from the Pre-Processor (PPr) system. The JEM receives and processes this information and transmits results to two merger modules (CMMs) in the crate, as well as information about selected events to Read-Out Drivers (RODs). Trigger, timing, and slow control are provided by a Timing and Control Module (TCM), and configuration is carried out over a reduced VME bus. Module-0 specifications for most Level-1 trigger modules (for acronyms see glossary in section 6) are available, or will be shortly. Documentation may be found at:

| TDR                 | http://atlasinfo.cern.ch/Atlas/GROUPS/DAQTRIG/TDR/tdr.html       |
|---------------------|------------------------------------------------------------------|
| URD                 | http://atlasinfo.cern.ch/Atlas/GROUPS/DAQTRIG/LEVEL1/L1CalURD.ps |
| TTC                 | http://www.cern.ch/TTC/intro.html                                |
| CPM, CMM, ROD, PPM, |                                                                  |
| TCM                 | http://hepwww.rl.ac.uk/Atlas-L1/Modules/Modules.html             |

# 1.2 Overview

The Jet / Energy Processor (JEP) is a major component of the ATLAS calorimeter trigger. It extracts jet and total and missing transverse energy information from data provided by the PPr. The JEP covers a pseudorapidity range of  $|\eta| < 4.9^1$ . Granularity of the jet data is approximately 0.2\*0.2 in  $\phi^*\eta$  in the central  $\eta$ -range . Jet multiplicities and thresholded energy data are sent to the Central Trigger Processor (CTP). The JEP is organised in two JEP crates, each of them processing two quadrants of trigger space (Figure 1) in 16 JEMs. Each JEM receives most of its data from two electromagnetic and two hadronic PPMs<sup>2</sup>, covering a jet trigger space of 8  $\phi$ -bins  $\times 4 \eta$ -bins (Figure 2, channels A..H,1..4). Additional overlap data from an additional four PPMs in each of the two neighbouring quadrants (V,W and Z) are required for the jet algorithm. Total number of input channels per JEM is 88. The JEM forwards results to merger modules in the JEP crates, which compile information from the entire event for transmission to the CTP.

<sup>&</sup>lt;sup>1</sup> Transverse energy is measured in the full  $\eta$  range. Jets are detectable up to  $|\eta| < 3.2$  only unless a dedicated forward trigger algorithm is implemented.

 $<sup>^{2}</sup>$  For the 6 central JEMs two PPMs are mapped to a single JEM as shown in Figure 2. For JEMs 0,7,8, and 15 the mapping is different for barrel and FCAL channels (see PPr specifications and sect.3.3.4).



Figure 1 : JEP channel map



Figure 2 : JEM channel map, including quadrant overlaps (Z,W) and V

# 1.2.1 Real-time data path

Data from the Pre-processor are received on serial links at a rate of 400Mb/s (Figure 3). On the JEM the data are deserialised to 10-bit words (odd parity-protected 9-bit energy) at the LHC bunch clock rate of 40 MHz. These data are presented to a first processor stage consisting of a bank of 11 input FPGAs (Figure 4). The data are subjected first to phase correction and error detection. Then they are summed in pairs of electromagnetic (em) and hadronic (had) channels to build jet elements. A threshold cut is applied to protect the trigger from excessive background and noise. Data are multiplexed to twice the bunch clock rate and sent to the main processor FPGA. The jet algorithm requires data from a 4×4 bin neighbourhood around each processed jet element (see sect.2). Therefore, neighbouring JEMs must share large amounts of data via fan in/out (FIO) links on the backplane. To accomplish this, the input FPGAs send copies of shared jet elements to neighbouring modules. Three out of four jet elements are duplicated in this manner.



Figure 3: JEM block diagram



Figure 4 : JEM input processor

The processor FPGA carries out both the jet and energy algorithms (Figure 5). The jet algorithm uses jet elements received from the local input FPGAs as well as overlap data from the neighbouring boards on FIO links. To process a core of  $8\times4$  elements, an environment of  $11\times7$  jet elements is required. Within this environment the processor FPGA identifies clusters of  $2\times2$ ,  $3\times3$ , and/or  $4\times4$  jet elements that exceed one of 8 programmable thresholds, and reports the results to the jet CMM in the form of 8 3-bit multiplicities<sup>3</sup>, plus 1 bit of odd parity.

<sup>&</sup>lt;sup>3</sup> The data format might be different on JEMs processing FCAL data. See 3.3.4



Figure 5 : JEM main processor FPGA

The energy algorithm operates on the 32 core jet elements. Total transverse energy is determined from a scalar sum over all core jet elements, while missing transverse energy is determined from a vector sum. A separate threshold is applied to data entering the total transverse energy adder tree. There is no separate threshold on the missing  $E_t$  data. To save on logic resources, all four jet elements at the same  $\phi$  (Figure 2) are summed together in a first stage. The resulting 8 data words are then converted to x and y projections by multiplication with cosine and sine of  $\phi$ , respectively. The vectors results are summed and transmitted along with the scalar transverse energy to the energy summation CMM as a 25-bit (odd) parity-protected data word.

Energy summation is performed with a precision of 1 GeV per LSB, in a range of 4.095 TeV. Signals exceeding full scale are saturated to 4.095 TeV. This requires 12-bit encoding for all energy data. Due to limitations on backplane bandwidth only 8 bits of data are available for each of the energy data words. Therefore a compression scheme is employed that ensures full resolution for small signals while accommodating the full 4.095 TeV range.

## 1.2.2 Level-2 interface, monitoring, diagnostics, and control

To help the Level-2 trigger efficiently evaluate events accepted by Level-1, the Level-1 trigger provides information on regions of interest (RoI) identified by the real-time data path. In the JEM, the RoI information comprises the coordinates and types of jets identified in the processor FPGA (Figure 5). The readout of ROI information is initiated by a Level-1 accept signal, which is asserted by the CTP several bunch crossings after the event has been processed in the real-time data path of the JEM. This latency is compensated by a shift register (latency buffer) that delays the RoI data and brings them in phase with the Level-1 accept signal. The latency-corrected RoI data is written to a FIFO (derandomiser buffer) upon receipt of a Level-1Accept signal. The RoI data is then serialised to a 40Mb/s bit stream and sent via a read-out controller (ROC) and on-board serial links (HP G-link) to R-RODS which forward the data to the level-2 trigger. (Figure 6).



Figure 6: RoI and DAQ slice data path



Figure 7 : Input processor : readout sequencer

Provision is made for extensive monitoring via the ATLAS data acquisition system. All real-time data entering or leaving the processor modules via cables or long backplane links are stored in latency buffers (Figure 7) for eventual readout to the DAQ for all LVL-1 accepted events. The derandomiser buffer captures data from 1 to 5 bunch crossings for each accepted event, and serialises the data to a 40Mb/s bit stream.

On each JEM there are 13 sources of DAQ data. Each input FPGA contributes a 1-bit wide serial data stream containing the data received from the de-serialiser chips. The processor FPGA contributes one bit stream of data, describing the jet and energy sum signals sent to the merger modules. A stream of bunch count information is added by the read-out controller (ROC).

For diagnostic purposes, additional playback and spy memories are provided. The playback memories can be filled with test patterns under VME control. Upon receipt of a synchronous command broadcast by the TTC system, the input FPGAs begin to insert the playback data into the processing chain (Figure 4). Data downstream of the playback memories are captured into spy memories (Figure 5) that can be read out via VME for analysis.

The main control path for the JEM is via a bus that carries a reduced set of VME commands. The physical interface is an A24D16 bus with a very small number of control lines. Signal levels are standard 5V LS-TTL. Programmable logic is used to interface the reduced VME bus to the JEM processor chips.

Environmental data are monitored by the DCS system. The JEM communicates with DCS using a local microcontroller-based CAN node. The controller collects temperature and supply voltage data, and forwards them to a crate-level master device via a 2-pin CAN link on the processor backplane.

The JEM processor chips are programmable devices with volatile firmware storage. Therefore, hardware is required to download configuration code to the devices upon power-up. Configuration data is supplied by non-volatile on-board memories, and multiple configurations for each device are stored in separate, selectable memories. The data sequencing hardware is implemented in non-volatile CPLDs. Special configurations can be loaded after power-up to selected devices via VME, if needed. All local data, programme, and configuration storage is ISP-flashable via JTAG/Boundary-scan.

System timing is derived from a TTC signal distributed to every module in the system. A single TTCrx chip on each JEM provides bunch clock signals, trigger information and synchronous and asynchronous control commands.

# 2 Functional requirements

The role of the JEP is to derive total transverse energy, missing transverse energy, and jet multiplicities from data provided by the PPr. The JEM covers the first stages of the jet and energy algorithms within a trigger space of 8 bins in  $\varphi \times 4$  bins in  $\eta$ . The full real-time jet and energy processing chain includes the JEM and two stages of CMMs. All data processing functions are implemented in FPGAs, using a fully synchronous design with I/O flip-flops used throughout the algorithms and an optimised partitioning of the algorithms between different devices. The JEM is linked to DAQ and LVL-2 via front panel coaxial connectors. All other signals are routed through the backplane on 2mm pitch PCI-type connectors.

This section outlines the requirements for all functions of the JEM. Implementation details are located in Section 3 of this document. For information on interfaces with other system components, see section 4.

# 2.1 Jet input data reception

The JEM receives energy data from the PPr (see Figure 8) via 88 serial links at a data rate of 400 Mb/s. The input signals enter the JEM through the backplane. The serial data stream is converted to a 10-bit parallel stream by LVDS deserialisers (**DS92LV1224**). The data word is accompanied by a rising-edge strobe signal and a link status bit. For data format details, see section 4.

The requirements on the JEM with respect to jet data reception are:

- Receive differential 400 Mb/s LVDS signals from the backplane on matched-impedance lines
- Feed the serial lines to the deserialisers
- Provide the deserialisers with a reference clock
- Read out 10 data bits, link status, and a strobe signal from each deserialiser to an input FPGA at the bunch clock rate.
- Route four electromagnetic and four corresponding hadronic data words to each input FPGA.

# 2.2 Jet input data conditioning

Both the jet and the energy algorithms operate on jet elements of  $0.2 \times 0.2$  ( $\varphi \times \eta$ ) bin size. They require information on the total energy deposit per bin. Therefore it is convenient to send the raw data words through an initial processing stage that is common to both jet and energy algorithms. This includes data re-timing, bit error handling and an initial summation of the electromagnetic and hadronic data in each jet element. Faulty or unused<sup>4</sup> channels are suppressed, as well as energy sums below a programmable threshold. If at least one of the electromagnetic or hadronic energy values is saturated (at the maximum value), the jet element energy sum is set to its maximum value. The jet element data are multiplexed to 5-bit wide data streams at twice the bunch clock rate, for transmission to the main processing stage.

<sup>&</sup>lt;sup>4</sup> In JEMs 0, 7, 8 and 15 the outermost  $\eta$ -bin (FCAL channels) is only half-populated with input cables due to the coarser granularity of the FCAL in  $\varphi$ .

The requirements for jet element formation are:

- Receive 10 bits of data (9 bits plus odd parity) and a link status bit from each deserialiser
- Latch the data and status into input flip-flops.
- Synchronise the data on to the global bunch clock<sup>5</sup>
  - Derive four clock phases with 0, 90, 180, and 270 degree offset from the bunch clock
  - Latch the data into flip-flops at a suitable phase
  - Latch the result into a  $2^{nd}$  column of flip-flops at 0 degree offset from the bunch clock
  - $\circ$  Provide a programmable delay of one full bunch clock tick, using a 3<sup>rd</sup> set of flip-flops.
- Mask out faulty or unused data channels using a VME-controlled register.
- On a link error, zero the corresponding data word and record the error for monitoring purposes.
- Calculate odd parity of the 9 data bits from each deserialiser, and compare it with the parity bit.
- On a parity error, zero the corresponding data word and record the error for monitoring purposes
- Add electromagnetic and corresponding hadronic energy into a 10-bit wide jet element.
- Set the jet element to maximum value if at least one of the deserialiser data values is at maximum.
- Subject jet elements to a VME-programmable threshold. Jet elements below this threshold are zeroed.
- Identify FCAL elements by the module geographic address and channel number, divide by two<sup>6</sup>
- Multiplex jet elements to 5-bit data at twice the bunch clock rate (least significant bits first)
- Duplicate three out of four jet elements to send copies to neighbouring JEMs (see sect.2.3.3)
- Add pipeline flip-flops in the algorithm at appropriate depths to guarantee reliable operation at the required clock rate
- Latch all output data on output flip-flops
- Transmit four 5-bit streams to the main processor, two to the left hand neighbour JEM, one to the right hand side.

# 2.3 Jet element processing

The jet and energy algorithms operate on a common data set of jet elements. The baseline JEM design will implement both algorithms in a common FPGA. Each JEM processes a core area of 8  $\varphi$ -bins by 4 $\eta$ -bins of the jet trigger space, within an environment of 11×7 jet elements required by the jet algorithm. Since jet elements are transmitted as 5-bit data at twice the bunch clock rate, the main processor FPGA requires a total of 385 input data lines. The energy algorithms use 160 of these signal lines.

The jet processor produces 24 bits of result data, representing 3-bit jet multiplicities for 8 thresholds. The energy processors deliver 3 12-bit energy sums that must be compressed to a 24-bit data word. The baseline algorithm utilises a quad-linear encoding scheme (see 3.3). Both jet and energy data words are driven down the JEP backplane, each accompanied by one (odd) parity bit.

The requirements for jet element and results handling are:

- Provide a synchronous local clock at twice the bunch clock rate
- Receive 44 5-bit jet elements from local input processors at twice the bunch clock rate, with least significant bits first
- Receive 22 jet elements from the right hand neighbour, 11 jet elements from the left
- Latch the jet elements in input flip-flops at twice the bunch clock rate
- Identify FCAL elements by the module geographic address and channel number, copy to neighbouring cell<sup>6</sup>
- Supply the jet algorithm with 77 jet elements on 385 signal lines at twice the bunch clock rate
- Supply the energy algorithm with 32 jet elements, de-multiplexed to 320 signal lines
- Receive 24 bits of jet count data at the bunch clock rate from the jet algorithm
- Encode them to a 25-bit odd parity protected data word
- Receive 12 bits of data from the total transverse energy algorithm
- Receive two data words of 12 bits each from the missing energy algorithm

<sup>&</sup>lt;sup>5</sup> This synchronisation method was devised for the cluster processor. For a low-latency alternative see 3.2.

<sup>&</sup>lt;sup>6</sup> For FCAL handling see 3.3.4.

- Compress the energy sums and encode them to a 25-bit odd parity-protected data word
- Latch all output data on output flip-flops before sending them across the backplane

## 2.3.1 Total transverse energy algorithm

The total transverse energy algorithm performs a scalar summation of jet elements over the full jet trigger space up to  $|\eta| = 4.9$ . On each JEM it calculates a sub-sum of all local jet elements. Duplicated data from neighbouring JEMs or quadrants are not included in the sum. Therefore the total number of jet elements entering the sum is 32 (8  $\varphi$ -bins ×4 $\eta$ -bins). To protect the algorithm against noise and background a separate programmable threshold is applied to jet elements on the total transverse energy path only.

The requirements for total transverse energy summation are:

- Receive 32 jet elements 10-bit wide
- Apply a common threshold to all jet elements
- Add all 32 jet elements in a 5-deep adder tree.
  - Perform summation at 12-bit range
  - Saturate sums at 4095 GeV
- Add pipeline flip-flops in the algorithm at appropriate depths to guarantee reliable operation at the required clock rate
- Output a 12-bit wide scalar energy sum at the bunch clock rate

## 2.3.2 Missing energy algorithm

The missing energy algorithm performs a vector summation of jet elements over the full jet trigger space up to  $|\eta| = 4.9$ . On each JEM it calculates a sub-sum of all 32 jet elements in the 8×4 space covered by the JEM. The energy vector components  $E_x$  and  $E_y$  are calculated by multiplying the total transverse energy by cosine and sine  $\varphi$ , respectively.

The requirements for missing energy summation are:

- Receive 32 10-bit jet elements
- Calculate 8 sums over 4 η-bins each
- Multiply the 8 sums by cosine and sine of  $\varphi$  to produce the  $E_x$  and  $E_y$  vector elements, respectively, in 12-bit range with 1GeV resolution<sup>7</sup>
- Add the Ex and Ey sums for all elements in a 3 level adder tree, summation performed in 12-bit range
- Saturate sum at (4095,4095) GeV
- Add pipeline flip-flops in the algorithm at appropriate depths to guarantee reliable operation at the required clock rate
- Output a 2×12-bit wide energy vector sum at the bunch clock rate

## 2.3.3 Jet Algorithm

The Jet algorithm identifies and counts clusters of jet elements centered around a local maximum that exceed one or more programmable thresholds. The jet algorithm supports eight independent jet definitions, each consisting of a programmable threshold associated with a selectable cluster size. Jet elements are first passed through a summation algorithm, which produces energy sums for 60 (6x10) 2x2 clusters, 45 (5x9) 3x3 clusters, and 16 (4x8) 4x4 clusters. Cluster sums containing saturated jet elements, or that saturate themselves are flagged. The central 32 (4x8) 2x2 clusters are compared with their nearest neighbors to determine whether they are local maxima, and therefore possible jet candidates. Saturated 2x2 clusters are automatically identified as local maxima; if two or more neighboring 2x2 clusters saturate, the local maximum position is defined to be the one that is at lowest  $\eta$ , then at lowest  $\varphi$ .

The central 4x8 region processed by each Jet FPGA is divided into eight 2x2 subregions, each of which can contain no more than one local maximum. If, due to saturation, more than one local maximum is found, the local maximum is chosen that has a position index at the lowest value of  $\eta$ , and then the

<sup>&</sup>lt;sup>7</sup> The algorithm operates on 12-bit data with LSB=1GeV. So as to minimise latency and consumption of FPGA resources a multiplier with an actual accuracy worse than 1LSB might be chosen. See sect.3.3.2.

lowest value of  $\varphi$ . When a local maximum is identified in a subregion, the 2x2, 3x3, and 4x4 clusters associated with it are selected and compared with appropriate thresholds. If no local maximum is found, the output of the subregion is zeroed. Saturated clusters automatically pass all thresholds. Three-bit multiplicities of jet clusters satisfying each of the eight jet definitions are produced.

The location of the jet cluster within each 2x2 subregion and its associated threshold bits are stored in a fixed-length pipeline for readout to the ROI builder upon a Level-1 accept.

The requirements for the jet algorithm are:

- Receive 77 jet elements multiplexed to 5-bit wide data at twice the bunch clock rate, FCAL energies divided evenly between two neigboring jet elements in  $\phi$
- Produce sums of 2x2, 3x3, and 4x4 jet elements.
- Identify 2x2 ROI candidates that are a local maximum
- Select ROI candidate coordinates within each 2x2 subregion
- Compare 2x2, 3x3, and 4x4 candidates against 8 programmable size/energy thresholds
- Produce 8 3-bit jet multiplicities corresponding to ROI candidates satisfying each of the 8 programmable thresholds
- Output these multiplicities at the bunch clock rate
- Output ROI coordinate and threshold bits at the bunch clock rate to be entered in the ROI readout pipeline.

# 2.4 DAQ and Rol

All real-time data (slice data) crossing board boundaries on long backplane links or cables are captured on the inputs or outputs of the processor chips and stored there in latency buffers to await a Level-1 accept decision. Duplicated data shared between neighbouring JEMs is not recorded in this way, due to the large number of bits involved. Bunch crossing identifiers are recorded along with the slice data. Upon a positive L1A decision, up to 5 consecutive bunch clock ticks worth of data are transferred to a derandomising FIFO buffer, from which they are serialised to one-bit wide data streams at the bunch clock rate. Data from accepted events are sent to the DAQ for subsequent data analysis and validation.

Jet coordinates indicating regions of interest (RoI) are captured and read out by the main processor FPGA by a readout structure similar to the one above. Upon a L1A, a single time slice of data is derandomised and forwarded to the Level-2 trigger.

The readout pipelines and derandomisers are implemented locally on the processor chips. The buffers are controlled and read out by local readout sequencers (ROSs), that communicate with a readout controller FPGA (ROC) that collects the single-bit data streams. The ROC FPGA formats the data (see Figure 12) and sends it off-board to readout drivers (D-RODs, R-RODs) over serial links (G-links).

The requirements with respect to DAQ and RoI data processing are:

- Sample all critical real-time JEM input and output slice data, including:
  - o incoming LVDS-link data including link status bit (8×11-bit per input processor FPGA)
  - a 12 bit bunch count identifier
  - outgoing trigger data
    - 24 bits of jet count information plus 1 (odd) parity bit
    - 24 bits of energy sum information plus 1 parity bit
- record the jet coordinates found by the jet processor (8 jets, each with 2 position bits, 1 saturation bit and 8 threshold bits)
- pipeline the slice and RoI data for a fixed number (approximately 48)<sup>8</sup> of bunch clock ticks corresponding to the maximum possible downstream latency before L1A reception

<sup>&</sup>lt;sup>8</sup> The JEP is a fixed latency pipelined processor. Slice and RoI data are collected at different pipeline depths. This results in different downstream latencies for the individual slice and RoI data sources. Since the JEP is built in programmable logic, the exact latency will be known upon final implementation of the algorithms only. Both processor pipeline and latency buffer pipeline are defined by FPGA configuration and will therefore match in any given firmware version. The programmable offset will allow for an over-all correction only to cope with latency variations outside the JEP.

- derive a ReadRequest signal from the LVL-1 accept signal with a programmable offset of up to 16 bunch clock ticks to compensate for possible varying downstream latencies
- assert the ReadRequest signal for 1 to 5 bunch clock ticks
- read out the pipeline data to 256-deep FIFOs on ReadRequest:
  - $\circ$  1 to 5 consecutive data words for slice data
  - $\circ$  1 data word for RoI data
- serialise the DAQ data to 13 single-bit data streams:
  - $\circ$  one data stream per input processor FPGA slice data (11 total)
  - $\circ$  ~ one data stream for main processor jet count and energy sum data
  - $\circ$  ~ one data stream for bunch count identifier data contributed by the ROC
- serialise the RoI data to a two-bit data stream
- Begin transmission of data streams immediately on ReadRequest if the FIFOs are empty
- continue data transmission until FIFOs are empty
- separate data from consecutive events by one empty separator bit
  - $\circ$  slice data packets are 1-5 × 89 bits long (including parity)
  - ROI data packets are 49 bits long (including parity)
- collect the data streams in the ROC
- present slice data and bunch count identifier to the DAQ G-link (13-bit wide)
- present RoI data to the RoI G-link (2-bit wide)
- negate G-link DAV signal on separator bit. DAV signal acts as event separator

# 2.5 Diagnostics and control

## 2.5.1 Playback and spy functionality

The JEM allows stand-alone diagnostics of the data path using play-back memories to feed test patterns into the input processors. The patterns are processed in the data path, and the results are captured by spy memories further down the chain. Both types of diagnostic memories are accessible via VME. The playback and spy memories are started concurrently<sup>9</sup> by TTC broadcast commands.

The requirements with respect to diagnostics are:

- provide 256-deep playback buffer memories in the input processor FPGAs, 10-bits wide for each input channel
- Load test patterns into the play-back memories via VME play-back registers. Memory write addresses are incremented on each VME write operation
- Upon receipt of a broadcast command from the TTC system, increment the memory addresses 256 times<sup>10</sup> and inject the data into the real-time data path (see Figure 4)
- provide 256-deep 4×10-bit wide spy buffer memories in the input processor FPGAs
- provide 256-deep 2×25-bit wide spy buffer memories in the main processor FPGA
- on receipt of a broadcast command from the TTC system capture 256 slices of data from the realtime data path and write them to the memories (Figure 4, Figure 5)
- read out spy memories toVME through spy registers. Memory read addresses are are incremented on each VME read operation

## 2.5.2 VME interface

Main board control is performed through a reduced width, reduced functionality VME bus running on the JEP backplane (Table 4, for pin-out see sect. 4.1). The 5V TTL backplane signals are interfaced to the processors via buffers and a control FPGA (for details see sect.3). An additional non-volatile CPLD allows restricted VME access to the module when the FPGAs are unconfigured. Module addresses are read from geographic addresses encoded on backplane pins GEOADD0 ... GEOADD5. The requirements with respect to VME control are:

<sup>&</sup>lt;sup>9</sup> Since both play-back and spy operations are started concurrently, the diagnostics software will have to account for an offset of a few bunch ticks due to the latency in the data path from play-back to spy memory

<sup>&</sup>lt;sup>10</sup> Normal playback operation will allow for the transmission of 256 data slices only so as to avoid spy memory overrun. Alternatively the playback circuitry can be started and cycled until it is stopped by a broadcast command. No valid data can be read from the spy memory in this case.

- provide the module with limited D16/A24 VME access
- buffer the data bus signals in compliant transceivers
- implement a set of basic functions in the non-volatile control CPLD:
  - derive the module base address from geographic addressing
    - decode the module sub-address range
    - o respond with DTACK to all D16/A24 commands referring to the module sub-address
    - o provide VME registers for FPGA configuration download and JTAG access
- connect the VME bus to a control FPGA
- control the processor FPGAs through the control FPGA, mapping the processor control and status registers to the module VME sub-address space

## 2.5.3 DCS interface

Environmental variables are monitored through the DCS system. The JEM is linked to DCS through a single CAN signal pair serving all modules on the processor backplane.

The requirements with respect to DCS control are:

- provide CAN access via the backplane
- run the CAN signal pair to a local CAN controller on high-impedance stubs
- supply the CAN controller with a node address derived from the geographic addresses in the control CPLD
- use a CAN controller compliant with ATLAS DCS (CANopen protocol)
- have the CAN controller monitor all module supply voltages and temperatures on selected FPGAs, and local voltage regulators.

# 2.5.4 JTAG port

JTAG access is required for board and chip test and for downloading CPLD and FPGA configurations. The requirements with respect to JTAG are:

- link all programmable devices (CPLD, FPGA, configuration memory) and the TTCrx to a single JTAG chain
- use only 2.5V and 3.3V signals in the JTAG chain
- interface the JTAG chain to the VME CPLD to allow for in-system programming of configuration memories
- provide access to the chain with a JTAG header via level translators (3.3V and 5V compatible) to allow for stand-alone tests with standard adapters

# 2.5.5 Configuration

All FPGAs must be supplied with a configuration data stream upon power-up of the JEM. CPLDs are configured by non-volatile on-chip memories, and require a configuration data stream only for initial setup and later firmware updates. Because all programmable devices are included in the JTAG chain, configuration download is possible via JTAG during the development phase. Non-volatile memories on the JEM store FPGA configurations for fast loading of standard configurations, and automatic configuration during power-up. The JEM Module-0 prototype allows both Xilinx serial flash ROMs and parallel flash ROMs to be used. A CPLD is used to read the configuration memories and route the configuration, and are therefore loaded in parallel from a single data stream. Main processor, control and ROC FPGAs receive separate streams. Configuration data streams are typically of the order of one or more Mbits each (see Table 2).

The requirements with respect to configuration are:

- provide on-board serial configuration memories for all FPGAs
- provide on-board parallel configuration memories for all FPGAs
- provide JTAG access to the serial configuration memory for configuration updates
- provide parallel VME access to the parallel configuration memory for configuration updates
- provide a CPLD based sequencer to generate serial data streams and accompanying clocks
- control FPGA mode pins to enable slave serial configuration
- feed FPGA CCLK and DIN pins with configuration clocks and serial data
- feed identical copies of configuration stream to the 11 input processors

#### Jet / Energy Processor Module: Project Specification

- feed individual data streams to main processor, control, and ROC FPGAs
- monitor successful configuration and report status to VME
- allow configurations to be reloaded under VME control
- allow (re)configuration from local ROM within a few seconds

# 2.5.6 TTC interface

Timing and control information is received through the TTC system. A 160 Mbit/s TTC data stream is received on a single signal pair from the processor backplane. Bunch clock, bunch crossing and event number as well as Level-1 Accept and other broadcast signals are decoded by an on-board TTCrx chip and are supplied to the processor chips through the control FPGA<sup>11</sup>.

The requirements with respect to TTC signal processing are:

- receive TTC signals from the backplane on a differential pair
- route them to a TTCrx chip on a terminated differential line
- supply the TTCrx with a reset signal from the control FPGA
- supply the TTCrx with parallel configuration data from the control FPGA upon reset
- supply the TTCrx with a configuration data stream (I2C) from the control FPGA
- decode the TTC signals in the TTCrx
- supply the control FPGA with the Clock40Des1 bunch clock signal
- supply the control FPGA with bunch crossing number, LVL1Accept, and playback/spy signals
- forward a bunch clock signal and control signals to the input processors
- forward two bunch clock signals and control signals to the main processor
- forward a bunch clock signal and control signals to the ROC FPGA

# 2.6 Board level issues : Power supplies and line impedances

The JEM is a large, high-density module, carrying large numbers of high-speed signal lines of various signal levels. The module relies on TTL, PECL, current mode (LVDS) and CMOS high- and low swing signalling. System noise and signal integrity are crucial factors for successful operation of the module. Noise on operating voltages has to be tightly controlled. To that end, large numbers of decoupling capacitors are required near all active components. Virtex and Spartan FPGAs are particularly prone to generating noise on the supply lines. The LVDS deserialiser chips are highly susceptible to noisy operating voltages, which tend to corrupt their high-speed input signals and compromise the operation of the supply noise, a combination of distributed capacitance (power planes) and discrete capacitors in the range of nF to hundreds of  $\mu$ F are required.

The JEM operates mainly at data rates of 40 and 80 Mb/s. Serial links of 400 to 800 Mb/s data rate are used on the real-time data path and DAQ/LVL2 interfaces. Those signals are routed over long tracks on-board, across the backplane, or over cables. This is only possible on matched-impedance lines. Depending on the signal type, differential and single-ended sink-terminated and single-ended source-terminated signal lines are used. Microstrip and stripline technologies are used for signal tracks. All lines carrying bunch clock signals must be treated with particular care.

The requirements with respect to signal integrity are:

- use low-noise local step-down regulators on the module, placed far from susceptible components
- use external 3.3V analog supply voltage to supply the LVDS links.
- run all supply voltages on power planes, each facing at least one ground plane to provide sufficient distributed capacitance
- Provide at least one local decoupling capacitor for each active component

<sup>&</sup>lt;sup>11</sup> At present it is not known what TTCrx output signals might be required in addition to bunch clocks, LVL1Accept, start playback/spy. The module under development will therefore feed basically all TTCrx I/O lines into the control FPGA which can be configured to forward all required controls (up to 8) to the processors

- For de-serialiser chips and for Spartan/Virtex-E, follow the manufacturer's guidelines on staged decoupling capacitors in a range of nF to mF
- Minimise the number of different V<sub>CCINT</sub> voltages per FPGA to avoid fragmentation of power planes
- avoid large numbers of vias perforating power and ground planes near critical components
- Route all long-distance high-speed signals on properly terminated, controlled-impedance lines:
  - $\circ \quad \text{Route 400Mb/s input signals to de-serialiser chips on short micro-strip lines of 100 } \Omega \\ \text{differential mode impedance with 100 } \Omega \\ \text{sink termination}$
  - $\circ$  ~ Route TTC input signal pairs on 100  $\Omega$  micro-strip lines with 100  $\Omega$  sink termination
  - $\circ$  Route DAQ and RoI serialiser signals on 100  $\Omega$  micro-strip lines.
  - $\circ$  ~ Route 80 Mb/s JEM-JEM FIO lines on source-terminated 60  $\Omega$  striplines or micro-strip lines
  - $\circ$  Route 40 Mb/s merger lines on source-terminated 60 Ω striplines or micro-strip lines
  - Route long-distance bunch clock signals on terminated single ended lines
  - Route the de-serialiser reference clock network on differential terminated lines with local signal translators
- have all micro-striplines facing one ground plane, and all striplines facing two ground planes
- avoid sharply bending signal tracks
- minimise the lengths of all non-terminated high-speed signal lines
- use DLLs in conjunction with low-skew clock drivers to generate symmetric, low-noise, low-skew bunch clock signals
- route CAN signals on short, unterminated stubs with low capacitance.

# 3 Implementation

The JEM is a largely FPGA-based processor module designed to minimise latency of the real-time path of data flowing from the pre-processor to the central trigger processor. Data processing is performed at data rates of 40 and 80 Mb/s in a pipelined processor. Input data are de-serialised from 400 Mb/s streams to 10-bit parallel words upon entrance into the module. Output to the CTP is sent via merger modules at the bunch clock rate. Jet element data are shared with neighbouring modules at twice the bunch clock rate.

Most signals enter and leave the JEM through rear edge connectors via a backplane (common processor backplane, PB). High-speed input data are received on cable links connected to the backplane and brought directly to the module. The JEM is connected to the backplane through 820 signal and ground pins, plus three high-current power pins.

The JEM is a 9U (366mm) high and 40cm deep module that fits a standard 9U 21-slot crate with IEEE 1101.10 mechanics. The module will be built on a multi-layer PCB with ground and power planes and impedance-controlled stripline and microstrip layers. For prototype tests a minimum of four modules must be built, of which only one or two need to be fully populated with all components. The serial input link chips in particular may not be available in sufficient quantity to populate all of the modules. Simulated input signals in the input processors may be a convenient way to generate test vectors for a wider trigger input space.

The module under review will be built to extensively test the technologies required on the final modules with full functionality, full channel count and full-size algorithms. However, the implementation for the final modules will differ from the one being designed now. On the production modules FPGAs will be chosen from cost-effective high performance devices available by that time. Board control might be modified as a result of joint system tests. The large degree of coherence with the CPM design will be maintained as far as is possible.

# 3.1 Jet input data reception



Figure 8 : JEM input map (barrel)

The serial link input signals are brought to the JEM by shielded parallel pair cable assemblies. The baseline cables are high density halogen free 4-pair cable assemblies, of type AMP(TYCO) 1370754-1 (Figure 8, pin numbering according to PPM specs). The cable assembly is designed to connect to standard 2mm PCI style backplane connectors. The signals are passed through the backplane and received on the JEM rear edge connector. For pinout see sect. 4.1. The mapping of PPM channels to JEMs is shown in Figure 8 (for FCAL JEM map see 3.3.4). The grounds of all serial cable assemblies entering each crate are connected to a common chassis ground on the backplane. The signal pairs are routed via 100 $\Omega$  differential micro striplines on the top and bottom layers of the JEM, and are terminated with a single resistor near the LVDS deserialisers. The deserialisers are supplied with separate analog and digital +3.3V supply voltages, according to the manufacturer's recommendations. The analog and the digital supply voltages are supplied on power planes with all of the devices connected to the two planes. The analog supply voltage is connected to the digital supply via a single inductor per JEM. The deserialisers are supplied with a reference clock, which is distributed across the PCB as a differential LVDS signal and converted locally to a differential CMOS signal. The reference clock tree can be driven with either the bunch clock or an on-board 40 MHz crystal clock. Static control signals (PWRDN, REN) are sent to the deserialisers from the processor FPGAs via CMOS buffer chips. These precautionary measures were adopted after measurements on a technology demonstrator board revealed bit errors on the serial link due to high levels of system noise on Virtex output lines.

The LVDS link output signals are fed into the nearest input processor FPGA. Each input processor handles 4 electromagnetic and 4 hadronic links corresponding to a single  $\varphi$ -bin of trigger space. Each of the channels is composed of a data strobe and 10 data bits (9 bit energy + odd parity). An additional /LOCK signal flags link errors. All signals are run on tracks of only a few cm length.

# 3.2 Jet input data conditioning

Data read out from the deserialiser chips are clocked into input flip-flops (IFD) on the input processor FPGAs. The use of input flip-flops guarantees predictable board level timing. The data have an arbitrary phase offset to the system clock and need to be synchronised.

Due to device and cable skew all serial data links operate on different phases. It is assumed that the preprocessors run at a skew of only a couple of nanoseconds, and the JEM bunch clock signals will be deskewed in the TTCrx to an accuracy of the order of ns. Cable length tolerances will be +/- 5 cm.

From the link chip set (DS92LV1023 / DS92LV1224) data sheet, we expect a maximum skew of 7ns. This suggests that the combined effects of clock skew and jitter will be well below one clock tick.

The baseline synchronisation method is adapted from the cluster processor. The 8 input data words per input processor are individually strobed into input registers. They are then latched on the global bunch clock at a programmable phase offset of 0°, 90°, 180°, or 270°. One of the clock phases will always be near the centre of the data window, where the data are valid and stable long enough to satisfy the setup time of the flip-flops. To allow an operation of the synchroniser at an arbitrary phase, the latency can be increased by an additional full bunch tick. Phase selection and latency registers are controlled by VME.

Dedicated calibration runs are required to determine the optimum phase settings. To this end the preprocessor sends a data pattern of alternating all-0 and all-1 (1023) data words. A FPGA-resident algorithm determines the optimum phase individually for each bit in each data word. Eventually the clock phase is chosen to suit all 10 individual bit paths per input channel. This somewhat complicated scheme is required due to the considerable bit-to-bit skew in FPGAs. The synchronisation scheme is described in detail in the CPM and CP serialiser specifications. Once the optimum clock phase is found, the contents of the phase registers are updated. They are read back via VME and the clock phase data are entered into a database. A detailed description of the synchronisation algorithm can be found in the serialiser chip specifications.

An alternative synchronisation scheme is under study. So as to reduce latency, the incoming 10-bit data from the de-serialisers are not latched on the deserialiser strobes. They are directly latched into the input flip-flops of the input FPGAs on one of two clock phases derived from the global LHC bunch clock (2 and 4 in Figure 9). Due to the wide data window, one of the two sampling points will always yield reliable data. One of the 10 data bits is additionally sampled in phases one and three. During run-in of the system the alternating pattern on this bit allows the detection of 0-1 transitions. They will occur



either from sample three to one, or from one to three. In the latter case, shown in the figure, sample four will be selected and forwarded to the algorithms, after further delay by a full tick, if required. Due to the limitation to just two samples per bunch tick, the data can be sampled on the input flip-flops at 80 MHz with inherently low bit-to-bit skew. While the choice of phase (one or four) will be made by an automatic procedure using a sync pattern, any phase offset by a full tick will have to be measured by spying on data in the spy memories and analysing them in software. Programmable length pipelines will have to be adjusted accordingly via VME control.

The synchronisation stage covers only the 10 data bits. The link error information (/LOCK) is latched into a single flip-flop on the rising edge of the bunch clock. Once all input signals are in phase with the global clock, they are copied to the slice readout circuitry (see sect.3.4). In the real-time data path the parity of 9 data bits is checked against the odd parity bit transmitted along with the data. The data bits are zeroed if a parity error occurs, or upon a loss of lock. Parity errors are counted in saturating 8-bit counters. A loss of lock count is determined from the leading edge of the /LOCK signal. 4-bit saturating counters are used for this purpose. The lock loss count information is complemented by the current link status information which is read into VME along with the counter information (see sect. 4). The error counters are reset on readout. Faulty channels can be zeroed under VME control via an 8-bit wide mask register.

Jet elements are built by adding corresponding electromagnetic and hadronic energies. The summation yields a 10-bit energy (for FCAL handling see sect.3.3.4). Throughout the JEP, overflows are indicated by saturation of the data field to it's largest binary representation. Therefore a value of 511 in either an electromagnetic or hadronic channel will saturate the trigger element to 1023.

The jet elements are multiplexed to a 5-bit wide word at twice the bunch clock rate, with the least significant bits sent first. The data are driven to board level via output flip-flops clocked at twice the bunch clock rate. All four jet elements are sent to the main processor, while copies of three of the data words (the two leftmost elements to the JEM on the left (- $\eta$ ), the rightmost element to the JEM on the right) are sent across the backplane to adjacent modules.

# 3.3 Jet element processing

The main processor FPGA receives 44 jet elements from the input processors, multiplexed at twice the bunch clock rate. JEMs adjacent in  $\eta$  contribute up to 33 additional jet elements at the same data rate. A total of 385 signals are supplied to the main processor, organised as 5-bit wide data. The jet elements are transmitted least significant bits first to facilitate serial arithmetics employed in the jet processor. Wide data windows on the incoming data are guaranteed by the use of input flip-flops on the data lines. The input clock is derived from the bunch clock in an on-chip DLL. Due to the low skew on the data paths into the main processor (corresponding to 50 cm of track) there is no need to individually resynchronise input data bits. All signals are latched on a common clock edge. The latched input data are routed to their destination, depending on the JEM flavour. Particulars of FCAL treatment are explained in sect 3.3.4. Data are copied to two parallel processing paths that feed the jet and energy processors, respectively. The jet processor results are collected as 8 3-bit words at the bunch clock rate, with a 25<sup>th</sup> odd parity bit. The energy processor yields three 12-bit energy words. Overflows in the energy summation trees are saturated to 12-bit full scale (4095 GeV). Saturated input channels enforce a 12-bit saturation, as well. The 12-bit energy data are each encoded to a 8-bit word with parity. It is envisaged to implement the encoding and parity generation in a single-step lookup table if it is found to be beneficial with respect to latency. This implementation will allow for an arbitrary transfer function to be used in the encoder. The present implementation is, however, based on CLB resources and uses a fixed quad-linear code. Dependent on the magnitude the incoming 12-bit number is divided by a scale factor (Table 1). The resulting 6-bit word is prepended with 2 scale bits.

| Range     | Scale Factor | Scale bits |
|-----------|--------------|------------|
| 0-63      | 1            | 00         |
| 64-255    | 4            | 01         |
| 256-1023  | 16           | 10         |
| 1024-4095 | 64           | 11         |

| Га | ıble | 1 |
|----|------|---|
|    |      |   |

The energy sums are encoded to a single 25-bit word including an overall odd parity bit. Both the jet and energy results are latched on output flip-flops at the bunch clock rate, and driven directly down the backplane at 2.5V signal level. The signals are source-terminated to the line impedance of 60 Ohms.

## 3.3.1 Total transverse energy

The total transverse energy path is implemented in parallel arithmetics. The input signals are demultiplexed to the bunch clock rate, and set to zero if they do not exceed a VME-programmed common threshold. The total energy per JEM is calculated in a 32-input 5-stage adder. The result is saturated to 4095 GeV if any of the input signals is saturated to full scale, or if an overflow occurs in the adder tree.

## 3.3.2 Missing energy

The missing energy is calculated from the x- and y-projections of the transverse energies. To minimise the size of the algorithm all same- $\varphi$  energy words are summed first. Saturation is handled as described above. The eight energy sub-sums are converted separately and in parallel to the vectors ( $E_x,E_y$ ). To save resources and latency, the algorithm employs two fixed-coefficient multipliers of 6 input bits each per projection; most significant and least-significant words are dealt with separately. In the baseline design the conversion is performed using block memory resources. In a future hardware revision hardware multipliers will allow for a decrease in latency. A pair of saturating 8-input, 3-stage adders determines the final 12-bit wide board-level vector sum.

# 3.3.3 Jet algorithm

The jet algorithm uses all 77 jet elements received by the main processor. To conserve logic and latency, most operations are carried out using serial arithmetics on the multiplexed, 5-bit wide data .

A series of interconnected adder trees produces 60 sums of 2x2 jet elements, which are compared with next neighbours to identify local maxima. In parallel with the local maximum identification, 3x3 and 4x4 sums of jet elements are also produced. For each square 2x2 subregion of the JEM's central 4x8 jet elements, an appropriate ROI position is determined, and 2x2, 3x3, and 4x4 cluster sums associated with that position are compared with threshold energies. Eight sets of jet cluster size and energy threshold criteria can be programmed via VME. If desired, the capability to have two different sets of threshold definitions in a JEM for different eta slices may be added. In the case of 2x2 or 4x4 clusters tested against a threshold, only one comparison per ROI is necessary, while four 3x3 clusters are associated with any ROI position and the jet cluster criterion is met if at least one associated cluster exceeds the threshold.

Jet clusters that overflow during summation are flagged as saturated. Clusters with saturated jet elements are not automatically flagged as saturated, but their sums are virtually certain to overflow when added with neigboring elements. 2x2 clusters that saturate are automatically identified as local maxima, and saturated clusters automatically pass all jet thresholds. If a 2x2 subregion contains more than one local maximum because of saturation, the ROI position at the lowest eta, and then the lowest phi, is selected.

Eight 3-bit jet multiplicities are produced for transmission to the CMM. If a jet multiplicity exceeds 7 in a single JEM, the reported multiplicity is set at 7. For each 2x2 subregion, the 2-bit ROI position is sent to the buffer for ROI readout, as well as 8 bits corresponding to the different threshold criteria passed.



# 3.3.4 FCAL and endcap calorimeter treatment

Figure 10 : Barrel vs. FCAL (JEM0) map

In each quadrant the left hand and right hand JEMs (JEMs 0,7,8,15) process endcap and FCAL signals (from PPMs 8 and 9) on their two outermost  $\eta$ -bins. While barrel JEMs are fed with cables that cover 2×2 bins in the  $\varphi \times \eta$  trigger space, those two  $\eta$ -bins are fed with cables carrying a single  $\eta$  bin only. The FCAL signals correspond to a double-width bin in  $\varphi$ , so a single 4-channel cable covers a full  $\varphi$ -quadrant. To process the FCAL and endcap channels together with jet elements in the barrel, these channels must be rearranged. Double-width  $\varphi$  bins are divided equally over two neighbouring jet elements. For convenience, the division by two of FCAL signals is carried out in the input processor. In the main processor, FCAL signals are copied to the neighbouring channel in  $\varphi$ , which is not connected to an external signal on FCAL JEMs. FCAL and endcap signals are routed to their proper locations and sent to the standard jet and energy algorithms. It is believed that the modifications to the algorithms do not require dedicated FPGA configurations. Instead, multiplexers will be employed to re-route the signals. The geographic address of the JEM will be used to identify FPGAs that require this alternate signal routing.

The use of FCAL signals (cables b,c,d) allows for an energy trigger space of  $|\eta| < 4.9$ . Cables a and f carry endcap fanout signals required by the jet algorithm. An extension of the jet algorithm up to  $|\eta| < 4.9$  is possible if FCAL fan-in (cable d) is used. Since the jet elements are rearranged as described above, there is no special forward trigger code required in the jet. Special forward trigger algorithms different from the barrel code are possible only within the limits of FPGA resources and connectivity on the real-time data path. A different data format might be chosen on the merger links, however, the bit count is limited to 25, including parity.

# 3.4 DAQ and Rol

The JEM readout to DAQ and Level-2 is handled by two functionally identical logic blocks, implemented in a common Read Out Controller (ROC) FPGA. The ROC controls and reads out local readout sequencers (ROSs) located in the processor FPGAs. An additional ROS implemented in the ROC captures and pipelines bunch crossing information for local readout. Since all readout FIFOs on the JEM are the same depth, the ROC can determine FIFO fill status locally for monitoring by VME if required.





The control logic for the Input FPGA readout is shown in Figure 11. DAQ read-out of the main processor FPGA is performed similarly. The RoI read-out follows the same principles, with the exception that only one time slice is read out for each L1A. For efficient resource use, the latency pipeline is implemented using SelectRAM 16-bit shift registers. Total shift register depth is selected in

the FPGA configuration file and cannot be modified at runtime. Therefore, the shift register depth must be tailored to cover the maximum possible downstream latency. Differences in readout latency for the various sampling points in the jet / energy algorithm are taken into account in the design of the buffers. Correct timing of the readout is therefore guaranteed by design rather than by adjustment. A VME-programmable register in the ROC sets the delay of the ReadRequest signal, which initiates the transfers of data at the end of the pipeline into the 256 slice deep FIFO. To read out n slices per BC the ReqRead signal is cycled n times.

Under the control of the local Read-Out Sequencer (ROS) logic on the processor FPGAs, data are moved from the end of the latency pipeline to the derandomiser FIFO, built from block memory. When the derandomiser buffer is not empty, serial readout is initiated through the FIFO read port. The main processor FPGA uses four such FIFO blocks in parallel for DAQ readout, while the Input FPGAs each use six. The parallel data streams from the FIFOs are serialized to a single bit by a shift register. An odd parity bit is generated and appended to the end of the resulting bit stream. The ROS concatenates all of the slices from each event into one large packet. Data from different events are separated by at least one bit of invalid data. Slice data from the processor FPGAs are appended with fill bits to make their bit streams the same length as the Input FPGA data packets.

The 11 Input FPGA data streams and the main processor bit stream are received in parallel by the ROC data port, and then passed to the DAQ G-link transmitter, along with a further bit generated within the ROC containing the BC number. The BC number is generated by a local counter driven by the 40 MHz system clock, and reset by a derivative of the TTCRx BcntRes strobe. For multi-slice readout the BC number must be latched to ensure that all slices are tagged with the same BC number corresponding to the triggered event. The readout of Jet RoIs is carried out in a similar manner.

The DAQ and RoI output links use two HP G-link HDMP-1022 transmitter devices, which have a 20bit wide data input. Unused input bits are forced to ground. The links are driven by the 40MHz bunch clock. The availability of data is signalled to the receiving end RODs using the G-link DAV signal, indicating valid data.

The G-link PECL level output signals are driven into coaxial cables at the front panel of the module. The G-link chips use a well-decoupled 5V supply. Since each transmitter dissipates 2 Watts, heat sinks will be used.

# 3.4.1 DAQ read-out

A single slice of DAQ data is 89 bits in length, including an appended odd parity bit (see sect. 4). Up to 5 slices are typically read out for each L1A during data taking, although the system may be upgraded to read out a larger number (up to the full length of the derandomiser FIFOs) if required for special diagnostic procedures. The number of slices to be read out is under the control of the ROC, and determined by a VME-accessible register. For multi-slice read-out, the ROC generates a correctly timed sequence of (consecutive) ReadReq commands and passes the incoming slice data to the G-link. Since the bunch crossing number information is unavailable to the local pipeline and readout logic, the correct selection of slices depends on proper timing of the ReadReq signal in conjunction with the fixed-latency pipeline.

## 3.4.2 ROI read-out

A single slice of ROI data is 12 bits in length per ROI (see sect. 4), including an appended odd parity bit. Only one slice is typically read out per event. Due to limited pin count on the JEMO main processor all eight ROIs per JEM are serialised into a two-bit wide data stream of 49 bits of total length. As for the DAQ readout, the ROC passes the incoming ROI data along with the bunch crossing number to the G-link.

# 3.5 Diagnostics and control

## 3.5.1 Playback and spy

Playback and spy memory will be available on JEM0. Due to limitations in FPGA resources the memories might not fully comply with the requirements outlined in section 2.5.1. All block memory not needed otherwise will be used for storage of playback data. The memory will be organized in 256 slices deep  $\times$  16 bits wide buffers. For efficient use of resources the 10-bit wide playback data need to be packed into 16-bit format. It is envisaged to do this locally on the input processors, if possible. However, the baseline implementation will not include a bit packing algorithm. It will require the software to assemble byte-wide data packets. The playback buffers are filled through a single VME data port. Memory addresses are incremented automatically upon VME write operation. A control register allows access to the address counter. Upon reception of a broadcast command from the TTC a StartSpy signal is asserted by the control FPGA and 256 bunch crossings worth of data are injected. The data are fed into the real-time data path immediately after the synchroniser (see Figure 4).

Due to lack of resources it might not be possible to sample all real-time signals on the outputs of the input processor into spy memories. It will, however, be possible to sample the data further downstreams on the outputs of the main processor FPGA. Here the data are written into 256×16 block memories and can be accessed from VME through a single data register and a control register as described above.

# 3.5.2 VME control

The VME-- interface implements a subset of standard VME-64 signals and commands. It allows for D16/A24 access to the JEM. 5V-LSTTL data bus drivers are used to comply with VME specifications. The relatively low slew rate of LSTTL helps to keep electromagnetic interference under control. The VME protocol is implemented in a CPLD. The CPLD reads the geographic address lines of the JEM to determine the VME address range. Data and sub-addresses are then fed into the control FPGA.

On JEM0 the VME signals are not fed to the processor FPGAs on a VME style bus. The signals are rather multiplexed and sent through the input FPGAs on a unidirectional 11-bit wide ring. Each input FPGA receives the signals, asserts signals if itself is addressed by means of an individual enable signal or forwards its input signals to the next chip in the chain otherwise. The ring structure starts and ends in the control FPGA. A simple 1-bit serialised VME access port is considered as an alternative solution if it turns out that a bandwidth of 5 Mbyte / sec is sufficient to handle the very limited data volume typically exchanged with the input FPGAs. Both access methods can be tried out on JEM0 to find a suitable design for the production version. A full-blown on-board VME bus will not be considered for the production modules unless a strong case can be made. Due to the already high routing density a true bus like structure will invariably lead to higher board layer count and therefore considerably higher cost. DAQ and main processor FPGAs are connected to the control FPGA through 16-bit wide bidirectional ports. A full VME port is not viable on JEM0 due to insufficient pin count on the main processor, if required.

## 3.5.3 DCS interface

Environmental conditions of the JEM are monitored by an on-board CAN node. It accurately measures supply voltages and board temperatures and forwards the data to DCS via a 2-pin CAN link. The monitoring system is based on the ATLAS <u>ELMB</u> board. It is a credit card sized daughter module providing 64 analog input channels of programmable range (up to 5V) and 16 bit resolution<sup>12</sup>. The supply voltages and temperature probes (NTCs) are interfaced to the analog inputs via resistor networks. The dual-processor CAN controller is equipped with ISPable on-chip firmware storage.

<sup>&</sup>lt;sup>12</sup> The ELMB will support additional interfaces like parallel digital I/O, SPI, JTAG, and an analog output port. These features will not be used on the JEM.

Software updates are accomplished via CAN. The firmware implements the CANopen protocol. Low level functions like voltage measurement and alert on error conditions are available. Therefore no low-level programming is required. All JEM specific configuration is done via high level commands within the SCADA programming environment.

The ELMB connects to the CAN bus via a transceiver located on the ELMB itself. Therefore the CAN link consists of a terminated bus along the processor backplane with the individual CAN nodes being attached via drop lines or stubs. According to CAN specifications drop lines should be kept as short as possible. A maximum of 30cm at 1Mbit/s is allowed per node. The JEP stays well below individual and cumulative stub length limits for any data rate supported by DCS.

# 3.5.4 JTAG

JTAG is implemented as a single daisy chain passing all components bearing JTAG ports. The JTAG chain can be exercised from either a JTAG header or a port on the control FPGA. For basic board tests and a first download of FPGA configurations the JTAG header will be used. At a later stage configuration memory download via a VME/JTAG bridge to be implemented in the control FPGA could be envisaged (see sect. 3.5.5). The JTAG chain is a mixed 2.5V/3.3V chain. 5V-compatible 3.3V buffers on the header make it compliant with standard JTAG devices.

# 3.5.5 FPGA configuration

The JEM is built mainly from programmable logic of CPLD and FPGA type. The CPLDs are configured from an on-chip flash memory on power-up automatically. The CPLD configuration is loaded to the flash memory once through a JTAG header after board assembly. On very rare occasions a firmware upgrade might be required. A configuration download will require each module to be connected to a JTAG test system. A configuration update cannot be performed during normal running.

FPGAs are static RAM based devices and are configured after power-up through a serial configuration port. On JEM0 there are serial and parallel memories available for configuration storage. Table 2 shows the amount of configuration memory required for FPGAs to be used on JEM0 and JEM. Virtex-E is used for the main processor, all other FPGAs are SpartanII.

|                | FPGA     | Configuration     | Serial ROM   |
|----------------|----------|-------------------|--------------|
| Our aut a sell | VOODIED  | 1 0 1 0 1 0 0 1 4 | VOIOVOI      |
| Spartanii      | XC2S150  | 1,040,128 DIt     | XC18V01      |
| Spartanll      | XC2S200  | 1.335.840 bit     | XC18V02      |
|                |          | .,,               |              |
| Virtex-E       | XCV600E  | 3,961,632 bit     | XC18V04      |
|                |          |                   |              |
| Virtex-E       | XCV1000E | 6,587,520 bit     | 2 of XC18V04 |
|                |          |                   |              |
| Virtex-E       | XCV1600E | 8,308,992 bit     | 2 of XC18V04 |
|                |          |                   |              |

#### Table 2 : Xilinx configuration memory

The processor FPGA will require up to two serial ROMs per configuration. Since the storage of two alternative configurations – production and test version – is required, a total of 4 serial ROMs might be required on the board. The input processors are configured off far smaller memories. All input processors will be configured identically and a single data stream can therefore be copied to all of the devices. Hence a total of 3 different Spartan configurations is required on board: input processors, control and DAQ FPGAs. The 3 configurations might be concatenated in a single memory chip. For JEM0 a total of 6 serial ROMs is provided. A parallel 64-Mbit flash ROM is provided as an optional storage medium. On JEM0 it is mapped into the VME address space by a CPLD. Due to the large size a single parallel ROM should suffice for storage of alternative configurations for all FPGAs

VME access is required to store configuration data into the flash ROMs. The serial memories are programmed via JTAG at a maximum data rate of 10Mbit/s. A VME-JTAG interface would have to be built so as to allow VME-based configuration download. Software will be based on the Xilinx J Drive engine. VME access to the parallel configuration memory does not require any special interface. However, the programming algorithm would have to be observed by the configuration software. The JEMO supports both the parallel and the serial memory type so as to allow for evaluation of suitability. On the production modules the more appropriate memory type will be chosen. Also coherence of JEM and CP designs as well as availability of software will be considered.

## 3.5.6 Timing

System timing is based on the low skew, low jitter clock and data distribution by the TTC system. Each JEM has its own TTCrx chip on board and can derive all required board level timing signals from the three bunch clock signals, two of them deskewed, and accompanying command and data words. On JEM0 TTC signals are handled by a daughter board (Tilecal TTCrx module). It can be mounted on two connectors provided on the JEM. The two connectors carry all relevant 40 Mbit/s signals. The serial input signal will be supplied by a short patch cable soldered to the boards. +5V will be supplied on a separate cable connection. An additional spare connector is available to route additional supply voltages and signals to a revised daughter board which will replace the Tilecal module once the final TTCrx chips are available in sufficient quantities..

All TTC signals are processed in the control FPGA. One DLL of the control FPGA is used to generate multiple zero-skew copies of one of the deskewed bunch clock (Clk40Des1). The clock replication is performed by driving the incoming clock through a low-skew clock buffer and feeding back one of its outputs into the DLL. The replicated Clk40Des1 signal is supplied to all of the FPGAs and the CPLDs. Several auxiliary clocks are routed to some of the devices. The main processor FPGA is supplied with deskew clockClk40Des2 in addition to Clk40Des1, to allow proper timing-in of FIO signals from neighboring JEMs. A 40 MHz crystal clock is provided to the deserialiser chips. It is used as a reference clock to run the deserialiser VCO to the approximate nominal frequency so as to enable the PLL to lock to the input data pattern. Also the VME related circuitry can be driven off the crystal clock.

There are no multiples of the bunch clock routed on the JEM. Twice the bunch clock frequency is required on the input and main processors. The clocks are derived internally from the on-chip DLLs.

The JEM is expected to use only a small fraction of the data and broadcast lines on the TTCrx. L1A is used to initiate readout into DAQ and level-2 trigger. BcntRes is used to reset the bunch counter located in the ROC. The playback and spy operation will be initiated by a broadcast over the TTC system. Since all TTC signal lines are fed to the control FPGA there is no limitation on the use of broadcasts. However, only a subset of 8 TTC signals will be routed to the processor FPGAs. Therefore the control FPGA will have to detect broadcast commands and encode them in a suitable way before sending them to the processor FPGAs

Upon a reset signal the TTCrx chip initialises an internal register with the chip ID and the I2C base address. The parameters are read into the SubAddr and Data lines. The control FPGA needs to set the relevant lines to the desired pattern and issue a reset signal to the TTCrx. The TTCRx I2C lines will connnect to a VME port in CP-style to allow for a common software interface.

As described above all clocks of 40 MHz and its multiples will be run through DLLs so as to guarantee precise low-skew timing. This requires the input clock to exhibit a maximum period variation of 1ns. A maximum cycle-to-cycle variation of 300ps is acceptable. The TTC clocks are operating well inside these limits. However, upon loss of input signal the TTCrx acts as a free running oscillator with its frequency variations far outside the operating conditions of the DLLs. Therefore no valid data can be read out of any of the processor FPGAs when a TTC link loss occurs.

It is important for proper operation of the DLLs that they are reset after an extended loss of input clock. Therefore clock monitoring circuitry will be provided on the JEM. The TTCrx asserts its TTCReady line when it is locked onto a valid input data stream and the output signals are stable. This signal will be

used to reset all DLLs whenever the TTC clock is lost. After reestablishing a stable TTC clock the DLLs will take some time to tune the delay loop. They assert the LOCKED line as soon as the output clock is stable. A wire AND over all LOCKED lines will report the JEM clock status to VME and DCS (?). The readout software will have to make sure that data are read out of the JEM only when the bunch clock is stable. Otherwise meaningless data might be read from the processor FPGAs into VME.

On JEM0 a crystal clock is provided to allow standalone operation of a module. Choice of either bunch clock or crystal clock is possible by jumpers only. The crystal clock is meant to be used in initial module setup only. Clock monitoring circuitry does not cover the crystal clock.

# 3.6 Signal levels and supply voltages

The JEM is a mixed signal level environment. Particular consideration of signal level compatibility and system noise issues is required so as to make the system work reliably. Differential signalling is employed for the reference clock tree (LVDS, 40MHz), trigger input signals (LVDS, 400Mb/s), TTC input signal distribution (PECL, AC-coupled, 160 Mb/s) and CAN (~100kb/s). All differential signals are routed on differential matched impedance micro strip lines only.

Single ended signals are TTL 5V (VME--), CMOS 5V (CAN and G-links), 3.3V (de-serialisers, CPLDs and buffers) and 2.5V (all FPGAs). There are no single ended level translators employed. It is assumed here that +5V logic levels are fully compatible to 3.3V devices. 3.3V devices in turn are compatible to to the 2.5V signal definition. There is no direct connection of 5V devices to 2.5V signal lines on the JEM. External control connections (JTAG) are fed to the 2.5V I/O FPGAs via 3.3V buffers. Any level shifting accomplished by direct connection of compatible but not identical logic families tends to reduce the noise margins. Therefore such implicit level shifting is applied on board-level connections only. Backplane connections are designed for source and sink device of the same logic families. All real-time data connections to the processor backplane are source terminated 2.5V CMOS lines. Please note that in a further design iteration the real-time backplane signals might be migrated to 1.8V or 1.5V logic levels. Full compatibility along with maximum noise margins can be assured if the FPGAs on the receiving end are Virtex-E, operated off +2.5V I/O supply. In this case an 1.8V internal reference derived from the core voltage is used to properly threshold 1.8V signals.

The LVDS de-serialisers are supplied from the 3.3V high current connector on the rear edge of the module. The 5V high current pin supplies the VME—data buffers as well as the G-links, and the CAN module. The 3.3V and 2.5V supply voltages are generated from +5V by a step-down regulator. The main processor FPGA requires a core voltage of 1.8V. Due to the quite limited supply current sunk by a single chip, the supply voltage can be generated by a local low-drop linear regulator.

# 4 Interfaces : connectors, pinouts, signal levels,data formats



Figure 12 : slice data bit stream format (a-c) and G-link bit map (d)



Figure 13 : RoI data bit stream format (a-b) and G-link bit map (c)

| LVDS de-serialiser Rout | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
|-------------------------|---|---|---|---|---|---|---|---|---|---|
| Trigger data bit        | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | Р |

| Table 3 | : | serial | input | data | bits |
|---------|---|--------|-------|------|------|
|---------|---|--------|-------|------|------|

| Total Number | 43 |
|--------------|----|
| DTACK*       | 1  |
| Write*       | 1  |
| DS0*         | 1  |
| D[150]       | 16 |
| A[231]       | 23 |
| SYSRESET     | 1  |

Table 4 VME signals

# 4.1 backplane connector layout

| Pos. | A | В | С | D | E |
|------|---|---|---|---|---|
|      |   |   |   |   |   |

Guide Pin (0-8mm) (AMP parts 223956-1, 223957-1, or equivalent)

Connector 1 (8-58mm) Type B-25 connector (short through-pins)

| 1  | SMM0  | <g></g>   | VMED00    | VMED08  | VMED09  |
|----|-------|-----------|-----------|---------|---------|
| 2  | SMM1  | VMED01    | VMED02    | VMED10  | VMED11  |
| 3  | SMM2  | <g></g>   | VMED03    | VMED12  | VMED13  |
| 4  | SMM3  | VMED04    | VMED05    | VMED14  | VMED15  |
| 5  | SMM4  | <g></g>   | VMED06    | VMEA23  | VMEA22  |
| 6  | SMM5  | VMED07    | <g></g>   | VMEA21  | VMEA20  |
| 7  | SMM6  | <g></g>   | VMEDS0*   | <g></g> | <g></g> |
| 8  | SMM7  | VMEWRITE* | <g></g>   | VMEA18  | VMEA19  |
| 9  | SMM8  | <g></g>   | VMEDTACK* | VMEA16  | VMEA17  |
| 10 | SMM9  | VMEA07    | VMEA06    | VMEA14  | VMEA15  |
| 11 | SMM10 | <g></g>   | VMEA05    | VMEA12  | VMEA13  |
| 12 | SMM11 | VMEA04    | VMEA03    | VMEA10  | VMEA11  |
| 13 | SMM12 | <g></g>   | VMEA02    | VMEA08  | VMEA09  |
| 14 | SMM13 | VMERESET* | VMEA01    | <g></g> | <g></g> |
| 15 | SMM14 | <g></g>   | <g></g>   | FLO     | FR0     |
| 16 | SMM15 | FL1       | FL2       | <g></g> | FR1     |
| 17 | SMM16 | <g></g>   | FL3       | FR2     | FR3     |
| 18 | SMM17 | FL4       | FL5       | <g></g> | FR4     |
| 19 | SMM18 | <g></g>   | FL6       | FR5     | FR6     |
| 20 | SMM19 | FL7       | FL8       | <g></g> | FR7     |
| 21 | SMM20 | <g></g>   | FL9       | FR8     | FR9     |
| 22 | SMM21 | FL10      | FL11      | <g></g> | FR10    |
| 23 | SMM22 | <g></g>   | FL12      | FR11    | FR12    |
| 24 | SMM23 | FL13      | FL14      | <g></g> | FR13    |
| 25 | SMM24 | <g></g>   | FL15      | FR14    | FR15    |

Connector 2 (58-96mm) Type B-19 connector (mixed short/long through pins)

| 1  | FL16       | FL17    | FR16      | <g></g> | FR17    |
|----|------------|---------|-----------|---------|---------|
| 2  | FL18       | <g></g> | FL19      | FR18    | FR19    |
| 3  | FL20       | FL21    | FR20      | <g></g> | FR21    |
| 4  | FL22       | <g></g> | FL23      | FR22    | FR23    |
| 5  | FL24       | FL25    | FR24      | <g></g> | FR25    |
| 6  | FL26       | <g></g> | FL27      | FR26    | FR27    |
| 7  | FL28       | FL29    | FR28      | <g></g> | FR29    |
| 8  | FL30       | <g></g> | FL31      | FR30    | FR31    |
| 9  | FL32       | FL33    | FR32      | <g></g> | FR33    |
| 10 | GEOADD5    | <g></g> | GEOADD4   | <g></g> | GEOADD3 |
| 11 | 1+ Cable 1 | 1-      | <sg></sg> | 2+      | 2-      |
| 12 | 3+         | 3-      | <sg></sg> | 4+      | 4-      |
| 13 | 1+ Cable 2 | 1-      | <sg></sg> | 2+      | 2-      |
| 14 | 3+         | 3-      | <sg></sg> | 4+      | 4-      |
| 15 | 1+ Cable 3 | 1-      | <sg></sg> | 2+      | 2-      |
| 16 | 3+         | 3-      | <sg></sg> | 4+      | 4-      |
| 17 | 1+ Cable 4 | 1-      | <sg></sg> | 2+      | 2-      |
| 18 | 3+         | 3-      | <sg></sg> | 4+      | 4-      |
| 19 | GEOADD2    | <g></g> | GEOADD1   | <g></g> | GEOADD0 |

Connector 3 (96-134mm) Type B-19 connector (mixed short/long through pins)

| 1  | FL34       | FL35    | FR34         | <g></g> | FR35    |  |
|----|------------|---------|--------------|---------|---------|--|
| 2  | FL36       | <g></g> | FL37         | FR36    | FR37    |  |
| 3  | FL38       | FL39    | FR38 <g></g> |         | FR39    |  |
| 4  | FL40       | <g></g> | FL41         | FR40    | FR41    |  |
| 5  | FL42       | FL43    | FR42         | <g></g> | FR43    |  |
| 6  | FL44       | <g></g> | FL45         | FR44    | FR45    |  |
| 7  | FL46       | FL47    | FR46         | <g></g> | FR47    |  |
| 8  | FL48       | <g></g> | FL49         | FR48    | FR49    |  |
| 9  | FL50       | FL51    | FR50         | <g></g> | FR51    |  |
| 10 | <g></g>    | <g></g> | <g></g>      | <g></g> | <g></g> |  |
| 11 | 1+ Cable 5 | 1-      | <sg></sg>    | 2+      | 2-      |  |
| 12 | 3+         | 3-      | <sg></sg>    | 4+      | 4-      |  |
| 13 | 1+ Cable 6 | 1-      | <sg></sg>    | 2+      | 2-      |  |
| 14 | 3+         | 3-      | <sg></sg>    | 4+      | 4-      |  |
| 15 | 1+ Cable 7 | 1-      | <sg></sg>    | 2+      | 2-      |  |
| 16 | 3+         | 3-      | <sg></sg>    | 4+      | 4-      |  |
| 17 | 1+ Cable 8 | 1-      | <sg></sg>    | 2+      | 2-      |  |
| 18 | 3+         | 3-      | <sg></sg>    | 4+      | 4-      |  |
| 19 | <g></g>    | <g></g> | <g></g>      | <g></g> | <g></g> |  |

Connector 4 (134-172mm) Type B-19 connector (short through pins)

| 1  | FL52        | FL53    | FR52      | <g></g> | FR53    |
|----|-------------|---------|-----------|---------|---------|
| 2  | FL54        | <g></g> | FL55      | FR54    | FR55    |
| 3  | FL56        | FL57    | FR56      | <g></g> | FR57    |
| 4  | FL58        | <g></g> | FL59      | FR58    | FR59    |
| 5  | FL60        | FL61    | FR60      | <g></g> | FR61    |
| 6  | FL62        | <g></g> | FL63      | FR62    | FR63    |
| 7  | FL64        | FL65    | FR64      | <g></g> | FR65    |
| 8  | FL66        | <g></g> | FL67      | FR66    | FR67    |
| 9  | FL68        | FL69    | FR68      | <g></g> | FR69    |
| 10 | <g></g>     | <g></g> | <g></g>   | <g></g> | <g></g> |
| 11 | 1+ Cable 9  | 1-      | <sg></sg> | 2+      | 2-      |
| 12 | 3+          | 3-      | <sg></sg> | 4+      | 4-      |
| 13 | 1+ Cable 10 | 1-      | <sg></sg> | 2+      | 2-      |
| 14 | 3+          | 3-      | <sg></sg> | 4+      | 4-      |
| 15 | 1+ Cable 11 | 1-      | <sg></sg> | 2+      | 2-      |
| 16 | 3+          | 3-      | <sg></sg> | 4+      | 4-      |
| 17 | 1+ Cable 12 | 1-      | <sg></sg> | 2+      | 2-      |
| 18 | 3+          | 3-      | <sg></sg> | 4+      | 4-      |
| 19 | <g></g>     | <g></g> | <g></g>   | <g></g> | <g></g> |

Connector 5 (172-210mm) Type B-19 connector (mixed short/long through pins)

| 1  | FL70        | FL71    | FR70      | <g></g> | FR71    |  |
|----|-------------|---------|-----------|---------|---------|--|
| 2  | FL72        | <g></g> | FL73      | FR72    | FR73    |  |
| 3  | FL74        | FL75    | FR74      | <g></g> | FR75    |  |
| 4  | FL76        | <g></g> | FL77      | FR76    | FR77    |  |
| 5  | FL78        | FL79    | FR78      | <g></g> | FR79    |  |
| 6  | FL80        | <g></g> | FL81      | FR80    | FR81    |  |
| 7  | FL82        | FL83    | FR82      | <g></g> | FR83    |  |
| 8  | FL84        | <g></g> | FL85      | FR84    | FR85    |  |
| 9  | FL86        | FL87    | FR86      | <g></g> | FR87    |  |
| 10 | <g></g>     | <g></g> | <g></g>   | <g></g> | <g></g> |  |
| 11 | 1+ Cable 13 | 1-      | <sg></sg> | 2+      | 2-      |  |
| 12 | 3+          | 3-      | <sg></sg> | 4+      | 4-      |  |
| 13 | 1+ Cable 14 | 1-      | <sg></sg> | 2+      | 2-      |  |
| 14 | 3+          | 3-      | <sg></sg> | 4+      | 4-      |  |
| 15 | 1+ Cable 15 | 1-      | <sg></sg> | 2+      | 2-      |  |
| 16 | 3+          | 3-      | <sg></sg> | 4+      | 4-      |  |
| 17 | 1+ Cable 16 | 1-      | <sg></sg> | 2+      | 2-      |  |

| 18 | 3+      | 3-      | <sg></sg> | 4+      | 4-      |
|----|---------|---------|-----------|---------|---------|
| 19 | <g></g> | <g></g> | <g></g>   | <g></g> | <g></g> |

Connector 6 (210-248mm) Type B-19 connector (mixed short/long through pins)

| 1  | FL88        | FL89    | FR88      | <g></g> | FR89    |
|----|-------------|---------|-----------|---------|---------|
| 2  | FL90        | <g></g> | FL91      | FR90    | FR91    |
| 3  | FL92        | FL93    | FR92      | <g></g> | FR93    |
| 4  | FL94        | <g></g> | FL95      | FR94    | FR95    |
| 5  | FL96        | FL97    | FR96      | <g></g> | FR97    |
| 6  | FL98        | <g></g> | FL99      | FR98    | FR99    |
| 7  | FL100       | FL101   | FR100     | <g></g> | FR101   |
| 8  | FL102       | <g></g> | FL103     | FR102   | FR103   |
| 9  | FL104       | FL105   | FR104     | <g></g> | FR105   |
| 10 | <g></g>     | <g></g> | <g></g>   | <g></g> | <g></g> |
| 11 | 1+ Cable 17 | 1-      | <sg></sg> | 2+      | 2-      |
| 12 | 3+          | 3-      | <sg></sg> | 4+      | 4-      |
| 13 | 1+ Cable 18 | 1-      | <sg></sg> | 2+      | 2-      |
| 14 | 3+          | 3-      | <sg></sg> | 4+      | 4-      |
| 15 | 1+ Cable 19 | 1-      | <sg></sg> | 2+      | 2-      |
| 16 | 3+          | 3-      | <sg></sg> | 4+      | 4-      |
| 17 | 1+ Cable 20 | 1-      | <sg></sg> | 2+      | 2-      |
| 18 | 3+          | 3-      | <sg></sg> | 4+      | 4-      |
| 19 | <g></g>     | <g></g> | <g></g>   | <g></g> | <g></g> |

Connector 7 (248-286mm) Type B-19 connector (mixed short/long through pins)

| 1  | FL106       | FL107   | FR106     | <g></g> | FR107 |
|----|-------------|---------|-----------|---------|-------|
| 2  | FL108       | <g></g> | FL109     | FR108   | FR109 |
| 3  | FL110       | FL111   | FR110     | <g></g> | FR111 |
| 4  | FL112       | <g></g> | FL113     | FR112   | FR113 |
| 5  | FL114       | FL115   | FR114     | <g></g> | FR115 |
| 6  | FL116       | <g></g> | FL117     | FR116   | FR117 |
| 7  | FL118       | FL119   | FR118     | <g></g> | FR119 |
| 8  | FL120       | <g></g> | FL121     | FR120   | FR121 |
| 9  | FL122       | FL123   | FR122     | <g></g> | FR123 |
| 10 | <g></g>     | <g></g> | <g></g>   | FL124   | FR124 |
| 11 | 1+ Cable 21 | 1-      | <sg></sg> | <g></g> | FR125 |
| 12 | 3+          | 3-      | <sg></sg> | FL125   | FL126 |
| 13 | 1+ Cable 22 | 1-      | <sg></sg> | <g></g> | FR126 |
| 14 | 3+          | 3-      | <sg></sg> | FL127   | FR127 |
| 15 | 1+ Cable 23 | 1-      | <sg></sg> | <g></g> | FR128 |
| 16 | 3+          | 3-      | <sg></sg> | FL128   | FL129 |
| 17 | 1+ Cable 24 | 1-      | <sg></sg> | <g></g> | FR129 |
| 18 | 3+          | 3-      | <sg></sg> | FL130   | FR130 |
| 19 | <g></g>     | <g></g> | <g></g>   | FL131   | FR131 |

| 1  | FL132   | FR132   | FR133   | <g></g> | JMM0  |  |
|----|---------|---------|---------|---------|-------|--|
| 2  | FL133   | <g></g> | FL134   | FR134   | JMM1  |  |
| 3  | FL135   | FR135   | FR136   | <g></g> | JMM2  |  |
| 4  | FL136   | <g></g> | FL137   | FR137   | JMM3  |  |
| 5  | FL138   | FR138   | FR139   | <g></g> | JMM4  |  |
| 6  | FL139   | <g></g> | FL140   | FR140   | JMM5  |  |
| 7  | FL141   | FR141   | FR142   | <g></g> | JMM6  |  |
| 8  | FL142   | <g></g> | FL143   | FR143   | JMM7  |  |
| 9  | FL144   | FR144   | FR145   | <g></g> | JMM8  |  |
| 10 | FL145   | <g></g> | FL146   | FR146   | JMM9  |  |
| 11 | FL147   | FR147   | FR148   | <g></g> | JMM10 |  |
| 12 | FL148   | <g></g> | FL149   | FR149   | JMM11 |  |
| 13 | FL150   | FR150   | FR151   | <g></g> | JMM12 |  |
| 14 | FL151   | <g></g> | FL152   | FR152   | JMM13 |  |
| 15 | FL153   | FR153   | FR154   | <g></g> | JMM14 |  |
| 16 | FL154   | <g></g> | FL155   | FR155   | JMM15 |  |
| 17 | FL156   | FR156   | FR157   | <g></g> | JMM16 |  |
| 18 | FL157   | <g></g> | FL158   | FR158   | JMM17 |  |
| 19 | FL159   | FR159   | FR160   | <g></g> | JMM18 |  |
| 20 | FL160   | <g></g> | FL161   | FR161   | JMM19 |  |
| 21 | FL162   | FR162   | FR163   | <g></g> | JMM20 |  |
| 22 | FL163   | <g></g> | FL164   | FR164   | JMM21 |  |
| 23 | <g></g> | <g></g> | <g></g> | <g></g> | JMM22 |  |
| 24 | CAN+    | <g></g> | TTC+    | <g></g> | JMM23 |  |
| 25 | CAN-    | <g></g> | TTC-    | <g></g> | JMM24 |  |

Connector 8 (286-336mm) Type B-25 connector (short through pins)

Connector 9 (336-361mm) Type D (N) connector

| 2  | +3.3V     |
|----|-----------|
| 6  | Power GND |
| 10 | +5.0V     |

# 5 Programming Model

#### (24.06.2001)

In the following a programming model for VME access to registers and memories of the JEM and its components is described. Since the description set out here is indicative only, the details of the provisional memory location and organization are very likely to change. Each JEP crate will have a local CPU directly accessing the VME part of the crate backplane and thus an entirely self-contained VME address space. A suitable documentation of the programming model will be provided to accompany the prototype and final modules.

Guidelines

- The standard access is VME A24/D16.
- There are no write-only registers execpt for pulse registers. All registers can be read by the crate CPU via VME.
- The register bits generally have the same meaning for reads as for writes.
  - o All Status Registers are read-only register
  - o All Control Registers are read/write or pulse registers
  - Reading back a register will generally return the last value written
- Attempts to write to read-only registers or undefined portions of registers will not change the contents of the non-modifiable fields.

- Fully synchronous design protects all VME registers from being read via crate CPU while being altered by the JEM itself at the same time.
- When the address space occupied by the JEM is accessed, it will always respond with a handshake (DTACK\*) to avoid a bus error.
- The power-up condition of all registers and BlockRAMs will be all zeros, unless otherwise stated.

Notation

- *RO* means that the computer can only read the value of this register; writing has no effect either to the value or the state of the module.
- *RW means that the computer can affect the state of the module by writing to this register.*
- WO means that the computer can write to a single memory location, which is then copied to multiple locations, which can be read back individually.

# 5.1.1 Memory Map

In the following an overview of the address scheme for a JEP crate and its subsystems is given. Register descriptions for most of the devices from the sub base address are given below.

| Address | 23 | 22           | 21     | 20     | 19  | 18 | 17             | 16          | 15    | 14    | 13 | 12                                  | 11 | 10 | 9 | 8 | 7 | 6  | 5 | 4 | 3 | 2 | 1 |
|---------|----|--------------|--------|--------|-----|----|----------------|-------------|-------|-------|----|-------------------------------------|----|----|---|---|---|----|---|---|---|---|---|
| Line    |    |              |        |        |     |    |                |             |       |       |    |                                     |    |    |   |   |   |    |   |   |   |   |   |
| Use     | 1  | Base address |        |        |     | 0  | Sub            | bas         | e ado | iress | :  | Register address space for each FPC |    |    |   |   |   | GA |   |   |   |   |   |
|         |    | for          | JEM    | [ wit] | hin |    | 11x Input FPGA |             |       |       |    |                                     |    |    |   |   |   |    |   |   |   |   |   |
|         |    | crat         | te (de | erive  | d   |    | 1x 1           | 1x ROC FPGA |       |       |    |                                     |    |    |   |   |   |    |   |   |   |   |   |
|         |    | from         |        |        |     |    | 1x 1           | Mair        | n FPO | GΑ    |    |                                     |    |    |   |   |   |    |   |   |   |   |   |
|         |    | geo          | grap   | hica   | 1   |    | 1x VME CPLD    |             |       |       |    |                                     |    |    |   |   |   |    |   |   |   |   |   |
|         |    | add          | ress   | pins   | )   |    | 1x (           | Cont        | rol F | PGA   | A  |                                     |    |    |   |   |   |    |   |   |   |   |   |

## 5.1.2 Sub Base Address

| VME Address (Hex) | Device              |
|-------------------|---------------------|
| 00                | VME CPLD            |
| 01                | Control FPGA        |
| 02                | ROC FPGA            |
| 03                | Main Processor FPGA |
| 04                | Input FPGA V        |
| 05                | Input FPGA A        |
| 06                | Input FPGA B        |
| 07                | Input FPGA C        |
| 08                | Input FPGA D        |
| 09                | Input FPGA E        |
| 0A                | Input FPGA F        |
| 0B                | Input FPGA G        |
| 0C                | Input FPGA H        |
| 0D                | Input FPGA W        |
| 0E                | Input FPGA Z        |

# 5.2 Input FPGA

The following provides a description of registers provisionally allocated from the InputFPGA memory map. This is liable to change in some areas as the JEM control logic is fully specified and implemented.

| VME Address | Туре | Size (bits) | Name        | Description                 |  |  |  |
|-------------|------|-------------|-------------|-----------------------------|--|--|--|
| (Hex)       |      |             |             |                             |  |  |  |
| 00          | RO   | 10          | VERSION_REG | Firmware version number     |  |  |  |
| 02          | RW   | 10          | CONTROL_REG | Control register, See below |  |  |  |
| 04          | RO   | 10          | STATUS_REG  | Status register, see below  |  |  |  |

| 06 | RW | 9  | THRESHOLD_REG  | Low threshold register                                                     |
|----|----|----|----------------|----------------------------------------------------------------------------|
| 08 | RW | 8  | MASK_REG       | Switches individual channels off, see below                                |
| 10 | RO | 8  | CLK_PHASE_EM   | Clock Phase Register em, see below                                         |
| 12 | RO | 8  | CLK_PHASE_HAD  | Clock Phase Register had, see below                                        |
| 14 | RW | 8  | DELAY_REG      | Delay register which enables addition of extra delay (1BC) to each channel |
| 20 | RO | 8  | LL_REG1        | Lock Loss Register 1: indicates which LVDS channel has lost lock           |
| 22 | RO | 8  | LL_REG2        | Lock Loss Register 2: current link status                                  |
| 24 | RO | 8  | PARITY_ERR_REG | Parity Error Register: indicates on which channel a parity error occurred  |
| 30 | RO | 4  | LLC_1e         | Lock Loss Counter for LVDS channel 1e                                      |
| 32 | RO | 4  | LLC_2e         | Lock Loss Counter for LVDS channel 2e                                      |
| 34 | RO | 4  | LLC_3e         | Lock Loss Counter for LVDS channel 3e                                      |
| 36 | RO | 4  | LLC_4e         | Lock Loss Counter for LVDS channel 4e                                      |
| 38 | RO | 4  | LLC_1h         | Lock Loss Counter for LVDS channel 1h                                      |
| 3A | RO | 4  | LLC_2h         | Lock Loss Counter for LVDS channel 2h                                      |
| 3C | RO | 4  | LLC_3h         | Lock Loss Counter for LVDS channel 3h                                      |
| 3E | RO | 4  | LLC_4h         | Lock Loss Counter for LVDS channel 4h                                      |
| 40 | RO | 8  | PEC_1e         | Parity Error Counter for LVDS channel 1e                                   |
| 42 | RO | 8  | PEC_2e         | Parity Error Counter for LVDS channel 2e                                   |
| 44 | RO | 8  | PEC_3e         | Parity Error Counter for LVDS channel 3e                                   |
| 46 | RO | 8  | PEC_4e         | Parity Error Counter for LVDS channel 4e                                   |
| 48 | RO | 8  | PEC_1h         | Parity Error Counter for LVDS channel 1h                                   |
| 4A | RO | 8  | PEC_2h         | Parity Error Counter for LVDS channel 2h                                   |
| 4C | RO | 8  | PEC_3h         | Parity Error Counter for LVDS channel 3h                                   |
| 4E | RO | 8  | PEC_4h         | Parity Error Counter for LVDS channel 4h                                   |
| 50 | RW | 10 | PLAY_MEM       | Playback memory <sup>13</sup> (256x8x10 bit)                               |

# 5.2.1 CONTROL\_REG

A register containing pulsed modul controls. Writing zero has no effect.

| Bit | Description                                         |
|-----|-----------------------------------------------------|
| 0   | Reset Lock Loss counters & Lock Loss Register       |
| 1   | Reset Parity Error counters & Parity Error Register |
| 2   | Reset counter for Playback Memory                   |

# 5.2.2 STATUS\_REG

A register containing module status information.

| Bit | Description                     |
|-----|---------------------------------|
| 0   | Lock status of DLL              |
| 1   | Status of synchronisation logic |

# 5.2.3 MASK\_REG

A mask register for switching LVDS channels on(=0) or off(=1). When switching off a channel the concerning data is zero.

| Bit | Description                |
|-----|----------------------------|
| 0   | Switches channel 1e on/off |
| 1   | Switches channel 2e on/off |
| 2   | Switches channel 3e on/off |

<sup>13</sup> Consecutive write accesses are used to write playback memory. Right now it is not possible to read back the playback data.

| 3 | Switches channel 4e on/off |
|---|----------------------------|
| 4 | Switches channel 1h on/off |
| 5 | Switches channel 2h on/off |
| 6 | Switches channel 3h on/off |
| 7 | Switches channel 4h on/off |

# 5.2.4 CLK\_PHASE\_EM

A register for storing the clock phase for electromagnetic LVDS channels. The phase offset is encoded as follows:

| $0^{\circ}="00", 90^{\circ}="01", 180^{\circ}="10", 270^{\circ}="11"$ |                                                    |  |
|-----------------------------------------------------------------------|----------------------------------------------------|--|
| Bit                                                                   | Description                                        |  |
| 0-1                                                                   | Phase of synchronisation clock for LVDS channel 1e |  |
| 2-3                                                                   | Phase of synchronisation clock for LVDS channel 2e |  |
| 4-5                                                                   | Phase of synchronisation clock for LVDS channel 3e |  |
| 6-7                                                                   | Phase of synchronisation clock for LVDS channel 4e |  |

# 5.2.5 CLK\_PHASE\_HAD

A register for storing the clock phase for hadronic LVDS channels. The phase offset is encoded as follows:

0°="00", 90°="01", 180°="10", 270°="11"

| Bit | Description                                        |
|-----|----------------------------------------------------|
| 0-1 | Phase of synchronisation clock for LVDS channel 1h |
| 2-3 | Phase of synchronisation clock for LVDS channel 2h |
| 4-5 | Phase of synchronisation clock for LVDS channel 3h |
| 6-7 | Phase of synchronisation clock for LVDS channel 4h |

## 5.2.6 DELAY\_REG

Optional delay of one clock cycle to ensure data from same BC are clocked into same LHC clock cycle.

| Bit | Description                                              |
|-----|----------------------------------------------------------|
| 0   | Enables addition of extra delay (1BC) to LVDS channel 1e |
| 1   | Enables addition of extra delay (1BC) to LVDS channel 2e |
| 2   | Enables addition of extra delay (1BC) to LVDS channel 3e |
| 3   | Enables addition of extra delay (1BC) to LVDS channel 4e |
| 4   | Enables addition of extra delay (1BC) to LVDS channel 1h |
| 5   | Enables addition of extra delay (1BC) to LVDS channel 2h |
| 6   | Enables addition of extra delay (1BC) to LVDS channel 3h |
| 7   | Enables addition of extra delay (1BC) to LVDS channel 4h |

# 5.3 Main Processor

The following provides a description of registers provisionally allocated from the Main Processor FPGA memory map. This is liable to change in some areas as the JEM control logic is fully specified and implemented.

The Main Processor FPGA is addressed via a chip select (CS) and read/write (WR) signal by the Control-FPGA. These two point-to-point signals are responsible for device addressing. An additional 1 bit point-to-point connection named WR\_STROBE (derived form VME DS0\*) latches the data into the Main Processor FPGA. The internal register addressing and data transport is done on a 20 bit wide bidirectional bus (RDO\_JEN) whereof 7 bits will be used as address bits and 13 bits as data bits.

| VME Address    | Туре  | Size (bits) | Name        | Description                                               |
|----------------|-------|-------------|-------------|-----------------------------------------------------------|
| ( <b>HEA</b> ) | RO    | 10          | VERSION REG | Firmware version number                                   |
| 00             | RW    | 10          | CONTROL REG | Control Register, see below                               |
| 02             | RO    | 10          | STATUS REG  | Status Register, see below                                |
| 10             | RW    | 10          | FT THR REG  | Threshold for Sum E <sub>m</sub> tree                     |
| 10             | RW    | 10          | IFT THR REG | Threshold for Jet tree                                    |
| 20             | RW    | 10          | IFT DEF 0   | Let definition register (2 bit cluster size $\pm 10$ bit  |
| 20             | IX VV | 12          |             | energy threshold)                                         |
| 22             | RW    | 12          | JET_DEF_1   | Jet definition register                                   |
| 24             | RW    | 12          | JET DEF 2   | Jet definition register                                   |
| 26             | RW    | 12          | JET DEF 3   | Jet definition register                                   |
| 28             | RW    | 12          | JET DEF 4   | Jet definition register                                   |
| 2A             | RW    | 12          | JET_DEF_5   | Jet definition register                                   |
| 2C             | RW    | 12          | JET_DEF_6   | Jet definition register                                   |
| 2E             | RW    | 12          | JET_DEF_7   | Jet definition register                                   |
| 50             | RW    | 12          | MULT_A_X    | E <sub>T.miss</sub> multiplication register <sup>14</sup> |
| 52             | RW    | 12          | MULT_A_Y    | E <sub>T.miss</sub> multiplication register               |
| 54             | RW    | 12          | MULT_B_X    | E <sub>T.miss</sub> multiplication register               |
| 56             | RW    | 12          | MULT_B_Y    | E <sub>T,miss</sub> multiplication register               |
| 58             | RW    | 12          | MULT_C_X    | E <sub>T,miss</sub> multiplication register               |
| 5A             | RW    | 12          | MULT_C_Y    | E <sub>T,miss</sub> multiplication register               |
| 5C             | RW    | 12          | MULT_D_X    | E <sub>T,miss</sub> multiplication register               |
| 5E             | RW    | 12          | MULT_D_Y    | E <sub>T,miss</sub> multiplication register               |
| 60             | RW    | 12          | MULT_E_X    | E <sub>T,miss</sub> multiplication register               |
| 62             | RW    | 12          | MULT_E_Y    | E <sub>T,miss</sub> multiplication register               |
| 64             | RW    | 12          | MULT_F_X    | E <sub>T,miss</sub> multiplication register               |
| 66             | RW    | 12          | MULT_F_Y    | E <sub>T,miss</sub> multiplication register               |
| 68             | RW    | 12          | MULT_G_X    | E <sub>T,miss</sub> multiplication register               |
| 6A             | RW    | 12          | MULT_G_Y    | E <sub>T,miss</sub> multiplication register               |
| 6C             | RW    | 12          | MULT_H_X    | E <sub>T,miss</sub> multiplication register               |
| 6E             | RW    | 12          | MULT_H_Y    | E <sub>T,miss</sub> multiplication register               |
| 70             | RO    | 8           | EX_PORT     | Spy memory for $E_x^{15}$ (256x8 bit)                     |
| 72             | RO    | 8           | EY_PORT     | Spy memory for $E_y$ (256x8 bit)                          |
| 74             | RO    | 9           | ET_PORT     | Spy memory for $E_T$ and parity bit                       |
|                |       |             |             | (256x 8 bit + parity bit)                                 |
| 76             | RO    | 12          | J1_PORT     | Spy memory for jet multiplicities (256x4x3 bit)           |
| 78             | RO    | 13          | J2_PORT     | Spy memory for jet multiplicities                         |
|                |       |             |             | (256x 4x3 bit + parity bit)                               |

## 5.3.1 Memory Map

# 5.3.2 CONTROL REG

A register containing pulsed modul controls. Writing zero has no effect.

| Bit | Description                                                   |
|-----|---------------------------------------------------------------|
| 0   | Enable Spy                                                    |
| 1   | Reset counter for E <sub>T,miss</sub> multiplication register |
| 2   | Reset read counter for Spy Memory                             |

 <sup>&</sup>lt;sup>14</sup> Consecutive read/write accesses are used to read and write the multiplication register.
<sup>15</sup> Consecutive read accesses are used to read spy memory.

# 5.3.3 STATUS\_REG

A register containing module status information.

| Bit | Description        |
|-----|--------------------|
| 0   | Lock status of DLL |

# 5.4 Control FPGA

## 5.4.1 Memory Map

| VME Address | Туре | Size (bits) | Name        | Description                 |
|-------------|------|-------------|-------------|-----------------------------|
| (Hex)       |      |             |             |                             |
| 00          | RO   | 16          | VERSION_REG | Firmware version number     |
| 02          | RW   | 16          | CONTROL_REG | Control Register, see below |
| 04          | RO   | 16          | STATUS_REG  | Status Register, see below  |
| 10          | RW   | 16          | TTC_REG     | TTC Register, see below     |
| 20          | RW   | 8           | TTCrx       | TTCrx Register, see below   |

# 5.4.2 CONTROL\_REG

A register containing pulsed modul controls. Writing zero has no effect.

| Bit | Description                            |
|-----|----------------------------------------|
| 0   | Global Reset (fan-out on TTC_DATA bus) |

# 5.4.3 STATUS\_REG

A register containing module status information.

| Bit | Description                           |
|-----|---------------------------------------|
| 0   | Lock status of quarz clock DLL        |
| 1   | Lock status of board level deskew DLL |

# 5.4.4 TTC\_REG

A register containing pulsed modul controls. Writing zero has no effect.

| Bit | Descript | ion    |       |      |       |  |
|-----|----------|--------|-------|------|-------|--|
| 0-7 | Encoded  | Broadc | ast o | comn | nands |  |
|     |          |        |       |      |       |  |

For encoded Broadcast commands refer to "Use of TTC-System and BUSY Network".

# 5.4.5 TTCrx\_REG

This register provides access to the TTCrx chip. More details follow.

# 5.5 VME CPLD

#### 5.5.1 Memory Map

| VME Address | Туре | Size (bits) | Name        | Description                            |
|-------------|------|-------------|-------------|----------------------------------------|
| (Hex)       |      |             |             |                                        |
| 00          | RO   | 16          | MOD_ID_A    | Module ID A: module type               |
| 02          | RO   | 16          | MOD_ID_B    | Serial number & Module Revision number |
| 04          | RO   | 16          | VERSION_REG | Firmware version number                |

| 06 | RO | 8  | STATUS_REG     | Status register, see below   |
|----|----|----|----------------|------------------------------|
| 10 | RW | 15 | CFG_MASK_REG   | Mask FPGAs for configuration |
| 12 | RW | 14 | FPGA_RESET_REG | Reset FPGA                   |
| 14 | RW | 8  | CFG_REG        | Configuration Register       |

## 5.5.2 MOD\_ID\_B

A register containing module identifiers.

| Bit   | Description     |
|-------|-----------------|
| 0 - 7 | Serial number   |
| 8-15  | Revision number |

## 5.5.3 STATUS\_REG

A register containing module status information.

| Bit | Description         |
|-----|---------------------|
| 0   | TTC clock available |

# 5.5.4 CFG\_MASK\_REG

A register for masking FPGAs which will be configured via CFG\_REG.

| Bit | Description              |
|-----|--------------------------|
| 0   | Configure Control-FPGA   |
| 1   | Configure ROC FPGA       |
| 2   | Configure Main Processor |
| 3   | Configure Input-FPGA V   |
| 4   | Configure Input-FPGA A   |
| 5   | Configure Input-FPGA B   |
| 6   | Configure Input-FPGA C   |
| 7   | Configure Input-FPGA D   |
| 8   | Configure Input-FPGA E   |
| 9   | Configure Input-FPGA F   |
| 10  | Configure Input-FPGA G   |
| 11  | Configure Input-FPGA H   |
| 12  | Configure Input-FPGA W   |
| 13  | Configure Input-FPGA Z   |

# 5.5.5 FPGA\_RESET\_REG

| A register for resetting FPC | JAs. |
|------------------------------|------|
|------------------------------|------|

| Bit | Description          |
|-----|----------------------|
| 0   | Reset Control-FPGA   |
| 1   | Reset ROC FPGA       |
| 2   | Reset Main Processor |
| 3   | Reset Input-FPGA V   |
| 4   | Reset Input-FPGA A   |
| 5   | Reset Input-FPGA B   |
| 6   | Reset Input-FPGA C   |
| 7   | Reset Input-FPGA D   |
| 8   | Reset Input-FPGA E   |
| 9   | Reset Input-FPGA F   |
| 10  | Reset Input-FPGA G   |
| 11  | Reset Input-FPGA H   |
| 12  | Reset Input-FPGA W   |

| 13 | Reset Input-FPGA Z |
|----|--------------------|

### 5.5.6 CFG\_REG

A register which sends the bits to DIN-port of the FPGAs selected by CFG\_MASK\_REG.

| 0-15 Data for configuration of selected FPGA |
|----------------------------------------------|

# 5.6 ROC FPGA

#### 5.6.1 Memory Map

| VME Address | Туре | Size (bits) | Name              | Description                     |
|-------------|------|-------------|-------------------|---------------------------------|
| (Hex)       |      |             |                   |                                 |
| 00          | RO   | 8           | VERSION_REG       | Firmware version number         |
| 02          | RW   | 8           | CONTROL_REG       | Control Register, see below     |
| 04          | RO   | 8           | GLINK_STATUS_REG  | Status Register, see below      |
| 10          | RW   | 8           | LATENCY_REG       | Latency register, see below     |
| 12          | RW   | 3           | SLICE_REG         | Slice count register, see below |
| 14          | RW   | 8           | ROI_REG           | ROI register, see below         |
| 16          | RW   | 8           | BC_OFFSET_REG     | BC counter offset, see below    |
| 18          | RW   | 8           | GLINK_CONTROL_REG | G-link settings, see below      |

## 5.6.2 CONTROL\_REG

A register containing (non-pulsed) modul controls.

| Bit | Description                    |
|-----|--------------------------------|
| 0   | Chip Reset for G-links (Slice) |
| 1   | Chip Reset for G-links (ROI)   |

## 5.6.3 GLINK\_STATUS\_REG

A register containing module status information.

| Bit | Description                   |
|-----|-------------------------------|
| 0   | Lock status of G-link (Slice) |
| 1   | Lock status of G-link (ROI)   |
| 2   | INV status of G-link (Slice)  |
| 3   | INV status of G-link (ROI)    |
| 4   | RFD status of G-link (Slice)  |
| 5   | RFD status of G-link (ROI)    |

# 5.6.4 LATENCY\_REG

A register containing the latency correction for the READ\_REQUEST signal . The READ\_REQUEST signal initiates DAQ readout on all Input-FPGAs and on the Main Processor.

| Bit | Description        |
|-----|--------------------|
| 0-4 | Latency correction |

# 5.6.5 SLICE\_REG

The slice count register contains the number of consecutive slices which will be read out after a L1A signal. The maximum number is five.

| Bit | Description |
|-----|-------------|
| 0-2 | Slice count |

## 5.6.6 ROI\_REG

A register containing the latency correction for ROI readout.

| Bit | Description            |
|-----|------------------------|
| 0-4 | ROI latency correction |

## 5.6.7 BC\_OFFSET\_REG

A register containing the timing correction for the local bunch crossing counter.

| Bit | Description                             |
|-----|-----------------------------------------|
| 0-4 | Bunch crossing offset timing correction |

## 5.6.8 GLINK\_CONTROL\_REG

A register containing static (non-pulsed) module controls.

| Bit | Description                             |
|-----|-----------------------------------------|
| 0   | Enable data (G-link Slice)              |
| 1   | Enable data (G-link ROI)                |
| 2   | Select Double Frame Mode (G-link Slice) |
| 3   | Select Double Frame Mode (G-link ROI)   |

# 6 Glossary

The following lists some of the terms and acronyms used in the text.

| Backplane   | Multi-purpose high-speed common backplane within JEP and CP crate        |
|-------------|--------------------------------------------------------------------------|
| CAN         | Controller Area Network                                                  |
| СММ         | Common Merger Module                                                     |
| СР          | Cluster Processor sub-system of the Calorimeter Trigger                  |
| СРМ         | Cluster Processor Module                                                 |
| СТР         | Central Trigger Processor                                                |
| DCS         | Detector Control System                                                  |
| DLL         | Delay-Locked Loop                                                        |
| FIO         | Fan in / fan out signal lines on the backplane                           |
| FPGA        | Field Programmable Gate Array                                            |
| G-link      | Hewlett-Packard Gigabit serial link                                      |
| JEM         | Jet/Energy processor module                                              |
| JEP         | Jet/Energy Processor sub-system of the Calorimeter Trigger               |
| JEP crate   | Electronics crate processing two quadrants of trigger space with 16 JEMs |
| jet element | Electromagnetic+hadronic energy sum in a .2×.2 cell                      |
| LVDS        | Low-Voltage Differential Signalling                                      |
| PPr         | Pre-processor                                                            |
| ROB         | Read-out Buffer                                                          |
| ROC         | Read-out Controller logic on the CPM                                     |
| ROD         | Read-out Driver module                                                   |
| ТСМ         | Timing and Control Module                                                |
| TTC         | Trigger Timing and Control                                               |

# Figures

| Figure 1 : JEP channel map                                            | 5   |
|-----------------------------------------------------------------------|-----|
| Figure 2 : JEM channel map, including quadrant overlaps (Z,W) and V   | 5   |
| Figure 3: JEM block diagram                                           | 6   |
| Figure 4 : JEM input processor                                        | 6   |
| Figure 5 : JEM main processor FPGA                                    | 7   |
| Figure 6: RoI and DAQ slice data path                                 | 8   |
| Figure 7 : Input processor : readout sequencer                        | 8   |
| Figure 8 : JEM input map (barrel)                                     | .17 |
| Figure 10 : Barrel vs. FCAL (JEM0) map                                | 20  |
| Figure 11 : Overview of the read-out logic                            | .21 |
| Figure 12 : slice data bit stream format (a-c) and G-link bit map (d) | 27  |
| Figure 13 : RoI data bit stream format (a-b) and G-link bit map (c)   | .27 |

Change log:

10/02: cleanup and minor corrections

add section on new input synchronisation correct for new DAQ/ROI data formats remove description of serialising block memories, no more start/stop bits to ROC correct for current VME address allocation

10/30: Modifications (SBS)

- 3.0 Crate mechanics according to IEEE 1101.10
- 3.3.3 Jet algorithm specification: Add possibility of multiple sets of threshold definitions

Saturated jet elements no longer automatically set OF condition Deskew2 clock added to main processor FPGA to time in FIO from neighboring 3.5.6

JEMs