ATLAS Level-1 Calorimeter Trigger
GOLD Module
-
V1.0-
Project Specification
Version 0.1
Date: 23
November 2010
Uli , Bruno, Eduard
Universität Mainz
1.2.2 Level-2
interface, monitoring, diagnostics, and control
2 Functional
requirements and design issues
2.3 Jet
input data conditioning and energy pre-summation
2.7.1 Playback
and spy functionality
2.8 Board
level issues : Power supplies and line impedances
3.2 Jet
input data conditioning
3.5 FCAL
and endcap calorimeter treatment
3.8 Signal
levels and supply voltages
3.10 Mezzanine
module iterations
3.11 Summary
of post-FDR design changes
4 Interfaces
: connectors, pinouts, data formats
4.2 backplane
connector layout
This document describes the specification for the initial version 1.0 of the Generic Opto Link Demonstrator (GOLD). The specifications cover the processor board as well as mezzanine modules carrying clock circuitry and opto-electrical receivers. Section 1 of this document gives an overview of the module. Section 2 describes the functional requirements. Section 3 details the implementation.
The GOLD is a demonstrator for the Topological Processor module and is therefore a major processor module within the Cluster and Jet processor scheme of the ATLAS level-1 calorimeter trigger. The Topological Processor will be located between the CMM++ and the CTP in the L1Calo architecture. A demonstrator for the CMM++ is the BLT. The future CTP is assumed to be electrically identical to its current incarnation, the firmware, and the data formats on the interfaces might be different.
TTC http://www.cern.ch/TTC/intro.html
Current L1Calo modules http://hepwww.rl.ac.uk/Atlas-L1/Modules/Modules.html
TTCDec http://hepwww.rl.ac.uk/Atlas-L1/Modules/Components.html
CTP
The Topological Processor within the L1Calo architecture is a single processor crate equipped with one or more processor modules. The processor modules will either be identical copies with firmware adapted to the specific topologies to operate on. Alternatively it could be modules tailored to a specific topology in hardware. The GOLD is a first technology demonstrator for the topological processor and other future L1Calo modules. These specifications describe its use as a Topological Processor demonstrator only. It is meant to be operated on output signals of a first prototype of the CMM++, or initially on the BLT.
The GOLD will receive the topological output data of the sliding windows processors. The details of the data formats are not currently known. However, the information transmitted into the GOLD will be basically comprised of the ROI data that are currently transmitted to the 2nd level trigger. The data will consist of a description of the position of an object (jet, e/m cluster, and tau) along with some qualifying information, basically the energy sum within the object. Preliminary data formats have been devised. Data are going to be transmitted on optical fibres. After conversion to electrical representation, data are received and processed in FPGAs. Results are sent to the Central Trigger processor.
The GOLD, along with its mezzanine modules, will populate three adjacent slots of a standard ATCA backplane. Backplane connectivity is made to a single ATCA slot only.
The GOLD is a module in ATCA form factor. Backplane zones 2 and 3 are used for real-time data transmission. The module incorporates two distinct processing schemes. One is based on currently available FPGAs (XC6VLXT series), the other one tries to make use of FPGAs (XC6VHXT) meant to be available very soon. The baseline operation of the GOLD relies on the LXT scheme. The HXT FPGAs will not be mounted initially.
The CMM++ data enter the GOLD optically through the backplane. The fibres are fed via five blind-mate backplane connectors that can carry up to 72 fibres each. That limits the maximum aggregate bandwidth to 3.6Tb line rate, if 10 Gb/s transmission is used throughout. Initially, however, the interfaces will be used at lower rates, and with just 12 fibres per connector. The optical signals are converted to electrical signals in 12-fibre receivers. Data reception is handled on a mezzanine module, stacked on the component side of the GOLD. The GOLD mainboard is fed with electrical signals, which are routed through a total of four industry-standard FMC connectors with differential signal layout. The data are routed to four FPGAs, where they are de-serialised in MGT receivers; the parallel data are presented to the FPGA fabric. In the four input processors part of the algorithms are run, the actual topological algorithms requiring access to data from the full eta/phi space of L1Calo, however, are run on the main processor chip. The final results are transmitted towards the CTP on optical fibres. The signals are converted electrical and de-serialised with low latency receivers on an adapter module located near the CTP crate.
The operation of the real-time data path requires low-jitter clocks throughout the system. For synchronous operation, data transmitters will have to be operated with clean multiples of the LHC bunch clock. Receiver reference clocks may as well be derived from local crystal oscillators, though tight limits on the frequency range will have to be observed. The module will be designed for 40.08 MHz operation only.
The optical data arrive on the input mezzanine module on standard 12-fibre bundles. Since the backplane connectors support multiples of 12 fibres, the signals might in a future version of the board be supplied by “octopus” cable sets, splitting 24, 36, 48, or 72 fibres into groups of 12. It should be noted that un-armed, bare fibre bundles are very flexible and can easily be routed to their destination, even in high-density designs. However, they need to be protected from mechanical stress. The opto-electrical conversion will be performed in AVAGO or compatible 12-channel devices. Industry-standard SNAP12 devices are not considered an option, since so far no high-bandwidth devices are available from any supplier. The opto receivers exhibit programmable pre-emphasis so as to allow for improvement on signal integrity for given track length.
Since the number of input channels will be higher than the number of optical inputs initially, the electrical signals are routed from the o/e converters into CML fan-outs, and then onto the mezzanine connectors. While the mezzanine module concept with a rather inexpensive and almost passive (except the fan-outs) carrier for the opto receivers is particularly useful in the demonstrator and prototyping phase, it is envisaged for the production modules as well. Such a concept allows for several instances of a topological processor to be run with different signal routing, optimised for the respective algorithms. The CML circuits do not allow for neither pre-emphasis nor signal equalisation. Therefore signal conditioning on the data sources and sinks are required, if longer tracks are to be used.
On the main board the multi-gigabit links are de-serialised on the input processors. They allow for programmable signal equalization on their inputs. The exact data formats are as yet unknown, though standard 8b/10b encoding is envisaged for purpose of run length limitation and DC balance. The processors are supplied with required bias and termination voltages, as well as a suitable reference clock.
The input processors are assumed to do some local pre-processing on the incoming data. The extent of local processing is determined by the respective algorithms.
The main processor has accesses to data from the full solid angle in eta and phi. Data volumes are limited by the general purpose I/O bandwidth of the FPGA. In the first iteration of the module a XC6VXLX240 is used. With a line rate of 1Gb/s and a total of 240 differential lanes into the main processor the aggregate bandwidth is limited to 240 Gb/s. This theoretical limit cannot probably be reached, since a fraction of lanes might be required for control purposes, as the main processor dubs as a module controller. Bandwidth can be increased when larger, foot print compatible FPGAs are mounted on the PCB. The use of multi-Gigabit links on this path is not effective due to their higher latencies. 600-lane FPGAs of the XC6V series have been announced, but are not yet available. Due to the currently limited aggregate bandwidth of parallel I/O the results of the topological algorithms are sent on to the Central Trigger Processor on multi-gigabit links. A single fibre-optical ribbon connection through the front panel of the module is provided for this purpose. An adapter board will be required to interface the GOLD to the LVDS inputs of the current CTP. This module will be equipped with CML-to-LVDS de-serialisers DS32EL0124 or similar.
Both the FPGA fabric and the multi-Gigabit transceivers (MGT) need to be supplied with low latency clock signals of appropriate signal levels and signal quality. The fabric clocks are multiples of the LHC bunch clock at LVDS levels. The MGT clocks are CML level signals, AC-coupled into the MGT clock pads of the FPGAs. The jitter of the MGT clocks has to be tightly controlled. Clock generation and jitter clean-up are performed on a clock and control mezzanine module, located towards the bottom of the GOLD. The clock distribution trees are located on the GOLD main board. The clock fan-out chips chosen are devices with multiple signal level input compatibility, and output levels suitable for their respective use, either LVDS or CML. The devices are manufactured by Micrel.
There are two independent clock trees for the fabric clocks into all FPGAs. There is one common MGT clock tree into all FPGAs. Another MGT clock tree supplies all XC6VLXT devices. There is yet another clock tree available to supply the XC6VHXT devices (see below).
While the GOLD is basically meant to be a demonstrator for the Multi-Gigabit real-time data paths of future L1Calo modules, an attempt is made to prototype various module controls for future ATCA based modules as well.
The GOLD is a purely FPGA based ATCA module. There are no dedicated communications processors available on the module. Therefore there is only a single control path predefined that could possibly allow for access to a module with un-configured FPGA devices. This is the I2C port available on all ATCA modules in zone 1 (see below). Also CAN (see below) could be an option for pre-configuration access. Due to the very limited bitrates on these two networks, neither of them is considered suitable for initial access in a fairly experimental setup, where there is quite some probability for a flash card based FPGA configuration failing. FPGA configuration with up to the order of 100MB of configuration data per GOLD would be impossible to achieve through a service network of just hundreds of kilobits per second bandwidth.
For this reason the GOLD will be equipped with a USB microcontroller which supports 480Mb/s USB high speed mode. The microcontroller is interfaced to the FPGAs via JTAG protocol. The microcontroller design is fully compatible to the Xilinx platform USB solution. Therefore the crucial operation of FPGA configuration is available from both Linux and Windows without the need for any software to be written. Either production firmware or service code can be loaded to the FPGAs as required. Debug with the Xilinx Chipscope tool set will be supported out of the box. JTAG user access to the FPGAs will be possible with a sustained rate of several Megabits per second. However, some application software would have to be written for this purpose.
On the GOLD the USB sub-system will be built on a mezzanine module, though it is envisaged to make this pre-configuration and debug access port available as an integral part of future L1Calo ATCA production modules. Pre-configuration access beyond the use case described above would be possible, since the USB microcontroller can at any time be loaded with a non-Xilinx hex code via the USB. However, initially on the GOLD there will be no additional electrical connectivity available beyond the Xilinx standard wiring to the JTAG port. Hardware upgrade might be possible at a later stage if required (spare connector pins ?!?!?)
The baseline FPGA configuration scheme on the GOLD is via a CompactFlash card and the Xilinx SystemACE chip. This configuration scheme has been successfully employed on several L1Calo modules so far, including the JEM. On the JEM the required software and firmware has been devised to allow for in-situ update of the CompactFlash images. The required board-level connectivity will be available on the GOLD (check bandwidth through CPLD!) to write the flash cards through some network connection to the FPGAs, once they are configured. The inevitable bootstrap problem is solved through the USB based pre-configuration access scheme described above.
Board level temperature and voltage monitoring will be a requirement on future L1Calo modules. The current L1Calo modules are monitored via CAN bus. The default ATCA monitoring and control path is via I2C links in zone 1. For the GOLD both monitoring paths will be available. The backplane I2C link is run via a IPMB compatible driver into the control CPLD. This CPLD is basically used for a further stage of routing only. The I2C controller can be located either in the main processor FPGA or inside a microcontroller located on a mezzanine module on the GOLD. It will be possible to mount a CAN microcontroller on a future version of the clock/control mezzanine. The required connectivity is available on the mezzanine socket.
On standard ATCA modules, IP connectivity is required for module control. This is generally provided by two 10/100/1000 Ethernet ports on the backplane in zone 2 (redundant base interface). This is a requirement for an ATCA compliant backplane, but not for an ATCA compliant module. The module is free to ignore the Ethernet ports and provide IP connectivity by any suitable means. On future L1Calo ATCA crate one would probably avoid wiring two slots exclusively to the two base interface ports. On the GOLD a few differential signal pairs towards the bottom of the module are reserved for module access via Ethernet, PCIe, or other connectivity. The respective lines are wired to the clock/control mezzanine, where the required connectivity can be made to a local microcontroller, an SGMII Phy device, or straight into an MGT link on the main processor.
Since the GOLD would be used in the existent phase-0 environment initially, a scheme for transmitting event data into the ATLAS data acquisition and 2nd level trigger was made compatible to the existent modules. A single optical channel into each the DAQ and ROI RODs is provided on the clock mezzanine module. The optical interface is made via SFP sockets. Since it is not yet known whether it will be necessary and possible to implement the G-Link protocol on Xilinx MGT circuits, the SFPs can alternatively be driven from the FPGA fabric, if a new clock mezzanine module is made. There are enough resources available on the clock mezzanine socket to allow for either MGT or fabric based connectivity into DAQ and 2nd level trigger.
The opto module is a mezzanine module on the GOLD main board. It is stacked on the component side of the GOLD with standard four FMC connectors. All FMC connectors are of same signal type specific pin allocation. Thus four identical mezzanine modules can be mounted on a GOLD, though for reason of module mechanics and keep-out zones on the main board, a mezzanine module would normally be connected via two or four FMC connectors. A four-connector mezzanine module will require special tools so as to be removed from its sockets, for reason of retention forces.
Clock reception and conditioning circuitry is implemented on a mezzanine board. The mezzanine connector carries all required clock signals (multiple MGT and fabric clocks for the FPGAs), as well as general control signals. Due to the limited real estate of the clock mezzanine it will most likely not be possible to drive all clock and control lines concurrently. So as to test full main board functionality there will be more than one version of the clock module. However, later versions of the clock mezzanine module are expected to drive all lines that are actually required for the GOLD to perform in the L1Calo phase-1 environment.
The GOLD, as described above, is the module version that will be assembled initially. It makes use of FPGA processors that are available on the market currently. The five FPGAs of type XC6VLXT are mounted in the top half of the GOLD. They cover about a third of the real estate of the module. Another third is taken by auxiliary circuitry and connectors. This leaves a third of module space for another four FPGAs of type XC6VHXT. Those FPGAs should appear on the market rather soon. They provide higher MGT link counts and speed, as well as higher processor capacities. HXT processors might be used on future L1Calo modules, though the properties of the MGT circuitry compare rather unfavourable to the next coming Virtex-7 family. The HXT suffer from a split bitrate range of up to 6.5 Gb/s for the slower MGTs, and a non-overlapping 10Gb/s only for the faster transceivers on the same chip. That is very inconvenient and it imposes considerable limitations on an architecture based on such devices.
The GOLD allows for mounting one pair of HXT devices connected to the lower pair of FMC connectors. Another pair of FPGAs is wired to zone 2 of the ATCA backplane connector. The HXT subsystem is not within the scope of this specifications document, however, there will be an appendix added to a future version of this document giving a few details on the HXT design. It should be noted that the power distribution system of the GOLD was designed to allow for concurrent operation of the LXT and the HXT subsystems. However, dissipation of the module would require special consideration when operating all of the FPGAs at full performance concurrently.