Timing Calibration Scheme

The JEP and CP systems employ several stages of similar, but not identical, circuitry to align clock and data, such that data are captured when valid, and such that data of all channels are kept in the bunch tick to which they belong. To this end, alignment of both phase (fraction of a clock tick) and delay (full clock ticks) is required. This is generally done selecting one of several sampling phases (rising/falling edge or four-phase clock) individually for each channel and re-registering the data to a common clock phase before sending them into the algorithmic parts of the FPGA code. The re-registering of the data adds to latency, if the latest incoming data happen to be registered on a phase that is different from the common clock phase.

The optimum sampling phase can be determined either with the firmware counting parity errors for the individual phases or with a delay scan at one common phase. The firmware approach deals with single channels only. Channel-to-channel alignment needs to be done using software. The delay scan is a fully software driven scheme. It allows determining both the optimum sampling phase and the skew between channels in one go.

Phases and delays need to be measured whenever any modules or cables were swapped. The results are stored to the data base and loaded into the modules upon start of run.

JEM

The JEM timing calibration scheme relies on delay scans determining the optimum sampling phases for each channel. The input processors latch the incoming LVDS parallel data at both the rising and the falling clock edge. The delay scan results are used to determine the one of the two samples that is expected to be error free. The data eyes are nominally 25ns wide. Due to a track length of a couple of cm only almost the full nominal width is available at the receiving end. Since the full data eye is measured in the scan, the choice of one out of two clock phases is sufficient to sample the data far away from the edges.

The delay scan scheme was introduced with JEM1.0, though software development is under way only now. It requires the PPr transmitting a linear ramp from its playback memories. On the JEMs Clock40Des1 is stepped in units of 1.04ns. After each step the playback/spy pointers are re-adjusted with help of a TTC short broadcast (NB this scheme requires multi-step runs to generate the broadcast signal). The incoming LVDS data are then captured in the playback memories. It is sufficient to read out a single tick worth of data, for the rising clock edge and zero bunch ticks delay. Clock40Des1 is then set back to its nominal value and the timing calibration software calculates the optimum sampling phase for each of the channels. The latency correction is calculated so as to align the channels in terms of full bunch ticks. Phase and delay values are written to the database.

FIO lines are timed-in in a similar way, though the sampling phase cannot be adjusted on a per-channel base. A delay scan is performed on Clock40Des2 with a step size of 104ps. The input processors transmit linear ramps synchronous to Clock40Des1 and the jet processor latches the data on Clock40Des2 and checks for pattern errors. The timing calibration software reads the error counters and optimizes both Clock40Des1 and Clock40Des2, so as to widen the data eye to a maximum.

While the sum processor runs off Clock40Des1 only, the jet processor runs its real-time data path off Clock40Des2, so as to avoid an increase in latency due to re-registering the data. Therefore data arriving at the mergers suffer from an additional slot-to-slot skew on Clock40Des1 and Clock40Des2 introduced by the FIO phase adjustment mechanism.

CPM

The CPM latches the LVDS input data on one of four programmable phases. The sampling phase is determined in firmware. The FIO timing is measured with delay scans and programmable delays / firmware delays are used to control the sampling phases.

CMM

The CMM latches the incoming data on one of four programmable phases. It can be expected that there are parity errors seen on exactly one of the four phases. The optimum sampling phase is the one with 180 degree offset with respect to the errored one. There is no programmable full-tick delay available for latency correction. Therefore, by setting all channels to the optimum clock phase, one might move early data into a different bunch tick than late data, if the incoming data are spread around the 0 degree phase. One might counter this by moving to one of the two neighbouring phases, which are likely to be error free as well. However, those sampling points are closer to the edge and there is a risk of low level errors. Also, with a large slot-to-slot skew, as expected on the JEP, this method will fail. This issue can be avoided only by adjustment of the TTCrx clock phase on the CMM. The JEP phases are not available for this global phase correction, if the latency optimized timing scheme is employed (see below).

Latency

On the data path from JEMs/CPMs through the CMMs there are several stages of re-synchronisation of data. Data are in a first step latched on a clock phase that captures the data error free. The phase is chosen on a per-channel basis. In a second step all data is latched on a common clock before it is processed. Latency can be minimized if, by an adequate choice of the TTCrx deskew values, the latest incoming data are allowed to be latched on the global clock right away. This procedure requires a full delay scan to determine the absolute phase for all channels.