The JEP and
CP systems employ several stages of similar, but not identical, circuitry to
align clock and data, such that data are captured when valid, and such that data
of all channels are kept in the bunch tick to which they belong. To this end,
alignment of both phase (fraction of a clock tick) and delay (full clock ticks)
is required. This is generally done selecting one of several sampling phases
(rising/falling edge or four-phase clock) individually for each channel and
re-registering the data to a common clock phase before sending them into the
algorithmic parts of the FPGA code. The re-registering of the data adds to
latency, if the latest incoming data happen to be registered on a phase that is
different from the common clock phase.
The optimum
sampling phase can be determined either with the firmware counting parity
errors for the individual phases or with a delay scan at one common phase. The
firmware approach deals with single channels only. Channel-to-channel alignment
needs to be done using software. The delay scan is a fully software driven
scheme. It allows determining both the optimum sampling phase and the skew
between channels in one go.
Phases and
delays need to be measured whenever any modules or cables were swapped. The
results are stored to the data base and loaded into the modules upon start of
run.
The JEM
timing calibration scheme relies on delay scans determining the optimum
sampling phases for each channel. The input processors latch the incoming LVDS
parallel data at both the rising and the falling clock edge. The delay scan
results are used to determine the one of the two samples that is expected to be
error free. The data eyes are nominally 25ns wide. Due to a track length of a
couple of cm only almost the full nominal width is available at the receiving
end. Since the full data eye is measured in the scan, the choice of one out of
two clock phases is sufficient to sample the data far away from the edges.
The delay
scan scheme was introduced with JEM1.0, though software development is under
way only now. It requires the PPr transmitting a
linear ramp from its playback memories. On the JEMs
Clock40Des1 is stepped in units of 1.04ns. After each step the playback/spy
pointers are re-adjusted with help of a TTC short broadcast (NB this scheme
requires multi-step runs to generate the broadcast signal). The incoming LVDS
data are then captured in the playback memories. It is sufficient to read out a
single tick worth of data, for the rising clock edge and zero bunch ticks
delay. Clock40Des1 is then set back to its nominal value and the timing
calibration software calculates the optimum sampling phase for each of the
channels. The latency correction is calculated so as to align the channels in
terms of full bunch ticks. Phase and delay values are written to the database.
FIO lines
are timed-in in a similar way, though the sampling phase cannot be adjusted on
a per-channel base. A delay scan is performed on Clock40Des2 with a step size
of 104ps. The input processors transmit linear ramps synchronous to Clock40Des1
and the jet processor latches the data on Clock40Des2 and checks for pattern
errors. The timing calibration software reads the error counters and optimizes
both Clock40Des1 and Clock40Des2, so as to widen the data eye to a maximum.
While the
sum processor runs off Clock40Des1 only, the jet processor runs its real-time
data path off Clock40Des2, so as to avoid an increase in latency due to
re-registering the data. Therefore data arriving at the mergers suffer from an
additional slot-to-slot skew on Clock40Des1 and Clock40Des2 introduced by the
FIO phase adjustment mechanism.
The
CPM latches the LVDS input data on one of four programmable phases. The
sampling phase is determined in firmware. The FIO timing is measured with delay
scans and programmable delays / firmware delays are used to control the
sampling phases.
The CMM
latches the incoming data on one of four programmable phases. It can be
expected that there are parity errors seen on exactly one of the four phases.
The optimum sampling phase is the one with 180 degree offset with respect to
the errored one. There is no programmable full-tick
delay available for latency correction. Therefore, by setting all channels to
the optimum clock phase, one might move early data into a different bunch tick
than late data, if the incoming data are spread around the 0 degree phase. One
might counter this by moving to one of the two neighbouring phases, which are
likely to be error free as well. However, those sampling points are closer to
the edge and there is a risk of low level errors. Also, with a large
slot-to-slot skew, as expected on the JEP, this method will fail. This issue can
be avoided only by adjustment of the TTCrx clock
phase on the CMM. The JEP phases are not available for this global phase
correction, if the latency optimized timing scheme is employed (see below).
On the data
path from JEMs/CPMs through the CMMs
there are several stages of re-synchronisation of data. Data are in a first
step latched on a clock phase that captures the data error free. The phase is
chosen on a per-channel basis. In a second step all data is latched on a common
clock before it is processed. Latency can be minimized if, by an adequate
choice of the TTCrx deskew
values, the latest incoming data are allowed to be latched on the global clock
right away. This procedure requires a full delay scan to determine the absolute
phase for all channels.