The ATLAS Level-1 Topological Processor: Phase-I upgrade and Phase-II adaptation

Outline:

- Introduction of the ATLAS hardware trigger system
- New Topo for Run 3
- Challenges for Run 4

Emanuel Meuser on behalf of the ATLAS TDAQ Collaboration TIPP2023 | Capetown | 04.09.2023 - 08.09.2023





# **Overview - The ATLAS detector**



# **Overview - The ATLAS detector**



The hardware trigger works on information of reduced granularity from the calorimeters and the muon spectrometer at 40 MHz

> Inner detector cannot be read out full at LHC BX frequency of 40 MHz!



# ATLAS Level-1 Trigger System - Run 1 (2011 - 2012)





# **Concept of a topological trigger**

Introduce additional topological criteria:

- Δη
- Δφ
- ΔRSqr
- Invariant Mass
- Hardness  $H_{T}$  ( $\sum Jet p_{T}$ )

Reduction of rate without change of thresholds => no bias!





5

# ATLAS Level-1 Trigger System - Run 2 (2015 - 2018)





# ATLAS Level-1 Trigger System - Run 3 (2022 - 2025)





# **Run 3 L1Topo - Hardware Overview**

- 3 dual width, **custom-designed** ATCA boards
- 2 processor FPGAs (VU9P) per board
- 12 Minipod opto/electrical transceivers per FPGA
  - 10 receivers and 2 transmitter per FPGA
  - running at 11.2 Gbps per channel
  - system's total receiving bandwidth at <u>8 Tbits</u>
- Zynq SOM as module controller
  - FPGA + ARM
  - Provides control signals to FPGAs
  - Programs processor FPGAs on power up
- Last board installed 28th of January 2022





Run 3 (Ph1) L1Topo board



### Run 3 L1Topo - LHC BX synchronous firmware (25 ns = 1 tick)



# Run 3 L1Topo - algorithmic firmware overview

# Run3 L1Topo algorithmic firmware:

2 stage approach:

- 1. Sort/Select stage:
  - 2\*25 ns latency budget
  - reducing the # of TOBs in 2 ticks
- 2. topological algorithms:
  - 25 ns latency budget
  - calculations performed in single tick





# Run 3 L1Topo - algorithmic firmware overview



# Run 3 L1Topo - example topological algorithm



- #TOBs reduced in Sort/Select stage
- implementation of example algorithm uses 31'732 LUTs
- algorithmic firmware occupies 2.5M LUTs across 6 FPGAs (VU9p)



## **Run 3 L1Topo - First performance results from Bphysics**



Run3-L1Topo chains provide ~ 70 % of unique rate for J/Ψ and Υ candidates!



# LHC schedule - Towards Higher Luminosities



High Lumi - LHC brings challenges for the Trigger:

- Luminosity of up to  $7.5 \cdot 10^{34}$  cm<sup>-1</sup>s<sup>-1</sup>
- Pileup of up to 200 (60 in Run 3)

#### => Adapt Trigger System for High-Lumi



# ATLAS Level-0 Trigger System - Run 4 and beyond



Changes for first level trigger for Run 4:

- Overall Latency for from 2.5 µs to 10 µs
- Full cell-level granularity of whole detector combined on single FPGA of L0Global
  - TOBs from L0Calo and L0Muon
  - Run own e, j, tau, XE, TE algorithms to improve TOBs efficiencies
  - Absorbs Topological Trigger
- => Combins 1 event onto single FPGA at full granularity using <u>time multiplexing</u>



# **L0Global - Time Multiplexing**





# **L0Global - Time Multiplexing**



48 Event Processors => 48 \* 25 ns = 1.2 µs until next event on same Event Processor



# L0Global - time multiplexed topological firmware



#### L0Global ≠ L1Topo:

- Variety of algorithms running on L0Global
- Data moves serially
  through L0Global
- Long time until next event
- Tight resource budget:
  100k LUTs allocated for topological part (3.3M LUTs on VP1802)
- => <u>Topological algorithms</u> <u>need to adapt</u>



# **L0Global - Minimization of Resources**



Resource minimization through serialization:

- trade resource vs. time
- process one combination per clock tick
- requires additional logic to provide all combinations

#### For this example algorithm serialization reduces resource costs from 31'732 to 636 LUTs

• 95 % already serialized - fits into 100k budget



Seq.

Or

60 combinations sequentially -

=> implementation uses 636 LUTs

60 sub-ticks (60 \* 3.125 ns)

## Conclusion

- Low p, physics data taking benefits from Topological Trigger immensely
- Run 3 Topo system was installed and is nearly fully commissioned
- First physics results from Run 3 show excellent performance!
- Run 4: Completely different boundary conditions for topological firmware
  - Serialized topological algorithms fit into 100k LUT budget

