Speaker
Description
In searches to discover new physics, at the LHC, machine learning classifiers can be used to
extract signal events from background processes. Semi-supervised machine learning classifiers
can be used to extract unlabeled signals from labeled background events, reducing biases
caused by preconceived understanding of the signal. When training machine learning
classifiers, overfitting can cause background events to be misclassified as signals. The amount
of fake signals, or error caused in over-training, must therefore be quantified before the
classifier can be used to discover new physics.
The study consists of the methodology and results of a frequentist approach to quantify fake
signals produced in the over-training of a semi-supervised DNN classifier. To this end, Zɣ final
state background data is used to evaluate an optimized semi-supervised DNN classifier. The
frequentist approach is used to account for the probability of observing local excesses,
elsewhere within the mass range. This is achieved by repeating the pseudo-experiment a
sufficient number of times to completely understand the significance. Each pseudo-experiment
consists of an distinct Zɣ dataset, generated using a WGAN, to train the DNN and a fixed mass
background rejection scan to expose fake signals from the DNN response output