Classification of Transient Event Candidates in LSST Without Human-Labels

From data pipeline processing to model evaluation

Bruno Sanchez¹, Raphael Bonnet-Guerrini^1,2, Dominique Fouchez¹, Benjamin Racine¹
22/05/2025

¹Centre de Physique des Particules de Marseille; ²Computer Science Department, University of Milan

Context of the presentation.

Context :

Our main interest is to do cosmology with Type Ia Supernovae.

We discover them using difference imaging (DIA) pipeline.

DIA techniques are highly contaminated by bogus detections — noise or artifacts — that can result from imperfect image subtraction, cosmic rays, bad pixels, or atmospheric effects.

Today's presentation : Investigating the potential of an ML-based Transient/Bogus classifier using fake injections. Solving noisy labels problem from this approach.

Intuitions behind this project

Key Intuitions:

In real data: High rate of bogus, very low rate of transients.

Possibility to simulate transients using fake supernova source injections.

Assuming real data are nearly all bogus and injections are all transients, we have a (noisy) labeled dataset!

⇨ Possible Machine Learning-Based Classification Task

We train the model to classify between injections and real data, and in reality, it classifies between bogus and transient.

⇨ False Positive Predictions Are the Potential Real Transients

After training, real transients will be classified as transients with the injected data.
They are false positives since we assume that all detections from real data are bogus.

Galaxy-Based Injection: Catalog creation and pipeline run

We generated galaxy-hosted transient source injection on Hyper Suprime Cam (HSC) data.

Developed catalog creation with physically motivated injections rather than random ones.

Using the LSST Pipeline Stack Gen3 to ingest the injection catalog and process the dataset.

Galaxy sample selection

Selecting galaxy-type sources using the 'extendedness' criteria from the pipeline source catalog.

Retrieving the shape (semi-major, semi-minor axes) and magnitude to create a database of galaxies with their properties.

Extendedness measurement vs extendedness criteria.

Galaxies circled in red.

Magnitude and Positions of the Injections

Magnitude and distance to the host are sampled from the host galaxy properties. We are injecting in ~3% of the galaxies: $$ d_{\text{inj}} \sim \mathcal{N}(0, \text{SemiMajor}_{\text{host}}) $$ $$ m_{\text{inj}} \sim \text{Uniform}(m_{\text{host}} - 1, m_{\text{host}} + 3)$$

From the host's reference frame, the positions are converted to x, y, and RA/DEC.

We then add 5% of host-less injections.

Catalog Creation, Butler Ingestion and pipeline processing:

The injection catalogs are created for each specific visit. They are non correlated - No Light Curve.

Ingestion is done band by band.


ingest_injection_catalog \
-b $BUTLER_REPO \
-i $CATALOG_REPO/g_band_catalog.csv g  \
-o u/rbonnetguerrini/inject_input_g

Adding a inject_visit task to the pipeline on step 3, we build a DIA object table.

Data Presentation

The HSC RC2 subset is composed of 6 detectors, with 8 visits per filter.
Running for the full UDEEP COSMOS is our next target.

Producing the Cutouts:

Cutout coordinates are extracted from the DIA source tables and produced from the Calexps.

Final format: (30x30), normalized grayscale.

Classes and Labels:

Inferring the Injection class from the cross matched catalog.

Injection: Our simulated sources.

Real: any other detection.

Model Architecture

A standard Convolutional Neural Network architecture:

Two 2D-convolutional layers with ReLU activation and max-pooling.

Two fully connected layers (FCC) for high-level abstractions and classification output.

Dropout layers included to avoid overfitting.

Confusion Matrix

Output Classes and Their Interpretation

Focus:

Minimize False Negatives (FN) and maximize Sensitivity.

Monitor False Positives (FP) and Precision.

Confusion Matrix

Trained on all different available visits:

Focus:

Minimize False Negatives (FN) and maximize Sensitivity.
Monitor False Positives (FP) and Precision.

Analysis:

Need more insight on how the network is predicting these classes.

Confusion Matrix

Trained on all different available visits, evaluating only high SNR:

$$\text{SNR} \notin [0, 8]$$

Focus:

Minimize False Negatives (FN) and maximize Sensitivity.
Monitor False Positives (FP) and Precision.

Analysis:

Reduction of False Negatives (FN).
Stable False Positive (FP) detections.

Injected and Real Data Output Probability Comparison

All Data Probability vs Injection Probability Prediction

Using standard metrics and evaluation tools does not work in our situation. There is a need for a deeper understanding of the data.

Injected and Real Data Output Probability Comparison

Trained on all different available visits:

We are looking at the output probability comparison between the full dataset and the injection.

Injections should be predicted around 1.

The additional data predicted around 1 are our potential transients.

We wish to see a clear split with fewer 'in-between' predictions.

We also want to reduce or explain the injections predicted around 0.

Injected and Real Data Output Probability Comparison

Trained on all different available visits, evaluating only high SNR:

$$\text{SNR} \notin [0, 8]$$ We are looking at the output probability comparison between the full dataset and the injection.

Injections should be predicted around 1.

The additional data predicted around 1 are our potential transients.

Reduction of the 'in-between' predictions.

By removing low SNR, we target the uncertain classifications of the network.

Using UMAP for NN Latent Space visualization.

UMAP Overview:

Preserves both local and global structure using non-linear dimensionality reduction.
Helps visualize high-dimensional data in 2D.
Can be used to build a nearest-neighbors model.

UMAP in Our Case:

Applied to the output layer of the network.
Provides a visual tool for better understanding network classifications.

UMAP with Data Class Predictions

Trained on all available visits:

UMAP with Data Class Predictions

Trained on all available visits, evaluated only for high SNR:

UMAP with Data Class Predictions and SNR

Trained on all available visits:

UMAP with Data Class Predictions and SNR

Trained on all available visits, evaluated only for high SNR:

What Are the Potential Improvements for the Model?

Co-Teaching: A Self-Event-Selection Strategy for Weakly Supervised Learning

Two models are trained simultaneously with different views on the same dataset.

In each batch, each model selects the datum with the smallest loss (most confident predictions).

Avoid training on the wrong labels.

Pros:

Effective for noisy datasets.

Cons:

Increases computational cost.
Assumes symmetrical noise.

Asymmetrical Co-Teaching

Key Changes:

Implements different remembering rates for each class.
Better fits the needs of our asymmetrically weakly supervised dataset.

Confusion Matrix

Trained on all available visits:

Focus:

Minimize false negatives and improve sensitivity.
Monitor false positives and precision.

Analysis:

More insight needed into how the network predicts these classes.

Confusion Matrix

Trained on all available visits using the co-teaching method:

Focus:

Minimize false negatives and improve sensitivity.
Monitor false positives and precision.

Analysis:

Further reduction in false negatives.
Maintained constant false positives.

Injected and Real data output probability comparison

Trained on all different available visits :

We are looking at the output probability comparison between the full data set and the injection.

Injection should be predicted around 1.

The additional data predicted around 1 are our potential transient.

We wish to see a clear split with the less 'in-between' predictions.

We also want to reduce or explain the injection predicted around 0.

Injected and Real data output probability comparison

Trained on all different available visits, using co teaching method.

We are looking at the output probability comparison between the full data set and the injection.

Injection should be predicted around 1.

The additional data predicted around 1 are our potential transient.

Clear split with nearly no 'in-between' predictions.

We also want to reduce or explain the injection predicted around 0.

UMAP with data class predictions.

Trained on all different available visits :

UMAP with data class predictions.

Trained on all different available visits, using co teaching method.

Light Curve Confirmation

Using Dia Object Table we are able to build the LC :

Light Curve Confirmation

We also developed a tool to inspect this LC on the UMAP.

This work has received funding from the European Union’s Horizon 2020 research and innovation programme under a Marie Skłodowska-Curie grant agreement.

Conclusions and future work

We implemented a Real/Bogus solution with a conventional CNN

We attempted to explain the results using UMAP projection of last layer space

We find that the network is separating the injections but we have noisy labels

Asymmetrical co-teaching training schema improves the Real/Bogus solution

New UMAP on co-teaching splits even further the samples.

We find a way to identify true astrophysical transients as low-bound of FP rates.

Reminder: we did not use human tagged data!

Thank you !

Contact : bsanchez@cppm.in2p3.fr

Contact : raphael.bonnet-guerrini@unimi.it

This work has received funding from the European Union’s Horizon 2020 research and innovation programme under a Marie Skłodowska-Curie grant agreement.