Advanced Machine Learning for Physics (PhD 2023)
Course Information and Syllabus
Contacts: Stefano Giagu (stefano.giagu [at] uniroma1.it) and Andrea Ciardiello (andrea.ciardiello [at] gmail.com)
Program:
The general objective of the course is to become familiar with advanced deep learning techniques based on differentiable neural network models with different learning paradigms; acquire skills in modeling complex problems, through deep learning techniques, and be able to apply them in different contexts in the fields of physics, basic and applied scientific research.
Topics covered include: general overview of differentiable artificial neural networks and use of the pytorch library for ANN design, training and testing. Basic architectures: MLP, Convolutional neural network, neural network for sequence analysis (RNN, LSTM/GRU). Bayesian-NN. Attention, Self-Attention, Transformers and Visual Transformers, Models for object detection and semantic segmentation and applications. Graph Neural Networks and Geometrical Deep Learning. Generative models based on VAE, GAN, autoregressive models, invertible networks, diffusion models, normalising flow, and generative GNNs. Advanced learning techniques: transfer learning, domain adaptation, adversarial learning, self-supervised and contrastive learning, model distillation. Explainable and interpretable AI. Quantum Machine Learning on near-term quantum devices.
Approximately 50% of the lectures are frontal lessons supplemented by slide, aimed at providing advanced knowledge of Deep Learning techniques. The remaining 50% is based on hands-on computational practical experiences that provide some of the application skills necessary to autonomously develop and implement advanced Deep Learning models for solving various problems in physics and scientific research in general.
Indispensable prerequisites: basic concepts in machine learning, python language programming, standard python libraries (numpy, pandas, matplotlib, torch/pytorch )
a basic python course on YT (many others available on web: https://youtu.be/_uQrJ0TkZlc
tutorial on numpy, matplotlib, pandas: https://jakevdp.github.io/PythonDataScienceHandbook/)
basic concepts of ML: Introduction + Part I (sec. 5: ML basics) of the book I. Goodfellow et al.: https://www.deeplearningbook.org/
tutorials on pytorch web site: https://pytorch.org/
an introductory course on pytorch on YT (many others available on web): https://youtu.be/c36lUUr864M
Depending on the requirements of your specific PhD course each students can decided how may lectures/hands-on to attend tu fulfil the required hours: 20h, 40h, 60h (60h corresponds to the whole course).
Discussion group (google group):
Calendar:
Lectures in Aula 6 - Dep. of Physics (E. Fermi) and at this google meet link
Lectures in Aula 3 - Dep. of Physics (E. Fermi) and at this google meet link
Hands-on sessions in LabSS - Dep. of Physcs (G. Marconi, 1st floor) and at this google meet link
28.2 - 17:00-18:30: course syllabus and introduction
1.3 - 16:00-18:00: recall on ANN
7.3 - 14:00-16:00: recall on ANN
8.3 - 16:00-18:00: hands-on on use of pytorch
14.3 - 14:30-16:00: learning paradigms
15.3 - 16:00-18:00: hands-on on residual learning
21.3 - 14:00-16:00: hands-on on contrastive learning
22.3 - 16:00-18:00: hands-on on contrastive learning
28.3 - 14:00-16:00: attention and transformers
29.3 - 16:00-18:00: hands-on on transformers
4.4 - 14:00-16:00 hands-on on transformers
5.4 - 16:00-18:00: segmentation and object detection models
12.4 - 16:00-18:00: hands-on on semantic segmentation
18.4 - 14:00-16:00: Graph Neural Networks
19.4 - 16:00-18:00: hands-on on object detection
26.4 - 16:00-18:00: hands-on on object detection
2.5 - 14:00-16:00: AI explainability and interpretability
3.5 - 16:00-18:00: hands-on on GNNs
9.5 - 14:00-16:00: hands-on on GNNs
10.5 - 16:00-18:00: hands-on on GNNs
16.5 - 14:00-16:00: hands-on on xAI
17.5 - 16:00-18:00: hands-on on xAI/knowledge-transfer
23.5 - 14:00-16:00: anomaly detection
24.5 - 16:00-18:00: hands-on on anomaly detection
30.5 - 14:00-16:00: generative DL
31.5 16-18: generative DL (online only at this google meet link)
6.6 - 14:00-16:00: hands on on generative DL (normalizing flow models)
7.6 - 16:00-18:00: hands on on generative DL (deep diffusion probabilistic models)
15.6 - 16:00-18:00: introduction to Quantum computation and Quantum ML and hands on on Quantum ML
Bibliography/References and detailed topics treated during lectures, slides, notebooks, etc.
Given the highly dynamic nature of the topics covered in the course, there is no single reference text. During the course the sources will be indicated and provided from time to time in the form of scientific and technical articles and book chapters.
Some classic readings on Deep Learning based on differentiable neural networks:
DL: I. Goodfellow, Y. Bengio, A. Courville: Deep Learning, MIT Press (https://www.deeplearningbook.org/)
PB: P. Baldi, Deep Learning in Science, Cambridge University Press
GRL: W. L. Hamilton, Graph Representation Learning Book, MCGill Uni press (https://www.cs.mcgill.ca/~wlh/grl_book/files/GRL_Book.pdf)
lecture 1 - 28.2.2023 (slides, recording) h17:00-19:00
course information
google colab and aws sagemaker studio lab clouds
artificial neural networks 101: (DL ch 6 (6.1,6.2,6.3, 6.4))
artificial neuron model and XOR gate example
MLPs and implementation of a Linear layer in pytorch
activations functions for hidden and output layers)
weight initialisation
universal approximation theorem for ANN
a simple example of a MLP implemented in pytorch
lecture 2 - 1.3.2023 (slides, recording) h16:00-18:00
ANN 101: (DL ch 6 (6.1,6.2,6.3, 6.4, 6.5), BIS ch 5 (5.1, 5.2,5.3, 5.5), DUDA ch 6 (6.1, 6.2, 6.3, 6.24, 6.5, 6.8))
training of an ANN
loss functions
SGD, momentum, learning rate, variable lr and optimizers
learning curves, bias-variance tradeoff and double descent in DNN
regularisation
dropout
early stopping
noise injection
data augmentation
weight regularisation L1, L2, L1+L2
a simple example of a MLP implemented in pytorch
lecture 3 - 7.3.2023 (slides, recording) h14:00-16:00
Convolutional-NN (SL 9 (9.1, 9.2, 9.3, 9.4, 9.7, 9.8, 9.9, 9.10, 9.11)
image representation and input properties of a CNN (symmetry, translation invariance, self-similarity, compositionality, locality) and learned convolutional filters (DL ch 9 (9.1,9.2, 9.4))
local receptive field
convolution e shared weights
pooling layers
CNN architectures: LeNet, AlexNet, VGG, Inception, ResNet, DenseNet, ...
Sequence analysis: task definition and problems (DL ch 10 (intro, 10.2))
Elementary RNN cell: structure and operating principle
StackedRNN, Bidirectional RNN, Encoder-Decoder RNN (seq2seq)
Back-propagation through time
Long-term correlation and gradient vanishing and exploding problems: Gated Cell and Long Short Term Memory RNN
LSTM: description of operations (note by C.Olah and for details DL ch 10 (10.10))
Hands-on 1 - 8.3.2024 (notebook, recording) h16:00-18:00
use of pytorch to design and train a ConvNet for identfication of particles in a RICH detector
Hands-on 2 - 15.3.2023 (notebook, recording) h16:00-18:00
denoising CNN based on residual learning for refining simulated fast raman spectra
Hands-on 3 - 21.3.2023 (notebook, recording) h14:00-16:00
self supervised contrastive learning SimCLR model for auroral identification and classification (part 1)
Hands-on 3bis - 22.3.2023 (notebook, recording) h16:00-18:00
self supervised contrastive learning SimCLR model for auroral identification and classification (part 2)
Lecture 5 - 28.3.2023 (slides, recording) h14:00-16:00
attention mechanism
the RNNSearch encoder-decoder model (arXiv:1409.0473)
attention and the Nadaraya-Watson kernel estimator
attention layers vs fully connected layers
transformer architecture (arXiv:1706.03762)
(masked) multi head (self) attention based on scaled dot product
layer normalization
positional embedding
modern evolutions GPT/GPT2/GPT3/BERT
vision transformer (arXiv:2010.11929)
Hands-on 4 - 29.3.2023 (notebook, recording) h16:00-18:00
a transformer encoder architecture for jet tagging
Hands-on 4bis - 4.4.2023 (notebook, recording) h14:00-16:00
a transformer encoder architecture for jet tagging + a ViT architecture trained for the same task
Lecture 6 - 5.4.2023 (slides, recording) h16:00-18:00
neural architectures for object detection and segmentation:
semantic segmentation, downsampling-upsampling (arXiv:1411.4038., arXiv:1505.04366)
object detection, IoU, anchor boxes, non-max supression
region proposals, R-CNN, Fast and Faster R-CNN (arXiv:1311.2524, arXiv:1506.01497 )
Yolo model
Instance segmentation, Mask R-CNN (arXiv:1703.06870)
Pose
Hands-on 5 - 12.4.2023 (notebook, recording) h16:00-18:00
segmentation of MRI images using the MONAI framework for medical imaging
Lecture 7 - 18.4.2023 (slides, recording) h14:00-16:00
Graph Neural Networks (GRL section 5 and 6, PyTorch geometric web site)
introduction
graphs and representation
permutation equivariance
graph convolutions and message passing
basic GCN layer
self-loop GCN
graph attention networks
solutions to the over-smoothing problem in GNNs
normalization
GNN python libraries
object detection with the YOLO V3 model - part 1
Hands-on 6bis - 26.4.2023 (notebook, recording) h14:00-16:00
object detection with the YOLO V3 model - part 2
Lecture 8 - 2.5.2023 (slides, recording) h14:00-16:00
AI explainability and interpretability
Hands-on 7 - 3.5.2023 (notebook, recording) h16:00-18:00
graph neural network with pytorch geometric
Hands-on 7bis - 9.5.2023 (notebook, recording) h14:00-16:00
GNN for point cloud classification (part 1)
Hands-on 7tris - 10.5.2023 (notebook, recording) h16:00-18:00
GNN for point cloud classification (part 2)
Hands-on session 8 - 16.5.2023 (notebook, recording) h14:00-16:00
xAI methods
Hands-on session 8bis - 17.5.2023 (notebook, recording) h16:00-18:00
xAI methods (part 2)
Lecture 9 - 23.5.2023 (slides, recording) h14:00-16:00
AutoEncoders (DL ch 14)
under-complete auto-encoders
linear AE and PCA
over-complete AEs
denoising AE
sparse AE
contractive AE
AE for self-supervised anomaly detection
examples of implementation in pytorch
Hands-on session 9 - 24.5.2023 (notebook, recording) h16:00-18:00
anomaly detection
Lecture 10 - 30.5.2023 (slides, recording) h14:00-16:00
generative DL (DL ch. 20)
autoregressive models
latent variable models
VAE, ELBO theorem
Generative Adversarial Networks
Lecture 11 - 31.5.2023 (slides, recording) h16:00-18:00
generative DL part 2 (DL ch. 20)
flow model: normalising flow models
deep diffusion probabilistic models
Hands-on session 10 - 6.6.2023 (notebook, recording) h14:00-16:00
normalising flow model implementation
Hands-on session 10bis - 7.6.2023 (notebook) h16:00-18:00
deep diffusion probabilistic comedy implementation
Lecture 12 and hands-on session 11 - 15.6.2023 (slides, notebook, recording) h16:00-18:00
basic introduction to Quantum Machine Learning (SP ch. 1, 3, 5)
implementation of simple models with the pennylane library
Special seminar from Sergio Orlandini on HPC DL - 23.6.2023 (slides)