Advanced Machine Learning for Physics (PhD 2025)

PhD Student's Registration to the course: link

Course Information and Syllabus

Contacts: Stefano Giagu (stefano.giagu [at] uniroma1.it) and Andrea Ciardiello (andrea.ciardiello [at] uniroma1.it)

Program:

General objective of the course is to familiarise with advanced deep learning techniques based on differentiable neural network models with different learning paradigms; to acquire skills in modelling complex problems, through deep learning techniques, and understand how to apply these techniques in different contexts in the fields of physics, basic and applied scientific research.

Topics covered include: recalls of differentiable artificial neural networks and use of the pytorch library for ANN design, learning paradigmas, ANN for visions: segmentation and object detections, generativeAI: autoregressive models, invertible models, diffusion models, uncertainty quantification on ANNs, Graph Neural Networks, Attention and Transformers, Reinforcement Learning, Energy Models, AI explainability, Quantum Machine Learning on near-term quantum devices.

Approximately 50% of the lectures are frontal lessons supplemented by slide projections, aimed at providing advanced knowledge of Deep Learning techniques. The remaining 50% is based on hands-on computational practical experiences that provide some of the application skills necessary to autonomously develop and implement advanced Deep Learning models for solving various problems in physics and scientific research in general.

Indispensable prerequisites: basic concepts in machine learning, python language programming, standard python libraries (numpy, pandas, matplotlib, torch/pytorch )

a basic python course on YT (many others available on web: https://youtu.be/_uQrJ0TkZlc
tutorial on numpy, matplotlib, pandas: https://jakevdp.github.io/PythonDataScienceHandbook/)
basic concepts of ML: Introduction + Part I (sec. 5: ML basics) of the book I. Goodfellow et al.: https://www.deeplearningbook.org/
tutorials on pytorch web site: https://pytorch.org/
an introductory course on pytorch on YT (many others available on web): https://youtu.be/c36lUUr864M

Depending on the requirements of your specific PhD course each students can decided how may lectures/hands-on to attend to reach the required CFUs: 20h, 40h, 60h (60h corresponds to the entiere course).

Forum group (telegram group):

https://t.me/+69bQpfi5j74yM2Zk

Calendar:

Aula 7: aula 7, dip. fisica E. Fermi
labSS: laboratorio segnali e sistemi, first floor, dip. fisica G. Marconi
google meet link for classroom lectures and hands-on sessions: https://meet.google.com/xnj-bkjo-afm

Bibliography/References and detailed topics treated during lectures, slides, notebooks, etc.

Given the highly dynamic nature of the topics covered in the course, there is no single reference text. During the course the sources will be indicated and provided from time to time in the form of scientific and technical articles and book chapters.

Some classic readings on Deep Learning based on differentiable neural networks:

DL: I. Goodfellow, Y. Bengio, A. Courville: Deep Learning, MIT Press (https://www.deeplearningbook.org/)
PB: P. Baldi, Deep Learning in Science, Cambridge University Press
DL2: C. Bishop, Deep Learning, Springer
SCR: S. Scardapane, lice's Adventures in a Differentiable Wonderland, https://arxiv.org/abs/2404.17625 (book: https://www.amazon.it/dp/B0D9QHS5NG)
GRL: W. L. Hamilton, Graph Representation Learning Book, MCGill Uni press (https://www.cs.mcgill.ca/~wlh/grl_book/files/GRL_Book.pdf)

recordings of the lectures: link

lecture L1 - 4.3.2025 (slides) h15:00-17:00
- introduction to HPC and Parallel acceleration for AI, the Leonardo supercomputer (with Sergio Orlandini, Cineca)
lecture L2 - 11.3.2025 (slides) h15:00-17:00
- course information and synopsis
- overview (recalls) of artificial neural networks, CNNs, RNNs (DL ch 6 (6.1,6.2,6.3, 6.4, 6.5), DL2 ch 6,7,8,9,10 , SCR Ch5, 6, 7)
  - artificial neuron model and MLPs
  - activations functions for hidden and output layers)
  - training of an ANN
  - loss functions
  - SGD, momentum, learning rate, variable lr and optimizers
  - learning curves, bias-variance tradeoff and double descent in DNN
  - regularisation
    - dropout
    - early stopping
    - noise injection
    - weight regularisation L1, L2, L1+L2
- Convolutional-NN
  - image representation and input properties of a CNN (symmetry, translation invariance, self-similarity, compositionality, locality) and learned convolutional filters (DL ch 9 (9.1,9.2, 9.4))
  - local receptive field
  - convolution e shared weights
  - pooling layers
- Analysis of sequences: task definition and problems (DL ch 10 (intro, 10.2))
  - Vanilla RNN cell: structure and operating principle
  - StackedRNN, Bidirectional RNN, Encoder-Decoder RNN (seq2seq)
  - Back-propagation through time
  - Long-term correlation and gradient vanishing and exploding problems: Gated Cell and Long Short Term Memory RNN
  - LSTM: description of operations (note by C.Olah and for details DL ch 10 (10.10))
hands-on E* (optional meant for people w/o experience with pytorch) - 13.3.2025 (slides, notebook, recording) h9:00-12:00
- pytorch framework recall
hands-on E2 - 17.3.2025 (slides1, slides2) h8:00-11:00
- Leonardo HPC hands-on with Sergio Orlandini
lecture L3 - 18.3.2025 (slides) h15:00-17:00
- learning methods
  - learning paradigms recall (supervised, unsupervised, reinforcment learning)
  - semi-supervised learning
  - self-supervised learning
  - contrastive learning, simCLR (DL2 6.3.5)
  - non-contrastive learning, Barlow twins (arxiv:2103.03230)
  - deep residual learning, denoisers based on residual learning
  - adversarial learning
  - transfer learning and domain adaptation (DL2 6.3.4)
  - knowledge transfer based on knowledge distillation (arxiv:1503.02531)
handson E3 - 20.3.2025 (notebook) h9:00-12:00
- implementation of the Barlow Twins model in pytorch applied to the half-moon recognition
lecture L4 - 25.3.2025 (slides) h15:00-17:00
- neural architectures for object detection and segmentation (DL2: 10.4. 10.5)
- semantic segmentation, downsampling-upsampling (arXiv:1411.4038., arXiv:1505.04366)
- object detection, IoU, anchor boxes, non-max supression
- region proposals, R-CNN, Fast and Faster R-CNN (arXiv:1311.2524, arXiv:1506.01497 )
- Yolo and SSD models (arXiv:1506.02640, arXiv:1512.02325)
- Instance segmentation, Mask R-CNN (arXiv:1703.06870)
- Pose detection
hands-on E4 - 27.3.2025 (notebook) h9:00-12:00
- object detection algorithm implementation
lecture L5 - 1.4.2025 (slides) h15:00-17:00
- generative AI (DL ch 20, DL2 ch 17,18,20, SCR ch 8.4)
  - autorgeressive models
  - normalizing flow
  - diffusion models
hands-on E5a - 3.4.2025 (notebook) h9:00-12:00
- setup Leonardo env
lecture L6 + hands-on E6 - 10.4.2025 (slides, notebook) h9:00-12:00
- uncertainty quantification in ANN
  - IID and OOD data
  - type of uncertainties: approximation unc., epistemic unc. aleatoric unc.
  - calibration error and ECE (https://arxiv.org/pdf/1706.04599.pdf)
  - ensemble methods:
    - deep ensembles (https://arxiv.org/pdf/1612.01474.pdf)
    - MC dropout (https://arxiv.org/pdf/1506.02142.pdf)
  - Bayesian-NN (https://arxiv.org/pdf/2007.06823.pdf)
  - conformal predictions (https://arxiv.org/pdf/2107.07511.pdf)
hands-on E5b - 15.4.2025 (notebook) h15:00-17:00
- normalizing flow applied to a molecular structure use-case
hands-on E5c - 24.4.2025 (notebook) h9:00-12:00
- diffusion model applied to a molecular structure use-case
lecture L7 - 6.5.2025 (slides) h15:00-17:00
- Graph Neural Networks (Dl2 ch 13, GRL section 5 and 6, PyTorch geometric web site)
  - introduction
  - graphs and representation
  - permutation equivariance
  - graph convolutions and message passing
  - basic GCN layer
  - self-loop GCN
  - graph attention networks
  - solutions to the over-smoothing problem in GNNs
  - normalization
  - GNN python libraries
hands-on E7 - 8.5.2025 (notebook) h9:00-12:00
- PyG: GNN applied to a Graph Coloring Problem
lecture L8 - 13.5.2025 (slides) h15:00-17:00
- attention mechanism (DL2 ch 12)
  - the RNNSearch encoder-decoder model (arXiv:1409.0473)
  - attention and the Nadaraya-Watson kernel estimator
  - attention layers vs fully connected layers
- transformer architecture (arXiv:1706.03762) (DL2 ch 12)
  - word embedding (cenni)
  - (masked) multi head (self) attention based on scaled dot product
  - layer normalization
  - positional embedding
  - modern evolutions of transformers architectures: BERT/GPT/GPT2/GPT3/...
  - vision transformer (arXiv:2010.11929)
  - multimodal transformers
hands-on E8 - 15.5.2025 (notebook) h9:00-12:00
- physics application of a Transformer architecture
lecture L9 - 20.5.2025 (slides) h15:00-17:00
- Reinforcement Learning
hands-on E9 - 22.5.2025 (notebook) h9:00-12:00
- RL hands-on
lecture L10 - 27.5.2025 (slides) h15:00-17:00
- xAI
hands-on E10 - 29.5.2025 (notebook) h9:00-12:00
- xAI demonstration
lecture L11 - 3.6.2025 (slides) h15:00-17:00
- energy models (associative memories, BMs and RBMs)
hands-on E11 - 5.6.2025 (notebook) h9:00-12:00
- associative memories
lecture L12 - 10.6.2025 (slides) h15:00-17:00
- Quantum Machine Learning

Page updated

Report abuse