Advanced Machine Learning for Physics (PhD 2025)
Course Information and Syllabus
Contacts: Stefano Giagu (stefano.giagu [at] uniroma1.it) and Andrea Ciardiello (andrea.ciardiello [at] uniroma1.it)
Program:
General objective of the course is to familiarise with advanced deep learning techniques based on differentiable neural network models with different learning paradigms; to acquire skills in modelling complex problems, through deep learning techniques, and understand how to apply these techniques in different contexts in the fields of physics, basic and applied scientific research.
Topics covered include: recalls of differentiable artificial neural networks and use of the pytorch library for ANN design, learning paradigmas, ANN for visions: segmentation and object detections, generativeAI: autoregressive models, invertible models, diffusion models, uncertainty quantification on ANNs, Graph Neural Networks, Attention and Transformers, Reinforcement Learning, Energy Models, AI explainability, Quantum Machine Learning on near-term quantum devices.
Approximately 50% of the lectures are frontal lessons supplemented by slide projections, aimed at providing advanced knowledge of Deep Learning techniques. The remaining 50% is based on hands-on computational practical experiences that provide some of the application skills necessary to autonomously develop and implement advanced Deep Learning models for solving various problems in physics and scientific research in general.
Indispensable prerequisites: basic concepts in machine learning, python language programming, standard python libraries (numpy, pandas, matplotlib, torch/pytorch )
a basic python course on YT (many others available on web: https://youtu.be/_uQrJ0TkZlc
tutorial on numpy, matplotlib, pandas: https://jakevdp.github.io/PythonDataScienceHandbook/)
basic concepts of ML: Introduction + Part I (sec. 5: ML basics) of the book I. Goodfellow et al.: https://www.deeplearningbook.org/
tutorials on pytorch web site: https://pytorch.org/
an introductory course on pytorch on YT (many others available on web): https://youtu.be/c36lUUr864M
Depending on the requirements of your specific PhD course each students can decided how may lectures/hands-on to attend to reach the required CFUs: 20h, 40h, 60h (60h corresponds to the entiere course).
Forum group (telegram group):
Calendar:
Aula 7: aula 7, dip. fisica E. Fermi
labSS: laboratorio segnali e sistemi, first floor, dip. fisica G. Marconi
google meet link for classroom lectures and hands-on sessions: https://meet.google.com/xnj-bkjo-afm
Bibliography/References and detailed topics treated during lectures, slides, notebooks, etc.
Given the highly dynamic nature of the topics covered in the course, there is no single reference text. During the course the sources will be indicated and provided from time to time in the form of scientific and technical articles and book chapters.
Some classic readings on Deep Learning based on differentiable neural networks:
DL: I. Goodfellow, Y. Bengio, A. Courville: Deep Learning, MIT Press (https://www.deeplearningbook.org/)
PB: P. Baldi, Deep Learning in Science, Cambridge University Press
DL2: C. Bishop, Deep Learning, Springer
SCR: S. Scardapane, lice's Adventures in a Differentiable Wonderland, https://arxiv.org/abs/2404.17625 (book: https://www.amazon.it/dp/B0D9QHS5NG)
GRL: W. L. Hamilton, Graph Representation Learning Book, MCGill Uni press (https://www.cs.mcgill.ca/~wlh/grl_book/files/GRL_Book.pdf)
recordings of the lectures: link
lecture L1 - 4.3.2025 (slides) h15:00-17:00
introduction to HPC and Parallel acceleration for AI, the Leonardo supercomputer (with Sergio Orlandini, Cineca)
lecture L2 - 11.3.2025 (slides) h15:00-17:00
course information and synopsis
overview (recalls) of artificial neural networks, CNNs, RNNs (DL ch 6 (6.1,6.2,6.3, 6.4, 6.5), DL2 ch 6,7,8,9,10 , SCR Ch5, 6, 7)
artificial neuron model and MLPs
activations functions for hidden and output layers)
training of an ANN
loss functions
SGD, momentum, learning rate, variable lr and optimizers
learning curves, bias-variance tradeoff and double descent in DNN
regularisation
dropout
early stopping
noise injection
weight regularisation L1, L2, L1+L2
Convolutional-NN
image representation and input properties of a CNN (symmetry, translation invariance, self-similarity, compositionality, locality) and learned convolutional filters (DL ch 9 (9.1,9.2, 9.4))
local receptive field
convolution e shared weights
pooling layers
Analysis of sequences: task definition and problems (DL ch 10 (intro, 10.2))
Vanilla RNN cell: structure and operating principle
StackedRNN, Bidirectional RNN, Encoder-Decoder RNN (seq2seq)
Back-propagation through time
Long-term correlation and gradient vanishing and exploding problems: Gated Cell and Long Short Term Memory RNN
LSTM: description of operations (note by C.Olah and for details DL ch 10 (10.10))
hands-on E* (optional meant for people w/o experience with pytorch) - 13.3.2025 (slides, notebook, recording) h9:00-12:00
pytorch framework recall
hands-on E2 - 17.3.2025 (slides1, slides2) h8:00-11:00
Leonardo HPC hands-on with Sergio Orlandini
lecture L3 - 18.3.2025 (slides) h15:00-17:00
learning methods
learning paradigms recall (supervised, unsupervised, reinforcment learning)
semi-supervised learning
self-supervised learning
contrastive learning, simCLR (DL2 6.3.5)
non-contrastive learning, Barlow twins (arxiv:2103.03230)
deep residual learning, denoisers based on residual learning
adversarial learning
transfer learning and domain adaptation (DL2 6.3.4)
knowledge transfer based on knowledge distillation (arxiv:1503.02531)
handson E3 - 20.3.2025 (notebook) h9:00-12:00
implementation of the Barlow Twins model in pytorch applied to the half-moon recognition