If these were independent, we could approximate: \left\langle\mathcal{O}\right\rangle \simeq \frac{1}{N}\sum^{N}_{n=1}\mathcal{O}(x_{n}) \sigma_{\mathcal{O}}^{2} = \frac{1}{N}\mathrm{Var}{\left[\mathcal{O} (x)
\right]}\Longrightarrow \sigma_{\mathcal{O}} \propto \frac{1}{\sqrt{N}}
Markov Chain Monte Carlo (MCMC)
🎯 Goal
Generate independent samples \{x_{i}\}, such that2\{x_{i}\} \sim p(x) \propto e^{-S(x)} where S(x) is the action (or potential energy)
Instead, nearby configs are correlated, and we incur a factor of \textcolor{#FF5252}{\tau^{\mathcal{O}}_{\mathrm{int}}}: \sigma_{\mathcal{O}}^{2} =
\frac{\textcolor{#FF5252}{\tau^{\mathcal{O}}_{\mathrm{int}}}}{N}\mathrm{Var}{\left[\mathcal{O}
(x) \right]}
Hamiltonian Monte Carlo (HMC)
Want to (sequentially) construct a chain of states: x_{0} \rightarrow x_{1} \rightarrow x_{i} \rightarrow \cdots \rightarrow x_{N}\hspace{10pt}
such that, as N \rightarrow \infty: \left\{x_{i}, x_{i+1}, x_{i+2}, \cdots, x_{N} \right\} \xrightarrow[]{N\rightarrow\infty} p(x)
\propto e^{-S(x)}
🪄 Trick
Introduce fictitious momentum v \sim \mathcal{N}(0, \mathbb{1})
Resample v_{0} \sim \mathcal{N}(0, \mathbb{1})
at the beginning of each trajectory
Note: \partial_{x} S(x) is the force
HMC Update
We build a trajectory of N_{\mathrm{LF}}leapfrog steps3\begin{equation*}
(x_{0}, v_{0})%
\rightarrow (x_{1}, v_{1})\rightarrow \cdots%
\rightarrow (x', v')
\end{equation*}
And propose x' as the next state in our chain
\begin{align*}
\textcolor{#F06292}{\Gamma}: (x, v) \textcolor{#F06292}{\rightarrow} v' &:= v - \frac{\varepsilon}{2} \partial_{x} S(x) \\
\textcolor{#FD971F}{\Lambda}: (x, v) \textcolor{#FD971F}{\rightarrow} x' &:= x + \varepsilon v
\end{align*}
We then accept / reject x' using Metropolis-Hastings criteria, A(x'|x) = \min\left\{1, \frac{p(x')}{p(x)}\left|\frac{\partial x'}{\partial x}\right|\right\}
HMC Demo
Issues with HMC
What do we want in a good sampler?
Fast mixing (small autocorrelations)
Fast burn-in (quick convergence)
Problems with HMC:
Energy levels selected randomly \rightarrowslow mixing
Cannot easily traverse low-density zones \rightarrowslow convergence
We can measure the performance by comparing \tau_{\mathrm{int}} for the trained model vs. HMC.
Note: lower is better
Interpretation
Deviation in x_{P}
Topological charge mixing
Artificial influx of energy
Figure 8: Illustration of how different observables evolve over a single L2HMC trajectory.
Interpretation
Average plaquette: \langle x_{P}\rangle vs LF step
Average energy: H - \sum\log|\mathcal{J}|
Figure 9: The trained model artifically increases the energy towards the middle of the trajectory, allowing the sampler to tunnel between isolated sectors.
4D SU(3) Results
Distribution of \log|\mathcal{J}| over all chains, at each leapfrog step, N_{\mathrm{LF}} (= 0, 1, \ldots, 8) during training:
Figure 10: 100 train iters
Figure 11: 500 train iters
Figure 12: 1000 train iters
4D SU(3) Results: \delta U_{\mu\nu}
Figure 13: The difference in the average plaquette \left|\delta U_{\mu\nu}\right|^{2} between the trained model and HMC
4D SU(3) Results: \delta U_{\mu\nu}
Figure 14: The difference in the average plaquette \left|\delta U_{\mu\nu}\right|^{2} between the trained model and HMC
This research used resources of the Argonne Leadership Computing Facility,
which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357.
Boyda, Denis et al. 2022. “Applications of Machine Learning to Lattice Quantum Field Theory.” In Snowmass 2021. https://arxiv.org/abs/2202.05838.
Foreman, Sam, Taku Izubuchi, Luchang Jin, Xiao-Yong Jin, James C. Osborn, and Akio Tomiya. 2022. “HMC with Normalizing Flows.”PoS LATTICE2021: 073. https://doi.org/10.22323/1.396.0073.
Foreman, Sam, Xiao-Yong Jin, and James C. Osborn. 2021. “Deep Learning Hamiltonian Monte Carlo.” In 9th International Conference on Learning Representations. https://arxiv.org/abs/2105.03418.
———. 2022. “LeapfrogLayers: A Trainable Framework for Effective Topological Sampling.”PoS LATTICE2021 (May): 508. https://doi.org/10.22323/1.396.0508.
Shanahan, Phiala et al. 2022. “Snowmass 2021 Computational Frontier CompF03 Topical Group Report: Machine Learning,” September. https://arxiv.org/abs/2209.07559.
Extras
Integrated Autocorrelation Time
Figure 15: Plot of the integrated autocorrelation time for both the trained model (colored) and HMC (greyscale).
Comparison
(a) Trained model
(b) Generic HMC
Figure 16: Comparison of \langle \delta Q\rangle = \frac{1}{N}\sum_{i=k}^{N} \delta Q_{i} for the trained model Figure 16 (a) vs. HMC Figure 16 (b)
Plaquette analysis: x_{P}
Deviation from V\rightarrow\infty limit, x_{P}^{\ast}
Average \langle x_{P}\rangle, with x_{P}^{\ast} (dotted-lines)
Figure 17: Plot showing how average plaquette, \left\langle x_{P}\right\rangle varies over a single trajectory for models trained at different \beta, with varying trajectory lengths N_{\mathrm{LF}}
Loss Function
Want to maximize the expected squared charge difference9: \begin{equation*}
\mathcal{L}_{\theta}\left(\xi^{\ast}, \xi\right) =
{\mathbb{E}_{p(\xi)}}\big[-\textcolor{#FA5252}{{\delta Q}}^{2}
\left(\xi^{\ast}, \xi \right)\cdot A(\xi^{\ast}|\xi)\big]
\end{equation*}
Where:
\delta Q is the tunneling rate: \begin{equation*}
\textcolor{#FA5252}{\delta Q}(\xi^{\ast},\xi)=\left|Q^{\ast} - Q\right|
\end{equation*}
A(\xi^{\ast}|\xi) is the probability10 of accepting the proposal \xi^{\ast}: \begin{equation*}
A(\xi^{\ast}|\xi) = \mathrm{min}\left( 1,
\frac{p(\xi^{\ast})}{p(\xi)}\left|\frac{\partial \xi^{\ast}}{\partial
\xi^{T}}\right|\right)
\end{equation*}
Networks 2D U(1)
Stack gauge links as shape\left(U_{\mu}\right)=[Nb, 2, Nt, Nx]\in \mathbb{C}
x_{\mu}(n) ≔ \left[\cos(x), \sin(x)\right]
with shape\left(x_{\mu}\right)= [Nb, 2, Nt, Nx, 2]\in \mathbb{R}
Calculate physical observables at increasing spatial resolution
Perform extrapolation to continuum limit
Figure 20: Increasing the physical resolution (a \rightarrow 0) allows us to make predictions about numerical values of physical quantities in the continuum limit.
Note that \left(\Gamma^{+}\right)^{-1} = \Gamma^{-}, i.e. \Gamma^{+}\left[\Gamma^{-}(U, P)\right] = \Gamma^{-}\left[\Gamma^{+}(U, P)\right] = (U, P)↩︎
Where \xi^{\ast} is the proposed configuration (prior to Accept / Reject)↩︎
And \left|\frac{\partial \xi^{\ast}}{\partial \xi^{T}}\right| is the Jacobian of the transformation from \xi \rightarrow \xi^{\ast}↩︎
Note that \left(\Gamma^{+}\right)^{-1} = \Gamma^{-}, i.e. \Gamma^{+}\left[\Gamma^{-}(x, v)\right] = \Gamma^{-}\left[\Gamma^{+}(x, v)\right] = (x, v)↩︎
Citation
For attribution, please cite this work as:
Foreman, Sam. 2023. “MLMC: Machine Learning Monte Carlo for
Lattice Gauge Theory.”Https://Indico.fnal.gov/Event/57249/Contributions/271305/.
Presentation at the 2023 International Symposium on Lattice Field
Theory, July 31. https://saforem2.github.io/lattice23.
Source Code
---title: "MLMC: Machine Learning Monte Carlo"location: Lattice 2023 (Fermilab)location-url: "https://indico.fnal.gov/event/57249/contributions/271305/"# location: "[Lattice 2023](https://indico.fnal.gov/event/57249/contributions/271305/)"image: "https://github.com/saforem2/lattice23/blob/main/assets/thumbnail.png?raw=true"date: 2023-07-31categories: - ai4science - MCMC - Lattice QCD# date-modified: last-modifiedtitle-block-categories: falsenumber-sections: falsebibliography: references.bibappendix-cite-as: displayfavicon: "./assets/favicon.svg"callout-style: simpletwitter-card: image: "https://github.com/saforem2/lattice23/blob/main/assets/thumbnail.png?raw=true" site: "saforem2" creator: "saforem2"citation: author: Sam Foreman type: speech genre: "Presentation at the 2023 International Symposium on Lattice Field Theory" container-title: https://indico.fnal.gov/event/57249/contributions/271305/ title: "MLMC: Machine Learning Monte Carlo for Lattice Gauge Theory" url: https://saforem2.github.io/lattice23 abstract: | We present a trainable framework for efficiently generating gauge configurations, and discuss ongoing work in this direction. In particular, we consider the problem of sampling configurations from a 4D 𝑆𝑈(3) lattice gauge theory, and consider a generalized leapfrog integrator in the molecular dynamics update that can be trained to improve sampling efficiency.format: html: # reference-location: section # toc-location: right page-layout: full # grid: # body-width: 800px revealjs: slides-url: https://samforeman.me/talks/lattice23/slides.html # template-partials: # - ./title-slide.html # - ./title-fancy/title-slide.html # - ./title_slide_template.html # - ../../title-slide.html title-slide-attributes: data-background-iframe: https://saforem2.github.io/grid-worms-animation/ data-background-size: contain data-background-color: "#1c1c1c" background-color: "#1c1c1c" title-block-style: none slide-number: c title-slide-style: default chalkboard: buttons: false auto-animate: true reference-location: section touch: true pause: false footnotes-hover: true citations-hover: true preview-links: true controls-tutorial: true controls: false logo: "https://raw.githubusercontent.com/saforem2/anl-job-talk/main/docs/assets/anl.svg" history: false # theme: [css/dark.scss] callout-style: simple # css: [css/default.css, css/callouts.css] fig-align: center css: # - css/default.css - ../../css/custom.css theme: # - light: - white - ../../css/title-slide-template.scss - ../../css/reveal/reveal.scss - ../../css/common.scss - ../../css/dark.scss - ../../css/syntax-dark.scss - ../../css/callout-cards.scss # callout-style: simple # css: # # - css/default.css # - ../../css/custom.css # theme: # # - light: # # - css/dark.scss # - ../../css/title-slide-template.scss # - ../../css/reveal/reveal.scss # - ../../css/common.scss # - ../../css/dark.scss # - ../../css/syntax-dark.scss # - ../../css/callout-cards.scss # css: # # - css/default.css # - ../../css/custom.css # theme: # # - black # # - ./title-fancy/title-slide-template.scss # - ../../css/reveal/reveal.scss # - ../../css/common.scss # - ../../css/dark.scss # - ../../css/syntax-dark.scss # - ../../css/callout-cards.scss # - css/dark.scss self-contained: false embed-resources: false self-contained-math: false center: true highlight-style: "atom-one" default-image-extension: svg code-line-numbers: true data-background-color: "#1c1c1c" background-color: "#1c1c1c" code-overflow: scroll html-math-method: katex output-file: "slides.html" mermaid: theme: neutral # gfm: # output-file: "lattice23.md"---# {.title-slide background-color="#1c1c1c" background-iframe="https://saforem2.github.io/grid-worms-animation/" loading="lazy"}::: {style="background-color: rgba(22,22,22,0.75); border-radius: 10px; text-align:center; padding: 0px; padding-left: 1.5em; padding-right: 1.5em; max-width: min-content; min-width: max-content; margin-left: auto; margin-right: auto; padding-top: 0.2em; padding-bottom: 0.2em; line-height: 1.5em!important;"}<span style="color:#939393; font-size:1.5em; font-weight: bold;">MLMC: Machine Learning Monte Carlo</span><span style="color:#777777; font-size:1.2em; font-weight: bold;">for Lattice Gauge Theory</span>[<br> ]{style="padding-bottom: 0.5rem;"} [{{< fa solid home >}}](https://samforeman.me) Sam Foreman [Xiao-Yong Jin, James C. Osborn]{.dim-text style="font-size:0.8em;"} [[[{{< fa brands github >}} `saforem2/`](https://github.com/saforem2/)]{style="border-bottom: 0.5px solid #00ccff;"}`{`[[`lattice23`](https://github.com/saforem2/lattice23)]{style="border-bottom: 0.5px solid #00ccff;"}, [[`l2hmc-qcd`](https://github.com/saforem2/l2hmc-qcd)]{style="border-bottom: 0.5px solid #00ccff;"}`}`]{style="font-size:0.8em;"}:::::: footer[2023-07-31 @ [Lattice 2023](https://indico.fnal.gov/event/57249/contributions/271305/)]{.dim-text style="text-align:left;'}:::# Overview {background-color="#1c1c1c"}1. [Background: `{MCMC,HMC}`](#markov-chain-monte-carlo-mcmc-centeredslide) - [Leapfrog Integrator](#leapfrog-integrator-hmc-centeredslide) - [Issues with HMC](#sec-issues-with-hmc) - [Can we do better?](#sec-can-we-do-better)2. [L2HMC: Generalizing MD](#sec-l2hmc) - [4D $SU(3)$ Model](#sec-su3) - [Results](#sec-results)3. [References](#sec-references)4. [Extras](#sec-extras)# Markov Chain Monte Carlo (MCMC) {.centeredslide background-color="#1c1c1c"}:::: {.columns}::: {.column width="50%"}::: {.callout-note collapse=false icon=false title="🎯 Goal" style="text-align:left;!important; width: 100%!important;"}Generate **independent** samples $\{x_{i}\}$, such that[^notation1]$$\{x_{i}\} \sim p(x) \propto e^{-S(x)}$$where $S(x)$ is the _action_ (or potential energy):::- Want to calculate observables $\mathcal{O}$: $\left\langle \mathcal{O}\right\rangle \propto \int \left[\mathcal{D}x\right]\hspace{4pt} {\mathcal{O}(x)\, p(x)}$:::::: {.column width="49%"}![](https://raw.githubusercontent.com/saforem2/deep-fridays/main/assets/normal_distribution.dark.svg):::::::If these were <span style="color:#00CCFF;">independent</span>, we could approximate: $\left\langle\mathcal{O}\right\rangle \simeq \frac{1}{N}\sum^{N}_{n=1}\mathcal{O}(x_{n})$ $$\sigma_{\mathcal{O}}^{2} = \frac{1}{N}\mathrm{Var}{\left[\mathcal{O} (x) \right]}\Longrightarrow \sigma_{\mathcal{O}} \propto \frac{1}{\sqrt{N}}$$[^notation1]: Here, $\sim$ means "is distributed according to"::: footer[{{< fa brands github >}} `saforem2/lattice23`](https://saforem2.github.io/lattice23):::# Markov Chain Monte Carlo (MCMC) {.centeredslide background-color="#1c1c1c"}:::: {.columns}::: {.column width="50%"}::: {.callout-note collapse=false icon=false title="🎯 Goal" style="text-align:left;!important; width: 100%!important;"}Generate **independent** samples $\{x_{i}\}$, such that[^notation2]$$\{x_{i}\} \sim p(x) \propto e^{-S(x)}$$where $S(x)$ is the _action_ (or potential energy):::- Want to calculate observables $\mathcal{O}$: $\left\langle \mathcal{O}\right\rangle \propto \int \left[\mathcal{D}x\right]\hspace{4pt} {\mathcal{O}(x)\, p(x)}$:::::: {.column width="49%"}![](https://raw.githubusercontent.com/saforem2/deep-fridays/main/assets/normal_distribution.dark.svg):::::::Instead, nearby configs are [correlated]{.red-text}, and we incur a factor of$\textcolor{#FF5252}{\tau^{\mathcal{O}}_{\mathrm{int}}}$: $$\sigma_{\mathcal{O}}^{2} = \frac{\textcolor{#FF5252}{\tau^{\mathcal{O}}_{\mathrm{int}}}}{N}\mathrm{Var}{\left[\mathcal{O} (x) \right]}$$[^notation2]: Here, $\sim$ means "is distributed according to"::: footer[{{< fa brands github >}} `saforem2/lattice23`](https://github.com/saforem2/lattice23):::# Hamiltonian Monte Carlo (HMC) {.center background-color="#1c1c1c"}- Want to (sequentially) construct a chain of states: $$x_{0} \rightarrow x_{1} \rightarrow x_{i} \rightarrow \cdots \rightarrow x_{N}\hspace{10pt}$$ such that, as $N \rightarrow \infty$: $$\left\{x_{i}, x_{i+1}, x_{i+2}, \cdots, x_{N} \right\} \xrightarrow[]{N\rightarrow\infty} p(x) \propto e^{-S(x)}$$::: {.callout-tip icon=false collapse=false title="🪄 Trick" style="display:inline!important; width: 100%!important;"} - Introduce [fictitious]{.green-text} momentum $v \sim \mathcal{N}(0, \mathbb{1})$ - Normally distributed **independent** of $x$, i.e. $$\begin{align*} p(x, v) &\textcolor{#02b875}{=} p(x)\,p(v) \propto e^{-S{(x)}} e^{-\frac{1}{2} v^{T}v} = e^{-\left[S(x) + \frac{1}{2} v^{T}{v}\right]} \textcolor{#02b875}{=} e^{-H(x, v)} \end{align*}$$:::## Hamiltonian Monte Carlo (HMC) {.centeredslide background-color="#1c1c1c"}:::: {.columns}::: {.column width="55%"}- [**Idea**]{.green-text}: Evolve the $(\dot{x}, \dot{v})$ system to get new states $\{x_{i}\}$❗- Write the **joint distribution** $p(x, v)$: $$ p(x, v) \propto e^{-S[x]} e^{-\frac{1}{2}v^{T} v} = e^{-H(x, v)} $$:::::: {.column width="45%"}::: {.callout-tip collapse=false icon=false title="🔋 Hamiltonian Dynamics" style="width:100%!important;"}$H = S[x] + \frac{1}{2} v^{T} v \Longrightarrow$$$\dot{x} = +\partial_{v} H,\,\,\dot{v} = -\partial_{x} H$$::::::::::::: {#fig-hmc-traj}![](https://raw.githubusercontent.com/saforem2/deep-fridays/main/assets/hmc1.svg){.r-stretch}Overview of HMC algorithm:::## Leapfrog Integrator (HMC) {#sec-leapfrog background-color="#1c1c1c"}:::: {.columns style="font-size: 0.9em;" height="100%"}::: {.column width="40%"}::: {.callout-tip collapse=false icon=false title="🔋 Hamiltonian Dynamics" style="width:100%!important;"}$\left(\dot{x}, \dot{v}\right) = \left(\partial_{v} H, -\partial_{x} H\right)$:::::: {.callout-note collapse=false icon=false title="🐸 Leapfrog Step" style="width:100%!important;"}`input` $\,\left(x, v\right) \rightarrow \left(x', v'\right)\,$ `output`$$\begin{align*}\tilde{v} &:= \textcolor{#F06292}{\Gamma}(x, v)\hspace{2.2pt} = v - \frac{\varepsilon}{2} \partial_{x} S(x) \\x' &:= \textcolor{#FD971F}{\Lambda}(x, \tilde{v}) \, = x + \varepsilon \, \tilde{v} \\v' &:= \textcolor{#F06292}{\Gamma}(x', \tilde{v}) = \tilde{v} - \frac{\varepsilon}{2} \partial_{x} S(x')\end{align*}$$:::::: {.callout-warning collapse=false icon=false title="⚠️ Warning!" style="width:100%!important;"}- Resample $v_{0} \sim \mathcal{N}(0, \mathbb{1})$ at the [beginning]{.yellow-text} of each trajectory:::::: {style="font-size:0.8em; margin-left:13%;"}[**Note**: $\partial_{x} S(x)$ is the _force_]{.dim-text}::::::::: {.column width="55%" style="text-align:left;"}![](./assets/hmc1/hmc-update-light.svg){width=60%}:::::::## HMC Update {background-color="#1c1c1c"}:::: {.columns}::: {.column width="65%"}- We build a trajectory of $N_{\mathrm{LF}}$ **leapfrog steps**[^v0] $$\begin{equation*} (x_{0}, v_{0})% \rightarrow (x_{1}, v_{1})\rightarrow \cdots% \rightarrow (x', v') \end{equation*}$$- And propose $x'$ as the next state in our chain$$\begin{align*} \textcolor{#F06292}{\Gamma}: (x, v) \textcolor{#F06292}{\rightarrow} v' &:= v - \frac{\varepsilon}{2} \partial_{x} S(x) \\ \textcolor{#FD971F}{\Lambda}: (x, v) \textcolor{#FD971F}{\rightarrow} x' &:= x + \varepsilon v\end{align*}$$- We then accept / reject $x'$ using Metropolis-Hastings criteria, $A(x'|x) = \min\left\{1, \frac{p(x')}{p(x)}\left|\frac{\partial x'}{\partial x}\right|\right\}$:::::: {.column width="30%"}![](./assets/hmc1/hmc-update-light.svg):::::::[^v0]: We **always** start by resampling the momentum, $v_{0} \sim\mathcal{N}(0, \mathbb{1})$## HMC Demo {.centeredslide background-color="#1c1c1c"}::: {#fig-hmc-demo}<iframe data-src="https://chi-feng.github.io/mcmc-demo/app.html" width="90%" height="500" title="l2hmc-qcd"></iframe>HMC Demo:::# Issues with HMC {#sec-issues style="font-size:0.9em;" background-color="#1c1c1c"}- What do we want in a good sampler? - **Fast mixing** (small autocorrelations) - **Fast burn-in** (quick convergence)- Problems with HMC: - Energy levels selected randomly $\rightarrow$ **slow mixing** - Cannot easily traverse low-density zones $\rightarrow$ **slow convergence**::: {#fig-hmc-issues layout-ncol=2}![HMC Samples with $\varepsilon=0.25$](https://raw.githubusercontent.com/saforem2/l2hmc-dwq25/main/docs/assets/hmc_traj_eps025.svg)![HMC Samples with $\varepsilon=0.5$](https://raw.githubusercontent.com/saforem2/l2hmc-dwq25/main/docs/assets/hmc_traj_eps05.svg)HMC Samples generated with varying step sizes $\varepsilon$:::# Topological Freezing {.center background-color="#1c1c1c"}:::: {.flex-container style="text-align: center; width: 100%"}::: {.col1 width="45%" style="text-align: left; font-size: 0.9em;"}**Topological Charge**:$$Q = \frac{1}{2\pi}\sum_{P}\left\lfloor x_{P}\right\rfloor \in \mathbb{Z}$$[**note:** $\left\lfloor x_{P} \right\rfloor = x_{P} - 2\pi\left\lfloor\frac{x_{P} + \pi}{2\pi}\right\rfloor$]{.dim-text style="font-size:0.8em;"}::: {.callout-important collapse=false icon=false title="⏳ Critical Slowing Down" style="text-align:left; width:100%!important;"}- $Q$ gets stuck! - as $\beta\longrightarrow \infty$: - $Q \longrightarrow \text{const.}$ - $\delta Q = \left(Q^{\ast} - Q\right) \rightarrow 0 \textcolor{#FF5252}{\Longrightarrow}$ - \# configs required to estimate errors **grows exponentially**:[$\tau_{\mathrm{int}}^{Q} \longrightarrow \infty$]{.red-text}::::::::: {.flex-container width="45%"}![Note $\delta Q \rightarrow 0$ at increasing$\beta$](https://raw.githubusercontent.com/saforem2/l2hmc-dwq25/main/docs/assets/critical_slowing_down.svg){width="80%"}:::::::# Can we do better? {#sec-can-we-do-better background-color="#1c1c1c"}:::: {.columns}::: {.column width="50%"}- Introduce two (**invertible NNs**) `vNet` and `xNet`[^l2hmc]: - [`vNet: ` $(x, F) \longrightarrow \left(s_{v},\, t_{v},\, q_{v}\right)$]{style="font-size:0.9em;"} - [`xNet: ` $(x, v) \longrightarrow \left(s_{x},\, t_{x},\, q_{x}\right)$]{style="font-size:0.9em;"} - Use these $(s, t, q)$ in the _generalized_ MD update: - [[$\Gamma_{\theta}^{\pm}$]{.pink-text} $: ({x}, \textcolor{#07B875}{v}) \xrightarrow[]{\textcolor{#F06292}{s_{v}, t_{v}, q_{v}}} (x, \textcolor{#07B875}{v'})$]{} - [[$\Lambda_{\theta}^{\pm}$]{.orange-text} $: (\textcolor{#AE81FF}{x}, v) \xrightarrow[]{\textcolor{#FD971F}{s_{x}, t_{x}, q_{x}}} (\textcolor{#AE81FF}{x'}, v)$]{}:::::: {.column width="48%"}::: {#fig-mdupdate}![](./assets/leapfrog-layer-2D-U1-vertical.light.svg){style="width:85%; text-align:center;"}Generalized MD update where [$\Lambda_{\theta}^{\pm}$]{.orange-text},[$\Gamma_{\theta}^{\pm}$]{.pink-text} are **invertible NNs**::::::::::[^l2hmc]: [L2HMC: ](https://github.com/saforem2/l2hmc-qcd) {{< fa solid book >}}[@Foreman:2021ixr; @Foreman:2021rhs]# L2HMC: Generalizing the MD Update {#sec-l2hmc .smaller .centeredslide background-color="#1c1c1c"}:::: {.flex-container style="text-align: left; width: 100%"}::: {.col1 style="width:45%; font-size: 0.8em;"}::: {style="border:1px solid red;"}- Introduce $d \sim \mathcal{U}(\pm)$ to determine the direction of our update 1. [$\textcolor{#07B875}{v'} =$ [$\Gamma^{\pm}$]{.pink-text}$({x}, \textcolor{#07B875}{v})$]{} [$\hspace{46pt}$ update $v$]{.dim-text style="font-size:0.9em;"} 2. [$\textcolor{#AE81FF}{x'} =$ [$x_{B}$]{.blue-text}$\,+\,$[$\Lambda^{\pm}$]{.orange-text}$($[$x_{A}$]{.red-text}$, {v'})$]{} [$\hspace{10pt}$ update first **half**: $x_{A}$]{.dim-text style="font-size:0.9em;"} 3. [$\textcolor{#AE81FF}{x''} =$ [$x'_{A}$]{.red-text}$\,+\,$[$\Lambda^{\pm}$]{.orange-text}$($[$x'_{B}$]{.blue-text}$, {v'})$]{} [$\hspace{8pt}$ update other half: $x_{B}$]{.dim-text style="font-size:0.9em;"} 4. [$\textcolor{#07B875}{v''} =$ [$\Gamma^{\pm}$]{.pink-text}$({x''}, \textcolor{#07B875}{v'})$]{} [$\hspace{36pt}$ update $v$]{.dim-text style="font-size:0.9em;"}:::::: {style="border:1px solid red;"}- Resample both $v\sim \mathcal{N}(0, 1)$, and $d \sim \mathcal{U}(\pm)$ at thebeginning of each trajectory - To ensure ergodicity + reversibility, we split the [$x$]{.purple-text} update into sequential (complementary) updates- Introduce directional variable $d \sim \mathcal{U}(\pm)$, resampled at thebeginning of each trajectory: - Note that $\left(\Gamma^{+}\right)^{-1} = \Gamma^{-}$, i.e. $$\Gamma^{+}\left[\Gamma^{-}(x, v)\right] = \Gamma^{-}\left[\Gamma^{+}(x, v)\right] = (x, v)$$::::::::: {.col2 style="width:55%;"}::: {#fig-mdupdate}![](./assets/leapfrog-layer-2D-U1-vertical.light.svg){style="width:85%; text-align:center;"}Generalized MD update with [$\Lambda_{\theta}^{\pm}$]{.orange-text}, [$\Gamma_{\theta}^{\pm}$]{.pink-text} **invertible NNs**::::::::::## L2HMC: Leapfrog Layer {.center width="100%" background-color="#1c1c1c"}:::: {.flex-container}::: {.column style="width: 35%;"}![](https://raw.githubusercontent.com/saforem2/l2hmc-dwq25/main/docs/assets/drawio/update_steps.svg){.absolute top="30" width="40%"}:::::: {.column style="width:65%;"}![](https://raw.githubusercontent.com/saforem2/l2hmc-dwq25/main/docs/assets/drawio/leapfrog_layer_dark2.svg){width="100%"}:::![](https://raw.githubusercontent.com/saforem2/l2hmc-dwq25/main/docs/assets/drawio/network_functions.svg){.absolute top=440 width="100%"}::::## L2HMC Update {style="font-size: 0.775em;" background-color="#1c1c1c"}:::: {.columns}::: {.column width="65%" style="font-size:0.7em;"}::: {.callout-important collapse=false icon=false title="👨💻 Algorithm" style="text-align:left; width: 100%!important;"}1. `input`: [$x$]{.purple-text} - Resample: $\textcolor{#07B875}{v} \sim \mathcal{N}(0, \mathbb{1})$; $\,\,{d\sim\mathcal{U}(\pm)}$ - Construct initial state: $\textcolor{#939393}{\xi} =(\textcolor{#AE81FF}{x}, \textcolor{#07B875}{v}, {\pm})$2. `forward`: Generate [proposal $\xi'$]{style="color:#f8f8f8"} by passing [initial $\xi$]{style="color:#939393"} through $N_{\mathrm{LF}}$ leapfrog layers $$\textcolor{#939393} \xi \hspace{1pt}\xrightarrow[]{\tiny{\mathrm{LF} \text{ layer}}}\xi_{1} \longrightarrow\cdots \longrightarrow \xi_{N_{\mathrm{LF}}} = \textcolor{#f8f8f8}{\xi'} := (\textcolor{#AE81FF}{x''}, \textcolor{#07B875}{v''})$$ - Accept / Reject: $$\begin{equation*} A({\textcolor{#f8f8f8}{\xi'}}|{\textcolor{#939393}{\xi}})= \mathrm{min}\left\{1, \frac{\pi(\textcolor{#f8f8f8}{\xi'})}{\pi(\textcolor{#939393}{\xi})} \left| \mathcal{J}\left(\textcolor{#f8f8f8}{\xi'},\textcolor{#939393}{\xi}\right)\right| \right\} \end{equation*}$$5. `backward` (if training): - Evaluate the **loss function**[^loss] $\mathcal{L}\gets \mathcal{L}_{\theta}(\textcolor{#f8f8f8}{\xi'}, \textcolor{#939393}{\xi})$ and backprop6. `return`: $\textcolor{#AE81FF}{x}_{i+1}$ Evaluate MH criteria $(1)$ and return accepted config, $$\textcolor{#AE81FF}{{x}_{i+1}}\gets \begin{cases} \textcolor{#f8f8f8}{\textcolor{#AE81FF}{x''}} \small{\text{ w/ prob }} A(\textcolor{#f8f8f8}{\xi''}|\textcolor{#939393}{\xi}) \hspace{26pt} ✅ \\ \textcolor{#939393}{\textcolor{#AE81FF}{x}} \hspace{5pt}\small{\text{ w/ prob }} 1 - A(\textcolor{#f8f8f8}{\xi''}|{\textcolor{#939393}{\xi}}) \hspace{10pt} 🚫 \end{cases}$$::::::::: {.column width="35%"}::: {#fig-mdupdate}![](./assets/leapfrog-layer-2D-U1-vertical.light.svg){style="width:75%; text-align:center;"}**Leapfrog Layer** used in generalized MD update::::::::::[^loss]: For simple $\mathbf{x} \in \mathbb{R}^{2}$ example, $\mathcal{L}_{\theta} = A(\xi^{\ast}|\xi)\cdot \left(\mathbf{x}^{\ast} - \mathbf{x}\right)^{2}$# 4D $SU(3)$ Model {#sec-su3 .centeredslide background-color="#1c1c1c"}:::: {.columns}::: {.column width="50%"}::: {.callout-note collapse=false icon=false title="🔗 Link Variables" style="text-align:left; width: 100%!important;"}- Write link variables $U_{\mu}(x) \in SU(3)$: $$ \begin{align*} U_{\mu}(x) &= \mathrm{exp}\left[{i\, \textcolor{#AE81FF}{\omega^{k}_{\mu}(x)} \lambda^{k}}\right]\\ &= e^{i \textcolor{#AE81FF}{Q}},\quad \text{with} \quad \textcolor{#AE81FF}{Q} \in \mathfrak{su}(3) \end{align*}$$ [where [$\omega^{k}_{\mu}(x)$]{.purple-text} $\in \mathbb{R}$, and $\lambda^{k}$ are the generators of $SU(3)$]{style="font-size:0.9em;"}:::::: {.callout-tip collapse=false icon=false title="🏃♂️➡️ Conjugate Momenta" style="text-align:left; width:100%!important;"}- Introduce [$P_{\mu}(x) = P^{k}_{\mu}(x) \lambda^{k}$]{.green-text} conjugate to[$\omega^{k}_{\mu}(x)$]{.purple-text}:::::: {.callout-important collapse=false icon=false title="🟥 Wilson Action" style="text-align:left; width:100%!important;"}$$ S_{G} = -\frac{\beta}{6} \sum\mathrm{Tr}\left[U_{\mu\nu}(x)+ U^{\dagger}_{\mu\nu}(x)\right] $$where $U_{\mu\nu}(x) = U_{\mu}(x) U_{\nu}(x+\hat{\mu})U^{\dagger}_{\mu}(x+\hat{\nu}) U^{\dagger}_{\nu}(x)$::::::::: {.column width="45%"}::: {#fig-4dlattice}![](./assets/u1lattice.dark.svg){width="90%"}Illustration of the lattice::::::::::## HMC: 4D $SU(3)$ {#sec-hmcsu3 background-color="#1c1c1c"}Hamiltonian: $H[P, U] = \frac{1}{2} P^{2} + S[U] \Longrightarrow$:::: {.columns}::: {.column style="font-size:0.9em; text-align: center;"}::: {.callout collapse=false style="text-align:left; width: 100%!important;"}- [$U$ update]{style="border-bottom: 2px solid #AE81FF;"}:[$\frac{d\omega^{k}}{dt} = \frac{\partial H}{\partial P^{k}}$]{.purple-text style="font-size:1.5em;"}$$\frac{d\omega^{k}}{dt}\lambda^{k} = P^{k}\lambda^{k} \Longrightarrow \frac{dQ}{dt} = P$$$$\begin{align*}Q(\textcolor{#FFEE58}{\varepsilon}) &= Q(0) + \textcolor{#FFEE58}{\varepsilon} P(0)\Longrightarrow\\-i\, \log U(\textcolor{#FFEE58}{\varepsilon}) &= -i\, \log U(0) + \textcolor{#FFEE58}{\varepsilon} P(0) \\U(\textcolor{#FFEE58}{\varepsilon}) &= e^{i\,\textcolor{#FFEE58}{\varepsilon} P(0)} U(0)\Longrightarrow \\&\hspace{1pt}\\\textcolor{#FD971F}{\Lambda}:\,\, U \longrightarrow U' &:= e^{i\varepsilon P'} U\end{align*}$$:::::: aside[$\textcolor{#FFEE58}{\varepsilon}$ is the step size]{.dim-text style="font-size:0.8em;"}::::::::: {.column style="font-size:0.9em; text-align: center;"}::: {.callout collapse=false style="text-align:left; width: 100%!important;"}- [$P$ update]{style="border-bottom: 2px solid #07B875;"}:[$\frac{dP^{k}}{dt} = - \frac{\partial H}{\partial \omega^{k}}$]{.green-text style="font-size:1.5em;"} $$\frac{dP^{k}}{dt} = - \frac{\partial H}{\partial \omega^{k}}= -\frac{\partial H}{\partial Q} = -\frac{dS}{dQ}\Longrightarrow$$$$\begin{align*}P(\textcolor{#FFEE58}{\varepsilon}) &= P(0) - \textcolor{#FFEE58}{\varepsilon} \left.\frac{dS}{dQ}\right|_{t=0} \\&= P(0) - \textcolor{#FFEE58}{\varepsilon} \,\textcolor{#E599F7}{F[U]} \\&\hspace{1pt}\\\textcolor{#F06292}{\Gamma}:\,\, P \longrightarrow P' &:= P - \frac{\varepsilon}{2} F[U]\end{align*}$$:::::: aside[$\textcolor{#E599F7}{F[U]}$ is the force term]{.dim-text style="font-size:0.8em;"}::::::::::## HMC: 4D $SU(3)$ {.centeredslide background-color="#1c1c1c"}:::: {.columns}::: {.column width="47%" style="text-align:left;"}- [Momentum Update]{style="border-bottom: 2px solid #F06292;"}: $$\textcolor{#F06292}{\Gamma}: P \longrightarrow P' := P - \frac{\varepsilon}{2} F[U]$$- [Link Update]{style="border-bottom: 2px solid #FD971F;"}: $$\textcolor{#FD971F}{\Lambda}: U \longrightarrow U' := e^{i\varepsilon P'} U\quad\quad$$- We maintain a batch of `Nb` lattices, all updated in parallel - $U$`.dtype = complex128` - $U$`.shape`[`= [Nb, 4, Nt, Nx, Ny, Nz, 3, 3]`]{style="font-size: 0.95em;"}:::::: {.column width="47%" style="text-align:right;"}![](./assets/hmc/hmc-update-light.svg){width=60%}:::::::# Networks 4D $SU(3)$ {#sec-su3networks .centeredslide auto-animate="true" background-color="#1c1c1c"}:::: {.columns}::: {.column width="54%" style="font-size:0.9em;"} <br> <br>[$U$]{.purple-text}-Network:[`UNet: ` $(U, P) \longrightarrow \left(s_{U},\, t_{U},\, q_{U}\right)$]{style="font-size:0.9em;"} <br>::: {style="border: 1px solid #1c1c1c; border-radius: 6px; padding:1%;"}[$P$]{.green-text}-Network:[`PNet: ` $(U, P) \longrightarrow \left(s_{P},\, t_{P},\, q_{P}\right)$]{style="font-size:0.9em;"} ::::::::: {.column width="45%" style="text-align:right;"}![](./assets/leapfrog-layer-4D-SU3-vertical.light.svg){width="80%"}:::::::# Networks 4D $SU(3)$ {.centeredslide auto-animate="true" background-color="#1c1c1c"}:::: {.columns}::: {.column width="54%" style="font-size:0.9em;"} <br> <br>[$U$]{.purple-text}-Network:[`UNet: ` $(U, P) \longrightarrow \left(s_{U},\, t_{U},\, q_{U}\right)$]{style="font-size:0.9em;"} <br>::: {style="border: 1px solid #07B875; border-radius: 6px; padding:1%;"}[$P$]{.green-text}-Network:[`PNet: ` $(U, P) \longrightarrow \left(s_{P},\, t_{P},\, q_{P}\right)$]{style="font-size:0.9em;"} :::[$\uparrow$]{.dim-text} [let's look at this]{.dim-text style="padding-top: 0.5em!important;"}:::::: {.column width="45%" style="text-align:right;"}![](./assets/leapfrog-layer-4D-SU3-vertical.light.svg){width="80%"}:::::::## $P$-`Network` (pt. 1) {style="font-size:0.95em;" background-color="#1c1c1c"}::: {style="text-align:center;"}![](./assets/SU3/PNetwork.light.svg)::::::: {.columns}::: {.column width="50%"}- [`input`[^sigma]: $\hspace{7pt}\left(U, F\right) := (e^{i Q}, F)$]{style="border-bottom: 2px solid rgba(131, 131, 131, 0.493);"} $$\begin{align*} h_{0} &= \sigma\left( w_{Q} Q + w_{F} F + b \right) \\ h_{1} &= \sigma\left( w_{1} h_{0} + b_{1} \right) \\ &\vdots \\ h_{n} &= \sigma\left(w_{n-1} h_{n-2} + b_{n}\right) \\ \textcolor{#FF5252}{z} & := \sigma\left(w_{n} h_{n-1} + b_{n}\right) \longrightarrow \\ \end{align*}$$:::::: {.column width="50%"}- [`output`[^lambda1]: $\hspace{7pt} (s_{P}, t_{P}, q_{P})$]{style="border-bottom: 2px solid rgba(131, 131, 131, 0.5);"} - $s_{P} = \lambda_{s} \tanh(w_s \textcolor{#FF5252}{z} + b_s)$ - $t_{P} = w_{t} \textcolor{#FF5252}{z} + b_{t}$ - $q_{P} = \lambda_{q} \tanh(w_{q} \textcolor{#FF5252}{z} + b_{q})$:::::::[^sigma]: $\sigma(\cdot)$ denotes an activation function[^lambda1]: $\lambda_{s}, \lambda_{q} \in \mathbb{R}$, trainable parameters## $P$-`Network` (pt. 2) {style="font-size:0.9em;" background-color="#1c1c1c"}::: {style="text-align:center;"}![](./assets/SU3/PNetwork.light.svg):::- Use $(s_{P}, t_{P}, q_{P})$ to update $\Gamma^{\pm}: (U, P) \rightarrow\left(U, P_{\pm}\right)$[^inverse]: - [forward]{style="color:#FF5252"} $(d = \textcolor{#FF5252}{+})$: $$\Gamma^{\textcolor{#FF5252}{+}}(U, P) := P_{\textcolor{#FF5252}{+}} = P \cdot e^{\frac{\varepsilon}{2} s_{P}} - \frac{\varepsilon}{2}\left[ F \cdot e^{\varepsilon q_{P}} + t_{P} \right]$$ - [backward]{style="color:#1A8FFF;"} $(d = \textcolor{#1A8FFF}{-})$: $$\Gamma^{\textcolor{#1A8FFF}{-}}(U, P) := P_{\textcolor{#1A8FFF}{-}} = e^{-\frac{\varepsilon}{2} s_{P}} \left\{P + \frac{\varepsilon}{2}\left[ F \cdot e^{\varepsilon q_{P}} + t_{P} \right]\right\}$$[^inverse]: Note that $\left(\Gamma^{+}\right)^{-1} = \Gamma^{-}$, i.e. $\Gamma^{+}\left[\Gamma^{-}(U, P)\right] = \Gamma^{-}\left[\Gamma^{+}(U, P)\right] = (U, P)$# Results: 2D $U(1)$ {#sec-results .centeredslide background-color="#1c1c1c"}:::: {.columns}::: {.column width=50% style="align:top;"}![](https://raw.githubusercontent.com/saforem2/physicsSeminar/main/assets/autocorr_new.svg){width="90%"}:::::: {.column width="33%" style="text-align:left; padding-top: 5%;"}::: {.callout-important icon=false collapse=false title="📈 Improvement" style="text-align:left!important; width: 100%!important;"}We can measure the performance by comparing $\tau_{\mathrm{int}}$ for the[**trained model**]{style="color:#FF2052;"} vs.[**HMC**]{style="color:#9F9F9F;"}. **Note**: [lower]{style="color:#FF2052;"} is better::::::::::![](https://raw.githubusercontent.com/saforem2/physicsSeminar/main/assets/charge_histories.svg){.absolute top=400 left=0 width="100%" style="margin-bottom: 1em;margin-top: 1em;"}## Interpretation {#sec-interpretation .centeredslide background-color="#1c1c1c"}:::: {.columns style="margin-left:1pt;"}::: {.column width="36%"}[Deviation in $x_{P}$]{.dim-text style="text-align:center; font-size:0.8em"}:::::: {.column width="30%"}[Topological charge mixing]{.dim-text style="text-align:right; font-size:0.8em"}:::::: {.column width="32%"}[Artificial influx of energy]{.dim-text style="text-align:right!important; font-size:0.8em;"}:::::::::: {#fig-interpretation}![](https://raw.githubusercontent.com/saforem2/physicsSeminar/main/assets/ridgeplots.svg){width="100%"}Illustration of how different observables evolve over a single L2HMCtrajectory.:::## Interpretation {.centeredslide background-color="#1c1c1c"}::: {#fig-energy-ridgeplot layout-ncol=2 layout-valign="top"}![Average plaquette: $\langle x_{P}\rangle$ vs LF step](https://raw.githubusercontent.com/saforem2/physicsSeminar/main/assets/plaqsf_ridgeplot.svg)![Average energy: $H - \sum\log|\mathcal{J}|$](https://raw.githubusercontent.com/saforem2/physicsSeminar/main/assets/Hf_ridgeplot.svg)The trained model artifically increases the energy towardsthe middle of the trajectory, allowing the sampler to tunnel between isolatedsectors.:::# 4D $SU(3)$ Results {#sec-su3results background-color="#1c1c1c"}- Distribution of $\log|\mathcal{J}|$ over all chains, at each _leapfrog step_, $N_{\mathrm{LF}}$($= 0, 1, \ldots, 8$)during training:::: {layout="[ [30, 30, 30] ]" layout-valign="center" style="display: flex; flex-direction: row; margin-top: -0.0em; align-items: center;"}![`100` train iters](./assets/SU3/logdet_ridgeplot1.svg){#fig-ridgeplot1}![`500` train iters](./assets/SU3/logdet_ridgeplot2.svg){#fig-ridgeplot2}![`1000` train iters](./assets/SU3/logdet_ridgeplot3.svg){#fig-ridgeplot3}:::## 4D $SU(3)$ Results: $\delta U_{\mu\nu}$ {background-color="#1c1c1c"}::: {#fig-pdiff}![](./assets/SU3/pdiff.svg)The difference in the average plaquette $\left|\delta U_{\mu\nu}\right|^{2}$between the trained model and HMC:::## 4D $SU(3)$ Results: $\delta U_{\mu\nu}$ {background-color="#1c1c1c"}::: {#fig-pdiff-robust}![](./assets/SU3/pdiff-robust.svg)The difference in the average plaquette $\left|\delta U_{\mu\nu}\right|^{2}$between the trained model and HMC:::# Next Steps {#sec-next-steps background-color="#1c1c1c"}- Further code development - {{< fa brands github >}} [`saforem2/l2hmc-qcd`](https://github.com/saforem2/l2hmc-qcd)- Continue to use / test different network architectures - Gauge equivariant NNs for $U_{\mu}(x)$ update- Continue to test different loss functions for training- Scaling: - Lattice volume - Network size - Batch size - \# of GPUs## Thank you! {#sec-thank-you background-color="#1c1c1c"} <br>::: {layout-ncol=4 style="text-align:left; font-size:0.8em;"}[[{{< fa solid home >}} `samforeman.me`](https://samforeman.me)]{style="font-size:0.8em;"}[[{{< fa brands github >}} `saforem2`](https://github.com/saforem2)]{style="font-size:0.8em;"}[[{{< fa brands twitter >}} `@saforem2`](https://www.twitter.com/saforem2)]{style="font-size:0.8em;"}[[{{< fa regular paper-plane >}} `foremans@anl.gov`](mailto:///foremans@anl.gov)]{style="font-size:0.8em;"}:::::: {.callout-note icon=false collapse=false title="🙏 Acknowledgements" style="width: 100%!important;"}This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357.:::## {#sec-l2hmc-gh background-color="#1c1c1c"}::: {style="text-align:center;"}[![](https://raw.githubusercontent.com/saforem2/l2hmc-qcd/main/assets/logo-small.svg)](https://github.com/saforem2/l2hmc-qcd)<a href="https://hits.seeyoufarm.com"><img alt="hits" src="https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2Fsaforem2%2Fl2hmc-qcd&count_bg=%2300CCFF&title_bg=%23555555&icon=&icon_color=%23111111&title=👋&edge_flat=false"></a><a href="https://github.com/saforem2/l2hmc-qcd/"><img alt="l2hmc-qcd" src="https://img.shields.io/badge/-l2hmc--qcd-252525?style=flat&logo=github&labelColor=gray"></a><a href="https://www.codefactor.io/repository/github/saforem2/l2hmc-qcd"><img alt="codefactor" src="https://www.codefactor.io/repository/github/saforem2/l2hmc-qcd/badge"></a><a href="https://arxiv.org/abs/2112.01582"><img alt="arxiv" src="http://img.shields.io/badge/arXiv-2112.01582-B31B1B.svg"></a><a href="https://arxiv.org/abs/2105.03418"><img alt="arxiv" src="http://img.shields.io/badge/arXiv-2105.03418-B31B1B.svg"></a><a href="https://hydra.cc"><img alt="hydra" src="https://img.shields.io/badge/Config-Hydra-89b8cd"></a><a href="https://pytorch.org/get-started/locally/"><img alt="pyTorch" src="https://img.shields.io/badge/PyTorch-ee4c2c?logo=pytorch&logoColor=white"></a><a href="https://www.tensorflow.org"><img alt="tensorflow" src="https://img.shields.io/badge/TensorFlow-%23FF6F00.svg?&logo=TensorFlow&logoColor=white"></a>[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Weights & Biases monitoring" height=20>](https://wandb.ai/l2hmc-qcd/l2hmc-qcd):::## Acknowledgements {#sec-acknowledgements background-color="#1c1c1c"}:::: {.columns}::: {.column width="50%"}- **Links**: - [{{< fa brands github >}} Link to github](https://github.com/saforem2/l2hmc-qcd) - [{{< fa solid paper-plane >}} reach out!](mailto:foremans@anl.gov)- **References**: - [Link to slides](https://saforem2.github.io/lattice23/) - [{{< fa brands github >}} link to github with slides](https://github.com/saforem2/lattice23) - {{< fa solid book >}} [@Foreman:2021ljl; @Foreman:2021rhs; @Foreman:2021ixr] - {{< fa solid book >}} [@Boyda:2022nmh; @Shanahan:2022ifi]:::::: {.column width="50%"}- Huge thank you to: - Yannick Meurice - Norman Christ - Akio Tomiya - Nobuyuki Matsumoto - Richard Brower - Luchang Jin - Chulwoo Jung - Peter Boyle - Taku Izubuchi - Denis Boyda - Dan Hackett - ECP-CSD group - [**ALCF Staff + Datascience Group**]{.red-text}:::::::## {#sec-references background-color="#1c1c1c"}:::: {.columns}::: {.column width="50%"}### Links {background-color="#1c1c1c"}- [{{< fa brands github >}} `saforem2/l2hmc-qcd`](https://github.com/saforem2/l2hmc-qcd)- [📊 slides](https://saforem2.github.io/lattice23) (Github: [{{< fa brands github >}} `saforem2/lattice23`](https://github.com/saforem2/lattice23)):::::: {.column width="50%"}### References {background-color="#1c1c1c"}- [Title Slide Background (worms) animation](https://saforem2.github.io/grid-worms-animation/) - Github: [{{< fa brands github >}} `saforem2/grid-worms-animation`](https://github.com/saforem2/grid-worms-animation)- [Link to HMC demo](https://chi-feng.github.io/mcmc-demo/app.html):::::::## References {style="line-height:1.2em;" background-color="#1c1c1c"}(I don't know why this is broken 🤷🏻♂️ )::: {#refs}:::# Extras {#sec-extras background-color="#1c1c1c"}## Integrated Autocorrelation Time {.centeredslide background-color="#1c1c1c"}::: {#fig-iat}![](https://raw.githubusercontent.com/saforem2/physicsSeminar/main/assets/tint1.svg){width="100%"}Plot of the integrated autocorrelation time for both the trained model(colored) and HMC (greyscale).:::## Comparison {background-color="#1c1c1c"}::: {#fig-comparison layout-ncol=2}![Trained model](https://saforem2.github.io/anl-job-talk/assets/dQint_eval.svg){#fig-eval}![Generic HMC](https://saforem2.github.io/anl-job-talk/assets/dQint_hmc.svg){#fig-hmc}Comparison of $\langle \delta Q\rangle = \frac{1}{N}\sum_{i=k}^{N} \delta Q_{i}$ for thetrained model [@fig-eval] vs. HMC [@fig-hmc]:::## Plaquette analysis: $x_{P}$ {.centeredslide background-color="#1c1c1c"}:::: {.columns}::: {.column width="55%"}[Deviation from $V\rightarrow\infty$ limit, $x_{P}^{\ast}$]{.dim-text style="text-align:center; font-size:0.9em;"}:::::: {.column width="45%"}[Average $\langle x_{P}\rangle$, with $x_{P}^{\ast}$ (dotted-lines)]{.dim-text style="text-align:right!important; font-size:0.9em;"}:::::::::: {#fig-avg-plaq}![](https://raw.githubusercontent.com/saforem2/physicsSeminar/main/assets/plaqsf_vs_lf_step1.svg){width="100%"}Plot showing how **average plaquette**, $\left\langle x_{P}\right\rangle$varies over a single trajectory for models trained at different $\beta$, withvarying trajectory lengths $N_{\mathrm{LF}}$:::## Loss Function {background-color="#1c1c1c"}- Want to maximize the _expected_ squared charge difference[^charge-diff]: $$\begin{equation*} \mathcal{L}_{\theta}\left(\xi^{\ast}, \xi\right) = {\mathbb{E}_{p(\xi)}}\big[-\textcolor{#FA5252}{{\delta Q}}^{2} \left(\xi^{\ast}, \xi \right)\cdot A(\xi^{\ast}|\xi)\big] \end{equation*}$$- Where: - $\delta Q$ is the _tunneling rate_: $$\begin{equation*} \textcolor{#FA5252}{\delta Q}(\xi^{\ast},\xi)=\left|Q^{\ast} - Q\right| \end{equation*}$$ - $A(\xi^{\ast}|\xi)$ is the probability[^jacobian] of accepting the proposal $\xi^{\ast}$: $$\begin{equation*} A(\xi^{\ast}|\xi) = \mathrm{min}\left( 1, \frac{p(\xi^{\ast})}{p(\xi)}\left|\frac{\partial \xi^{\ast}}{\partial \xi^{T}}\right|\right) \end{equation*}$$[^charge-diff]: Where $\xi^{\ast}$ is the _proposed_ configuration (prior toAccept / Reject)[^jacobian]: And $\left|\frac{\partial \xi^{\ast}}{\partial \xi^{T}}\right|$ is theJacobian of the transformation from $\xi \rightarrow \xi^{\ast}$## Networks 2D $U(1)$ {auto-animate=true background-color="#1c1c1c"}- Stack gauge links as `shape`$\left(U_{\mu}\right)$` =[Nb, 2, Nt, Nx]` $\in \mathbb{C}$ $$ x_{\mu}(n) ≔ \left[\cos(x), \sin(x)\right]$$ with `shape`$\left(x_{\mu}\right)$` = [Nb, 2, Nt, Nx, 2]` $\in \mathbb{R}$- $x$-Network: - [$\psi_{\theta}: (x, v) \longrightarrow \left(s_{x},\, t_{x},\, q_{x}\right)$]{.purple-text}- $v$-Network: - [$\varphi_{\theta}: (x, v) \longrightarrow \left(s_{v},\, t_{v},\, q_{v}\right)$]{.green-text} [$\hspace{2pt}\longleftarrow$ lets look at this]{.dim-text}## $v$-Update[^reverse] {background-color="#1c1c1c"}- [forward]{style="color:#FF5252"} $(d = \textcolor{#FF5252}{+})$: $$\Gamma^{\textcolor{#FF5252}{+}}: (x, v) \rightarrow v' := v \cdot e^{\frac{\varepsilon}{2} s_{v}} - \frac{\varepsilon}{2}\left[ F \cdot e^{\varepsilon q_{v}} + t_{v} \right]$$- [backward]{style="color:#1A8FFF;"} $(d = \textcolor{#1A8FFF}{-})$:$$\Gamma^{\textcolor{#1A8FFF}{-}}: (x, v) \rightarrow v' := e^{-\frac{\varepsilon}{2} s_{v}} \left\{v + \frac{\varepsilon}{2}\left[ F \cdot e^{\varepsilon q_{v}} + t_{v} \right]\right\}$$[^reverse]: [Note that $\left(\Gamma^{+}\right)^{-1} = \Gamma^{-}$, i.e. $\Gamma^{+}\left[\Gamma^{-}(x, v)\right] = \Gamma^{-}\left[\Gamma^{+}(x, v)\right] = (x, v)$]{style="font-size:0.8em;"}## $x$-Update {background-color="#1c1c1c"}- [forward]{style="color:#FF5252"} $(d = \textcolor{#FF5252}{+})$:$$\Lambda^{\textcolor{#FF5252}{+}}(x, v) = x \cdot e^{\frac{\varepsilon}{2} s_{x}} - \frac{\varepsilon}{2}\left[ v \cdot e^{\varepsilon q_{x}} + t_{x} \right]$$- [backward]{style="color:#1A8FFF;"} $(d = \textcolor{#1A8FFF}{-})$:$$\Lambda^{\textcolor{#1A8FFF}{-}}(x, v) = e^{-\frac{\varepsilon}{2} s_{x}} \left\{x + \frac{\varepsilon}{2}\left[ v \cdot e^{\varepsilon q_{x}} + t_{x} \right]\right\}$$## Lattice Gauge Theory (2D $U(1)$) {.centeredslide background-color="#1c1c1c"}:::: {.columns layout-valign="top"}::: {.column width="50%"}::: {style="text-align:center;"}::: {.callout-note icon=false collapse=false title="🔗 Link Variables" style="width:100%!important; text-align:left;"}$$U_{\mu}(n) = e^{i x_{\mu}(n)}\in \mathbb{C},\quad \text{where}\quad$$$$x_{\mu}(n) \in [-\pi,\pi)$$:::::: {}::: {.callout-important icon=false collapse=false title="🫸 Wilson Action" style="width:100%!important; text-align:left;"}$$S_{\beta}(x) = \beta\sum_{P} \cos \textcolor{#00CCFF}{x_{P}},$$$$\textcolor{#00CCFF}{x_{P}} = \left[x_{\mu}(n) + x_{\nu}(n+\hat{\mu})- x_{\mu}(n+\hat{\nu})-x_{\nu}(n)\right]$$:::[**Note**: $\textcolor{#00CCFF}{x_{P}}$ is the product oflinks around $1\times 1$ square, called a ["plaquette"]{.blue-text}]{.dim-text style=font-size:0.8em;}:::::::::::: {.column width="50%"}![2D Lattice](https://raw.githubusercontent.com/saforem2/deep-fridays/main/assets/u1lattice.dark.svg){width="80%"}:::::::## {background-color="white"}::: {#fig-notebook}<iframe data-src="https://nbviewer.org/github/saforem2/l2hmc-qcd/blob/SU3/src/l2hmc/notebooks/l2hmc-2dU1.ipynb" width="100%" height="650" title="l2hmc-qcd"></iframe>Jupyter Notebook:::## Annealing Schedule {#sec-annealing-schedule background-color="#1c1c1c"}- Introduce an _annealing schedule_ during the training phase: $$\left\{ \gamma_{t} \right\}_{t=0}^{N} = \left\{\gamma_{0}, \gamma_{1}, \ldots, \gamma_{N-1}, \gamma_{N} \right\}$$ where $\gamma_{0} < \gamma_{1} < \cdots < \gamma_{N} \equiv 1$, and $\left|\gamma_{t+1} - \gamma_{t}\right| \ll 1$ - [**Note**]{.green-text}: - for $\left|\gamma_{t}\right| < 1$, this rescaling helps to reduce the height of the energy barriers $\Longrightarrow$ - easier for our sampler to explore previously inaccessible regions of the phase space## Networks 2D $U(1)$ {#sec-networks-2dU1 background-color="#1c1c1c"}- Stack gauge links as `shape`$\left(U_{\mu}\right)$` =[Nb, 2, Nt, Nx]` $\in \mathbb{C}$ $$ x_{\mu}(n) ≔ \left[\cos(x), \sin(x)\right]$$ with `shape`$\left(x_{\mu}\right)$` = [Nb, 2, Nt, Nx, 2]` $\in \mathbb{R}$- $x$-Network: - [$\psi_{\theta}: (x, v) \longrightarrow \left(s_{x},\, t_{x},\, q_{x}\right)$]{.purple-text}- $v$-Network: - [$\varphi_{\theta}: (x, v) \longrightarrow \left(s_{v},\, t_{v},\, q_{v}\right)$]{.green-text}## Toy Example: GMM $\in \mathbb{R}^{2}$ {.centeredslide background-color="#1c1c1c"}![](https://raw.githubusercontent.com/saforem2/l2hmc-dwq25/main/docs/assets/iso_gmm_chains.svg){#fig-gmm .r-stretch}## Physical Quantities {background-color="#1c1c1c"}- To estimate physical quantities, we: - Calculate physical observables at **increasing** spatial resolution - Perform extrapolation to continuum limit::: {#fig-continuum}![](https://raw.githubusercontent.com/saforem2/physicsSeminar/main/assets/static/continuum.svg)Increasing the physical resolution ($a \rightarrow 0$) allows us to makepredictions about numerical values of physical quantities in the continuumlimit.:::## Extra {#sec-extra background-color="#1c1c1c"}[![](./assets/thumbnail.png)]{.preview-image style="text-align:center; margin-left:auto; margin-right: auto;"}