---
title: "MLMC: Machine Learning Monte Carlo"
location: Lattice 2023 (Fermilab)
location-url: "https://indico.fnal.gov/event/57249/contributions/271305/"
# location: "[Lattice 2023](https://indico.fnal.gov/event/57249/contributions/271305/)"
image: "https://github.com/saforem2/lattice23/blob/main/assets/thumbnail.png?raw=true"
date: 2023-07-31
categories:
- ai4science
- MCMC
- Lattice QCD
# date-modified: last-modified
title-block-categories: false
number-sections: false
bibliography: references.bib
appendix-cite-as: display
favicon: "./assets/favicon.svg"
callout-style: simple
twitter-card:
image: "https://github.com/saforem2/lattice23/blob/main/assets/thumbnail.png?raw=true"
site: "saforem2"
creator: "saforem2"
citation:
author: Sam Foreman
type: speech
genre: "Presentation at the 2023 International Symposium on Lattice Field Theory"
container-title: https://indico.fnal.gov/event/57249/contributions/271305/
title: "MLMC: Machine Learning Monte Carlo for Lattice Gauge Theory"
url: https://saforem2.github.io/lattice23
abstract: |
We present a trainable framework for efficiently generating gauge
configurations, and discuss ongoing work in this direction. In particular, we
consider the problem of sampling configurations from a 4D 𝑆𝑈(3) lattice gauge
theory, and consider a generalized leapfrog integrator in the molecular
dynamics update that can be trained to improve sampling efficiency.
format:
html:
# reference-location: section
# toc-location: right
page-layout: full
# grid:
# body-width: 800px
revealjs:
slides-url: https://samforeman.me/talks/lattice23/slides.html
# template-partials:
# - ./title-slide.html
# - ./title-fancy/title-slide.html
# - ./title_slide_template.html
# - ../../title-slide.html
title-slide-attributes:
data-background-iframe: https://saforem2.github.io/grid-worms-animation/
data-background-size: contain
data-background-color: "#1c1c1c"
background-color: "#1c1c1c"
title-block-style: none
slide-number: c
title-slide-style: default
chalkboard:
buttons: false
auto-animate: true
reference-location: section
touch: true
pause: false
footnotes-hover: true
citations-hover: true
preview-links: true
controls-tutorial: true
controls: false
logo: "https://raw.githubusercontent.com/saforem2/anl-job-talk/main/docs/assets/anl.svg"
history: false
# theme: [css/dark.scss]
callout-style: simple
# css: [css/default.css, css/callouts.css]
fig-align: center
css:
# - css/default.css
- ../../css/custom.css
theme:
# - light:
- white
- ../../css/title-slide-template.scss
- ../../css/reveal/reveal.scss
- ../../css/common.scss
- ../../css/dark.scss
- ../../css/syntax-dark.scss
- ../../css/callout-cards.scss
# callout-style: simple
# css:
# # - css/default.css
# - ../../css/custom.css
# theme:
# # - light:
# # - css/dark.scss
# - ../../css/title-slide-template.scss
# - ../../css/reveal/reveal.scss
# - ../../css/common.scss
# - ../../css/dark.scss
# - ../../css/syntax-dark.scss
# - ../../css/callout-cards.scss
# css:
# # - css/default.css
# - ../../css/custom.css
# theme:
# # - black
# # - ./title-fancy/title-slide-template.scss
# - ../../css/reveal/reveal.scss
# - ../../css/common.scss
# - ../../css/dark.scss
# - ../../css/syntax-dark.scss
# - ../../css/callout-cards.scss
# - css/dark.scss
self-contained: false
embed-resources: false
self-contained-math: false
center: true
highlight-style: "atom-one"
default-image-extension: svg
code-line-numbers: true
data-background-color: "#1c1c1c"
background-color: "#1c1c1c"
code-overflow: scroll
html-math-method: katex
output-file: "slides.html"
mermaid:
theme: neutral
# gfm:
# output-file: "lattice23.md"
---
# {.title-slide background-color="#1c1c1c" background-iframe="https://saforem2.github.io/grid-worms-animation/" loading="lazy"}
::: {style="background-color: rgba(22,22,22,0.75); border-radius: 10px; text-align:center; padding: 0px; padding-left: 1.5em; padding-right: 1.5em; max-width: min-content; min-width: max-content; margin-left: auto; margin-right: auto; padding-top: 0.2em; padding-bottom: 0.2em; line-height: 1.5em!important;"}
<span style="color:#939393; font-size:1.5em; font-weight: bold;">MLMC: Machine Learning Monte Carlo</span>
<span style="color:#777777; font-size:1.2em; font-weight: bold;">for Lattice Gauge Theory</span>
[ <br> ] {style="padding-bottom: 0.5rem;"}
[ {{< fa solid home >}} ](https://samforeman.me) Sam Foreman
[ Xiao-Yong Jin, James C. Osborn ] {.dim-text style="font-size:0.8em;"}
[[[ {{< fa brands github >}} `saforem2/` ](https://github.com/saforem2/) ]{style="border-bottom: 0.5px solid #00ccff;"}`{` [[ `lattice23` ](https://github.com/saforem2/lattice23) ]{style="border-bottom: 0.5px solid #00ccff;"}, [[ `l2hmc-qcd` ](https://github.com/saforem2/l2hmc-qcd) ]{style="border-bottom: 0.5px solid #00ccff;"}`}` ]{style="font-size:0.8em;"}
:::
::: footer
[ 2023-07-31 @ [Lattice 2023](https://indico.fnal.gov/event/57249/contributions/271305/) ] {.dim-text style="text-align:left;'}
:::
# Overview {background-color="#1c1c1c"}
1. [ Background: `{MCMC,HMC}` ](#markov-chain-monte-carlo-mcmc-centeredslide)
- [ Leapfrog Integrator ](#leapfrog-integrator-hmc-centeredslide)
- [ Issues with HMC ](#sec-issues-with-hmc)
- [ Can we do better? ](#sec-can-we-do-better)
2. [ L2HMC: Generalizing MD ](#sec-l2hmc)
- [ 4D $SU(3)$ Model ](#sec-su3)
- [ Results ](#sec-results)
3. [ References ](#sec-references)
4. [ Extras ](#sec-extras)
# Markov Chain Monte Carlo (MCMC) {.centeredslide background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="50%"}
::: {.callout-note collapse=false icon=false title="🎯 Goal" style="text-align:left;!important; width: 100%!important;"}
Generate **independent** samples $\{ x_{i}\} $, such that[^notation1]
$$\{ x_{i}\} \sim p(x) \propto e^{-S(x)}$$
where $S(x)$ is the _action_ (or potential energy)
:::
- Want to calculate observables $\mathcal{O}$:
$\left\langle \mathcal{O}\right\rangle \propto \int \left[ \mathcal{D}x\right ] \hspace{4pt} {\mathcal{O}(x)\, p(x)}$
:::
::: {.column width="49%"}

:::
::::
If these were <span style="color:#00CCFF;">independent</span>, we could approximate:
$\left\langle\mathcal{O}\right\rangle \simeq \frac{1}{N}\sum^{N}_{n=1}\mathcal{O}(x_{n})$
$$\sigma_{\mathcal{O}}^{2} = \frac{1}{N}\mathrm{Var}{\left[\mathcal{O} (x)
\right]}\Longrightarrow \sigma_{\mathcal{O}} \propto \frac{1}{\sqrt{N}}$$
[^notation1]: Here, $\sim$ means "is distributed according to"
::: footer
[ {{< fa brands github >}} `saforem2/lattice23` ](https://saforem2.github.io/lattice23)
:::
# Markov Chain Monte Carlo (MCMC) {.centeredslide background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="50%"}
::: {.callout-note collapse=false icon=false title="🎯 Goal" style="text-align:left;!important; width: 100%!important;"}
Generate **independent** samples $\{ x_{i}\} $, such that[^notation2]
$$\{ x_{i}\} \sim p(x) \propto e^{-S(x)}$$
where $S(x)$ is the _action_ (or potential energy)
:::
- Want to calculate observables $\mathcal{O}$:
$\left\langle \mathcal{O}\right\rangle \propto \int \left[ \mathcal{D}x\right ] \hspace{4pt} {\mathcal{O}(x)\, p(x)}$
:::
::: {.column width="49%"}

:::
::::
Instead, nearby configs are [ correlated ] {.red-text}, and we incur a factor of
$\textcolor{#FF5252}{\tau^{\mathcal{O}}_{\mathrm{int}}}$:
$$\sigma_{\mathcal{O}}^{2} =
\frac{\textcolor{#FF5252}{\tau^{\mathcal{O}}_{\mathrm{int}}}}{N}\mathrm{Var}{\left[\mathcal{O}
(x) \right]}$$
[^notation2]: Here, $\sim$ means "is distributed according to"
::: footer
[ {{< fa brands github >}} `saforem2/lattice23` ](https://github.com/saforem2/lattice23)
:::
# Hamiltonian Monte Carlo (HMC) {.center background-color="#1c1c1c"}
- Want to (sequentially) construct a chain of states:
$$x_{0} \rightarrow x_{1} \rightarrow x_{i} \rightarrow \cdots \rightarrow x_{N}\hspace{10pt}$$
such that, as $N \rightarrow \infty$:
$$\left\{ x_{i}, x_{i+1}, x_{i+2}, \cdots, x_{N} \right\} \xrightarrow[]{N\rightarrow\infty} p(x)
\propto e^{-S(x)}$$
::: {.callout-tip icon=false collapse=false title="🪄 Trick" style="display:inline!important; width: 100%!important;"}
- Introduce [ fictitious ] {.green-text} momentum $v \sim \mathcal{N}(0, \mathbb{1})$
- Normally distributed **independent** of $x$, i.e.
$$\begin{align*}
p(x, v) &\textcolor{#02b875}{=} p(x)\,p(v) \propto e^{-S{(x)}} e^{-\frac{1}{2} v^{T}v}
= e^{-\left[ S(x) + \frac{1}{2} v^{T}{v}\right ] }
\textcolor{#02b875}{=} e^{-H(x, v)}
\end{align*}$$
:::
## Hamiltonian Monte Carlo (HMC) {.centeredslide background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="55%"}
- [ **Idea** ] {.green-text}: Evolve the $(\dot{x}, \dot{v})$ system to get new states $\{ x_{i}\} $❗
- Write the **joint distribution** $p(x, v)$:
$$
p(x, v) \propto e^{-S[ x ] } e^{-\frac{1}{2}v^{T} v} = e^{-H(x, v)}
$$
:::
::: {.column width="45%"}
::: {.callout-tip collapse=false icon=false title="🔋 Hamiltonian Dynamics" style="width:100%!important;"}
$H = S[ x ] + \frac{1}{2} v^{T} v \Longrightarrow$
$$\dot{x} = +\partial_{v} H,
\,\,\dot{v} = -\partial_{x} H$$
:::
:::
::::
::: {#fig-hmc-traj}
 {.r-stretch}
Overview of HMC algorithm
:::
## Leapfrog Integrator (HMC) {#sec-leapfrog background-color="#1c1c1c"}
:::: {.columns style="font-size: 0.9em;" height="100%"}
::: {.column width="40%"}
::: {.callout-tip collapse=false icon=false title="🔋 Hamiltonian Dynamics" style="width:100%!important;"}
$\left(\dot{x}, \dot{v}\right) = \left(\partial_{v} H, -\partial_{x} H\right)$
:::
::: {.callout-note collapse=false icon=false title="🐸 Leapfrog Step" style="width:100%!important;"}
`input` $\,\left(x, v\right) \rightarrow \left(x', v'\right)\,$ `output`
$$\begin{align*}
\tilde{v} &:= \textcolor{#F06292}{\Gamma}(x, v)\hspace{2.2pt} = v - \frac{\varepsilon}{2} \partial_{x} S(x) \\
x' &:= \textcolor{#FD971F}{\Lambda}(x, \tilde{v}) \, = x + \varepsilon \, \tilde{v} \\
v' &:= \textcolor{#F06292}{\Gamma}(x', \tilde{v}) = \tilde{v} - \frac{\varepsilon}{2} \partial_{x} S(x')
\end{align*}$$
:::
::: {.callout-warning collapse=false icon=false title="⚠️ Warning!" style="width:100%!important;"}
- Resample $v_{0} \sim \mathcal{N}(0, \mathbb{1})$
at the [ beginning ] {.yellow-text} of each trajectory
:::
::: {style="font-size:0.8em; margin-left:13%;"}
[ **Note**: $\partial_{x} S(x)$ is the _force_ ] {.dim-text}
:::
:::
::: {.column width="55%" style="text-align:left;"}
 {width=60%}
:::
::::
## HMC Update {background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="65%"}
- We build a trajectory of $N_{\mathrm{LF}}$ **leapfrog steps**[^v0]
$$\begin{equation*}
(x_{0}, v_{0})%
\rightarrow (x_{1}, v_{1})\rightarrow \cdots%
\rightarrow (x', v')
\end{equation*}$$
- And propose $x'$ as the next state in our chain
$$\begin{align*}
\textcolor{#F06292}{\Gamma}: (x, v) \textcolor{#F06292}{\rightarrow} v' &:= v - \frac{\varepsilon}{2} \partial_{x} S(x) \\
\textcolor{#FD971F}{\Lambda}: (x, v) \textcolor{#FD971F}{\rightarrow} x' &:= x + \varepsilon v
\end{align*}$$
- We then accept / reject $x'$ using Metropolis-Hastings criteria,
$A(x'|x) = \min\left\{ 1, \frac{p(x')}{p(x)}\left|\frac{\partial x'}{\partial x}\right|\right\} $
:::
::: {.column width="30%"}

:::
::::
[^v0]: We **always** start by resampling the momentum, $v_{0} \sim
\mathcal{N}(0, \mathbb{1})$
## HMC Demo {.centeredslide background-color="#1c1c1c"}
::: {#fig-hmc-demo}
<iframe data-src="https://chi-feng.github.io/mcmc-demo/app.html" width="90%" height="500" title="l2hmc-qcd"></iframe>
HMC Demo
:::
# Issues with HMC {#sec-issues style="font-size:0.9em;" background-color="#1c1c1c"}
- What do we want in a good sampler?
- **Fast mixing** (small autocorrelations)
- **Fast burn-in** (quick convergence)
- Problems with HMC:
- Energy levels selected randomly $\rightarrow$ **slow mixing**
- Cannot easily traverse low-density zones $\rightarrow$ **slow convergence**
::: {#fig-hmc-issues layout-ncol=2}


HMC Samples generated with varying step sizes $\varepsilon$
:::
# Topological Freezing {.center background-color="#1c1c1c"}
:::: {.flex-container style="text-align: center; width: 100%"}
::: {.col1 width="45%" style="text-align: left; font-size: 0.9em;"}
**Topological Charge**:
$$Q = \frac{1}{2\pi}\sum_{P}\left\lfloor x_{P}\right\rfloor \in \mathbb{Z}$$
[**note:** $\left\lfloor x_{P} \right\rfloor = x_{P} - 2\pi
\left\lfloor\frac{x_{P} + \pi}{2\pi}\right\rfloor$]{.dim-text style="font-size:0.8em;"}
::: {.callout-important collapse=false icon=false title="⏳ Critical Slowing Down" style="text-align:left; width:100%!important;"}
- $Q$ gets stuck!
- as $\beta\longrightarrow \infty$:
- $Q \longrightarrow \text{const.}$
- $\delta Q = \left(Q^{\ast} - Q\right) \rightarrow 0 \textcolor{#FF5252}{\Longrightarrow}$
- \# configs required to estimate errors
**grows exponentially**:
[ $\tau_{\mathrm{int}}^{Q} \longrightarrow \infty$ ] {.red-text}
:::
:::
::: {.flex-container width="45%"}
{width="80%"}
:::
::::
# Can we do better? {#sec-can-we-do-better background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="50%"}
- Introduce two (**invertible NNs**) `vNet` and `xNet` [^l2hmc] :
- [ `vNet: ` $(x, F) \longrightarrow \left(s_{v},\, t_{v},\, q_{v}\right)$ ] {style="font-size:0.9em;"}
- [ `xNet: ` $(x, v) \longrightarrow \left(s_{x},\, t_{x},\, q_{x}\right)$ ] {style="font-size:0.9em;"}
- Use these $(s, t, q)$ in the _generalized_ MD update:
- [[ $\Gamma_{\theta}^{\pm}$ ] {.pink-text} $: ({x}, \textcolor{#07B875}{v}) \xrightarrow[]{\textcolor{#F06292}{s_{v}, t_{v}, q_{v}}} (x, \textcolor{#07B875}{v'})$]{}
- [[ $\Lambda_{\theta}^{\pm}$ ] {.orange-text} $: (\textcolor{#AE81FF}{x}, v) \xrightarrow[]{\textcolor{#FD971F}{s_{x}, t_{x}, q_{x}}} (\textcolor{#AE81FF}{x'}, v)$]{}
:::
::: {.column width="48%"}
::: {#fig-mdupdate}
 {style="width:85%; text-align:center;"}
Generalized MD update where [ $\Lambda_{\theta}^{\pm}$ ] {.orange-text},
[ $\Gamma_{\theta}^{\pm}$ ] {.pink-text} are **invertible NNs**
:::
:::
::::
[^l2hmc]: [ L2HMC: ](https://github.com/saforem2/l2hmc-qcd) {{< fa solid book >}}
[ @Foreman:2021ixr; @Foreman:2021rhs ]
# L2HMC: Generalizing the MD Update {#sec-l2hmc .smaller .centeredslide background-color="#1c1c1c"}
:::: {.flex-container style="text-align: left; width: 100%"}
::: {.col1 style="width:45%; font-size: 0.8em;"}
::: {style="border:1px solid red;"}
- Introduce $d \sim \mathcal{U}(\pm)$ to determine the direction of our update
1. [ $\textcolor{#07B875}{v'} =$ [$\Gamma^{\pm}$]{.pink-text}$({x}, \textcolor{#07B875}{v})$ ] {} [ $\hspace{46pt}$ update $v$ ] {.dim-text style="font-size:0.9em;"}
2. [ $\textcolor{#AE81FF}{x'} =$ [$x_{B}$]{.blue-text}$\,+\,$[$\Lambda^{\pm}$]{.orange-text}$($[$x_{A}$]{.red-text}$, {v'})$ ] {} [ $\hspace{10pt}$ update first **half**: $x_{A}$ ] {.dim-text style="font-size:0.9em;"}
3. [ $\textcolor{#AE81FF}{x''} =$ [$x'_{A}$]{.red-text}$\,+\,$[$\Lambda^{\pm}$]{.orange-text}$($[$x'_{B}$]{.blue-text}$, {v'})$ ] {} [ $\hspace{8pt}$ update other half: $x_{B}$ ] {.dim-text style="font-size:0.9em;"}
4. [ $\textcolor{#07B875}{v''} =$ [$\Gamma^{\pm}$]{.pink-text}$({x''}, \textcolor{#07B875}{v'})$ ] {} [ $\hspace{36pt}$ update $v$ ] {.dim-text style="font-size:0.9em;"}
:::
::: {style="border:1px solid red;"}
- Resample both $v\sim \mathcal{N}(0, 1)$, and $d \sim \mathcal{U}(\pm)$ at the
beginning of each trajectory
- To ensure ergodicity + reversibility, we split the [ $x$ ] {.purple-text}
update into sequential (complementary) updates
- Introduce directional variable $d \sim \mathcal{U}(\pm)$, resampled at the
beginning of each trajectory:
- Note that $\left(\Gamma^{+}\right)^{-1} = \Gamma^{-}$, i.e.
$$\Gamma^{+}\left[ \Gamma^{-}(x, v)\right ] = \Gamma^{-}\left[\Gamma^{+}(x,
v)\right] = (x, v)$$
:::
:::
::: {.col2 style="width:55%;"}
::: {#fig-mdupdate}
 {style="width:85%; text-align:center;"}
Generalized MD update with [ $\Lambda_{\theta}^{\pm}$ ] {.orange-text}, [ $\Gamma_{\theta}^{\pm}$ ] {.pink-text} **invertible NNs**
:::
:::
::::
## L2HMC: Leapfrog Layer {.center width="100%" background-color="#1c1c1c"}
:::: {.flex-container}
::: {.column style="width: 35%;"}
 {.absolute top="30" width="40%"}
:::
::: {.column style="width:65%;"}
 {width="100%"}
:::
 {.absolute top=440 width="100%"}
::::
## L2HMC Update {style="font-size: 0.775em;" background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="65%" style="font-size:0.7em;"}
::: {.callout-important collapse=false icon=false title="👨💻 Algorithm" style="text-align:left; width: 100%!important;"}
1. `input` : [ $x$ ] {.purple-text}
- Resample: $\textcolor{#07B875}{v} \sim \mathcal{N}(0, \mathbb{1})$; $\,\,{d\sim\mathcal{U}(\pm)}$
- Construct initial state:
$\textcolor{#939393}{\xi} =(\textcolor{#AE81FF}{x}, \textcolor{#07B875}{v}, {\pm})$
2. `forward` : Generate [ proposal $\xi'$ ] {style="color:#f8f8f8"} by passing [ initial $\xi$ ] {style="color:#939393"} through $N_{\mathrm{LF}}$ leapfrog layers
$$\textcolor{#939393} \xi \hspace{1pt}\xrightarrow[]{\tiny{\mathrm{LF} \text{ layer}}}\xi_{1} \longrightarrow\cdots \longrightarrow \xi_{N_{\mathrm{LF}}} = \textcolor{#f8f8f8}{\xi'} := (\textcolor{#AE81FF}{x''}, \textcolor{#07B875}{v''})$$
- Accept / Reject:
$$\begin{equation*}
A({\textcolor{#f8f8f8}{\xi'}}|{\textcolor{#939393}{\xi}})=
\mathrm{min}\left\{ 1,
\frac{\pi(\textcolor{#f8f8f8}{\xi'})}{\pi(\textcolor{#939393}{\xi})} \left| \mathcal{J}\left(\textcolor{#f8f8f8}{\xi'},\textcolor{#939393}{\xi}\right)\right| \right\}
\end{equation*}$$
5. `backward` (if training):
- Evaluate the **loss function**[^loss] $\mathcal{L}\gets \mathcal{L}_{\theta}(\textcolor{#f8f8f8}{\xi'}, \textcolor{#939393}{\xi})$ and backprop
6. `return` : $\textcolor{#AE81FF}{x}_{i+1}$
Evaluate MH criteria $(1)$ and return accepted config,
$$\textcolor{#AE81FF}{{x}_{i+1}}\gets
\begin{cases}
\textcolor{#f8f8f8}{\textcolor{#AE81FF}{x''}} \small{\text{ w/ prob }} A(\textcolor{#f8f8f8}{\xi''}|\textcolor{#939393}{\xi}) \hspace{26pt} ✅ \\
\textcolor{#939393}{\textcolor{#AE81FF}{x}} \hspace{5pt}\small{\text{ w/ prob }} 1 - A(\textcolor{#f8f8f8}{\xi''}|{\textcolor{#939393}{\xi}}) \hspace{10pt} 🚫
\end{cases}$$
:::
:::
::: {.column width="35%"}
::: {#fig-mdupdate}
 {style="width:75%; text-align:center;"}
**Leapfrog Layer** used in generalized MD update
:::
:::
::::
[^loss] :
For simple $\mathbf{x} \in \mathbb{R}^{2}$ example, $\mathcal{L}_{\theta} =
A(\xi^{\ast}|\xi)\cdot \left(\mathbf{x}^{\ast} - \mathbf{x}\right)^{2}$
# 4D $SU(3)$ Model {#sec-su3 .centeredslide background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="50%"}
::: {.callout-note collapse=false icon=false title="🔗 Link Variables" style="text-align:left; width: 100%!important;"}
- Write link variables $U_{\mu}(x) \in SU(3)$:
$$ \begin{align*}
U_{\mu}(x) &= \mathrm{exp}\left[{i\, \textcolor{#AE81FF}{\omega^{k}_{\mu}(x)} \lambda^{k}}\right]\\
&= e^{i \textcolor{#AE81FF}{Q}},\quad \text{with} \quad \textcolor{#AE81FF}{Q} \in \mathfrak{su}(3)
\end{align*}$$
[ where [$\omega^{k}_{\mu}(x)$ ] {.purple-text} $\in \mathbb{R}$, and $\lambda^{k}$ are
the generators of $SU(3)$]{style="font-size:0.9em;"}
:::
::: {.callout-tip collapse=false icon=false title="🏃♂️➡️ Conjugate Momenta" style="text-align:left; width:100%!important;"}
- Introduce [ $P_{\mu}(x) = P^{k}_{\mu}(x) \lambda^{k}$ ] {.green-text} conjugate to
[ $\omega^{k}_{\mu}(x)$ ] {.purple-text}
:::
::: {.callout-important collapse=false icon=false title="🟥 Wilson Action" style="text-align:left; width:100%!important;"}
$$ S_{G} = -\frac{\beta}{6} \sum
\mathrm{Tr}\left[U_{\mu\nu}(x)
+ U^{\dagger}_{\mu\nu}(x)\right] $$
where $U_{\mu\nu}(x) = U_{\mu}(x) U_{\nu}(x+\hat{\mu})
U^{\dagger}_{\mu}(x+\hat{\nu}) U^{\dagger}_{\nu}(x)$
:::
:::
::: {.column width="45%"}
::: {#fig-4dlattice}
 {width="90%"}
Illustration of the lattice
:::
:::
::::
## HMC: 4D $SU(3)$ {#sec-hmcsu3 background-color="#1c1c1c"}
Hamiltonian: $H[ P, U ] = \frac{1}{2} P^{2} + S[ U ] \Longrightarrow$
:::: {.columns}
::: {.column style="font-size:0.9em; text-align: center;"}
::: {.callout collapse=false style="text-align:left; width: 100%!important;"}
- [ $U$ update ] {style="border-bottom: 2px solid #AE81FF;"}:
[ $\frac{d\omega^{k}}{dt} = \frac{\partial H}{\partial P^{k}}$ ] {.purple-text style="font-size:1.5em;"}
$$\frac{d\omega^{k}}{dt}\lambda^{k} = P^{k}\lambda^{k} \Longrightarrow \frac{dQ}{dt} = P$$
$$\begin{align*}
Q(\textcolor{#FFEE58}{\varepsilon}) &= Q(0) + \textcolor{#FFEE58}{\varepsilon} P(0)\Longrightarrow\\
-i\, \log U(\textcolor{#FFEE58}{\varepsilon}) &= -i\, \log U(0) + \textcolor{#FFEE58}{\varepsilon} P(0) \\
U(\textcolor{#FFEE58}{\varepsilon}) &= e^{i\,\textcolor{#FFEE58}{\varepsilon} P(0)} U(0)\Longrightarrow \\
&\hspace{1pt}\\
\textcolor{#FD971F}{\Lambda}:\,\, U \longrightarrow U' &:= e^{i\varepsilon P'} U
\end{align*}$$
:::
::: aside
[ $\textcolor{#FFEE58}{\varepsilon}$ is the step size ] {.dim-text style="font-size:0.8em;"}
:::
:::
::: {.column style="font-size:0.9em; text-align: center;"}
::: {.callout collapse=false style="text-align:left; width: 100%!important;"}
- [ $P$ update ] {style="border-bottom: 2px solid #07B875;"}:
[ $\frac{dP^{k}}{dt} = - \frac{\partial H}{\partial \omega^{k}}$ ] {.green-text style="font-size:1.5em;"}
$$\frac{dP^{k}}{dt} = - \frac{\partial H}{\partial \omega^{k}}
= -\frac{\partial H}{\partial Q} = -\frac{dS}{dQ}\Longrightarrow$$
$$\begin{align*}
P(\textcolor{#FFEE58}{\varepsilon}) &= P(0) - \textcolor{#FFEE58}{\varepsilon} \left.\frac{dS}{dQ}\right|_{t=0} \\
&= P(0) - \textcolor{#FFEE58}{\varepsilon} \,\textcolor{#E599F7}{F[ U ] } \\
&\hspace{1pt}\\
\textcolor{#F06292}{\Gamma}:\,\, P \longrightarrow P' &:= P - \frac{\varepsilon}{2} F[ U ]
\end{align*}$$
:::
::: aside
[ $\textcolor{#E599F7}{F[U]}$ is the force term ] {.dim-text style="font-size:0.8em;"}
:::
:::
::::
## HMC: 4D $SU(3)$ {.centeredslide background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="47%" style="text-align:left;"}
- [ Momentum Update ] {style="border-bottom: 2px solid #F06292;"}:
$$\textcolor{#F06292}{\Gamma}: P \longrightarrow P' := P - \frac{\varepsilon}{2} F[ U ] $$
- [ Link Update ] {style="border-bottom: 2px solid #FD971F;"}:
$$\textcolor{#FD971F}{\Lambda}: U \longrightarrow U' := e^{i\varepsilon P'} U\quad\quad$$
- We maintain a batch of `Nb` lattices, all updated in parallel
- $U$`.dtype = complex128`
- $U$`.shape`
[ `= [Nb, 4, Nt, Nx, Ny, Nz, 3, 3]` ] {style="font-size: 0.95em;"}
:::
::: {.column width="47%" style="text-align:right;"}
 {width=60%}
:::
::::
# Networks 4D $SU(3)$ {#sec-su3networks .centeredslide auto-animate="true" background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="54%" style="font-size:0.9em;"}
<br>
<br>
[ $U$ ] {.purple-text}-Network:
[ `UNet: ` $(U, P) \longrightarrow \left(s_{U},\, t_{U},\, q_{U}\right)$ ] {style="font-size:0.9em;"}
<br>
::: {style="border: 1px solid #1c1c1c; border-radius: 6px; padding:1%;"}
[ $P$ ] {.green-text}-Network:
[ `PNet: ` $(U, P) \longrightarrow \left(s_{P},\, t_{P},\, q_{P}\right)$ ] {style="font-size:0.9em;"}
:::
:::
::: {.column width="45%" style="text-align:right;"}
 {width="80%"}
:::
::::
# Networks 4D $SU(3)$ {.centeredslide auto-animate="true" background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="54%" style="font-size:0.9em;"}
<br>
<br>
[ $U$ ] {.purple-text}-Network:
[ `UNet: ` $(U, P) \longrightarrow \left(s_{U},\, t_{U},\, q_{U}\right)$ ] {style="font-size:0.9em;"}
<br>
::: {style="border: 1px solid #07B875; border-radius: 6px; padding:1%;"}
[ $P$ ] {.green-text}-Network:
[ `PNet: ` $(U, P) \longrightarrow \left(s_{P},\, t_{P},\, q_{P}\right)$ ] {style="font-size:0.9em;"}
:::
[ $\uparrow$ ] {.dim-text}
[ let's look at this ] {.dim-text style="padding-top: 0.5em!important;"}
:::
::: {.column width="45%" style="text-align:right;"}
 {width="80%"}
:::
::::
## $P$-`Network` (pt. 1) {style="font-size:0.95em;" background-color="#1c1c1c"}
::: {style="text-align:center;"}

:::
:::: {.columns}
::: {.column width="50%"}
- [ `input`[^sigma]: $\hspace{7pt}\left(U, F\right) := (e^{i Q}, F)$ ] {style="border-bottom: 2px solid rgba(131, 131, 131, 0.493);"}
$$\begin{align*}
h_{0} &= \sigma\left( w_{Q} Q + w_{F} F + b \right) \\
h_{1} &= \sigma\left( w_{1} h_{0} + b_{1} \right) \\
&\vdots \\
h_{n} &= \sigma\left(w_{n-1} h_{n-2} + b_{n}\right) \\
\textcolor{#FF5252}{z} & := \sigma\left(w_{n} h_{n-1} + b_{n}\right) \longrightarrow \\
\end{align*}$$
:::
::: {.column width="50%"}
- [ `output`[^lambda1]: $\hspace{7pt} (s_{P}, t_{P}, q_{P})$ ] {style="border-bottom: 2px solid rgba(131, 131, 131, 0.5);"}
- $s_{P} = \lambda_{s} \tanh(w_s \textcolor{#FF5252}{z} + b_s)$
- $t_{P} = w_{t} \textcolor{#FF5252}{z} + b_{t}$
- $q_{P} = \lambda_{q} \tanh(w_{q} \textcolor{#FF5252}{z} + b_{q})$
:::
::::
[^sigma]: $\sigma(\cdot)$ denotes an activation function
[^lambda1]: $\lambda_{s}, \lambda_{q} \in \mathbb{R}$, trainable parameters
## $P$-`Network` (pt. 2) {style="font-size:0.9em;" background-color="#1c1c1c"}
::: {style="text-align:center;"}

:::
- Use $(s_{P}, t_{P}, q_{P})$ to update $\Gamma^{\pm}: (U, P) \rightarrow
\left(U, P_{\pm}\right)$[^inverse] :
- [ forward ] {style="color:#FF5252"} $(d = \textcolor{#FF5252}{+})$:
$$\Gamma^{\textcolor{#FF5252}{+}}(U, P) := P_{\textcolor{#FF5252}{+}} = P \cdot e^{\frac{\varepsilon}{2} s_{P}} - \frac{\varepsilon}{2}\left[ F \cdot e^{\varepsilon q_{P}} + t_{P} \right]$$
- [ backward ] {style="color:#1A8FFF;"} $(d = \textcolor{#1A8FFF}{-})$:
$$\Gamma^{\textcolor{#1A8FFF}{-}}(U, P) := P_{\textcolor{#1A8FFF}{-}} = e^{-\frac{\varepsilon}{2} s_{P}} \left\{P + \frac{\varepsilon}{2}\left[ F \cdot e^{\varepsilon q_{P}} + t_{P} \right]\right\} $$
[^inverse]: Note that $\left(\Gamma^{+}\right)^{-1} = \Gamma^{-}$, i.e. $\Gamma^{+}\left[ \Gamma^{-}(U, P)\right ] = \Gamma^{-}\left[ \Gamma^{+}(U, P)\right ] = (U, P)$
# Results: 2D $U(1)$ {#sec-results .centeredslide background-color="#1c1c1c"}
:::: {.columns}
::: {.column width=50% style="align:top;"}
 {width="90%"}
:::
::: {.column width="33%" style="text-align:left; padding-top: 5%;"}
::: {.callout-important icon=false collapse=false title="📈 Improvement" style="text-align:left!important; width: 100%!important;"}
We can measure the performance by comparing $\tau_{\mathrm{int}}$ for the
[ **trained model** ] {style="color:#FF2052;"} vs.
[ **HMC** ] {style="color:#9F9F9F;"}.
**Note**: [ lower ] {style="color:#FF2052;"} is better
:::
:::
::::
 {.absolute top=400 left=0 width="100%" style="margin-bottom: 1em;margin-top: 1em;"}
## Interpretation {#sec-interpretation .centeredslide background-color="#1c1c1c"}
:::: {.columns style="margin-left:1pt;"}
::: {.column width="36%"}
[ Deviation in $x_{P}$ ] {.dim-text style="text-align:center; font-size:0.8em"}
:::
::: {.column width="30%"}
[ Topological charge mixing ] {.dim-text style="text-align:right; font-size:0.8em"}
:::
::: {.column width="32%"}
[ Artificial influx of energy ] {.dim-text style="text-align:right!important; font-size:0.8em;"}
:::
::::
::: {#fig-interpretation}
 {width="100%"}
Illustration of how different observables evolve over a single L2HMC
trajectory.
:::
## Interpretation {.centeredslide background-color="#1c1c1c"}
::: {#fig-energy-ridgeplot layout-ncol=2 layout-valign="top"}


The trained model artifically increases the energy towards
the middle of the trajectory, allowing the sampler to tunnel between isolated
sectors.
:::
# 4D $SU(3)$ Results {#sec-su3results background-color="#1c1c1c"}
- Distribution of $\log|\mathcal{J}|$ over all chains, at each _leapfrog step_, $N_{\mathrm{LF}}$
($= 0, 1, \ldots, 8$)
during training:
::: {layout="[ [ 30, 30, 30 ] ]" layout-valign="center" style="display: flex; flex-direction: row; margin-top: -0.0em; align-items: center;"}
 {#fig-ridgeplot1}
 {#fig-ridgeplot2}
 {#fig-ridgeplot3}
:::
## 4D $SU(3)$ Results: $\delta U_{\mu\nu}$ {background-color="#1c1c1c"}
::: {#fig-pdiff}

The difference in the average plaquette $\left|\delta U_{\mu\nu}\right|^{2}$
between the trained model and HMC
:::
## 4D $SU(3)$ Results: $\delta U_{\mu\nu}$ {background-color="#1c1c1c"}
::: {#fig-pdiff-robust}

The difference in the average plaquette $\left|\delta U_{\mu\nu}\right|^{2}$
between the trained model and HMC
:::
# Next Steps {#sec-next-steps background-color="#1c1c1c"}
- Further code development
- {{< fa brands github >}} [ `saforem2/l2hmc-qcd` ](https://github.com/saforem2/l2hmc-qcd)
- Continue to use / test different network architectures
- Gauge equivariant NNs for $U_{\mu}(x)$ update
- Continue to test different loss functions for training
- Scaling:
- Lattice volume
- Network size
- Batch size
- \# of GPUs
## Thank you! {#sec-thank-you background-color="#1c1c1c"}
<br>
::: {layout-ncol=4 style="text-align:left; font-size:0.8em;"}
[[ {{< fa solid home >}} `samforeman.me` ](https://samforeman.me) ]{style="font-size:0.8em;"}
[[ {{< fa brands github >}} `saforem2` ](https://github.com/saforem2) ]{style="font-size:0.8em;"}
[[ {{< fa brands twitter >}} `@saforem2` ](https://www.twitter.com/saforem2) ]{style="font-size:0.8em;"}
[[ {{< fa regular paper-plane >}} `foremans@anl.gov` ](mailto:///foremans@anl.gov) ]{style="font-size:0.8em;"}
:::
::: {.callout-note icon=false collapse=false title="🙏 Acknowledgements" style="width: 100%!important;"}
This research used resources of the Argonne Leadership Computing Facility,
which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357.
:::
## {#sec-l2hmc-gh background-color="#1c1c1c"}
::: {style="text-align:center;"}
[  ](https://github.com/saforem2/l2hmc-qcd)
<a href="https://hits.seeyoufarm.com"><img alt="hits" src="https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2Fsaforem2%2Fl2hmc-qcd&count_bg=%2300CCFF&title_bg=%23555555&icon=&icon_color=%23111111&title=👋&edge_flat=false"></a>
<a href="https://github.com/saforem2/l2hmc-qcd/"><img alt="l2hmc-qcd" src="https://img.shields.io/badge/-l2hmc--qcd-252525?style=flat&logo=github&labelColor=gray"></a> <a href="https://www.codefactor.io/repository/github/saforem2/l2hmc-qcd"><img alt="codefactor" src="https://www.codefactor.io/repository/github/saforem2/l2hmc-qcd/badge"></a>
<a href="https://arxiv.org/abs/2112.01582"><img alt="arxiv" src="http://img.shields.io/badge/arXiv-2112.01582-B31B1B.svg"></a> <a href="https://arxiv.org/abs/2105.03418"><img alt="arxiv" src="http://img.shields.io/badge/arXiv-2105.03418-B31B1B.svg"></a>
<a href="https://hydra.cc"><img alt="hydra" src="https://img.shields.io/badge/Config-Hydra-89b8cd"></a> <a href="https://pytorch.org/get-started/locally/"><img alt="pyTorch" src="https://img.shields.io/badge/PyTorch-ee4c2c?logo=pytorch&logoColor=white"></a> <a href="https://www.tensorflow.org"><img alt="tensorflow" src="https://img.shields.io/badge/TensorFlow-%23FF6F00.svg?&logo=TensorFlow&logoColor=white"></a>
[ <img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Weights & Biases monitoring" height=20> ](https://wandb.ai/l2hmc-qcd/l2hmc-qcd)
:::
## Acknowledgements {#sec-acknowledgements background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="50%"}
- **Links**:
- [ {{< fa brands github >}} Link to github ](https://github.com/saforem2/l2hmc-qcd)
- [ {{< fa solid paper-plane >}} reach out! ](mailto:foremans@anl.gov)
- **References**:
- [ Link to slides ](https://saforem2.github.io/lattice23/)
- [ {{< fa brands github >}} link to github with slides ](https://github.com/saforem2/lattice23)
- {{< fa solid book >}} [ @Foreman:2021ljl; @Foreman:2021rhs; @Foreman:2021ixr ]
- {{< fa solid book >}} [ @Boyda:2022nmh; @Shanahan:2022ifi ]
:::
::: {.column width="50%"}
- Huge thank you to:
- Yannick Meurice
- Norman Christ
- Akio Tomiya
- Nobuyuki Matsumoto
- Richard Brower
- Luchang Jin
- Chulwoo Jung
- Peter Boyle
- Taku Izubuchi
- Denis Boyda
- Dan Hackett
- ECP-CSD group
- [ **ALCF Staff + Datascience Group** ] {.red-text}
:::
::::
## {#sec-references background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="50%"}
### Links {background-color="#1c1c1c"}
- [ {{< fa brands github >}} `saforem2/l2hmc-qcd` ](https://github.com/saforem2/l2hmc-qcd)
- [ 📊 slides ](https://saforem2.github.io/lattice23) (Github: [ {{< fa brands github >}} `saforem2/lattice23` ](https://github.com/saforem2/lattice23) )
:::
::: {.column width="50%"}
### References {background-color="#1c1c1c"}
- [ Title Slide Background (worms) animation ](https://saforem2.github.io/grid-worms-animation/)
- Github: [ {{< fa brands github >}} `saforem2/grid-worms-animation` ](https://github.com/saforem2/grid-worms-animation)
- [ Link to HMC demo ](https://chi-feng.github.io/mcmc-demo/app.html)
:::
::::
## References {style="line-height:1.2em;" background-color="#1c1c1c"}
(I don't know why this is broken 🤷🏻♂️ )
::: {#refs}
:::
# Extras {#sec-extras background-color="#1c1c1c"}
## Integrated Autocorrelation Time {.centeredslide background-color="#1c1c1c"}
::: {#fig-iat}
 {width="100%"}
Plot of the integrated autocorrelation time for both the trained model
(colored) and HMC (greyscale).
:::
## Comparison {background-color="#1c1c1c"}
::: {#fig-comparison layout-ncol=2}
 {#fig-eval}
 {#fig-hmc}
Comparison of $\langle \delta Q\rangle = \frac{1}{N}\sum_{i=k}^{N} \delta Q_{i}$ for the
trained model [ @fig-eval ] vs. HMC [ @fig-hmc ]
:::
## Plaquette analysis: $x_{P}$ {.centeredslide background-color="#1c1c1c"}
:::: {.columns}
::: {.column width="55%"}
[ Deviation from $V\rightarrow\infty$ limit, $x_{P}^{\ast}$ ] {.dim-text style="text-align:center; font-size:0.9em;"}
:::
::: {.column width="45%"}
[ Average $\langle x_{P}\rangle$, with $x_{P}^{\ast}$ (dotted-lines) ] {.dim-text style="text-align:right!important; font-size:0.9em;"}
:::
::::
::: {#fig-avg-plaq}
 {width="100%"}
Plot showing how **average plaquette**, $\left\langle x_{P}\right\rangle$
varies over a single trajectory for models trained at different $\beta$, with
varying trajectory lengths $N_{\mathrm{LF}}$
:::
## Loss Function {background-color="#1c1c1c"}
- Want to maximize the _expected_ squared charge difference[^charge-diff] :
$$\begin{equation*}
\mathcal{L}_{\theta}\left(\xi^{\ast}, \xi\right) =
{\mathbb{E}_{p(\xi)}}\big[-\textcolor{#FA5252}{{\delta Q}}^{2}
\left(\xi^{\ast}, \xi \right)\cdot A(\xi^{\ast}|\xi)\big]
\end{equation*}$$
- Where:
- $\delta Q$ is the _tunneling rate_:
$$\begin{equation*}
\textcolor{#FA5252}{\delta Q}(\xi^{\ast},\xi)=\left|Q^{\ast} - Q\right|
\end{equation*}$$
- $A(\xi^{\ast}|\xi)$ is the probability[^jacobian] of accepting the proposal $\xi^{\ast}$:
$$\begin{equation*}
A(\xi^{\ast}|\xi) = \mathrm{min}\left( 1,
\frac{p(\xi^{\ast})}{p(\xi)}\left|\frac{\partial \xi^{\ast}}{\partial
\xi^{T}}\right|\right)
\end{equation*}$$
[^charge-diff]: Where $\xi^{\ast}$ is the _proposed_ configuration (prior to
Accept / Reject)
[^jacobian]: And $\left|\frac{\partial \xi^{\ast}}{\partial \xi^{T}}\right|$ is the
Jacobian of the transformation from $\xi \rightarrow \xi^{\ast}$
## Networks 2D $U(1)$ {auto-animate=true background-color="#1c1c1c"}
- Stack gauge links as `shape` $\left(U_{\mu}\right)$` =[Nb, 2, Nt, Nx]` $\in \mathbb{C}$
$$ x_{\mu}(n) ≔ \left[ \cos(x), \sin(x)\right ] $$
with `shape` $\left(x_{\mu}\right)$` = [Nb, 2, Nt, Nx, 2]` $\in \mathbb{R}$
- $x$-Network:
- [ $\psi_{\theta}: (x, v) \longrightarrow \left(s_{x},\, t_{x},\, q_{x}\right)$ ] {.purple-text}
- $v$-Network:
- [ $\varphi_{\theta}: (x, v) \longrightarrow \left(s_{v},\, t_{v},\, q_{v}\right)$ ] {.green-text} [ $\hspace{2pt}\longleftarrow$ lets look at this ] {.dim-text}
## $v$-Update[^reverse] {background-color="#1c1c1c"}
- [ forward ] {style="color:#FF5252"} $(d = \textcolor{#FF5252}{+})$:
$$\Gamma^{\textcolor{#FF5252}{+}}: (x, v) \rightarrow v' := v \cdot e^{\frac{\varepsilon}{2} s_{v}} - \frac{\varepsilon}{2}\left[ F \cdot e^{\varepsilon q_{v}} + t_{v} \right ] $$
- [ backward ] {style="color:#1A8FFF;"} $(d = \textcolor{#1A8FFF}{-})$:
$$\Gamma^{\textcolor{#1A8FFF}{-}}: (x, v) \rightarrow v' := e^{-\frac{\varepsilon}{2} s_{v}} \left\{ v + \frac{\varepsilon}{2}\left[ F \cdot e^{\varepsilon q_{v}} + t_{v} \right ] \right\} $$
[^reverse]: [ Note that $\left(\Gamma^{+}\right)^{-1} = \Gamma^{-}$, i.e. $\Gamma^{+}\left[\Gamma^{-}(x, v)\right] = \Gamma^{-}\left[\Gamma^{+}(x, v)\right] = (x, v)$ ] {style="font-size:0.8em;"}
## $x$-Update {background-color="#1c1c1c"}
- [ forward ] {style="color:#FF5252"} $(d = \textcolor{#FF5252}{+})$:
$$\Lambda^{\textcolor{#FF5252}{+}}(x, v) = x \cdot e^{\frac{\varepsilon}{2} s_{x}} - \frac{\varepsilon}{2}\left[ v \cdot e^{\varepsilon q_{x}} + t_{x} \right ] $$
- [ backward ] {style="color:#1A8FFF;"} $(d = \textcolor{#1A8FFF}{-})$:
$$\Lambda^{\textcolor{#1A8FFF}{-}}(x, v) = e^{-\frac{\varepsilon}{2} s_{x}} \left\{ x + \frac{\varepsilon}{2}\left[ v \cdot e^{\varepsilon q_{x}} + t_{x} \right ] \right\} $$
## Lattice Gauge Theory (2D $U(1)$) {.centeredslide background-color="#1c1c1c"}
:::: {.columns layout-valign="top"}
::: {.column width="50%"}
::: {style="text-align:center;"}
::: {.callout-note icon=false collapse=false title="🔗 Link Variables" style="width:100%!important; text-align:left;"}
$$U_{\mu}(n) = e^{i x_{\mu}(n)}\in \mathbb{C},\quad \text{where}\quad$$
$$x_{\mu}(n) \in [-\pi,\pi)$$
:::
::: {}
::: {.callout-important icon=false collapse=false title="🫸 Wilson Action" style="width:100%!important; text-align:left;"}
$$S_{\beta}(x) = \beta\sum_{P} \cos \textcolor{#00CCFF}{x_{P}},$$
$$\textcolor{#00CCFF}{x_{P}} = \left[x_{\mu}(n) + x_{\nu}(n+\hat{\mu})
- x_{\mu}(n+\hat{\nu})-x_{\nu}(n)\right]$$
:::
[**Note**: $\textcolor{#00CCFF}{x_{P}}$ is the product of
links around $1\times 1$ square, called a [ "plaquette" ] {.blue-text}]{.dim-text style=font-size:0.8em;}
:::
:::
:::
::: {.column width="50%"}
 {width="80%"}
:::
::::
## {background-color="white"}
::: {#fig-notebook}
<iframe data-src="https://nbviewer.org/github/saforem2/l2hmc-qcd/blob/SU3/src/l2hmc/notebooks/l2hmc-2dU1.ipynb" width="100%" height="650" title="l2hmc-qcd"></iframe>
Jupyter Notebook
:::
## Annealing Schedule {#sec-annealing-schedule background-color="#1c1c1c"}
- Introduce an _annealing schedule_ during the training phase:
$$\left\{ \gamma_{t} \right\}_{t=0}^{N} = \left\{\gamma_{0}, \gamma_{1},
\ldots, \gamma_{N-1}, \gamma_{N} \right\} $$
where $\gamma_{0} < \gamma_{1} < \cdots < \gamma_{N} \equiv 1$, and $\left|\gamma_{t+1} - \gamma_{t}\right| \ll 1$
- [ **Note** ] {.green-text}:
- for $\left|\gamma_{t}\right| < 1$, this rescaling helps to reduce the
height of the energy barriers $\Longrightarrow$
- easier for our sampler to explore previously inaccessible regions of the phase space
## Networks 2D $U(1)$ {#sec-networks-2dU1 background-color="#1c1c1c"}
- Stack gauge links as `shape` $\left(U_{\mu}\right)$` =[Nb, 2, Nt, Nx]` $\in \mathbb{C}$
$$ x_{\mu}(n) ≔ \left[ \cos(x), \sin(x)\right ] $$
with `shape` $\left(x_{\mu}\right)$` = [Nb, 2, Nt, Nx, 2]` $\in \mathbb{R}$
- $x$-Network:
- [ $\psi_{\theta}: (x, v) \longrightarrow \left(s_{x},\, t_{x},\, q_{x}\right)$ ] {.purple-text}
- $v$-Network:
- [ $\varphi_{\theta}: (x, v) \longrightarrow \left(s_{v},\, t_{v},\, q_{v}\right)$ ] {.green-text}
## Toy Example: GMM $\in \mathbb{R}^{2}$ {.centeredslide background-color="#1c1c1c"}
 {#fig-gmm .r-stretch}
## Physical Quantities {background-color="#1c1c1c"}
- To estimate physical quantities, we:
- Calculate physical observables at **increasing** spatial resolution
- Perform extrapolation to continuum limit
::: {#fig-continuum}

Increasing the physical resolution ($a \rightarrow 0$) allows us to make
predictions about numerical values of physical quantities in the continuum
limit.
:::
## Extra {#sec-extra background-color="#1c1c1c"}
[  ] {.preview-image style="text-align:center; margin-left:auto; margin-right: auto;"}