I’m a Computational Scientist in the AI / ML group at the Argonne Leadership Computing Facility (ALCF).
I’m generally interested in the large scale distributed training of AI models for scientific applications, and am the co-lead of the Models / Pre-Training group for the AuroraGPT project.
Prior to this, I received my PhD in Physics from the University of Iowa in 2019, where I used ML to build better Markov Chain Monte Carlo sampling techniques for Lattice Quantum Chromodynamics (l2hmc-qcd).
Convert from HTML to slideshow version of a page by appending /slides to the end of its URL, e.g.
📆 2025
📆 2024
📆 2023
📆 2022
📆 2020
- 🌎 AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions (Hatanpää et al. (2025))
- Aurora: Architecting Argonne’s First Exascale Supercomputer for Accelerated Scientific Discovery (Allen et al. (2025))
- HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights (Gokdemir et al. (2025))
- Automated Tuning for HMC Mass Ratios (Torsiello et al. (2025))
- MOFA: Discovering Materials for Carbon Capture with a GenAI and Simulation-Based Workflow (Yan et al. (2025))
- 🧪 MProt-DPO: Breaking the ExaFLOPS Barrier for Multimodal Protein Design with DPO (Dharuman et al. (2024))
- Intro to HPC Bootcamp: Engaging New Communities Through Energy Justice Projects (Leung et al. (2024))
- Thorough Characterization and Analysis of Large Transformer Model Training At-Scale (Cheng et al. (2024))
- MLMC: Machine Learning Monte Carlo for Lattice Gauge Theory (Sam Foreman, Jin, and Osborn (2023))
- Protein Generation via Genome-scale Language Models with Bio-physical Scoring (Dharuman et al. (2023))
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery (Song et al. (2023)) - Comprehensive Performance Study of LLMs on Novel AI Accelerators (Emani et al. (2023))
- Exploratory Analysis of Climate Data with
ClimRR, Intro to HPC Bootcamp @ NERSC (Sam Foreman (2023)) - 🧬 GenSLMs: Genome-scale language models reveal SARS-Cov-2 evolutionary dynamics (Zvyagin et al. (2023))
- Lattice QCD and Particle Physics (Kronfeld et al. (2022))
- Applications of ML to Lattice QFT (Boyda et al. (2022))
- LeapFrogLayers: Trainable Framework for Effective Sampling (Sam Foreman et al. (2021))
- HMC with Normalizing Flows [slides] (Sam Foreman et al. (2021))
- Deep Learning Hamiltonian Monte Carlo [+ poster] (Sam Foreman, Jin, and C. (2021))
- Machine Learning and Neural Networks for Field Theory (Sam Foreman, Jin, and Osborn (2020))
- Examples of renormalization group transformations for image sets (Samuel Foreman et al. (2018))
- RG inspired Machine Learning for lattice field theory (Sam Foreman et al. (2018))
- Large Energy Density in Three-Plate Nanocapacitors due to Coulomb Blockade (Hubler et al. (2018))
- Superconductivity of In and Sn Samples (Deamont and Foreman (2014))
saforem2s GitHub Repositories
ezpz
PublicTrain across all your devices, ezpz 🍋
github-stats
PublicGithub Stats
awesome-stars
PublicA curated list of my GitHub stars!
saforem2
PublicProfile README
personal_site
PublicMy personal website
sam.onl
PublicNo description provided.
pbs-tui
PublicTUI for PBS Pro Scheduler
parallel-training-slides
PublicModern parallelism techniques for training LLMs
alcf-mlops
PublicExample repo demonstrating MLOps use cases on ALCF systems
amsc
PublicPython
kitty-config
PublicConfiguration for Kitty
ambivalent
PublicMinimal, beautiful (+ highly-customizable) styles for Matplotlib.
diagrams
PublicRepo of various `draw.io` diagrams
saforem2.github.io
PublicPersonal Website (using Quarto)
ezpz-ai
PublicPyPi alias for https://github.com/saforem2/ezpz
dotfiles-old
Publicdotfiles
chunkwm
PublicTiling window manager for MacOS based on plugin architecture
lattice_gauge_theory
PublicMonte Carlo simulation of Z(N) models in lattice gauge theory.
lattice23
PublicSlides for Lattice 2023
mmm
PublicMulti-Modal Modeling
mccl
PublicCollective communications using mpi4py
m
Publicmonorepo
l2hmc-qcd
PublicApplication of the L2HMC algorithm to simulations in lattice QCD.
intro-hpc-bootcamp-2025
PublicIntro to HPC Bootcamp 2025
wordplay
PublicPlaying with words
hpc-bootcamp-2025
PublicNo description provided.
quarto-codespaces
PublicQuarto codespaces
worm_algorithm
PublicWorm algorithm implementation for 2D Ising model
Notes-Demo
PublicDemo Obsidian Vault
sf
Publicsf: So Fast
blog-old
PublicNew domain, new blog
No description provided.
orkz
Public🎶 `orkz`: your devices, your symphony. Library for large scale orchestration of accelerators.
orchestron
Public`pyorch`: PyTorch Orchestration. Your devices, your symphony 🎶
orch
Public🎶 Orchestrator: Your devices, your symphony.
samforeman.dev
PublicPersonal Website (development version)
glam
PublicNeovim colorscheme that pops 💅
Slides from Statistical Learning Talk @ ATPESC 2022
aoc24
PublicAOC 24
lazy-vim
PublicLazyVim starter config
large-vision-models
PublicPlaying with Large Vision Models and ViTs
lazy-vim-template
PublicNo description provided.
No description provided.
llm-workshop-talk
PublicSimple tutorial on creating Small(-ish) LLMs (pt. 2 🎉!!)
llm-lunch-talk
PublicLLMs at ALCF
yap
PublicLearning to yap
LLM-tutorial
PublicSimple tutorial on creating Small(-ish) LLMs
🎓 Education
- Ph.D., Physics
University of Iowa | 2015–2019 - B.S. in Engineering Physics
University of Illinois at Urbana-Champaign | 2010–2015 - B.S. in Applied Mathematics
University of Illinois at Urbana-Champaign | 2010–2015
👔 Professional Experience
- Assistant Computational Scientist
- Argonne National Laboratory, Leadership Computing Facility (ALCF) Lemont, IL | 2022–Present
- Research lead on scaling large language models (LLMs) and generative AI for science on supercomputers (Aurora, Frontier, LUMI, Leonardo, …).
- Co-lead the Models and Pretraining team of the AuroraGPT project
- Optimize large-scale training of foundation models and language models for scientific applications.
- Collaborate with interdisciplinary teams to enhance simulation efficiency and scalability
- Focus on AI and HPC for scientific applications, including:
- Training large language models on supercomputers
- Genome scale language models (GenSLMs) for studying SARS-CoV-2 evolutionary dynamics
- Direct Preference Optimization (DPO) for multimodal protein design workflows
- Climate modeling and weather forecasting using foundation models
- Developing improved sampling algorithms for lattice quantum chromodynamics (QCD)
- https://www.alcf.anl.gov/about/people/sam-foreman
- Research lead on scaling large language models (LLMs) and generative AI for science on supercomputers (Aurora, Frontier, LUMI, Leonardo, …).
- Argonne National Laboratory, Leadership Computing Facility (ALCF) Lemont, IL | 2022–Present
- Postdoctoral Researcher
- Argonne National Laboratory, Leadership Computing Facility (ALCF) Lemont, IL | 2019 – 2022
- Applied deep learning to lattice gauge theory and quantum field simulations.
- Developed ML-enhanced Monte Carlo methods for QCD (l2hmc-qcd).
- Engaged in AI-for-Science collaborations with national labs and university partners.
- Argonne National Laboratory, Leadership Computing Facility (ALCF) Lemont, IL | 2019 – 2022
- Graduate Researcher (DOE SCGSR Fellowship)
- Argonne National Laboratory, Mathematics and Computer Sciences Division (MCS)
Lemont, IL | 2018 – 2019- Development of l2hmc-qcd in collaboration with ALCF for my PhD Thesis research
- Argonne National Laboratory, Mathematics and Computer Sciences Division (MCS)
🏆 Awards and Honors
Member of the DeepSpeed Technical Steering Commiittee, 2025 – Present
- Contributing to the development and direction of the DeepSpeed library for large-scale model training.
Nominated to serve on the US Coordinating Panel for Software and Computing by the Division of Particles and Fields of the American Physical Society (APS).
Finalist, ACM Gordon Bell Prize in Climate Modeling, 2025
- Recognized for our work on
🌎 AERIS (Hatanpää et al. (2025)): The first billion-parameter pixel-level diffusion model for global weather and subseasonal-to-seasonal forecasting. Trained efficiently at scales from 1.3–80B parameters with our sequence-window parallelism (SWiPe) strategy, we achieve a sustained mixed-precision performance of 10.21 ExaFLOPS and peak performance of 11.21 ExaFLOPS, scaling to 10,080 nodes (120,960 GPUs) on the Aurora supercomputer.
- Recognized for our work on
Finalist, ACM Gordon Bell Prize, 2024
- Acknowledged for the MProt-DPO (Dharuman et al. (2024)) project, which achieved over 4 ExaFLOP sustained performance in multimodal protein design workflows using Direct Preference Optimization.
ACM Gordon Bell Special Prize for High Performance Computing-Based COVID-19 Research, 2022
- Recognized for contributions to the GenSLMs (Zvyagin et al. (2023)) project, which developed genome-scale language models to study SARS-CoV-2 evolutionary dynamics.
DOE Office of Science Graduate Student Research Fellow, 2018
- Awarded by the Department of Energy for outstanding research contributions during graduate studies.
🎪 Events
- Organizer for:
- SC25 Workshop: High Performance Python for Science at Scale (HPPSS), November 2025
- SC25 Tutorial: Accelerating and Scaling Python for HPC
- SC24 Workshop: High Performance Python for Science at Scale (HPPSS), November 2024
- SC23 Workshop: High Performance Python for Science at Scale (HPPSS), November 2023
- Machine Learning and Quantum Computing for Earth Sciences at 17th U. S. National Congress on Computational Mechanics, July 2023
From https://sf.status.lol:

👋 Hello from Obsidian!
Temporarily disabled while guesbooks gets their Azure issues worked out :(
Footnotes
🏅 Finalist for the Gordon Bell Prize in Climate Based Modeling at SC25!↩︎
Citation
@online{foreman2026,
author = {Foreman, Sam},
date = {2026-01-09},
url = {https://samforeman.me/},
langid = {en}
}
