---
license: cc-by-nc-sa-4.0
extra_gated_prompt: "The AQCat25 models are for non-commercial use. If you want to work with us or discuss obtaining a commercial license, please contact us directly at aqcat25@sandboxaq.com."
extra_gated_fields:
First Name: text
Last Name: text
Company Name or Affiliation: text
Role or Job Title: text
I want to use the AQCat25 models for: text
I agree to use the AQCat25 models are for non-commercial use ONLY: checkbox
tags:
- chemistry
- materials-science
- machine-learning
- catalysis
- computational-chemistry
- dft
configs:
- config_name: default
data_files:
- split: train_id
path: parquet/train_id.parquet
- split: val_id
path: parquet/val_id.parquet
- split: test_id
path: parquet/test_id.parquet
- split: val_ood_ads
path: parquet/val_ood_ads.parquet
- split: val_ood_mat
path: parquet/val_ood_mat.parquet
- split: val_ood_both
path: parquet/val_ood_both.parquet
- split: test_ood_ads
path: parquet/test_ood_ads.parquet
- split: test_ood_mat
path: parquet/test_ood_mat.parquet
- split: test_ood_both
path: parquet/test_ood_both.parquet
- split: id_slabs
path: parquet/id_slabs.parquet
- split: val_ood_slabs
path: parquet/val_ood_slabs.parquet
- split: test_ood_slabs
path: parquet/test_ood_slabs.parquet
---
AQCat25 Models: Unlocking spin-aware, high-fidelity machine learning potentials for heterogeneous catalysis

This repository contains **AQCat25-EV2** model checkpoints. The models released currently are based on the EquiformerV2 (EV2) architecture, wherein scalar activations are additively modulated by a Feature-wise Linear Modulation (FiLM) network that explicitly conditions on the underlying DFT settings (i.e., "is this calculation high-fidelity?" and "is spin on?").
Stay tuned for additional models that will be released in the near future.
Please see our [website](https://www.sandboxaq.com/aqcat25) and [paper](https://cdn.prod.website-files.com/622a3cfaa89636b753810f04/68ffc1e7c907b6088573ba8c_AQCat25.pdf) for more details about the impact of the models and [dataset](https://huggingface.co/datasets/SandboxAQ/aqcat25).
## 1. Model Installation and Usage (EV2-FiLM)
This section details how to install and run the EquiformerV2-FiLM (EV2-FiLM) model.
### Step 1.1: Create Environment
First, create and activate a new micromamba (or conda) environment with Python 3.10.
```bash
micromamba create -n aqcat-ev2 python=3.10
micromamba activate aqcat-ev2
```
### Step 1.2: Install Dependencies
Before installing `fairchem`, install PyTorch and all other required libraries.
```bash
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121 --no-input
pip install torch_geometric --no-input
pip install torch_sparse -f https://data.pyg.org/whl/torch-2.4.0+cu121.html --no-input
pip install torch_scatter -f https://data.pyg.org/whl/torch-2.4.0+cu121.html --no-input
pip install e3nn submitit torchtnt hydra-core pymatgen ase orjson wandb tensorboard lmdb huggingface-hub numba datasets pandas tqdm requests --no-input
```
### Step 1.3: Log in to Hugging Face
To download the model files and dataset, you must log in to your Hugging Face account.
**Create an Access Token:**
Navigate to your **Settings -> Access Tokens** page or click [here](https://huggingface.co/settings/tokens). Create a new token with at least `read` permissions. Copy this token.
**Log in via the Command Line:**
Open your terminal, run the following command, and paste your token when prompted.
```bash
hf auth login
```
### Step 1.4: Download the AQCat Model Files
Next, download the necessary model checkpoints and scripts from this Hugging Face repository.
Save the following code as `download_files.py`. You will need to edit it to add your Hugging Face token.
```python
from huggingface_hub import snapshot_download
MY_TOKEN = "hf_YOUR_TOKEN_HERE"
print("Downloading model and code files...")
snapshot_download(
repo_id="SandboxAQ/aqcat25-ev2",
repo_type="model",
allow_patterns=[
"checkpoints_aqcat_ev2/*",
"ev2_film/*",
"patched_code/*",
],
local_dir="./aqcat25-ev2",
token=MY_TOKEN
)
print("Download complete.")
```
Now, run the script. This will create a new folder named `aqcat25-ev2` containing all the necessary files.
```bash
python download_files.py
```
### Step 1.5: Clone, Patch, and Install `fairchem`
Finally, we will clone the `fairchem` repo, check out the correct V1 code, copy our custom files into it, and install the modified version.
```bash
git clone git@github.com:facebookresearch/fairchem.git
cd fairchem
git fetch --all --tags
git checkout -b aqcat-ev2 tags/fairchem_core-1.10.0
cp ../aqcat25-ev2/ev2_film/equiformer_v2_film.py packages/fairchem-core/src/fairchem/core/models/equiformer_v2/
cp ../aqcat25-ev2/patched_code/ase_utils.py packages/fairchem-core/src/fairchem/core/common/relaxation/ase_utils.py
pip install -e packages/fairchem-core --no-deps --no-input
```
### Step 1.6: Model Checkpoints
This repository provides the following checkpoints.
* **Trained-from-scratch (EV2-in+midFiLM):** A generally well-rounded model that provides the best performance on practical catalysis discovery tasks. It was jointly trained from scratch on both AQCat25 and 20M examples from OC20. This model particularly excels on the most challenging material subclasses, such as non-metals and organics. See the paper for more details.
* **Cotuned (EV2-inFiLM):** This model uses the pre-trained EV2-31M (OC20 All+MD) checkpoint as its starting point and is then fine-tuned on AQCat25 while replaying 20M examples from OC20. It offers strong performance particularly on metal-only systems.
* **Directly Tuned (Baselines):** These are the pre-trained EV2-31M and EV2-153M (OC20 All+MD) checkpoints that have been directly fine-tuned on AQCat25 with *no* OC20 replay. They serve as baselines for comparison to show the effects of co-tuning.
### Step 1.7: Model Usage Example
Setup is complete. Your `aqcat-ev2` environment now has the patched `fairchem` v1.10.0 installed.
Here is a full example of relaxing a carbon monoxide molecule on a cobalt slab with two spin conditions.
The `patched_calc` provided in the installation guide defaults to the high-fidelity context and will detect if spin polarization is needed based on the elements in your system (e.g., Co, Fe, Ni), though you can also toggle these flags manually as shown below.
```python
import numpy as np
from ase.build import hcp0001, molecule, add_adsorbate
from ase.constraints import FixAtoms
from ase.optimize import LBFGS
from ase.io import write
from fairchem.core.common.relaxation.ase_utils import patched_calc
CHECKPOINT_PATH = "aqcat25-ev2/checkpoints_aqcat_ev2/ev2-in+midFiLM-AQCat25+OC20-20M_20251008_223220.pt"
calc = patched_calc(checkpoint_path=CHECKPOINT_PATH)
slab = hcp0001('Co', size=(3, 4, 4), orthogonal=True)
co_molecule = molecule('CO')
add_adsorbate(slab, co_molecule, 3.0, 'ontop')
slab.center(vacuum=10.0, axis=2)
slab.set_pbc(True)
slab_symbols = slab.get_chemical_symbols()
is_slab_atom = np.array([sym == 'Co' for sym in slab_symbols])
slab_z_coords = slab.get_positions()[is_slab_atom][:, 2]
unique_slab_z = np.unique(slab_z_coords)
unique_slab_z.sort()
top_layer_z = unique_slab_z[-1]
mask = slab.get_positions()[:, 2] < top_layer_z - 0.1
num_fixed = mask.sum()
slab.set_constraint(FixAtoms(mask=mask))
# Run with spin polarization context
slab_spin_on = slab.copy()
slab_spin_on.info['is_spin_off'] = False
slab_spin_on.calc = patched_calc(checkpoint_path=CHECKPOINT_PATH)
optimizer_on = LBFGS(slab_spin_on, trajectory='co_co_spin_on.traj')
optimizer_on.run(fmax=0.05)
final_energy_spin_on = slab_spin_on.get_potential_energy()
# Run without spin polarization context
slab_spin_off = slab.copy()
slab_spin_off.info['is_spin_off'] = True
slab_spin_off.calc = patched_calc(checkpoint_path=CHECKPOINT_PATH)
optimizer_off = LBFGS(slab_spin_off, trajectory='co_co_spin_off.traj')
optimizer_off.run(fmax=0.05)
final_energy_spin_off = slab_spin_off.get_potential_energy()
print("\n--- Final Comparison ---")
print(f"Spin-On Energy: {final_energy_spin_on:.4f} eV")
print(f"Spin-Off Energy: {final_energy_spin_off:.4f} eV")
print(f"Energy Difference (Off - On): {(final_energy_spin_off - final_energy_spin_on):.4f} eV")
```
### Understanding the example
You should observe a difference between the two final adsorption energies. The spin-unpolarized run will likely have a lower (more negative) final energy, which indicates a stronger, more stable binding.
This is the correct behavior for the model, as it has learned the underlying physics of magnetic systems from the AQCat25 and OC20 datasets.
As explained in the paper "Spin Effects in Chemisorption and Catalysis" (ACS Catal. 2023, 13, 3456-3462), for 3d magnetic metals like Co, Fe, and Ni, the true spin-polarized ground state (is_spin_off=False) actually results in weaker adsorbate bonding compared to a hypothetical non-spin-polarized state (is_spin_off=True). This is because the energetic stabilization from the spin-down (minority-spin) d-states does not fully compensate for the destabilization from the spin-up (majority-spin) d-states.
By toggling the is_spin_off flag, you are telling the model to apply a different physical context, and the model predicts a different, more stable energy for the (hypothetical) non-spin-polarized system.
---
## 2. How to Cite
If you use the AQCat25 dataset or the models in your research, please cite the following paper:
```
Omar Allam, Brook Wander, & Aayush R. Singh. (2025). AQCat25: Unlocking spin-aware, high-fidelity machine learning potentials for heterogeneous catalysis. arXiv preprint arXiv:XXXX.XXXXX.
```
### BibTeX Entry
```bibtex
@article{allam2025aqcat25,
title={{AQCat25: Unlocking spin-aware, high-fidelity machine learning potentials for heterogeneous catalysis}},
author={Allam, Omar and Wander, Brook and Singh, Aayush R},
journal={arXiv preprint arXiv:2510.22938},
year={2025},
eprint={2510.22938},
archivePrefix={arXiv},
primaryClass={cond-mat.mtrl-sci}
}
```