--- license: cc-by-nc-sa-4.0 extra_gated_prompt: "The AQCat25 models are for non-commercial use. If you want to work with us or discuss obtaining a commercial license, please contact us directly at aqcat25@sandboxaq.com." extra_gated_fields: First Name: text Last Name: text Company Name or Affiliation: text Role or Job Title: text I want to use the AQCat25 models for: text I agree to use the AQCat25 models are for non-commercial use ONLY: checkbox tags: - chemistry - materials-science - machine-learning - catalysis - computational-chemistry - dft configs: - config_name: default data_files: - split: train_id path: parquet/train_id.parquet - split: val_id path: parquet/val_id.parquet - split: test_id path: parquet/test_id.parquet - split: val_ood_ads path: parquet/val_ood_ads.parquet - split: val_ood_mat path: parquet/val_ood_mat.parquet - split: val_ood_both path: parquet/val_ood_both.parquet - split: test_ood_ads path: parquet/test_ood_ads.parquet - split: test_ood_mat path: parquet/test_ood_mat.parquet - split: test_ood_both path: parquet/test_ood_both.parquet - split: id_slabs path: parquet/id_slabs.parquet - split: val_ood_slabs path: parquet/val_ood_slabs.parquet - split: test_ood_slabs path: parquet/test_ood_slabs.parquet ---

AQCat25 Models: Unlocking spin-aware, high-fidelity machine learning potentials for heterogeneous catalysis

![model_figure](https://cdn-uploads.huggingface.co/production/uploads/67256b7931376d3bacb18de0/KKtSqC60leVPuBoL-xi1l.jpeg) This repository contains **AQCat25-EV2** model checkpoints. The models released currently are based on the EquiformerV2 (EV2) architecture, wherein scalar activations are additively modulated by a Feature-wise Linear Modulation (FiLM) network that explicitly conditions on the underlying DFT settings (i.e., "is this calculation high-fidelity?" and "is spin on?"). Stay tuned for additional models that will be released in the near future. Please see our [website](https://www.sandboxaq.com/aqcat25) and [paper](https://cdn.prod.website-files.com/622a3cfaa89636b753810f04/68ffc1e7c907b6088573ba8c_AQCat25.pdf) for more details about the impact of the models and [dataset](https://huggingface.co/datasets/SandboxAQ/aqcat25). ## 1. Model Installation and Usage (EV2-FiLM) This section details how to install and run the EquiformerV2-FiLM (EV2-FiLM) model. ### Step 1.1: Create Environment First, create and activate a new micromamba (or conda) environment with Python 3.10. ```bash micromamba create -n aqcat-ev2 python=3.10 micromamba activate aqcat-ev2 ``` ### Step 1.2: Install Dependencies Before installing `fairchem`, install PyTorch and all other required libraries. ```bash pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121 --no-input pip install torch_geometric --no-input pip install torch_sparse -f https://data.pyg.org/whl/torch-2.4.0+cu121.html --no-input pip install torch_scatter -f https://data.pyg.org/whl/torch-2.4.0+cu121.html --no-input pip install e3nn submitit torchtnt hydra-core pymatgen ase orjson wandb tensorboard lmdb huggingface-hub numba datasets pandas tqdm requests --no-input ``` ### Step 1.3: Log in to Hugging Face To download the model files and dataset, you must log in to your Hugging Face account. **Create an Access Token:** Navigate to your **Settings -> Access Tokens** page or click [here](https://huggingface.co/settings/tokens). Create a new token with at least `read` permissions. Copy this token. **Log in via the Command Line:** Open your terminal, run the following command, and paste your token when prompted. ```bash hf auth login ``` ### Step 1.4: Download the AQCat Model Files Next, download the necessary model checkpoints and scripts from this Hugging Face repository. Save the following code as `download_files.py`. You will need to edit it to add your Hugging Face token. ```python from huggingface_hub import snapshot_download MY_TOKEN = "hf_YOUR_TOKEN_HERE" print("Downloading model and code files...") snapshot_download( repo_id="SandboxAQ/aqcat25-ev2", repo_type="model", allow_patterns=[ "checkpoints_aqcat_ev2/*", "ev2_film/*", "patched_code/*", ], local_dir="./aqcat25-ev2", token=MY_TOKEN ) print("Download complete.") ``` Now, run the script. This will create a new folder named `aqcat25-ev2` containing all the necessary files. ```bash python download_files.py ``` ### Step 1.5: Clone, Patch, and Install `fairchem` Finally, we will clone the `fairchem` repo, check out the correct V1 code, copy our custom files into it, and install the modified version. ```bash git clone git@github.com:facebookresearch/fairchem.git cd fairchem git fetch --all --tags git checkout -b aqcat-ev2 tags/fairchem_core-1.10.0 cp ../aqcat25-ev2/ev2_film/equiformer_v2_film.py packages/fairchem-core/src/fairchem/core/models/equiformer_v2/ cp ../aqcat25-ev2/patched_code/ase_utils.py packages/fairchem-core/src/fairchem/core/common/relaxation/ase_utils.py pip install -e packages/fairchem-core --no-deps --no-input ``` ### Step 1.6: Model Checkpoints This repository provides the following checkpoints. * **Trained-from-scratch (EV2-in+midFiLM):** A generally well-rounded model that provides the best performance on practical catalysis discovery tasks. It was jointly trained from scratch on both AQCat25 and 20M examples from OC20. This model particularly excels on the most challenging material subclasses, such as non-metals and organics. See the paper for more details. * **Cotuned (EV2-inFiLM):** This model uses the pre-trained EV2-31M (OC20 All+MD) checkpoint as its starting point and is then fine-tuned on AQCat25 while replaying 20M examples from OC20. It offers strong performance particularly on metal-only systems. * **Directly Tuned (Baselines):** These are the pre-trained EV2-31M and EV2-153M (OC20 All+MD) checkpoints that have been directly fine-tuned on AQCat25 with *no* OC20 replay. They serve as baselines for comparison to show the effects of co-tuning. ### Step 1.7: Model Usage Example Setup is complete. Your `aqcat-ev2` environment now has the patched `fairchem` v1.10.0 installed. Here is a full example of relaxing a carbon monoxide molecule on a cobalt slab with two spin conditions. The `patched_calc` provided in the installation guide defaults to the high-fidelity context and will detect if spin polarization is needed based on the elements in your system (e.g., Co, Fe, Ni), though you can also toggle these flags manually as shown below. ```python import numpy as np from ase.build import hcp0001, molecule, add_adsorbate from ase.constraints import FixAtoms from ase.optimize import LBFGS from ase.io import write from fairchem.core.common.relaxation.ase_utils import patched_calc CHECKPOINT_PATH = "aqcat25-ev2/checkpoints_aqcat_ev2/ev2-in+midFiLM-AQCat25+OC20-20M_20251008_223220.pt" calc = patched_calc(checkpoint_path=CHECKPOINT_PATH) slab = hcp0001('Co', size=(3, 4, 4), orthogonal=True) co_molecule = molecule('CO') add_adsorbate(slab, co_molecule, 3.0, 'ontop') slab.center(vacuum=10.0, axis=2) slab.set_pbc(True) slab_symbols = slab.get_chemical_symbols() is_slab_atom = np.array([sym == 'Co' for sym in slab_symbols]) slab_z_coords = slab.get_positions()[is_slab_atom][:, 2] unique_slab_z = np.unique(slab_z_coords) unique_slab_z.sort() top_layer_z = unique_slab_z[-1] mask = slab.get_positions()[:, 2] < top_layer_z - 0.1 num_fixed = mask.sum() slab.set_constraint(FixAtoms(mask=mask)) # Run with spin polarization context slab_spin_on = slab.copy() slab_spin_on.info['is_spin_off'] = False slab_spin_on.calc = patched_calc(checkpoint_path=CHECKPOINT_PATH) optimizer_on = LBFGS(slab_spin_on, trajectory='co_co_spin_on.traj') optimizer_on.run(fmax=0.05) final_energy_spin_on = slab_spin_on.get_potential_energy() # Run without spin polarization context slab_spin_off = slab.copy() slab_spin_off.info['is_spin_off'] = True slab_spin_off.calc = patched_calc(checkpoint_path=CHECKPOINT_PATH) optimizer_off = LBFGS(slab_spin_off, trajectory='co_co_spin_off.traj') optimizer_off.run(fmax=0.05) final_energy_spin_off = slab_spin_off.get_potential_energy() print("\n--- Final Comparison ---") print(f"Spin-On Energy: {final_energy_spin_on:.4f} eV") print(f"Spin-Off Energy: {final_energy_spin_off:.4f} eV") print(f"Energy Difference (Off - On): {(final_energy_spin_off - final_energy_spin_on):.4f} eV") ``` ### Understanding the example You should observe a difference between the two final adsorption energies. The spin-unpolarized run will likely have a lower (more negative) final energy, which indicates a stronger, more stable binding. This is the correct behavior for the model, as it has learned the underlying physics of magnetic systems from the AQCat25 and OC20 datasets. As explained in the paper "Spin Effects in Chemisorption and Catalysis" (ACS Catal. 2023, 13, 3456-3462), for 3d magnetic metals like Co, Fe, and Ni, the true spin-polarized ground state (is_spin_off=False) actually results in weaker adsorbate bonding compared to a hypothetical non-spin-polarized state (is_spin_off=True). This is because the energetic stabilization from the spin-down (minority-spin) d-states does not fully compensate for the destabilization from the spin-up (majority-spin) d-states. By toggling the is_spin_off flag, you are telling the model to apply a different physical context, and the model predicts a different, more stable energy for the (hypothetical) non-spin-polarized system. --- ## 2. How to Cite If you use the AQCat25 dataset or the models in your research, please cite the following paper: ``` Omar Allam, Brook Wander, & Aayush R. Singh. (2025). AQCat25: Unlocking spin-aware, high-fidelity machine learning potentials for heterogeneous catalysis. arXiv preprint arXiv:XXXX.XXXXX. ``` ### BibTeX Entry ```bibtex @article{allam2025aqcat25, title={{AQCat25: Unlocking spin-aware, high-fidelity machine learning potentials for heterogeneous catalysis}}, author={Allam, Omar and Wander, Brook and Singh, Aayush R}, journal={arXiv preprint arXiv:2510.22938}, year={2025}, eprint={2510.22938}, archivePrefix={arXiv}, primaryClass={cond-mat.mtrl-sci} } ```