Shane Caldwell's picture

1 3

Shane Caldwell PRO

SJCaldwell

·

https://hackbot.dad/

AI & ML interests

cybersecurity + ml

Recent Activity

authored a paper 5 days ago

AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models

authored a paper 5 days ago

PentestJudge: Judging Agent Behavior Against Operational Requirements

liked a dataset 4 months ago

View all activity

Organizations

authored 2 papers 5 days ago

AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models

Paper • 2506.14682 • Published Jun 17, 2025

PentestJudge: Judging Agent Behavior Against Operational Requirements

Paper • 2508.02921 • Published Aug 4, 2025