Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Paper
•
2506.02095
•
Published
port of cyclereward
There are three variants of the CycleReward model:
This model has been pushed to the Hub using the PytorchModelHubMixin integration:
To use this model:
pip install git+https://github.com/Abhinay1997/imscore@cyclereward
from imscore.cyclereward.model import CycleReward
available_models = ["NagaSaiAbhinay/CycleReward-Combo" "NagaSaiAbhinay/CycleReward-T2I", "NagaSaiAbhinay/CycleReward-I2T"]
model_id = available_models[0]
model = CycleReward.from_pretrained(model_id)
prompts = "a photo of a cat"
pixels = Image.open("cat.jpg")
pixels = np.array(pixels)
pixels = rearrange(torch.tensor(pixels), "h w c -> 1 c h w") / 255.0
# prompts and pixels should have the same batch dimension
# pixels should be in the range [0, 1]
# score == logits
score = model.score(pixels, prompts) # full differentiable reward)