Update README.md

e83197b verified almost 2 years ago

2.99 kB

extra_gated_heading: You need to share contact information with Alchemab to access this model
extra_gated_prompt: >-

  ### FAbCon Terms of Use

  FAbCon models follow a [modified Apache 2.0
  license](https://huggingface.co/alchemab/fabcon-large/blob/main/LICENSE.md)
extra_gated_fields:
  First Name: text
  Last Name: text
  Email: text
  Organization: text
  By clicking 'Submit' below, I accept the terms of the license, agree to share contact information with Alchemab: checkbox
  I agree to being contacted about future products, services, and/or partnership opportunities: checkbox
extra_gated_description: >-
  The information you provide will be collected, stored, processed, and shared
  in accordance with the [Alchemab Privacy
  Notice](https://www.alchemab.com/privacy-policy/).
extra_gated_button_content: Submit
license: other
widget:
  - text: ḢQVQLE
tags:
  - biology

FAbCon-medium 🦅🧬

FAbCon is a generative, antibody-specific language model based on the Falcon model. It is pre-trained using causal language modelling, and is suitable for a range of tasks. FAbCon-small, FAbCon-medium, and FAbCon-large are available for non-commercial use via a modified Apache 2.0 license. For any users seeking commercial use of our models (and license for generated antibodies from all FAbCon models), please contact us.

Model variant	Parameters	Config	License
FAbCon-small	144M	24L, 12H, 768d	Modified Apache 2.0
FAbCon-medium	297M	28L, 16H, 1024d	Modified Apache 2.0
FAbCon-large	2.4B	56L, 32H, 2048d	Modified Apache 2.0

Usage example - generation

Generating sequences can be done using HuggingFace's built-in model.generate method,

from transformers import (
    PreTrainedTokenizerFast,
    FalconForCausalLM
)

>>> tokenizer = PreTrainedTokenizerFast.from_pretrained("alchemab/fabcon-medium")
>>> model = FalconForCausalLM.from_pretrained("alchemab/fabcon-medium")
>>> o = model.generate(
            tokenizer("Ḣ", return_tensors='pt')['input_ids'][:, :-1],
            max_new_tokens=...,
            top_k = ...,
            temperature = ...
    )
>>> decoded_seq = tokenizer.batch_decode(o)

Usage example - sequence property prediction

Use the transformers built-in SequenceClassification classes

from transformers import (
    PreTrainedTokenizerFast,
    FalconForSequenceClassification
)

>>> tokenizer = PreTrainedTokenizerFast.from_pretrained("alchemab/fabcon-medium")
>>> model = FalconForSequenceClassification.from_pretrained("alchemab/fabcon-medium")
>>> o = model(input_ids=tokenizer("Ḣ", return_tensors='pt')['input_ids'],
              attention_mask=tokenizer("Ḣ", return_tensors='pt')['attention_mask'])