gbyuvd commited on
Commit
6cbfeb0
·
verified ·
1 Parent(s): b463a40

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -14,7 +14,7 @@ library_name: sentence-transformers
14
 
15
  # miniChembed-prototype
16
 
17
- This is a **self-supervised molecular embedding** model trained using the **Barlow Twins** objective on approximately **24K unlabeled SMILES strings**. If validated as effective, it will be scaled to 2.1M molecules. The training data were compiled from public sources including:
18
 
19
  - **ChEMBL34** (Zdrazil et al., 2023)
20
  - **COCONUTDB** (Sorokina et al., 2021)
@@ -25,6 +25,9 @@ The model maps SMILES strings to a **320-dimensional dense vector space**, optim
25
  Unlike fixed fingerprints (e.g., ECFP4), this model learns representations directly from **stochastic SMILES augmentations**, encouraging invariance to syntactic variation while potentially maximizing representational diversity across molecules.
26
  The Barlow Twins objective explicitly minimizes redundancy between embedding dimensions, promoting structured, non-collapsed representations.
27
 
 
 
 
28
  ---
29
 
30
  ## Model Details
 
14
 
15
  # miniChembed-prototype
16
 
17
+ This is an experimental **self-supervised molecular embedding** model trained using the **Barlow Twins** objective on approximately **24K unlabeled SMILES strings**. If validated as effective, it will be scaled to 2.1M molecules. The training data were compiled from public sources including:
18
 
19
  - **ChEMBL34** (Zdrazil et al., 2023)
20
  - **COCONUTDB** (Sorokina et al., 2021)
 
25
  Unlike fixed fingerprints (e.g., ECFP4), this model learns representations directly from **stochastic SMILES augmentations**, encouraging invariance to syntactic variation while potentially maximizing representational diversity across molecules.
26
  The Barlow Twins objective explicitly minimizes redundancy between embedding dimensions, promoting structured, non-collapsed representations.
27
 
28
+ > Note: This is an experimental prototype.
29
+ > Feel free to experiment with and edit the training script as you wish!
30
+ > Correcting my mistake(s), tweaking augmentations, loss weights, optimizer settings, or network architecture could lead to even better representations.
31
  ---
32
 
33
  ## Model Details