SentenceTransformer

This is a sentence-transformers model trained on the cornstack_python, cornstack_python_pairs, codesearchnet, codesearchnet_pairs and solyanka_qa datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Datasets:
  • Languages: ru, en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("fyaronskiy/code_retriever_ru_en")
# Run inference
sentences = [
    "Method which returns a dictionary of field statistics received from the\n    input source.\n\n    Returns:\n\n      fieldStats: dict of dicts where the first level is the field name and\n        the second level is the statistic. ie. fieldStats['pounds']['min']",
    'def _getFieldStats(self):\n    """\n    Method which returns a dictionary of field statistics received from the\n    input source.\n\n    Returns:\n\n      fieldStats: dict of dicts where the first level is the field name and\n        the second level is the statistic. ie. fieldStats[\'pounds\'][\'min\']\n\n    """\n\n    fieldStats = dict()\n    fieldNames = self._inputSource.getFieldNames()\n    for field in fieldNames:\n      curStats = dict()\n      curStats[\'min\'] = self._inputSource.getFieldMin(field)\n      curStats[\'max\'] = self._inputSource.getFieldMax(field)\n      fieldStats[field] = curStats\n    return fieldStats',
    'def customize(func):\n    """\n    Decorator to set plotting context and axes style during function call.\n    """\n    @wraps(func)\n    def call_w_context(*args, **kwargs):\n        set_context = kwargs.pop(\'set_context\', True)\n        if set_context:\n            with plotting_context(), axes_style():\n                return func(*args, **kwargs)\n        else:\n            return func(*args, **kwargs)\n    return call_w_context',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9048, 0.0377],
#         [0.9048, 1.0000, 0.0953],
#         [0.0377, 0.0953, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.8684
cosine_accuracy@3 0.9439
cosine_accuracy@5 0.9566
cosine_accuracy@10 0.9668
cosine_precision@1 0.8684
cosine_precision@3 0.3146
cosine_precision@5 0.1913
cosine_precision@10 0.0967
cosine_recall@1 0.8684
cosine_recall@3 0.9439
cosine_recall@5 0.9566
cosine_recall@10 0.9668
cosine_ndcg@10 0.9224
cosine_mrr@10 0.9076
cosine_map@100 0.9083

Information Retrieval

Metric Value
cosine_accuracy@1 0.8742
cosine_accuracy@3 0.9425
cosine_accuracy@5 0.9549
cosine_accuracy@10 0.9644
cosine_precision@1 0.8742
cosine_precision@3 0.3142
cosine_precision@5 0.191
cosine_precision@10 0.0964
cosine_recall@1 0.8742
cosine_recall@3 0.9425
cosine_recall@5 0.9549
cosine_recall@10 0.9644
cosine_ndcg@10 0.9234
cosine_mrr@10 0.9098
cosine_map@100 0.9105

Training Details

Training Datasets

cornstack_python

cornstack_python

  • Dataset: cornstack_python
  • Size: 2,869,969 training samples
  • Columns: ru_query, document, negative_0, negative_1, negative_2, negative_3, negative_4, negative_5, negative_6, negative_7, negative_8, negative_9, negative_10, negative_11, negative_12, negative_13, negative_14, and negative_15
  • Approximate statistics based on the first 1000 samples:
    ru_query document negative_0 negative_1 negative_2 negative_3 negative_4 negative_5 negative_6 negative_7 negative_8 negative_9 negative_10 negative_11 negative_12 negative_13 negative_14 negative_15
    type string string string string string string string string string string string string string string string string string string
    details
    • min: 7 tokens
    • mean: 27.46 tokens
    • max: 162 tokens
    • min: 6 tokens
    • mean: 304.38 tokens
    • max: 5574 tokens
    • min: 6 tokens
    • mean: 237.08 tokens
    • max: 3627 tokens
    • min: 6 tokens
    • mean: 229.94 tokens
    • max: 6691 tokens
    • min: 6 tokens
    • mean: 230.06 tokens
    • max: 6229 tokens
    • min: 7 tokens
    • mean: 230.7 tokens
    • max: 4876 tokens
    • min: 8 tokens
    • mean: 220.57 tokens
    • max: 4876 tokens
    • min: 7 tokens
    • mean: 236.08 tokens
    • max: 5880 tokens
    • min: 6 tokens
    • mean: 247.91 tokens
    • max: 6621 tokens
    • min: 6 tokens
    • mean: 207.62 tokens
    • max: 3350 tokens
    • min: 6 tokens
    • mean: 222.54 tokens
    • max: 6863 tokens
    • min: 6 tokens
    • mean: 221.53 tokens
    • max: 4976 tokens
    • min: 7 tokens
    • mean: 216.06 tokens
    • max: 4876 tokens
    • min: 7 tokens
    • mean: 197.03 tokens
    • max: 4763 tokens
    • min: 6 tokens
    • mean: 200.83 tokens
    • max: 8192 tokens
    • min: 6 tokens
    • mean: 204.94 tokens
    • max: 3210 tokens
    • min: 6 tokens
    • mean: 188.51 tokens
    • max: 2754 tokens
    • min: 6 tokens
    • mean: 188.27 tokens
    • max: 4876 tokens
  • Samples:
    ru_query document negative_0 negative_1 negative_2 negative_3 negative_4 negative_5 negative_6 negative_7 negative_8 negative_9 negative_10 negative_11 negative_12 negative_13 negative_14 negative_15
    установите значение business_id сообщения данных в конкретное значение def step_impl_the_ru_is_set_to(context, business_id):
    context.bdd_helper.message_data["business_id"] = business_id
    def business_id(self, business_id):

    self._business_id = business_id
    def business_phone(self, business_phone):

    self._business_phone = business_phone
    def business_phone_number(self, business_phone_number):

    self._business_phone_number = business_phone_number
    def bus_ob_id(self, bus_ob_id):

    self._bus_ob_id = bus_ob_id
    def bus_ob_id(self, bus_ob_id):

    self._bus_ob_id = bus_ob_id
    def _set_id(self, value):
    pass
    def business_email(self, business_email):

    self._business_email = business_email
    def mailing_id(self, val: str):
    self._mailing_id = val
    def message_id(self, val: str):
    self._message_id = val
    def business_model(self, business_model):

    self._business_model = business_model
    def business_account(self, business_account):

    self._business_account = business_account
    def update_business(current_user, businessId):
    business = Business.query.get(int(businessId))

    if not business:
    return make_json_reply('message', 'Business id does not exist'), 404

    if business.user_id != current_user.id:
    return make_json_reply('message', 'Cannot update business'), 400

    data = request.get_json(force=True)
    name = location = category = description = None

    if 'name' in data.keys():
    name = data['name']

    if 'location' in data.keys():
    location = data['location']

    if 'category' in data.keys():
    category = data['category']

    if 'description' in data.keys():
    description = data['description']

    if check_validity_of_input(name=name):
    business.name = name

    if check_validity_of_input(location=location):
    business.location = location

    if check_validity_of_input(category=category):
    business.category = category

    if check_validity_of_input(description=description):
    ...
    def set_company_id_value(self, company_id_value):
    self.company_id_value = company_id_value
    def id(self, value):
    self._id = value
    def set_bribe(self, bribe_amount):

    self.bribe = bribe_amount
    def business_owner(self, business_owner):

    self._business_owner = business_owner
    Установить состояние правил sid def set_state_sid_request(ruleset_name, sid):
    message = json.loads(request.stream.read().decode('utf-8'))
    message['sid'] = sid
    result = host.patch_state(ruleset_name, message)
    return jsonify(result)
    def sid(self, sid):
    self._sid = sid
    def set_state(self,s):
    self.state = s
    def set_state(self, state: int): def setstate(self, state):

    self.set(DER = state)
    def set_rule(self, rule):
    self.rule.load_state_dict(rule, strict=True)
    def _set_state(self, state):
    #print("** set state from %d to %d" % (self.state, state))
    self.state = state
    def set_state( self ): def set_ident(self, new_ident: int):
    if not isinstance(new_ident, int):
    raise TypeError("Spectrum set identifiers may ONLY be positive integers")
    self._set_ident = new_ident
    def set_state(self, state):
    #print("ComponentBase.set_state")
    for k,v in state.items():
    #print(" Set {:14s} to {:s}".format(k,str(v)))
    if k == "connectors":
    for con_state in v:
    self.add_connector()
    self.connectors[-1].set_state(con_state)
    else:
    setattr(self, k, v)
    def setstate(self, state):

    self.list = state
    def setstate(self, state):

    self.list = state
    def state_id(self, state_id):

    self._state_id = state_id
    def set_state(self, state: int):
    self.state = state
    def set_domain_sid(self, sid):
    dsdb._samdb_set_domain_sid(self, sid)
    def set_state(self,state):
    self.__state = state
    def set_srid(self, srid: ir.IntegerValue) -> GeoSpatialValue:
    return ops.GeoSetSRID(self, srid=srid).to_expr()
    Отправить события sid в ruleset def post_sid_events(ruleset_name, sid):
    message = json.loads(request.stream.read().decode('utf-8'))
    message['sid'] = sid
    result = host.post(ruleset_name, message)
    return jsonify(result)
    def post_events(ruleset_name):
    message = json.loads(request.stream.read().decode('utf-8'))
    result = host.post(ruleset_name, message)
    return jsonify(result)
    def set_state_sid_request(ruleset_name, sid):
    message = json.loads(request.stream.read().decode('utf-8'))
    message['sid'] = sid
    result = host.patch_state(ruleset_name, message)
    return jsonify(result)
    def sid(self, sid):
    self._sid = sid
    def post(self, request, *args, **kwargs):

    id = args[0] if args else list(kwargs.values())[0]
    try:
    ssn = Subscription.objects.get(id=id)
    except Subscription.DoesNotExist:
    logger.error(
    f'Received unwanted subscription {id} POST request! Sending status '
    '410 back to hub.'
    )
    return Response('Unwanted subscription', status=410)

    ssn.update(time_last_event_received=now())
    self.handler_task.delay(request.data)
    return Response('') # TODO
    def informed_consent_on_post_save(sender, instance, raw, created, **kwargs):
    if not raw:
    if created:
    pass
    # instance.registration_update_or_create()
    # update_model_fields(instance=instance,
    # model_cls=['subject_identifier', instance.subject_identifier])
    try:
    OnSchedule.objects.get(
    subject_identifier=instance.subject_identifier, )
    except OnSchedule.DoesNotExist:
    onschedule_model = 'training_subject.onschedule'
    put_on_schedule(schedule_name='training_subject_visit_schedule', instance=instance, onschedule_model=onschedule_model)
    def post_event(self, event):

    from evennia.scripts.models import ScriptDB


    if event.public_event:

    event_manager = ScriptDB.objects.get(db_key="Event Manager")

    event_manager.post_event(event, self.owner.player, event.display())
    def post(self, event, *args, **kwargs):
    self.inq.Signal((event, args, kwargs))
    def post(self, request):
    return self.serviceHandler.addEvent(request.data)
    def register_to_event(request):
    pass
    def setFilterOnRule(request):

    logger = logging.getLogger(name)

    # Get some initial post values for processing.
    ruleIds = request.POST.getlist('id')
    sensors = request.POST.getlist('sensors')
    commentString = request.POST['comment']
    force = request.POST['force']
    response = []

    # If the ruleIds list is empty, it means a SID has been entered manually.
    if len(ruleIds) == 0:
    # Grab the value from the POST.
    ruleSID = request.POST['sid']

    # Match the GID:SID pattern, if its not there, throw exception.
    try:
    matchPattern = r"(\d+):(\d+)"
    pattern = re.compile(matchPattern)
    result = pattern.match(ruleSID)

    ruleGID = result.group(1)
    ruleSID = result.group(2)
    except:
    response.append({'response': 'invalidGIDSIDFormat', 'text': 'Please format in the GID:SID syntax.'})
    logger.warning("Invalid GID:SID syntax provided: "+str(ruleSID)+".")
    return HttpResponse(json.dumps(response))

    # Try to find a generator object with the GID supplied, if it does...
    def store_event(self, violations):
    current_time = datetime.now().strftime("%Y/%m/%d %H:%M:%S")
    insert_query = """INSERT INTO social_distancing (Location, Local_Time, Violations) VALUES ('{}', '{}', {})""".format(self.location, current_time, violations)
    self.off_chain.insert(insert_query)

    event_id = self.off_chain.select("""SELECT LAST_INSERT_ID() FROM social_distancing""")[0][0]
    self.on_chain.store_hash(event_id, self.location, current_time, violations)
    def test_post_event_on_schedule_page(self):
    json_data = {
    'title': 'Test Event',
    'start': '2017-8-8T12:00:00',
    'end': '2017-8-8T12:00:00',
    'group': '3'
    }

    response = self.app.post("/saveEvent", data=json.dumps(json_data),
    content_type='application/json')
    self.assertTrue(response.status_code, 200)
    def _push(self, server):
    defns = [self.get_id(ident) for ident in list(self.ids)]
    #for ident in list(self.ids):
    # defn = self.get_id(ident)
    if len(defns) == 0:
    return
    self.app.logger.info(f"Updating {server} with {len(defns)} records")
    url = f"{server}/add_record"
    try:
    resp = requests.post(url, json=defns)
    except Exception as e:
    self.app.logger.error(str(e))
    return
    if not resp.ok:
    self.app.logger.error(f"{resp.reason} {resp.content}")
    return
    self._server_updated[server] = True
    def post(self, slug = None, eid = None):
    uid = self.request.form.get("uid")
    status = self.request.form.get("status") # can be join, maybe, notgoubg
    event = self.barcamp.get_event(eid)

    user = self.app.module_map.userbase.get_user_by_id(uid)

    reg = RegistrationService(self, user)
    try:
    status = reg.set_status(eid, status, force=True)
    except RegistrationError, e:
    print "a registration error occurred", e
    raise ProcessingError(str(e))
    return

    return {'status' : 'success', 'reload' : True}
    def events(self): def post(self):

    # we need a unique tx number so we can look these back up again
    # as well as for logging
    # FIXME: how can we guarantee uniqueness here?
    tx = int(time.time() * 100000) + random.randrange(10000, 99999)

    log.info("EVENTS [{}]: Creating events".format(tx))

    try:
    user = self.jbody["user"]
    if not EMAIL_REGEX.match(user):
    user += "@" + self.domain
    event_type_id = self.jbody.get("eventTypeId", None)
    category = self.jbody.get("category", None)
    state = self.jbody.get("state", None)
    note = self.jbody.get("note", None)
    except KeyError as err:
    raise exc.BadRequest(
    "Missing Required Argument: {}".format(err.message)
    )
    except ValueError as err:
    raise exc.BadRequest(err.message)

    if not event_type_id and (not category and not state):
    raise exc.BadRequest(
    ...
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "CachedMultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    
cornstack_python_pairs

cornstack_python_pairs

  • Dataset: cornstack_python_pairs
  • Size: 1,434,984 training samples
  • Columns: en_query, ru_query, and label
  • Approximate statistics based on the first 1000 samples:
    en_query ru_query label
    type string string float
    details
    • min: 7 tokens
    • mean: 26.96 tokens
    • max: 150 tokens
    • min: 7 tokens
    • mean: 27.46 tokens
    • max: 162 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    en_query ru_query label
    set the message data business_id to a specific value установите значение business_id сообщения данных в конкретное значение 1.0
    Set ruleset state sid Установить состояние правил sid 1.0
    Post sid events to the ruleset Отправить события sid в ruleset 1.0
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "CoSENTLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    
codesearchnet

codesearchnet

  • Dataset: codesearchnet at 3f90200
  • Size: 1,880,853 training samples
  • Columns: ru_func_documentation_string and func_code_string
  • Approximate statistics based on the first 1000 samples:
    ru_func_documentation_string func_code_string
    type string string
    details
    • min: 5 tokens
    • mean: 95.0 tokens
    • max: 619 tokens
    • min: 62 tokens
    • mean: 522.56 tokens
    • max: 8192 tokens
  • Samples:
    ru_func_documentation_string func_code_string
    Мультипроцессинг-целевой объект для устройства очереди zmq def zmq_device(self):
    '''
    Multiprocessing target for the zmq queue device
    '''
    self.__setup_signals()
    salt.utils.process.appendproctitle('MWorkerQueue')
    self.context = zmq.Context(self.opts['worker_threads'])
    # Prepare the zeromq sockets
    self.uri = 'tcp://{interface}:{ret_port}'.format(**self.opts)
    self.clients = self.context.socket(zmq.ROUTER)
    if self.opts['ipv6'] is True and hasattr(zmq, 'IPV4ONLY'):
    # IPv6 sockets work for both IPv6 and IPv4 addresses
    self.clients.setsockopt(zmq.IPV4ONLY, 0)
    self.clients.setsockopt(zmq.BACKLOG, self.opts.get('zmq_backlog', 1000))
    self._start_zmq_monitor()
    self.workers = self.context.socket(zmq.DEALER)

    if self.opts.get('ipc_mode', '') == 'tcp':
    self.w_uri = 'tcp://127.0.0.1:{0}'.format(
    self.opts.get('tcp_master_workers', 4515)
    )
    else:
    self.w_uri = 'ipc:...
    Чисто завершите работу сокета роутера def close(self):
    '''
    Cleanly shutdown the router socket
    '''
    if self._closing:
    return
    log.info('MWorkerQueue under PID %s is closing', os.getpid())
    self._closing = True
    # pylint: disable=E0203
    if getattr(self, '_monitor', None) is not None:
    self._monitor.stop()
    self._monitor = None
    if getattr(self, '_w_monitor', None) is not None:
    self._w_monitor.stop()
    self._w_monitor = None
    if hasattr(self, 'clients') and self.clients.closed is False:
    self.clients.close()
    if hasattr(self, 'workers') and self.workers.closed is False:
    self.workers.close()
    if hasattr(self, 'stream'):
    self.stream.close()
    if hasattr(self, '_socket') and self._socket.closed is False:
    self._socket.close()
    if hasattr(self, 'context') and self.context.closed is False:
    self.context.term()
    До форка нам нужно создать устройство zmq роутера

    :param func process_manager: Экземпляр класса salt.utils.process.ProcessManager
    def pre_fork(self, process_manager):
    '''
    Pre-fork we need to create the zmq router device

    :param func process_manager: An instance of salt.utils.process.ProcessManager
    '''
    salt.transport.mixins.auth.AESReqServerMixin.pre_fork(self, process_manager)
    process_manager.add_process(self.zmq_device)
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "CachedMultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    
codesearchnet_pairs

codesearchnet_pairs

  • Dataset: codesearchnet_pairs at 3f90200
  • Size: 940,426 training samples
  • Columns: en_func_documentation_string, ru_func_documentation_string, and label
  • Approximate statistics based on the first 1000 samples:
    en_func_documentation_string ru_func_documentation_string label
    type string string float
    details
    • min: 5 tokens
    • mean: 102.69 tokens
    • max: 1485 tokens
    • min: 5 tokens
    • mean: 95.0 tokens
    • max: 619 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    en_func_documentation_string ru_func_documentation_string label
    Multiprocessing target for the zmq queue device Мультипроцессинг-целевой объект для устройства очереди zmq 1.0
    Cleanly shutdown the router socket Чисто завершите работу сокета роутера 1.0
    Pre-fork we need to create the zmq router device

    :param func process_manager: An instance of salt.utils.process.ProcessManager
    До форка нам нужно создать устройство zmq роутера

    :param func process_manager: Экземпляр класса salt.utils.process.ProcessManager
    1.0
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "CoSENTLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    
solyanka_qa

solyanka_qa

  • Dataset: solyanka_qa at deeac62
  • Size: 85,523 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 19 tokens
    • mean: 202.49 tokens
    • max: 518 tokens
    • min: 16 tokens
    • mean: 196.36 tokens
    • max: 524 tokens
  • Samples:
    anchor positive
    Как происходит взаимодействие нескольких языков программирования? Понятно, что большинство (если не все) крупные энтерпрайз сервисы, приложения и тд. (не только веб) написаны с использованием не одного языка программирования, а нескольких. И эти составные части, написанные на разных языках, как-то взаимодействуют между собой (фронт, бизнес-логика, еще что-то).
    Опыта разработки подобных систем у меня нет, поэтому не совсем могу представить, как это происходит. Подозреваю, что взаимодействие идет через независимые от языков средства. Например, нечто написанное на одном языке, шлет через TCP-IP пакет, который ловится и обрабатывается чем-то написанным на другом языке. Либо через HTTP запросы. Либо через запись/чтение из БД. Либо через файловый обмен, XML например.
    Хотелось бы, чтобы знающие люди привели пару примеров, как это обычно происходит. Не просто в двух словах, мол "фронт на яваскрипте, бэк на яве", а с техническими нюансами. Заранее спасибо.
    Несколько языков могут сосуществовать как в рамках одного процесса, так и в рамках нескольких.
    Проще всего сосуществовать в рамках нескольких процессов: если процессы обмениваются данными, то совершенно всё равно (ну, в известных рамках), на каком языке эти данные были созданы, и какой язык их читает. Например, вы можете генерировать данные в виде HTML сервером на ASP.NET, а читать браузером, написанным на C++. (Да, пара из сервера и клиента — тоже взаимодействие языков.)
    Теперь, если мы хотим взаимодействие в рамках одного процесса, нам нужно уметь вызывать друг друга. Для этого нужен общий стандарт вызова. Часто таким общим стандартом являются бинарные соглашения C (extern "C", экспорт из DLL в Windows).
    Ещё пример общего стандарта — COM: COM-объекты можно писать на многих языках, так что если в языке есть часть, реализующая стандарт COM, он может вполне пользоваться им.
    Отдельная возможность, популярная сейчас — языки, компилирующиеся в общий промежуточный код. Например, Java и Sc...
    Слэши и ковычки после использования stringify Есть подобный скрипт:
    [code]
    var output = {
    lol: [
    {name: "hahaha"}
    ]
    };
    console.log(output);
    output = JSON.stringify(output);
    console.log(output);
    [/code]
    в итоге получаем
    почему он вставил слэши и кавычки там, где не надо?
    Может сразу сделать валидный JSON
    [code]
    var output = {
    lol: {name: "hahaha"}
    };
    console.log(output);
    output = JSON.stringify(output);
    console.log(output);
    [/code]
    Правда я незнаю что за переменная name
    Оптимизация поиска числа в списке Есть функция. Она принимает число от 1 до 9 (мы ищем, есть ли оно в списке), и список, в котором мы его ищем)
    [code]
    def is_number_already_in(number, line):
    equality = False
    for i in line:
    if i == number:
    equality = True
    if equality:
    return True
    else:
    return False
    [/code]
    Как можно этот код оптимизировать и как называется способ (тема) оптимизации, чтобы я мог загуглить
    Только не через лямбду, пожалуйста)
    >
    [code]
    > if equality:
    > return True
    > else:
    > return False
    >
    [/code]
    [code]
    return equality
    [/code]
    >
    [code]
    > equality = False
    > for i in line:
    > if i == number:
    > equality = True
    >
    [/code]
    [code]
    equality = any(i == number for i in line)
    [/code]
    Всё целиком:
    [code]
    def is_number_already_in(number, line):
    return any(i == number for i in line)
    [/code]
    Хотя на самом деле вроде бы можно гораздо проще
    [code]
    def is_number_already_in(number, line):
    return number in line
    [/code]
    PS: Не проверял, но в любом случае идея должна быть понятна.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "CachedMultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Datasets

codesearchnet

codesearchnet

  • Dataset: codesearchnet at 3f90200
  • Size: 30,000 evaluation samples
  • Columns: ru_func_documentation_string and func_code_string
  • Approximate statistics based on the first 1000 samples:
    ru_func_documentation_string func_code_string
    type string string
    details
    • min: 6 tokens
    • mean: 194.76 tokens
    • max: 1278 tokens
    • min: 58 tokens
    • mean: 580.66 tokens
    • max: 8192 tokens
  • Samples:
    ru_func_documentation_string func_code_string
    Обучить модель deepq.

    Параметры
    -------
    env: gym.Env
    среда для обучения
    network: строка или функция
    нейронная сеть, используемая в качестве аппроксиматора функции Q. Если строка, она должна быть одной из имен зарегистрированных моделей в baselines.common.models
    (mlp, cnn, conv_only). Если функция, она должна принимать тензор наблюдения и возвращать тензор скрытой переменной, которая
    будет отображена в головы функции Q (см. build_q_func в baselines.deepq.models для деталей по этому поводу)
    seed: int или None
    seed генератора случайных чисел. Запуски с одинаковым seed "должны" давать одинаковые результаты. Если None, используется отсутствие семени.
    lr: float
    скорость обучения для оптимизатора Adam
    total_timesteps: int
    количество шагов среды для оптимизации
    buffer_size: int
    размер буфера воспроизведения
    exploration_fraction: float
    доля всего периода обучения, в течение которого прои...
    def learn(env,
    network,
    seed=None,
    lr=5e-4,
    total_timesteps=100000,
    buffer_size=50000,
    exploration_fraction=0.1,
    exploration_final_eps=0.02,
    train_freq=1,
    batch_size=32,
    print_freq=100,
    checkpoint_freq=10000,
    checkpoint_path=None,
    learning_starts=1000,
    gamma=1.0,
    target_network_update_freq=500,
    prioritized_replay=False,
    prioritized_replay_alpha=0.6,
    prioritized_replay_beta0=0.4,
    prioritized_replay_beta_iters=None,
    prioritized_replay_eps=1e-6,
    param_noise=False,
    callback=None,
    load_path=None,
    **network_kwargs
    ):
    """Train a deepq model.

    Parameters
    -------
    env: gym.Env
    environment to train on
    network: string or a function
    neural network to use as a q function approximator. If string, has to be one of the ...
    Сохранить модель в pickle, расположенный по пути path def save_act(self, path=None):
    """Save model to a pickle located at path"""
    if path is None:
    path = os.path.join(logger.get_dir(), "model.pkl")

    with tempfile.TemporaryDirectory() as td:
    save_variables(os.path.join(td, "model"))
    arc_name = os.path.join(td, "packed.zip")
    with zipfile.ZipFile(arc_name, 'w') as zipf:
    for root, dirs, files in os.walk(td):
    for fname in files:
    file_path = os.path.join(root, fname)
    if file_path != arc_name:
    zipf.write(file_path, os.path.relpath(file_path, td))
    with open(arc_name, "rb") as f:
    model_data = f.read()
    with open(path, "wb") as f:
    cloudpickle.dump((model_data, self._act_params), f)
    CNN из статьи Nature. def nature_cnn(unscaled_images, **conv_kwargs):
    """
    CNN from Nature paper.
    """
    scaled_images = tf.cast(unscaled_images, tf.float32) / 255.
    activ = tf.nn.relu
    h = activ(conv(scaled_images, 'c1', nf=32, rf=8, stride=4, init_scale=np.sqrt(2),
    **conv_kwargs))
    h2 = activ(conv(h, 'c2', nf=64, rf=4, stride=2, init_scale=np.sqrt(2), **conv_kwargs))
    h3 = activ(conv(h2, 'c3', nf=64, rf=3, stride=1, init_scale=np.sqrt(2), **conv_kwargs))
    h3 = conv_to_fc(h3)
    return activ(fc(h3, 'fc1', nh=512, init_scale=np.sqrt(2)))
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "CachedMultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    
codesearchnet_en

codesearchnet_en

  • Dataset: codesearchnet_en at 3f90200
  • Size: 30,000 evaluation samples
  • Columns: en_func_documentation_string and func_code_string
  • Approximate statistics based on the first 1000 samples:
    en_func_documentation_string func_code_string
    type string string
    details
    • min: 6 tokens
    • mean: 200.33 tokens
    • max: 2498 tokens
    • min: 58 tokens
    • mean: 580.66 tokens
    • max: 8192 tokens
  • Samples:
    en_func_documentation_string func_code_string
    Train a deepq model.

    Parameters
    -------
    env: gym.Env
    environment to train on
    network: string or a function
    neural network to use as a q function approximator. If string, has to be one of the names of registered models in baselines.common.models
    (mlp, cnn, conv_only). If a function, should take an observation tensor and return a latent variable tensor, which
    will be mapped to the Q function heads (see build_q_func in baselines.deepq.models for details on that)
    seed: int or None
    prng seed. The runs with the same seed "should" give the same results. If None, no seeding is used.
    lr: float
    learning rate for adam optimizer
    total_timesteps: int
    number of env steps to optimizer for
    buffer_size: int
    size of the replay buffer
    exploration_fraction: float
    fraction of entire training period over which the exploration rate is annealed
    exploration_final_eps: float
    final value of ra...
    def learn(env,
    network,
    seed=None,
    lr=5e-4,
    total_timesteps=100000,
    buffer_size=50000,
    exploration_fraction=0.1,
    exploration_final_eps=0.02,
    train_freq=1,
    batch_size=32,
    print_freq=100,
    checkpoint_freq=10000,
    checkpoint_path=None,
    learning_starts=1000,
    gamma=1.0,
    target_network_update_freq=500,
    prioritized_replay=False,
    prioritized_replay_alpha=0.6,
    prioritized_replay_beta0=0.4,
    prioritized_replay_beta_iters=None,
    prioritized_replay_eps=1e-6,
    param_noise=False,
    callback=None,
    load_path=None,
    **network_kwargs
    ):
    """Train a deepq model.

    Parameters
    -------
    env: gym.Env
    environment to train on
    network: string or a function
    neural network to use as a q function approximator. If string, has to be one of the ...
    Save model to a pickle located at path def save_act(self, path=None):
    """Save model to a pickle located at path"""
    if path is None:
    path = os.path.join(logger.get_dir(), "model.pkl")

    with tempfile.TemporaryDirectory() as td:
    save_variables(os.path.join(td, "model"))
    arc_name = os.path.join(td, "packed.zip")
    with zipfile.ZipFile(arc_name, 'w') as zipf:
    for root, dirs, files in os.walk(td):
    for fname in files:
    file_path = os.path.join(root, fname)
    if file_path != arc_name:
    zipf.write(file_path, os.path.relpath(file_path, td))
    with open(arc_name, "rb") as f:
    model_data = f.read()
    with open(path, "wb") as f:
    cloudpickle.dump((model_data, self._act_params), f)
    CNN from Nature paper. def nature_cnn(unscaled_images, **conv_kwargs):
    """
    CNN from Nature paper.
    """
    scaled_images = tf.cast(unscaled_images, tf.float32) / 255.
    activ = tf.nn.relu
    h = activ(conv(scaled_images, 'c1', nf=32, rf=8, stride=4, init_scale=np.sqrt(2),
    **conv_kwargs))
    h2 = activ(conv(h, 'c2', nf=64, rf=4, stride=2, init_scale=np.sqrt(2), **conv_kwargs))
    h3 = activ(conv(h2, 'c3', nf=64, rf=3, stride=1, init_scale=np.sqrt(2), **conv_kwargs))
    h3 = conv_to_fc(h3)
    return activ(fc(h3, 'fc1', nh=512, init_scale=np.sqrt(2)))
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "CachedMultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    
codesearchnet_pairs

codesearchnet_pairs

  • Dataset: codesearchnet_pairs at 3f90200
  • Size: 30,000 evaluation samples
  • Columns: en_func_documentation_string, ru_func_documentation_string, and label
  • Approximate statistics based on the first 1000 samples:
    en_func_documentation_string ru_func_documentation_string label
    type string string float
    details
    • min: 6 tokens
    • mean: 200.33 tokens
    • max: 2498 tokens
    • min: 6 tokens
    • mean: 194.76 tokens
    • max: 1278 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    en_func_documentation_string ru_func_documentation_string label
    Train a deepq model.

    Parameters
    -------
    env: gym.Env
    environment to train on
    network: string or a function
    neural network to use as a q function approximator. If string, has to be one of the names of registered models in baselines.common.models
    (mlp, cnn, conv_only). If a function, should take an observation tensor and return a latent variable tensor, which
    will be mapped to the Q function heads (see build_q_func in baselines.deepq.models for details on that)
    seed: int or None
    prng seed. The runs with the same seed "should" give the same results. If None, no seeding is used.
    lr: float
    learning rate for adam optimizer
    total_timesteps: int
    number of env steps to optimizer for
    buffer_size: int
    size of the replay buffer
    exploration_fraction: float
    fraction of entire training period over which the exploration rate is annealed
    exploration_final_eps: float
    final value of ra...
    Обучить модель deepq.

    Параметры
    -------
    env: gym.Env
    среда для обучения
    network: строка или функция
    нейронная сеть, используемая в качестве аппроксиматора функции Q. Если строка, она должна быть одной из имен зарегистрированных моделей в baselines.common.models
    (mlp, cnn, conv_only). Если функция, она должна принимать тензор наблюдения и возвращать тензор скрытой переменной, которая
    будет отображена в головы функции Q (см. build_q_func в baselines.deepq.models для деталей по этому поводу)
    seed: int или None
    seed генератора случайных чисел. Запуски с одинаковым seed "должны" давать одинаковые результаты. Если None, используется отсутствие семени.
    lr: float
    скорость обучения для оптимизатора Adam
    total_timesteps: int
    количество шагов среды для оптимизации
    buffer_size: int
    размер буфера воспроизведения
    exploration_fraction: float
    доля всего периода обучения, в течение которого прои...
    1.0
    Save model to a pickle located at path Сохранить модель в pickle, расположенный по пути path 1.0
    CNN from Nature paper. CNN из статьи Nature. 1.0
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "CoSENTLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    
solyanka_qa

solyanka_qa

  • Dataset: solyanka_qa at deeac62
  • Size: 5,000 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 17 tokens
    • mean: 200.35 tokens
    • max: 533 tokens
    • min: 19 tokens
    • mean: 202.53 tokens
    • max: 525 tokens
  • Samples:
    anchor positive
    Atom IDE произвольное изменение строк Пользуюсь Atom IDE, установлены плагины для GIT'а, использую тему Material theme (может быть кому то это что то даст), в общем проблема такая, что в php файлах при сохранении файла, даже если я изменил всего один символ, он добавляет изменения очень странные,берет 2-3 строки (хз как выбирает) и удаляет их, а потом вставялет их же, без каких то либо изменений. При этом GIT фиксирует это изменение...
    Вот скрин в blob формате: "blob:https://web.telegram.org/04094604-204d-47b0-a083-f8cd090bdfa0"
    Проблема заключалась в том, что все IDE испльзуют свой символ перехода на следующую строку, если в команде разработчики используют разные IDE, у которых разный перенос строки, то при сохранении файла чужие переносы строк будут заменяться на свои :)
    print() с частью текста и форматированием как переменная Python3 Есть повторяющаяся функция print('\n' + f'{" ЗАПУСКАЕМ ТЕСТ ":=^120}' + '\n')
    на выходе получаем чтото типа
    ================ ЗАПУСКАЕМ ТЕСТ ================
    или с другим текстом
    ================= КОНЕЦ ТЕСТА ==================
    Текст внутри может меняться, форматирование - нет.
    Как обернуть print('\n' + f'{"":=^120}' + '\n') в переменную, с возможностью подставлять нужный текст, типа print_var('ПРИМЕР ТЕКСТА')?
    [code]
    def print_var(str):
    print(f'\n{" " + str + " ":=^120}\n')
    [/code]
    В результате:
    [code]
    >>> print_var('КАКОЙ_ТО ТЕКСТ')
    ===================================================== КАКОЙ_ТО ТЕКСТ =====================================================
    [/code]
    Не получается перегрузить оператор присваивания в шаблонном классе Нужно перегрузить оператор присваивания в шаблонном классе, не могу понять, почему не работает стандартный синтаксис, при реализации выдает эту ошибку (/home/anton/Programming/tree/tree.h:96: ошибка: overloaded 'operator=' must be a binary operator (has 1 parameter)). Объявление и реализация в одном .h файле.
    Объявление:
    [code]
    tree& operator = (tree &other);
    [/code]
    реалицация:
    [code]
    template
    tree& operator = (tree &other)
    {
    }
    [/code]
    Ну надо указать, какому классу он принадлежит... А так вы пытались реализовать унарный оператор =...
    [code]
    template
    tree& tree::operator = (tree &other)
    {
    }
    [/code]
    И еще - вы точно планируете при присваивании менять присваиваемое? Может, лучше
    [code]
    template
    tree& tree::operator = (const tree &other)
    {
    }
    [/code]
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "CachedMultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 32
  • learning_rate: 2e-05
  • num_train_epochs: 2
  • warmup_ratio: 0.1
  • bf16: True
  • resume_from_checkpoint: ../models/RuModernBERT-base_bs128_lr_2e-05_2nd_epoch/checkpoint-27400
  • auto_find_batch_size: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 32
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: ../models/RuModernBERT-base_bs128_lr_2e-05_2nd_epoch/checkpoint-27400
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: True
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss codesearchnet loss codesearchnet en loss codesearchnet pairs loss solyanka qa loss cosine_ndcg@10
0.4899 27600 50.3428 - - - - -
0.4934 27800 50.6395 - - - - -
0.4970 28000 49.216 0.1459 0.1558 0.0 0.2914 0.9234
0.5005 28200 49.4888 - - - - -
0.5041 28400 49.2362 - - - - -
0.5076 28600 50.2319 - - - - -
0.5112 28800 48.3359 - - - - -
0.5147 29000 49.5835 - - - - -
0.5183 29200 50.2161 - - - - -
0.5218 29400 49.6727 - - - - -
0.5254 29600 48.9899 - - - - -
0.5289 29800 48.823 - - - - -
0.5325 30000 49.1113 0.1460 0.1557 0.0 0.2914 0.9234
0.5360 30200 48.4824 - - - - -
0.5396 30400 48.8933 - - - - -
0.5431 30600 49.2607 - - - - -
0.5467 30800 49.1046 - - - - -
0.5502 31000 48.6113 - - - - -
0.5538 31200 49.6846 - - - - -
0.5573 31400 50.1739 - - - - -
0.5609 31600 49.4535 - - - - -
0.5644 31800 48.6802 - - - - -
0.5680 32000 49.9103 0.1458 0.1554 0.0 0.2909 0.9234
0.5715 32200 50.5902 - - - - -
0.5751 32400 49.3202 - - - - -
0.5786 32600 49.0049 - - - - -
0.5822 32800 49.7297 - - - - -
0.5857 33000 49.4004 - - - - -
0.5893 33200 48.5135 - - - - -
0.5928 33400 48.2331 - - - - -
0.5964 33600 50.3588 - - - - -
0.5999 33800 48.3158 - - - - -
0.6035 34000 49.6962 0.1456 0.1553 0.0 0.2908 0.9234
0.6070 34200 49.8121 - - - - -
0.6106 34400 50.8481 - - - - -
0.6141 34600 50.0363 - - - - -
0.6177 34800 49.5676 - - - - -
0.6212 35000 47.5664 - - - - -
0.6248 35200 48.5752 - - - - -
0.6283 35400 49.4174 - - - - -
0.6319 35600 48.8215 - - - - -
0.6354 35800 49.9745 - - - - -
0.6390 36000 47.8552 0.1456 0.1551 0.0 0.2906 0.9234
0.6425 36200 50.2583 - - - - -
0.6461 36400 48.5441 - - - - -
0.6496 36600 48.7192 - - - - -
0.6532 36800 49.947 - - - - -
0.6567 37000 48.6255 - - - - -
0.6603 37200 48.0433 - - - - -
0.6638 37400 49.5333 - - - - -
0.6674 37600 48.8394 - - - - -
0.6709 37800 48.6463 - - - - -
0.6745 38000 49.3688 0.1456 0.1551 0.0 0.2913 0.9234
0.6780 38200 49.4758 - - - - -
0.6816 38400 50.0071 - - - - -
0.6851 38600 49.9054 - - - - -
0.6887 38800 49.9274 - - - - -
0.6922 39000 47.5942 - - - - -
0.6958 39200 49.409 - - - - -
0.6993 39400 49.6438 - - - - -
0.7029 39600 49.4253 - - - - -
0.7064 39800 49.1187 - - - - -
0.7100 40000 49.2283 0.1455 0.1551 0.0 0.2910 0.9235
0.7135 40200 51.0079 - - - - -
0.7171 40400 48.4275 - - - - -
0.7206 40600 48.6685 - - - - -
0.7242 40800 48.7769 - - - - -
0.7277 41000 49.712 - - - - -
0.7312 41200 49.0523 - - - - -
0.7348 41400 49.6381 - - - - -
0.7383 41600 49.7758 - - - - -
0.7419 41800 51.02 - - - - -
0.7454 42000 49.738 0.1454 0.1550 0.0 0.2914 0.9235
0.7490 42200 48.4278 - - - - -
0.7525 42400 48.2776 - - - - -
0.7561 42600 50.1085 - - - - -
0.7596 42800 49.6109 - - - - -
0.7632 43000 50.1112 - - - - -
0.7667 43200 48.3173 - - - - -
0.7703 43400 49.4717 - - - - -
0.7738 43600 50.4256 - - - - -
0.7774 43800 51.3672 - - - - -
0.7809 44000 49.5019 0.1455 0.1550 0.0 0.2913 0.9234
0.7845 44200 49.9114 - - - - -
0.7880 44400 48.8164 - - - - -
0.7916 44600 48.4947 - - - - -
0.7951 44800 48.6371 - - - - -
0.7987 45000 49.3439 - - - - -
0.8022 45200 48.8964 - - - - -
0.8058 45400 48.7946 - - - - -
0.8093 45600 48.6259 - - - - -
0.8129 45800 49.4873 - - - - -
0.8164 46000 49.6979 0.1454 0.1550 0.0 0.2914 0.9234
0.8200 46200 48.246 - - - - -
0.8235 46400 49.1022 - - - - -
0.8271 46600 49.18 - - - - -
0.8306 46800 48.8027 - - - - -
0.8342 47000 48.7197 - - - - -
0.8377 47200 47.9643 - - - - -
0.8413 47400 50.829 - - - - -
0.8448 47600 50.3984 - - - - -
0.8484 47800 48.848 - - - - -
0.8519 48000 50.6701 0.1453 0.1548 0.0 0.2908 0.9235
0.8555 48200 49.9972 - - - - -
0.8590 48400 48.1245 - - - - -
0.8626 48600 49.4942 - - - - -
0.8661 48800 48.1227 - - - - -
0.8697 49000 48.9811 - - - - -
0.8732 49200 49.4753 - - - - -
0.8768 49400 49.2714 - - - - -
0.8803 49600 49.166 - - - - -
0.8839 49800 49.0925 - - - - -
0.8874 50000 48.4746 0.1453 0.1549 0.0 0.2910 0.9234
0.8910 50200 49.0912 - - - - -
0.8945 50400 49.6571 - - - - -
0.8981 50600 50.9175 - - - - -
0.9016 50800 51.2218 - - - - -
0.9052 51000 47.8553 - - - - -
0.9087 51200 48.6819 - - - - -
0.9123 51400 49.6197 - - - - -

Framework Versions

  • Python: 3.10.11
  • Sentence Transformers: 5.1.2
  • Transformers: 4.52.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.12.0
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

CoSENTLoss

@article{10531646,
    author={Huang, Xiang and Peng, Hao and Zou, Dongcheng and Liu, Zhiwei and Li, Jianxin and Liu, Kay and Wu, Jia and Su, Jianlin and Yu, Philip S.},
    journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
    title={CoSENT: Consistent Sentence Embedding via Similarity Ranking},
    year={2024},
    doi={10.1109/TASLP.2024.3402087}
}
Downloads last month
16
Safetensors
Model size
0.1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train fyaronskiy/code_retriever_ru_en

Papers for fyaronskiy/code_retriever_ru_en

Evaluation results