miriad/miriad-4.4M
Viewer • Updated • 4.49M • 434 • 35
How to use yasserrmd/pharma-gemma-300m-emb with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("yasserrmd/pharma-gemma-300m-emb")
sentences = [
"How can training influence nurses' awareness of medication errors?\n",
"Treatment with obeticholic acid resulted in significant reductions in serum alanine aminotransferase (ALT) and aspartate aminotransferase (AST) concentrations over the first 36 weeks of treatment, and these reductions were sustained for the duration of treatment. However, serum alkaline phosphatase concentrations increased with obeticholic acid treatment, although γ-glutamyl transpeptidase concentrations (another indicator of cholestasis) decreased. These changes in liver enzyme concentrations reversed after obeticholic acid was stopped, and at 24 weeks after treatment discontinuation, there were no significant differences between the obeticholic acid group and the placebo group.",
"Training can increase nurses' awareness of medication errors by providing them with the knowledge and skills necessary to identify and prevent errors. Through training, nurses can learn about medication safety protocols, proper medication administration techniques, and the importance of error reporting. This increased awareness can help nurses recognize potential errors and take appropriate actions to prevent harm to patients.",
"ML171, also known as 2-acetylphenothiazine, has been identified as a specific NOX1 oxidase inhibitor at nanomolar concentrations. It shows minimal activity on other cellular ROS-producing sources, including xanthine oxidase and other NADPH oxidases. ML171 targets the NOX1 catalytic subunit without affecting its cytosolic regulators, such as the NOXO1, NOXA1, or RAC1 subunits. It effectively blocks NOX1 oxidase-dependent ROS-mediated formation of extracellular matrix-degrading invadopodia in colon cancer cells."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from google/embeddinggemma-300m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(4): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("yasserrmd/pharma-gemma-300m-emb")
# Run inference
queries = [
"What is the purpose of dabigatran and why was it prescribed to the 71-year-old female?",
]
documents = [
'Dabigatran is prescribed for stroke prevention in patients with atrial fibrillation. Atrial fibrillation increases the risk of blood clots forming in the heart, which can then travel to the brain and cause a stroke. Dabigatran is an anticoagulant that helps prevent the formation of blood clots, reducing the risk of stroke in patients with atrial fibrillation.',
'G-protein coupled receptors, like OXTR and MOR, can form homo- or hetero-dimers, which means they can associate with another molecule of the same receptor or with receptors from other families. This physical association has been shown to modulate receptor binding and function. For example, in MOR-alpha2A-adrenergic receptor dimers, the activation of MOR by morphine inhibits the adjacent alpha2A-receptor by blocking its ability to activate the G-proteins, even in the presence of noradrenaline.',
'Melatonin agonists may have side effects such as nausea, headache, elevated liver enzyme levels, rebound insomnia, withdrawal symptoms, and addiction. Contraindications include liver failure, renal failure, alcohol addiction, and high lipid levels.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.4430, 0.0378, -0.0539]])
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
How does ticlopidine differ from clopidogrel in terms of side effects and precautions? |
Unlike clopidogrel, ticlopidine can lead to neutropenia in up to 1% of patients, which limits its widespread use. Regular blood count checks are necessary in the initial weeks of ticlopidine treatment. Additionally, neuraxial regional anesthesia should not be performed until 10 days have elapsed since the last ingestion of ticlopidine. |
What are the different types of ligands that can bind to GPCRs? |
GPCRs can bind a wide variety of endogenous ligands, including neuropeptides, amino acids, ions, hormones, chemokines, lipid-derived mediators, and ions. Some GPCRs are considered orphan receptors because their exact ligands have not been identified yet. |
How does etomidate function as an adrenostatic agent and what are its effects on cortisol secretion? |
Etomidate acts as an adrenostatic agent by blocking the cytochrome P450-dependent adrenal enzymes 11β-hydroxylase and cholesterol-side-chain cleavage enzyme. This inhibition leads to a decrease in cortisol secretion. In dispersed guinea-pig adrenal cells, etomidate has been shown to be the most potent adrenostatic drug available, with a mean concentration of 97 nmol/l required for 50% inhibition of cortisol secretion. This concentration is considerably lower than the plasma concentration needed to induce sedation. After a single induction dose of etomidate, the adrenocortical blockade lasts several hours while the hypnotic action of etomidate rapidly fades. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
per_device_train_batch_size: 4per_device_eval_batch_size: 4num_train_epochs: 1multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 4per_device_eval_batch_size: 4per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.1 | 500 | 0.0134 |
| 0.2 | 1000 | 0.009 |
| 0.3 | 1500 | 0.0138 |
| 0.4 | 2000 | 0.0052 |
| 0.5 | 2500 | 0.0154 |
| 0.6 | 3000 | 0.0076 |
| 0.7 | 3500 | 0.0062 |
| 0.8 | 4000 | 0.0021 |
| 0.9 | 4500 | 0.0028 |
| 1.0 | 5000 | 0.0015 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
google/embeddinggemma-300m