Instructions to use KedarPN/GrantsLLM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use KedarPN/GrantsLLM with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="KedarPN/GrantsLLM")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("KedarPN/GrantsLLM")
model = AutoModelForCausalLM.from_pretrained("KedarPN/GrantsLLM")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use KedarPN/GrantsLLM with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="KedarPN/GrantsLLM",
	filename="unsloth.Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use KedarPN/GrantsLLM with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf KedarPN/GrantsLLM:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf KedarPN/GrantsLLM:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf KedarPN/GrantsLLM:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf KedarPN/GrantsLLM:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf KedarPN/GrantsLLM:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf KedarPN/GrantsLLM:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf KedarPN/GrantsLLM:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf KedarPN/GrantsLLM:Q4_K_M

Use Docker

docker model run hf.co/KedarPN/GrantsLLM:Q4_K_M

LM Studio
Jan

vLLM

How to use KedarPN/GrantsLLM with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "KedarPN/GrantsLLM"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "KedarPN/GrantsLLM",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/KedarPN/GrantsLLM:Q4_K_M

SGLang

How to use KedarPN/GrantsLLM with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "KedarPN/GrantsLLM" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "KedarPN/GrantsLLM",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "KedarPN/GrantsLLM" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "KedarPN/GrantsLLM",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use KedarPN/GrantsLLM with Ollama:
```
ollama run hf.co/KedarPN/GrantsLLM:Q4_K_M
```

Unsloth Studio new

How to use KedarPN/GrantsLLM with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for KedarPN/GrantsLLM to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for KedarPN/GrantsLLM to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for KedarPN/GrantsLLM to start chatting

Pi new

How to use KedarPN/GrantsLLM with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf KedarPN/GrantsLLM:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "KedarPN/GrantsLLM:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use KedarPN/GrantsLLM with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf KedarPN/GrantsLLM:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default KedarPN/GrantsLLM:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use KedarPN/GrantsLLM with Docker Model Runner:
```
docker model run hf.co/KedarPN/GrantsLLM:Q4_K_M
```

Lemonade

How to use KedarPN/GrantsLLM with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull KedarPN/GrantsLLM:Q4_K_M

Run and chat with the model

lemonade run user.GrantsLLM-Q4_K_M

List all available models

lemonade list

GrantsLLM / README.md

KedarPN

Update README.md

e486028 verified 2 months ago

preview code

raw

history blame contribute delete

14.7 kB

	---
	license: cc-by-4.0
	language:
	- en
	library_name: transformers
	tags:
	- grant-writing
	- research
	- STEM
	- biotech
	- fine-tuned
	- Qwen
	- text-generation
	- academic-writing
	- proposal-writing
	base_model:
	- Qwen/Qwen3-4B
	datasets:
	- custom
	pipeline_tag: text-generation
	widget:
	- text: >-
	Write a Specific Aims section for an NIH R03 grant on developing
	CRISPR-based therapeutics for rare genetic disorders. Include 2 aims.
	example_title: Generate Specific Aims
	- text: >-
	Draft a Significance and Innovation section for an NSF grant on machine
	learning applications in protein structure prediction.
	example_title: Generate Significance
	- text: >-
	Review the following grant aims and provide feedback: Aim 1: Develop a
	novel CRISPR delivery system. Aim 2: Test efficacy in animal models.
	example_title: Review Grant Section
	model-index:
	- name: GrantsLLM
	results: []
	---
	# GrantsLLM

	[![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)
	[![Base Model](https://img.shields.io/badge/Base-Qwen3%204B-blue)](https://huggingface.co/Qwen/Qwen3-4B)

	A specialized language model for STEM research grant writing and review

	Developed by [Evionex](https://evionex.com) \| Created by Kedar P. Navsariwala

	---

	## Model Description

	GrantsLLM is a domain-specialized language model fine-tuned on 78 STEM research grant applications to assist researchers in drafting, refining, and reviewing grant proposals. Built on Qwen3-4B, this model has been trained to understand the structure, terminology, and writing style of successful research grants across NIH, NSF, and similar funding mechanisms.

	- Developed by: Kedar P. Navsariwala, CTO & Co-Founder at Evionex
	- Model type: Causal Language Model (Decoder-only Transformer)
	- Language(s): English
	- License: CC BY 4.0 (requires attribution)
	- Finetuned from: Qwen/Qwen3-4B

	---

	## 🎯 Use Cases

	### What GrantsLLM Can Do

	- ✅ Generate complete grant proposals (NIH R03/R01/R21, NSF, etc.)
	- ✅ Draft specific sections: Specific Aims, Significance, Innovation, Approach, Research Strategy
	- ✅ Improve existing text for clarity, structure, and persuasiveness
	- ✅ Provide review feedback on grant coherence and alignment
	- ✅ Expand bullet points into full narrative sections
	- ✅ Adapt tone to academic/scientific writing standards

	### Intended Users

	- Principal Investigators (PIs) and research scientists
	- Postdoctoral researchers and graduate students
	- University grant support offices
	- Biotech and research startups
	- Academic research administrators

	### Out of Scope

	- ❌ Automated funding decisions or grant scoring
	- ❌ Legal, regulatory, or IRB compliance review
	- ❌ Generating fabricated data or citations
	- ❌ Non-STEM grants (humanities, arts, social sciences may have reduced quality)
	- ❌ Non-English grant applications

	---

	## 🚀 Quick Start

	### Installation

	```bash
	pip install transformers torch accelerate
	```

	### Basic Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "KedarPN/GrantsLLM"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)

	prompt = """Write a Specific Aims section for an NIH R03 grant on developing novel CRISPR-based gene editing tools for treating sickle cell disease. Include 2-3 specific aims with clear objectives and expected outcomes."""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	temperature=0.7,
	top_p=0.9,
	do_sample=True
	)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### Using with Pipeline

	```python
	from transformers import pipeline

	generator = pipeline(
	"text-generation",
	model="KedarPN/GrantsLLM",
	device_map="auto"
	)

	prompt = "Draft a Research Significance statement for a computational biology grant on protein folding prediction using deep learning."
	output = generator(prompt, max_new_tokens=400, temperature=0.7, top_p=0.9)
	print(output[0]['generated_text'])
	```

	### Prompt Templates

	For Section Generation:
	```
	Write a [Section] for a [Funder] [Mechanism] grant on [Topic].
	Requirements: [Specific elements needed]
	Word limit: [Number] words
	```

	For Review/Feedback:
	```
	Review the following [Section] and provide feedback on clarity, structure, and alignment with [Funder] guidelines:
	[Paste text here]
	```

	Examples:
	- `"Write Specific Aims for an NIH R01 grant on cancer immunotherapy"`
	- `"Draft Innovation section for NSF CAREER award on quantum computing"`
	- `"Review this Research Strategy for logical flow and hypothesis clarity"`

	---

	## 📊 Training Data

	### Dataset Composition

	- Size: 78 research grant applications
	- Domains: Biotechnology, Molecular Biology, Computational Biology, Chemistry, Biomedical Sciences
	- Formats: NIH (R01, R03, R21), NSF, and similar federal/institutional grant formats
	- Sources: Publicly available grant examples, institutional repositories, and NIH RePORTER
	- Language: English

	### Data Processing

	Stage 1: Continued Pretraining (CPT)
	- Raw grant text extracted and cleaned from PDFs/documents
	- Structured into single-column `text` format (JSONL/Parquet)
	- Preserves section structure and domain terminology

	Stage 2: Supervised Fine-Tuning (SFT)
	- Chat-style instruction pairs using ChatML template
	- Tasks include: section generation, expansion, refinement, review
	- Format: `{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}`

	---

	## 🔧 Training Procedure

	### Training Hyperparameters

	- Base Model: Qwen/Qwen3-4B (~4B parameters)
	- Training Framework: Unsloth + PyTorch
	- Hardware: Google Colab (single GPU, T4/V100)
	- Fine-tuning Method: LoRA/QLoRA (Parameter-Efficient Fine-Tuning)
	- Training Stages:
	1. Continued Pretraining on grant corpus
	2. Supervised Instruction Fine-Tuning on QnA pairs
	- Optimizer: AdamW
	- Learning Rate: Low rate to prevent catastrophic forgetting
	- Training monitored for: Overfitting, repetition, coherence

	### Training Details

	```yaml
	Training Type: Full fine-tuning with LoRA adapters
	Epochs: [Adjusted based on validation performance]
	Batch Size: Optimized for 4B model on single GPU
	Context Length: 262,144 tokens (256K)
	Loss Function: Causal Language Modeling (CLM) loss
	Validation Strategy: Qualitative evaluation on held-out grant examples
	```

	---

	## 📈 Performance & Evaluation

	### Evaluation Methodology

	Qualitative Assessment:
	- Human expert review of generated grant sections
	- Evaluation criteria: coherence, structure, domain accuracy, persuasiveness
	- Practical testing on mock NIH/NSF grant prompts

	### Known Strengths

	- ✅ Strong grasp of STEM grant structure (Aims, Significance, Innovation, Approach)
	- ✅ Effective expansion of bullet points to narrative
	- ✅ Appropriate academic/scientific tone
	- ✅ Good understanding of NIH/NSF terminology and conventions
	- ✅ Maintains logical flow between sections

	### Known Limitations

	- ⚠️ Hallucination Risk: May generate plausible but incorrect citations, grant numbers, or policies
	- ⚠️ Format Bias: Optimized for NIH/NSF; other formats (European, private foundations) may be weaker
	- ⚠️ Domain Bias: Best for biotech/life sciences; physics/engineering grants may be less polished
	- ⚠️ Repetition: Can produce repetitive text if prompt lacks detail or structure
	- ⚠️ Recency: Training data may not reflect latest funder guidelines (post-2025)

	---

	## ⚠️ Bias, Risks, and Limitations

	### Bias Sources

	Domain Bias: Model is optimized for STEM fields represented in training data (biotech, molecular biology, computational biology). Grants in underrepresented fields may receive lower quality outputs.

	Institutional Bias: Writing style may reflect patterns from R1 research universities and well-funded institutions present in training examples.

	Funding Mechanism Bias: Strongest performance on NIH R-series and NSF standard grants; less reliable for fellowships, training grants, or international formats.

	Historical Bias: May reinforce language patterns from historically funded research areas, potentially disadvantaging emerging or interdisciplinary fields.

	### Risks

	Fabrication: Model may generate convincing but false information including:
	- Non-existent citations and references
	- Incorrect grant mechanism details
	- Fabricated preliminary data or results
	- Inaccurate funder policies

	Over-reliance: Users may trust outputs without verification, risking submission of flawed proposals.

	Privacy: Users may inadvertently input confidential research ideas or unpublished data.

	### Recommendations

	1. Always verify: Check all factual claims, citations, and funder guidelines
	2. Human review required: Never submit AI-generated grants without expert review
	3. Iterative refinement: Use as drafting assistant, not final author
	4. Protect IP: Don't input confidential or proprietary information
	5. Disclose usage: Be transparent with collaborators and (when appropriate) funders about AI assistance
	6. Update manually: Cross-reference current funder guidelines and requirements

	---

	## 🔐 Ethical Considerations

	### Responsible Use

	- Transparency: Disclose AI assistance to co-authors and collaborators
	- Human oversight: Keep domain experts in the loop for all submissions
	- Academic integrity: Ensure outputs align with your institution's policies on AI use
	- Verification: Validate all scientific claims and citations independently
	- Privacy: Avoid inputting sensitive, unpublished, or identifiable information

	### Funder Policies

	As of February 2026, grant-writing AI policies vary by funder:
	- NIH: Generally permits AI assistance for writing, but PIs remain responsible for all content
	- NSF: Similar stance; emphasizes researcher accountability
	- Check specific RFAs for any AI-related restrictions or disclosure requirements

	When in doubt: Contact your program officer or sponsored research office.

	---

	## 📜 Licensing & Attribution

	### License: CC BY 4.0

	This model is licensed under [Creative Commons Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/).

	### You Must:

	✅ Give appropriate credit to Evionex and Kedar P. Navsariwala
	✅ Provide a link to the license
	✅ Indicate if changes were made to the model
	✅ Retain attribution in any derivative works or applications

	### Citation

	If you use GrantsLLM in your research or projects, please cite:

	```bibtex
	@software{grantsllm2026,
	author = {Navsariwala, Kedar P.},
	title = {GrantsLLM: A Fine-Tuned Language Model for STEM Grant Writing},
	year = {2026},
	publisher = {Hugging Face},
	organization = {Evionex},
	howpublished = {\url{https://huggingface.co/KedarPN/GrantsLLM}},
	license = {CC-BY-4.0}
	}
	```

	### Attribution Example

	```
	Grant drafting assistance provided by GrantsLLM (Navsariwala, 2026), developed by Evionex.
	Available at https://huggingface.co/KedarPN/GrantsLLM
	```

	---

	## 🛠️ Technical Specifications

	### Model Architecture

	- Architecture: Qwen3 (Decoder-only Transformer)
	- Parameters: ~4 billion
	- Layers: 36
	- Hidden Size: 2560
	- Attention Heads: 32
	- Vocabulary Size: 151,936
	- Context Window: 262,144 tokens (256K)

	### Software Stack

	- Training: Unsloth, PyTorch, Hugging Face Transformers
	- Fine-tuning: LoRA/QLoRA with PEFT
	- Environment: Google Colab (GPU)
	- Export Formats:
	- Hugging Face Transformers checkpoint (BF16 + BNB NF4 4-bit)
	- GGUF (Q4_K_M, Q5_K_M, Q8_0)

	### Hardware Requirements

	Inference:
	- Minimum: 8GB VRAM (with GGUF quantization) or 16GB RAM (CPU)
	- Recommended: 16GB+ VRAM for full precision
	- CPU inference: Supported via GGUF quantized versions

	---

	## 📦 Model Variants

	\| Variant \| File \| Size \| Use Case \| Hardware \|
	\|---------\|------\|------\|----------\|----------\|
	\| Full precision (BF16) \| `model-0000[1-2]-of-00002.safetensors` \| ~8.05 GB \| Maximum quality \| 16GB+ VRAM \|
	\| BNB NF4 4-bit \| `model.safetensors` \| ~3.51 GB \| Memory-efficient fine-tuning checkpoint \| 8GB+ VRAM \|
	\| GGUF Q8_0 \| `unsloth.Q8_0.gguf` \| ~4.28 GB \| Balanced quality/speed \| 8GB+ VRAM or CPU \|
	\| GGUF Q5_K_M \| `unsloth.Q5_K_M.gguf` \| ~2.89 GB \| Good quality, reduced size \| 6GB+ VRAM or CPU \|
	\| GGUF Q4_K_M \| `unsloth.Q4_K_M.gguf` \| ~2.5 GB \| Fast inference, minimal VRAM \| 4GB+ VRAM or CPU \|

	---

	## 🤝 Acknowledgments

	### Built With

	- Base Model: [Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) by Alibaba/Qwen Team
	- Training Framework: [Unsloth](https://github.com/unslothai/unsloth) for efficient fine-tuning
	- ML Libraries: PyTorch, Hugging Face Transformers
	- Infrastructure: Google Colab

	### Special Thanks

	- Open-source grant examples from NIH RePORTER and NSF Award Search
	- Academic institutions sharing grant templates and examples
	- Unsloth team for efficient fine-tuning tools
	- Hugging Face for model hosting and inference infrastructure

	---

	## 📞 Contact & Support

	Developer: Kedar P. Navsariwala
	Organization: Evionex
	Website: [www.evionex.com](https://evionex.com)
	Model Repository: [KedarPN/GrantsLLM](https://huggingface.co/KedarPN/GrantsLLM)

	### Issues & Feedback

	- Report bugs or issues in the [Discussion tab](https://huggingface.co/KedarPN/GrantsLLM/discussions)
	- Share use cases and success stories
	- Request features or improvements
	- Contribute to model evaluation

	---

	## 📌 Disclaimer

	GrantsLLM is an assistive tool designed to support the grant writing process. It does not:
	- Guarantee grant success or funding approval
	- Replace domain expertise or scientific judgment
	- Ensure compliance with all funder requirements
	- Eliminate the need for human review and verification

	Always consult official funder guidelines and domain experts before grant submission.

	---

	## 🔄 Version History

	v1.0 (February 2026)
	- Initial release
	- Trained on 78 STEM grant applications
	- Base model: Qwen/Qwen3-4B
	- Supports NIH and NSF formats

	---

	© 2026 Evionex \| Licensed under CC BY 4.0
	Made with ❤️ for the research community

	```
	This Qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library.
	```