vllm.model_executor.models.jina ¶
JinaEmbeddingsV5Model ¶
Bases: Qwen3ForCausalLM, VllmModelForPooling
Jina Embeddings V5 with task-specific LoRA adapters merged at load time.
Extends Qwen3ForCausalLM (the underlying architecture) and declares itself as a pooling model so that as_embedding_model() does not wrap it.
Source code in vllm/model_executor/models/jina.py
_build_lora_pairs ¶
Group raw adapter tensors into {base_key: {"A": tensor, "B": tensor}} pairs.
Transforms adapter keys like
base_model.model.layers.0.self_attn.q_proj.lora_A.weight
Into base keys like: layers.0.self_attn.q_proj.weight
Source code in vllm/model_executor/models/jina.py
_load_adapter ¶
_load_adapter(
model: str, task: str, revision: str | None
) -> tuple[dict, dict[str, Tensor]] | None
Load adapter config and weights from a local path or HF repo.
Returns (adapter_config, adapter_weights) or None if not found.