Router DC (Dual-Contrastive)
Overview
router_dc is a semantic selection algorithm that matches queries to models using embedding similarity via dual-contrastive learning.
It aligns to config/algorithm/selection/router-dc.yaml.
Paper: Query-Based Router by Dual Contrastive Learning
Key Advantages
- Learns a semantic mapping between query embeddings and model capability embeddings.
- No explicit ranking rules needed — selection is driven by learned similarity.
- Supports both query-side and model-side contrastive learning.
- Useful when prompt semantics matter more than static priority or cost.
Algorithm Principle
Router-DC learns two embedding spaces — one for queries, one for models — and brings matching query-model pairs closer together using dual contrastive loss:
- Query Embedding: Each user query is encoded into a dense vector via the configured embedding provider.
- Model Embedding: Each model is represented by an embedding derived from its description and optional capability tags.
- Contrastive Learning: Positive pairs (query, correct model) are pushed together; negative pairs (query, wrong model) are pushed apart.
- Matching: At inference time, the query embedding is compared to all model embeddings using cosine similarity with temperature-scaled softmax:
Where is the temperature (temperature, default 0.07).
Select Flow
Model Embedding Initialization
Models need descriptions for embedding-based matching. Configure descriptions in modelCards:
routing:
modelCards:
- name: llama-3.2-1b
description: "Fast small model for simple tasks, low cost"
capabilities: ["summarization", "simple_qa"]
- name: codellama-7b
description: "Code generation specialist, good at programming tasks"
capabilities: ["code_generation", "debugging"]
When use_capabilities: true, capability tags are concatenated with descriptions to enrich embeddings.
What Problem Does It Solve?
Some workloads are primarily semantic matching problems where the best model depends on the query meaning more than explicit heuristics. router_dc learns a query-to-model embedding space so routing follows semantic fit instead of only static priority or cost rules.
When to Use
- The best candidate depends on semantic similarity between prompt and model profile.
- You want a learned selector without full online exploration.
- One route should route by semantic fit rather than only cost or latency.
- Models have descriptive profiles or capability tags.
Known Limitations
- Requires model descriptions: If models lack descriptions, embedding quality degrades.
- Cold query problem: Rare query types may not match well with any model embedding.
- Affinity matrix: The query-model affinity matrix (
affinityMatrixin code) is currently initialized but not actively updated online; it serves as a future extension point for online contrastive learning. - Temperature sensitivity: Very low temperature makes the selector near-greedy; very high temperature makes it near-uniform.
Configuration
algorithm:
type: router_dc
router_dc:
temperature: 0.07 # Softmax temperature (lower = sharper)
dimension_size: 768 # Embedding dimension
min_similarity: 0.3 # Minimum similarity threshold
use_query_contrastive: true # Enable query-side contrastive learning
use_model_contrastive: true # Enable model-side contrastive learning
require_descriptions: false # Fail if models lack descriptions
use_capabilities: true # Include capability tags in embeddings
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
temperature | float | 0.07 | Softmax temperature (lower = more confident selection) |
dimension_size | int | 768 | Embedding dimension size |
min_similarity | float | 0.3 | Minimum similarity threshold for valid matches (0–1) |
use_query_contrastive | bool | true | Enable query-side contrastive learning |
use_model_contrastive | bool | true | Enable model-side contrastive learning |
require_descriptions | bool | false | Require all models to have descriptions |
use_capabilities | bool | true | Include capability tags in embedding text |
Feedback
Router-DC supports UpdateFeedback() for online affinity updates. When feedback arrives, the query-model affinity matrix is updated to reflect observed preferences:
curl -X POST http://localhost:8000/api/v1/feedback \
-H "Content-Type: application/json" \
-d '{
"query": "Write a Python function to sort a list",
"winner_model": "codellama-7b",
"decision_name": "coding"
}'