Welcome to the Pulze.ai KNN Router

The KNN Router is an open-sourced intent-tuned LLM router designed to efficiently select the best language model (LLM) for a user’s query. This documentation provides an overview of how to deploy and utilize the KNN Router, as well as insights into its underlying technology.

Key Features of KNN Router

  1. Semantic Query Routing:

    • The KNN Router uses k-nearest neighbors (KNN) to find semantically similar queries, allowing for precise model selection based on user input.
  2. Weighted Scoring:

    • Each LLM associated with the nearest neighbors is scored based on a weighted average of distances, ensuring the most appropriate model is selected for each query.
  3. Minimal Latency:

    • Built in Go, the KNN Router is optimized for low-latency responses, making it suitable for real-time applications.
  4. Integration Flexibility:

    • The router can be integrated with various systems, including information retrieval systems, agents, and other LLMs.
  5. Open Source:

    • Being open-sourced, the KNN Router is available for customization and enhancement by the community.

Model and Data Overview

The KNN Router leverages the pulze-intent-v0.1 model, which is trained to select the most appropriate LLM based on user queries. The model and dataset can be accessed via the following links:

Supported Models

The KNN Router can work with a variety of models, including:

  • claude-3-haiku-20240307
  • claude-3-opus-20240229
  • claude-3-sonnet-20240229
  • command-r
  • command-r-plus
  • dbrx-instruct
  • gpt-3.5-turbo-0125
  • gpt-4-turbo-2024-04-09
  • llama-3-70b-instruct
  • mistral-large
  • mistral-medium
  • mistral-small
  • mixtral-8x7b-instruct

Getting Started with KNN Router

Local Deployment

To deploy the KNN Router locally, follow these steps:

  1. Fetch Artifacts from Huggingface:

    huggingface-cli download pulze/intent-v0.1 --local-dir .dist --local-dir-use-symlinks=False
    
  2. Start the Services:

    docker compose up -d --build
    
  3. Query the Router: Use curl to send a query:

    curl -s 127.0.0.1:8888/ \
        -X POST \
        -d '{"query":"give me instructions for making ramen at home"}' \
        -H 'Content-Type: application/json' | jq .
    

The expected output will include a ranked list of hits and scores for each target, similar to the following:

{
  "hits": [ ... ],
  "scores": [ ... ]
}

Kubernetes Deployment

For Kubernetes deployment, refer to the provided example configuration.

Generating Deployment Artifacts

Before deploying, you need to generate the required artifacts:

  1. Prepare Input Files:
  • points.jsonl: Contains points and their respective categories and embeddings.
  • targets.jsonl: Contains targets and their respective scores for each point.
  1. Generate Artifacts: Use the following script to generate the required artifacts:

    scripts/gen-artifacts.sh --points-data-path points.jsonl --scores-data-path targets.jsonl --output-dir ./dist
    

Join the Community

We encourage developers and enthusiasts to contribute to the KNN Router project. Your feedback and contributions help us improve the tool for everyone.

For more information and to stay updated, visit our GitHub repository: KNN Router on GitHub.

Future Development

We have plans for additional features and enhancements, including:

  • A Helm chart for easier Kubernetes deployments.
  • GRPC endpoint for improved performance and flexibility.

Stay tuned for updates!

Contact Us

For questions, feedback, or support, please reach out through our community channels or directly on GitHub.