DS2API: Deepseek-to-API Middleware with Multi-Account Rotation

DS2API is a powerful open-source middleware that converts DeepSeek Web chat capabilities into fully compatible OpenAI, Claude, and Gemini APIs. Built as a pure Go implementation with a React WebUI admin panel, DS2API provides enterprise-grade features including multi-account rotation, automatic token refresh, Proof-of-Work solving, and intelligent concurrency control - all through a single unified service.

Whether you need to integrate DeepSeek models into existing OpenAI SDK workflows, connect Claude Code to DeepSeek’s reasoning capabilities, or use the Gemini SDK with DeepSeek as a backend, DS2API handles the protocol translation seamlessly while managing account pools and request queuing behind the scenes.

DS2API Architecture Overview

How DS2API Works

The architecture diagram above illustrates the complete request flow through DS2API. When a client sends a request - whether using OpenAI, Claude, or Gemini SDK - it enters through the chi router with full middleware support including request ID tracking, real IP detection, logging, recovery, and CORS handling.

Understanding the Architecture

HTTP API Surface Layer

The HTTP API surface is where protocol-specific handling occurs. DS2API exposes five distinct API surfaces:

OpenAI API: Handles /v1/chat/completions, /v1/responses, /v1/embeddings, /v1/files, and /v1/models endpoints. This is the most feature-complete surface, supporting both chat completions and the newer Responses API with streaming.
Claude API: Provides /anthropic/v1/messages and /anthropic/v1/messages/count_tokens endpoints, plus shortcut paths /v1/messages and /messages. This enables direct compatibility with the Anthropic SDK.
Gemini API: Supports /v1beta/models/{model}:generateContent and /v1beta/models/{model}:streamGenerateContent, plus /v1/models/{model}:* paths. Full Tool Calling support is included with functionDeclarations to functionCall translation.
Admin API: Manages configuration, runtime settings hot-reload, proxy management, account testing, session cleanup, import/export, and Vercel sync.
WebUI: A single-page application at /admin with bilingual Chinese/English support and dark mode.

Runtime Core

The runtime core is where the real processing happens:

PromptCompat: The compatibility kernel that translates API requests from any protocol into DeepSeek web-chat plain text context. This is the critical translation layer that makes multi-protocol support possible.
Chat/Responses Runtime: Provides unified tool calling and streaming semantics across all protocols.
Auth Resolver: Handles multiple authentication formats - Bearer tokens, x-api-key headers, and x-goog-api-key for Gemini compatibility.
Account Pool + Queue: Manages multi-account rotation with in-flight slot limits and waiting queues to prevent 429 errors.
DeepSeek Client: Handles session management, authentication, completion requests, and file operations with the upstream DeepSeek API.
PoW Solver: A pure Go implementation of DeepSeekHashV1 that solves Proof-of-Work challenges in milliseconds.
Tool Sieve: Provides Go/Node semantic parity for tool call anti-leak handling and structured incremental output.
History Split: Automatically converts long conversation histories into file attachments to avoid prompt length limits.

Multi-Account Rotation and Concurrency

Multi-Account Rotation

Understanding Account Rotation

The multi-account rotation system is one of DS2API’s most powerful features. When you have multiple DeepSeek accounts configured, DS2API automatically distributes requests across them to maximize throughput and avoid rate limiting.

Authentication Flow

Every incoming request goes through the Auth Resolver first. If the provided API key matches one in config.keys, DS2API enters Managed Account Mode where it automatically selects an available account from the pool. If the key does not match, it is treated as a direct DeepSeek token and passed through without rotation.

Account Selection Strategy

DS2API uses a round-robin / least-loaded strategy to select accounts. Each account has a configurable in-flight limit (default: 2 concurrent requests per account). When a request arrives:

The system checks if any account has available in-flight slots
If slots are available, the request is assigned to that account
If all slots are full, the request enters a FIFO waiting queue
Only when the total capacity (in-flight + queue) is exceeded does DS2API return a 429 error

This means you can scale linearly: with 5 accounts and 2 in-flight slots each, you get a recommended concurrency of 10 simultaneous requests, with a queue capacity of 10 more before any 429 errors.

Auto Token Refresh

Each managed account automatically refreshes its authentication token on a configurable interval (default: 6 hours). This ensures uninterrupted service without manual intervention. Both email and mobile phone login methods are supported.

Concurrency Model

      
        Per-account inflight = DS2API_ACCOUNT_MAX_INFLIGHT (default 2)
Recommended concurrency = account_count x per_account_inflight
Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency)
429 threshold = inflight + queue = account_count x 4

The GET /admin/queue/status endpoint provides real-time visibility into current concurrency state, making it easy to monitor and tune your deployment.

API Compatibility Matrix

API Compatibility

Understanding API Compatibility

DS2API provides three complete API compatibility layers, each translating protocol-specific requests into DeepSeek web chat interactions and then translating responses back into the client’s expected format.

OpenAI-Compatible Endpoints

The OpenAI surface is the most comprehensive, supporting:

Endpoint	Method	Description
`/v1/chat/completions`	POST	Chat completions with streaming support
`/v1/responses`	POST	Responses API (newer OpenAI format)
`/v1/responses/{response_id}`	GET	Retrieve stored responses
`/v1/models`	GET	List available models
`/v1/models/{id}`	GET	Get model details
`/v1/embeddings`	POST	Generate embeddings
`/v1/files`	POST	File upload for context

Claude-Compatible Endpoints

Endpoint	Method	Description
`/anthropic/v1/messages`	POST	Send messages (Anthropic SDK compatible)
`/anthropic/v1/messages/count_tokens`	POST	Count tokens for a message
`/anthropic/v1/models`	GET	List available Claude-compatible models
`/v1/messages`	POST	Shortcut path for messages
`/messages`	POST	Shortcut path for messages

Gemini-Compatible Endpoints

Endpoint	Method	Description
`/v1beta/models/{model}:generateContent`	POST	Generate content (Gemini SDK compatible)
`/v1beta/models/{model}:streamGenerateContent`	POST	Stream content generation
`/v1/models/{model}:*`	Various	Model operations

Platform Compatibility

Tier	Platform	Status
P0	Codex CLI/SDK	Supported
P0	OpenAI SDK (JS/Python)	Supported
P0	Vercel AI SDK	Supported
P0	Anthropic SDK	Supported
P0	Google Gemini SDK	Supported
P1	LangChain / LlamaIndex / OpenWebUI	Supported

Model Support

DeepSeek Native Models

Family	Model ID	Thinking	Search
default	`deepseek-v4-flash`	Enabled by default	No
expert	`deepseek-v4-pro`	Enabled by default	No
default	`deepseek-v4-flash-search`	Enabled by default	Yes
expert	`deepseek-v4-pro-search`	Enabled by default	Yes
vision	`deepseek-v4-vision`	Enabled by default	No
vision	`deepseek-v4-vision-search`	Enabled by default	Yes

Model Aliases

DS2API also accepts common aliases as input. For example, gpt-4o maps to deepseek-v4-flash, o3 maps to deepseek-v4-pro, and Claude model names map to their DeepSeek equivalents. The /v1/models endpoint always returns normalized DeepSeek native model IDs.

For Claude compatibility, the default mappings are:

Claude Model	DeepSeek Mapping
`claude-sonnet-4-6`	`deepseek-v4-flash`
`claude-haiku-4-5`	`deepseek-v4-flash`
`claude-opus-4-6`	`deepseek-v4-pro`

Installation and Setup

Option 1: Download Release Binaries (Recommended)

The easiest way to get started is downloading pre-built binaries:

      
        # Download the archive for your platform from GitHub Releases
tar -xzf ds2api_<tag>_linux_amd64.tar.gz
cd ds2api_<tag>_linux_amd64

# Create configuration
cp config.example.json config.json
# Edit config.json with your DeepSeek accounts and API keys

# Start the server
./ds2api

The server starts at http://127.0.0.1:5001 by default.

Option 2: Docker Deployment

      
        # Clone and configure
git clone https://github.com/CJackHwang/ds2api.git
cd ds2api
cp .env.example .env
cp config.example.json config.json

# Edit .env - at minimum set DS2API_ADMIN_KEY
# DS2API_ADMIN_KEY=your-strong-password

# Start with Docker Compose
docker-compose up -d

# View logs
docker-compose logs -f

The default docker-compose.yml maps host port 6011 to container port 5001. To expose 5001 directly, set DS2API_HOST_PORT=5001.

Option 3: Vercel Serverless

      
        # Fork the repository to your GitHub account
# Import the project on Vercel
# Set environment variables:
#   DS2API_ADMIN_KEY=your-admin-key
#   DS2API_CONFIG_JSON=<base64 encoded config.json>

# Generate Base64 config:
base64 < config.json | tr -d '\n'

Note: On Vercel, /v1/chat/completions uses a Node.js runtime for real-time SSE streaming, while all other routes use the Go runtime.

Option 4: Local Source Build

      
        # Prerequisites: Go 1.26+, Node.js 20.19+ or 22.12+ (for WebUI)

git clone https://github.com/CJackHwang/ds2api.git
cd ds2api
cp config.example.json config.json
# Edit config.json

go run ./cmd/ds2api

Configuration

DS2API uses a single config.json file as the source of truth. Here is a minimal configuration:

      
    
      
        {
  "keys": [
    "your-api-key-1",
    "your-api-key-2"
  ],
  "accounts": [
    {
      "name": "primary-account",
      "email": "your-email@example.com",
      "password": "your-password"
    },
    {
      "name": "secondary-account",
      "email": "another@example.com",
      "password": "another-password"
    }
  ],
  "model_aliases": {
    "gpt-4o": "deepseek-v4-flash",
    "o3": "deepseek-v4-pro"
  },
  "runtime": {
    "account_max_inflight": 2,
    "account_max_queue": 0,
    "token_refresh_interval_hours": 6
  },
  "auto_delete": {
    "mode": "none"
  }
}

      
      
        

Key Configuration Fields

Field	Description
`keys` / `api_keys`	Client API keys for authentication. `api_keys` adds `name` and `remark` metadata.
`accounts`	DeepSeek managed accounts with email or mobile login, proxy, name, and remark.
`model_aliases`	Shared alias map for OpenAI, Claude, and Gemini model name translation.
`runtime`	Account concurrency, queue, and token refresh settings. Hot-reloadable via Admin Settings.
`auto_delete.mode`	Remote session cleanup after requests: `none`, `single`, or `all`.
`history_split`	Multi-turn history split policy. Tunable trigger threshold.

Authentication Modes

DS2API supports two authentication modes for business endpoints:

Mode	Description
Managed Account	Use a key from `config.keys` via `Authorization: Bearer ...` or `x-api-key`. DS2API auto-selects an account from the pool.
Direct Token	If the token is not in `config.keys`, it is treated as a DeepSeek token directly.

You can also pin a specific managed account using the X-Ds2-Target-Account header with an email or mobile number.

Tool Calling Support

When tools is present in a request, DS2API performs sophisticated anti-leak handling:

Context-aware detection: Tool call recognition is only enabled in non-code-block contexts, preventing false positives from code examples.
Canonical XML parsing: Only the canonical XML format <tool_calls><invoke name="..."><parameter name="..."> is treated as executable tool calls. Legacy formats are handled as plain text.
Protocol-native output: The output follows the client’s request protocol - OpenAI, Claude, or Gemini native structures are generated appropriately.
Streaming support: The responses streaming mode uses official item lifecycle events (response.output_item.*, response.content_part.*, response.function_call_arguments.*).
Tool choice enforcement: Supports auto, none, required, and forced function modes. Violations return 422 for non-streaming and response.failed for streaming.

Usage Examples

OpenAI SDK (Python)

      
    
      
        from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:5001/v1",
    api_key="your-api-key-1"
)

response = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[
        {"role": "user", "content": "Explain quantum computing in simple terms"}
    ],
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

      
      
        

Anthropic SDK (Python)

      
        import anthropic

client = anthropic.Anthropic(
    base_url="http://127.0.0.1:5001",
    api_key="your-api-key-1"
)

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Write a haiku about programming"}
    ]
)

print(message.content[0].text)

Google Gemini SDK (Python)

      
        import google.generativeai as genai

genai.configure(
    api_key="your-api-key-1",
    transport="rest",
    client_options={"api_endpoint": "http://127.0.0.1:5001"}
)

model = genai.GenerativeModel("deepseek-v4-flash")
response = model.generate_content("What is machine learning?")
print(response.text)

cURL Test

      
        curl http://127.0.0.1:5001/v1/chat/completions \
  -H "Authorization: Bearer your-api-key-1" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-flash",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": false
  }'

Admin and Monitoring

DS2API includes a comprehensive admin interface:

WebUI: Accessible at /admin with bilingual support and dark mode
Health Probes: GET /healthz (liveness) and GET /readyz (readiness)
Queue Status: GET /admin/queue/status for real-time concurrency monitoring
Config Hot-Reload: Runtime settings can be updated without restart via Admin API
Account Testing: Individual and batch account testing endpoints
Session Cleanup: Automatic or manual remote session management
Import/Export: Full configuration backup and restore

Key Features Summary

Feature	Description
Triple Protocol Support	OpenAI, Claude, and Gemini API compatibility from a single service
Multi-Account Rotation	Automatic load distribution across DeepSeek accounts
Concurrency Control	Per-account in-flight limits with FIFO waiting queue
Auto Token Refresh	Configurable interval (default 6 hours) for uninterrupted service
PoW Solving	Pure Go DeepSeekHashV1 implementation with millisecond response
Tool Calling	Anti-leak handling with canonical XML parsing and protocol-native output
History Split	Automatic long-history file attachment to avoid prompt limits
Admin WebUI	Bilingual SPA with real-time monitoring and configuration
Multiple Deployment	Local binary, Docker, Vercel serverless, or systemd
Health Probes	Kubernetes-ready liveness and readiness endpoints

Conclusion

DS2API stands out as a comprehensive DeepSeek-to-API middleware that solves the practical challenges of using DeepSeek models through standard API interfaces. Its triple-protocol support (OpenAI, Claude, and Gemini) means you can integrate DeepSeek into virtually any existing AI application without code changes. The multi-account rotation system with intelligent concurrency control ensures reliable, high-throughput operation, while the pure Go backend delivers excellent performance with minimal resource usage.

Whether you are running a single account for personal use or managing a pool of accounts for production workloads, DS2API provides the tools you need - from automatic token refresh and PoW solving to real-time monitoring and hot-reloadable configuration.

DS2API: Deepseek-to-API Middleware with Multi-Account Rotation and Triple Protocol Support

DS2API: Deepseek-to-API Middleware with Multi-Account Rotation

How DS2API Works

Understanding the Architecture

Multi-Account Rotation and Concurrency

Understanding Account Rotation

API Compatibility Matrix

Understanding API Compatibility

Model Support

DeepSeek Native Models

Model Aliases

Installation and Setup

Option 1: Download Release Binaries (Recommended)

Option 2: Docker Deployment

Option 3: Vercel Serverless

Option 4: Local Source Build

Configuration

Key Configuration Fields

Authentication Modes

Tool Calling Support

Usage Examples

OpenAI SDK (Python)

Anthropic SDK (Python)

Google Gemini SDK (Python)

cURL Test

Admin and Monitoring

Key Features Summary

Conclusion

Links

DS2API: Deepseek-to-API Middleware with Multi-Account Rotation and Triple Protocol Support

DS2API: Deepseek-to-API Middleware with Multi-Account Rotation

How DS2API Works

Understanding the Architecture

Multi-Account Rotation and Concurrency

Understanding Account Rotation

API Compatibility Matrix

Understanding API Compatibility

Model Support

DeepSeek Native Models

Model Aliases

Installation and Setup

Option 1: Download Release Binaries (Recommended)

Option 2: Docker Deployment

Option 3: Vercel Serverless

Option 4: Local Source Build

Configuration

Key Configuration Fields

Authentication Modes

Tool Calling Support

Usage Examples

OpenAI SDK (Python)

Anthropic SDK (Python)

Google Gemini SDK (Python)

cURL Test

Admin and Monitoring

Key Features Summary

Conclusion

Links

Related Posts

Deep Reinforcement Learning Series - Complete Roadmap and...

Part 7: Proximal Policy Optimization (PPO) - State-of-the...

Build Your Own X: Master Programming by Recreating Your F...

InferSim: LLM Inference Simulation by Alibaba

Universal Commerce Protocol: The Open Standard Breaking D...

Warp: The Agentic Development Environment Born Out of the...

Contents