How we optimized MCP servers for serverless (or: From SSE to WebSockets)

A deep-dive into our journey making MCP servers compatible with WebSockets.

Nicolas Lecomte

March 5, 2025

12 min read

Building AI agents that can seamlessly interact with external systems is a complex challenge. At Blaxel, we've been working on creating a platform that fast-tracks the developing of AI agents for developers. Along the way, we've encountered numerous technical hurdles with Model Context Protocol (MCP) servers, particularly around connection stability and scalability in cloud environments.

This article details our journey from using Server-Sent Events (SSE) to WebSockets, and the significant improvements we've seen as a result.

Understanding MCP: the foundation of modern AI agents

Before diving into our implementation challenges, let's briefly review what MCP is and why it matters. The Model Context Protocol (MCP) is an open standard developed by Anthropic that enables AI assistants to connect with external systems where data lives - including content repositories, business tools, and development environments.

MCP consists of three main components:

MCP Servers: These act as bridges connecting APIs, databases, or code to AI models, exposing data sources as tools. At Blaxel, we wanted to provide prebuilt and custom MCP servers for users.

MCP Clients: These use the protocol to interact with MCP servers and can be developed using SDKs in Python or TypeScript.

MCP Hosts: These systems manage communication between servers and clients, ensuring smooth data exchange.

The beauty of MCP is that tools provided by an MCP server can be accessed via any MCP host, allowing developers to connect AI agents to new tools without custom integration code. This standardization is what makes MCP so powerful for building extensible AI systems.

Our initial approach: HTTP handlers on Cloudflare

When we first started integrating MCP servers into Blaxel, we took what seemed like the most straightforward approach: implementing HTTP handlers on Cloudflare (which was already part of our stack) and mixing MCP with traditional APIs. This worked... sort of.

The problem? Adding a new MCP server was tedious and time-consuming. Each integration required hours of work, which is unsustainable when you're trying to build a platform batteries-included that needs to scale to support thousands of tools and millions of running agents. Furthermore, HTTP isn’t a standard transport in MCP, so there was no official support for it.

Standardizing MCP integration

To address the tedium of adding new MCP servers, we began looking for standardized ways to register them. Two key resources were kept as a result of our search:

Smithery: A registry of MCP servers that provides a standardized packaging format

MCP Hub: Our own open-source catalog of MCP servers designed to accelerate integration.

This infrastructure helped us organize our growing collection of MCP servers, but we still faced significant technical challenges with the underlying communication protocol.

SSE: a promising start with disappointing results

During our search for better solutions, we discovered Supergateway, a tool that wraps stdio-based MCP servers with Server-Sent Events (SSE). On paper, this looked like an elegant solution.

For those unfamiliar with SSE, it's a technology that establishes a one-way communication channel from server to client over HTTP. Unlike WebSockets, which provide full-duplex communication, SSE is designed specifically for server-to-client updates. This makes it seemingly ideal for scenarios where clients primarily need to receive updates from servers.

We implemented Supergateway with SSE, but quickly ran into significant issues:

What are the problems with SSE in serverless environments

Connection Instability: In serverless environments, SSE connections dropped randomly and frequently. This is particularly problematic for AI agents that need reliable, persistent connections to function properly.

Scaling Challenges: As we tried to scale our platform, the limitations of SSE became increasingly apparent. The protocol wasn't designed with cloud-native architectures in mind.

Browser Connection Limits: SSE suffers from a limitation to the maximum number of open connections, which is set to a very low number (6) per browser and domain. This became problematic when users opened multiple tabs.

Proxy and Firewall Issues: Some proxies and firewalls block SSE connections because they don't have a Content-Length header, creating deployment challenges in enterprise environments.

After extensive testing, we concluded that while SSE might work well for simpler use cases or controlled environments, it wasn't robust enough for our cloud-based AI agent platform.

WebSockets: a game-changer for MCP

Facing these challenges, we made the decision to switch from SSE to WebSockets for all our MCP server communications. Despite the fact that MCP documentation doesn't extensively discuss WebSockets, they are officially supported - and as we discovered, they work significantly better in cloud environments.

Why WebSockets outperform SSE for MCP servers

WebSockets establish a persistent, full-duplex TCP connection between client and server, allowing for bidirectional communication. This architecture offers several advantages over SSE for MCP servers:

Connection Stability: WebSockets maintain more stable connections, with built-in mechanisms for handling disconnections and reconnections.

Bidirectional Communication: While MCP often doesn't require extensive client-to-server communication, having the capability for bidirectional data flow eliminates the need for separate HTTP requests for client-initiated actions.

Binary Data Support: WebSockets can transmit both binary data and UTF-8 text, whereas SSE is limited to UTF-8. This provides more flexibility for different types of data exchange.

Better Performance: WebSockets typically offer lower latency and overhead compared to SSE, especially for frequent communications.

No Connection Limits: WebSockets don't suffer from the same browser connection limits as SSE, making them more suitable for applications where users might have multiple tabs open.

Forking Supergateway

To implement our WebSocket solution, we forked the Supergateway project and modified it to use WebSockets instead of SSE. The core changes involved:

Protocol Adaptation: Modifying the communication layer to use WebSocket protocol instead of HTTP streaming.

Connection Management: Implementing robust connection handling with automatic reconnection logic.

Error Handling: Enhancing error detection and recovery mechanisms to ensure reliable operation in cloud environments.

Scaling Optimizations: Adding features to better support horizontal scaling across multiple instances.

Our modified version of Supergateway is available on GitHub as Blaxel's Supergateway, and we welcome contributions and feedback from the community!

Technical implementation: WebSockets for MCP

For those interested in the technical details, here's how we implemented WebSockets for our MCP servers. Please note that the entire code can be found in open-source on our GitHub on Blaxel's Supergateway and Blaxel’s SDK.

Protocol bridging without code changes

The cornerstone of our solution is a protocol bridge that allows any stdio-based MCP server to communicate over WebSockets without modification:

typescript