Serverless MCP framework

7 min readMay 4, 2025

A specific protocol for standardizing the way to serve context and functionality to LLMs has recently been published: the MCP protocol. This protocol describes how to serve and implement resources, pre-configured prompts, and tools.

Convenient SDKs has then been published to implement servers and clients for many programming languages: Python, Java, C#, Kotlin and finally also Typescript (or used as pure Javascript in NodeJS). This piqued my curiosity and I started trying to implement it with AWS services, Serverless of course!

SSE and the stream pain

The MCP protocol is based on receiving and sending messages in JSON-RPC format via stream on the format, this is to transmit the response in real time and maintain a connection for new messages.

Unfortunately this is very complicated to implement with serverless services, API Gateway does not support the stream response but rather Lambda buffers the response before delivering it to the client. You can instead implement the stream response using Function URL, but you need to use some wrapper libraries for that like aws-lambda-stream in combination with another wrapper for Express.JS, which is what the SDK is based on.

Resources:
McpFunction:
Type: AWS::Serverless::Function
Properties:
FunctionName: !Sub "${Environment}-${AppName}-mcp"
MemorySize: 512
Timeout: 900
FunctionUrlConfig:
InvokeMode: RESPONSE_STREAM
AuthType: NONE
Cors:
AllowOrigins:
- "*"
AllowMethods:
- GET
- POST
AllowHeaders:
- Content-Type

As if this were not enough, when a client opens such a connection it completely occupies the execution of the Lambda function, creating a disproportionate number of concurrent executions.

The peculiarity of this stream is also that it is initialized as a request on a path /sse then wait for server-to-client messages and another request on /messages to send messages client-to-server. This becomes even more complicated with different executions of functions not being able to share memory it is necessary to use an external system to store the sessions (a DynamoDB table for example).

The HTTP Streamable

The MCP protocol has recently (2025-03-26) been updated with HTTP Streamable support, even on a single path /mcp. This certainly simplifies the previous implementation, even if the stream is still very complicated to implement.

However, there is a very nice setting in the transport initialization, the enableJsonResponse:

const transport = new StreamableHTTPServerTransport({
sessionIdGenerator: undefined,
enableJsonResponse: true
})

This setting allows you to set the immediate response to the client without using a response stream!

Serverless implementations

We can now simply use API Gateway to serve the MCP server:

Resources:
RestApi:
Type: AWS::Serverless::Api
Properties:
Name: !Sub "${Project}-${Environment}-mcp"
StageName: !Ref Environment

McpFunction:
Type: AWS::Serverless::Function
Properties:
FunctionName: !Sub "${Project}-${Environment}-mcp"
Handler: index.handler
MemorySize: 512
Timeout: 90
Events:
ApiEvent:
Type: Api
Properties:
Path: /mcp
Method: ANY
RestApiId: !Ref RestApi

Just a small wrapper is needed to allow using Espress.JS inside a Lambda function, this can be done with @codegenie/serverless-express module

import app from './app.mjs'
import serverlessExpress from '@codegenie/serverless-express'
export const handler = serverlessExpress({ app })

You can now complete the configuration and initialization of the MCP server as per the SDK documentation.

Deploy and test

As with other solutions, once the code has been put together and the two resources of API Gateway and Lambda have been described, the deployment is ready.

In order to test the MCP server we need some tools, both of these mentioned have been updated recently to support HTTP Streamable transport, when doing these tests it is good to verify this support otherwise most clients will attempt a GET request trying to use the SSE transport.

MCP Inspector

The MCP Inspector is a graphical tool that allows you to test MCP server response and trigger tools executions

npx @modelcontextprotocol/inspector

Configure the inspector with:

  • Transport Type: Streamable HTTP
  • URL: http://localhost:3000/mcp

Connect to the MCP server with the "Connect" button.

MCP Inspector

Claude Desktop

The Claude Desktop can be configured to use MCP server using mcp-remote to proxy HTTP to STDIO:

{
"mcpServers": {
"example": {
"command": "npx",
"args": [
"-y",
"mcp-remote",
"https://xxxxxxxxxx.execute-api.eu-west-1.amazonaws.com/dev/mcp",
"--transport",
"http-only"
]
}
}
}

Once restarted you will be able to see the tools appear in the configurations:

Claude Desktop

Make solution reusable

I have already experimented in another solution the connection of Lambda functions via SSM parameters to implement the tools of a solution with LLM, I’d like to reuse it to decouple the server instance and tools implementation.

Infrastructure schema

In order to declare resources, prompts and tools. I create SSM parameters and Lambda functions in a different template, every SSM parameters begin with a prefix /<project name>/<environment name>/, for example /mcp/dev/tools/echo, the MCP server load all parameters when a request is made and declare them using the MCP SDK.

const parameters = await listParameters()

const resources = await listResources(parameters)
for (const resource of resources) {
server.resource(
resource.name,
resource.uri,
async (uri) => ({
contents: [{
uri: uri.href,
text: resource.content
}]
})
)
}

const prompts = await listPrompts(parameters)
for (const prompt of prompts) {
server.prompt(
prompt.name,
prompt.description,
prompt.inputSchema ? jsonSchemaObjectToZodRawShape(prompt.inputSchema) : {},
async (input) => ({
messages: [{
role: 'user',
content: {
type: 'text',
text: buildPrompt(prompt.content, input)
}
}]
})
)
}

const tools = await listTools(parameters)
for (const tool of tools) {
server.tool(
tool.name,
tool.description,
tool.inputSchema ? jsonSchemaObjectToZodRawShape(tool.inputSchema) : {},
async (input) => await invokeTool(tool.name, input, context)
)
}

The MCP SDK uses zod for input schema declaration, instead wanting to use the declaration in JSON schema I used the module zod-from-json-schema:

import { jsonSchemaObjectToZodRawShape } from 'zod-from-json-schema'

server.tool(
tool.inputSchema ? jsonSchemaObjectToZodRawShape(tool.inputSchema) : {}
)

Resources

Being a simple GET of information I’ve just declared them as follows:

ExampleResourceParameter:
Type: AWS::SSM::Parameter
Properties:
Name: !Sub "/${Project}/${Environment}/resources/example"
Type: String
Value: |
{
"name": "example",
"uri": "config://example",
"content": "Just an example resource"
}

Simple as that, the content is returned as the resource’s text.

server.resource(
resource.name,
resource.uri,
async (uri) => ({
contents: [{
uri: uri.href,
text: resource.content
}]
})
)

Testing with MCP Inspector:

MCP Inspector test resource

Prompts

A slightly more dynamic approach for the prompts instead, wanting to interpolate the inputs within the response to the client I first tried to use handlebars, however I encountered a limitation of SSM on the content of the value: you cannot use double curly brackets.

I then used ejs since it has a different (and configurable) syntax:

EchoPromptParameter:
Type: AWS::SSM::Parameter
Properties:
Name: !Sub "/${Project}/${Environment}/prompts/echo"
Type: String
Value: |
{
"name": "echo",
"description": "Execute the example tool",
"inputSchema": {
"json": {
"type": "object",
"properties": {
"message": {
"type": "string"
}
},
"required": ["message"]
}
},
"content": "Execute the tool 'echo' with the message '<%= message %>'"
}

This way the input parameters are interpolated creating a dynamic prompt

import ejs from 'ejs'

server.prompt(
prompt.name,
prompt.description,
prompt.inputSchema ? jsonSchemaObjectToZodRawShape(prompt.inputSchema) : {},
async (input) => ({
messages: [{
role: 'user',
content: {
type: 'text',
text: ejs.render(prompt.content, input)
}
}]
})
)

Testing with MCP Inspector:

MCP Inspector test prompt

Tools

Since the tools are more complex, they require greater flexibility in processing the response, perhaps by contacting external APIs, databases, files and other resources.

EchoToolFunction:
Type: AWS::Serverless::Function
Properties:
FunctionName: !Sub "${Project}-${Environment}-tools-echo"
Handler: index.handler
InlineCode: |
exports.handler = async ({ toolUseId, name, input }, context) => {
console.log(JSON.stringify({ input, context }, null, 2))
return {
name,
toolUseId,
status: 'success',
content: [{
text: input.message
}]
}
}

EchoToolParameter:
Type: AWS::SSM::Parameter
Properties:
Name: !Sub "/${Project}/${Environment}/tools/echo"
Type: String
Value: |
{
"name": "echo",
"description": "Print the message provided in input",
"inputSchema": {
"json": {
"type": "object",
"properties": {
"message": {
"type": "string"
}
},
"required": ["message"]
}
}
}

The name of the functions must respect a certain nomenclature in order to connect them to the tool, in my case I use as a prefix <project name>-<environment name>-tools-<tool name>.

Additionally to the input to the Lambda function, the ClientContext is also passed with endpoint query parameters (for example using http://localhost:3000/mcp?param=value), client headers or the token can be sent to the execution of the Lambda function if authentication is enabled:

{
"input": {
"message": "test"
},
"context": {
"clientContext": {
"query": {
"param": "value"
},
"userAgent": "node",
"token": "xxxxxxxxxxxxxxxx"
}
}
}

Testing with MCP Inspector:

MCP Inspector test tool

--

--

Fabio Gollinucci
Fabio Gollinucci

No responses yet