Self-Hosted, Open Source MCP Infrastructure

Octelium provides a complete secure, free and open source, self-hosted infrastructure to build your MCP-based architectures and gateways. It provides not only secure access to your MCP servers wherever they are but it also seamlessly provides deployment and scaling for all your streamable HTTP MCP servers, unified and scalable authentication and identity management to all your MCP clients, L-7 aware pre-request authorization and OpenTelemetry-ready visibility.

MCP Gateway

You can develop your MCP servers over streamable HTTP/SSE transport and run anywhere even behind NAT. Here is a simple example of a my-mcp Service whose upstream is running at http://localhost:8787 and is remotely served by a connected octelium client used by the User mcp-01 (read more about serving Services via connected Users here):

1
kind: Service
2
metadata:
3
name: my-mcp
4
spec:
5
port: 8080
6
mode: HTTP
7
isPublic: true
8
config:
9
upstream:
10
url: http://localhost:8787
11
user: mcp-01

To serve the MCP server, the User mcp-01 needs to explicitly serve that Service via the --serve flag (read more here) as follows:

octelium connect --serve my-mcp

You can also deploy and scale your containerized streameable HTTP-based MCP server and serve it as a Service by reusing the underlying Kubernetes infrastructure that runs the Octelium Cluster (read more about managed containers here) including from private container registries.

This is a very simple example of an MCP server developed in Python with FastMCP that is serving over streamable HTTP transport:

1
import asyncio
2
import os
3
4
from fastmcp import FastMCP
5
6
mcp = FastMCP("Demo Octelium MCP Server")
7
8
@mcp.tool()
9
def add(a: int, b: int) -> int:
10
return a + b
11
12
@mcp.tool()
13
def subtract(a: int, b: int) -> int:
14
return a - b
15
16
if __name__ == "__main__":
17
asyncio.run(
18
mcp.run_async(
19
transport="streamable-http",
20
host="0.0.0.0",
21
port=os.getenv("PORT", 8080),
22
)
23
)

Such MCP server can be very easily Dockerized, pushed to your public or even private container registry of choice (e.g. Docker register, GitHub's ghcr, etc...) and automatically deployed and scaled via Octelium as follows:

1
kind: Service
2
metadata:
3
name: my-mcp
4
spec:
5
mode: HTTP
6
isPublic: true
7
config:
8
upstream:
9
container:
10
port: 8080
11
image: <MY_REGISTRY>/<MY_ORG>/<MY_MCP_IMAGE:version>
12
credentials:
13
usernamePassword:
14
username: <USER>
15
password:
16
fromSecret: registry-token
17
resourceLimit:
18
cpu:
19
millicores: 1000
20
memory:
21
megabytes: 2000

You can now apply the creation of the Service as follows (read more here):

octeliumctl apply /PATH/TO/SERVICE.YAML
NOTE

You might also need to take a look at Namespaces (read more here) where you can organize your MCP server Services and affect their hostnames as well as access control to a whole set of Services that share a certain purpose or functionality according to your needs.

Octelium also supports secret-less access for Users to public MCP servers that are protected by standard bearer access tokens, basic authentication, API keys set in custom headers as well as OAuth2 client credential flows. You can read more here.

Now we move on to the client-side of MCPs. The main value of using Octelium is providing a unified and scalable identity management for all your clients where you can have a single standard OAuth2 client credential or bearer authentication for your MCP clients to access all authorized MCP servers without having to use any special SDKs or clients from the clients' side or even having to be aware of the Octelium's Cluster existence at all. You can simply develop your MCP clients in any language and connect to the MCP server Service not only privately over the octelium client (read more here), but also MCP clients can use the standard OAuth client credentials flow to obtain a bearer access token and publicly access the MCP server service (read more about the public clientless access mode here). You can read more about OAuth2-based client-less access here and here. You can also rea more about access token Credentials here.

Now we move on to access control. Octelium's application-layer (L7) awareness seamlessly enables you to control access at the HTTP-layer based on HTTP request paths, methods and more importantly in our use case for MCP, JSON body of the requests. Here is an example:

1
kind: Service
2
metadata:
3
name: my-mcp
4
spec:
5
port: 8080
6
mode: HTTP
7
isPublic: true
8
config:
9
upstream:
10
url: https://mcp.example.com
11
http:
12
enableRequestBuffering: true
13
body:
14
mode: JSON
15
maxRequestSize: 100000
16
authorization:
17
inlinePolicies:
18
- spec:
19
rules:
20
- effect: ALLOW
21
condition:
22
all:
23
of:
24
- match: ctx.request.http.bodyMap.jsonrpc == "2.0"
25
- match: ctx.request.http.bodyMap.method == "tools/call"
26
- match: ctx.request.http.bodyMap.params.name in ["add", "subtract"]
27
- match: ctx.request.http.bodyMap.params.arguments.a < 1000
28
- match: ctx.request.http.bodyMap.params.arguments.b > 1000

Octelium also provides OpenTelemetry-ready, application-layer L7 aware visibility and access logging in real time (see an example for HTTP here). You can read more about visibility here.

Here are a few more features that you might be interested in:

  • Request/response header manipulation (read more here).
  • Application layer-aware ABAC access control via policy-as-code using CEL and Open Policy Agent (read more here).
  • Exposing the API publicly for anonymous access (read more here).
  • OpenTelemetry-ready, application-layer L7 aware auditing and visibility (read more here).
© 2025 octelium.comOctelium Labs, LLCAll rights reserved
Octelium and Octelium logo are trademarks of Octelium Labs, LLC.
WireGuard is a registered trademark of Jason A. Donenfeld