Wrapping HOOPS AI as a REST API with FastAPI
Most manufacturing software I’ve worked on — apps, services, plugins — was C/C++, C#, or JavaScript/TypeScript, with npm handling dependencies, built in Visual Studio or VS Code. HOOPS AI (launched November 2025) flips that: it’s Python, dependencies come from pip (or uv) into a venv instead of npm into node_modules, and the tutorials live in Jupyter notebooks — the world AI engineers and data scientists work in, not the one most of us came from.
I installed it, ran through the tutorials, swapped in my own CAD data, got inference working. Then came the question I genuinely didn’t have an answer for: not “how do I improve the model,” but how do you take code that only runs when you personally execute notebook cells in your own SDK install, and put it where an end user, a domain application, or another service can just call it?
Short answer: FastAPI made this far less painful than I expected. Here’s a minimal sandbox that wraps two HOOPS AI tutorial workflows as REST endpoints, and what I learned building it.
Repo: hoops_ai_webapi_sandbox. It’s small enough to keep open in another tab while reading — below, each point links back to the specific file it comes from:
main.py— app setup and thelifespanstartup hookcore.py— license init, MFR model loading/caching, thejson_safe()helperrouters/cad.py,routers/mfr.py— the two endpoint definitions
Why a WebAPI in the first place
Running the HOOPS AI tutorials proved the ML side works — manufacturing feature recognition, similar-shape search, both came back with real, usable predictions. The piece that clarified the “how do you expose this” question for me was realizing HOOPS AI doesn’t stand on its own as a product — it’s a capability that domain applications (cost estimation, CAD, CAM, CAE, …) need to call into when they need it, and increasingly, that a general-purpose AI agent should be able to call into via MCP as well.
Once I had that picture — multiple domain consumers, plus an MCP-fronted AI agent, all needing to call the same capability — the requirement became “expose this as a REST API,” not “build one more notebook.” FastAPI, which lets you stand up a REST API directly in Python, was the natural thing to try.
What I built
Two endpoints, each mirroring a HOOPS AI tutorial notebook 1:1:
| Endpoint | What it does |
|---|---|
POST /cad/load |
Upload a CAD file (STEP, IGES, …) → loads it via HOOPSLoader + BrepEncoder, returns face/edge B-Rep attributes as JSON |
POST /mfr/inference |
Upload a CAD file → runs a pre-trained GraphNodeClassification model, returns per-face manufacturing feature predictions (holes, slots, pockets, fillets, …) |
Here’s /mfr/inference actually called against a test part, with the response trimmed to one face:
I picked FastAPI specifically because it turns a plain Python function into a documented REST endpoint with nothing more than a decorator and type hints — no separate schema or routing config to maintain. Another real advantage: FastAPI auto-generates an interactive Swagger UI (/docs), so you can test both endpoints either from there or from the console — details in the repo’s README.md.
What actually mattered (the non-obvious parts)
Initialize the license once at startup, not per request
See the lifespan function in main.py and init_hoops_license() in core.py. HOOPS AI license validation happens in FastAPI’s lifespan hook, which runs exactly once when the server boots — not inside the request handler. License validation sets up process-wide state, not per-request or per-client state, so this holds regardless of how many clients end up calling the API: one process, one validation, shared by everyone who connects to it. Notebooks don’t have a concept of “per request” — you just run the init cell once per session. A service does, so “run once at startup” vs. “run per call” becomes a real decision you have to make.
Cache the trained model in memory — don’t reload the checkpoint per call
See _get_mfr_inference_model() and _create_mfr_inference_model() in core.py. The MFR inference model is loaded lazily on first use and then kept in a module-level variable for the life of the process:
_mfr_inference_model = None
def _get_mfr_inference_model():
global _mfr_inference_model
if _mfr_inference_model is None:
_mfr_inference_model = _create_mfr_inference_model()
return _mfr_inference_model
In a notebook you load the checkpoint once in a cell and reuse the variable for the rest of the session without thinking about it. In a service, “reuse the variable” isn’t automatic — each request is its own call, so without this caching the checkpoint would reload from disk on every single inference request.
numpy → JSON is one small recursive helper, not a serialization project
See json_safe() in core.py. BRep face/edge attributes and inference probabilities come back as numpy arrays, which aren’t directly JSON-serializable. The fix is one small recursive function:
def json_safe(value):
if isinstance(value, dict):
return {str(json_safe(k)): json_safe(v) for k, v in value.items()}
if isinstance(value, (list, tuple)):
return [json_safe(item) for item in value]
if hasattr(value, "tolist"):
return value.tolist()
if hasattr(value, "item"):
return value.item()
return value
This is the kind of conversion that would have meant real plumbing work in C++. Here it’s ~12 lines, and combined with FastAPI’s automatic request parsing for multipart file uploads (UploadFile = File(...)), the amount of boilerplate that disappears is the biggest reason “I don’t know how to service-ify this” turned out to be a non-problem.
You have to run it with the HOOPS AI venv’s Python — not your system Python
See the “Running the server” section of the repo’s README.md. hoops_ai only resolves inside the virtual environment it was installed into. Forgetting this gets you import errors that look unrelated to the actual cause. In VS Code, selecting that venv as the interpreter (Ctrl+Shift+P → Python: Select Interpreter) makes the integrated terminal pick it up automatically; for a systemd service or any non-interactive launch, you need the full path to that venv’s python.exe/python binary in the start command, not a bare python main.py.
What I haven’t verified yet
Tested so far: a single client, one endpoint at a time, same machine as the server. This is a proof of concept, not production code — everything below is still open:
- Load-test concurrent requests against both endpoints (multiple clients hitting the API at once, not yet tested)
- Confirm plain
def(notasync def) inrouters/cad.py/routers/mfr.pyactually holds up under load — FastAPI offloadsdefhandlers to a worker thread pool, which should keep one slow CAD load/inference call from stalling the whole server, but this is untested in practice - The license key is currently just a plain string in an env var, with no secrets management
- The server still runs directly inside the installer’s venv rather than a pinned, reproducible one
Decisions at a glance
| Decision | Why |
|---|---|
| FastAPI over writing routing/serialization by hand | Decorator + type hints → working REST endpoint + auto-generated docs, with almost no extra code |
License validation in lifespan, once at startup |
Avoids re-validating on every request |
| MFR model loaded lazily, cached in a global | Avoids reloading the checkpoint from disk on every inference call |
| Only 2 endpoints, no auth/queueing/scaling | This is a sandbox to prove the notebook→service path is open, not a deployment template |
Verdict
It works for what I actually tested: a single client, calling one endpoint at a time, on the same machine as the server. The mental hurdle was bigger than the technical one — once I accepted that FastAPI just wraps existing Python functions, most of the “how do I even start” anxiety went away. See “What I haven’t verified yet” above for the open items — treat this as a starting point, not something to deploy as-is.
The real payoff of WebAPI-izing HOOPS AI is what it opens up: once it’s a REST API, anything that speaks HTTP can use it. I’ve since taken this further and wrapped a more complete version (many more endpoints, a 3D viewer, shape similarity search) as an MCP server so it can be called from Claude Desktop in natural language — that’s a separate project I’ll cover in its own article.
Source code
Same repo as linked at the top: hoops_ai_webapi_sandbox — README has setup/run instructions. As noted above, this is a starting point, not production code.
If you’re planning to deploy HOOPS AI on a headless Linux box, see “I tried running HOOPS AI V1.1 headless on Ubuntu 24.04 (EC2)”, where I wrote up that setup separately (using this same sandbox repo to verify it).
Questions, comments, or “here’s what I’d have done differently” — drop them in this thread.


